Apache Maven Training Programme

NOBLEPROG Training Institute

Course Summary

IT, Software & Hardware
Class room
Beginner/Entry level

Course Overview

NobleProg is the first Polish company that succeeded as international training company in the field of IT. NobleProg, founded in 2004, currently has offices among others in the United States, Canada, Western and Central Europe, Asia and the Middle East including Italy, France, China, Russia, Brazil and India. Our Personalized Training and Consultancy offers include Artificial Intelligence, Management, IT, Statistics, Programming – our trainings.

Course Description

  • Why Spark?
  • Spark Basics
  • Working with RDDs
  • The Hadoop Distributed File System
  • Running Spark on a Cluster
  • Parallel Programming with Spark
  • Caching and Persistence
  • Writing Spark Applications
  • Spark, Hadoop, and the Enterprise Data Center
  • Spark Streaming
  • Common Spark Algorithms
  • Improving Spark Performance

Topics Covered

    Why Spark?

    • Problems with Traditional Large-Scale Systems
    • Introducing Spark

    Spark Basics

    • What is Apache Spark?
    • Using the Spark Shell
    • Resilient Distributed Datasets (RDDs)
    • Functional Programming with Spark

    Working with RDDs

    • RDD Operations
    • Key-Value Pair RDDs
    • MapReduce and Pair RDD Operations

    The Hadoop Distributed File System

    • Why HDFS?
    • HDFS Architecture
    • Using HDFS

    Running Spark on a Cluster

    • Overview
    • A Spark Standalone Cluster
    • The Spark Standalone Web UI

    Parallel Programming with Spark

    • RDD Partitions and HDFS Data Locality
    • Working With Partitions
    • Executing Parallel Operations

    Caching and Persistence

    • RDD Lineage
    • Caching Overview
    • Distributed Persistence

    Writing Spark Applications

    • Spark Applications vs. Spark Shell
    • Creating the SparkContext
    • Configuring Spark Properties
    • Building and Running a Spark Application
    • Logging

    Spark, Hadoop, and the Enterprise Data Center

    • Overview
    • Spark and the Hadoop Ecosystem
    • Spark and MapReduce

    Spark Streaming

    • Spark Streaming Overview
    • Example: Streaming Word Count
    • Other Streaming Operations
    • Sliding Window Operations
    • Developing Spark Streaming Applications

    Common Spark Algorithms

    • Iterative Algorithms
    • Graph Analysis
    • Machine Learning
    • Improving Spark Performance
    • Shared Variables: Broadcast Variables
    • Shared Variables: Accumulators
    • Common Performance Issues

See other jobs

Share Course on :
SR Technologies Institute

Mobile Testing course
Ecorp Trainings Institute