Call +60 3-7490 2093 Email:

HRDF Approved Training Provider in Malaysia - Modular Fast Track Skill-Based Trainings

Apache Spark Essential Training

Apache Spark is a powerful platform that provides users with new ways to store and make use of big data. In this course, get up to speed with Spark, and discover how to leverage this popular processing engine to deliver effective and comprehensive insights into your data. The trainer will how you  how to analyze data in Spark using PySpark and Spark SQL, explores running machine learning algorithms using MLib, demonstrates how to create a streaming analytics application using Spark Streaming, and more.

Course Highlights

  • Overview of Apache Spark
  • Apache Spark components
  • Databricks
  • data interfaces
  • Import files
  • Spark ML for Machine Learning
  • Spark SQL for querying streaming data


All participants will receive a Certificate of Completion from Tertiary Courses after achieved at least 75% attendance.

HRDF SBL Claimable for Employers Registered with HRDF

HRDF claimable

Course Code: M456

Course Booking


Course Date

Course Time

* Required Fields

Course Cancellation/Reschedule Policy

We reserve the right to cancel or re-schedule the course due to unforeseen circumstances. If the course is cancelled, we will refund 100% to participants.
Note the venue of the training is subject to changes due to class size and availability of the classroom.
Note the minimal class size to start a class is 3 Pax.

Course Details

Topic 2: Exploring Data

  • Data Interface
  • RDD Basic Operations
  • Import Data
  • Actions and Transformations
  • Saving Results

Topic 3: Analyzing Data

  • Select and FIlter Data
  • Aggregate Data
  • Save Data

Topic 4: SparkSQL

  • Creating Tables
  • Querying Data 
  • Visualizing Data

Topic 5: Machine Learning

  • ML or MLlib Module
  • Preprocessing Data
  • Linear Regression
  • Classification

Topic 6: Spark Streaming

  • Streaming Setup
  • Querying Streaming Data

Course Admin


This is a intermediate course. Participants should have basic knowledge on the following subjects:

  • Python
  • Apache Spark

Software Requirement

Download and unzip Apache Spark

Who Should Attend

  • Data Scientists
  • Data Analysts
  • Apache Spark developers who want to use Apache Spark for Hadoop Big Data analysis


Apache Spark TrainerDr Atabak has Total of 15 years of experience in software development/architecture and system integration. Broad experience in commercial software architecture and development. Experience in all stages of the software development lifecycle, high performance, and high-availability secure reliable systems. Experienced team and project lead of 10-15 through SDLC iterations, Agile and eXtreme Programming practitioner mentored fellow developers on the various aspects of application architecture and development, work well with customers. Architects solutions in agile and scrum development environments. Deep understanding of technology with a focus on delivering business solutions. Externalizes configuration and business logic to ease client software implementations. Expertise in full project life cycle development including implementation and integration. Successful background working with stakeholders to develop an architecture framework that aligns strategy, processes, and IT assets with a business goal. Work closely with project managers, developers, and focus groups to avoid redundancy, minimize expenditures, and improve overall synergy within the organization.

Apache Spark trainerRupesh Nanglia (RUPS) has more than 13 years' experience in software development/architecture and system integration. Has architected solutions in agile and scrum development environments. Is adept in Big Data Analytics, Internet of Things (IoT), Project Management, as well as Cloud Computing and Network Analysis. Moreover, while his on-the-job experience has afforded him a well-rounded skill set. He is an expert at: Apache Hadoop, No SQL DB(Cassandra, MongoDB, Hbase, Neo4j), Spark, Kafka, Storm, R, Data Analytics, Social Media Analytics,OpenStack Cloud Computing Platform, preparing study narratives documents and case reviews, Perform the tasks of reviewing studies performed by the staff. Is also working ‎on Data Science fields as a trainer and Data Scientist. Has worked on Machine Learning and Process ‎ Mining projects. Strong written and verbal communication skill. He has successful background working with stakeholders to develop and implement an architecture framework that aligns strategy, processes, and IT assets with a business goal. Has been associated with training industry for more than half a decade.

Write Your Own Review

You're reviewing: Apache Spark Essential Training

How do you rate this product? *

  1 star 2 stars 3 stars 4 stars 5 stars
1. Do you find the course meet your expectation?
2. Do you find the trainer knowledgeable in this subject?
3. How do you find the training environment
  • Reload captcha
    Attention: Captcha is case sensitive.

Product Subjects

Use spaces to separate Subjects. Use single quotes (') for phrases.

You May Be Interested In These Courses