Course Details
LU1 Get started with data engineering on Azure
Topic 1 Introduction to data engineering on Azure
- What is data engineering
- Important data engineering concepts
- Data engineering in Microsoft Azure
Topic 2 Introduction to Azure Data Lake Storage Gen2
- Understand Azure Data Lake Storage Gen
- Enable Azure Data Lake Storage Gen2 in Azure Storage
- Compare Azure Data Lake Store to Azure Blob storage
- Understand the stages for processing big data
- Use Azure Data Lake Storage Gen2 in data analytics workloads
Topic 3 Introduction to Azure Synapse Analytics
- What is Azure Synapse Analytics
- How Azure Synapse Analytics works
- When to use Azure Synapse Analytics
- Exercise - Explore Azure Synapse Analytics
LU2 Build data analytics solutions using Azure Synapse serverless SQL pools
Topic 4 Use Azure Synapse serverless SQL pool to query files in a data lake
- Understand Azure Synapse serverless SQL pool capabilities and use cases
- Query files using a serverless SQL pool
- Create external database objects
- Exercise - Query files using a serverless SQL pool
Topic 5 Use Azure Synapse serverless SQL pools to transform data in a data lake
- Transform data files with the CREATE EXTERNAL TABLE AS SELECT statement
- Encapsulate data transformations in a stored procedure
- Include a data transformation stored procedure in a pipeline
- Exercise - Transform files using a serverless SQL pool
Topic 6 Create a lake database in Azure Synapse Analytics
- Understand lake database concepts
- Explore database templates
- Create a lake database
- Use a lake database
- Exercise - Analyze data in a lake database
Topic 7 Secure data and manage users in Azure Synapse serverless SQL pools
- Choose an authentication method in Azure Synapse serverless SQL pools
- Manage users in Azure Synapse serverless SQL pools
- Manage user permissions in Azure Synapse serverless SQL pools
LU3 Perform data engineering with Azure Synapse Apache Spark Pools
Topic 8 Analyze data with Apache Spark in Azure Synapse Analytics
- Get to know Apache Spark
- Use Spark in Azure Synapse Analytics
- Analyze data with Spark
- Visualize data with Spark
- Exercise - Analyze data with Spark
Topic 9 Transform data with Spark in Azure Synapse Analytics
- Modify and save dataframes
- Partition data files
- Transform data with SQL
- Exercise: Transform data with Spark in Azure Synapse Analytics
Topic 10 Use Delta Lake in Azure Synapse Analytics
- Understand Delta Lake
- Create Delta Lake tables
- Create catalog tables
- Use Delta Lake with streaming data
- Use Delta Lake in a SQL pool
- Exercise - Use Delta Lake in Azure Synapse Analytics
LU4 Work with Data Warehouses using Azure Synapse Analytics
Topic 11 Analyze data in a relational data warehouse
- Design a data warehouse schema
- Create data warehouse tables
- Load data warehouse tables
- Query a data warehouse
- Exercise - Explore a data warehouse
Topic 12 Load data into a relational data warehouse
- Load staging tables
- Load dimension tables
- Load time dimension tables
- Load slowly changing dimensions
- Load fact tables
- Perform post load optimization
- Exercise - load data into a relational data warehouse
Topic 13 Manage and monitor data warehouse activities in Azure Synapse Analytics
- Scale compute resources in Azure Synapse Analytics
- Pause compute in Azure Synapse Analytics
- Manage workloads in Azure Synapse Analytics
- Use Azure Advisor to review recommendations
- Use dynamic management views to identify and troubleshoot query performance
Topic 14 Secure a data warehouse in Azure Synapse Analytics
- Understand network security options for Azure Synapse Analytics
- Configure Conditional Access
- Configure authentication
- Manage authorization through column and row level security
- Exercise - Manage authorization through column and row level security
- Manage sensitive data with Dynamic Data Masking
- Implement encryption in Azure Synapse Analytics
LU5 Transfer and transform data with Azure Synapse Analytics pipelines
Topic 15 Build a data pipeline in Azure Synapse Analytics
- Understand pipelines in Azure Synapse Analytics
- Create a pipeline in Azure Synapse Studio
- Define data flows
- Run a pipeline
- Exercise - Build a data pipeline in Azure Synapse Analytics
Topic 16 Use Spark Notebooks in an Azure Synapse Pipeline
- Understand Synapse Notebooks and Pipelines
- Use a Synapse notebook activity in a pipeline
- Use parameters in a notebook
- Exercise - Use an Apache Spark notebook in a pipeline
LU6 Work with Hybrid Transactional and Analytical Processing Solutions using Azure Synapse Analytics
Topic 17 Plan hybrid transactional and analytical processing using Azure Synapse Analytics
- Understand hybrid transactional and analytical processing patterns
- Describe Azure Synapse Link
Topic 18 Implement Azure Synapse Link with Azure Cosmos DB
- Enable Cosmos DB account to use Azure Synapse Link
- Create an analytical store enabled container
- Create a linked service for Cosmos DB
- Query Cosmos DB data with Spark
- Query Cosmos DB with Synapse SQL
- Exercise - Implement Azure Synapse Link for Cosmos DB
Topic 19 Implement Azure Synapse Link for SQL
- What is Azure Synapse Link for SQL?
- Configure Azure Synapse Link for Azure SQL Database
- Configure Azure Synapse Link for SQL Server
- Exercise - Implement Azure Synapse Link for SQL
LU7 Implement a Data Streaming Solution with Azure Stream Analytics
Topic 20 Get started with Azure Stream Analytics
- Understand data streams
- Understand event processing
- Understand window functions
- Exercise - Get started with Azure Stream Analytics
Topic 21 Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics
- Stream ingestion scenarios
- Configure inputs and outputs
- Define a query to select, filter, and aggregate data
- Run a job to ingest data
- Exercise - Ingest streaming data into Azure Synapse Analytics
Topic 22 Visualize real-time data with Azure Stream Analytics and Power BI
- Use a Power BI output in Azure Stream Analytics
- Create a query for real-time visualization
- Create real-time data visualizations in Power BI
- Exercise - Create a real-time data visualization
LU8 Govern data across an enterprise
Topic 23 Introduction to Microsoft Purview
- What is Microsoft Purview?
- How Microsoft Purview works
- When to use Microsoft Purview
Topic 24 Discover trusted data using Microsoft Purview
- Search for assets
- Browse assets
- Use assets with Power BI
- Integrate with Azure Synapse Analytics
Topic 25 Catalog data artifacts by using Microsoft Purview
- Register and scan data
- Classify and label data
- Search the data catalog
Topic 26 Manage Power BI assets by using Microsoft Purview
- Register and scan a Power BI tenant
- Search and browse Power BI assets
- View Power BI metadata and lineage
Topic 27 Integrate Microsoft Purview and Azure Synapse Analytics
- Catalog Azure Synapse Analytics data assets in Microsoft Purview
- Connect Microsoft Purview to an Azure Synapse Analytics workspace
- Search a Purview catalog in Synapse Studio
- Track data lineage in pipelines
- Exercise - Integrate Azure Synapse Analytics and Microsoft Purview
LU9 Data engineering with Azure Databricks
Topic 28 Explore Azure Databricks
- Get started with Azure Databricks
- Identify Azure Databricks workloads
- Understand key concepts
- Exercise - Explore Azure Databricks
Topic 29 Use Apache Spark in Azure Databricks
- Get to know Spark
- Create a Spark cluster
- Use Spark in notebooks
- Use Spark to work with data files
- Visualize data
- Exercise - Use Spark in Azure Databricks
Topic 30 Use Delta Lake in Azure Databricks
- Get Started with Delta Lake
- Create Delta Lake tables
- Create and query catalog tables
- Use Delta Lake for streaming data
- Exercise - Use Delta Lake in Azure Databricks
Topic 31 Use SQL Warehouses in Azure Databricks
- Get started with SQL Warehouses
- Create databases and tables
- Create queries and dashboards
- Exercise - Use a SQL Warehouse in Azure Databricks
Topic 32 Run Azure Databricks Notebooks with Azure Data Factory
- Understand Azure Databricks notebooks and pipelines
- Create a linked service for Azure Databricks
- Use a Notebook activity in a pipeline
- Use parameters in a notebook
- Exercise - Run an Azure Databricks Notebook with Azure Data Factory
Course Info
Prerequisite
This is an intermediate course. the following knowledge is assumed
Software Requirement
Please download and install the following software prior to the class
HRDF Funding
Please refer to this video https://youtu.be/Kzpd-V1F9Xs
1- HRD Corp Grant Helper
How to submit grant applications for HRD Corp Claimable Courses
2- Employers are required to apply for the grant at least one week before training commences.
Employers must submit their applications with supporting documents, including invoices/quotations, trainer profiles, training schedule and course content.
3- First, Login to Employer’s e-TRIS account -https://etris.hrdcorp.gov.my
Second, Click Application
4- Click Grant on the left side under Applications
5- Click Apply Grant on the left side under Applications
6- Click Apply
7- Choose a Scheme Code and select HRD Corp Claimable Courses: Skim Bantuan Latihan Khas. Then, click Apply
8- Scheme Code represents all types of training that suit the requirements provided by HRD Corp. Below are the list of schemes offered by HRD Corp:
9- Select your Immediate Officer and click Next
10- Select a Training Provider, then click Next
11- Please select a training programme from the list, then key in all the required details and click Next
Select your desired training programme.
Give an explanation on why the participant is required to attend the training. E.g., related to their tasks/ career development, etc.
Explain the background and objective of this training.
Select a relevant focus area. For Employer-Specific Courses, select ‘Not Applicable’.
12- If the training programme is a micro-credential programme, you are required to complete these 3 fields. Save and click Next
Insert MiCAS Application number
13- Based on the nine (9) pillars listed below, HRD Corp Focus Area Courses are closely tied to support government initiatives towards nation building. As such, courses offered through the HRD Corp Focus Areas are designed to provide the workforce with skills required for current and future demands. Details of the focus areas are as follows:
14- Please select a Course Title and Type of Training
15- Select the correct type of training according to the actual type of training, or as mentioned in the training brochure:
16- Please key in the Training Location and click Next
17- Please select the Level of Certification and click Next
18- Please follow the instructions and key in trainee details
19- Click Add Batch, then click Save
20- Click Add Trainee Details
21- Please key in all the required details, then click Add
22- Click Add if there are more participants. Once done, click Save
23- Click Next
24- Please key in the course fees and allowance details, then click Save
25- Estimated cost includes the course fees/external trainer fees, allowances, and consumable training materials. Please comply with the HRD Corp Allowable Cost Matrix.
26- Select Upfront Payment to Training Provider and key in the percentage from 0% to 30%. Then, click Save and Next
27- Complete the declaration form and select a desired officer
28- Add all the required documents, then click Add Attachment. Then, click Save and Submit Application
29- Once the New Grant Application is successfully submitted, the Grant Officer will evaluate the application accordingly. The application may be queried if additional information is required.
The application status will be updated via the employer’s dashboard, email, and the e-TRiS inbox.
Job Roles
- Data Engineer
- Azure Data Engineer
- Cloud Data Engineer
- Business Intelligence Developer
- Data Architect
- Data Scientist
- Machine Learning Engineer
- Big Data Engineer
- Data Analyst
- Database Administrator
- Cloud Solutions Architect
- DevOps Engineer
- Software Developer
- IT Professional
- Systems Analyst
Trainers
Lee Cheong Loong: Lee Cheong Loong, Manager with 23 years working experience in multiple role and department, He completed HRD Corp Train the Trainer programme, HRD Corp Accredited Trainer, Microsoft Certified Trainer and CPFA Citizen Data scientist Trainer programme. with Professional certificate in Big Data & Analytics, Microsoft Office Specialist -Excel 2016 and Tableau Desktop Specialist. He also deliver training for R & Python programming, Excel Dashboard for Business analysis, Data Visualization with Tableau, and Microsoft PowerBI, and Citizen Data Scientist (OpenCertHub).