| Unit/Topic | Topics(s) | Labs/Activities |
|---|---|---|
| 00 | Course Intro: Data Engineering Basics |
An Introduction to Data Engineering
Beginners Guide to Data Engineering 1 Beginners Guide to Data Engineering 2 Beginners Guide to Data Engineering 3 |
| Unit 1 | Structured Data Systems (SQL Relational Databases) |
Activity 01: Installing/Using MySQL & MySQL Workbench |
| Topic 1 | Online Transaction Processing (OLTP) Schema Design | Activity 02: Creating and Populating the Northwind Database |
| Topic 2 | SQL Language and Querying Fundamentals | Lab 01: SQL Query Fundamentals |
| Topic 3 | Advanced SQL Query Language: Advanced Topics | Activity 03: Advanced SQL Querying Techniques |
| Topic 4 | Online Analytical Processing (OLAP) Schema Design | Activity 04: Creating the Northwind_DW Data Warehouse |
| Topic 5 | Extract-Transform-Load (ETL) Processing | Lab 02: Basic ETL Processing (with SQL) |
| Unit 2 | Python Programming for Data Engineering | Activity 01: Installing/Using Anaconda Python with Jupyter Notebooks |
| Topic 1 | Python Fundamentals | Activity 02: Python Language Basics in Jupyter Notebooks |
| Topic 2 | Using Python to Interact with SQL Database Systems (MySQL) | Activity 03: Using Python to Interact with MySQL in Jupyter Notebooks |
| Topic 3 | Using Python to Interact with File System Data | Activity 04: Using Python to Interact with Files in Jupyter Notebooks |
| Topic 4 | Using Python to Interact with Application Program Interfaces (APIs) | Activity 05: Using Python to Interact with APIs in Jupyter Notebooks |
| Topic 5 | Using Python to Extract, Transform and Load Data | Lab 03: Using Python to Perform Extract-Transform-Load (ETL) Processing |
| Project 1 | Create a Data Warehouse Using Data from Various Sources | |
| Unit 3 | Semi-Structured Data Systems (NoSQL) | Activity 01: Installing MongoDB & MongoDB Compass |
| Topic 1 | Introduction to NoSQL Database Systems | Activity 02: Provisioning MongoDB Atlas (Cloud Version) |
| Topic 2 | Using Python to Interact with NoSQL Database System (MongoDB) | Activity 03: Using Python to Interact with MongoDB in Jupyter Noteboks |
| Topic 3 | Working with Polyschematic Data and JSON | Activity 04: MongoDB Querying Fundamentals with JavaScript Object Notation (JSON) |
| Topic 4 | Integrating MongoDB Data into the Northwind_DW Data Warehouse | Lab 04: Extending the Northwind_DW Data Warehouse with Data from MongoDB |
| Unit 4 | Data Lakehouse Architectures & Real-Time Streaming Systems | Activity 01: Provisioning a Spark/PySpark Development Environment |
| Topic 1 | Introduction to Apache Spark & PySpark | Activity 02: Running and Configuring Apache Spark/PySpark |
| Topic 2 | Spark SQL Language and Query Fundamentals | Activity 03: Using Spark-SQL to Query File-based Data |
| Topic 3 | Spark Files, Databases, Tables and Views | Activity 04: Using PySpark to Create Tables and Views |
| Topic 4 | Integrating Real-Time Data with Structured Streaming | Lab 05: Incremental Updates with PySpark Structured Streaming |
| Topic 5 | Data Integration & ETL Processing in Spark | Lab 06: Using PySpark to Implement the Medallion Architecture |
| Project 2 | Create a Data Lakehouse Using PySpark | |
| Topic 6 | Integrating Databases with Spark | Activity 05: Connecting to MySQL and SQL Server with PySpark |
| Topic 7 | Integrating NoSQL Databases with Databricks | Activity 06: Connecting to MongoDB with PySpark |