DP-3027: Implement a data engineering solution with Azure Databricks

Length: 1 Day(s) Cost:$895 + GST

= Scheduled class

= Guaranteed to run

= Fully booked

Click on the date to book online

Please wait as we are loading the schedules...

LOCATION	July	August	September	October
Auckland
Hamilton
Christchurch
Wellington
Virtual Class

DP-3027 - Implement a data engineering solution with Azure Databricks equips data engineers with the skills to implement real-time and batch data pipelines using Azure Databricks. The course covers structured streaming, Delta Live Tables, and performance optimization techniques. Learn to automate workflows, apply CI/CD practices, and ensure data governance with Unity Catalog. Hands-on labs reinforce concepts like streaming architecture, SQL Warehouses, and Azure Data Factory integration.

This course is best suited to Data Scientists

Before attending this course, students should have fundamental knowledge of data analytics concepts

After completing this course, students will be able to:

Understand and implement Spark Structured Streaming for incremental data processing.
Use Delta Live Tables for building scalable streaming architectures.
Optimize pipeline performance using serverless compute and query tuning.
Apply CI/CD workflows with Git integration and testing strategies.
Automate data workloads using Azure Databricks Jobs and best practices.
Implement governance, security, and data lineage with Unity Catalog.

Perform incremental processing with spark structured streaming

Understand Spark structured streaming.
Some techniques to optimize structured streaming.
How to handle late arriving or out of order events.
How to set up real-time-sources for incremental processing.
Lab: Real-time ingestion and processing with Delta Live Tables with Azure Databricks

Implement streaming architecture patterns with Delta Live Tables

Use Event driven architectures with Delta Live Tables
Ingest streaming data
Achieve Data consistency and reliability
Scale streaming workloads with Delta Live Tables
Lab: end-to-end streaming pipeline with Delta Live tables

Optimize performance with Spark and Delta Live Tables

Use serverless compute and parallelism with Delta live tables
Perform cost based optimization and query performance
Use Change Data Capture (CDC)
Apply enhanced autoscaling capabilities
Implement Observability and enhance data quality metrics
Lab: optimize data pipelines for better performance in Azure Databricks

Implement CI/CD workflows in Azure Databricks

Implement version control and Git integration.
Perform unit testing and integration testing.
Maintain environment and configuration management.
Implement rollback and roll-forward strategies.
Lab: Implement CI/CD workflows

Automate workloads with Azure Databricks Jobs

Implement job scheduling and automation.
Optimize workflows with parameters.
Handle dependency management.
Implement error handling and retry mechanisms.
Explore best practices and guidelines.
Lab: Automate data ingestion and processing

Contact a member of our team and request a class date.

Make an enquiry

Contact a member of our team and we can help organize that.

Make an enquiry