DP-3027: Implement a data engineering solution with Azure Databricks

Length: 1 Day(s)     Cost:$895 + GST

= Scheduled class     = Guaranteed to run     = Fully booked

Click on the date to book online
Please wait as we are loading the schedules...
LOCATION July August September October
Auckland
Hamilton
Christchurch
Wellington
Virtual Class

DP-3027 - Implement a data engineering solution with Azure Databricks equips data engineers with the skills to implement real-time and batch data pipelines using Azure Databricks. The course covers structured streaming, Delta Live Tables, and performance optimization techniques. Learn to automate workflows, apply CI/CD practices, and ensure data governance with Unity Catalog. Hands-on labs reinforce concepts like streaming architecture, SQL Warehouses, and Azure Data Factory integration.


This course is best suited to Data Scientists


Before attending this course, students should have fundamental knowledge of data analytics concepts


After completing this course, students will be able to:

  • Understand and implement Spark Structured Streaming for incremental data processing.
  • Use Delta Live Tables for building scalable streaming architectures.
  • Optimize pipeline performance using serverless compute and query tuning.
  • Apply CI/CD workflows with Git integration and testing strategies.
  • Automate data workloads using Azure Databricks Jobs and best practices.
  • Implement governance, security, and data lineage with Unity Catalog.

Perform incremental processing with spark structured streaming
  • Understand Spark structured streaming.
  • Some techniques to optimize structured streaming.
  • How to handle late arriving or out of order events.
  • How to set up real-time-sources for incremental processing.
  • Lab: Real-time ingestion and processing with Delta Live Tables with Azure Databricks
Implement streaming architecture patterns with Delta Live Tables
  • Use Event driven architectures with Delta Live Tables
  • Ingest streaming data
  • Achieve Data consistency and reliability
  • Scale streaming workloads with Delta Live Tables
  • Lab: end-to-end streaming pipeline with Delta Live tables
Optimize performance with Spark and Delta Live Tables
  • Use serverless compute and parallelism with Delta live tables
  • Perform cost based optimization and query performance
  • Use Change Data Capture (CDC)
  • Apply enhanced autoscaling capabilities
  • Implement Observability and enhance data quality metrics
  • Lab: optimize data pipelines for better performance in Azure Databricks
Implement CI/CD workflows in Azure Databricks
  • Implement version control and Git integration.
  • Perform unit testing and integration testing.
  • Maintain environment and configuration management.
  • Implement rollback and roll-forward strategies.
  • Lab: Implement CI/CD workflows
Automate workloads with Azure Databricks Jobs
  • Implement job scheduling and automation.
  • Optimize workflows with parameters.
  • Handle dependency management.
  • Implement error handling and retry mechanisms.
  • Explore best practices and guidelines.
  • Lab: Automate data ingestion and processing