25,000+ Courses Nationwide
0345 4506120

Azure Data Factory Training: Designing and Implementing Data Integration Solutions

This course covers all key aspects of the Azure Data Factory v2 platform.

It is ideal for architects, developers, administrators, IT managers, and anyone else who would like to make the best possible use of this Azure service. Key areas covered include ADF v2 architecture, UI-based and automated data movement mechanisms, 10+ data transformation approaches, control-flow activities, reuse options, operational best-practises, and a multi-tiered approach to ADF security.

Special attention is paid to covering Azure services which are commonly used with ADF v2 solutions. These services are Azure Data Lake Storage Gen 2, Azure SQL Database, Azure Databricks, Azure Key Vault, Azure Functions, and a few others.

6 hands-on instructor-led labs are included with the course. These allow students to practise applying ADF v2 concepts and prepare them for real-world Azure data integration projects.

Key Features of this Azure Data Factory Training:

  • After-course instructor coaching benefit
  • Hands-on labs included

Select specific date to see price, venue and full details.

Learning Objectives

  • Build end-to-end ETL and ELT solutions using Azure Data Factory v2
  • Architect, develop and deploy sophisticated, high-performance, easy-to-maintain and secure pipelines that integrate data from a variety of Azure and non-Azure data sources.
  • Apply the latest DevOps best practises available for the ADF v2 platform.

Pre-Requisites

Microsoft Azure Fundamentals Training or equivalent experience.

Course Content

Introduction to ADF

  • Historical background: SSIS, ADF v1, other ETL/ELT tools
  • Key capabilities and benefits of ADF v2
  • Recent feature updates and enhancements

Core Architectural Components 

  • Connectors: Azure services, databases, NoSQL, files, generic protocols, services & apps, custom
  • Pipelines
  • Activities: data movement, data transformation, control flow
  • Datasets: source, sink
  • Integration Runtimes: Azure, Self-Hosted, Azure-SSIS

Building and Executing Your First Pipeline

  • Creating ADF v2 instance
  • Creating a pipeline and associated activities
  • Executing the pipeline
  • Monitoring execution
  • Reviewing results

Data Movement

Copying Tools and SDKS

  • Copy Data Tool/Wizard
  • Copy activity
  • SDKs: Python, .NET
  • Automation: PowerShell, REST API, ARM Templates

Copying Considerations

  • File formats: Avro, binary, delimited, JSON, ORC, Parquet
  • Data store support matrix
  • Write behaviour: append, upsert, overwrite, write with custom logic
  • Schema and data type mapping
  • Fault tolerance options

Data Transformation

Transformation with Mapping Data Flows

  • Introduction to mapping data flows
  • Data flow canvas
  • Debug mode
  • Dealing with schema drift
  • Expression builder & language
  • Transformation types: Aggregate, Alter row, Conditional split, Derived column, Exists, Filter, Flatten, Join, Lookup, New branch, Pivot, Select, Sink, Sort, Source, Surrogate key, Union, Unpivot, Window

Transformation with External Services

  • Databricks: Notebook, Jar, Python
  • HDInsight: Hive, Pig, MapReduce, Streaming, Spark
  • Azure Machine Learning service
  • SQL Stored procedures
  • Azure Data Lake Analytics U-SQL
  • Custom activities with .NET or R

Control Flow

  • Purpose of activity dependencies: branching and chaining
  • Activity dependency conditions: succeeded, failed, skipped, completed
  • Control flow activities: Append Variable, Azure Function, Execute Pipeline, Filter, ForEach, Get Metadata, If Condition, Lookup, Set Variable, Until, Wait, Web

Runtime and Operations

  • Debugging
  • Monitoring: visual, Azure Monitor, SDKs, runtime-specific best practises
  • Scheduling execution with triggers: event-based, schedule, tumbling window
  • Performance, scalability, tuning
  • Common troubleshooting scenarios in activities, connectors, data flows and integration runtimes

DevOps with ADF

  • Quick introduction to source control with Git
  • Integration with GitHub and Azure DevOps platforms
  • Environment management: Development, QA, Production
  • Iterative development best practises
  • Continuous Integration (CI) pipelines
  • Continuous Delivery (CD) pipelines

Promoting Reuse

  • Templates: out-of-the-box and organisational
  • Parameters
  • Naming convention

Security

  • Data movement security
  • Azure Key Vault
  • Self-hosted IR considerations
  • IP address blocks
  • Managed identity

Azure Data Factory Training FAQs

What is Azure Data Factory?

Azure Data Factory (ADF) v2 is an Azure data integration service which allows creation of data-driven workflows to orchestrate and automate data movement and transformation across cloud, on-prem and hybrid environments.

How much ADF experience do I need to sit for this course?

While this new course is designed to bring students from zero expertise with ADF v2 to an intermediate or even advanced level of knowledge, Microsoft Azure Fundamentals Training or equivalent is expected.

Related Courses

Privacy Notice

In order to provide you with the service requested we will need to retain and use your contact information in accordance with our Privacy Notice. If you choose to provide us with this information you explicitly consent to us using the information as necessary to provide the requested service to you. If you do not agree please do not proceed to request the service from us.

Marketing Permissions

Would you like to receive our newsletter and other information on products and services which we think will be of interest to you by email. We will always treat your information with care and in accordance with our Privacy Notice. You are free to withdraw this permission at any time.

 

We work with the best