0345 4506120

Data Engineering on Google Cloud Platform

This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform.

This class is intended for experienced developers who are responsible for managing big data transformations including:

  •   Extracting, Loading, Transforming, cleaning, and validating data
  •   Designing pipelines and architectures for data processing
  •   Creating and maintaining machine learning and statistical models
  •   Querying datasets, visualizing query results and creating reports
Reset

Learning Objectives

This course teaches participants the following skills:                    

  •   Design and build data processing systems on Google Cloud Platform
  •   Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  •   Derive business insights from extremely large datasets using Google BigQuery
  •   Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML
  •   Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
  •   Enable instant insights from streaming data

Pre-Requisites

To get the most of out of this course, participants should have:

  •   Completed Google Cloud Fundamentals: Big Data & Machine Learning OR have equivalent experience
  •   Basic proficiency with common query language such as SQL
  •   Experience with data modeling, extract, transform, load activities
  •   Developing applications using a common programming language such Python
  •   Familiarity with Machine Learning and/or statistics

Course Content

Day 1: Serverless Data Analysis

  •   Module 1: Serverless data analysis with BigQuery
  •   Module 2: Serverless, autoscaling data pipelines with Dataflow

Day 2: Leveraging unstructured data

  •   Module 3: Google Cloud Dataproc Overview
  •   Module 4: Running Dataproc Jobs
  •   Module 5: Integrating Dataproc with Google Cloud Platform
  •   Module 6: Making Sense of Unstructured Data with Google’s Machine Learning APIs

Day 3: Serverless Machine Learning

  •   Module 7: Getting started with Machine Learning
  •   Module 8: Building ML models with Tensorflow
  •   Module 9: Scaling ML models with CloudML
  •   Module 10: Feature Engineering
  •   Module 11: ML architectures

Day 4: Resilient streaming systems

  •   Module 12: Need for real-time streaming analytics
  •   Module 13: Architecture of streaming pipelines
  •   Module 14: Stream data and events into PubSub
  •   Module 15: Build a stream processing pipeline
  •   Module 16: High throughput and low-latency with Bigtable
  •   Module 17: Building Dashboards

Privacy Notice

In order to provide you with the service requested we will need to retain and use your contact information in accordance with our Privacy Notice. If you choose to provide us with this information you explicitly consent to us using the information as necessary to provide the request service to you. If you do not agree please do not proceed to request the service from us.

Marketing Permissions

Would you like to receive our newsletter and other information on products and services which we think will be of interest to you by email. We will always treat your information with care and in accordance with our Privacy Notice. You are free to withdraw this permission at any time.

 

Online Courses

You may prefer an online course if you are looking for a flexible and cost-effective solution. Online courses allow you to study at your own pace, at a time that suits you.

We have the following eLearning options available:

Virtual Classroom

Virtual classrooms provide all the benefits of attending a classroom course without the need to arrange travel and accomodation. Please note that virtual courses are attended in real-time, commencing on a specified date.

Virtual Course Dates

Our Customers Include