Google Cloud Big Data and Machine Learning Fundamentals

Overview

This one-day instructor-led course introduces participants to the big data capabilities of Google Cloud Platform. Through a combination of presentations, demos, and hands-on labs, participants get an overview of the Google Cloud platform and a detailed view of the data processing and machine learning capabilities. This course showcases the ease, flexibility, and power of big data solutions on Google Cloud Platform.

Target Audience

  • Data analysts, Data scientists, Business analysts getting started with Google Cloud Platform.
  • Individuals responsible for designing pipelines and architectures for data processing, creating and maintaining machine learning and statistical models, querying datasets, visualizing query results and creating reports.
  • Executives and IT decision makers evaluating Google Cloud Platform for use by data scientists.

 

Prerequisites

To get the most of out of this course, participants should have:

  • Basic proficiency with common query language such as SQL.
  • Experience with data modeling, extract, transform, load activities.
  • Developing applications using a common programming language such Python.
  • Familiarity with machine learning and/or statistics.

 

Learning Outcomes

  • Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform.
  • Use Cloud SQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform.
  • Employ BigQuery and Cloud Datalab to carry out interactive data analysis.
  • Train and use a neural network using TensorFlow.
  • Employ ML APIs.
  • Choose between different data processing products on the Google Cloud Platform.

 

Course Outline

Session 1

0 Welcome

  1. Facility Logistics

1 Introduction to Google Cloud Platform

  • Google Cloud Platform infrastructure and big data products
  • Demo: BigQuery Github query
  • The different data roles in an organization
  • What you can do with GCP
  • Activity: Explore a customer use case

      Lab 1: Exploring a Public Dataset with BigQuery

2 Product Recommendations using Cloud SQL and Spark

  • Compare Google Cloud Big Data products and services
  • Managed Hadoop in the cloud
  • Demo: Creating a Cluster
  • Your SQL database in the cloud

      Lab 2: Product Recommendation using Cloud SQL and Spark

3 Predicting Visitor Purchases using BigQuery Machine Learning

  • Introduction to BigQuery
  • Fast SQL Query Engine
  • Managed Storage for Datasets
  • Demo: Google Sheets to BQ
  • Insights from Geographic data
  • Demo: BigQuery ML
  • Creating ML models with SQL w/BigQuery ML

      Lab 3: Predicting Visitor Purchases BigQuery ML

Session 2

4 Real-time Dashboards with Pub/Sub, Dataflow, and Data Studio

  • Introduction
  • Message-oriented architectures
  • Serverless data pipelines
  • Data Visualization w/Data Studio

      Lab 4: Real-time Dashboards with Pub/Sub, Dataflow, and Data Studio

5 Deriving Insights from Unstructured Data using Machine Learning

  • Introduction to Machine Learning
  • Pre-built ML models
  • Demo: Cloud Vision API
  • Codeless ML with AutoML

      Lab 5: Classify Images using AutoML

6 Summary

  • Recap of lesson