Google Cloud – Professional Data Engineer Certification Exam Notes

Google Cloud Data Engineer Study Guide

Section 1: Designing Data Processing Systems (~22%)

1.1 Designing for Security and Compliance

  • Identity and Access Management (e.g., Cloud IAM and organization policies)
  • Data security (encryption and key management)
  • Privacy (e.g., personally identifiable information, and Cloud Data Loss Prevention API)
  • Regional considerations (data sovereignty) for data access and storage
  • Legal and regulatory compliance

1.2 Designing for Reliability and Fidelity

  • Preparing and cleaning data (e.g., Dataprep, Dataflow, and Cloud Data Fusion)
  • Monitoring and orchestration of data pipelines
  • Disaster recovery and fault tolerance
  • Making decisions related to ACID compliance and availability
  • Data validation

1.3 Designing for Flexibility and Portability

  • Mapping current and future business requirements to the architecture
  • Designing for data and application portability (multi-cloud, data residency)
  • Data staging, cataloging, and discovery (data governance)

1.4 Designing Data Migrations

  • Analyzing current stakeholder needs and planning for desired state
  • Planning migration to Google Cloud (e.g., BigQuery, Datastream)
  • Designing migration validation strategy
  • Designing dataset and table architecture for proper governance