MCA Microsoft Certified Associate Azure Data Engineer Study Guide : Exam DP-203
MCA Microsoft Certified Associate Azure Data Engineer Study Guide : Exam DP-203
Click to enlarge
Author(s): Perkins, Benjamin
ISBN No.: 9781119885429
Pages: 1,008
Year: 202309
Format: Trade Paper
Price: $ 82.80
Dispatch delay: Dispatched between 7 to 15 days
Status: Available

Introduction xxvii Part I Azure Data Engineer Certification and Azure Products 1 Chapter 1 Gaining the Azure Data Engineer Associate Certification 3 The Journey to Certification 7 How to Pass Exam DP- 203 8 Understanding the Exam Expectations and Requirements 9 Use Azure Daily 17 Read Azure Articles to Stay Current 17 Have an Understanding of All Azure Products 20 Azure Product Name Recognition 21 Azure Data Analytics 23 Azure Synapse Analytics 23 Azure Databricks 26 Azure HDInsight 28 Azure Analysis Services 30 Azure Data Factory 31 Azure Event Hubs 33 Azure Stream Analytics 34 Other Products 35 Azure Storage Products 36 Azure Data Lake Storage 37 Azure Storage 40 Other Products 42 Azure Databases 43 Azure Cosmos DB 43 Azure SQL Server Products 46 Additional Azure Databases 46 Other Products 47 Azure Security 48 Azure Active Directory 48 Role- Based Access Control 51 Attribute- Based Access Control 53 Azure Key Vault 53 Other Products 55 Azure Networking 56 Virtual Networks 56 Other Products 59 Azure Compute 59 Azure Virtual Machines 59 Azure Virtual Machine Scale Sets 60 Azure App Service Web Apps 60 Azure Functions 60 Azure Batch 60 Azure Management and Governance 60 Azure Monitor 61 Azure Purview 61 Azure Policy 62 Azure Blueprints (Preview) 62 Azure Lighthouse 62 Azure Cost Management and Billing 62 Other Products 63 Summary 64 Exam Essentials 64 Review Questions 66 Chapter 2 CREATE DATABASE dbName; GO 69 The Brainjammer 70 A Historical Look at Data 71 Variety 73 Velocity 74 Volume 74 Data Locations 74 Data File Formats 75 Data Structures, Types, and Concepts 83 Data Structures 83 Data Types and Management 92 Data Concepts 95 Data Programming and Querying for Data Engineers 125 Data Programming 126 Querying Data 143 Understanding Big Data Processing 169 Big Data Stages 169 Etl, Elt, Eltl 174 Analytics Types 175 Big Data Layers 176 Summary 177 Exam Essentials 177 Review Questions 179 Part II Design and Implement Data Storage 181 Chapter 3 Data Sources and Ingestion 183 Where Does Data Come From? 185 Design a Data Storage Structure 189 Design an Azure Data Lake Solution 190 Recommended File Types for Storage 198 Recommended File Types for Analytical Queries 199 Design for Efficient Querying 200 Design for Data Pruning 203 Design a Folder Structure That Represents the Levels of Data Transformation 203 Design a Distribution Strategy 205 Design a Data Archiving Solution 206 Design a Partition Strategy 207 Design a Partition Strategy for Files 209 Design a Partition Strategy for Analytical Workloads 210 Design a Partition Strategy for Efficiency and Performance 211 Design a Partition Strategy for Azure Synapse Analytics 211 Identify When Partitioning Is Needed in Azure Data Lake Storage Gen 2 212 Design the Serving/Data Exploration Layer 213 Design Star Schemas 214 Design Slowly Changing Dimensions 215 Design a Dimensional Hierarchy 219 Design a Solution for Temporal Data 220 Design for Incremental Loading 222 Design Analytical Stores 223 Design Metastores in Azure Synapse Analytics and Azure Databricks 224 The Ingestion of Data into a Pipeline 228 Azure Synapse Analytics 228 Azure Data Factory 268 Azure Databricks 275 Event Hubs and IoT Hub 301 Azure Stream Analytics 303 Apache Kafka for HDInsight 314 Migrating and Moving Data 316 Summary 317 Exam Essentials 317 Review Questions 319 Chapter 4 The Storage of Data 321 Implement Physical Data Storage Structures 322 Implement Compression 322 Implement Partitioning 325 Implement Sharding 328 Implement Different Table Geometries with Azure Synapse Analytics Pools 329 Implement Data Redundancy 331 Implement Distributions 341 Implement Data Archiving 342 Azure Synapse Analytics Develop Hub 346 Implement Logical Data Structures 360 Build a Temporal Data Solution 361 Build a Slowly Changing Dimension 365 Build a Logical Folder Structure 368 Build External Tables 369 Implement File and Folder Structures for Efficient Querying and Data Pruning 372 Implement a Partition Strategy 375 Implement a Partition Strategy for Files 376 Implement a Partition Strategy for Analytical Workloads 377 Implement a Partition Strategy for Streaming Workloads 378 Implement a Partition Strategy for Azure Synapse Analytics 378 Design and Implement the Data Exploration Layer 379 Deliver Data in a Relational Star Schema 379 Deliver Data in Parquet Files 385 Maintain Metadata 386 Implement a Dimensional Hierarchy 386 Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster 388 Recommend Azure Synapse Analytics Database Templates 389 Implement Azure Synapse Analytics Database Templates 389 Additional Data Storage Topics 390 Storing Raw Data in Azure Databricks for Transformation 390 Storing Data Using Azure HDInsight 392 Storing Prepared, Trained, and Modeled Data 393 Summary 394 Exam Essentials 395 Review Questions 396 Part III Develop Data Processing 399 Chapter 5 Transform, Manage, and Prepare Data 401 Chapter 6 Ingest and Transform Data 402 Transform Data Using Azure Synapse Pipelines 404 Transform Data Using Azure Data Factory 410 Transform Data Using Apache Spark 414 Transform Data Using Transact- SQL 429 Transform Data Using Stream Analytics 431 Cleanse Data 433 Split Data 435 Shred JSON 439 Encode and Decode Data 445 Configure Error Handling for the Transformation 450 Normalize and Denormalize Values 451 Transform Data by Using Scala 461 Perform Exploratory Data Analysis 463 Transformation and Data Management Concepts 473 Transformation 473 Data Management 480 Azure Databricks 481 Data Modeling and Usage 485 Data Modeling with Machine Learning 486 Usage 494 Summary 500 Exam Essentials 500 Review Questions 502 Create and Manage Batch Processing and Pipelines 505 Design and Develop a Batch Processing Solution 507 Design a Batch Processing Solution 510 Develop Batch Processing Solutions 512 Create Data Pipelines 538 Handle Duplicate Data 560 Handle Missing Data 569 Handle Late- Arriving Data 571 Upsert Data 572 Configure the Batch Size 578 Configure Batch Retention 581 Design and Develop Slowly Changing Dimensions 582 Design and Implement Incremental Data Loads 583 Integrate Jupyter/IPython Notebooks into a Data Pipeline 590 Chapter 7 Revert Data to a Previous State 591 Handle Security and Compliance Requirements 592 Design and Create Tests for Data Pipelines 593 Scale Resources 593 Design and Configure Exception Handling 593 Debug Spark Jobs Using the Spark UI 594 Implement Azure Synapse Link and Query the Replicated Data 594 Use PolyBase to Load Data to a SQL Pool 595 Read from and Write to a Delta Table 595 Manage Batches and Pipelines 596 Trigger Batches 597 Schedule Data Pipelines 597 Validate Batch Loads 598 Implement Version Control for Pipeline Artifacts 604 Manage Data Pipelines 607 Manage Spark Jobs in a Pipeline 609 Handle Failed Batch Loads 610 Summary 610 Exam Essentials 611 Review Questions 612 Design and Implement a Data Stream Processing Solution 615 Develop a Stream Processing Solution 617 Design a Stream Processing Solution 618 Create a Stream Processing Solution 630 Process Time Series Data 657 Design and Create Windowed Aggregates 658 Process Data Within One Partition 661 Process Data Across Partitions 663 Upsert Data 665 Handle Schema Drift 674 Configure Checkpoints/Watermarking During Processing 680 Replay Archived Stream Data 685 Design and Create Tests for Data Pipelines 688 Monitor for Performance and Functional Regressions 689 Optimize Pipelines for Analytical or Transactional Purposes 689 Scale Resources 690 Design and Configure Exception Handling 691 Handle Interruptions 694 Ingest and Transform Data 694 Transform Data Using Azure Stream Analytics 694 Monitor Data Storage and Data Processing 695 Monitor Stream Processing 695 Summary 695 Exam Essentials 696 Review Questions 697 Part IV Secure, Monitor, and Optimize Data Storage and Data Processing 699 Chapter 8 Keeping Data Safe and Secure 701 Design Security for Data Policies and Standards 702 Design a Data Auditing Strategy 711 Design a Data Retention Policy 716 Design for Data Privacy 717 Design to Purge Data Based on Business Requirements 719 Design Data Encryption for Data at Rest and in Transit 719 De.


To be able to view the table of contents for this publication then please subscribe by clicking the button below...
To be able to view the full description for this publication then please subscribe by clicking the button below...