Module 1: Azure Fundamentals for Big Data
- Overview of Azure services for big data.
- Azure Blob Storage and data lake concepts.
- Implementing Azure Data Factory pipelines.
- Introduction to Azure Databricks.
- Azure Synapse Analytics for large-scale data.
- Azure Stream Analytics for real-time processing.
- Data security with Azure Active Directory.
- Compliance and governance in the Azure cloud.
Module 2: Data Processing with Azure HDInsight
- Hadoop ecosystem components in Azure HDInsight.
- Running Hive and Pig for data transformation.
- Implementing batch processing workflows.
- Integration with Azure storage and data services.
- Monitoring and optimisation of HDInsight clusters.
- Customisation with script actions.
Module 3: Real-Time Analytics with Azure Stream Analytics
- Ingesting streaming data using Azure Event Hubs.
- Defining real-time analytics jobs in Stream Analytics.
- Visualising real-time data with Power BI.
Module 4: Developing Big Data Solutions with Azure Databricks
- Configuring Azure Databricks workspaces.
- Collaborative data exploration and notebook workflows.
- Data engineering with Spark SQL and DataFrames.
- Developing scalable machine learning models with MLlib.
- Integration with Azure services for complete solutions.
- Delta Lake for ACID transactions and versioned data storage.
- Data visualisation and reporting via Databricks dashboards.
- Optimising Databricks jobs for performance and cost.
- Managing jobs and clusters with Databricks utilities.
Module 5: Data Warehousing with Azure Synapse Analytics
- Building and managing data warehouses.
- Data loading strategies and ETL in Synapse Analytics.
- Employing T-SQL for complex data queries.
- Synapse Analytics security features.
- Performance tuning and maintenance.
Module 6: Advanced Analytics with Azure Machine Learning
- Azure Machine Learning Workbench environment.
- Building and deploying predictive models.
- Connecting ML services with Azure data pipelines.
- Model management and performance tracking.
Module 7: Internet of Things (IoT) on Azure
- Introduction to Azure IoT Hub and related services.
- Device provisioning and management.
- Building IoT solutions with Azure IoT Central.
- Stream processing with Azure Time Series Insights.
- Edge computing with Azure IoT Edge.
- Integrating IoT data with analytics processes.
- Security considerations for IoT in Azure.
- Case studies on IoT data analytics.
Module 8: AI and Cognitive Services Integration
- Utilising Azure AI services for analytics.
- Building bots with Azure Bot Service.
- Text analytics and language understanding with Cognitive Services.
- Image and video processing with Computer Vision APIs.
- Custom AI solutions with Azure Machine Learning Designer.
- Best practices for deploying AI models.
- Incorporating Azure Search for data discoverability.
- Managed AI services vs custom model deployment considerations.
- Ethical considerations in deploying AI solutions.
Module 9: Data Visualisation and Business Intelligence
- Power BI integration with Azure services.
- Designing interactive reports and dashboards.
- Advanced data visualisation techniques.
Module 10: Capstone Project
- Integrating various Azure components to solve a real-world big data problem.
- Synthesising insights and presenting analytic findings.
- Peer-review and evaluation of capstone projects.
Module 11: Streamlining Data Lake Analytics
- Architecture and setup of Azure Data Lake.
- Data exploration with U-SQL scripting.
- Managing metadata with Azure Data Catalogue.
- Performance tuning and optimisation strategies.
- Access control and security best practices for data lakes.
- Integration with Azure Synapse for complex analytics.
- Establishing a hierarchical namespace with Azure Data Lake Storage Gen2.
- Implementing end-to-end data lake analytics solutions.
Module 12: Advanced Data Engineering on Azure Synapse
- Deep dive into data integration with pipelines and data flows.
- Real-world scenarios: batch scoring, data warehousing, and ETL.
- Advanced data modelling techniques and best practices.
- Querying and analysing data using Azure Synapse SQL pools.
- Optimising data storage with partitioning and indexing strategies.
- Continuous integration and delivery (CI/CD) in Synapse workflows.
Module 13: Big Data Analytics with Azure Analysis Services
- Developing semantic data models for enterprise analytics.
- Deploying and managing Analysis Services instances.
- Consuming data with Power BI, Excel, and other applications.
- Automation and scaling with Azure Analysis Services.
- Updated best practices for MDX and DAX expressions.
- Security features and row-level security implementation.
Module 14: Security and Compliance in Azure Big Data
- Understanding Azure’s security infrastructure for big data.
- Data encryption at rest and in transit.
- Implementing Azure Key Vault for managing encryption keys.
- Security auditing and threat detection with Azure Security Center.
- Navigating compliance frameworks and data protection laws.
- Implementing data governance strategies with Azure Purview.
Module 15: Multi-Cloud and Hybrid Big Data Architectures
- Designing big data architectures for hybrid cloud environments.
- Connecting on-premises systems with Azure Stack.
- Cross-service orchestration with Azure Arc.
- Managing multi-cloud resources with Azure management tools.
- Best practices for data synchronisation and consistency.
Module 16: Advanced Analytics with Synapse Spark Pools
- In-depth understanding of Spark pools in Azure Synapse Analytics.
- Best practices for job and resource management.
- Advanced analytics and machine learning with Spark pools.
- Integration with DevOps for Spark applications.
- Troubleshooting common issues and optimisation tips.
Module 17: Big Data Governance and Lifecycle Management
- Implementing data governance frameworks in Azure.
- Lifecycle management strategies from ingestion to disposal.
- Metadata management and data cataloguing with Azure Purview.
- Implementing data quality services and master data management (MDM).
- Auditing and monitoring for data compliance and usage.