Kouluttaja: Karri Pulkkinen
Big Data for Data Warehouse and Business Intelligence Professionals
Fix common data warehouse headaches with big data concepts and technologies. Find out about the limitations of big data.
What’s the trouble with the data warehouse?
- Data warehouses are bursting at the seams.
- Data volumes and costs are exploding.
- Relational data warehouses don’t get on with graphs, unstructured data, keyword searches, machine learning, predictive analytics etc.
- It takes forever to get data into the data warehouse and insights out of it.
- Only 20% of enterprise data finds its way to the data warehouse.
- Writing ETL code is slow and hard to maintain.
- Traditional data warehouse architecture sucks for real-time analytics.
Does this sound familiar?
Yes? Then this training course is for you.
Who should attend?
- Data Warehouse and Business Intelligence Professionals
- Data Warehouse Managers
- Enterprise/Solution Architects
- Data Warehouse Architects
- ETL Developers
- Data Warehouse Project Managers
- Data/Business Analysts
- Learn how to reduce data warehouse costs.
- Find out if Hadoop is a good fit for your data warehouse.
- Learn about the pros and cons of the three different types of distributed technologies to process large data volumes.
- Understand the benefits of cloud data warehousing.
- Find out about the role of the cloud in data warehousing.
- Find out if data lakes are just hype or a useful design pattern.
- Understand if it makes sense to implement real-time analytics.
- Learn about metadata driven ETL with code templates to automate data warehouse loads.
- Learn if and how advanced analytics can help to make better decisions.
- Learn how to make more data available to more people in the enterprise without creating data anarchy.
- Learn more about current trends in data warehousing.
- Understand the benefits and limitations of big data technologies.
- Future proof your data warehouse.
Big Data Concepts
What is Big Data?
Big Data and the Enterprise Data Warehouse
Data Warehousing on Distributed Relational Databases (MPP)
Massively Parallel Processing & Shared Nothing Architecture
Data Storage & Indexes
Concurrency, Latency & Throughput
Limitations 1: Concurrency
Limitations 2: Scalability
Limitations 3: Resilience
Limitations 4: Unstructured Data
Limitations 5: Tight Coupling
Limitations 6: License Costs
Matrix MPP Vendors
MPP on Hadoop
Data Warehousing on Hadoop and Spark
Data Distribution & HDFS
Data Storage & Indexes
Concurrency, Latency, Throughput
Trends and Innovations
Data Warehousing in the Cloud
Pay as you go model
Comparison Snowflake, Athena/Presto, Redshift
Data Warehouse Optimization
Limitations of Relational Databases for Data Warehousing
What is Data Warehouse Offload?
Data Warehouse Offload Opportunities
Next Generation Data Warehouse Architecture
Data Warehouse and Cloud
The Future of Dimensional Modeling and ETL
Dimensional Modelling in the Age of Big Data
The Future of ETL
Real-time. Hype or hope?
Stateless & Stateful Computations
End to End Consistency
Event Time & Processing Time
Windowing. Types of Windows.
Batch vs Realtime
Comparison of Streaming Engines
Data Lakes, Self-Service & Advanced Analytics
The Concept of the Data Lake
Is the Data Lake useful?
The Concept of Data Preparation and Self-Service Analytics
From Big to Smart Data
Types of advanced analytics
Data Preparation Tools