Etusivu / Kurssit / New Big Database Technologies; A Market Overview of Technologies and Products

New Big Database Technologies; A Market Overview of Technologies and Products

Esittely

Introduction

With the introduction of big data and cloud platforms, a tsunami of new technologies and products for data storage, processing, and analytics has been introduced. Hadoop, Spark, NoSQL, NewSQL, triplestores, SQL-on-Hadoop are just a few of the countless technologies that have become available for developing big data systems. But also so many new powerful database engines have entered the market, including Amazon Athena, Exasol, Google BigQuery, Microsoft Synapse, MongoDB, Neo4j, SingleStore, SnowflakeDB, Splice Machine, and Starburst.

Most organizations have many questions. How mature are all these new technologies? Are they worthy replacements for the more traditional SQL products? How should they be incorporated in existing data warehouse architecture? Should they be used to develop data lakes? Are they the perfect platforms for data science, or for operational BI?

This seminar gives a clear, extensive, and critical overview of all the new key technologies for storing, processing, and analyzing big data. Technologies are explained, market overviews are presented, strengths and weaknesses are discussed, and guidelines and best practices are given. It’s the perfect update for those interested in the new market of big data technology.

Subjects

1. Big Data: State of the art

What exactly do we mean with big data?
The key application area of big data: business analytics
Differences between semi-structured, poly-structured, multi-structured, and unstructured data

2. Analytical SQL Database Servers

Classification of analytical SQL database servers, and can they compete with NoSQL products?
The advantages and disadvantages of column-based database servers
How important is in-database analytics?
Is loading databases into internal memory the solution? Is it feasible?
Market overview, including Amazon Athena, Exasol, Google BigQuery, HP/Vertica, Microsoft Synapse, SnowflakeDB, Splice Machine, and Starburst.

3. The World of Hadoop and Spark

The Hadoop stack explained: HDFS, MapReduce, Spark, Hive, HBase, YARN, ZooKeeper, Pig, HCatalog, and so on
Characteristics and consequences of HDFS and file formats
Alternative implementations by MapR, Amazon, and ScaleOut (Hadoop in-memory)
Kafka for fast messaging

4. NoSQL Database Stores

Classification of NoSQL products: key-values stores, document stores, column-family stores, and graph data stores
It’s all about data scalability and performance
Why is schema-on-read more flexible than schema-on-write?
Are NoSQL products really database servers?
Market overview, including Apache HBase and CouchDB, Cassandra, Cloudera, DataStax, InfiniteGraph, MongoDB, and Neo4J

5. Exploring Data in Hadoop Using SQL

Making Hadoop data available for reporting and analysis through SQL-on-Hadoop engines
Examples of SQL-on-Hadoop engines, including Apache Drill, Apache Hive, Apache Phoenix, Cloudera Impala, HP Vertica, Pivotal HawQ, Singlestore, Spark SQL and Splice Machine
Data virtualization for unleashing the information hidden in NoSQL and SQL systems

6. NewSQL database servers for transaction workloads

NewSQL database servers are designed for high-performance transactional systems
Simpler transaction mechanisms
The challenge of multi-table joins
Market overview, including CitusDB, Clustrix, MariaDB, NuoDB, and VoltDB

7. Concluding Remarks

What You Will Learn:

Why traditional database technology is not “big” enough
How different are Hadoop and NoSQL form traditional technology
How new and existing technologies such as Hadoop, NoSQL, and NewSQL can help develop BI and big data systems
How to embed Hadoop technologies in existing BI systems
How Spark can boost performance for analytics
How to distinguish between three NoSQL subcategories: key-value, document, and column-family stores
Why graph databases are very different from all other systems
When to use NewSQL or NoSQL for developing transactional systems
How to simplify data access through SQL-on-Hadoop engines
When to use which new data storage technology and the pros and cons of each solution
Which products and technologies are winners and which are losers

Geared to: IT architects; database specialists; big data specialists; BI specialists; data warehouse designers; technology planners; technical architects; enterprise architects; IT consultants; IT strategists; systems analysts; database developers; database administrators; solutions architects; data architects.

+ Lue koko esittely

Kouluttaja:

RICK VAN DER LANS

Rick van der Lans is a highly-respected independent analyst, consultant, author, and internationally acclaimed lecturer specializing in data architectures, data warehousing, business intelligence, big data, and database technology. In 2018 he was selected the sixth most influential BI analyst worldwide by onalytica.com.

He has presented countless seminars, webinars, and keynotes at industry-leading conferences. He also helps clients worldwide to design their data warehouse, big data, and business intelligence architectures and solutions and assists them with selecting the right products.

Lue lisää

Etusivu / Kurssit / New Big Database Technologies; A Market Overview of Technologies and Products