Machine Learning & Advanced Analytics

Introduction

Overview
Today, with most people connected to the Internet, the power of the customer is almost limitless. The Internet has given them freedom to choose in a way that business could never have imagined. They can browse your competitors’ web sites with ease. They can compare prices, they can view sentiment about your business, and they can switch loyalty in a single click any time anywhere all from a mobile device. In addition, the emergence of social media sites means that customers also have a voice. They can express opinion and sentiment about products and brands on Twitter, Facebook, and review web sites and create social networks by attracting followers and following others.

For many CEOs, customer retention, loyalty, service and growth are top of their agenda. In addition, improving operational effectiveness is also high on their priority list. The only way they can achieve this is to acquire more data and use AI to help make this possible. CMOs also want access to new data to enrich what they already know about customers. New data is needed to provide insight on customer on-line behaviour for better segmentation and to understand the value of a customers’ social network and not just the customer. In addition, COOs want more data to become more effective in operations. Instrumentation is therefore being added so that operations can capture new data. With etc. so much demand we are now in an era where data has never before been so important to business in helping to create competitive advantage.

This new 2-day seminar looks at the need to capture new data sources to add to what we already know and use machine learning to automatically discover, profile and catalog what is in these data sources. It then looks at how machine learning and advanced analytical techniques, such as text analyses, sentiment analysis, graph and streaming analytics, can be used at scale on big data to provide new insight that helps foster growth, reduce costs and improve effectiveness for competitive advantage.

AUDIENCE

Business Analysts, data scientists, BI Managers, data warehousing professionals, enterprise architects, data architects CIO’s, IT Managers

LEARNING OBJECTIVES

Attendees to this seminar will learn:
• How data and analytical characteristics can dictate the approach taken and tools needed to conduct exploratory analytics
• How to develop analytical models using supervised and unsupervised machine learning
• How to develop machine learning models at scale on Apache Spark and Hadoop
• Tools for building machine learning models
• Tools for deploying, monitoring and re-training machine learning models • Tools and techniques for discovery, analysis and visualisation of multi-structured data
• Text and sentiment analysis
• Scaling text analysis to run on Hadoop and Spark
• Clickstream analysis
• Graph analysis – 4 graph analytical techniques to identify shortest path, analyse connectivity, identify communities, determine influencers and important people in social networks, etc.
• Scale graph analysis on Apache Spark
• Analyse fast data in real-time using streaming analytics
• Deep learning with multi-layer neural networks
• Leverage machine learning and advanced analytics quickly and easily from self-service BI reports and dashboards for access over the web and on mobile devices


MODULE 1: AN INTRODUCTION TO DATA EXPLORATION, DISCOVERY AND VISUALISATION

This session introduces data discovery and visualisation and looks at why businesses now need

  • New data sources – Structured versus multi-structured data
  • What are the different analytical workloads?
  • Types of Data Science tools
  • Why do businesses need this newcapability? – Machine learning Exampleuse cases
  • Skills required for Data Discovery andVisualisation
  • Creating a business aligned analyticsstrategy

MODULE 2: GETTING STARTED WITH PREDICTIVE ANALYTICS AND MACHINE LEARNING
As we move into the era of smart business, looking back in time is not enough to make good decisions. Companies have to also model the future to forecast and predict so that they can anticipate problems and act in a timely manner to compete. Predictive analytics is therefore a key part of any BI initiative and should be integrated into analysis, reporting and dashboards. This session introduces predictive analytics and shows how it can be used in analysis and in business optimisation

  • What is machine learning?
  • Technologies and methodologies developing predictive analytical models
  • Using supervised machine learning to develop predictive models for automatic classification
  • Popular predictive algorithms, e.g. Linear regression, naïve bayes, decision trees, random forest, neural networks, support vector machines
  • Implementing in-Hadoop, in-memory analytics using Apache Spark
  • Data Science Notebooks using Jupyter, RStudio, Apache Zeppelin and Databricks Cloud
  • Accessing data in HDFS, cloud storage or data warehouses using SQL to build models
  • Accessing Spark machine learning algorithms from data mining tools
  • Deploying predictive models as a service, in a container, in-analytical databases and in Hadoop
  • Industrialising enterprise model deployment using MLOps platforms such as Algorithmia
  • Integrating predictive analytics with event stream processing for automated analysis of fast data in every-day business operations
  • Clustering data using unsupervised learning algorithms
  • Speeding up model development using machine learning automation tools
  • Data Science tools
    o E.g. Cloudera CDP ML Service, IBM Watson Studio, AWS Sagemaker • Deep Learning
    o Google Tensorflow, deepsense.io, PyTorch
  • Moving beyond Predictive with Reinforcement learning and RAY

MODULE 3: ADVANCED ANALYTICS FOR MULTI-STRUCTURED DATA
This session looks at emerging analytical technologies for multi-structured data and explores how you can use them to improve business insight. Not all analytical projects are implemented using relational database technology, especially when it comes to very large data volumes with unstructured content, semi-structured JSON or XML data, sensor data, and clickstream. This session looks at the emergence of advanced analytics using Big Data NoSQL Platforms like Spark and Cloud storage or Hadoop. It looks at the approaches to analysing complex unstructured and social content and the challenges of creating valuable business insight from multiple sources of unstructured content.

  • Techniques for producing insight from unstructured content
  • Tools and techniques for analysing text
  • Understanding the ‘voice of the customer’ using sentiment analytics on email and social media data
  • Clickstream analysis
  • Graph analysis
    o Graph databases
    o Path analytics
    o Connectivity analysis
    o Community analysis
    o Centrality analysis
    o Finding Influencers in social networks
    o Calculating follower susceptibility to be influenced
  • Streaming analytics
    o What is data-in-motion
    o Use cases for streaming data
    o Time series analysis and streaming data
    o Tools for managing streaming ingest, e.g. Apache Kafka, StreamSets, Hortonworks Data Flow
    o Open source streaming engines – Apache Spark, Apache Flink, Google Data Flow
    o Commercial streaming analytics products*
    o Developing streaming analytics applications with no programming
    o Modernising your architecture to accommodate streaming data
    o Future proofing your architecture

MODULE 4: SEARCH, BI & BIG DATA

This session will examine the growing role of search in an analytical environment both as an information consumer tool for self- service BI and as a way of analysing both structured and unstructured data. Search has been incorporated into BI tools for some time, but with the emergence of Big Data as a platform for analysing unstructured information, it is taking on a major new role. Search is a simple mechanism that is familiar to most people and opening up the interactive use of BI via search can have enormous business benefits. Search can be used to grow the use of BI to a much wider group of users and also provide a way to extract additional insight from unstructured content. Topics that will be covered include:

  • Why Search and BI?
  • The growing importance of analysing unstructured content
  • The implications of Big Data on search and BI
  • Creating search indexes on multi- structured data
  • Building dashboards and reports on top of search engine indexed content
  • Using search to analyse multi- structured data
  • The integration of search with traditional BI platforms
  • Using Search to find BI content and metrics
  • Guided analysis using multi-faceted search
  • The search based analytical tools marketplace: Apache Solr (Lucene), Cloudera Search, Amazon Kendra, ServiceNow Attivio, IBI WebFocus Magnify, IBM Watson Explorer, Microsoft, Splunk, Thoughtspot


MODULE 5: DEPLOYING AND USING SELF- SERVICE BI TOOLS

Self-service BI tools are frequently sold into business departments so that local business analysts can build their own BI reports, dashboards and applications and do ad-hoc analysis without having to wait for IT. This session looks at how to maximise business benefit of machine learning by integrating self-service BI tools with predictive and advanced analytics deployed in containers, in-database, in-Hadoop, in-Spark and in- streaming analytics platforms to leverage on-demand and event-driven analytics at scale. It also looks at OLAP on Hadoop to enable scalable multi-dimensional analysis.

  • The self-service BI tools marketplace – Microsoft PowerBI, Qlik Sense, Tableau, MicroStrategy, Oracle Analytics Cloud, ThoughtSpot, Information Builders WebFOCUS, IBM Cognos Analytics, SiSense etc.
  • Self-service BI tool access to Big Data via SQL on cloud storage or Hadoop
  •  Simplifying Self-service BI tool data access to multiple data stores via data virtualisation – logical data warehouse
  • Accessing deployed machine learning models deployed in containers, in- database, and in-Spark from self- service BI tools and spread sheets
  • Accessing streaming data and real- time analytics from self-service BI tools and spreadsheets
  • Integration with advanced analytics in the cloud and on-premises
  • Scalable OLAP on Hadoop – Multi- dimensional analysis using AtScale, Kyvos Insights and Apache Kylin
+ Read more

Educator:

MIKE FERGUSON

Managing Director, Intelligent Business Strategies Limited

Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent IT industry analyst and consultant, he specialises in BI / analytics and data management. With over 40 years of IT experience, Mike has consulted for dozens of companies on BI/Analytics, data strategy, technology selection, data architecture, and data management. Mike is also conference chairman of Big Data LDN, the fastest growing data and analytics conference in Europe.  He has spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited – the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS. He teaches popular master classes in Data Warehouse Modernisation, Big Data Architecture & Technology, Centralised Data Governance of a Distributed Data Landscape, Practical Guidelines for Implementing a Data Mesh (Data Catalog, Data Fabric, Data Products, Data Marketplace), Real-Time Analytics, Embedded Analytics, Intelligent Apps & AI Automation, Migrating your Data Warehouse to the Cloud, Modern Data Architecture and Data Virtualisation & the Logical Data Warehouse.

Read more

Machine Learning & Advanced Analytics

Theme:
Agile Development
Educator:
MIKE FERGUSON
Language:
English
Duration:
2 days
Location:
Remote training
Dates:
Contact

Koulutusohjelmalla / kurssilla ei ole aktiivisia aloituspäivämääriä, jos olet kiinnostunut kurssista ota yhteyttä.

Contact

Please contact:

 

  • This field is for validation purposes and should be left unchanged.

 

More than one participants from same company?

We also organize company-specific courses.

Course for company

You might be interested in

+