0345 4506120

Introduction to Big Data

What is Big Data? + Key Reasons to Learn Big Data Analytics starting with a vendor-agnostic approach:
This Intro to Big Data is a unique approach to help you act on data for real business gain – not what a tool can do, but what you can do with the output from the tool.  Big data as defined by Wiki is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.

In this hands-on Introduction to Big Data Course, learn to leverage big data analysis tools and techniques to foster better business decision-making – before you get into specific products like Hadoop training (just to name one). Learn ways of storing data that allow for efficient processing and analysis, and gain the skills you need to store, manage, process, and analyze massive amounts of unstructured data to create an appropriate data lake.

Who Should Attend:

  • Anyone needing to implement, enhance your big data environment and looking to advance their analytics career by ensuring foundational knowledge
  • Typical job roles include: Project Managers and IT Managers, Database Administrators & Data Architects, Developers & SQL Developers, Data Scientists & Business Intelligence

All-Inclusive: After-Course Coaching for Real-World Application:

Your tutor is with you from the beginning of your planning until you return to your job ready to apply your new skills – with instructor coaching to answer real-world big data implementation challenges.

Take Your Big Data Course Online or In-person:

Schedules are busy, but big data training online makes it easy to level-up your career. If you need Big Data online training, we’ve got you covered. Our Virtual course delivery option gives you the advantages of a live classroom right from the comfort of your computer screen – no matter where you are.

Reset

Learning Objectives

You Will Learn How To

  • Store, manage, and analyse unstructured data
  • Select the correct big data stores for disparate data sets
  • Process large data sets using Hadoop to extract value
  • Query large data sets in near real time with Pig and Hive
  • Plan and implement a big data strategy for your organisation

Pre-Requisites

Recommended Experience:

  • Working knowledge of the Microsoft Windows platform and basic database concepts

Course Content

Course Outline

Introduction to Big Data

Defining Big Data

  • The four dimensions of Big Data: volume, velocity, variety, veracity
  • Introducing the Storage, MapReduce and Query Stack

Delivering business benefit from Big Data

  • Establishing the business importance of Big Data
  • Addressing the challenge of extracting useful data
  • Integrating Big Data with traditional data

Storing Big Data

Analysing your data characteristics

  • Selecting data sources for analysis
  • Eliminating redundant data
  • Establishing the role of NoSQL

Overview of Big Data stores

  • Data models: key value, graph, document, column–family
  • Hadoop Distributed File System
  • HBase
  • Hive
  • Cassandra
  • Hypertable
  • Amazon S3
  • BigTable
  • DynamoDB
  • MongoDB
  • Redis
  • Riak
  • Neo4J

Selecting Big Data stores

  • Choosing the correct data stores based on your data characteristics
  • Moving code to data
  • Implementing polyglot data store solutions
  • Aligning business goals to the appropriate data store

Processing Big Data

Integrating disparate data stores

  • Mapping data to the programming framework
  • Connecting and extracting data from storage
  • Transforming data for processing
  • Subdividing data in preparation for Hadoop MapReduce

Employing Hadoop MapReduce

  • Creating the components of Hadoop MapReduce jobs
  • Distributing data processing across server farms
  • Executing Hadoop MapReduce jobs
  • Monitoring the progress of job flows

The building blocks of Hadoop MapReduce

  • Distinguishing Hadoop daemons
  • Investigating the Hadoop Distributed File System
  • Selecting appropriate execution modes: local, pseudo–distributed and fully distributed

Handling streaming data

  • Comparing real–time processing models
  • Leveraging Storm to extract live events
  • Lightning–fast processing with Spark and Shark

Tools and Techniques to Analyse Big Data

Abstracting Hadoop MapReduce jobs with Pig

  • Communicating with Hadoop in Pig Latin
  • Executing commands using the Grunt Shell
  • Streamlining high–level processing

Performing ad hoc Big Data querying with Hive

  • Persisting data in the Hive MegaStore
  • Performing queries with HiveQL
  • Investigating Hive file formats

Creating business value from extracted data

  • Mining data with Mahout
  • Visualising processed results with reporting tools
  • Querying in real time with Impala

Developing a Big Data Strategy

Defining a Big Data strategy for your organisation

  • Establishing your Big Data needs
  • Meeting business goals with timely data
  • Evaluating commercial Big Data tools
  • Managing organisational expectations

Enabling analytic innovation

  • Focusing on business importance
  • Framing the problem
  • Selecting the correct tools
  • Achieving timely results

Implementing a Big Data Solution

  • Selecting suitable vendors and hosting options
  • Balancing costs against business value
  • Keeping ahead of the curve

Privacy Notice

In order to provide you with the service requested we will need to retain and use your contact information in accordance with our Privacy Notice. If you choose to provide us with this information you explicitly consent to us using the information as necessary to provide the request service to you. If you do not agree please do not proceed to request the service from us.

Marketing Permissions

Would you like to receive our newsletter and other information on products and services which we think will be of interest to you by email. We will always treat your information with care and in accordance with our Privacy Notice. You are free to withdraw this permission at any time.

 

Virtual Classroom

Virtual classrooms provide all the benefits of attending a classroom course without the need to arrange travel and accomodation. Please note that virtual courses are attended in real-time, commencing on a specified date.

Virtual Course Dates

Our Customers Include