25,000+ Courses Nationwide
0345 4506120

Data Analysis with Python

This course focuses on the extensive features of the Python data analysis workhorse library, Pandas, and its visualisation counterpart Matplotlib. It covers the reading, preparation and manipulation of tabular data from various sources and in various common formats. Most wrangling and manipulation processes are covered. Time series data processing and practical linear regression are also covered. For the programming environment we use JupyterLab on the Anaconda platform. Anaconda is one of the most, if not the most, popular data science platforms.

Who will the Course Benefit?

This course is designed for anyone with Python programming experience wanting to gain a solid foundation in Python's data analysis libraries. It is a must for aspiring Data Analysts and Scientists. Existing Data Analysts wanting a systematic introduction to Python's Data Analysis tools would also find the course very useful.

We believe in learning by doing and take a hands-on approach to training. Delegates are provided with all required resources, including data, and are expected to code along with the instructor. The objective is for delegates to reproduce the analysis in our manuals as well as gain a conceptual understanding of the methods.

Exercises and examples are used throughout the course to give practical hands-on experience with the techniques covered.

Course Objectives

This course aims to provide delegates, who already have Python programming experience, in-depth knowledge of Python's main data analysis and visualization libraries (Pandas and Matplotlib). The knowledge gained will enable delegates to design and develop enterprise level data analytics solutions.

Select specific date to see price, venue and full details.

Learning Objectives

The delegate will learn and acquire skills as follows:

  • Read csv, excel and json format data into Pandas DataFrame objects
  • Fetch data from local files, web url and a relational database
  • Concatinating DataFrames
  • Clean, pivot, group, manipulate and summarise tabular data
  • Understand, read and process time series data
  • Perform single and multiple predictor variable linear regression calculations
  • Plot bar and pie charts, histograms, scatter and line graphs, using Matplotlib
  • Use JupyterLab

Pre-Requisites

Programming:

  • Delegates are expected to have Python programming experience. They should be able to effectively use Python containers (lists, tuples, dictionaries, and sets), construct loops and conditional statements, write functions and create and use classes and objects. Skills and knowledge that can be acquired by taking our Python Programming course

Numeracy:

  • Able to calculate and interpret averages, standard deviations and similar basic statistics
  • Ability to read and understand charts and graphs
  • For linear regression; an understanding of the meaning of a linear graph (or an ability to understand it quickly when explained)
  • Mathematics: GCSE or equivalent

Course Content

Data Analysis Python Training Course

Course Introduction

  • Administration and Course Materials
  • Course Structure and Agenda
  • Delegate and Trainer Introductions

Session 1: INTRODUCTION TO DATAFRAMES

  • What is a DataFrame?
  • Loading DataFrames
  • Accessing contents
  • Useful functions
  • Adding and dropping columns and rows

Session 2: INTRODUCTION TO DATAFRAMES (CONTINUED)

  • Fitering and assigning data
  • Missing values and duplicates
  • Arithmetic basics
  • Applymap and apply

Session 3: COMBINING DATAFRAMES

  • Concatinate
  • Merge
  • Keys to merge on and suffixes for duplicate columns
  • Merge methods
  • Append
  • Join
  • Combine_first: For missing values

Session 4: RESHAPING DATAFRAMES

  • Unstacking and Stacking
  • Pivoting
  • Melting
  • Concatinating files from disk

Session 5: GROUPBY AND AGGREGATION: SPLIT-APPLY-COMBINE

  • Basic GroupBy
  • Hierarchical GroupBy
  • Group by function of Index

Session 6: GROUPBY AND AGGREGATION: SPLIT-APPLY-COMBINE (CONTINUED)

  • Aggregate by mapping on Index and Columns
  • Aggregate by user-defined functions
  • Aggregate using multiple functions
  • Aggregate using separate function for each column

Session 7: GROUPBY AND AGGREGATION: SPLIT-APPLY-COMBINE (CONTINUED)

  • Transfrom
  • Apply function
  • Pivoting with Aggregation

Session 8: PLOTTING WITH MATPLOTLIB

  • Pie chart
  • Bar chart
  • Histogram
  • Scatter plot
  • Line plot

Session 9: TIME SERIES DATA

  • Basic Concepts; Datetime, Timestamp, Timedelta, Timezones
  • Pandas to_date() fucntion
  • Date Range
  • What is time series data
  • Reading time series data
  • Missing Dates
  • Partial indexing, Slicing and Selecting
  • Resampling
  • Moving Window functions

Session 10: LINEAR REGRESSION

  • What is linear regression?
  • Simple Linear regression
  • Multiple Regression

Related Courses

Privacy Notice

In order to provide you with the service requested we will need to retain and use your contact information in accordance with our Privacy Notice. If you choose to provide us with this information you explicitly consent to us using the information as necessary to provide the requested service to you. If you do not agree please do not proceed to request the service from us.

Marketing Permissions

Would you like to receive our newsletter and other information on products and services which we think will be of interest to you by email. We will always treat your information with care and in accordance with our Privacy Notice. You are free to withdraw this permission at any time.

 

We work with the best