Effective Data Analysis in MATLAB®

This 2-day hands-on workshop provides engineers, researchers, and statisticians a practical and organized approach to using MATLAB for data analysis.

Topics include data input and output, handling large data sets, computing descriptive statistics, statistical plotting and visualization, statistical process control, clustering, and data mining. The workshop includes many examples and exercises that cover a cross-section of application areas in science and engineering.

Prerequisites

A working knowledge of the MATLAB language equivalent to “Comprehensive MATLAB” and basic statistical concepts.

Course Outline

Data and Statistics
Objective: Learn to work with data in the MATLAB environment, compute descriptive parameters, and visualize data and data fits in a variety of ways.

Data analysis and visualization

  • Major tasks and approaches
Working with data
  • Data I/O for a variety of formats
  • Data organization
  • Mixed data types
  • Missing data
Descriptive statistics and distributions
  • Statistical questions
  • Characterizing and comparing distributions
  • Measures of center, spread and shape
  • Parameter estimation
  • Preprocessing and transformation
Statistical plotting
  • Histograms, quantile, box, and scatter plots
  • Scatterplot matrix, conditioning plots
  • Surface, countour, and image plots
  • Grouped data
Curve fitting
  • Polynomial and other linear and nonlinear fits
  • Robust fits
  • Visualizing fits and residuals
Time series
  • Fourier and other methods
  • Smoothing
  • Working with irregular sampling intervals
3D data
  • Visualization
  • Surface fitting
  • Working with irregular sampling
Interpolation
  • Choosing the best method
  • Dealing with data problems
Beyond 3 dimensions
  • Visualization
  • Dimensionality reduction

Exercises

Statistical Process Control
Objective: Learn and practice the charting and interpretation methods for SPC.
Concepts and examples

  • SPC and quality: stability, variation, common and special causes
  • Variable control charts
  • Attribute control charts
  • Control chart interpretation
  • Extensions for more than one variable

Exercise

Clustering
Objective: Learn and practice partitioning methods and assessment.


Cluster analysis

  • Concepts: partition and distance
  • k-means clustering and other methods
  • Assessing results

Exercise

Large data sets
Objective: Learn techniques for effectively dealing with large data sets.

Manipulation

  • Memory limits
  • Processing speed
  • Data format and organization
Visualization
  • Coded plots
  • Slicing methods
  • Dimensionality reduction

Data mining
Objective: Understand the major concepts of data mining and knowledge discovery in data. Practice the most important techniques.

Data Mining Concepts and Tasks

  • Classification for analysis and prediction
  • Clustering large amounts of data
  • Association of clusters
  • Deviation detection for outliers and changes
  • Visualization to support discovery
  • Summarization to describe patterns
Exercise