Typing on a Computer

Data Analysis Using R Software

This course introduces participants to R programming for statistical analysis and research.  The goal of this course is to build familiarity with the basic R toolkit for statistical analysis and graphics.

Course Objectives

After completing this course, participants will be able to:

  • Use R for managing and manipulating data

  • Explore simple programming in R

  • Become familiar with some of R's most commonly used statistical procedures

  • Apply knowledge of the following data mining techniques for complex data sets using R

    • Multivariate Statistics

    • Regression

    • ANOVA

    • Cluster Analysis

    • GLM including Logistic Regression

Graphic Spiral

Module 1:  Introduction to R

  • Downloading R and installing packages

  • Performing basic calculations in R

  • Vector and matrix arithmetic

  • Logical selections, using R script files, and reading data into R

  • How to do loops and if statements to manipulate data

Graphic Spiral

Module 2:  Descriptive Statistics in R

  • Calculating descriptive statistics

  • Summarizing data (grouped & ungrouped)

  • Creating tables

  • Creating graphics

Graphic Spiral

Module 3:  Data Mining & Statistical Analysis in R

  • Hypothesis testing

  • Regression (Including Step-wise)

  • ANOVA

  • Logistic Regression

  • Chi Square Testing

  • Cluster Analysis

  • Decision Trees

Graphic Spiral

Module 4:  Case Study – Big Data

  • Analysis of a large dataset

  • Reading data into R from CSV files, data manipulation and basic descriptive statistics

  • Plots and graphing, for-loops and if-statements, and installation of R packages

  • Identify the purpose of certain lines of code

  • Practice what has been learned in Modules 1-3 with a ‘big data’ set

  • Regression analysis using “big data” set