


Course Objectives
After completing this course, participants will be able to:
-
Use R for managing and manipulating data
-
Explore simple programming in R
-
Become familiar with some of R's most commonly used statistical procedures
-
Apply knowledge of the following data mining techniques for complex data sets using R
-
Multivariate Statistics
-
Regression
-
ANOVA
-
Cluster Analysis
-
GLM including Logistic Regression
-

Module 1: Introduction to R
-
Downloading R and installing packages
-
Performing basic calculations in R
-
Vector and matrix arithmetic
-
Logical selections, using R script files, and reading data into R
-
How to do loops and if statements to manipulate data

Module 2: Descriptive Statistics in R
-
Calculating descriptive statistics
-
Summarizing data (grouped & ungrouped)
-
Creating tables
-
Creating graphics

Module 3: Data Mining & Statistical Analysis in R
-
Hypothesis testing
-
Regression (Including Step-wise)
-
ANOVA
-
Logistic Regression
-
Chi Square Testing
-
Cluster Analysis
-
Decision Trees

Module 4: Case Study – Big Data
-
Analysis of a large dataset
-
Reading data into R from CSV files, data manipulation and basic descriptive statistics
-
Plots and graphing, for-loops and if-statements, and installation of R packages
-
Identify the purpose of certain lines of code
-
Practice what has been learned in Modules 1-3 with a ‘big data’ set
-
Regression analysis using “big data” set