Data Science and Big Data Analytics v2 Course Details:

In this course, you will gain practical foundation level training that enables immediate and effective participation in big data and other analytics projects. You will cover basic and advanced analytic methods and big data analytics technology and tools, including MapReduce and Hadoop. Extensive labs throughout the course provide you with the opportunity to apply these methods and tools to real world business challenges using a technology-neutral approach. In a final lab, you will address a big data analytics challenge by applying the concepts taught in the course to the context of the Data Analytics Lifecycle. You will prepare for the Data Scientist Associate (EMCDSA) certification exam and establish a baseline of Data Science skills.

    No classes are currenty scheduled for this course.

    Call (919) 283-1653 to get a class scheduled online or in your area!

The following modules and lessons included in this course are designed to support the course objectives:

Module 1 - Introduction to Big Data analytics

  • Big Data and its characteristics Lesson
  • Business value from Big Data
  • Data scientist

Module 2 – Data Analytics Lifecycle

  • Data analytics lifecycle overview
  • Discovery phase
  • Data preparation phase
  • Model planning phase
  • Model building phase
  • Communicate results phase
  • Operationalize phase

Module 3 – Basic data analytics methods using R

  • Introduction to the R programming language
  • Analyzing and exploring data
  • Statistics for model building and evaluation

Module 4– Advanced analytics theory and methods

  • Introduction to advanced analytics—theory and methods
  • K-means clustering
  • Association rules
  • Linear regression
  • Logistic regression
  • Text analysis
  • Naïve Bayes
  • Decision trees
  • Time series analysis

Module 5: Advanced analytics—technology and tools

  • Introduction to advanced analytics—technology and tools
  • Hadoop ecosystem
  • In-database analytics SQL essentials
  • Advanced SQL and MADlib

Module 6: Putting it all together

  • Preparing to operationalize
  • Preparing project presentations
  • Data visualization techniques

*Please Note: Course Outline is subject to change without notice. Exact course outline will be provided at time of registration.
  • Immediately participate as a data science team member
  • Work with large data sets and generate insights
  • Build predictive and classification models
  • Manage a data analytics project through the entire lifecycle

In addition to the examples provided in the lectures, this course includes labs to allow practical experience for the participant. Note: There are no demonstrations.

1. Big Data Analytics

  • Big Data
  • State of the Practice in Analytics
  • Data Scientist
  • Big Data Analytics in Industry Verticals

2. Data Analytics Lifecycle

  • Discovery
  • Data Preparation
  • Model Planning
  • Model Building
  • Communicating Results
  • Operationalizing

3. Basic Data Analytic Methods Using R

  • Using R to Look at Data
  • Analyzing and Exploring the Data
  • Statistics for Model Building and Evaluation

4. Advanced Analytics: Theory and Methods

  • K Means Clustering
  • Association Rules
  • Linear Regression
  • Logistic Regression
  • Naïve Bayesian Classifier
  • Decision Trees
  • Time Series Analysis
  • Text Analysis

5. Advanced Analytics: Technologies and Tools

  • Analytics for Unstructured Data
    • MapReduce and Hadoop
  • Hadoop Ecosystem
    • In-Database Analytics: SQL Essentials
    • Advanced SQL and MADlib for In-Database Analytics

6. Putting it All Together

  • Operationalizing an Analytics Project
  • Creating the Final Deliverables
  • Data Visualization Techniques
  • Final Lab Exercise on Big Data Analytics

To successfully complete this course and gain the maximum benefits from participation, you should have the following knowledge and skill sets:

  • A strong quantitative background with a solid understanding of basic statistics, as would be found in a statistics 101 level course
  • Experience with a scripting language such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (with an RStudio GUI), which is an open source statistical tool and programming
  • Experience with SQL
  • Managers of business intelligence, analytics, and big data professionals teams
  • Current business and data analysts looking to add big data analytics to their skills
  • Data and database professionals looking to exploit their analytic skills in a big data environment
  • Recent college graduates and graduate students with academic experience in a related discipline looking to move into the world of Data Science and big data
  • Individuals looking to take the Data Scientist Associate (EMCDSA) certification

Ready to Jumpstart Your IT Career?