Big Data Fundamentals

Big Data Fundamentals Course Details:

This course is a survey of big data – the landscape, the technology behind it, business drivers, and strategic possibilities. “Big data” is a hot buzzword, but most organizations struggle to put it to practical use. Without assuming any prior knowledge of Apache Hadoop or big data management, this course teaches you how to use and manage the benefits of big data.

No classes are currenty scheduled for this course.

Call (919) 283-1674 to get a class scheduled online or in your area!

Introduction to Big Data
- Academic
- Early web
- Web scale
Sources (Examples)
- Internet
- Transport systems
- Medical, healthcare
- Insurance
- Military and others
Hadoop – the free platform for working with big data
- History
- Yahoo
- Platform fragmentation
- What usage looks like in the enterprise
The concepts
- Load data how you find it
- Process it when you can
- Project it into various schemas on the fly
- Push it back to where you need it
The basics
- What it’s good for
- What can’t it do / disadvantages
- Most common use cases for big data
Introduction to HDFS
- Robustness
- Data Replication
- Gotchas
MapReduce – the core big data function
- Map explained
- Sort and shuffle explained
- Reduce explained
YARN
- How it fits
- How it works
- Resource Manager
- Application Master
PIG
- What it is
- How it works
- Compatibilities
- Advantages
- Disadvantages
Processing Data
- The Piggy Bank
- Loading and Illustrating the data
- Writing a Query
- Storing the Result
HIVE
- Data warehousing
- What it is, what it’s not
- Language compatibilities
- Advantages
OOZIE
- What it is
- Complex workflow environments
- Reducing time-to-market
- Frequency execution
- How it works with other big data tools
FLUME – stream, collect, store and analyze high-volume log data
- How it works: Event, source, sink, channel, agent and client
- How it works illustrated
- How it works demonstrated
SPARK
- Move over 2012 Big Data tools: Apache SPARK is the new power tool
- The new open source cluster framework
- When SPARK performs 100 times faster
- Performance comparison of Spark and Hadoop
- What else can it do?
HBASE
- What it is
- Common use cases
Using External Tools

*Please Note: Course Outline is subject to change without notice. Exact course outline will be provided at time of registration.

Navigate the technology stacks and tools used to work with big data
Establish a common vocabulary on your teams for applying big data practices
Get an overview of how big data technologies work: Apache Hadoop, Spark, Pig, Hive, Sqoop, OOZIE, and FLUME
Design both functional and non-functional requirements for working with big data
Understand common business cases for big data
Differentiate between hype and what’s truly possible
Look at examples of real-world big data use cases
Select initiatives and projects that have high potential to benefit from big data applications
Understand what type of staffing, technical skills, and training is required for projects that incorporate or focus on big data

Software Developer
Machine Learning Engineer
Data Scientist
Business Intelligence Developer
Research Scientist
Data Engineer
Programmer
Project Manager

Big Data Fundamentals

Big Data Fundamentals Course Details:

Ready to Jumpstart Your IT Career?