Advanced Methods in Data Science & Big Data Analytics

Course Description

This course builds on skills developed in the Data Science & Big Data Analytics course. The main focus areas cover Hadoop (including Pig, Hive, & HBase), Natural Language Processing, Social Network Analysis, Simulation, Random Forests, Multinomial Logistic Regression, & Data Visualization. Taking an “Open” or technology-neutral approach, this course utilizes several open-source tools to address significant data challenges. This training prepares the learner for Dell Technologies Proven Professional advanced analytics specialist-level certification exam (E20-065). 

Prerequisites

  • Completion of the Data Science & Big Data Analytics course 
  • Proficiency in at minimum one programming language, such as Java or Python

Audience Profile

This course is intended for aspiring Data Scientists, data analysts that have completed the associate-level Data Science & Big Data Analytics course, & computer scientists wanting to learn MapReduce & methods for analyzing unstructured data such as text.

Learning Objectives

Upon successful completion of this course, participants should be able to: 

  • Develop & execute MapReduce functionality 
  • Gain familiarity with NoSQL databases & Hadoop Ecosystem tools for analyzing large-scale, unstructured data sets 
  • Develop a working knowledge of Natural Language Processing, Social Network Analysis, & Data Visualization concepts 
  • Use advanced quantitative methods & apply one of them in a Hadoop environment 
  • Apply advanced techniques to real-world datasets in a final lab

Content Outline

  • Lesson 1: The MapReduce Framework 
  • Lesson 2: Apache Hadoop
  • Lesson 3: Hadoop Distributed File System 
  • Lesson 4: YARN 
  • Lesson 1: Hadoop Ecosystem  
  • Lesson 2: Pig 
  • Lesson 3: Hive 
  • Lesson 4: NoSQL - Not Only SQL 
  • Lesson 5: HBase 
  • Lesson 6: Spark 
  • Lesson 1: Introduction to NLP 
  • Lesson 2: Text Preprocessing  
  • Lesson 3: TFIDF  
  • Lesson 4: Beyond Bag of Words  
  • Lesson 5: Language Modeling 
  • Lesson 6: POS Tagging & HMM 
  • Lesson 7: Sentiment Analysis & Topic Modeling
  • Lesson 1: Introduction to SNA & Graph Theory  
  • Lesson 2: Most Important Nodes 
  • Lesson 3: Communities & Small World 
  • Lesson 4: Network Problems & SNA Tools
  • Lesson 1: Simulation  
  • Lesson 2: Random Forests  
  • Lesson 3: Multinomial Logistic Regression
  • Lesson 1: Perception & Visualization 
  • Lesson 2: Visualization of Multivariate Data Module

A: This course builds on skills developed in the Data Science & Big Data Analytics course. The main focus areas cover Hadoop (including Pig, Hive, & HBase), Natural Language Processing, Social Network Analysis, Simulation, Random Forests, Multinomial Logistic Regression, & Data Visualization.

A: To attend the training session, you should have operational Desktops or Laptops with the required specification and a good internet connection to access the labs. 

A: We would always recommend you attend the live session to practice & clarify the doubts instantly & get more value from your investment. However, if, due to some contingency, you have to skip the class, Radiant Techlearning will help you with the recorded session of that particular day. However, those recorded sessions are not meant only for personal consumption & NOT for distribution or commercial use.

A: Radiant Techlearning has a data center containing a Virtual Training environment for participants’ hand-on-practice. 

Participants can easily access these labs over Cloud with the assistance of a remote desktop connection. 

Radiant virtual labs allow you to learn from anywhere in the world & in any time zone. 

A: The learners will be enthralled as we engage them the real-world & industry Oriented projects during the training program. These projects will improve your skills & knowledge & you will gain a better experience. These real-time projects will help you a lot in your future tasks & assignments.

Certification

Specialist - Data Scientist, Advanced Analytics Version 1.0 (DCS-DS)

Send a Message.


  • Enroll
    • Learning Format: ILT
    • Duration: 80 Hours
    • Training Level : Beginner
    • Jan 29th : 8:00 - 10:00 AM (Weekend Batch)
    • Price : INR 25000
    • Learning Format: VILT
    • Duration: 50 Hours
    • Training Level : Beginner
    • Validity Period : 3 Months
    • Price : INR 6000
    • Learning Format: Blended Learning (Highly Interactive Self-Paced Courses +Practice Lab+VILT+Career Assistance)
    • Duration: 160 Hours 50 Hours Self-paced courses+80 Hours of Boot Camp+20 Hours of Interview Assisstance
    • Training Level : Beginner
    • Validity Period : 6 Months
    • Jan 29th : 8:00 - 10:00 AM (Weekend Batch)
    • Price : INR 6000

    This is id #d