Course Outline

Introduction

Overview of Data Access Approaches (Hive, databases, etc.)

Overview of Spark Features and Architecture

Installing and Configuring Spark

Understanding Dataframes in Spark

Defining Tables and Importing Datasets

Querying Data Frames using SQL

Carrying out Aggregations, JOINs and Nested Queries

Uploading and Accessing Data

Querying Different Types of Data

  • JSON, Parquet, etc.

Querying Data Lakes with SQL

Troubleshooting

Summary and Conclusion

Requirements

  • Experience with SQL queries
  • Programming experience in any language

Audience

  • Data analysts
  • Data scientists
  • Data engineers
 7 Hours

Number of participants



Price per participant

Testimonials (10)

Related Courses

Python and Spark for Big Data (PySpark)

21 Hours

Introduction to Graph Computing

28 Hours

Artificial Intelligence - the most applied stuff - Data Analysis + Distributed AI + NLP

21 Hours

Apache Spark MLlib

35 Hours

Big Data Analytics in Health

21 Hours

Hadoop and Spark for Administrators

35 Hours

Hortonworks Data Platform (HDP) for Administrators

21 Hours

A Practical Introduction to Stream Processing

21 Hours

Magellan: Geospatial Analytics on Spark

14 Hours

Apache Spark for .NET Developers

21 Hours

SMACK Stack for Data Science

14 Hours

Apache Spark Fundamentals

21 Hours

Administration of Apache Spark

35 Hours

Apache Spark in the Cloud

21 Hours

Spark for Developers

21 Hours

Related Categories

1