Explanation
The ultimate goal of the Nanodegree program in Data Streaming is to provide students with the latest skills to process data in real time by building fluency with modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. A graduate of this program will be able to: Understand the components of data systems. Ingest real-time data using Apache Kafka and Spark and run analytics. Use the Faust Stream Processing Python library to build a real-time streaming application. Collect real-time data and run live analytics, as well as draw insights from reports generated by the streaming console. Learn about the Kafka ecosystem, and the types of problems each solution is designed to solve. Use the integrated Kafka Python library for easy topic management, production, and consumption. Explain Spark Streaming components (architecture and API), connect Apache Spark Structured Streaming and Apache Kafka, manage data using Spark, and read DataFrames in the Spark Streaming Console.
This program consists of 2 courses and 2 projects. Each project you build will be an opportunity to demonstrate what you have learned from the course, and it will show future employers that you have skills in these areas. The goal of this course is to demonstrate knowledge of all the tools taught, including Kafka Consumers, Producers, & Topics; Kafka Connecting Sources and Streams, Kafka REST Proxy for extracting data over REST, Data Systems with JSON and Apache Avro/Schema Registry, Flow Processing in Faust Python Library, and Flow Processing in KSQL.
What will you learn?
- Describe and explain data storage and
- flow processing
- Describe and explain the real-world use of flow processing
- Describe and describe the architecture of Kafka
- Describe and explain Kafka topics and configuration
- Leverage Kafka’s integration with Python to create themes and schemas
- Describe and describe Kafka producers, consumers, and configurations
- Describe and describe typical flow conditions
- processing, and where to use stream and batch
- Describe and describe common flow processing strategies
- Explain and explain how timing and windows work in flow processing
Who is this course for?
- This program is designed for software engineers looking to build real-time data processing skills, as well as data engineers looking to enhance their existing skills in the next evolution of data engineering.
Nanodegree Data Flow Guidelines
Nanodegree Flow Data Content
Welcome to the Data Flow Nanodegree Program
Lesson 01: Introduction to the Data Flow Nanodegree Program
Lesson 02: Introduction to Data
Lesson 03: Career Services Nanodegree
Kafka Data Collection & Kafka Streaming
Lesson 01: Introduction to Flow Processing
Lesson 02: Apache Kafka
Lesson 03: Data Systems and Apache Avro
Lesson 04: Kafka Connect and REST Proxy
Lesson 05: Flow Processing Basics
Lesson 06: Setting Up a Faust Walk
Lesson 07: KSQL
Lesson 08: Improving Public Transport
Lesson 09: Optimize Your GitHub Profile
Apache Spark and Spark Streaming
Lesson 01: Spark power
Lesson 02: Data crunching in Spark
Lesson 03: Introduction to Spark Streaming
Lesson 04: Structured Streaming APIs
Lesson 05: Integrating Spark Streaming with Kafka
Lesson 06: SF Crime Statistics with Spark Streaming
Lesson 07: Take 30 minutes to develop your LinkedIn
Requirements
- Intermediate to SQL, and Python. And ETL experience.
- Basic knowledge of traditional batch processing and traditional service architecture is desired, but not required.
Pictures
Sample Clip
Installation Guide
Open the index.html file and use and follow the links to watch the tutorial
Subtitle : Not available
Quality: 720p
Download Links
Password file: free download software
file size
1.46GB