Description
Data Engineering Essentials using SQL, Python and PySpark is a project-oriented data engineering training course with SQL, Python and PySpark framework published by Udemy Academy. At the end of this training, you will be able to create a data pipeline to process and store information and carry out engineering and data analysis projects alone with the trained techniques and technologies. Data engineering is the process of filtering, storing and processing various data based on the needs and objectives of a particular project or research. Data engineering is a broad concept and includes many subdisciplines. At the beginning of this training, you will familiarize yourself with the principles and basics of Python and SQL programming languages, and after solving the exercises and projects in this section, you will refer to the following and more advanced topics.
What you’ll learn in Data Engineering Essentials using SQL, Python, and PySpark
- Building a Data Pipeline with SQL
- Postgres database management system
- Initial installation of the database and performing simple operations on the information such as adding, deleting, updating, etc.
- Write simple SQL queries and queries
- Filter, integrate and compress data with SQL
- Creating indexes and tables in a database environment with DDL commands and commands
- Partition and categorize information in the database
- Predefined functions in SQL such as manipulating string values and…
- Write complex and specific SQL queries with Postgresql
- Python Programming Principles
- Implement and perform simple database operations with the Python programming language.
- Commands and Conditional Loops in Python
- List and definition in Python
- Data Types and Data Types in Python Programming
- Map and Minify Libraries in Python
- Panda Library
- Initial installation of the data engineering application development environment
- Spark dataframe API types such as select, filter, groupBy, orderBy and…
- Using different files and formats like Parquet, JSON, CSV, etc. build data transmissions
Course Specifications
Editor: Udemy
Instructors: Durga Viswanatha Raju Gadiraju
French language
Intermediate level
Number of lessons: 624
Duration: 56 hours
course topics
Basics of Data Engineering using SQL, Python and PySpark
Laptop with decent configuration (minimum 4 GB RAM and Dual Core)
Sign up for GCP with available credit or AWS Access
Set up a self-help lab on cloud platforms (you may need to pay applicable cloud fees unless you have credit)
A degree in CS or IT or prior IT experience is strongly desired
Pictures
Video presentation
installation guide
After the clip, watch with your favorite reader.
Subtitle: English
Quality: 720p
Previous title:
Practical Data Engineering Principles: SQL, Python, and Spark
Changes:
Version 2023/7 compared to 2021/11 reduced the number of 14 lessons and the duration by 1 hour and 22 minutes.
Download links
File password(s): free download software
size
15.4 GB