Real Time Spark Project for Beginners: Hadoop, Spark, Docker
Building Real Time Data Pipeline Using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexmonster on Docker
In many data centers, different type of servers generate large amount of data(events, Event in this case is status of the server in the data center) in real-time.
There is always a need to process these data in real-time and generate insights which will be used by the server/data center monitoring people and they have to track these server's status regularly and find the resolution in case of issues occurring, for better server stability.
Since the data is huge and coming in real-time, we need to choose the right architecture with scalable storage and computation frameworks/technologies.
Hence we want to build the Real Time Data Pipeline Using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexmonster on Docker to generate insights out of this data.
The Spark Project/Data Pipeline is built using Apache Spark with Scala and PySpark on Apache Hadoop Cluster which is on top of Docker.
Data Visualization is built using Django Web Framework and Flexmonster.
Course Curriculum
Introduction
Available in
days
days
after you enroll
Environment Setup
Available in
days
days
after you enroll
-
Start3. Setting up Docker Environment (9:54)
-
Start4. Create Single Node Kafka Cluster on Docker (8:15)
-
Start5. Create Single Node Apache Hadoop and Spark Cluster on Docker (35:06)
-
Start6. Setting up IntelliJ IDEA Community Edition(IDE) (21:00)
-
Start7. Setting up PyCharm Community Edition(IDE) (16:40)
-
Start8. Setting up Django Web Framework (7:09)
Development Project Code Walk Through
Available in
days
days
after you enroll
-
Start9. Event Simulator using Python(Server Status Detail) (19:15)
-
Start10. Building Streaming Data Pipeline using Scala - Spark Structured Streaming (30:57)
-
Start11. Building Streaming Data Pipeline using PySpark-Spark Structured Streaming (28:53)
-
Start12. Setting up PostgreSQL Database(Events_Database) (4:55)
-
Start13. Building Dashboard using Django Web Framework and Flexmonster - Visualization (22:20)
Frequently Asked Questions
When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
What if I am unhappy with the course?
We would never want you to be unhappy! If you are unsatisfied with your purchase, contact us in the first 30 days and we will give you a full refund.