Apache Flink
Please rate the course
Course short description
Apache Flink is a distributed system and computational engine for stateful big data streaming. That was a mouthful. In plain English, Flink is an library that allows you to process big data at scale, as it arrives, in almost real time. Flink gives you a variety of APIs that allow you to do plain functional programming on streaming data, and low-level APIs to give you ultimate control. Plus connectors to everything popular, including Kafka, JDBC, Cassandra, Pulsar, S3 and all sorts of data processors and storage systems. In this course, you'll learn how to be productive with Flink, and you'll grow as a data engineer.
First of all, this course will give you everything you need to be productive with Flink:
- You'll deeply understand the Flink streaming engine and how it works
- You'll use functional programming on data streams
- You'll process any kind of data in real time, at scale
- You'll master complex transformations such as window functions
- You'll be able to run stateful computations, which is the main strength of Flink
- You'll know how to connect Flink to the most popular message buses, data streaming and data storage systems
- You'll be able to design your own connectors
- You'll be able to deploy Flink applications to a cluster
- You'll be able to troubleshoot and find relevant information in the Flink UI
After this course, you'll be able to process data in any way you need using Flink.
But most importantly, you'll develop timeless skills that you'll carry with you for your entire career, regardless of which data streaming tool you'll end up using:
- You'll deeply understand the practical benefits of streaming data in general
- You'll be able to work with event time and processing time
- You'll internalize the implications and tradeoffs of choosing latency vs throughput
- You'll understand the need for data consistency and persistence