Continuous Applications
Continuous applications are an end-to-end application that reacts to data in real-time. In particular, developers would like to use a single programming interface to support the facets of continuous applications that are currently handled in separate systems, such as query serving or interaction with batch jobs. Below is an example of continuous applications can handle the following use cases.
- Updating data that will be served in real time. The developer would write a single Spark application that handles both updates and serving (e.g. through Spark’s JDBC server), or would use an API that automatically performs transactional updates on a serving system like MySQL, Redis or Apache Cassandra.
- Extract, transform and load (ETL). The developer would simply list the transformations required as in a batch job, and the streaming system would handle coordination with both storage systems to ensure exactly-once processing.
- Creating a real-time version of an existing batch job. The streaming system would guarantee results are always consistent with a batch job on the same data.
- Online machine learning. The machine learning library would be designed to combine real-time training, periodic batch training, and prediction serving behind the same API.
Here’s more to explore