Jinxuan Wu's Blog

Home Archives Categories Tags About

Jinxuan Wu (Jim)

SDE & Ph.D

Phoenix, AZ FOLLOW

13 posts

2 tags

Tutorial: Reflections on Almost Two Decades of Research into Stream Processing

paper reading

#Tutorial: Reflections on Almost Two Decades of Research into Stream Processing

This tutorial reflects on this research history by highlighting
a number of trends and best practices that can be identied in hindsight. It also enumerates a list of directions for future research in stream processing.

NOTABLE SYSTEMS:

TinyDB
STREAM
TelegraphCQ [13],
Gigascope [15],
Aurora
Borealis
IBM Infosphere Systems
System S
S4
Storm
Spark Streaming
Flink
Kafka/Samza
Millwheel
Google Cloud DataFlow
StreamScope

TRENDS

From DSMSs to Big “Streaming” Data Frameworks
- Combine the low-latency benefits of streaming systems with the scalability properties of Big Data frameworks.
Increased Importance of Exact Results
- providing the exactly once delivery semantics. Even transactional properties
One-pass Computation vs Replayability
- Previously data was primarily kept in memory but for a very short period of time for low latency requirement assumption. Now use high-throughput persisted message buses such as Kafka ease of debugging and replayability. Simplifies fault-tolerance mechanisms.
Unification of Batch and Streaming Models
- Flink and apache Beam.
Domain-specific to General-purpose
- wireless sensor networks or network traffic monitoring
Richer Window Specifications
- Time or count-based windows
- Session Windows (Google Cloud Dataflow and Kafka Streams) and Episodes/Frames

##BEST PRACTICES

System-wide State Management
Simplified Reasoning via Punctuation

Share Comments

Building Connected Car Applications on Top of the World-Wide Streams Platform

Exploring big volume sensor data with Vroom

© 2017 Jinxuan Wu, Currently this site have views.
Powered by Hexo. Theme by PPOffice