We are excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework.

By
Bo Lei
Co-Founder & CTO, Fleak
We're excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework. After months of internal development and refinement, we're making this tool available to the broader developer community.
Why We Built ZephFlow
Working with data processing frameworks often introduces significant challenges:
Configuring and operating processing clusters consumes substantial resources
Implementing jobs frequently requires complex configuration
Many existing solutions have steep learning curves, even for simpler use cases
ZephFlow was developed to address these pain points by providing a more accessible approach to data transformation that doesn't sacrifice capability.
What is ZephFlow?
ZephFlow is a flexible data processing framework that allows developers to build transformation pipelines using a directed acyclic graph (DAG) structure. It provides:
A simple, expressive DSL for defining data transformations
Support for both SQL and our custom Fleak Eval Expression Language (FEEL)
The ability to run as a standalone process, within a JVM application, or as an HTTP service
Powerful operators for filtering, transforming, and validating data
ZephFlow can adapt to your specific needs—running as a synchronous API backend for smaller workloads where multiple jobs share resources, or as an asynchronous data pipeline with dedicated resources for more demanding scenarios.
Getting Started
Here's a simple example of a ZephFlow pipeline:
Key Benefits
Simplicity: Define complex transformations with minimal code
Flexibility: Run anywhere - from your local development environment to production services
Resource Efficiency: Process data without excessive infrastructure overhead
Expressiveness: Leverage SQL or FEEL for powerful data manipulations
Use Cases
We've successfully deployed ZephFlow for various internal use cases:
Log processing and normalization
ETL workloads
Event streaming and transformation
Data validation and enrichment
The framework is particularly effective when you need powerful transformation capabilities without the operational complexity of larger distributed systems.
Get Involved
Check out the documentation to learn more about ZephFlow and how to use it. The source code is available on GitHub.
We welcome contributions, feedback, and feature requests. Join us in building a more efficient approach to data processing.
Other Posts
Jun 15, 2025
OCSF to S3: Streaming with Kinesis, Firehose, and Zephflow
In Part 2, we build the final stage of our pipeline. Learn to stream OCSF logs to S3 as Parquet using Kinesis Firehose, a Glue schema, and a Zephflow sink, making your data ready for large-scale analysis.
Jun 13, 2025
From VPC Logs to OCSF: A Streaming Pipeline with Kinesis and Zephflow
In part one of our series with Cardinal, learn to transform AWS VPC logs into the query-ready OCSF format. We'll build a streaming data pipeline using Fleak's OCSF Mapper, Zephflow, and Kinesis.
Jun 9, 2025
A Practical Guide to Building Real-Time Log Parsing Pipelines
Tired of fragile log parsing? Learn to build a robust, real-time pipeline. This guide covers architecture, tools, and scalable strategies to handle complex logs and avoid late-night alerts for good.