We are excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework.

By
Bo Lei
Co-Founder & CTO, Fleak
We're excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework. After months of internal development and refinement, we're making this tool available to the broader developer community.
Why We Built ZephFlow
Working with data processing frameworks often introduces significant challenges:
Configuring and operating processing clusters consumes substantial resources
Implementing jobs frequently requires complex configuration
Many existing solutions have steep learning curves, even for simpler use cases
ZephFlow was developed to address these pain points by providing a more accessible approach to data transformation that doesn't sacrifice capability.
What is ZephFlow?
ZephFlow is a flexible data processing framework that allows developers to build transformation pipelines using a directed acyclic graph (DAG) structure. It provides:
A simple, expressive DSL for defining data transformations
Support for both SQL and our custom Fleak Eval Expression Language (FEEL)
The ability to run as a standalone process, within a JVM application, or as an HTTP service
Powerful operators for filtering, transforming, and validating data
ZephFlow can adapt to your specific needs—running as a synchronous API backend for smaller workloads where multiple jobs share resources, or as an asynchronous data pipeline with dedicated resources for more demanding scenarios.
Getting Started
Here's a simple example of a ZephFlow pipeline:
Key Benefits
Simplicity: Define complex transformations with minimal code
Flexibility: Run anywhere - from your local development environment to production services
Resource Efficiency: Process data without excessive infrastructure overhead
Expressiveness: Leverage SQL or FEEL for powerful data manipulations
Use Cases
We've successfully deployed ZephFlow for various internal use cases:
Log processing and normalization
ETL workloads
Event streaming and transformation
Data validation and enrichment
The framework is particularly effective when you need powerful transformation capabilities without the operational complexity of larger distributed systems.
Get Involved
Check out the documentation to learn more about ZephFlow and how to use it. The source code is available on GitHub.
We welcome contributions, feedback, and feature requests. Join us in building a more efficient approach to data processing.
Other Posts
Jul 11, 2025
A Step-by-Step Guide: From Okta Logs to OCSF in Databricks
Transform Okta logs to OCSF for analysis in Databricks. This guide covers automated mapping with Fleak, building a data pipeline with ZephFlow, and querying your logs in a Delta Lake. Simplify your security analytics workflow now.
Jul 2, 2025
The OWASP LLM Top 10 for 2025: A Practical Security Guide for Engineering Teams
The OWASP 2025 LLM Top 10 is here, targeting real-world attacks. Our guide for engineers breaks down new threats like Vector Security and Prompt Leakage, offering practical tips to secure your LLM apps from sophisticated exploits.
Jun 15, 2025
OCSF to S3: Streaming with Kinesis, Firehose, and Zephflow
In Part 2, we build the final stage of our pipeline. Learn to stream OCSF logs to S3 as Parquet using Kinesis Firehose, a Glue schema, and a Zephflow sink, making your data ready for large-scale analysis.