Announcing ZephFlow: A Lightweight Data Processing Framework Now Open Source

Announcing ZephFlow: A Lightweight Data Processing Framework Now Open Source

We are excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework.

By

Bo Lei

Co-Founder & CTO, Fleak

We're excited to announce the open source release of ZephFlow, our lightweight yet powerful data processing framework. After months of internal development and refinement, we're making this tool available to the broader developer community.

Why We Built ZephFlow

Working with data processing frameworks often introduces significant challenges:

  • Configuring and operating processing clusters consumes substantial resources

  • Implementing jobs frequently requires complex configuration

  • Many existing solutions have steep learning curves, even for simpler use cases

ZephFlow was developed to address these pain points by providing a more accessible approach to data transformation that doesn't sacrifice capability.

What is ZephFlow?

ZephFlow is a flexible data processing framework that allows developers to build transformation pipelines using a directed acyclic graph (DAG) structure. It provides:

  • A simple, expressive DSL for defining data transformations

  • Support for both SQL and our custom Fleak Eval Expression Language (FEEL)

  • The ability to run as a standalone process, within a JVM application, or as an HTTP service

  • Powerful operators for filtering, transforming, and validating data

ZephFlow can adapt to your specific needs—running as a synchronous API backend for smaller workloads where multiple jobs share resources, or as an asynchronous data pipeline with dedicated resources for more demanding scenarios.

Getting Started

Here's a simple example of a ZephFlow pipeline:

// Create a flow that filters and transforms data
ZephFlow flow = ZephFlow.startFlow();
ZephFlow processedFlow = flow
    .kafkaSource("broker:9092", "input-topic", "consumer-group", EncodingType.JSON_OBJECT, null)
    .filter("$.status == 'success'")  // FEEL expression
    .eval("dict_merge($, dict(processed_at=epoch_to_ts_str($.timestamp, \"yyyy-MM-dd'T'HH:mm:ss\")))")
    .kafkaSink("broker:9092", "output-topic", null, EncodingType.JSON_OBJECT, null);

// Execute the flow
processedFlow.execute("job_id", "env", "service");

Key Benefits

  1. Simplicity: Define complex transformations with minimal code

  2. Flexibility: Run anywhere - from your local development environment to production services

  3. Resource Efficiency: Process data without excessive infrastructure overhead

  4. Expressiveness: Leverage SQL or FEEL for powerful data manipulations

Use Cases

We've successfully deployed ZephFlow for various internal use cases:

  • Log processing and normalization

  • ETL workloads

  • Event streaming and transformation

  • Data validation and enrichment

The framework is particularly effective when you need powerful transformation capabilities without the operational complexity of larger distributed systems.

Get Involved

Check out the documentation to learn more about ZephFlow and how to use it. The source code is available on GitHub.

We welcome contributions, feedback, and feature requests. Join us in building a more efficient approach to data processing.

Start Building with Fleak Today

Production Ready AI Data Workflows in Minutes

Request a Demo

Start Building with Fleak Today

Production Ready AI Data Workflows in Minutes

Request a Demo