Case study Deatails

Global

Building an open-source AI tool that puts LLM power on any device

Industries:

AI, edge computing, open-source

Duration:

1 year

Team:

3 core contributors (2 ML engineers, 1 OSS architect)

CLI tool in Docker, edge-optimized ONNX output

Community-led, stewarded by our team

Services:
Technologies:
Integrations:

GitHub Actions, LLMs, ONNX.js, Rust, Swift

DevOps:

GitHub CI, Flake8, Pytest, Pre-commit

Project Overview

The WhiteLightning project started with a question: Does text classification need the cloud every time?

It began as an internal experiment during our ML hack days, aimed at exploring what could be done with less. LLMs are great, but for most real-world use cases, you don’t need a 175B-parameter model on standby. Instead, you need something fast, portable, and private. Something that works offline, ships inside your app, and doesn’t rack up API bills.

Instead of utilizing LLMs at runtime, we use them once to generate synthetic data. Then, distill that into a compact, ONNX-based model that runs anywhere. No cloud, no lock-in, no friction. Just a simple way to go from idea to working classifier on your terms.

95% cheaper

Uses LLMs once for data generation (~$0.01 vs $1–10 per query)

1 MB model size

Easily fits in mobile apps, kiosks, or embedded firmware

10–15 min training time

Generate a binary classifier on a laptop in minutes

2,520 texts/sec inference speed

That’s 0.38 ms per input on commodity CPUs

512 kB RAM usage

Runs on low-power hardware like Raspberry Pi Zero

8 languages and runtimes supported

Identical logits across Python, Rust, Swift, and more

100% offline-ready

No cloud, no vendor lock-in, no latency risks

All-platform deployment

ONNX.js (web), iOS/Android (mobile), MCUs (embedded), laptops (desktop)

Who it’s for

WhiteLightning is made for builders who don’t want to rent intelligence from the
cloud.

Indie developer

Add sentiment analysis to a desktop app without paying per query or relying on the cloud — it just works offline, out of the box.

Indie developer

Add sentiment analysis to a desktop app without paying per query or relying on the cloud — it just works offline, out of the box.

Indie developer

Add sentiment analysis to a desktop app without paying per query or relying on the cloud — it just works offline, out of the box.

Real-world scenarios of using

WhiteLightning was developed to make intelligent text classification possible
anywhere, even in environments where cloud access is limited, restricted, or simply
not allowed.

Personal productivity and desktop apps

Smart quick-add (e.g., calendar vs. task vs. reminder)
Auto-tagging notes in Obsidian or Notion
Gmail-style inbox tabs, fully offline

Comms safety and moderation

Console games with offline parental controls
Secure chat platforms enforcing code of conduct
SMS spam filtering on Android ROMs

Healthcare and life sciences

Patient triage kiosks (e.g., refill request vs. symptoms)
Symptom classifiers for medical wearables
Transcription flaggers for allergy or dosage mentions

Customer support and compliance

On-prem ticket routing for banks and hospitals
VoIP transcription classifiers
Contact-center QA inside closed networks

IoT, automotive, and smart devices

Offline voice commands for home automation
In-car NLP for media or navigation
Industrial alarm logs classified by risk level

Retail and eCommerce

Offline voice commands for home automation
In-car NLP for media or navigation
Industrial alarm logs classified by risk level

Developer and DevOps tools

GitHub bots tagging issue types
CI pipelines detecting secrets or tone
IDE extensions nudging for better commit messages

Education

Adaptive e-readers
Captioning systems detecting topic shifts

OEM/Embedded hardware

Router firmware with built-in parental filters
3D printer UI explaining G-code errors

Yes, it runs even on a potato

WhiteLightning was built to work even on extremely low-spec hardware. With models under 1 MB, no runtime dependencies, and ONNX compatibility, it runs smoothly on:

Raspberry Pi Zero
Old laptops
Budget Android phones
In-browser via ONNX.js
Microcontrollers with limited RAM

If it can run Python or Rust, it can run WhiteLightning.

WhyWhiteLightning delivers

1.

Pay once for LLM access (via OpenRouter or other API)

2.

Generate synthetic labeled data from your task prompt

3.

Distill into a 1 MB model that runs forever, for free, anywhere

OSS as a collaboration model

WhiteLightning is 100% open-source under GPL-3.0. The classifiers it generates are
MIT-licensed and yours to use in commercial apps.

Our team maintains it publicly:

GitHub issues, PRs, and test matrix are open

CI/CD runs on every PR via GitHub Actions

Dev chat happens on Discord

Docker image is onghcr

contact us

get in totch with us

Whether you need assistance or want to discuss our services,
feel free to get in touch with us.