Skip to content

Compare

ForestHub vs LiteRT-LM

This comparison is different from the others on this hub: LiteRT-LM and ForestHub are not competitors. LiteRT-LM is Google's on-device inference engine, the layer that loads a model and generates tokens. ForestHub is the orchestration platform above that layer. The honest framing is engine versus platform, and the most interesting question is how they combine.

Facts as of June 2026View sources and date

LiteRT-LM is the better fit if

  • The task is running a quantized LLM on-device with maximum hardware acceleration
  • GPU and NPU execution, speculative decoding and multimodal input matter most
  • An app embeds inference directly via C++, Kotlin or Python APIs

ForestHub is the better fit if

  • The model is one component and the real problem is workflow, protocols and governance
  • LLM decisions must be embedded in auditable graphs with rules and actions around them
  • Several models and providers need routing, local and cloud

Side by side

DimensionForestHubLiteRT-LM
CategoryEdge AI and agents orchestration platformOn-device LLM inference engine
LicenseOpen source runtime (AGPL-3.0, ForestHubAI/edge-agents), commercial backendApache 2.0. External code contributions are currently not accepted
DeploymentEngine as a Docker image on Linux edge devices (amd64 and arm64), runs on premiseC++ library and CLI for Android, iOS, Windows, macOS, Linux and Web. GPU and NPU on selected platforms
Industrial protocolsMQTT first-party, HTTP and REST APIsNone documented. Scope is model loading, sessions and token generation
Ecosystem and integrationsMulti-LLM routing, knowledge bases (RAG), HTTP APIsGemma, Llama, Phi-4-mini and Qwen models in the .litertlm format, OpenAI-compatible server mode
AI and agentsGraph-first agents. The LLM is one node among many, every run is recorded and replayableFunction calling with constrained decoding. No orchestration or workflow layer documented
PricingRuntime free and open source (AGPL-3.0), platform signup at app.foresthub.aiFree under Apache 2.0. Model weights carry their own terms (for example Gemma)

All LiteRT-LM entries follow the linked sources below and reflect the state of June 2026.

Two layers, not two rivals

The documented scope of LiteRT-LM is LLM inference: loading models, managing sessions, generating tokens, parsing tool calls. Industrial protocols, hardware I/O or workflow features are not part of its documented scope. It is the runtime behind on-device AI in Google products such as Chrome and ChromeOS.

ForestHub sits one layer up. It orchestrates what happens before and after a model call: triggers, machine data over MQTT, rule nodes, knowledge bases, actions and the audit trail. An inference engine is something ForestHub uses, not something it competes with.

What LiteRT-LM is genuinely good at

As an inference engine LiteRT-LM is state of the art for its targets: CPU on all platforms, GPU on Android, iOS, macOS, Windows and Linux, NPU on Android with Windows in early preview. Speculative decoding via multi-token prediction reaches up to a 2.2x decode speedup for Gemma 4 by Google's own measurement.

Constrained decoding enforces structured output, for example against a JSON schema, at the sampling level. Quantized Gemma, Llama, Phi-4-mini and Qwen models ship in the .litertlm container format, with vision and audio input supported on-device.

Running LiteRT-LM under ForestHub

Since v0.13.0 LiteRT-LM ships an OpenAI-API-compatible server mode. That is the natural integration point: the engine serves a local model over HTTP, and a ForestHub graph reaches it like any other local service.

The division of labor is clean. LiteRT-LM decides how fast and how well a model runs on the silicon. ForestHub decides when the model is consulted, what data it sees, what it may trigger and how the decision is recorded.

Openness, read carefully

LiteRT-LM is Apache 2.0 licensed, but the repository currently accepts no external code contributions, and model weights such as Gemma carry their own terms of use. Free to use, Google-steered in direction.

ForestHub's runtime is open source too, under AGPL-3.0 with the repository public on GitHub (ForestHubAI/edge-agents). The platform backend around it is commercial, and the direction is edge agents for industrial use.

An honest recommendation

For embedding inference into a mobile or desktop app, LiteRT-LM is one of the strongest choices and ForestHub is not an alternative. The comparison only becomes real at the orchestration layer.

For agents at the industrial edge, an inference engine alone leaves the actual work open: protocols, rules, audit, actions. There the realistic setup is both, a local model behind an engine like LiteRT-LM inside a ForestHub graph.

Frequently asked questions

Is LiteRT-LM an alternative to ForestHub?

No, the two solve different layers. LiteRT-LM runs LLMs on-device, ForestHub orchestrates agents and workflows that use such models. An alternative to LiteRT-LM would be another inference runtime. An alternative to ForestHub would be another orchestration platform.

Can agentic workflows be built with LiteRT-LM?

LiteRT-LM provides building blocks for agents: function calling with constrained decoding and an OpenAI-compatible server mode. Orchestration, state, protocols and audit are outside its documented scope. Those come from a layer above, for example Google's ADK in the app world or ForestHub on industrial Linux edge devices.

Does LiteRT-LM run on Linux edge devices?

Yes. CPU inference is supported on Linux, GPU support is listed for Linux too, and the CLI plus the OpenAI-compatible server mode make it usable headless. NPU support as of June 2026 targets Android, with Windows in early preview.

Is LiteRT-LM the same as LiteLLM?

No. LiteRT-LM is Google's on-device inference engine for running quantized models locally. LiteLLM is a proxy and SDK that unifies calls to 100+ hosted LLM APIs in the OpenAI format. The names are similar, the jobs sit at nearly opposite ends of the stack.

Sources and date

All statements about LiteRT-LM on this page were checked against the sources below, last verified on June 12, 2026. If something is outdated, a short note to the team is enough and the page gets corrected.

Build a graph, replay a run

At app.foresthub.ai, author a workflow as a graph in the visual builder and deploy the engine to a Linux edge device. For enterprise evaluations, the team walks through architecture, audit and rollout questions together.