Talk to Claude, Gemini, Qwen, or OpenAI while Ruflo invokes the same ~210 MCP tools the CLI uses โ agent orchestration, persistent memory, swarm coordination, code review, GitHub ops โ directly from chat.
6 curated frontier models out-of-the-box โ Qwen 3.6 Max, Claude Sonnet 4.6, Gemini 2.5 Pro, and more via OpenRouter. Add your own: any OpenAI-compatible endpoint.
5 server groups (Core, Intelligence, Agents, Memory, DevTools) plus an 18-tool gallery that runs entirely in your browser โ works offline.
One model response can fire 4โ6+ tools at the same time. The UI shows them as cards with a "Step N โ X tools completed" badge so you can see exactly what ran.
Say "remember my favorite color is indigo" and ask weeks later โ Flo recalls it. Backed by AgentDB + HNSW vector search (โฅ150ร faster than brute force).
Add any MCP endpoint (HTTP, SSE, or stdio) from the chat input. Your tools join the native ones in the same parallel-execution flow. Run a local MCP server on localhost:3000 and it just works.
Flo is shipped as Docker with embedded MongoDB. Deploy to your own Cloud Run, Fly, Kubernetes, or docker-compose. The hosted demo is one option; running your own is fully supported.
A typical Flo conversation โ the model plans, the tools execute in parallel, and the results feed back into the response.