Introduction — What the Middleware Does
Admin Bud-E is LAION's open-source middleware and user management layer for AI assistants. It sits between the browser-based front end (e.g. School Bud-E) and different AI providers (Google Vertex AI, Mistral, Together, and others).
How It Works
The front end runs entirely in the user's browser; conversation histories are stored locally on the user's device (no server-side chat storage). The middleware, in turn:
- Receives API requests from the front end
- Forwards them to configured providers
- Measures usage
- Keeps per-user and per-project credit balances
Key Benefits
Users don't need to register with external vendors or manage multiple API keys. Instead, the admin of your organization issues one single "Universal API key" for each user (or class, or group).
This key:
- Is not linked to personal data (no names or e-mails required)
- Works for all capabilities: LLM, VLM, TTS, and ASR
- Makes operations simpler and supports GDPR-compliant setups
Usage Measurement
The middleware counts usage in units that match the technology:
| Technology | Unit | Description |
|---|---|---|
| LLM/VLM | Tokens | Words are split into small chunks |
| TTS | Characters | Each character counts toward usage |
| ASR | Tokens or Time | Provider-reported tokens or time-based (per hour of audio) |
This keeps cost drivers transparent:
- Longer answers consume more tokens
- Frequent TTS playback increases character usage
- ASR is typically inexpensive when billed per hour
A non-specialist can therefore control the two main levers—answer length and TTS share—confidently.
OpenAI-Compatible Proxy
Admin Bud-E includes a small OpenAI-compatible proxy for Google Vertex AI. This means clients can talk to Gemini models via the familiar OpenAI API shape (POST /v1/chat/completions, Authorization: Bearer …) while the middleware translates the requests to Vertex under the hood.
Terminology
- LLM — Large Language Model: generates/understands text
- VLM — Vision-Language Model: understands images + text
- TTS — Text-to-Speech: converts written text to natural speech
- ASR — Automatic Speech Recognition: transcribes speech to text