mdGPT Gateway

A FastAPI-based LLM gateway with token streaming, full observability and per-user namespace isolation — the production backbone behind multi-tenant chat and deep-research products.

Role

Backend architecture & build

Domain

LLM infrastructure

Core stack

FastAPI · Redis · SSE

Status

In production

Overview

Shipping LLM products to real users means solving the unglamorous parts: streaming responses with low latency, isolating each tenant's data, and seeing exactly what the model did when something goes wrong. mdGPT Gateway is the production layer that makes those guarantees.

It sits between applications and the inference stack, exposing a clean streaming API while enforcing per-user isolation and emitting full traces for every request — so chat assistants and long-running deep-research agents stay observable and safe to operate.

What it provides

Token streaming over Server-Sent Events for responsive, real-time output.
Per-user namespace isolation — strict multi-tenant data separation.
Full observability via OpenTelemetry traces and metrics on every call.
Async throughput with asyncpg and Redis for high concurrency.

mdGPT Gateway

Overview

What it provides

Highlights

In production

Stack