Commit a76c9f0
committed
Add audio, moderations, and tokenize/detokenize endpoint support
Register previously missing OpenAI-compatible routes so they are no
longer rejected with 404:
/v1/audio/transcriptions, /v1/audio/translations (multipart/form-data)
/v1/audio/speech (JSON)
/v1/moderations
/tokenize and /detokenize (vLLM extension)
Add BackendModeAudio and handleAudioInference which extracts the model
field from multipart form data. vLLM passes audio requests through
natively; llama.cpp returns a descriptive error directing users to the
chat completions input_audio content-part instead. Address review
feedback: extract shared scheduleInference helper to restore parity
(auto-install progress, preload-only, recorder, origin tracking) between
handleOpenAIInference and handleAudioInference; fix multipart temp-file
leak (defer MultipartForm.RemoveAll); tighten /v1/audio/ path matching
to explicit HasSuffix checks; make Content-Type check case-insensitive.
Replace the Go routing layer with a Rust reverse proxy (axum + tokio)
compiled as a CGo-linked staticlib (router/). The Rust router owns:
- All route registration and path aliasing (/v1/ -> /engines/, etc.)
- CORS middleware matching the Go CorsMiddleware semantics
- Path normalisation (NormalizePathLayer)
- Static routes: GET / and GET /version
The deleted Go files (pkg/routing/router.go, pkg/routing/routing.go,
pkg/middleware/alias.go) are fully replaced. pkg/routing/service.go is
trimmed; main.go registers Ollama, Anthropic, and Responses handlers
directly on the backend mux.
The in-process CGo callback path (pkg/router/handler.go) replaces the
network proxy hop: Rust calls Go's http.Handler directly via a streaming
protocol — dmr_write_chunk()/dmr_close_stream() push response chunks
into a tokio::sync::mpsc channel as Go writes them, so streaming
endpoints like POST /models/create deliver progress in real time without
buffering the full response.
Rust shared utilities are extracted into a dmr-common workspace crate
(init_tracing, unix_now_secs). model-cli Rust code is deduplicated:
shared send_and_check free function, request_timeout helper,
apply_azure_version helper, build_app/run_gateway_async extracted to
handlers.rs.
Build system:
- Cargo workspace root (Cargo.toml) unifies all Rust crates
- make build-router-lib compiles the Rust staticlib before go build
- Dockerfile installs Rust and builds libdmr_router.a in the builder
stage
- CI test job installs Rust toolchain and builds the library so
go test -race works with CGo enabled
- Platform-split CGo LDFLAGS: Darwin keeps -framework flags, Linux
uses plain -lpthread/-ldl/-lm
- pkg/router/router_stub.go provides a no-op implementation for
CGO_ENABLED=0 builds (lint, cross-compilation)
Signed-off-by: Eric Curtin <eric.curtin@docker.com>1 parent 1304519 commit a76c9f0
File tree
42 files changed
+3025
-700
lines changed- .github/workflows
- cmd/cli
- dmr-common
- src
- model-cli
- src
- providers
- pkg
- inference
- backends
- diffusers
- llamacpp
- mlx
- sglang
- vllm
- scheduling
- middleware
- router
- routing
- router
- src
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
42 files changed
+3025
-700
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
56 | 72 | | |
57 | 73 | | |
58 | 74 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
26 | | - | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
34 | | - | |
35 | | - | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
36 | 40 | | |
| 41 | + | |
| 42 | + | |
37 | 43 | | |
38 | 44 | | |
39 | 45 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
39 | 44 | | |
40 | 45 | | |
41 | 46 | | |
| |||
68 | 73 | | |
69 | 74 | | |
70 | 75 | | |
| 76 | + | |
71 | 77 | | |
72 | 78 | | |
73 | 79 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
0 commit comments