Skip to content

Commit 0f37ac7

Browse files
authored
Merge pull request #607 from docker/improve-safetensors-packaging
Improve Safetensors packaging
2 parents 232f06b + 707326b commit 0f37ac7

11 files changed

Lines changed: 1378 additions & 8 deletions

File tree

pkg/distribution/MODEL_TYPES.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# Model Types and Interfaces
2+
3+
This document explains the model types and interfaces in the distribution package.
4+
5+
## Architecture Overview
6+
7+
```
8+
┌─────────────────────────────────────────────────────────────────┐
9+
│ oci.Image │
10+
│ (Low-level OCI artifact: Layers, Manifest, Digest, etc.) │
11+
└───────────────────────────────┬─────────────────────────────────┘
12+
│ embeds
13+
14+
┌─────────────────────────────────────────────────────────────────┐
15+
│ types.ModelArtifact │
16+
│ (Building & pushing: ID, Config, Descriptor) │
17+
└───────────────────────────────┬─────────────────────────────────┘
18+
│ implemented by
19+
20+
┌─────────────────────────────────────────────────────────────────┐
21+
│ partial.BaseModel │
22+
│ (Common implementation with LayerList, ConfigFile) │
23+
└─────────────────────────────────────────────────────────────────┘
24+
```
25+
26+
## Core Interfaces
27+
28+
### `oci.Image` (pkg/distribution/oci/image.go)
29+
30+
Low-level OCI artifact operations for registry storage.
31+
32+
```go
33+
type Image interface {
34+
Layers() ([]Layer, error)
35+
MediaType() (MediaType, error)
36+
Size() (int64, error)
37+
ConfigName() (Hash, error)
38+
ConfigFile() (*ConfigFile, error)
39+
RawConfigFile() ([]byte, error)
40+
Digest() (Hash, error)
41+
Manifest() (*Manifest, error)
42+
RawManifest() ([]byte, error)
43+
LayerByDigest(Hash) (Layer, error)
44+
LayerByDiffID(Hash) (Layer, error)
45+
}
46+
```
47+
48+
### `types.ModelArtifact` (pkg/distribution/types/model.go)
49+
50+
For building and distributing models. Extends `oci.Image`.
51+
52+
```go
53+
type ModelArtifact interface {
54+
ID() (string, error)
55+
Config() (ModelConfig, error)
56+
Descriptor() (Descriptor, error)
57+
oci.Image
58+
}
59+
```
60+
61+
### `types.Model` (pkg/distribution/types/model.go)
62+
63+
Stored model with file path resolution for inference.
64+
65+
```go
66+
type Model interface {
67+
ID() (string, error)
68+
GGUFPaths() ([]string, error)
69+
SafetensorsPaths() ([]string, error)
70+
DDUFPaths() ([]string, error)
71+
ConfigArchivePath() (string, error)
72+
MMPROJPath() (string, error)
73+
Config() (ModelConfig, error)
74+
Tags() []string
75+
Descriptor() (Descriptor, error)
76+
ChatTemplatePath() (string, error)
77+
}
78+
```
79+
80+
### `types.ModelBundle` (pkg/distribution/types/model.go)
81+
82+
Unpacked model ready for runtime execution.
83+
84+
```go
85+
type ModelBundle interface {
86+
RootDir() string
87+
GGUFPath() string
88+
SafetensorsPath() string
89+
DDUFPath() string
90+
ChatTemplatePath() string
91+
MMPROJPath() string
92+
RuntimeConfig() ModelConfig
93+
}
94+
```
95+
96+
### `types.ModelConfig` (pkg/distribution/types/config.go)
97+
98+
Format-agnostic configuration access.
99+
100+
```go
101+
type ModelConfig interface {
102+
GetFormat() Format
103+
GetContextSize() *int32
104+
GetSize() string
105+
GetArchitecture() string
106+
GetParameters() string
107+
GetQuantization() string
108+
}
109+
```
110+
111+
Implemented by:
112+
- `*types.Config` (Docker format, snake_case JSON)
113+
- `*modelpack.Model` (CNCF ModelPack format, camelCase JSON)
114+
115+
## Helper Interfaces (pkg/distribution/internal/partial/)
116+
117+
Compositional interfaces enabling code reuse:
118+
119+
```go
120+
type WithRawConfigFile interface {
121+
RawConfigFile() ([]byte, error)
122+
}
123+
124+
type WithRawManifest interface {
125+
RawManifest() ([]byte, error)
126+
}
127+
128+
type WithLayers interface {
129+
WithRawConfigFile
130+
Layers() ([]oci.Layer, error)
131+
}
132+
133+
type WithConfigMediaType interface {
134+
GetConfigMediaType() oci.MediaType
135+
}
136+
```
137+
138+
Helper functions work with any type satisfying these interfaces:
139+
- `Config(WithRawConfigFile)``ModelConfig`
140+
- `ID(WithRawManifest)``string`
141+
- `GGUFPaths(WithLayers)``[]string`
142+
- `SafetensorsPaths(WithLayers)``[]string`
143+
- `ManifestForLayers(WithLayers)``*oci.Manifest`
144+
145+
## Concrete Types
146+
147+
### `partial.BaseModel`
148+
149+
Common implementation for model artifacts:
150+
151+
```go
152+
type BaseModel struct {
153+
ModelConfigFile types.ConfigFile
154+
LayerList []oci.Layer
155+
ConfigMediaType oci.MediaType
156+
}
157+
```
158+
159+
### `partial.Layer`
160+
161+
Local file layer implementing `oci.Layer`:
162+
163+
```go
164+
type Layer struct {
165+
Path string
166+
oci.Descriptor
167+
}
168+
```
169+
170+
## Model Formats
171+
172+
| Format | Constant | Description |
173+
|--------|----------|-------------|
174+
| GGUF | `FormatGGUF` | llama.cpp quantized models |
175+
| Safetensors | `FormatSafetensors` | Hugging Face weights |
176+
| Diffusers | `FormatDiffusers` | Image generation models |
177+
| Safetensors | `FormatSafetensors` | HuggingFace weights |
178+
| Diffusers | `FormatDiffusers` | Image generation models |
179+
180+
## Media Types
181+
182+
### Docker Format
183+
- `application/vnd.docker.ai.model.config.v0.1+json` - Legacy config
184+
- `application/vnd.docker.ai.model.config.v0.2+json` - Layer-per-file config
185+
- `application/vnd.docker.ai.gguf.v3` - GGUF weights
186+
- `application/vnd.docker.ai.safetensors` - Safetensors weights
187+
188+
### CNCF ModelPack Format
189+
- `application/vnd.cncf.model.config.v1+json` - ModelPack config
190+
- `application/vnd.cncf.model.weight.v1.gguf` - GGUF weights
191+
- `application/vnd.cncf.model.weight.v1.safetensors` - Safetensors weights
192+
193+
## Why Multiple Interfaces?
194+
195+
| Interface | Use Case |
196+
|-----------|----------|
197+
| `oci.Image` | Registry push/pull operations |
198+
| `ModelArtifact` | Building and distributing models |
199+
| `Model` | Stored model file path access |
200+
| `ModelBundle` | Runtime execution |
201+
| `ModelConfig` | Format-agnostic config access |
202+
203+
This separation enables:
204+
- Different backends (llama.cpp, vLLM) consume the same `ModelBundle`
205+
- Support for multiple formats (Docker, CNCF ModelPack) without conversion
206+
- Clean separation between storage and inference layers

0 commit comments

Comments
 (0)