Skip to content

Commit 3e3f169

Browse files
authored
Merge pull request #24095 from ericcurtin/hf-doc
Add Hugging Face support to Docker Model Runner docs
2 parents d98c6bd + 90c8b1a commit 3e3f169

1 file changed

Lines changed: 10 additions & 8 deletions

File tree

content/manuals/ai/model-runner/_index.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ aliases:
1717
Docker Model Runner (DMR) makes it easy to manage, run, and
1818
deploy AI models using Docker. Designed for developers,
1919
Docker Model Runner streamlines the process of pulling, running, and serving
20-
large language models (LLMs) and other AI models directly from Docker Hub or any
21-
OCI-compliant registry.
20+
large language models (LLMs) and other AI models directly from Docker Hub,
21+
any OCI-compliant registry, or [Hugging Face](https://huggingface.co/).
2222

2323
With seamless integration into Docker Desktop and Docker
2424
Engine, you can serve models via OpenAI and Ollama-compatible APIs, package GGUF files as
@@ -32,7 +32,8 @@ with AI models locally.
3232

3333
## Key features
3434

35-
- [Pull and push models to and from Docker Hub](https://hub.docker.com/u/ai)
35+
- [Pull and push models to and from Docker Hub or any OCI-compliant registry](https://hub.docker.com/u/ai)
36+
- [Pull models from Hugging Face](https://huggingface.co/)
3637
- Serve models on [OpenAI and Ollama-compatible APIs](api-reference.md) for easy integration with existing apps
3738
- Support for [llama.cpp, vLLM, and Diffusers inference engines](inference-engines.md) (vLLM and Diffusers on Linux with NVIDIA GPUs)
3839
- [Generate images from text prompts](inference-engines.md#diffusers) using Stable Diffusion models with the Diffusers backend
@@ -81,11 +82,12 @@ Docker Engine only:
8182

8283
## How Docker Model Runner works
8384

84-
Models are pulled from Docker Hub the first time you use them and are stored
85-
locally. They load into memory only at runtime when a request is made, and
86-
unload when not in use to optimize resources. Because models can be large, the
87-
initial pull may take some time. After that, they're cached locally for faster
88-
access. You can interact with the model using
85+
Models are pulled from Docker Hub, an OCI-compliant registry, or
86+
[Hugging Face](https://huggingface.co/) the first time you use them and are
87+
stored locally. They load into memory only at runtime when a request is made,
88+
and unload when not in use to optimize resources. Because models can be large,
89+
the initial pull may take some time. After that, they're cached locally for
90+
faster access. You can interact with the model using
8991
[OpenAI and Ollama-compatible APIs](api-reference.md).
9092

9193
### Inference engines

0 commit comments

Comments
 (0)