docs: Document dedicated LocalBackend pipeline support

vivekkalyan · vivekkalyan · commit e4c6b2be3170 · 2026-03-18T13:50:01.000-07:00
diff --git a/docs/features/checkpoint-forking.mdx b/docs/features/checkpoint-forking.mdx
@@ -31,7 +31,7 @@ import art
 from art.local import LocalBackend
 
 async def train():
-    with LocalBackend() as backend:
+    async with LocalBackend() as backend:
         # Create a new model that will fork from an existing checkpoint
         model = art.TrainableModel(
             name="my-model-v2",
@@ -115,7 +115,7 @@ low_lr_model = art.TrainableModel(
 )
 
 async def experiment():
-    with LocalBackend() as backend:
+    async with LocalBackend() as backend:
         # Fork the model from the base model
         await backend._experimental_fork_checkpoint(
             low_lr_model,
diff --git a/docs/fundamentals/art-backend.mdx b/docs/fundamentals/art-backend.mdx
@@ -73,6 +73,30 @@ backend = LocalBackend(
 )
 ```
 
+If you're using `PipelineTrainer`, `LocalBackend` is currently supported only in dedicated mode, where training and inference run on separate GPUs.
+
+```python
+from art import TrainableModel
+from art.dev import InternalModelConfig
+from art.local import LocalBackend
+
+backend = LocalBackend(path="./.art")
+model = TrainableModel(
+    name="pipeline-localbackend",
+    project="my-project",
+    base_model="Qwen/Qwen3-0.6B",
+    _internal_config=InternalModelConfig(
+        trainer_gpu_ids=[0],
+        inference_gpu_ids=[1],
+    ),
+)
+```
+
+Shared `LocalBackend` still pauses inference during training, so ART rejects that configuration for `PipelineTrainer`.
+
+In dedicated mode, a new checkpoint becomes the default inference target only after its LoRA has been reloaded into vLLM. That checkpoint publication flow is backend-specific, so `save_checkpoint` does not have identical semantics across every ART backend.
+Requests that are already in flight keep using the adapter they started with; the reload only affects subsequent routing to the latest served step.
+
 ## Using a backend
 
 Once initialized, a backend can be used in the same way regardless of whether it runs locally or remotely.
diff --git a/docs/fundamentals/training-loop.mdx b/docs/fundamentals/training-loop.mdx
@@ -22,6 +22,8 @@ ART's functionality is divided into a [**client**](/fundamentals/art-client) and
 
 This training loop runs until a specified number of inference and training iterations have completed.
 
+This describes the default shared-resource loop. `PipelineTrainer` can also run with `LocalBackend` in dedicated mode, where training and inference stay on separate GPUs and the latest served step advances only after vLLM reloads the new LoRA.
+
 Training and inference use both the ART **client** and **backend**. Learn more by following the links below!
 
 <div className="cards-container">