[ContentUnderstanding] Add to_llm_input helper for converting analysis results to LLM-friendly text by chienyuanchang · Pull Request #46386 · Azure/azure-sdk-for-python

chienyuanchang · 2026-04-17T22:49:35Z

Description

Adds the to_llm_input() public helper function to the azure-ai-contentunderstanding package. This function converts a CU AnalysisResult into a formatted text string (YAML front matter + markdown body) suitable for injecting into LLM prompts, storing in vector databases, or returning as tool output in agentic workflows.

Key features:

Renders all content types: documents (with page markers), audio/video (with time ranges for multi-segment), and classification hierarchies (parent auto-skipped, children rendered with category labels)
_resolve_fields() recursively flattens all 9 ContentField subtypes (StringField, NumberField, ObjectField, ArrayField, etc.) into plain Python dicts
Span-based page markers from pages[].spans offsets, with  fallback for older content
Minimal built-in YAML serializer (no external dependency) with proper quoting for dates, booleans, and YAML-special characters
RAI warnings always included in output regardless of include_fields/include_markdown flags
Single AV content omits timeRange; only multi-segment AV includes timeRange per segment (per design spec)
Configurable via include_fields, include_markdown, and metadata keyword arguments

Files changed

File	Change
`azure/ai/contentunderstanding/_helpers.py`	New file — `to_llm_input()`, `_resolve_fields()`, and supporting internal functions
`azure/ai/contentunderstanding/_patch.py`	Import and re-export `to_llm_input` in `__all__`
`tests/test_to_llm_input.py`	New file — 60 unit tests across 10 categories (public API, error handling, field resolution, documents, AV, multi-segment, classification, parameters, front matter, edge cases, real CU patterns)

How to verify

cd sdk/contentunderstanding/azure-ai-contentunderstanding
pip install -e .
python -m pytest tests/test_to_llm_input.py -v

…at_helper

Copilot

Pull request overview

Adds a public to_llm_input() helper to azure-ai-contentunderstanding to convert AnalysisResult objects into LLM-friendly text (YAML front matter + Markdown), along with unit tests and version/docs updates.

Changes:

Introduces to_llm_input() and supporting field/page/YAML rendering helpers.
Re-exports to_llm_input from the package public surface and bumps version to 1.1.0.
Adds a comprehensive test suite covering content types, rendering rules, and edge cases.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`azure/ai/contentunderstanding/_helpers.py`	New helper implementation, including minimal YAML serializer and content rendering logic.
`azure/ai/contentunderstanding/_patch.py`	Re-exports `to_llm_input` via `__all__` for package-level import.
`tests/test_to_llm_input.py`	New unit tests validating helper behavior across documents, AV, classification, and flags.
`azure/ai/contentunderstanding/_version.py`	Version bump to `1.1.0`.
`README.md`	Adds `1.1.0` to the SDK-to-service-version table.
`CHANGELOG.md`	Adds release notes entry for `1.1.0`.

…at_helper # Conflicts: # sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md # sdk/contentunderstanding/azure-ai-contentunderstanding/README.md # sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_version.py

chienyuanchang added 2 commits April 10, 2026 15:29

initial version

bd77da9

Merge remote-tracking branch 'origin/main' into cu_sdk/llm_input_form…

7c3231b

…at_helper

github-actions bot added the Cognitive - Content Understanding label Apr 17, 2026

chienyuanchang added 4 commits April 17, 2026 16:41

catch more edge cases

c31e093

Merge branch 'main' into cu_sdk/llm_input_format_helper

86a23aa

update version

3a82526

Merge branch 'main' into cu_sdk/llm_input_format_helper

06d993f

chienyuanchang marked this pull request as ready for review April 20, 2026 19:03

Copilot AI review requested due to automatic review settings April 20, 2026 19:03

chienyuanchang requested review from bojunehsu, changjian-wang and yungshinlintw as code owners April 20, 2026 19:03

Copilot started reviewing on behalf of chienyuanchang April 20, 2026 19:04 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Comment thread ...contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_helpers.py Outdated

Comment thread sdk/contentunderstanding/azure-ai-contentunderstanding/tests/test_to_llm_input.py Outdated

chienyuanchang added 7 commits April 20, 2026 12:21

fix nested dicts inside array item

2236f48

update version to 1.2.0

a86f782

update for CI check

7b12254

fix copilot comments

c7d2cab

segs->segments for spell

98ccdd3

Merge branch 'main' into cu_sdk/llm_input_format_helper

6f81649

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ContentUnderstanding] Add to_llm_input helper for converting analysis results to LLM-friendly text#46386

[ContentUnderstanding] Add to_llm_input helper for converting analysis results to LLM-friendly text#46386
chienyuanchang wants to merge 13 commits intomainfrom
cu_sdk/llm_input_format_helper

chienyuanchang commented Apr 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chienyuanchang commented Apr 17, 2026

Description

Files changed

How to verify

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants