[ContentUnderstanding] Add to_llm_input helper for converting analysis results to LLM-friendly text#46386
Open
chienyuanchang wants to merge 13 commits intomainfrom
Open
[ContentUnderstanding] Add to_llm_input helper for converting analysis results to LLM-friendly text#46386chienyuanchang wants to merge 13 commits intomainfrom
chienyuanchang wants to merge 13 commits intomainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a public to_llm_input() helper to azure-ai-contentunderstanding to convert AnalysisResult objects into LLM-friendly text (YAML front matter + Markdown), along with unit tests and version/docs updates.
Changes:
- Introduces
to_llm_input()and supporting field/page/YAML rendering helpers. - Re-exports
to_llm_inputfrom the package public surface and bumps version to1.1.0. - Adds a comprehensive test suite covering content types, rendering rules, and edge cases.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
azure/ai/contentunderstanding/_helpers.py |
New helper implementation, including minimal YAML serializer and content rendering logic. |
azure/ai/contentunderstanding/_patch.py |
Re-exports to_llm_input via __all__ for package-level import. |
tests/test_to_llm_input.py |
New unit tests validating helper behavior across documents, AV, classification, and flags. |
azure/ai/contentunderstanding/_version.py |
Version bump to 1.1.0. |
README.md |
Adds 1.1.0 to the SDK-to-service-version table. |
CHANGELOG.md |
Adds release notes entry for 1.1.0. |
…at_helper # Conflicts: # sdk/contentunderstanding/azure-ai-contentunderstanding/CHANGELOG.md # sdk/contentunderstanding/azure-ai-contentunderstanding/README.md # sdk/contentunderstanding/azure-ai-contentunderstanding/azure/ai/contentunderstanding/_version.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds the
to_llm_input()public helper function to theazure-ai-contentunderstandingpackage. This function converts a CUAnalysisResultinto a formatted text string (YAML front matter + markdown body) suitable for injecting into LLM prompts, storing in vector databases, or returning as tool output in agentic workflows.Key features:
_resolve_fields()recursively flattens all 9ContentFieldsubtypes (StringField,NumberField,ObjectField,ArrayField, etc.) into plain Python dictspages[].spansoffsets, with<!-- PageBreak -->fallback for older contentinclude_fields/include_markdownflagstimeRange; only multi-segment AV includestimeRangeper segment (per design spec)include_fields,include_markdown, andmetadatakeyword argumentsFiles changed
azure/ai/contentunderstanding/_helpers.pyto_llm_input(),_resolve_fields(), and supporting internal functionsazure/ai/contentunderstanding/_patch.pyto_llm_inputin__all__tests/test_to_llm_input.pyHow to verify