[SC-16225] Add Gemini support#513
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
PR SummaryThis PR makes extensive modifications to support an additional LLM provider (Gemini) in the ValidMind Library. Significant changes include: • Updates to the configuration routines in the AI utilities (in validmind/ai/utils.py) so that the library now selects among OpenAI, Azure, and Gemini providers based on environment variables. The functions now consistently leverage a new helper (_get_configured_provider) and define default model names/constants (e.g. GEMINI_MODEL and GEMINI_EMBEDDINGS_MODEL). • Refactoring of the get_client_and_model and get_judge_config functions. The judge configuration now correctly calls provider-specific methods, using a Gemini-specific implementation (_build_gemini_judge_config) when appropriate and falling back to the OpenAI method otherwise. • Introduction of a new function, get_deepeval_model, which returns a native model object (or wraps it) appropriately for DeepEval scorers. This ensures that the Gemini provider is correctly supported, and the scoring modules in the deepeval subpackage have been updated to call get_deepeval_model in lieu of get_client_and_model. • Updates across multiple scorer implementations (in the deepeval folder) and prompt validation tests to integrate the new provider configuration, ensuring that all evaluation paths consistently use the updated configuration logic. • Addition of new unit tests in the tests/unit_tests directory to verify proper handling of Gemini credentials and parsing of finish signals in RAGAS tests. The test files have been augmented to mock environment variables for Gemini, confirm that the correct models are instantiated, and validate that the new finish parser behaves as expected. In summary, this PR unifies LLM evaluation configuration across multiple usage points and extends the support to Gemini-based APIs, ensuring that users with different credentials experience consistent behavior across prompt-validation, RAGAS, and DeepEval workflows. Test Suggestions
|
Pull Request Description
What and why?
Added Gemini support across the library’s three LLM evaluation paths: prompt evaluation tests, RAGAS tests, and DeepEval scorers.
Before this change, the shared judge config only supported OpenAI/Azure and Gemini was not wired consistently across all evaluators. After this change, all three evaluation paths can use Gemini, RAGAS handles Gemini finish reasons more reliably, and the judge-configuration notebook now documents the current environment variables and defaults used by the library.
How to test
OPENAI_API_KEYkey in your.envGEMINI_API_KEY, and optionallyGEMINI_MODELandGEMINI_EMBEDDINGS_MODELnotebooks/how_to/run_tests/configure_tests/configure_judge_llms.ipynbWhat needs special review?
Dependencies, breaking changes, and deployment notes
Release notes
Added Gemini support for ValidMind prompt evaluation tests, RAGAS tests, and DeepEval scorers, and updated the judge-configuration notebook to reflect the current environment variables and defaults.
Checklist