LLM Tracker

LLM Tracker is a Python package for identifying psychological constructs in text data (e.g., interviews, social media posts, chatbot interactions) and comparing LLM-coded results against human-coded results.

The package supports:

Access to thousands of LLMs through OpenRouter API
Detect every instance of a construct (e.g., anxiety) and return the verbatim quote (e.g., "I'm worried about my cousin")
Comparing human and LLM codings at the quote level (using LLMs to match human and LLM quotes)
Computing inter-rater reliability metrics (Kappa, ICC, PABAK) and classification metrics (sensitivity, precision, F1, and PR AUC) and returning summary tables
Automatically retrying submissions when LLM outputs are not parseable
Saving analyzer outputs, metadata, and retryable error records
Flexible loading of csv, txt, docx and preprocessing of dedoose human coding to match LLM coding dataframes.
Many new features coming soon: visualizations, automated prompt engineering, and more!

Please cite this if you use this package:

Low, D., Mair, P., Nock, M., & Ghosh, S. (2025). Text Psychometrics: Assessing Psychological Constructs in Text Using Natural Language Processing. PsyArxiv. https://osf.io/preprints/psyarxiv/9rdux_v4

Installation

Install dependencies with Poetry:

poetry install

For tutorial extras such as corpus summaries:

poetry install --with tutorials

API Key

LLM Tracker uses OpenRouter for LLM calls. You can obtain an API key by adding a few dollars here https://openrouter.ai/. Each LLM call tends to cost a a fraction of a cent (see cost for specific models on OpenRouter). Provide an API key directly:

api_key = "your-openrouter-key"

or set it in the environment:

export OPENROUTER_API_KEY="your-openrouter-key"

You can also pass the path to a .env file containing:

OPENROUTER_API_KEY=your-openrouter-key

Basic LLM Coding

from llm_tracker import LLMTrackerAnalyzer

analyzer = LLMTrackerAnalyzer(
    api_key=api_key,
    model_name="google/gemini-3-flash-preview",
)

results_llm, metadata_llm, errors_llm = analyzer.analyze_csv(
    csv_path="sample_data.csv",
    codebook_path="codebook.json",
    text_column="post",
    subreddit_column="subreddit",
    author_column="author",
    output_dir="LLM_coding",
)

For a directory of supported document files:

results_llm, metadata_llm, errors_llm = analyzer.analyze_directory(
    input_dir="documents",
    codebook_path="codebook.json",
    output_dir="LLM_coding",
)

Directory mode supports .txt and .csv files. Each file becomes one document.

Human Coding Input

Human coding is loaded into memory and passed directly to the comparer:

from llm_tracker.file_handlers import load_human_coding

human_results = load_human_coding(
    "human_coding.csv",
    doc_id_col="Media Title",
    quote_col="Excerpt Copy",
    range_col="Excerpt Range",
    construct_col="Codes Applied Combined",
)

The defaults are designed for Dedoose-style excerpt exports. For other sources, pass the column names used by your file. The values in doc_id_col should match the document IDs produced by the LLM run.

Comparing Results

from llm_tracker.comparison import (
    LLMTrackerComparer,
    compute_summary_tables,
    format_concatenated,
    format_weighted_summary,
)

comparer = LLMTrackerComparer(
    api_key=api_key,
    match_model="google/gemini-3-flash-preview",
)

comparison_table = comparer.compare_results(
    human_results=human_results,
    llm_results=results_llm,
    output_dir="comparison_run",
)

per_doc, pooled, weighted = compute_summary_tables(comparison_table)

format_concatenated(pooled)
format_weighted_summary(weighted)

The comparison table contains one row per matched, human-only, or LLM-only construct instance. The human coding is treated as the reference set for classification metrics.

Quote Matching

Quote indices are recovered with exact matching by default. Fuzzy quote matching is available but off by default:

analyzer = LLMTrackerAnalyzer(
    api_key=api_key,
    fuzzy_quote_matching=True,
)

Use fuzzy matching when quotes may differ slightly from the source text due to spacing, punctuation, or small transcription differences.

Retry Failed Documents

Analyzer runs save error records for failed documents. You can retry them later:

recovered_results, recovered_metadata, remaining_errors = analyzer.retry_errors(
    output_dir="LLM_coding_2026-05-20_120000",
    codebook_path="codebook.json",
)

Tutorial

See tutorial.ipynb for a fuller walkthrough using the sample data and codebook.

Testing

Run the test suite with:

poetry run pytest

The tests avoid real API calls and focus on package behavior, file handling, comparison logic, configuration, and parsing.

Data Privacy

This package sends text to an LLM API during analysis and matching. Do not send identifiable or sensitive data unless it has been properly anonymized and your API provider's data handling policy is appropriate for your use case.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.github		.github
codebooks		codebooks
docs/pdoc-theme		docs/pdoc-theme
sample_data		sample_data
src/llm_tracker		src/llm_tracker
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
codecov.yaml		codecov.yaml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tutorial.ipynb		tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Tracker

Installation

API Key

Basic LLM Coding

Human Coding Input

Comparing Results

Quote Matching

Retry Failed Documents

Tutorial

Testing

Data Privacy

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LLM Tracker

Installation

API Key

Basic LLM Coding

Human Coding Input

Comparing Results

Quote Matching

Retry Failed Documents

Tutorial

Testing

Data Privacy

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages