Skip to content

BUG: Categorical.map() sort categories for unordered categoricals (#58153)#65286

Open
tinezivic wants to merge 1 commit intopandas-dev:mainfrom
tinezivic:fix-categorical-map-sort-categories
Open

BUG: Categorical.map() sort categories for unordered categoricals (#58153)#65286
tinezivic wants to merge 1 commit intopandas-dev:mainfrom
tinezivic:fix-categorical-map-sort-categories

Conversation

@tinezivic
Copy link
Copy Markdown
Contributor

@tinezivic tinezivic commented Apr 19, 2026

Bug

DataFrame.sort_values(key=...) on a Categorical column where the key maps values to a custom order ignored the key, sorting alphabetically instead of by the mapped values.

Minimal reproduction:

import pandas as pd

df = pd.DataFrame(
    [[1, 2, "March"], [5, 6, "Dec"], [3, 4, "April"]],
    columns=["a", "b", "month"],
)
df.month = df.month.astype("category")
custom_dict = {"March": 0, "April": 1, "Dec": 3}
print(df.sort_values(by=["month"], key=lambda x: x.map(custom_dict)))
# Before fix: sorted alphabetically (April, Dec, March) — key ignored
# After fix:  sorted by custom_dict order (March, April, Dec)

Root Cause

Categorical.map() preserved the positional category order from the original categories. For unordered categoricals the mapped values inherited an arbitrary ordering, so sort_values sorted by old category position instead of the mapped values.

Fix: In Categorical.map(), when the categorical is unordered, sort the mapped categories and remap codes accordingly. For ordered categoricals the existing category order is preserved. Mixed-type categories that cannot be compared (e.g. str and float) fall back to original order via TypeError catch.

AI Disclosure

This fix was developed with the assistance of GitHub Copilot (Claude Sonnet 4.6). The AI assisted with code generation and diff review. The contributor verified the fix locally and confirmed all 49 tests pass.

@tinezivic tinezivic force-pushed the fix-categorical-map-sort-categories branch 3 times, most recently from 33e3402 to 6a44ddb Compare April 19, 2026 02:03
…#58153)

Categorical.map() preserved the positional order of categories from the
original (pre-mapped) categorical. For unordered categoricals, this meant
the mapped values inherited an arbitrary category ordering, causing
sort_values(key=...) to ignore custom sort orders.

Fix: In Categorical.map(), when the categorical is unordered, sort the
mapped categories and remap codes accordingly. For ordered categoricals,
preserve the existing category order (since the ordering is user-defined).

Mixed-type categories (e.g. str and float) that cannot be compared
gracefully fall back to preserving the original category order.

Closes pandas-dev#58153

Generated-by: GitHub Copilot
@tinezivic tinezivic force-pushed the fix-categorical-map-sort-categories branch from 6a44ddb to f8b7b35 Compare April 19, 2026 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: sort_values() with key not working on categorical column

1 participant