Skip to content

BUG: ADBCDatabase does not escape SQL identifiers in delete_rows, read_table, and to_sql #65065

@betoalien

Description

@betoalien

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
  import adbc_driver_sqlite.dbapi as adbc_sqlite
                                                                                                                        
  conn = adbc_sqlite.connect(":memory:")                                                                                
                                                                                                                        
  # Create two tables                                                                                                   
  pd.DataFrame({"id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"]}).to_sql(                                        
      "users", conn, if_exists="replace", index=False                                                                   
  )
  pd.DataFrame({"secret": ["password123", "api_key"]}).to_sql(                                                          
      "audit_log", conn, if_exists="replace", index=False                                                               
  )
                                                                                                                        
  # Attacker controls the `name` parameter — targets a different table                                                  
  from pandas.io.sql import ADBCDatabase
  db = ADBCDatabase(conn)                                                                                               
                                                                                                                      
  # Capture the SQL that delete_rows generates                                                                          
  import unittest.mock as mock                                                                                        
  with mock.patch.object(db, "execute", wraps=db.execute) as m:                                                         
      db.delete_rows("audit_log")   # attacker-controlled name                                                        
      print("SQL generated:", m.call_args[0][0])                                                                        
      # Output: DELETE FROM audit_log                                                                                   
      # → all rows deleted from "audit_log" even though caller intended "users"                                         
                                                                                                                        
  # With multi-statement payload (driver-dependent):                                                                    
  malicious = 'users; DROP TABLE audit_log; --'                                                                         
  table_name = malicious   # exactly what ADBCDatabase.delete_rows builds                                               
  print(f"Injected SQL: DELETE FROM {table_name}")                                                                      
  # Output: DELETE FROM users; DROP TABLE audit_log; --                                                                 
                                                                                                                        
  Confirmed effect against PostgreSQL 16 ADBC:                                                                          
  - DELETE FROM "audit_log" executes and empties the table when the caller passed name="audit_log"any table name     
  accepted without restriction.                                                                                         
  - UNION injection via read_table also executes when column types match:                                             
  # SQL generated by read_table when table_name is attacker-controlled:                                                 
  SELECT * FROM users UNION SELECT id, name, role FROM shadow_users                                                     
  # → returns rows from both tables (data exfiltration confirmed)

Issue Description

ADBCDatabase (used when passing an ADBC connection to pd.read_sql_table() or DataFrame.to_sql()) interpolates the name
and schema parameters directly into SQL strings without quoting or escaping. This allows SQL identifier injection
when those parameters contain user-controlled or untrusted values.

Three methods are affected in pandas/io/sql.py:

ADBCDatabase.delete_rows (line ~2445) — vulnerable:

BEFORE (vulnerable)

table_name = f"{schema}.{name}" if schema else name
self.execute(f"DELETE FROM {table_name}").close()

ADBCDatabase.read_table (line ~2239) — vulnerable:

BEFORE (vulnerable)

stmt = f"SELECT {select_list} FROM {schema}.{table_name}"

ADBCDatabase.to_sql (line ~2401) — vulnerable:

BEFORE (vulnerable)

sql_statement = f"DROP TABLE {table_name}"

By contrast, SQLiteDatabase.delete_rows already uses _get_valid_sqlite_name() which wraps identifiers in double quotes
and escapes internal quotes — ADBC has no equivalent function and this pattern was not followed when ADBC support was
added.

Expected Behavior

Table names and schema names should be quoted as SQL identifiers before interpolation, the same way SQLiteDatabase
already does with _get_valid_sqlite_name().

Expected generated SQL:
-- name = "audit_log"
DELETE FROM "audit_log"

-- name = "users; DROP TABLE audit_log; --"
DELETE FROM "users; DROP TABLE audit_log; --" ← treated as a literal identifier, not executable SQL

A new helper _get_valid_adbc_name() using ANSI SQL double-quote escaping (compatible with PostgreSQL, DuckDB, SQLite,
and other ADBC targets) should be applied to all three methods.

Installed Versions

Details

INSTALLED VERSIONS

commit : ab90747
python : 3.12.10
python-bits : 64
OS : Darwin
OS-release : 25.3.0
Version : Darwin Kernel Version 25.3.0: Wed Jan 28 20:56:35 PST 2026;
root:xnu-12377.91.3~2/RELEASE_ARM64_T6030
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : None.UTF-8

pandas : 3.0.2
numpy : 2.4.4
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
psycopg2 : None
pymysql : None
pyarrow : 23.0.1
pyiceberg : None
pyreadstat : None
pytest : None
python-calamine : None
pytz : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
qtpy : None
pyqt5 : None

Note: adbc-driver-sqlite==1.10.0 and adbc-driver-postgresql==1.10.0 were used for reproduction (installed separately).

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions