Skip to content

Hydration round-trips every fetched value through std::wstring (lossy + slow reads) #17

@jkalias

Description

@jkalias

Problem

Database::Fetch builds a FetchQueryResults whose row_values is a std::vector<std::vector<std::wstring>> (include/fetch_query_results.h). FetchRecordsQuery::GetResults stringifies every column into a wstring via GetColumnValue, and FetchRecordsQuery::Hydrate then parses each string back into the typed member. This text round-trip on the read path is both a performance cost and the root cause of two concrete data-corruption bugs.

Concrete defects caused by the text round-trip

1. 64-bit integers truncated to 32 bits (consolidated from #15)
GetColumnValue reads SQLITE_INTEGER with the 32-bit sqlite3_column_int (return std::to_wstring(sqlite3_column_int(stmt_, col));), and FetchMaxIdQuery::GetMaxId does the same. Any stored int64_t above INT32_MAX — including AUTOINCREMENT ids once they grow large — is silently truncated/wrapped on read, breaking subsequent Fetch(id) / Update / Delete. Fix: use sqlite3_column_int64.

2. REAL values lose precision (consolidated from #16)
GetColumnValue formats SQLITE_FLOAT via std::to_wstring(double), which uses %f with a fixed 6 decimal places, so e.g. 0.123456789 round-trips back as 0.123457 and large/scientific magnitudes are mangled. Writes are fine (sqlite3_bind_double stores the true value); the loss is only on the read path. Fix: read sqlite3_column_double directly, or format with full round-trip precision (%.17g).

Performance / memory

  • Every value incurs an allocation + parse on top of the SQLite copy; significant for wide rows / large result sets.
  • The entire result set is materialized as strings before hydration (no streaming), so large queries hold the whole set twice (string + typed form).

Suggested direction

Hydrate directly from the prepared statement using the typed column accessors (sqlite3_column_int64, sqlite3_column_double, sqlite3_column_text/_bytes) into the destination members, skipping the intermediate wstring. This removes the per-value allocation/parse, fixes both data-corruption bugs above at the source, and opens the door to row-by-row streaming.

Acceptance criteria

  • Integer columns round-trip values greater than INT32_MAX exactly (regression test).
  • Double columns round-trip high-precision values exactly (regression test).
  • Hydration reads typed columns directly rather than via std::wstring.

Related (tracked separately)


Consolidates #15 (int truncation) and #16 (REAL precision), which were symptoms of this single root cause and are closed as duplicates of this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions