sql/stats: inaccurate avg_size in statistics for column in single-column column family #137713
Labels
A-sql-table-stats
Table statistics (and their automatic refresh).
branch-release-22.1
Used to mark GA and release blockers, technical advisories, and bugs for 22.1
branch-release-22.2
Used to mark GA and release blockers, technical advisories, and bugs for 22.2
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
branch-release-23.2
Used to mark GA and release blockers, technical advisories, and bugs for 23.2
branch-release-24.1
Used to mark GA and release blockers, technical advisories, and bugs for 24.1
branch-release-24.2
Used to mark GA and release blockers, technical advisories, and bugs for 24.2
branch-release-24.3
Used to mark GA and release blockers, technical advisories, and bugs for 24.3
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-sql-queries
SQL Queries Team
Single-column column families are encoded using legacy marshaling. When we collect statistics on these single-column column families, we estimate their size using Datum.Size() instead of the actual on-disk size, and this can be very inaccurate. Here's an example:
The avg_size of both
a
andb
is 3 bytes, but avg_size ofc
is 48 bytes because DEnum.Size() includes a bunch of metadata that isn't written to disk.Jira issue: CRDB-45715
The text was updated successfully, but these errors were encountered: