由 Omer Arap 提交于 8月 29, 2017
GPORCA should not spend time extracting column statistics that are not
needed for cardinality estimation. This commit eliminates this overhead
of requesting and generating the statistics for columns that are not
used in cardinality estimation unnecessarily.

E.g:
`CREATE TABLE foo (a int, b int, c int);`

For table foo, the query below only needs for stats for column `a` which
is the distribution column and column `c` which is the column used in
where clause.
`select * from foo where c=2;`

However, prior to that commit, the column statistics for column `b` is
also calculated and passed for the cardinality estimation. The only
information needed by the optimizer is the `width` of column `b`. For
this tiny information, we transfer every stats information for that
column.

This commit and its counterpart commit in GPORCA ensures that the column
width information is passed and extracted in the `dxl:Relation` metadata
information.

Preliminary results for short running queries provides up to 65x
performance improvement.
Signed-off-by: NJemish Patel <jpatel@pivotal.io>
5b659321
CTranslatorUtils.h 13.2 KB
Greenplum / Gpdb

Replace CTranslatorUtils.h