XForm for index joins with groupby and project on the inner side (#10711)
* Add a CPatternNode operator
This operator can be used in patterns of xforms and it can match one of
multiple regular operators.
Which operators it matches depends on its match type. Right now, there is
only one match type, EmtMatchInnerOrLeftOuterJoin, that matches a
logical inner join or a logical left outer join.
* Add 2 new xforms for index apply
The new xforms use the new CPatternNode for a patterns of the form
CPatternNode(Leaf, Tree, Tree)
The CPatternNode matches both an inner and a left outer join.
The Tree for the right child can contain arbitrary operators.
To avoid an explosion of the search space, we add two conditions
to the xform: First, the xform is applied only once, not for all
bindings of the Tree node. Second, the xform is applied only if
the right child has a "join depth" of 1, excluding any children
that are complex and wouldn't satisfy the conditions of this xform
anyway.
* Remove 16 obsolete xforms and replace them with 2 new ones
To remove xforms, we have to add a mechanism to skip unused xform ids,
to preserve the ids of the remaining xforms that are used in trace flags.
Our array of xforms now allows for holes.
* Changes to unit test programs
Updated the test programs to tolerate holes in the xform array.
Used a new xform instead of one of the removed ones.
* MDP changes
* ICG changes
* Fixes for ICG failures in join and aggregates tests
The new xform allows additional plans that are chosen in the explain
output. It also surfaced a bug where we can't eliminate a groupby that
sits on top of a CLogicalIndexGet, because the index get doesn't derive
a key set.
* Support for project nodes in index nested loop joins
When generating required distribution specs for its children,
CPhysicalInnerIndexNLJoin will start with its inner child and send it
an ANY required distribution spec. It will then force the outer child
to match the inner's distribution spec (or require a broadcast on the outer).
Now, assume we have CPhysicalComputeScalar as the inner child. This
node, in CPhysicalComputeScalar::PdsRequired will currently require
its child to be REPLICATED (opt request 1) or SINGLETON (opt request
0), if the expression has any outer references. This won't work, since
the underlying table has neither of these distribution schemes and
since we don't want any motions between the index join and the
index get.
This commit changes the behavior of a CPhysicalComputeScalar. If it
senses that it is part of an index nested loop join, it will just
propagate the required distribution spec from the parent.
How does it sense that? By the required ANY distribution spec that
allows outer references. This request is generated in only two places:
CPhysicalInnerIndexNLJoin::PdsRequired and
CPhysicalLeftOuterIndexNLJoin::PdsRequired, so it is only used in the
context of an index NLJ. This behavior is similar to what the CPhysicalFilter
node does, the other node allowed between an index NLJ and the get.
Co-authored-by: NDavid Kimura <dkimura@vmware.com>
Showing
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
想要评论请 注册 或 登录