未验证 提交 5b4c4f59 编写于 作者: Z Zhenghua Lyu 提交者: GitHub

Correct plan of general & segmentGeneral path with volatiole functions.

General and segmentGeneral locus imply that if the corresponding slice
is executed in many different segments should provide the same result
data set. Thus, in some cases, General and segmentGeneral can be
treated like broadcast.

But what if the segmentGeneral and general locus path contain volatile
functions? volatile functions, by definition, do not guarantee results
of different invokes. So for such cases, they lose the property and
cannot be treated as *general. Previously, Greenplum planner
does not handle these cases correctly. Limit general or segmentgeneral
path also has such issue.

The fix idea of this commit is: when we find the pattern (a general or
segmentGeneral locus paths contain volatile functions), we create a
motion path above it to turn its locus to singleQE and then create a
projection path. Then the core job becomes how we choose the places to
check:

  1. For a single base rel, we should only check its restriction, this is
     the at bottom of planner, this is at the function set_rel_pathlist
  2. When creating a join path, if the join locus is general or segmentGeneral,
     check its joinqual to see if it contains volatile functions
  3. When handling subquery, we will invoke set_subquery_pathlist function,
     at the end of this function, check the targetlist and havingQual
  4. When creating limit path, the check and change algorithm should also be used
  5. Correctly handle make_subplan

OrderBy clause and Group Clause should be included in targetlist and handled
by the above Step 3.

Also this commit fixes DMLs on replicated table. Update & Delete Statement on
a replicated table is special. These statements have to be dispatched to each
segment to execute. So if they contain volatile functions in their targetList
or where clause, we should reject such statements:

  1. For targetList, we check it at the function create_motion_path_for_upddel
  2. For where clause, they will be handled in the query planner and if we
     find the pattern and want to fix it, do another check if we are updating
     or deleting replicated table, if so reject the statement.

CherryPick from commit d1f9b96b from master to 6X.
上级 777a4cdc
......@@ -2039,3 +2039,54 @@ has_redistributable_clause(RestrictInfo *restrictinfo)
return restrictinfo->hashjoinoperator != InvalidOid;
}
/*
* turn_volatile_seggen_to_singleqe
*
* This function is the key tool to build correct plan
* for general or segmentgeneral locus paths that contain
* volatile functions.
*
* If we find such a pattern:
* 1. if we are update or delete statement on replicated table
* simply reject the query
* 2. if it is general locus, simply change it to singleQE
* 3. if it is segmentgeneral, use a motion to bring it to
* singleQE and then create a projection path
*
* If we do not find the pattern, simply return the input path.
*
* The last parameter of this function is the part that we want to
* check volatile functions.
*/
Path *
turn_volatile_seggen_to_singleqe(PlannerInfo *root, Path *path, Node *node)
{
if ((CdbPathLocus_IsSegmentGeneral(path->locus) || CdbPathLocus_IsGeneral(path->locus)) &&
(contain_volatile_functions(node)))
{
CdbPathLocus singleQE;
Path *mpath;
ProjectionPath *ppath;
if (root->upd_del_replicated_table > 0 &&
bms_is_member(root->upd_del_replicated_table,
path->parent->relids))
elog(ERROR, "could not devise a plan");
if (CdbPathLocus_IsGeneral(path->locus))
{
CdbPathLocus_MakeSingleQE(&(path->locus),
getgpsegmentCount());
return path;
}
CdbPathLocus_MakeSingleQE(&singleQE,
CdbPathLocus_NumSegments(path->locus));
mpath = cdbpath_create_motion_path(root, path, NIL, false, singleQE);
ppath = create_projection_path_with_quals(root, mpath->parent, mpath, NIL);
ppath->force = true;
return (Path *) ppath;
}
else
return path;
}
......@@ -45,6 +45,7 @@
#include "cdb/cdbmutate.h" /* cdbmutate_warn_ctid_without_segid */
#include "cdb/cdbpath.h" /* cdbpath_rows() */
#include "cdb/cdbsetop.h"
// TODO: these planner gucs need to be refactored into PlannerConfig.
bool gp_enable_sort_limit = FALSE;
......@@ -122,6 +123,7 @@ static void recurse_push_qual(Node *setOp, Query *topquery,
RangeTblEntry *rte, Index rti, Node *qual);
static void bring_to_singleQE(PlannerInfo *root, RelOptInfo *rel, List *outer_quals);
static bool is_query_contain_limit_groupby(Query *parse);
static void handle_gen_seggen_volatile_path(PlannerInfo *root, RelOptInfo *rel);
/*
* make_one_rel
......@@ -448,6 +450,40 @@ bring_to_singleQE(PlannerInfo *root, RelOptInfo *rel, List *outer_quals)
set_cheapest(rel);
}
/*
* handle_gen_seggen_volatile_path
*
* Only use for base replicated rel.
* Change the path in its pathlist if match the pattern
* (segmentgeneral or general path contains volatile restrictions).
*/
static void
handle_gen_seggen_volatile_path(PlannerInfo *root, RelOptInfo *rel)
{
List *origpathlist;
ListCell *lc;
origpathlist = rel->pathlist;
rel->cheapest_startup_path = NULL;
rel->cheapest_total_path = NULL;
rel->cheapest_unique_path = NULL;
rel->cheapest_parameterized_paths = NIL;
rel->pathlist = NIL;
foreach(lc, origpathlist)
{
Path *origpath = (Path *) lfirst(lc);
Path *path;
path = turn_volatile_seggen_to_singleqe(root,
origpath,
(Node *) (rel->baserestrictinfo));
add_path(rel, path);
}
set_cheapest(rel);
}
/*
* set_rel_pathlist
* Build access paths for a base relation
......@@ -516,6 +552,14 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
bring_to_singleQE(root, rel, rel->upperrestrictinfo);
}
/*
* Greenplum specific behavior:
* Change the path in pathlist if it is a general or segmentgeneral
* path that contains volatile restrictions.
*/
if (rel->reloptkind == RELOPT_BASEREL)
handle_gen_seggen_volatile_path(root, rel);
#ifdef OPTIMIZER_DEBUG
debug_print_rel(root, rel);
#endif
......@@ -1487,6 +1531,21 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* XXX rel->onerow = ??? */
}
if (rel->subplan->flow->locustype == CdbLocusType_General &&
(contain_volatile_functions((Node *) rel->subplan->targetlist) ||
contain_volatile_functions(subquery->havingQual)))
{
rel->subplan->flow->locustype = CdbLocusType_SingleQE;
rel->subplan->flow->flotype = FLOW_SINGLETON;
}
if (rel->subplan->flow->locustype == CdbLocusType_SegmentGeneral &&
(contain_volatile_functions((Node *) rel->subplan->targetlist) ||
contain_volatile_functions(subquery->havingQual)))
{
rel->subplan = (Plan *) make_motion_gather(subroot, rel->subplan, NIL);
}
rel->subroot = subroot;
/* Isolate the params needed by this specific subplan */
......
......@@ -1310,6 +1310,7 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
* not using.)
*/
if (!best_path->cdb_restrict_clauses &&
!best_path->force &&
(is_projection_capable_plan(subplan) ||
tlist_same_exprs(tlist, subplan->targetlist)))
{
......@@ -6480,6 +6481,12 @@ make_modifytable(PlannerInfo *root,
{
Plan *subplan = (Plan *) lfirst(subnode);
if (operation != CMD_INSERT &&
(subplan->flow->locustype == CdbLocusType_SegmentGeneral ||
subplan->flow->locustype == CdbLocusType_General) &&
contain_volatile_functions((Node *) subplan->targetlist))
elog(ERROR, "could not devise a plan");
if (subnode == list_head(subplans)) /* first node? */
plan->startup_cost = subplan->startup_cost;
plan->total_cost += subplan->total_cost;
......
......@@ -3218,6 +3218,12 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
offset_est,
count_est);
result_plan->flow = pull_up_Flow(result_plan, result_plan->lefttree);
if (result_plan->flow->locustype == CdbLocusType_General)
{
result_plan->flow->locustype = CdbLocusType_SingleQE;
result_plan->flow->flotype = FLOW_SINGLETON;
}
}
/*
......
......@@ -615,6 +615,15 @@ make_subplan(PlannerInfo *root, Query *orig_subquery, SubLinkType subLinkType,
&subroot,
config);
if ((plan->flow->locustype == CdbLocusType_SegmentGeneral ||
plan->flow->locustype == CdbLocusType_General) &&
(contain_volatile_functions((Node *) plan->targetlist) ||
contain_volatile_functions(subquery->havingQual)))
{
plan->flow->locustype = CdbLocusType_SingleQE;
plan->flow->flotype = FLOW_SINGLETON;
}
/* Isolate the params needed by this specific subplan */
plan_params = root->plan_params;
root->plan_params = NIL;
......
......@@ -1209,6 +1209,16 @@ contain_volatile_functions_walker(Node *node, void *context)
contain_volatile_functions_walker,
context, 0);
}
else if (IsA(node, RestrictInfo))
{
/*
* We need to handle RestrictInfo, a case that uses this
* is that replicated table with a volatile restriction.
* We have to find the pattern and turn it into singleQE.
*/
RestrictInfo * info = (RestrictInfo *) node;
return contain_volatile_functions_walker((Node*)info->clause, context);
}
return expression_tree_walker(node, contain_volatile_functions_walker,
context);
......
......@@ -3088,7 +3088,7 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
*
* Returns the resulting path node.
*/
NestPath *
Path *
create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
......@@ -3232,7 +3232,26 @@ create_nestloop_path(PlannerInfo *root,
final_cost_nestloop(root, pathnode, workspace, sjinfo, semifactors);
return pathnode;
/*
* Greenplum specific behavior:
* If we find the join locus is general or segmentgeneral,
* we should check the joinqual, if it contains volatile functions
* we have to turn the join path to singleQE.
*
* NB: we do not add this logic in the above create_unique_rowid_path
* code block, the reason is:
* create_unique_rowid_path is a technique to implement semi join
* using normal join, it can only happens for sublink query:
* 1. if the sublink query contains volatile target list or havingQual
* it cannot be pulled up in pull_up_subquery, so it will be a
* subselect and be handled in the function set_subquery_pathlist
* 2. if the sublink query contains volatile functions in joinqual
* or where clause, it will be handled in set_rel_pathlist and
* here.
*/
return turn_volatile_seggen_to_singleqe(root,
(Path *) pathnode,
(Node *) (pathnode->joinrestrictinfo));
}
/*
......@@ -3262,7 +3281,7 @@ create_nestloop_path(PlannerInfo *root,
* 'innersortkeys' are the sort varkeys for the inner relation
* or NIL to use existing ordering
*/
MergePath *
Path *
create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
......@@ -3385,7 +3404,12 @@ create_mergejoin_path(PlannerInfo *root,
final_cost_mergejoin(root, pathnode, workspace, sjinfo);
return pathnode;
/*
* See the comments at the end of create_nestloop_path.
*/
return turn_volatile_seggen_to_singleqe(root,
(Path *) pathnode,
(Node *) (pathnode->jpath.joinrestrictinfo));
}
/*
......@@ -3404,7 +3428,7 @@ create_mergejoin_path(PlannerInfo *root,
* 'hashclauses' are the RestrictInfo nodes to use as hash clauses
* (this should be a subset of the restrict_clauses list)
*/
HashPath *
Path *
create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
......@@ -3518,7 +3542,12 @@ create_hashjoin_path(PlannerInfo *root,
final_cost_hashjoin(root, pathnode, workspace, sjinfo, semifactors);
return pathnode;
/*
* See the comments at the end of create_nestloop_path.
*/
return turn_volatile_seggen_to_singleqe(root,
(Path *) pathnode,
(Node *) (pathnode->jpath.joinrestrictinfo));
}
/*
......
......@@ -44,4 +44,7 @@ cdbpath_dedup_fixup(PlannerInfo *root, Path *path);
bool
cdbpath_contains_wts(Path *path);
Path *
turn_volatile_seggen_to_singleqe(PlannerInfo *root, Path *path, Node *node);
#endif /* CDBPATH_H */
......@@ -1383,6 +1383,12 @@ typedef struct ProjectionPath
bool dummypp; /* true if no separate Result is needed */
List *cdb_restrict_clauses;
/*
* Greenplum specific field:
* If force is true, we always create a Result plannode.
*/
bool force;
} ProjectionPath;
/*
......
......@@ -116,7 +116,7 @@ extern Relids calc_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern Relids calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path);
extern bool path_contains_inner_index(Path *path);
extern NestPath *create_nestloop_path(PlannerInfo *root,
extern Path *create_nestloop_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
JoinCostWorkspace *workspace,
......@@ -129,7 +129,7 @@ extern NestPath *create_nestloop_path(PlannerInfo *root,
List *pathkeys,
Relids required_outer);
extern MergePath *create_mergejoin_path(PlannerInfo *root,
extern Path *create_mergejoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
JoinCostWorkspace *workspace,
......@@ -144,7 +144,7 @@ extern MergePath *create_mergejoin_path(PlannerInfo *root,
List *outersortkeys,
List *innersortkeys);
extern HashPath *create_hashjoin_path(PlannerInfo *root,
extern Path *create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
JoinCostWorkspace *workspace,
......
......@@ -390,6 +390,183 @@ WINDOW w AS (order by a);
-------------+---
(0 rows)
-- volatile general
-- General and segmentGeneral locus imply that if the corresponding
-- slice is executed in many different segments should provide the
-- same result data set. Thus, in some cases, General and segmentGeneral
-- can be treated like broadcast. But if the segmentGeneral and general
-- locus path contain volatile functions, they lose the property and
-- can only be treated as singleQE. The following cases are to check that
-- we correctly handle all these cases.
-- FIXME: for ORCA the following SQL does not consider this. We should
-- fix them when ORCA changes.
set optimizer = off;
create table t_hashdist(a int, b int, c int) distributed by (a);
---- pushed down filter
explain (costs off)
select * from
(select a from generate_series(1, 10)a) x, t_hashdist
where x.a > random();
QUERY PLAN
----------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Function Scan on generate_series a
Filter: ((a)::double precision > random())
Optimizer: Postgres query optimizer
(8 rows)
---- join qual
explain (costs off) select * from
t_hashdist,
(select a from generate_series(1, 10) a) x,
(select a from generate_series(1, 10) a) y
where x.a + y.a > random();
QUERY PLAN
---------------------------------------------------------------------
Nested Loop
-> Nested Loop
Join Filter: (((a.a + a_1.a))::double precision > random())
-> Function Scan on generate_series a
-> Function Scan on generate_series a_1
-> Materialize
-> Gather Motion 3:1 (slice1; segments: 3)
-> Seq Scan on t_hashdist
Optimizer: Postgres query optimizer
(9 rows)
---- sublink & subquery
explain (costs off) select * from t_hashdist where a > All (select random() from generate_series(1, 10));
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop Left Anti Semi (Not-In) Join
Join Filter: ((t_hashdist.a)::double precision <= "NotIn_SUBQUERY".random)
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Subquery Scan on "NotIn_SUBQUERY"
-> Function Scan on generate_series
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist where a in (select random()::int from generate_series(1, 10));
QUERY PLAN
------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Hash Semi Join
Hash Cond: (t_hashdist.a = ((random())::integer))
-> Seq Scan on t_hashdist
-> Hash
-> Redistribute Motion 1:3 (slice1; segments: 1)
Hash Key: ((random())::integer)
-> Function Scan on generate_series
Optimizer: Postgres query optimizer
(9 rows)
-- subplan
explain (costs off, verbose) select * from
t_hashdist left join (select a from generate_series(1, 10) a) x on t_hashdist.a > any (select random() from generate_series(1, 10));
QUERY PLAN
-----------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c, a.a
-> Nested Loop Left Join
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c, a.a
Join Filter: (SubPlan 1)
-> Seq Scan on public.t_hashdist
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c
-> Function Scan on pg_catalog.generate_series a
Output: a.a
Function Call: generate_series(1, 10)
SubPlan 1 (slice2; segments: 3)
-> Materialize
Output: random()
-> Broadcast Motion 1:3 (slice1; segments: 1)
Output: (random())
-> Function Scan on pg_catalog.generate_series
Output: random()
Function Call: generate_series(1, 10)
Optimizer: Postgres query optimizer
Settings: enable_bitmapscan=off, enable_seqscan=off, optimizer=off
(20 rows)
-- targetlist
explain (costs off) select * from t_hashdist cross join (select random () from generate_series(1, 10))x;
QUERY PLAN
---------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Function Scan on generate_series
Optimizer: Postgres query optimizer
(7 rows)
explain (costs off) select * from t_hashdist cross join (select a, sum(random()) from generate_series(1, 10) a group by a) x;
QUERY PLAN
------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> HashAggregate
Group Key: a.a
-> Function Scan on generate_series a
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist cross join (select random() as k, sum(a) from generate_series(1, 10) a group by k) x;
QUERY PLAN
------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> HashAggregate
Group Key: random()
-> Function Scan on generate_series a
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist cross join (select a, count(1) as s from generate_series(1, 10) a group by a having count(1) > random() order by a) x ;
QUERY PLAN
-----------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> GroupAggregate
Group Key: a.a
Filter: ((count(1))::double precision > random())
-> Sort
Sort Key: a.a
-> Function Scan on generate_series a
Optimizer: Postgres query optimizer
(12 rows)
-- limit
explain (costs off) select * from t_hashdist cross join (select * from generate_series(1, 10) limit 1) x;
QUERY PLAN
----------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Limit
-> Function Scan on generate_series
Optimizer: Postgres query optimizer
(8 rows)
reset optimizer;
-- start_ignore
drop table if exists bfv_planner_x;
drop table if exists testbadsql;
......
......@@ -3360,25 +3360,25 @@ PL/pgSQL function func2_mod_int_stb(integer) line 3 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_110.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_nosql_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=25190)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_113.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=25190)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_116.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_read_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=25190)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_118.sql
begin;
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_mod_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=25190)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
rollback;
-- @description function_in_subqry_withfunc2_119.sql
begin;
......@@ -3683,28 +3683,32 @@ rollback;
-- @description function_in_subqry_withfunc2_150.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_nosql_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=25190)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_153.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=25190)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_156.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_read_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=25190)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_158.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_mod_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=25190)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
......
......@@ -3368,25 +3368,25 @@ PL/pgSQL function func2_mod_int_stb(integer) line 3 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_110.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_nosql_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=24740)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_113.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=24740)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_116.sql
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_read_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=24740)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
-- @description function_in_subqry_withfunc2_118.sql
begin;
SELECT * FROM foo, (SELECT func1_read_setint_sql_vol(func2_mod_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it accesses relation "qp_funcs_in_subquery.bar" (seg0 slice1 127.0.1.1:6002 pid=24740)
CONTEXT: SQL statement "SELECT d FROM bar WHERE c <> $1"
PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: PL/pgSQL function func1_read_setint_sql_vol(integer) line 5 at FOR over SELECT rows
rollback;
-- @description function_in_subqry_withfunc2_119.sql
begin;
......@@ -3691,28 +3691,32 @@ rollback;
-- @description function_in_subqry_withfunc2_150.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_nosql_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=24740)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_153.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_sql_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=24740)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_156.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_read_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=24740)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
-- @description function_in_subqry_withfunc2_158.sql
begin;
SELECT * FROM foo, (SELECT func1_mod_setint_vol(func2_mod_int_vol(5))) r order by 1,2,3;
ERROR: function cannot execute on a QE slice because it issues a non-SELECT statement (seg0 slice1 127.0.1.1:6002 pid=24740)
ERROR: query plan with multiple segworker groups is not supported
HINT: likely caused by a function that reads or modifies data in a distributed table
CONTEXT: SQL statement "UPDATE bar SET d = d+1 WHERE c > $1"
PL/pgSQL function func1_mod_setint_vol(integer) line 5 at SQL statement
rollback;
......
......@@ -694,6 +694,235 @@ ALTER TABLE foopart_1_prt_1 SET DISTRIBUTED REPLICATED;
ERROR: can't set the distribution policy of "foopart_1_prt_1"
HINT: Distribution policy can be set for an entire partitioned table, not for one of its leaf parts or an interior branch.
DROP TABLE foopart;
-- volatile replicated
-- General and segmentGeneral locus imply that if the corresponding
-- slice is executed in many different segments should provide the
-- same result data set. Thus, in some cases, General and segmentGeneral
-- can be treated like broadcast. But if the segmentGeneral and general
-- locus path contain volatile functions, they lose the property and
-- can only be treated as singleQE. The following cases are to check that
-- we correctly handle all these cases.
-- FIXME: ORCA does not consider this, we need to fix the cases when ORCA
-- consider this.
set optimizer = off;
create table t_hashdist(a int, b int, c int) distributed by (a);
create table t_replicate_volatile(a int, b int, c int) distributed replicated;
---- pushed down filter
explain (costs off) select * from t_replicate_volatile, t_hashdist where t_replicate_volatile.a > random();
QUERY PLAN
----------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Result
-> Seq Scan on t_replicate_volatile
Filter: ((a)::double precision > random())
Optimizer: Postgres query optimizer
(9 rows)
-- join qual
explain (costs off) select * from t_hashdist, t_replicate_volatile x, t_replicate_volatile y where x.a + y.a > random();
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop
-> Result
-> Gather Motion 1:1 (slice1; segments: 1)
-> Nested Loop
Join Filter: (((x.a + y.a))::double precision > random())
-> Seq Scan on t_replicate_volatile x
-> Materialize
-> Seq Scan on t_replicate_volatile y
-> Materialize
-> Gather Motion 3:1 (slice2; segments: 3)
-> Seq Scan on t_hashdist
Optimizer: Postgres query optimizer
(12 rows)
-- sublink & subquery
explain (costs off) select * from t_hashdist where a > All (select random() from t_replicate_volatile);
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop Left Anti Semi (Not-In) Join
Join Filter: ((t_hashdist.a)::double precision <= "NotIn_SUBQUERY".random)
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Subquery Scan on "NotIn_SUBQUERY"
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist where a in (select random()::int from t_replicate_volatile);
QUERY PLAN
------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Hash Semi Join
Hash Cond: (t_hashdist.a = ((random())::integer))
-> Seq Scan on t_hashdist
-> Hash
-> Redistribute Motion 1:3 (slice1; segments: 1)
Hash Key: ((random())::integer)
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(9 rows)
-- subplan
explain (costs off, verbose) select * from t_hashdist left join t_replicate_volatile on t_hashdist.a > any (select random() from t_replicate_volatile);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c, t_replicate_volatile.a, t_replicate_volatile.b, t_replicate_volatile.c
-> Nested Loop Left Join
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c, t_replicate_volatile.a, t_replicate_volatile.b, t_replicate_volatile.c
Join Filter: (SubPlan 1)
-> Seq Scan on rpt.t_hashdist
Output: t_hashdist.a, t_hashdist.b, t_hashdist.c
-> Materialize
Output: t_replicate_volatile.a, t_replicate_volatile.b, t_replicate_volatile.c
-> Seq Scan on rpt.t_replicate_volatile
Output: t_replicate_volatile.a, t_replicate_volatile.b, t_replicate_volatile.c
SubPlan 1 (slice2; segments: 3)
-> Materialize
Output: random()
-> Broadcast Motion 1:3 (slice1; segments: 1)
Output: (random())
-> Seq Scan on rpt.t_replicate_volatile t_replicate_volatile_1
Output: random()
Optimizer: Postgres query optimizer
Settings: enable_seqscan=off, optimizer=off
(20 rows)
-- targetlist
explain (costs off) select * from t_hashdist cross join (select random () from t_replicate_volatile)x;
QUERY PLAN
---------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(7 rows)
explain (costs off) select * from t_hashdist cross join (select a, sum(random()) from t_replicate_volatile group by a) x;
QUERY PLAN
----------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> HashAggregate
Group Key: t_replicate_volatile.a
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist cross join (select random() as k, sum(a) from t_replicate_volatile group by k) x;
QUERY PLAN
----------------------------------------------------------
Nested Loop
-> Gather Motion 3:1 (slice1; segments: 3)
-> Seq Scan on t_hashdist
-> Materialize
-> Gather Motion 1:1 (slice2; segments: 1)
-> HashAggregate
Group Key: random()
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(9 rows)
explain (costs off) select * from t_hashdist cross join (select a, sum(b) as s from t_replicate_volatile group by a having sum(b) > random() order by a) x ;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Sort
Sort Key: t_replicate_volatile.a
-> HashAggregate
Group Key: t_replicate_volatile.a
Filter: ((sum(t_replicate_volatile.b))::double precision > random())
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(12 rows)
-- insert
explain (costs off) insert into t_replicate_volatile select random() from t_replicate_volatile;
QUERY PLAN
---------------------------------------------------------------------------
Insert on t_replicate_volatile
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Subquery Scan on "*SELECT*"
-> Seq Scan on t_replicate_volatile t_replicate_volatile_1
Optimizer: Postgres query optimizer
(5 rows)
explain (costs off) insert into t_replicate_volatile select random(), a, a from generate_series(1, 10) a;
QUERY PLAN
------------------------------------------------------
Insert on t_replicate_volatile
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Subquery Scan on "*SELECT*"
-> Function Scan on generate_series a
Optimizer: Postgres query optimizer
(5 rows)
create sequence seq_for_insert_replicated_table;
explain (costs off) insert into t_replicate_volatile select nextval('seq_for_insert_replicated_table');
QUERY PLAN
---------------------------------------------------
Insert on t_replicate_volatile
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Subquery Scan on "*SELECT*"
-> Result
Optimizer: Postgres query optimizer
(5 rows)
-- update & delete
explain (costs off) update t_replicate_volatile set a = 1 where b > random();
ERROR: could not devise a plan (cdbpath.c:2074)
explain (costs off) update t_replicate_volatile set a = 1 from t_replicate_volatile x where x.a + random() = t_replicate_volatile.b;
ERROR: could not devise a plan (cdbpath.c:2074)
explain (costs off) update t_replicate_volatile set a = 1 from t_hashdist x where x.a + random() = t_replicate_volatile.b;
ERROR: could not devise a plan (cdbpath.c:2074)
explain (costs off) delete from t_replicate_volatile where a < random();
ERROR: could not devise a plan (cdbpath.c:2074)
explain (costs off) delete from t_replicate_volatile using t_replicate_volatile x where t_replicate_volatile.a + x.b < random();
ERROR: could not devise a plan (cdbpath.c:2074)
explain (costs off) update t_replicate_volatile set a = random();
ERROR: could not devise a plan (createplan.c:6488)
-- limit
explain (costs off) insert into t_replicate_volatile select * from t_replicate_volatile limit 1;
QUERY PLAN
---------------------------------------------------------------------------
Insert on t_replicate_volatile
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Limit
-> Seq Scan on t_replicate_volatile t_replicate_volatile_1
Optimizer: Postgres query optimizer
(5 rows)
explain (costs off) select * from t_hashdist cross join (select * from t_replicate_volatile limit 1) x;
QUERY PLAN
----------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3)
-> Nested Loop
-> Seq Scan on t_hashdist
-> Materialize
-> Broadcast Motion 1:3 (slice1; segments: 1)
-> Limit
-> Seq Scan on t_replicate_volatile
Optimizer: Postgres query optimizer
(8 rows)
reset optimizer;
-- start_ignore
drop schema rpt cascade;
NOTICE: drop cascades to 7 other objects
......
......@@ -272,6 +272,51 @@ select first_value(a) over w, a
from (select * from bfv_planner_t3 order by a, b) as x
WINDOW w AS (order by a);
-- volatile general
-- General and segmentGeneral locus imply that if the corresponding
-- slice is executed in many different segments should provide the
-- same result data set. Thus, in some cases, General and segmentGeneral
-- can be treated like broadcast. But if the segmentGeneral and general
-- locus path contain volatile functions, they lose the property and
-- can only be treated as singleQE. The following cases are to check that
-- we correctly handle all these cases.
-- FIXME: for ORCA the following SQL does not consider this. We should
-- fix them when ORCA changes.
set optimizer = off;
create table t_hashdist(a int, b int, c int) distributed by (a);
---- pushed down filter
explain (costs off)
select * from
(select a from generate_series(1, 10)a) x, t_hashdist
where x.a > random();
---- join qual
explain (costs off) select * from
t_hashdist,
(select a from generate_series(1, 10) a) x,
(select a from generate_series(1, 10) a) y
where x.a + y.a > random();
---- sublink & subquery
explain (costs off) select * from t_hashdist where a > All (select random() from generate_series(1, 10));
explain (costs off) select * from t_hashdist where a in (select random()::int from generate_series(1, 10));
-- subplan
explain (costs off, verbose) select * from
t_hashdist left join (select a from generate_series(1, 10) a) x on t_hashdist.a > any (select random() from generate_series(1, 10));
-- targetlist
explain (costs off) select * from t_hashdist cross join (select random () from generate_series(1, 10))x;
explain (costs off) select * from t_hashdist cross join (select a, sum(random()) from generate_series(1, 10) a group by a) x;
explain (costs off) select * from t_hashdist cross join (select random() as k, sum(a) from generate_series(1, 10) a group by k) x;
explain (costs off) select * from t_hashdist cross join (select a, count(1) as s from generate_series(1, 10) a group by a having count(1) > random() order by a) x ;
-- limit
explain (costs off) select * from t_hashdist cross join (select * from generate_series(1, 10) limit 1) x;
reset optimizer;
-- start_ignore
drop table if exists bfv_planner_x;
......
......@@ -351,6 +351,59 @@ ALTER TABLE foopart SET DISTRIBUTED REPLICATED;
ALTER TABLE foopart_1_prt_1 SET DISTRIBUTED REPLICATED;
DROP TABLE foopart;
-- volatile replicated
-- General and segmentGeneral locus imply that if the corresponding
-- slice is executed in many different segments should provide the
-- same result data set. Thus, in some cases, General and segmentGeneral
-- can be treated like broadcast. But if the segmentGeneral and general
-- locus path contain volatile functions, they lose the property and
-- can only be treated as singleQE. The following cases are to check that
-- we correctly handle all these cases.
-- FIXME: ORCA does not consider this, we need to fix the cases when ORCA
-- consider this.
set optimizer = off;
create table t_hashdist(a int, b int, c int) distributed by (a);
create table t_replicate_volatile(a int, b int, c int) distributed replicated;
---- pushed down filter
explain (costs off) select * from t_replicate_volatile, t_hashdist where t_replicate_volatile.a > random();
-- join qual
explain (costs off) select * from t_hashdist, t_replicate_volatile x, t_replicate_volatile y where x.a + y.a > random();
-- sublink & subquery
explain (costs off) select * from t_hashdist where a > All (select random() from t_replicate_volatile);
explain (costs off) select * from t_hashdist where a in (select random()::int from t_replicate_volatile);
-- subplan
explain (costs off, verbose) select * from t_hashdist left join t_replicate_volatile on t_hashdist.a > any (select random() from t_replicate_volatile);
-- targetlist
explain (costs off) select * from t_hashdist cross join (select random () from t_replicate_volatile)x;
explain (costs off) select * from t_hashdist cross join (select a, sum(random()) from t_replicate_volatile group by a) x;
explain (costs off) select * from t_hashdist cross join (select random() as k, sum(a) from t_replicate_volatile group by k) x;
explain (costs off) select * from t_hashdist cross join (select a, sum(b) as s from t_replicate_volatile group by a having sum(b) > random() order by a) x ;
-- insert
explain (costs off) insert into t_replicate_volatile select random() from t_replicate_volatile;
explain (costs off) insert into t_replicate_volatile select random(), a, a from generate_series(1, 10) a;
create sequence seq_for_insert_replicated_table;
explain (costs off) insert into t_replicate_volatile select nextval('seq_for_insert_replicated_table');
-- update & delete
explain (costs off) update t_replicate_volatile set a = 1 where b > random();
explain (costs off) update t_replicate_volatile set a = 1 from t_replicate_volatile x where x.a + random() = t_replicate_volatile.b;
explain (costs off) update t_replicate_volatile set a = 1 from t_hashdist x where x.a + random() = t_replicate_volatile.b;
explain (costs off) delete from t_replicate_volatile where a < random();
explain (costs off) delete from t_replicate_volatile using t_replicate_volatile x where t_replicate_volatile.a + x.b < random();
explain (costs off) update t_replicate_volatile set a = random();
-- limit
explain (costs off) insert into t_replicate_volatile select * from t_replicate_volatile limit 1;
explain (costs off) select * from t_hashdist cross join (select * from t_replicate_volatile limit 1) x;
reset optimizer;
-- start_ignore
drop schema rpt cascade;
-- end_ignore
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册