- 10 6月, 2020 1 次提交
-
-
由 Hans Zeller 提交于
We found that when we have window functions and also correlated subqueries in the same target list, the CQueryMutators::NormalizeWindowProjList method would leave the varattno attributes of outer references in the subquery unchanged. That needs to be changed, since we are producing a different RTE for the query. We will eventually create a fix. For now, this PR just searches for the problem and triggers a fallback when we see it, to avoid incorrect results. Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io> Co-authored-by: NHans Zeller <hzeller@vmware.com>
-
- 09 6月, 2020 2 次提交
-
-
由 Paul Guo 提交于
After gprecoverseg, need to wait until the cluster is synchronized before running subsequent tests. Reviewed-by: NHubert Zhang <hzhang@pivotal.io> Reviewed-by: NReviewed-by: Ashwin Agrawal <aagrawal@pivotal.io> Cherry-picked from d490798b
-
由 David Yozie 提交于
* Update statement about mirroring recommendations & support * Updates based on k8s feedback
-
- 08 6月, 2020 1 次提交
-
-
由 Hubert Zhang 提交于
When introducing a new mirror, we need two steps: 1. start mirror segment 2. update gp_segment_configuration catalog Previously gp_add_segment_mirror will be called to update the catalog, but dbid is chosen by get_availableDbId() which cannot ensure to be the same dbid in internal.auto.conf. Reported by issue9837 Reviewed-by: NPaul Guo <pguo@pivotal.io> Reviewed-by: NBhuvnesh Chaudhary <bhuvnesh2703@gmail.com> cherry-pick from commit: f7965d and 1ee999
-
- 06 6月, 2020 5 次提交
-
-
由 Lisa Owen 提交于
-
由 David Yozie 提交于
This reverts commit 07775046.
-
由 Lisa Owen 提交于
* docs - add info about moving a query to a different resource group * need to be superuser * remove upgrade/downgrade info for master
-
由 David Yozie 提交于
-
由 Lisa Owen 提交于
-
- 05 6月, 2020 5 次提交
-
-
由 Hubert Zhang 提交于
We now use initplan id to differentiate the tuplestore used by different INITPLAN functions. INITPLAN will also write the function result into different tuplestores. Also fix the bug which appends initplan in the wrong place. It may generate wrong result in UNION ALL case. cherry-pick from: 2589a3
-
由 David Yozie 提交于
-
由 Lisa Owen 提交于
-
由 Lisa Owen 提交于
* docs - new pxf IGNORE_MISSING_PATH option * reword default case * add IGNORE_MISSING_PATH info to relevant profiles * the action to take * try to describe why pxf behaviour is not optimal
-
由 Lisa Owen 提交于
-
- 04 6月, 2020 5 次提交
-
-
由 Hans Zeller 提交于
-
由 Wen Lin 提交于
-
由 Hans Zeller 提交于
Orca uses this property for cardinality estimation of joins. For example, a join predicate foo join bar on foo.a = upper(bar.b) will have a cardinality estimate similar to foo join bar on foo.a = bar.b. Other functions, like foo join bar on foo.a = substring(bar.b, 1, 1) won't be treated that way, since they are more likely to have a greater effect on join cardinalities. Since this is specific to ORCA, we use logic in the translator to determine whether a function or operator is NDV-preserving. Right now, we consider a very limited set of operators, we may add more at a later time. Let's assume that we join tables R and S and that f is a function or expression that refers to a single column and does not preserve NDVs. Let's also assume that p is a function or expression that also refers to a single column and that does preserve NDVs: join predicate card. estimate comment ------------------- ------------------------------------- ----------------------------- col1 = col2 |R| * |S| / max(NDV(col1), NDV(col2)) build an equi-join histogram f(col1) = p(col2) |R| * |S| / NDV(col2) use NDV-based estimation f(col1) = col2 |R| * |S| / NDV(col2) use NDV-based estimation p(col1) = col2 |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation p(col1) = p(col2) |R| * |S| / max(NDV(col1), NDV(col2)) use NDV-based estimation otherwise |R| * |S| * 0.4 this is an unsupported pred Note that adding casts to these expressions is ok, as well as switching left and right side. Here is a list of expressions that we currently treat as NDV-preserving: coalesce(col, const) col || const lower(col) trim(col) upper(col) One more note: We need the NDVs of the inner side of Semi and Anti-joins for cardinality estimation, so only normal columns and NDV-preserving functions are allowed in that case. This is a port of these GPDB 5X and GPOrca PRs: https://github.com/greenplum-db/gporca/pull/585 https://github.com/greenplum-db/gpdb/pull/10090 (cherry picked from commit 3ccd1ebfa1ea949ac77ed3b5d8f5faadfa87affd) Also updated join.sql expected files with minor motion changes.
-
由 David Yozie 提交于
-
由 David Yozie 提交于
-
- 03 6月, 2020 5 次提交
-
-
由 Shreedhar Hardikar 提交于
Duplicate sensitive HashDistribute Motions generated by ORCA get translated to Result nodes with hashFilter cols set. However, if the Motion needs to distribute based on a complex expression (rather than just a Var), the expression must be added into the targetlist of the Result node and then referenced in hashFilterColIdx. However, this can affect other operators above the Result node. For example, a Hash operator expects the targetlist of its child node to contain only elements that are to be hashed. Additional expressions here can cause issues with memtuple bindings that can lead to errors. (E.g The attached test case, when run without our fix, will give an error: "invalid input syntax for integer:") This PR fixes the issue by adding an additional Result node on top of the duplicate sensitive Result node to project only the elements from the original targetlist in such cases.
-
由 Jinbao Chen 提交于
The psql client ignored rel storage when he create the \dm command. So the output of \dm was empty. Add the correct rel storage check in command
-
由 Xiaoran Wang 提交于
This reverts commit 412493b0. Failed to compile pgbouncer on centos6: can't find libevent. pgbouncer 1.13 uses pkg-config to look libevent up instead of using --with-libevent. Another issue is: pgbouncer does not support libevent version 1.x in 1.13 version, but we use libevent 1.4 on centos6.
-
由 Xiaoran Wang 提交于
-
由 Hans Zeller 提交于
* Make DbgPrint and OsPrint methods on CRefCount Create a single DbgPrint() method on the CRefCount class. Also create a virtual OsPrint() method, making some objects derived from CRefCount easier to print from the debugger. Note that not all the OsPrint methods had the same signatures, some additional OsPrintxxx() methods have been generated for that. * Making print output easier to read, print some stuff on demand Required columns in required plan properties are always the same for a given group. Also, equivalent expressions in required distribution properties are important in certain cases, but in most cases they disrupt the display and make it harder to read. Added two traceflags, EopttracePrintRequiredColumns and EopttracePrintEquivDistrSpecs that have to be set to print this information. If you want to go back to the old display, use these options when running gporca_test: -T 101016 -T 101017 * Add support for printing alternative plans A new method, CEngine::DbgPrintExpr() can be called from COptimizer::PexprOptimize, to allow printing of the best plan for different contexts. This is only enabled in debug builds. To use this: - run an MDP using gporca_test, using a debug build - print out memo after optimization (-T 101006 -T 101010) - set a breakpoint near the end of COptimizer::PexprOptimize() - if, after looking at the contents of memo, you want to see the optimal plan for context c of group g, do the following: p eng.DbgPrintExpr(g, c) You could also get the same info from the memo printout, but it would take a lot longer. (cherry picked from commit b3fdede6)
-
- 02 6月, 2020 5 次提交
-
-
由 Jinbao Chen 提交于
SkipData flag should only short circuit in transientrel_receive on QE We should still do the begin/end work, e.g. remove the new created temp file, or we will have file leak. Co-authored-by: NJinbao Chen <jinchen@pivotal.io> Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
-
由 Jinbao Chen 提交于
In REFRESH MATERIALIZED VIEW command, CONCURRENTLY option is only allowed if there is at least one unique index with no WHERE clause on one or more columns of the matview. Previously, concurrent refresh checked the existence of a unique index on the matview after filling the data to new snapshot, i.e., after calling refresh_matview_datafill(). So, when there was no unique index, we could need to wait a long time before we detected that and got the error. It was a waste of time. To eliminate such wasting time, this commit changes concurrent refresh so that it checks the existence of a unique index at the beginning of the refresh operation, i.e., before starting any time-consuming jobs. If CONCURRENTLY option is not allowed due to lack of a unique index, concurrent refresh can immediately detect it and emit an error. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Fujii Masao Co-authored-by: NFujii Masao <fujii@postgresql.org>
-
由 Richard Guo 提交于
If there is ORDER BY or DISTINCT in the query, we need to bring all the data to a single node by setting must_gather to be true. An exception is when there's a LIMIT or OFFSET clause, which would be handled later when inserting Limit node. Here to tell if there is any LIMIT or OFFSET clause, we should use limit_needed, instead of checking limitCount or limitOffset directly. Fixes issue #9746. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NEkta Khanna <ekhanna@pivotal.io>
-
由 Jacob Champion 提交于
The query to obtain AO auxiliary catalog names is relatively expensive compared to the other aux table queries, and it was being performed once for every AO table. Consolidate all calls into a single query, and manually join the results with the other relation info by relid. Also improve the "couldn't find aux tables" FATAL message for easier debugging (it needs to include the dbname).
-
由 Jacob Champion 提交于
The large_objects test invokes pg_upgrade and must not run concurrently with the setup DDL. This led to intermittent failures, since some setup scripts temporarily add objects that aren't upgradable.
-
- 01 6月, 2020 1 次提交
-
-
由 Hubert Zhang 提交于
We introduce function which runs on INITPLAN in commit a21ff2 INITPLAN function is designed to support "CTAS select * from udf();" Since udf() is run on EntryDB, but EntryDB is always read gang which cannot do dispatch work, the query would fail if function contains DDL statement etc. The idea of INITPLAN function is to run the function on INITPLAN, which is QD in fact and store the result into a tuplestore. Later the FunctionScan on EntryDB just read tuple from tuplestore instead of running the real function. But the life cycle management is a little tricky. In the original commit, we hack to close the tuplestore in INITPLAN without deleting the file, and let EntryDB reader to delete the file after finishing the tuple fetch. This will introduce file leak if the transaction abort before the entryDB runs. This commit add a postprocess_initplans in ExecutorEnd() of the main plan to clean the tuplestore createed in preprocess_initplans in ExecutorStart() of the main plan. Note that postprocess_initplans must be place after the dispatch work are finished i.e. mppExecutorFinishup(). Upstream don't need this function since it always use scalar PARAM to communicate between INITPLAN and main plan. cherry-pick from: f669acf7
-
- 30 5月, 2020 3 次提交
-
-
由 Wang Hao 提交于
For some reason, gpmon_qexeckey_t structure used int16 for ccnt while all other GP code operates int32. This problem can cause ccnt overflow in gpperfmon packets. This problem doesn't affect master branch as gpperfmon code has been removed from it. But it seems to affect 6X_STABLE and 5X_STABLE branches. Authored-by: Denis Smirnov darthunix@gmail.com Reviewed-by: Hao Wang haowang@pivotal.io
-
由 Chris Hajas 提交于
Previously in the DPv2 transform (exhaustive2) while we penalized cross joins for the remaining joins in greedy, we did not for the first join, which in some cases selected a cross join. This ended up selecting a poor join order in many cases and went against the intent of the alternative being generated, which is to minimize cross joins. We also increase the cost of the default penalty from 5 to 1024, which is the value we use in the cost model during the optimization stage. The greedy alternative also wasn't kept in the heap, so we include that now too. (cherry picked from commit 457bb928)
-
由 Chris Hajas 提交于
In cases where Orca generates a NLJ with a parameter on the inner side, the executor will not pass the EXEC_FLAG_REWIND flag down, as it assumed the inner side will always need to be rescanned. The material node will therefore not have its rewind flag set and can act as a no-op. This is not always correct. While the executor will set EXEC_FLAG_REWIND if the Materialize is directly above a motion, it does not recognize the case where the Materialize is on the inner side with other nodes between it and the motion, even though the Materialize serves to prevent a rescan of the underlying Motion node. This causes the execution to fail with: `Illegal rescan of motion node: invalid plan (nodeMotion.c:1623)` as it would attempt to rescan a motion. Since Orca only produces Materialize when necessary, either for performance reasons or to prevent rescan of an underlying Motion, EXEC_FLAG_REWIND should be set for any Materialize generated by Orca. Below is a valid plan generated by Orca: ``` Result (cost=0.00..3448.01 rows=1 width=4) -> Nested Loop (cost=0.00..3448.01 rows=1 width=1) Join Filter: true -> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=2 width=4) -> Seq Scan on foo1 (cost=0.00..431.00 rows=1 width=4) -> Result (cost=0.00..431.00 rows=1 width=1) Filter: (foo1.a = foo2.a) -> Materialize (cost=0.00..431.00 rows=1 width=4) -> Hash Semi Join (cost=0.00..431.00 rows=1 width=4) Hash Cond: (foo2.b = foo3.b) -> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..0.00 rows=1 width=8) -> Bitmap Heap Scan on foo2 (cost=0.00..0.00 rows=1 width=8) Recheck Cond: (c = 3) -> Bitmap Index Scan on f2c (cost=0.00..0.00 rows=0 width=0) Index Cond: (c = 3) -> Hash (cost=431.00..431.00 rows=1 width=4) -> Gather Motion 3:1 (slice3; segments: 3) (cost=0.00..431.00 rows=2 width=4) -> Seq Scan on foo3 (cost=0.00..431.00 rows=1 width=4) Optimizer: Pivotal Optimizer (GPORCA) ``` Co-authored-by: NChris Hajas <chajas@pivotal.io> Co-authored-by: NShreedhar Hardikar <shardikar@pivotal.io>
-
- 28 5月, 2020 3 次提交
-
-
由 Lena Hunter 提交于
* clarifying pg_upgrade note * gpinitsystem -I second format * gpinitsystem edits * edits from review
-
由 xiong-gang 提交于
pg_resgroup_move_query is an asychronous operation, pg_resgroup_move_query complete doesn't mean the query has already move to the destination group.
-
由 Sambitesh Dash 提交于
This is a continuation of commit 456b2b31 in GPORCA. Adding more errors to the list that doesn't get logged in log file. We are also removing the code that writes to std::cerr, generating a not very nice looking log message. Instead, add the info whether the error was unexpected to another log message that we also generate. The original commit on the master branch is fba77702.
-
- 27 5月, 2020 1 次提交
-
- 26 5月, 2020 1 次提交
-
-
由 Pengzhou Tang 提交于
This issue is exposed when doing an experiment to remove the special "eval_stable_functions" handling in evaluate_function(), qp_functions_in_* test cases will get stuck sometimes and it turns out to be a gp_interconnect_id disorder issue. Under UDPIFC interconnect, gp_interconnect_id is used to distinguish the executions of MPP-fied plan in the same session and in the receiver side, packets with smaller gp_interconnect_id is treated as 'past' packets, receiver will stop the sender to send the packets. The RCA of the hung is: 1. QD call InitSliceTable() to advance the gp_interconnect_id and store it in slice table. 2. In CdbDispatchPlan->exec_make_plan_constant(), QD find some stable function need to be simplified to const, then it executes this function first. 3. The function contains the SQL, QD init another slice table and advance the gp_interconnect_id again, QD dispatch the new plan and execute it. 4. After the function is simplified to const, QD continues to dispatch the previous plan, however, the gp_interconnect_id for it becomes the older one. When a packet comes, if the receiver hasn't set up the interconnect yet, the packet will be handled by handleMismatch() and it will be treated as `past` packets and the senders will be stopped earlier by the receiver. Later the receiver finish the setup of interconnect, it cannot get any packets from senders and get stuck. To resolve this, we advance the gp_interconnect_id when a plan is really dispatched, the plan is dispatched sequentially, so the later dispatched plan will have a higher gp_interconnect_id. Also limit the usage of gp_interconnect_id in rx thread of UDPIFC, we prefer to use sliceTable->ic_instance_id in main thread. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 25 5月, 2020 2 次提交
-
-
由 Huiliang.liu 提交于
pkill is not in /bin/ folder on Ubuntu, so gpfdist can't be killed in sreh test. That would make gpfdist regression test fail
-
由 Hao Wu 提交于
`select c.c1, c.c2 from d1 c union all select a.c1, a.c2 from d2 a;` Both d1 and d2 are replicated tables, but the `numsegments` of them in gp_distribution_policy are different. This could happen during gpexpanding. The bug exists in function cdbpath_create_motion_path. Both `subpath->locus` and `locus` are SegmentGeneral, but the locuses are not equal Co-authored-by: NPengzhou Tang <ptang@pivotal.io> (cherry picked from commit 49749292)
-