- 22 9月, 2020 2 次提交
-
-
由 Shreedhar Hardikar 提交于
When gp_use_legacy_hashops is set, for plans that have an Aggregate, ORCA might crash when computing the distribution spec. This occurs when the grouping columns are of a type that have default hash distribution opfamilies, but do not have legacy hash distribution opfamilies. This happens because the distribution is done only for types for which "IsRedistributable()" returns true. This is (incorrectly) determined by looking (only) at the default opfamily (independent of the GUC). Subsequently, CDistributionSpecHashed::PopulateDefaultOpfamilies() tries to assemble opfamilies for all redistributable columns, which shouldn't include such types (when the GUC is set), resulting in dereferencing a nullptr. For example, int[] type has a default distr opfamily, but no legacy distr opfamily. This commit fixes the implementation of IsRedistributable() to avoid the above error. Co-authored-by: NShreedhar Hardikar <hardikar@cs.wisc.edu>
-
由 Kalen Krempely 提交于
Co-authored-by: NXin Zhang <zhxin@vmware.com> Co-authored-by: NBrent Doli <bdoil@vmware.com>
-
- 19 9月, 2020 1 次提交
-
-
由 Lisa Owen 提交于
-
- 18 9月, 2020 1 次提交
-
-
由 (Jerome)Junfeng Yang 提交于
Commit 0cf1c0d8 defined a global variable in a header. This resulted in a ton of "unused variable" warnings. The `static step_timer timer` is only used in contrib/pg_upgrade/util.c. So move it here.
-
- 17 9月, 2020 3 次提交
-
-
由 David Kimura 提交于
Function `RelationGetIndexList()` does not filter out invalid indexes. That responsiblity is left to the caller (e.g. `get_relation_info()`). Issue is that Orca was not checking index validity. This commit also introduces an optimization to Orca that is already used in Planner whereby we first check relhasindex before checking pg_index. (cherry picked from commit b011c351)
-
由 Jesse Zhang 提交于
For context, the registry-image resource was introduced more than two years ago as a drop-in replace for the docker-image resource (especially when you're only fetching, not building). It's leaner, less resource-intense, faster, and doesn't rely on spawning a Docker daemon. Also swept up in this patch are four unused files that are left behind by previous changes. This is spiritually a cherry-pick of commit be6ff30f (greenplum-db/gpdb#10818). See https://github.com/concourse/registry-image-resource for more.
-
由 Hans Zeller 提交于
This is a backport of the changes in #10799 to 6X_STABLE. When we have an outer ref in a subquery, like this select * from foo where foo.a is null or foo.a = (select foo.b from bar) then we can't simply use the outer reference for the condition when we unnest the subquery into an apply. This is because if the subquery returns no rows, then we must be using a NULL instead of the outer reference. We have code to handle this for quantified subqueries, but not for scalar subqueries. When we translate the generated DXL to a plan, we assert when we find an outer reference in the project list of a subquery. In rare cases, we might also crash, when the subquery contained a project with multiple values below the outer reference (see added test in gporca.sql). The "fix" (more a workaround) is to force a fallback when we detect this situation during unnesting of a scalar subquery.
-
- 16 9月, 2020 4 次提交
-
-
由 Alexandra Wang 提交于
Previously after the 9.2 merge, we had a hack that short-circuits the execution of the join when ORCA is on. This hack stops working with the Postgres 12 merge because the FIXME subquery gets executed on the QEs as well rather than just on the QD. This patch simply turns ORCA off for the query. Co-authored-by: NJesse Zhang <sbjesse@gmail.com> Co-authored-by: NAlexandra Wang <lewang@pivotal.io> (cherry picked from commit 891e77d1)
-
由 Jesse Zhang 提交于
Commit df5d3389 attempted to backport e79b81e5 but we missed a critical chunk while resolving a conflict. This (of course) breaks Travis macOS build. Put it back here. (cherry picked from commit e79b81e5)
-
由 Jesse Zhang 提交于
ccache 3.2 (from Ubuntu Xenial) defaults CCACHE_CPP2 to off (this setting defaults to on starting from ccache 3.3). That default leads to unlegible compiler warning outputs because GCC and Clang will emit warnings using the preprocessed output. Turn it on here. (cherry picked from commit 44e6bfec)
-
由 Jesse Zhang 提交于
While working on https://github.com/greenplum-db/gpdb/pull/9937 ("Build ORCA with C++14."), I realized we took out ccache from Travis CI. This patch set brings it back with some additional TLC. Highlights: * Remove deprecated [1][2] key "sudo: false" * Set Travis language to C++ Travis propagates the environment variables CXX (in addition to CC, which is exported when the language is C) when the language is C++. This prepares us for a forthcoming change to enable ORCA build in Travis. * Care is taken to ensure ccache is actually used on macOS. The default settings in Travis somehow neglected it (travis-ci/travis-build#655) so we compensate for it. * Similar care is taken to ensure we use ccache with Clang. The version of ccache used in Travis is just slightly older than when automatic Clang symlink supported was added. * Show ccache stats (hit rate and more) for each build Reference: [1] https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration [2] Job config linting warning from Travis (cherry picked from commit e79b81e5)
-
- 15 9月, 2020 1 次提交
-
- 14 9月, 2020 4 次提交
-
-
由 Polina Bungina 提交于
Execution of a long enough query containing multi-byte characters can cause incorrect truncation of the query string. Incorrect truncation implies an occasional cut of a multi-byte character and (with log_min_duration_statement set to 0 ) subsequent write of an invalid symbol to segment logs. Such broken character present in logs produces problems when trying to fetch logs info from gp_toolkit.__gp_log_segment_ext table - queries fail with the following error: «ERROR: invalid byte sequence for encoding…». This is caused by buildGpQueryString function in `cdbdisp_query.c`, which prepares query text for dispatch to QE. It does not take into account character length when truncation is necessary (text is longer than QUERY_STRING_TRUNCATE_SIZE). (cherry picked from commit 889ba39e)
-
由 Paul Guo 提交于
Here is part of the diff output. @@ -14,11 +14,11 @@ role | preferred_role | content | mode | status ------+----------------+---------+------+-------- m | m | -1 | s | u - m | m | 0 | s | u + m | m | 0 | n | u The root cause has nothing to do with this test case. It's because test prepared_xact_deadlock_pg_rewind finally calls gprecoverseg to recover the cluster but does not wait until the cluster state restores. Reviewed-by: NJunfeng(Jerome) Yang <jeyang@pivotal.io> (cherry picked from commit 07c594ed)
- 12 9月, 2020 1 次提交
-
-
由 David Yozie 提交于
-
- 11 9月, 2020 3 次提交
-
-
由 Maksim Milyutin 提交于
The terminate_timeout and recv_timeout parameters are defined based on incoming quantum value before its invalidation/normalization phase. As a consequence those parameters can take on values that lead to unstable interaction between gpmmon and gpsmon. Current fix moves the assignment of terminate_timeout and recv_timeout behind invalidation/normalization block. Furthermore, as terminate_timeout is passed to gpsmon as startup option, the guard that disallow zero timeout value for tcp_event is added.
-
由 Lisa Owen 提交于
-
由 Peter Eisentraut 提交于
Using exit() requires stdlib.h, which is not included. Use return instead. Also add return type for main(). Reviewed-by: NHeikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: NThomas Munro <thomas.munro@enterprisedb.com> (cherry picked from commit 1c0cf52b) (cherry picked from commit 6d3c99bb)
-
- 10 9月, 2020 6 次提交
-
-
由 Jesse Zhang 提交于
This file will be used to record commits to be ignored by default by git-blame (user still has to opt in). This is intended to include large (generally automated) reformatting or renaming commits. (cherry picked from commit b19e6abb)
-
由 Ning Yu 提交于
The GUC gp_interconnect_proxy_addresses is used to set the listener addresses and ports of all the proxy bgworkers, only IP addresses were supported previously, which is inconvenient to use. Now we add the support for hostnames too, the IP addresses are also supported. Note that if a hostname is bound to a different IP at runtime, we must reload the setting with the "gpstop -u" command. Reviewed-by: NHubert Zhang <hzhang@pivotal.io> (cherry picked from commit 2a1794bc)
-
由 Ning Yu 提交于
A typical mistake on allocating typed memory is as below: int64 *ptr = malloc(sizeof(int32)); To prevent this, now we make ic_proxy_new() a typed allocator, it always return a pointer of the specified type, for example: int64 *p1 = ic_proxy_new(int64); /* good */ int64 *p2 = ic_proxy_new(int32); /* bad, gcc will raise a warning */ Reviewed-by: NHubert Zhang <hzhang@pivotal.io> (cherry picked from commit a3ef623d)
-
由 (Jerome)Junfeng Yang 提交于
We used to have `Assert` to check `RegisterFileWithSet` never register file to a non-active workfile_set. But in production, there could be some corner cases that caller register file to a non-active workfile_set. It'll cause inconsistent `workfile_shared->num_active` with the real active workfile_sets numbers under some situations. For example, 1. `RegisterFileWithSet` a file to a created work_set. (current `work_set->num_files` is 1) 2. `FileClose` closes the file and causes `WorkFileDeleted` to detele the work_set since current `work_set->num_files` is 0 after detele file. Which also decrease `workfile_shared->num_active`. 3. `RegisterFileWithSet` another file to the created work_set(which actually is not active now, but we dont't prevent that, only uses `Assert` to check). 4. `FileClose` closes the file and causes `WorkFileDeleted` to detele the work_set again. The `workfile_shared->num_active` gets decreased again. Raise PANIC to expose the coner cases. Normally the caller of `RegisterFileWithSet` should ensure the correctness. But we lack of the check in the `RegisterFileWithSet`. (cherry picked from commit c23980cb)
-
由 Hans Zeller 提交于
Refactor caching of scalar expressions in MEMO groups ----------------------------------------------------- We cache a full CExpression tree for scalar expressions in each scalar CGroup - but only if no subqueries are involved. With subqueries, things get a bit complicated and we store CExpressions with incorrect arity. This caused a problem in handling nary LOJs, because for those we have to descend into the scalar expression. Refactored the code such that we can ask a CGroupExpression for an exact scalar expression tree, with the possibility that it will return NULL if there is a subquery in the expression. Changed the numerous callers of this method into two subclasses: 1. Places where we accept an inexact version of the scalar expression, with subqueries replaced by a "true" boolean constant or a NULL value. This works well for statistics deriviation, costing, etc. where some imprecision is acceptable. 2. Places where we need an exact expression, like constraint derivation. Those places have to deal with the possibility of getting a NULL pointer. Handle NAry joins that contain LOJs in the decorrelator ------------------------------------------------------- The decorrelator methods didn't handle the special scalar expression that is used when we have LOJs in an NAry join. The fix checks for outer references in the ON clause and the right child of LOJs and prevents decorrelation when such outer refs are found. The fix also passes the correct inner join predicates to recursive calls and updates any changes in the inner join predicates. Finally, this also fixes a bug unrelated to NAry joins, when we have a regular 2-way LOJ or FOJ with outer refs in the ON clause or the right child (for LOJ) or any child (FOJ). We shouldn't decorrelate such outer refs. Look for specific outer references when decorrelating ----------------------------------------------------- Changing the decorrelator from trying to push up all outer references to pushing up only those outer references that come from the outer side of the apply that is driving the decorrelation process. This is done by passing the ColRefs to remove as an additional argument. There are two reasons for this change: First, it allows us to fix a bug. The existing code didn't have a good way of checking whether there are any outer refs in the ON clause of a left join that's part of an NAry join. Second, the new code should allow us to decorrelate hierarchies of subqueries better, by decorrelating those that satisfy the conditions while leaving the rest in nested form. Fix for fallback on correlated subquery with exhaustive2 -------------------------------------------------------- We currently don't expand NAry joins if they have outer references. To avoid a regression when moving from "exhaustive" to "exhaustive2", we need to allow expansion of NAry joins that can't be decorrelated, because there are outer refs in the LOJ parts of them. The added fix is a bit more general, it allows expansion of NAry joins with LOJs in them, regardless of where the outer refs are. Fix join stats calculation for NAry join with LOJs and outer refs The join stats calculation had a bug when using optimizer_join_order = exhaustive2. We didn't handle outer refs correctly when encountering the new flavor of NAry joins that contain left outer joins. As a result, we would crash in retail builds and run into an assert in debug builds. Fix handling of NAry LOJs in CPredicateUtils::PexprRemoveImpliedConjuncts ------------------------------------------------------------------------- This method also needs to preserve the CScalarNAryJoinPredList operator. We are using a "representative" expression. In addition to fixing an assert, this has two consequences: - queries with a mix of predicates and subqueries, such as a = 5 and (sq) will get somewhat better estimates, as they will use the non-subquery parts. - queries with scalar subqueries and negated subqueries may get lower estimates, which are probably more risky than overestimates. Examples: where a = (sq) gets converted to where a = NULL where not (sq) gets converted to where FALSE Handle n-ary LOJs in subquery to apply xform -------------------------------------------- The CXformSubqNAryJoin2Apply xform didn't handle NAry joins with LOJs correctly. Added logic to preserve the LOJ-related data in the NAry join and to push subqueries only to inner join children. Remove failing assert from xform -------------------------------- Looks like other code changes suddenly expose this xform. The assert isn't correct. It calls the promise function on a CExpression, but the promise function is written to work only with a CGroupExpression attached to the expression handle. Since the assert doesn't seem very useful, I just removed it. Fixes for preprocessor-related methods -------------------------------------- Fixed CLogicalNAryJoin::DeriveNotNullColumns and CLogicalNAryJoin::DerivePropertyConstraint so that they handle NAry LOJs correctly. Columns from LOJ children are always nullable. Equivalence classes and property constrains can only be passed to the parent if they come from non-LOJ children. Expand NAry joins with outer refs in "exhaustive2" -------------------------------------------------- When optimizer_join_order is set to "exhaustive", that also enables the "query" join order. When we have NAry joins with outer references, only the "query" join order gets triggered. This is because in many cases, we will be able to decorrelate the query tree. The "query" join order provides a stop-gap for when decorrelation isn't possible. We need a similar stop-gap for "exhaustive2", where DP, query, mincard and greedy are all baked into a single transform. This fix enables the DPv2 xform to fire on NAry joins with outer refs. Note that for now we do a full expand, assuming that DPv2 can handle this, given that very large joins in subqueries are not common. We could restrict the logic to query only, but that would be a bit messy. Handle NAry joins with outer references in DPv2 ----------------------------------------------- We need one change in the DPv2 xform to handle joins with outer references. This is because we may see predicates of the form <col> = <outer ref>. Such predicates are not true join predicates, involving multiple tables. Therefore, they need to be applied as a separate select node on top of the expanded join tree. To do this, we need to build an "expression to edge map", used to find such unused edges. This fix ensures that we build the expression to edge map if the NAry join has outer references. Add tests --------- Added some tests, both explains and actual queries, to gporca.sql. Some of these tests fall back to planner, as they would have before this change. Others fail in the executor, see https://github.com/greenplum-db/gpdb/issues/10791. The executor failure happens without ORCA, so it is an independent issue. (cherry picked from commit c1d143b1eb14247de021fb66423ec401330f7496)
-
由 Bhuvnesh Chaudhary 提交于
When the standby is unreachable and the user proceeds with startup, the standby would attempt to be started resulting in a stack trace. Detect when the standby is unreachable and set start_standby to False to prevent starting it later in the startup process. Co-authored-by: NKalen Krempely <kkrempely@vmware.com>
-
- 09 9月, 2020 5 次提交
-
-
由 Peifeng Qiu 提交于
On centos7, the system libcurl uses NSS instead of OpenSSL as backend for TLS/SSL connections. Previously it will fail if hostname is used in external table location, probably due to initialization issues. This is fixed by commit 89a1211c. Modify original test case to use hostname. curl is used for connection testing in gpfdist_ssl_start. But it will always fail since we are now using SSL. Change the condition to not equal to 7. This code can be found at curl documentation: https://curl.haxx.se/libcurl/c/libcurl-errors.html
-
由 Xiaoran Wang 提交于
* Fix url_curl on MacOS Fix libcurl can not read data from gpfdist on MacOS But gpfdist with a pipe can not work on macos as flock(2) which is used in gfile.c is not supported on MacOS.
-
由 Huiliang.liu 提交于
* Enable and enhance gpfdist SSL test cases 1. Add multiple root CA test cases for gpfdist SSL 2. Fix output file due to foreign table modification
-
由 xiaoxiao 提交于
* fix gpload multi-level partition table and special char in columns issue fix match column condition to resolve primary key conflict when using the gpload merge mode to import data to the Multi-level partition table fix fail when special char and capital letters in column names * add double quotations when creating staging table omit distribution key * fix gpload fail when column names have capital letters in merge mode Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
由 David Yozie 提交于
-
- 08 9月, 2020 1 次提交
-
- 07 9月, 2020 1 次提交
-
-
由 xiaoxiao 提交于
* fix gpload fial when capital letters in column name in merge mode add double quotations for column names when create staging table omit distributio Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
- 04 9月, 2020 2 次提交
-
-
由 Maksim Milyutin 提交于
The GROUPING SETS statement with multiple canonical rollups working on randomly distributed partitioned table causes rebuilding of root->simple_rte_array and root->simple_rel_array on planning stage by PostgreSQL optimizer. The rebuilding process initializes both RTEs and RelOptInfos item by item and doesn't take into account that the build_simple_rel() routine is recursive for inherited relations and requires a full list of already initialized child RTEs inside root->simple_rte_array. As a result, the call of build_simple_rel() fails on the access to child RTEs. The current fix segregates loops of building root->simple_rte_array and root->simple_rel_array trying to leave a whole semantics unchanged.
-
由 xiong-gang 提交于
When doing 'VACUUM FULL', 'swap_relation_files' updates the pg_class entry but not increase the command counter, so the later 'vac_update_relstats' will inplace update the 'relfrozenxid' and 'relhasindex' of the old tuple, when the transaction is interrupted and aborted on the QE after this, the old entry is corrupted.
-
- 03 9月, 2020 4 次提交
-
-
由 Adam Lee 提交于
This will make LISTEN and NOTIFY work on the QD node.
-
由 xiaoxiao 提交于
fix match column condition to resovle primary key conflict when using the gpload merge mode to import data to the Multi-level partition table fix fail when special char and capital letters in column names Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
-
由 Hubert Zhang 提交于
Resource group used to access resGroupSlot in SessionState without lock. This is correct when session only access resGroupSlot by itself. But as we introduced runaway feature, we need to traverse the current session array to find the top consumer session when redzone is reached. This requires: 1. runaway detector should hold shared resgroup lock to avoid resGroupSlot is detached from a session concurrently when redzone is reached. 2. normal session should hold exclusive lock when modifying resGroupSlot in SessionState. Also fix a compile warning. Reviewed-by: NNing Yu <nyu@pivotal.io> (cherry picked from commit a4cb06b4)
-
由 Hubert Zhang 提交于
Proxy bgworker will become orphan process after postmaster is dead due to the lack of checking pipe postmaster_alive_fds[POSTMASTER_FD_WATCH]. Epoll this pipe inside proxy bgworker main loop as well. Reviewed-by: NNing Yu <nyu@pivotal.io> (cherry picked from commit 9ce59d1a)
-
- 02 9月, 2020 1 次提交
-
-
由 David Kimura 提交于
This approach special cases gp_segment_id enough to include the column as a distributed column constraint. It also updates direct dispatch info to be aware of gp_segment_id which represents the raw value of the segment where the data resides. This is different than other columns which hash the datum value to decide where the data resides. After this change the following DDL shows Gather Motion from 2 segments on a 3 segment demo cluster. ``` CREATE TABLE t(a int, b int) DISTRIBUTED BY (a); EXPLAIN SELECT gp_segment_id, * FROM t WHERE gp_segment_id=1 or gp_segment_id=2; QUERY PLAN ------------------------------------------------------------------------------- Gather Motion 2:1 (slice1; segments: 2) (cost=0.00..431.00 rows=1 width=12) -> Seq Scan on t (cost=0.00..431.00 rows=1 width=12) Filter: ((gp_segment_id = 1) OR (gp_segment_id = 2)) Optimizer: Pivotal Optimizer (GPORCA) (4 rows) ``` (cherry picked from commit 10e2b2d9)
-