提交 · 6X_STABLE · Greenplum / Gpdb

17 11月, 2020 1 次提交

Fix multiple definition linker error in unit tests · 7ec4678f

由 Jesse Zhang 提交于 11月 12, 2020

In the same spirit as ee7eb0e8, this removes the extra
definition in postgres_test.c.

This is uncovered by building with GCC 10, where -fno-common is the new
default [1][2] (vis a vis -fcommon). I could also reproduce this by
turning on "-fno-common" in older releases of GCC and Clang.

Backpatch to 6X_STABLE.

(cherry picked from commit 108504ff)

7ec4678f

16 11月, 2020 3 次提交

Fix GROUPING SETS of multiple empty sets · 265bdcd1

由 Richard Guo 提交于 11月 16, 2020

Currently if GROUPING SETS are empty sets, we will clean up
parse->groupClause and treat it as having no groups. If meanwhile there
are no aggregates in tlist or havingQual, the query would be planned as
if there are no GROUPING SETS, which is not correct.

This patch is just a workaround by petending the query has aggregates
when it finds the GROUPING SETS are empty sets.

Fix issue #11003
Reviewed-by: NAsim R P <pasim@vmware.com>
Reviewed-by: NPaul Guo <pguo@pivotal.io>

265bdcd1

Notice reject messages of external tables executed on master (#11086) · c283f6b8

由 Mingli Zhang 提交于 11月 16, 2020

With reject limit, if reject limit is not reached,
there should be some messages that how many rows
are rejected as `execute on segments` does.

Co-authored-by Mingli Zhang <zmingli@vmware.com>
Co-authored-by Adam Lee <adlee@vmware.com>

c283f6b8

Fix overflow of shmCommittedGxactArray on standby (#11071) · b81a580a

由 dreamedcheng 提交于 11月 13, 2020

Previously, standby will replay checkpoint XLOG record's DTX info
in function XLogProcessCheckpointRecord. However, is some certain cases,
it will cause anomaly: When a DTX has flushed FORGET COMMITTED XLOG into
disk, but didn't change its own state to DTX_STATE_INSERTED_FORGET_COMMITTED.
If at this very moment, checkpoint process is calculating DTX info, it
will include the DTX into its XLOG record. So when standby replaying this
checkpoint XLOG record from master, it will add an already forgotten GID to
shmCommittedGxactArray again, which may cause the overflow of shmCommittedGxactArray.

Since DTX info stored in checkpoint XLOG record has been populated earlier in
ReadCheckpointRecord(), there is no need to call XLogProcessCheckpointRecord()
again during recovery.
Co-authored-by: Nwuchengwen <wcw190496@alibaba-inc.com>
Co-authored-by: NDenis Smirnov <sd@arenadata.io>

(cherry picked from afcf30be)

One difference is that it seems that after test
restore_memory_accounting_default more time is needed to make standby ready.
The test in the patch fails on PR pipeline because it detects that standby is
not in-sync state before running. Enlarging the wait times to 5000 same as that
in other tests.  If seeing the flaky behavior on master in the future we should
port this to master also.

b81a580a

13 11月, 2020 1 次提交

Fix handling of empty constraint array in Orca (#11139) (#11153) · bd4f895b

由 Chris Hajas 提交于 11月 12, 2020

During preprocessing, Orca simplifies the query by merging/deduplicating
constraints. However, Orca did not consider the case where the passed in
constraint was a column compared with an empty array. This assumption
caused Orca to crash when dealing with predicates such as `a = ANY('{}')`.

Instead, we now explicitly return an empty constraint when dealing with an ANY('{}')
clause. For ALL('{}'), we won't process the constraint as there's no
simplification to do.

(cherry picked from commit f97a3e23)

bd4f895b

12 11月, 2020 2 次提交

Suppress source greenplum_path.sh since it did not work on dash · ca86e93a

由 Shaoqi Bai 提交于 8月 31, 2020

PyGreSQL need pg_config to build, so it source greenplum_path.sh to set
PATH so it can find pg_config, but greenplum_path.sh now need bash to be
sourced, in ubuntu, it is dash, it does not define BASH_SOURCE, so
source greenplum_path in ubuntu will fail. In order to make build
PyGreSQL pass, explicitly set PATH to find pg_config
Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NXin Zhang <zhxin@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>
Reviewed-by: NAdam Lee <adam8157@gmail.com>
Reviewed-by: NHubert Zhang <hubertzhang@apache.org>
Reviewed-by: NJesse Zhang <sbjesse@gmail.com>

ca86e93a

Set GPHOME to GPDB installation directory when sourcing greenplum_path · cb632ef1

由 Shaoqi Bai 提交于 8月 24, 2020

Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NTingfang Bao <baotingfang@gmail.com>
Co-authored-by: NXin Zhang <zhxin@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>
Reviewed-by: NAdam Lee <adam8157@gmail.com>
Reviewed-by: NHubert Zhang <hubertzhang@apache.org>
Reviewed-by: NJesse Zhang <sbjesse@gmail.com>

cb632ef1

09 11月, 2020 1 次提交
- X
  
  Allow gpexpand to expand materialized view · 826699c7
  由 xiong-gang 提交于 11月 09, 2020
  
  826699c7
06 11月, 2020 1 次提交

Harden analyzedb against concurrently dropped and recreated tables · ae0c062b

由 Abhijit Subramanya 提交于 10月 28, 2020

Commit 4bbbb381 introduced some hardening
around concurrent drop and recreate of tables while analyzedb is running but it
failed to take into account the code around updating the last operation
performed. This commit fixes it.

(cherry picked from commit 6949ecfb1a90c92db63c7c26463f3a9110d3fa12)

ae0c062b

05 11月, 2020 3 次提交

Redirect the error to log message · e1f85a74

由 Bhuvnesh Chaudhary 提交于 10月 06, 2020

Earlier the error was sent to /dev/null and the information was lost
displaying the cause of the error. Redirect the error to log file.

e1f85a74

extract config_primaries_for_replication · 17e8ed4b

由 Bhuvnesh Chaudhary 提交于 10月 29, 2020

This commit does the following:
1. Extract config_primaries_for_replication to be used by both gpaddmirrors
and gprecoverseg.

2. Add --hba-hostname handling

3. gprecoverseg: add replication entries for primaries and add tests
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>

17e8ed4b

B
Remove dead code appendNewEntriesToHbaFile · e881f989
由 Bhuvnesh Chaudhary 提交于 10月 29, 2020
```
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>
```
e881f989

04 11月, 2020 5 次提交

Fix dangling pointer issue when refreshing a matview relation · 300afe37

由 Robert Mu 提交于 12月 24, 2019

019-12-04 23:16:23.075209 CST,"gpadmin","regression",p30552,th-790954624,"[local]",,2019-12-04 23:16:03 CST,8192,con143,cmd54,seg-1,,,x8192,sx1,"ERROR","XX000","unrecognized node type: 0 (copyfuncs.c:6424)",,,,,,"REFRESH MATERIALIZED VIEW m_aocs WITH NO DATA;",0,,"copyfuncs.c",6273,"Stack trace:
1 0xaf1f9c postgres errstart (elog.c:561)
2 0xaf4b83 postgres elog_finish (elog.c:1734)
3 0x7b15f7 postgres copyObject (copyfuncs.c:6424)
4 0x6c6ad1 postgres ExecRefreshMatView (matview.c:409)
5 0x997841 postgres <symbol not found> (utility.c:1743)
6 0x9969f4 postgres standard_ProcessUtility (utility.c:1071)
7 0x993bc5 postgres <symbol not found> (palloc.h:176)
8 0x9945c5 postgres <symbol not found> (pquery.c:1552)
9 0x995a21 postgres PortalRun (pquery.c:1022)
10 0x9908d4 postgres <symbol not found> (postgres.c:1791)
11 0x99359b postgres PostgresMain (postgres.c:5123)
12 0x541c32 postgres <symbol not found> (postmaster.c:4445)
13 0x87dcea postgres PostmasterMain (postmaster.c:1519)
14 0x543fbb postgres main (discriminator 1)
15 0x7f4ccb783505 libc.so.6 __libc_start_main + 0xf5
16 0x54485f postgres <symbol not found> + 0x54485f

Root cause
1 Save the rule (query tree with parentStmtType = PARENTSTMTTYPE_NONE) in the pg_rewrite table when creating a matview relation.
In ExecRefreshMatView function
2 Use dataQuery pointer to get the rule (query tree) of matview relation data(relcache)
3 Set dataQuery->parentStmtType = PARENTSTMTTYPE_REFRESH_MATVIEW
4 QD may receive a reset message(shared-inval-queue overflow) when make_new_heap is called,causing QD to rebuild the entire relcache of the backend, including the matview relation
When rebuilding matview relation(relcache), it is found that oldRel->rule(parentStmtType = PARENTSTMTTYPE_REFRESH_MATVIEW) is not equal to newRel->rule(parentStmtType = PARENTSTMTTYPE_NONE), caused oldRel->rule(dataQuery) to be released
5 refresh_matview_datafill using dataQuery will report an error

(cherry picked from commit 474088cb)

300afe37

盏

Fix the FATAL error of autovacuum on template0 (#10920) · e0f598b1

由盏一提交于 11月 04, 2020

The autovacuum worker for template0 would FATAL because Gp_session_role is still the dispatcher
role with below message.

2020-09-29 21:20:02.686827 CST,,,p19902,th-881792832,,,,0,,,seg2,,,,sx1,"FATAL","57P03","
connections to primary segments are not allowed","This database instance is running as a
primary segment in a Greenplum cluster and does not permit direct connections.","
To force a connection anyway (dangerous!), use utility mode.",,,,,0,,"postinit.c",1151,

Fixing this by setting Gp_session_role as the utility role.

e0f598b1

盏

mask all signal in the udp pthreads · e0c0217a

由盏一提交于 10月 21, 2020

In some cases, some signals (like SIGQUIT) that should only be
processed by the main thread of the postmaster may be dispatched to rxThread.
So we should and it is safe to block all signals in the udp pthreads.

Fix #11006

(cherry picked from commit 54451fc0)

e0c0217a

Experimental cost model update (#11083) · 42a7d061

由 Hans Zeller 提交于 11月 03, 2020

* Avoid costing change for IN predicates on btree indexes

Commit e5f1716 changed the way we handle IN predicates on indexes, it
now uses a more efficient array comparison instead of treating it like
an OR predicate. A side effect is that the cost function,
CCostModelGPDB::CostBitmapTableScan, now goes through a different code
path, using the "small NDV" or "large NDV" costing method. This produces
very high cost estimates when the NDV increases beyond 2, so we
basically never choose an index for these cases, although a btree
index used in a bitmap scan isn't very sensitive to the NDV.

To avoid this, we go back to the old formula we used before commit e5f1716.
The fix is restricted to IN predicates on btree indexes, used in a bitmap
scan.

* Add an MDP for a larger IN list, using a btree index on an AO table

* Misc. changes to the calibration test program

- Added tests for btree indexes (btree_scan_tests).
- Changed data distribution so that all column values range from 1...n.
- Parameter values for test queries are now proportional to selectivity,
a parameter value of 0 produces a selectivity of 0%.
- Changed the logic to fake statistics somewhat, hopefully this will
lead to more precise estimates. Incorporated the changes to the
data distribution with no more 0 values. Added fake stats for
unique columns.
- Headers of tests now use semicolons to separate parts, to give
a nicer output when pasting into Google Docs.
- Some formatting changes.
- Log fallbacks.
- When using existing tables, the program now determines the table
structure (heap or append-only) and the row count.
- Split off two very slow tests into separate test units. These are
not included when running "all" tests, they have to be run
explicitly.
- Add btree join tests, rename "bitmap_join_tests" to "index_join_tests"
and run both bitmap and btree joins
- Update min and max parameter values to cover a range that includes
or at least is closer to the cross-over between index and table scan
- Remove the "high NDV" tests, since the ranges in the general test
now include both low and high NDV cases (<= and > 200)
- Print out selectivity of each query, if available
- Suppress standard deviation output when we execute queries only once
- Set search path when connecting
- Decrease the parameter range when running bitmap scan tests on
heap tables
- Run btree scan tests only on AO tables, they are not designed
for testing index scans

* Updates to the experimental cost model, new calibration

1. Simplify some of the formulas, the calibration process seemed to justify
that. We might have to revisit if problems come up. Changes:
- Rewrite some of the formulas so the costs per row and costs per byte
are more easy to see
- Make the cost for the width directly proportional
- Unify the formula for scans and joins, use the same per-byte costs
and make NDV-dependent costs proportional to num_rebinds * dNDV,
except for the logic in item 3.

That makes the cost for the new experimental cost model a simple linear formula:

num_rebinds * ( rows * c1 + rows * width * c2 + ndv * c3 + bitmap_union_cost + c4 ) + c5

We have 5 constants, c1 ... c5:

c1: cost per row (rows on one segment)
c2: cost per byte
c3: cost per distinct value (total NDV on all segments)
c4: cost per rebind
c5: initialization cost
bitmap_union_cost: see item 3 below

2. Recalibrate some of the cost parameters, using the updated calibration
program src/backend/gporca/scripts/cal_bitmap_test.py

3. Add a cost penalty for bitmap index scans on heap tables. The added
cost takes the form bitmap_union_cost = <base table rows> * (NDV-1) * c6.

The reason for this is, as others have pointed out, that heap tables
lead to much larger bit vectors, since their CTIDs are more spaced out
than those of AO tables. The main factor seems to be the cost of unioning
these bit vectors, and that cost is proportional to the number of bitmaps
minus one and the size of the bitmaps, which is approximated here by the
number of rows in the table.

Note that because we use (NDV-1) in the formula, this penalty does not
apply to usual index joins, which have an NDV of 1 per rebind. This is
consistent with what we see in measurements and it also seems reasonable,
since we don't have to union bitmaps in this case.

4. Fix to select CostModelGPDB for the 'experimental' model, as we do in 5X.

5. Calibrate the constants involved (c1 ... c6), using the calibration program
and running experiments with heap and append-only tables on a laptop and
also on a Linux cluster with 24 segments. Also run some other workloads
for validation.

6. Give a small initial advantage to bitmap scans, so they will be chosen over
table scans for small tables. Otherwise, small queries will
have more or less random plans, all of which cost around 431, the value
of the initial cost. Added a 10% advantage of the bitmap scan.

42a7d061

Improve partition elimination when indexes are present (#10970) · 4a7a6821

由 Hans Zeller 提交于 11月 03, 2020

* Use original join pred for DPE with index nested loop joins

Dynamic partition selection is based on a join predicate. For index
nested loop joins, however, we push the join predicate to the inner
side and replace the join predicate with "true". This meant that
we couldn't do DPE for nested index loop joins.

This commit remembers the original join predicate in the index nested
loop join, to be used in the generated filter map for DPE. The original
join predicate needs to be passed through multiple layers.

* SPE for index preds

Some of the xforms use method CXformUtils::PexprRedundantSelectForDynamicIndex
to duplicate predicates that could be used both as index predicates and as
partition elimination predicates. The call was missing in some other xforms.
Added it.

* Changes to equivalent distribution specs with redundant predicates

Adding redundant predicates causes some issues with generating
equivalent distribution specs, to be used for the outer table of
a nested index loop join. We want the equivalent spec to be
expressed in terms of outer references, which are the columns of
the outer table.

By passing in the outer refs, we can ensure that we won't replace
an outer ref in a distribution spec with a local variable from
the original distribution spec.

Also removed the asserts in CPhysicalFilter::PdsDerive that ensure the
distribution spec is complete (consisting of only columns from the
outer table) after we see a select node. Even without my changes, the
asserts do not always hold, as this test case shows:

  drop table if exists foo, bar;
  create table foo(a int, b int, c int, d int, e int) distributed by(a,b,c);
  create table bar(a int, b int, c int, d int, e int) distributed by(a,b,c);

  create index bar_ixb on bar(b);

  set optimizer_enable_hashjoin to off;
  set client_min_messages to log;

  -- runs into assert
  explain
  select *
  from foo join bar on foo.a=bar.a and foo.b=bar.b
  where bar.c > 10 and bar.d = 11;

Instead of the asserts, we now use the new method of passing in the
outer refs to ensure that we move towards completion. We also know
now that we can't always achieve a complete distribution spec, even
without redundant predicates.

* MDP changes

Various changes to MDPs:

- New SPE filters used in plan
- New redundant predicates (partitioning or on non-partitioning columns)
- Plan space changes
- Cost changes
- Motion changes
- Regenerated, because plan switched to a hash join, so used a guc
  to force an index plan
- Fixed lookup failures
- Add mdp where we try unsuccessfully to complete a distribution spec

* ICG result changes

- Test used the 'experimental' cost model to force an index scan, but we
  now get the index scan even with the default cost model.

4a7a6821

03 11月, 2020 2 次提交

Fix resgroup unusable if its dropping failed · 3e548e69

由 xiong-gang 提交于 11月 03, 2020

In function DropResourceGroup(), group->lockedForDrop is set
to true by calling ResGroupCheckForDrop, however, it can only
be set to false inside dropResgroupCallback. This callback is
registered at the ending of function DropResourceGroup. If an
error occured between them, group->lockedForDrop would be true
forever.

Fix it by putting the register process ahead of the lock call.
To prevent Assert(group->nRunning* > 0) if ResGroupCheckForDrop
throws an error, return directly if group->lockedForDrop did
not change.

See:

```
gpconfig -c gp_resource_manager -v group
gpstop -r -a

psql
                CPU_RATE_LIMIT=20,
                MEMORY_LIMIT=20,
                CONCURRENCY=50,
                MEMORY_SHARED_QUOTA=80,
                MEMORY_SPILL_RATIO=20,
                MEMORY_AUDITOR=vmtracker
        );

psql -U user_test
> \d -- hang
```
Co-authored-by: Ndh-cloud <60729713+dh-cloud@users.noreply.github.com>

3e548e69

(

Reset wrote_xlog in pg_conn to avoid keeping old value. · 310ab79b

由 (Jerome)Junfeng Yang 提交于 11月 03, 2020

On QD, it tracks whether QE wrote_xlog in the libpq connection.

The logic is, if QE writes xlog, it'll send a libpq msg to QD. But the
msg is sent in ReadyForQuery. So, before QE execute this function, the
QE may already send back results to QD. Then when QD process this
message, it does not read the new wrote_xlog value. This makes the
connection still contains the previous dispatch wrote_xlog value,
which will affect whether choosing one phase commit.

The issue only happens when the QE flush the libpq msg before the
ReadyForQuery function, hard to find a case to cover it.
I found the issue when I playing the code to send some information from
QE to QD. And it breaks the gangsize test which shows the commit info.

(cherry picked from commit 777b51cd)

310ab79b

02 11月, 2020 2 次提交

Update greenplum-database-release default repo to main · 4beb2a7c

由 Ning Wu 提交于 11月 02, 2020

The repo https://github.com/greenplum-db/greenplum-database-release has
been changed default repo from master to main. This is to sync up this
change
Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>

4beb2a7c

GPCC want to hook query like (#11060) · 8aa4104e

由 Jialun 提交于 11月 02, 2020

- create table ... as select ...
- create materialized view ... as select ...

This is backport from commit: 7ae210a1bf7e569a18cda32dcec3b55665a42ee7

8aa4104e

31 10月, 2020 1 次提交
- D
  
  Docs - add note regarding change in TRUNCATE behavior with foreign key (#11028) · 8ebf8c36
  由 David Yozie 提交于 10月 30, 2020
  
  8ebf8c36
30 10月, 2020 3 次提交

Docs - update interconnect proxy discussion to cover hostname support (#11027) · 9284cf79

由 David Yozie 提交于 10月 29, 2020

* Docs - update interconnect proxy discussion to cover hostname support

* Change gp_interconnect_type -> gp_interconnect_proxy_addresses in note

9284cf79

L

docs - update some troubleshooting info (#11064) · b6cdd143
由 Lisa Owen 提交于 10月 29, 2020

b6cdd143

gpinitsystem -I should respect master dbid != 1 · 7e914f23

由 dh-cloud 提交于 10月 29, 2020

Looking at GP documents, there is no indication that master dbid
must be 1. However, when CREATE_QD_DB, gpinitsystem always writes
"gp_dbid=1" into file `internal.auto.conf` even if we specify:

```
mdw~5432~/data/master/gpseg-1~2~-1
 OR
mdw~5432~/data/master/gpseg-1~0~-1
```

But catalog gp_segment_configuration can have the correct master
dbid value (2 or 0), the mismatch causes gpinitsystem hang.
Users can run into such problem for their first time to use
gpinitsystem -I.

Here we test dbid 0, because PostmasterMain() will simply check
dbid >= 0 (non-utility mode), it says:

> This value must be >= 0, or >= -1 in utility mode

It seems 0 is a valid value.

Changes:

- use specified master dbid field when CREATE_QD_DB.
Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
(cherry picked from commit 00ae3013)

7e914f23

29 10月, 2020 2 次提交

L

docs - add info about postgres_fdw module (#11075) · c188ed7a
由 Lisa Owen 提交于 10月 29, 2020

c188ed7a

Skip fts probe for fts process · db74df9f

由 dh-cloud 提交于 10月 28, 2020

If cdbcomponent_getCdbComponents() caught an error threw by
function getCdbComponents, FtsNotifyProber would be called.
But if it happened inside fts process, ftp process would hang.

Skip fts probe for fts process, after that, under the same
situation, fts process would exit and then be restarted by
postmaster.

(cherry picked from commit 3cf72f6c)

db74df9f

28 10月, 2020 10 次提交

A
Mock cmd.get_stdout() to fix test regression · bfb71c43
由 Adam Lee 提交于 10月 26, 2020
```
Otherwise it will raise an exception "command not run yet".
```
bfb71c43

gprecoverseg: log the error if pg_rewind fails · 61b5c3bd

由 Adam Lee 提交于 10月 22, 2020

It didn't log the error message before if pg_rewind fails, fix that to make
DBA/field/developer's life eaisier.

Before this:
```
20201022:15:19:10:011118 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors
20201022:15:19:12:011118 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment.
20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-Starting mirrors
20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031
```

After this:
```
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors
20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-pg_rewind: fatal: could not find common ancestor of the source and target cluster's timelines
20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment.
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Starting mirrors
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031
```

61b5c3bd

Hardcode missing cdblegacyhash_bpchar in isLegacyCdbHashFunction check · 4c176763

由 Jimmy Yih 提交于 10月 19, 2020

Currently, when inserting into a table distributed by a bpchar using
the legacy bpchar hash operator, the row goes through jump consistent
hashing instead of lazy modular hashing. This is because the
cdblegacyhash_bpchar funcid is missing from the
isLegacyCdbHashFunction check function which determines if an
attribute is using a legacy hash function or not. The funcids
currently in that check function come from the auto-generated
fmgroids.h header file which only creates a DEFINE for the
pg_proc.prosrc field. Unfortunately, cdblegacyhash_bpchar is left out
because its prosrc is cdblegacyhash_text.

A proper fix would require a catalog change. To fix this issue in
6X_STABLE, we need to hardcode cdblegacyhash_bpchar funcid 6148 into
the isLegacyCdbHashFunction check function. This should be fine since
the GPDB 6X_STABLE catalog is frozen.

This issue was reported by github user cobolbaby in the gpbackup
repository while the user was migrating GPDB 5X tables to GPDB 6X:
https://github.com/greenplum-db/gpbackup/issues/425

4c176763

Pin PR resource to v0.21 to avoid Github abuse rate limit · aa4cf74b

由 Jesse Zhang 提交于 10月 23, 2020

We started hitting this on Thursday, and there's been ongoing report
from the community about this as well. While upstream is figuring out a
long term solution [1], we've been advised [2] to pin to the previous
release (v0.21.0) to avoid being blocked for hours at once.

[1]: https://github.com/telia-oss/github-pr-resource/pull/238
[2]: https://github.com/telia-oss/github-pr-resource/pull/238#issuecomment-714830491

(cherry picked from commit f4bf9be6)

aa4cf74b

Remove duplicate definition of data_directory · 8f0efc83

由 Bradford D. Boyle 提交于 10月 27, 2020

Compiling with gcc 10 on Debian testing fails with the following error:

```
/usr/bin/ld: utils/misc/guc_gp.o:(.bss+0x308): multiple definition of
`data_directory'; utils/misc/guc.o:(.bss+0x70): first defined here
```

8f0efc83

Update python configure macros · 9f94b977

由 Bradford D. Boyle 提交于 10月 27, 2020

Some platforms do not have unversioned "python" available but do have
versioned "python2". Configure should look for either "python" or
"python2" when run with the option "--with-python".

These changes were manually copied from the Postgres build system but
omitted searching for "python3" since Greenplum does not have support
for Python 3 yet.

9f94b977

L

docs - remove duplicate left nav entry (#11040) · 2bfcffe4
由 Lisa Owen 提交于 10月 27, 2020

2bfcffe4
L

docs - add -a (no prompt) option to analyzedb call in script (#10985) · 533fc4ed
由 Lisa Owen 提交于 10月 27, 2020

533fc4ed
D
Add workload3 to explain pipeline · 7192731a
由 David Kimura 提交于 10月 21, 2020
```
(cherry picked from commit 91ed33c9)
```
7192731a
D
Copy and follow symlinks in run_explain_suite pipeline · d92683cc
由 David Kimura 提交于 10月 21, 2020
```
This allows us to reduce code duplication of workload SQL scripts

(cherry picked from commit 8c204bd5)
```
d92683cc

27 10月, 2020 3 次提交

postgres_fdw: disable UPDATE/DELETE on foreign Greenplum servers · e1fed42a

由 Xiaoran Wang 提交于 10月 27, 2020

Greenplum only supports INSERT, because UPDATE/DELETE requires the
hidden column gp_segment_id and the other "ModifyTable mixes distributed
and entry-only tables" issue.

e1fed42a

Remove Orca assertions when merging buckets · 44621b6f

由 Chris Hajas 提交于 10月 16, 2020

These assertions started getting tripped in the previous commit when
adding tests, but aren't related to the Epsilon change. Rather, we're
calculating the frequency of a singleton bucket using two different
methods which causes this assertion to break down. The first method
(calculating the upper_third) assumes the singleton has 1 NDV and that there is an even distribution
across the NDVs. The second (in GetOverlapPercentage) calculates a
"resolution" that is based on Epsilon and assumes the bucket contains
some small Epsilon frequency. It results in the overlap percentage being
too high, instead it too should likely be based on the NDV.

In practice, this won't have much impact unless the NDV is very small.
Additionally, the conditional logic is based on the bounds, not
frequency. However, it would be good to align in the future so our
statistics calculations are simpler to understand and predictable.

For now, we'll remove the assertions and add a TODO. Once we align the
methods, we should add these assertions back.

44621b6f

Fix stats bucket logic for Double values in UNION queries in Orca · 45e49e17

由 Chris Hajas 提交于 10月 16, 2020

When merging statistics buckets for UNION and UNION ALL queries
involving a column that maps to Double (eg: floats, numeric, time
related types), we could end up in an infinite loop. This occurred if
the bucket boundaries that we compared were within a very small value,
defined in Orca as Epsilon. While we considered that two values were
equal if they were within Epsilon, we didn't when computing whether
datum1 < datum2. Therefore we'd get into a situation where a datum
could be both equal to and less than another datum, which the logic
wasn't able to handle.

The fix is to make sure we have a hard boundary of when we consider a
datum less than another datum by including the epsilon logic in all
datum comparisons. Now, 2 datums are equal if they are within epsilon,
but datum1 is less than datum 2 only if datum1 < datum2 - epsilon.

Also add some tests since we didn't have any tests for types that mapped
to Double.

45e49e17