提交 · fc572100780143d7fbeda19f36797e9e1d16da0d · Greenplum / Gpdb

05 11月, 2020 1 次提交

Redirect the error to log message · fc572100

由 Bhuvnesh Chaudhary 提交于 10月 08, 2020

Earlier the error was sent to /dev/null and the information was lost
displaying the cause of the error. Redirect the error to log file.

fc572100

04 11月, 2020 9 次提交

J

Remove unused chevron operator. · 460780eb
由 Jesse Zhang 提交于 11月 02, 2020

460780eb

Remove no-op OsPrint methods. · ad86711b

由 Jesse Zhang 提交于 11月 02, 2020

They have the same no-op implementation as the overridden base method,
and they aren't called anywhere.

ad86711b

Remove no-op print from test. · a567bbbb

由 Jesse Zhang 提交于 11月 02, 2020

The method called doesn't actually do any printing, and we don't assert
on the output. Removing the call doesn't even change the program output.

a567bbbb

J

Make CJob::OsPrint const. · 420f3eb0
由 Jesse Zhang 提交于 11月 02, 2020

420f3eb0

Make CMemo::OsPrint const by fixing CSyncList. · cc7fa71b

由 Jesse Zhang 提交于 11月 02, 2020

While working on extracting a common implementation of DbgPrint() into a
mixin (commit forthcoming), I ran into the curious phenomenon that is
the non-const CMemo::OsPrint. I almost dropped the requirement that
DbgPrint requires "OsPrint() const", before realizing that the root
cause is CSyncList has non-const Next() and friends. And that could be
easily fixed. Make it so.

While we're at it, also fixed a fairly obvious omission in
CMemo::OsPrint where the output stream parameter was unused. We output
to an unrelated "auto" stream instead. This was probably never noticed
because we were relying on the assumption that streams are always
connected to standard output.

cc7fa71b

Remove unnecessary "virtual" specifier. · 0158b5ce

由 Jesse Zhang 提交于 11月 02, 2020

These are classes that are only implementing an OsPrint method just so
that they can have a debug printing facility. They are not overriding
anything from a base class, so the "virtual" was just a bad habit.
Remove them.

0158b5ce

Remove unused class gpos::COstreamFile. · 0f2c1b75

由 Jesse Zhang 提交于 11月 02, 2020

I looked through the history, this class was dead on arrival and *never*
used. Ironically, we kept adding #include for its header over the years
to places that didn't use the class.

0f2c1b75

Remove dead headers. · dfa1f0a6

由 Jesse Zhang 提交于 11月 02, 2020

Mortality date, in chronological order:

gpos/memory/ICache.h: Added in 2010, orphaned in 2011 (private commit)
CL 90194

gpopt/utils/COptClient.h and gpopt/utils/COptServer.h: Added in 2012,
orphaned in 2015 (private commit) MPP-25631

gpopt/base/CDrvdPropCtxtScalar.h: Dead on arrival when added in 2013

gpos/error/CAutoLogger.h: Added in 2012, orphaned in 2014 (private
commit) CL 189022

unittest/gpos/task/CWorkerPoolManagerTest.h: Added in 2010, orphaned in
2019 in commit 61c7405a "Remove multi-threading code"
(greenplum-db/gporca#510)

unittest/gpos/task/CAutoTaskProxyTest.h wasn't removed in commit
61c7405a probably because there was an reference in
CWorkerPoolManagerTest.h which is was also left behind (chained
orphaning).

dfa1f0a6

G
Fix pipeline failure · 986c222b
由 Gang Xiong 提交于 11月 03, 2020
```
commit '5f7cdc' didn't update the answer file expand_table.out
```
986c222b

03 11月, 2020 1 次提交
- X
  Allow gpexpand to expand materialized view · 5f7cdc1a
  由 xiong-gang 提交于 11月 03, 2020
```
Co-authored-by: NGang Xiong <gangx@vmware.com>
```
  5f7cdc1a
02 11月, 2020 1 次提交

Update greenplum-database-release default repo to main · 7ff9c06e

由 Ning Wu 提交于 11月 02, 2020

The repo https://github.com/greenplum-db/greenplum-database-release has
been changed default repo from master to main. This is to sync up this
change
Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>

7ff9c06e

31 10月, 2020 2 次提交
- A
  Harden analyzedb against concurrently dropped and recreated tables · 4dc25ad7
  由 Abhijit Subramanya 提交于 10月 28, 2020
```
Commit 4bbbb381 introduced some hardening
around concurrent drop and recreate of tables while analyzedb is running but it
failed to take into account the code around updating the last operation
performed. This commit fixes it.
```
  4dc25ad7
- D
  
  Docs - add note regarding change in TRUNCATE behavior with foreign key (#11028) · 0daf6306
  由 David Yozie 提交于 10月 30, 2020
  
  0daf6306
30 10月, 2020 6 次提交

Fix source greenplum_path.sh error with set -u (#11085) · 1f429744

由 Chen Mulong 提交于 10月 30, 2020

The error was introduced by dc96f667.
If `set -u` was called before sourcing greenplum_path.sh with bash, an
error `ZSH_VERSION: unbound variable` would be reported.
To solve the issue, use shell syntax `{:-}` which will output an empty
value if the variable doesn't exist.

Tested with zsh, bash and dash.

1f429744

(

Reset wrote_xlog in pg_conn to avoid keeping old value. (#11077) · 777b51cd

由 (Jerome)Junfeng Yang 提交于 10月 30, 2020

On QD, it tracks whether QE wrote_xlog in the libpq connection.

The logic is, if QE writes xlog, it'll send a libpq msg to QD. But the
msg is sent in ReadyForQuery. So, before QE execute this function, the
QE may already send back results to QD. Then when QD process this
message, it does not read the new wrote_xlog value. This makes the
connection still contains the previous dispatch wrote_xlog value,
which will affect whether choosing one phase commit.

The issue only happens when the QE flush the libpq msg before the
ReadyForQuery function, hard to find a case to cover it.
I found the issue when I playing the code to send some information from
QE to QD. And it breaks the gangsize test which shows the commit info.

777b51cd

Make greenplum-path.sh compatible with more shells (#11043) · dc96f667

由 Chen Mulong 提交于 10月 30, 2020

The generated greenplum_path.sh env file contained bash specific syntax
previously, so it errors out if the user's shell is zsh.

zsh doesn't have BASH_SOURCE. "${(%):-%x}" is the similar replacement
for zsh.
Also try to support other shells with some command combinations.
Tested with bash/zsh/dash.

dc96f667

Docs - update interconnect proxy discussion to cover hostname support (#11027) · d7bfe6ee

由 David Yozie 提交于 10月 29, 2020

* Docs - update interconnect proxy discussion to cover hostname support

* Change gp_interconnect_type -> gp_interconnect_proxy_addresses in note

d7bfe6ee

L

docs - update some troubleshooting info (#11064) · 151fa706
由 Lisa Owen 提交于 10月 29, 2020

151fa706

gpinitsystem -I should respect master dbid != 1 · 00ae3013

由 dh-cloud 提交于 10月 29, 2020

Looking at GP documents, there is no indication that master dbid
must be 1. However, when CREATE_QD_DB, gpinitsystem always writes
"gp_dbid=1" into file `internal.auto.conf` even if we specify:

```
mdw~5432~/data/master/gpseg-1~2~-1
 OR
mdw~5432~/data/master/gpseg-1~0~-1
```

But catalog gp_segment_configuration can have the correct master
dbid value (2 or 0), the mismatch causes gpinitsystem hang.
Users can run into such problem for their first time to use
gpinitsystem -I.

Here we test dbid 0, because PostmasterMain() will simply check
dbid >= 0 (non-utility mode), it says:

> This value must be >= 0, or >= -1 in utility mode

It seems 0 is a valid value.

Changes:

- use specified master dbid field when CREATE_QD_DB.
- remove unused macros MASTER_DBID, InvalidDbid in C sources.
Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>

00ae3013

29 10月, 2020 2 次提交

L

docs - add info about postgres_fdw module (#11075) · 6693192c
由 Lisa Owen 提交于 10月 29, 2020

6693192c

Skip fts probe for fts process · 3cf72f6c

由 dh-cloud 提交于 10月 28, 2020

If cdbcomponent_getCdbComponents() caught an error threw by
function getCdbComponents, FtsNotifyProber would be called.
But if it happened inside fts process, ftp process would hang.

Skip fts probe for fts process, after that, under the same
situation, fts process would exit and then be restarted by
postmaster.

3cf72f6c

28 10月, 2020 6 次提交

(

Collect pgstat from QE to enable auto-ANALYZE on partition leaf table. (#10988) · 259cb9e7

由 (Jerome)Junfeng Yang 提交于 10月 28, 2020

Collect tuple relead pgstat table info from segments. Then the
auto-analyze could consider partition tables now. Since before, we don't
have accurate pgstat for partition leaf table. This kind of info is counted
through the access method on segments and we used to collect them by the
estate es_processed count on QD. So if insert into the root partition
table, we can not know how many tuples inserted into a leaf, autovacuum
never trigger auto-ANALYZE for leaf table.

The idea is, for writer QE, report current nest level xact tables pgstat
to QD through libpq at the end of a query statement. For a single
statement, it wouldn't operate too many tables, so the effort is really
small.
And on QD, retrieve and combine these tables' stat from the dispatch
result and add to current nest level xact pgstats.
Now we can remove the old pgstat collection code on QD.

The pgstat for a table could be view by query `pg_stat_all_tables_inernal`.
And now, except for the scan related counters, other counters should be accurate.
On master, the table's pgstat of scan related counters are not gathered
from segments yet, this requires extra work. The current implementation
is already enough for supporting auto-ANALYZE on partition table.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

259cb9e7

盏

mask all signal in the udp pthreads · 54451fc0

由盏一提交于 10月 21, 2020

In some cases, some signals (like SIGQUIT) that should only be
processed by the main thread of the postmaster may be dispatched to rxThread.
So we should and it is safe to block all signals in the udp pthreads.

Fix #11006

54451fc0

L

docs - remove duplicate left nav entry (#11040) · 83afc602
由 Lisa Owen 提交于 10月 27, 2020

83afc602
L

docs - add -a (no prompt) option to analyzedb call in script (#10985) · e08cedf8
由 Lisa Owen 提交于 10月 27, 2020

e08cedf8

Pin PR resource to v0.21 to avoid Github abuse rate limit · f4bf9be6

由 Jesse Zhang 提交于 10月 23, 2020

We started hitting this on Thursday, and there's been ongoing report
from the community about this as well. While upstream is figuring out a
long term solution [1], we've been advised [2] to pin to the previous
release (v0.21.0) to avoid being blocked for hours at once.

[1]: https://github.com/telia-oss/github-pr-resource/pull/238
[2]: https://github.com/telia-oss/github-pr-resource/pull/238#issuecomment-714830491

f4bf9be6

Validate cluster state during regression tests · 937187e9

由 Ashwin Agrawal 提交于 10月 27, 2020

It is often observed in CI that a test that leaves the cluster in an
inconsistent state (e.g. primary-mirror pair is not in sync, or
primary has not finished crash recovery) cause several following tests
in the schedule to fail.  What's worse, the culprit test itself may be
reported as passed because its validation criteria did not include the
state of the cluster.  This has found to mislead debugging efforts,
ultimately waste of time.

To make debugging CI failures more efficient, this patch enhances
pg_regress to perform the validation internally. If the validation
fails, further testing is aborted.

The validation is performed before running a group of tests specified
on one line in the schedule file. Also, validation is performed before
every single test, if running in serialized fashion.

When the cluster validation is found to fail, the culprit is in the
previously run test group.

This patch is built on the ground work and analysis laid out in
PR #9865 and PR #10825 by Wu Hao and Asim R P.
Reviewed-by: NAsim R P <pasim@vmware.com>

937187e9

27 10月, 2020 6 次提交

EXCLUDE in window functions works now, remove 'gp_ignore_window_exclude'. · 8299a524

由 Heikki Linnakangas 提交于 10月 27, 2020

Previously, GPDB did not support the SQL "EXCLUDE [CURRENT ROW | GROUP |
TIES]" syntax in window functions. We got support for that from upstream
with the PostgreSQL v12 merge. That left the GUC obsolete and unused.

Update the 'olap_window' test accordingly. NOTE: 'olap_window' test isn't
currently run as part of the regression suite! I don't know why it's been
neglected like that, but that's not this patch's fault. The upstream
'window' test has queries with the EXCLUDE clause, so it's covered.

Reviewed-by: Jimmy Yih

8299a524

Add query info hook for CTAS query type. (#11050) · c8d84436

由 Jialun 提交于 10月 27, 2020

GPCC want to hook query like
- create table ... as select ...
- create materialized view ... as select ...

c8d84436

Fix a shell issue of empty string · 5db43663

由 Adam Lee 提交于 10月 26, 2020

If the $(UBUNTU_PLATFORM) is an empty string, the test command will fail,
double quotes it to fix.

```
--- mock for platform
/bin/sh: line 0: [: =: unary operator expected
```

5db43663

Remove Orca assertions when merging buckets · 34ae3d94

由 Chris Hajas 提交于 10月 16, 2020

These assertions started getting tripped in the previous commit when
adding tests, but aren't related to the Epsilon change. Rather, we're
calculating the frequency of a singleton bucket using two different
methods which causes this assertion to break down. The first method
(calculating the upper_third) assumes the singleton has 1 NDV and that there is an even distribution
across the NDVs. The second (in GetOverlapPercentage) calculates a
"resolution" that is based on Epsilon and assumes the bucket contains
some small Epsilon frequency. It results in the overlap percentage being
too high, instead it too should likely be based on the NDV.

In practice, this won't have much impact unless the NDV is very small.
Additionally, the conditional logic is based on the bounds, not
frequency. However, it would be good to align in the future so our
statistics calculations are simpler to understand and predictable.

For now, we'll remove the assertions and add a TODO. Once we align the
methods, we should add these assertions back.

34ae3d94

Fix stats bucket logic for Double values in UNION queries in Orca · ba4deed0

由 Chris Hajas 提交于 10月 16, 2020

When merging statistics buckets for UNION and UNION ALL queries
involving a column that maps to Double (eg: floats, numeric, time
related types), we could end up in an infinite loop. This occurred if
the bucket boundaries that we compared were within a very small value,
defined in Orca as Epsilon. While we considered that two values were
equal if they were within Epsilon, we didn't when computing whether
datum1 < datum2. Therefore we'd get into a situation where a datum
could be both equal to and less than another datum, which the logic
wasn't able to handle.

The fix is to make sure we have a hard boundary of when we consider a
datum less than another datum by including the epsilon logic in all
datum comparisons. Now, 2 datums are equal if they are within epsilon,
but datum1 is less than datum 2 only if datum1 < datum2 - epsilon.

Also add some tests since we didn't have any tests for types that mapped
to Double.

ba4deed0

Make 'rows' estimate more accurate for plans that fetch only a few rows. · f4d48358

由 Heikki Linnakangas 提交于 10月 26, 2020

In commit c5f6dbbe, we changed the row and cost estimates on plan nodes
to represent per-segment costs. That made some estimates worse, because
the effects of the estimate "clamping" compounds. Per my comment on the
PR back then:

> One interesting effect of this change, that explains many of the
> plan changes: If you have a table with very few rows, or e.g. a qual
> like id = 123 that matches exactly one row, the Seq/Index Scan on it
> will be marked with rows=1. It now means that we estimate that every
> segment returns one row, although in reality, only one of them will
> return a row, and the rest will return nothing. That's because the
> row count estimates are "clamped" in the planner to at least
> 1. That's not a big deal on its own, but if you then have e.g. a
> Gather Motion on top of the Scan, the planner will estimate that the
> Gather Motion returns as many rows as there are segments. If you
> have e.g. 100 segments, that's relatively a big discrepancy, with
> 100 rows vs 1. I don't think that's a big problem in practice, I
> don't think most plans are very sensitive to that kind of a
> misestimate. What do you think?
>
> If we wanted to fix that, perhaps we should stop "clamping" the
> estimates to 1. I don't think there's any fundamental reason we need
> to do it. Perhaps clamp down to 1 / numsegments instead.

But I came up with a less intrusive idea, implemented in this commit:
Most Motion nodes have a "parent" RelOptInfo, and the RelOptInfo
contains an estimate of the total number of rows, before dividing it
with the number of segments or clamping. So if the row estimate we get
from the subpath seems clamped to 1.0, we look at the row estimate on
the underlying RelOptInfo instead, and use that if it's smaller. That
makes the row count estimates better for plans that fetch a single row
or a few rows, same as they were before commit c5f6dbbe. Not all
RelOptInfos have a row count estimate, and the subpaths estimate is
more accurate if the number of rows produced by the path differs from
the number of rows in the underlying relation, e.g.  because of a
ProjectSet node, so we still prefer the subpath's estimate if it
doesn't seem clamped.
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>

f4d48358

26 10月, 2020 3 次提交

A
Mock cmd.get_stdout() to fix test regression · 81613a5c
由 Adam Lee 提交于 10月 26, 2020
```
Otherwise it will raise an exception "command not run yet".
```
81613a5c

Handle PartitionSelector in plan_tree_mutator(). · 76c99759

由 Heikki Linnakangas 提交于 10月 26, 2020

It was handled in expression_tree_mutator(), which is why everything
worked, but that's not the right place. expression_tree_mutator() is
supposed to handle nodes that can appear in expressions, and
plan_tree_mutator() is supposed to handle Plan nodes.
Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>

76c99759

Fix: might recycle wrong gang size. · 269c3b73

由 dh-cloud 提交于 10月 26, 2020

In buildGangDefinition, newGangDefinition->db_descriptors are
initialized one by one, but newGangDefinition->size was already
set to its final value. If an error was caught, its size should
be reset to the right number.

269c3b73

24 10月, 2020 1 次提交

gpstart: testing of improve handling of down segment hosts · e39465ae

由 David Krieger 提交于 10月 22, 2020

The tests in commit be5d11e2 contained a typo that caused the changes
in the Scenario "gpstart starts even if the standby host is unreachable"
to not properly cleanup after itself. Though the test feature still
passes, this leaves a bug to be found later when more tests are added.

e39465ae

23 10月, 2020 2 次提交

Relfrozenxid must be invalid for append-optimized tables · e68d5b8a

由 Asim R P 提交于 10月 23, 2020

Append-optimized tables do not contain transaction information in
their tuples.  Therefore, pg_class.relfrozenxid must remain invalid.
This is being done correctly during table creation, however, when the
table was rewritten, the relfrozenxid was accidentally set.  Fix it
such that diff with upstream is minimised.  In particular, the function
"should_have_valid_relfozenxid" is removed.

The fixme comments that led me to this bug are also removed.

Reviewed by: Ashwin Agrawal

e68d5b8a

Fix CLOSE_WAIT leaks when Gang recycling · 990454e8

由 dh-cloud 提交于 10月 22, 2020

Postgresql libpq document:

> Note that when PQconnectStart or PQconnectStartParams returns a
> non-null pointer, you must call PQfinish when you are finished
> with it, in order to dispose of the structure and any associated
> memory blocks. **This must be done even if the connection attempt
> fails or is abandoned**.

However, cdbconn_disconnect() function did not call PQfinish when
CONNECTION_BAD, it can cause socket leaks (CLOSE_WAIT state).

990454e8