提交 · 717f0c47a9e980a0bfe42b93b21eeb65919f1d50 · Greenplum / Gpdb

14 11月, 2020 1 次提交

gpstart logs commands' stderr · 717f0c47

由 Adam Lee 提交于 11月 09, 2020

I have seen too many "[CRITICAL]:-gpstart failed. (Reason='')
exiting..." errors, and there was nothing in the log. The reason could
be "SSH PATH", "Python modules" or some other issues.

Log the stderr to save debugging efforts.

717f0c47

12 11月, 2020 1 次提交

Add Orca index only scan cost formula (#11106) · bc7ab7bd

由 David Kimura 提交于 11月 11, 2020

Planner costs index-only-scan relative to index-scan. In the case of
index-only-scan, `RelOptInfo->allvisfrac` can be used to reduce the cost
when the visimap contains all-visible blocks. Thus, if index-only-scan
is possible, it will be favored.

In this commit, Orca also costs index-only-scan relative to index-scan
and scales the cost based on percentage of all-visible blocks in the
visimap. This is done by storing `pg_class.relallvisible` stats in
CDXLRelStats which is accessible during costing. In DXL this is added to
RelationStatistics:
```
<dxl:RelationStatistics ... RelPages="XX" RelAllVisible="YY"/>
```

Also in this commit, we update gpsd and minirepro to collect stats for
relallvisible. In doing this we also udpated the tools to use python3
sytnax.

And finally, it also adds a new option "index_only_scan_tests" in
cal_bitmap_test to callibrate index only scan cost/execution.

bc7ab7bd

09 11月, 2020 1 次提交

compatible gpload (#11103) · f7174966

由 xiaoxiao 提交于 11月 09, 2020

* refactor gpload test file TEST.py

1. migrate gpload test to pytest
2. new function to form config file through yaml package and make it more reasonable
3. add a case to cover gpload update_condition arggument

* migrate gpload and TEST.py to python3.6
new test case 43 to test gpload behavior when column name has capital letters and without data type
change some ans file since psql react different

* change sql to find reuseable external table to make gpload compatible in gp7 and gp6
better TEST.py to write config file with ruamel.yaml moudle
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>

f7174966

07 11月, 2020 3 次提交

Support hba-hostnames and adding pg_hba entries during recoverseg · 7890784d

由 Kalen Krempely 提交于 10月 27, 2020

This commit does the following:
1. Extract config_primaries_for_replication to be used by both gpaddmirrors
and gprecoverseg.

2. gprecoverseg: add replication entries for primaries

3. gprecoverseg: add support for --hba-hostnames

7890784d

K

remove dead code appendNewEntriesToHbaFile() · 858428dd
由 Kalen Krempely 提交于 10月 27, 2020

858428dd

Remove left over gphostcache usage · 57eda1bf

由 Bhuvnesh Chaudhary 提交于 10月 27, 2020

This commit does the following:
1. Removes the calls related to gphost cache.
2. Uses ping to validate if the hostname could be resolved.

57eda1bf

05 11月, 2020 1 次提交

Redirect the error to log message · fc572100

由 Bhuvnesh Chaudhary 提交于 10月 08, 2020

Earlier the error was sent to /dev/null and the information was lost
displaying the cause of the error. Redirect the error to log file.

fc572100

03 11月, 2020 1 次提交
- X
  Allow gpexpand to expand materialized view · 5f7cdc1a
  由 xiong-gang 提交于 11月 03, 2020
```
Co-authored-by: NGang Xiong <gangx@vmware.com>
```
  5f7cdc1a
31 10月, 2020 1 次提交

Harden analyzedb against concurrently dropped and recreated tables · 4dc25ad7

由 Abhijit Subramanya 提交于 10月 28, 2020

Commit 4bbbb381 introduced some hardening
around concurrent drop and recreate of tables while analyzedb is running but it
failed to take into account the code around updating the last operation
performed. This commit fixes it.

4dc25ad7

30 10月, 2020 3 次提交

Fix source greenplum_path.sh error with set -u (#11085) · 1f429744

由 Chen Mulong 提交于 10月 30, 2020

The error was introduced by dc96f667.
If `set -u` was called before sourcing greenplum_path.sh with bash, an
error `ZSH_VERSION: unbound variable` would be reported.
To solve the issue, use shell syntax `{:-}` which will output an empty
value if the variable doesn't exist.

Tested with zsh, bash and dash.

1f429744

Make greenplum-path.sh compatible with more shells (#11043) · dc96f667

由 Chen Mulong 提交于 10月 30, 2020

The generated greenplum_path.sh env file contained bash specific syntax
previously, so it errors out if the user's shell is zsh.

zsh doesn't have BASH_SOURCE. "${(%):-%x}" is the similar replacement
for zsh.
Also try to support other shells with some command combinations.
Tested with bash/zsh/dash.

dc96f667

gpinitsystem -I should respect master dbid != 1 · 00ae3013

由 dh-cloud 提交于 10月 29, 2020

Looking at GP documents, there is no indication that master dbid
must be 1. However, when CREATE_QD_DB, gpinitsystem always writes
"gp_dbid=1" into file `internal.auto.conf` even if we specify:

```
mdw~5432~/data/master/gpseg-1~2~-1
 OR
mdw~5432~/data/master/gpseg-1~0~-1
```

But catalog gp_segment_configuration can have the correct master
dbid value (2 or 0), the mismatch causes gpinitsystem hang.
Users can run into such problem for their first time to use
gpinitsystem -I.

Here we test dbid 0, because PostmasterMain() will simply check
dbid >= 0 (non-utility mode), it says:

> This value must be >= 0, or >= -1 in utility mode

It seems 0 is a valid value.

Changes:

- use specified master dbid field when CREATE_QD_DB.
- remove unused macros MASTER_DBID, InvalidDbid in C sources.
Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>

00ae3013

27 10月, 2020 1 次提交

Fix a shell issue of empty string · 5db43663

由 Adam Lee 提交于 10月 26, 2020

If the $(UBUNTU_PLATFORM) is an empty string, the test command will fail,
double quotes it to fix.

```
--- mock for platform
/bin/sh: line 0: [: =: unary operator expected
```

5db43663

26 10月, 2020 1 次提交
- A
  Mock cmd.get_stdout() to fix test regression · 81613a5c
  由 Adam Lee 提交于 10月 26, 2020
```
Otherwise it will raise an exception "command not run yet".
```
  81613a5c
24 10月, 2020 1 次提交

gpstart: testing of improve handling of down segment hosts · e39465ae

由 David Krieger 提交于 10月 22, 2020

The tests in commit be5d11e2 contained a typo that caused the changes
in the Scenario "gpstart starts even if the standby host is unreachable"
to not properly cleanup after itself. Though the test feature still
passes, this leaves a bug to be found later when more tests are added.

e39465ae

23 10月, 2020 1 次提交

gprecoverseg: log the error if pg_rewind fails · 57756cc0

由 Adam Lee 提交于 10月 22, 2020

It didn't log the error message before if pg_rewind fails, fix that to make
DBA/field/developer's life eaisier.

Before this:
```
20201022:15:19:10:011118 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors
20201022:15:19:12:011118 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment.
20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-Starting mirrors
20201022:15:19:12:011118 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031
```

After this:
```
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Running pg_rewind on required mirrors
20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-pg_rewind: fatal: could not find common ancestor of the source and target cluster's timelines
20201022:15:33:31:019577 gprecoverseg:earth:adam-[WARNING]:-Incremental recovery failed for dbid 2. You must use gprecoverseg -F to recover the segment.
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-Starting mirrors
20201022:15:33:31:019577 gprecoverseg:earth:adam-[INFO]:-era is 0406b847bf226356_201022151031
```

57756cc0

22 10月, 2020 1 次提交

gpstart: improve handling of down segment hosts · be5d11e2

由 Jamie McAtamney 提交于 10月 13, 2020

Currently, if a host is unreachable when gpstart is run, it will not report this
and will instead fail with an error that is both inaccurate and unhelpful to the
user, such as claiming that checksums are invalid for segments on a given host
when it simply can't reach that host to verify the checksums.

This commit adds a check to verify that all hosts are reachable before beginning
the startup process and, if one or more hosts are not reachable, marks segments
on those hosts down (in gparray, not in the cluster) so gpstart won't try to run
any checks against unreachable hosts and so that the cluster can still be started
in this state so long as there are otherwise enough valid segments to start it.

be5d11e2

21 10月, 2020 1 次提交

fix gprecoverseg -r when password authentification enabled for gpadmin · b11743ce

由 Aleksey Kashin 提交于 10月 17, 2020

The parameters were incorrectly passed while gprecoverseg was invoked
causing gprecoverseg to fail.

Co-authored-by: Bhuvnesh Chaudhary<bchaudhary@vmware.com>

b11743ce

09 10月, 2020 1 次提交

Replace list() with set() validation in analyzedb · 1a7b4c83

由 Denis Smirnov 提交于 9月 29, 2020

After testing analyzedb on a huge database with 170k tables we
have found a bottleneck while printing candidate list to analyze.
It took about 45 minutes to print all tables. The bottleneck was
in O(n^2) complexity when we validated candidates in a loop with
a list() instead of set(). The same O(n^2) validation is made while
running analyze commands on executor pool.
This commit change candidate type from list() to set() to reduce
complexity from O(n^2) to O(n).

1a7b4c83

01 10月, 2020 3 次提交

Exclude only 127.0.0.x and ::1 address. · 555aba93

由 Bhuvnesh Chaudhary 提交于 9月 24, 2020

With the ifaddrs utility, we excluded ip addresses on the loopback
interface which caused regression causing replication entries to be not
populated for such interfaces causing gpaddmirrors and gpinitstandby to
fail. Routable IP addresses can be assigned to the loopback interface,
and this case was not considered earlier.
This commit fixes the issues by allowing all loopback addresses
except 127.0.0.1 and ::1 address

555aba93

gpinitsystem: remove sorting hostname logic · 880ce21e

由 Ashwin Agrawal 提交于 9月 30, 2020

In case of multi-host setup, gpinitsystem used to sort the hostnames
provided in hostfile. This logic seems guessing user intention and
hence unnecessary intelligence. Better to just use the order in which
names appear in file to deploy GPDB.

Searching though the historical code of greenplum, found previously it
used sort command but that used to yield unintended outcome and hence
logic was coded in python (because the desired outcome wished is
equivalent to what sort --version-sort would give). Though why sorting
existed in first place is no where to be found.

Input host file:
-------
sdw1-1
sdw10-1
sdw1-2
sdw10-2
-------

Sorted:
-------
sdw1-1
sdw1-2
sdw10-1
sdw10-2
-------

This logic got broken with Python3 changes, as the regex coded doesn't
work with Python3. It's still mystery for me how it worked for Python2
even. Anyway, lets just avoid sorting as we have no idea what naming
convention user is having for hostnames.

880ce21e

Delete logic to cleanup shared memory on unclean shutdown · 44d90150

由 Ashwin Agrawal 提交于 9月 30, 2020

Postgres has logic to reuse or cleanup the shared-memory from previous
unclean shut-down. Plus, also starting b0fc0df9 the System V shared
memory consumption was dramatically reduced. Hence, no need to have
this logic in utilities to clean up shared memory.

The main reason to make this change now is postmaster.pid file format
changed and postmaster status is recorded on last
line. CleanSharedMem() was coded with expectation last line will
always be shared memory key, no more holds true due to it. If we have
to keep this logic around need to change the logic to read line 7 from
file and not last line. Given the need doesn't exist, just deleting
the logic instead of fixing it.

Based on inputs from Heikki Linnakangas and Asim R P.

44d90150

25 9月, 2020 12 次提交

J
Update gpexpand to Python 3 · 379cd8e5
由 Jamie McAtamney 提交于 9月 24, 2020
```
Change sorting to use key functions, use bytestrings for interview
```
379cd8e5

Update gpload code and tests for Python 3 · 3c124b2f

由 Jamie McAtamney 提交于 9月 24, 2020

Co-authored-by: NAshwin Agrawal <aashwin@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>

3c124b2f

A

Update packcore tests to Python 3 · da34086f
由 Ashwin Agrawal 提交于 9月 24, 2020

da34086f

Replace unix.InterfaceAddrs with gp.IfAddrs · c3167d17

由 Tyler Ramer 提交于 9月 24, 2020

This was an outstanding TODO, and there is an added benefit of removing
yet another extensive shell command which is fragile.
Authored-by: NTyler Ramer <tramer@vmware.com>

c3167d17

Update isolation2 tests to Python 3 · 05a33dda

由 Jamie McAtamney 提交于 9月 24, 2020

This commit is mostly general Python 3 changes of the sort already made to the
utilities code (updating print syntax, updating imports, using bytestrings when
necessary, fixing sorting behavior, and so forth).

The major change of note in this commit is entirely replacing plpythonu with
plpython3u. While it would technically be possible to support both versions of
PL/Python concurrently, as the version of Python used by PL/Python doesn't have
to match the version used for the utilities, we've made the decision not to try
to support both, since users will need to update to Python 3 regardless.
Co-authored-by: NAshwin Agrawal <aashwin@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

05a33dda

Update utilities test code for Python 3 · ab965ba5

由 Tyler Ramer 提交于 9月 24, 2020

This commit makes several broad changes to address conversion issues common to
multiple test files:

- Several built-in functions have been deprecated or renamed, or now need to
  use bytestrings (and associated encoding and decoding) instead of strings

- There is a "test case" run when ComputeCatalogUpdate is executed as a
  standalone program, but this should not be present in shipped code, so we
  remove it

- Some shelled-out commands in test code have been simplified due to changes
  to shell escaping, file redirection, and string manipulation, moving string
  parsing logic from shell commands to internal Python logic wherever possible
Co-authored-by: NAshwin Agrawal <aashwin@vmware.com>
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

ab965ba5

Update gpcheckcat for Python 3 · 63fca133

由 Jamie McAtamney 提交于 9月 24, 2020

- Certain objects in gpcheckcat can no longer be sorted with Python 3's new
  hashing logic, and the sorting was not functionally necessary, so the sorting
  has been removed.

- In Python 2, variables of type gpcatalog.GPCatalogTable were automatically
  coerced to strings when performing string comparisons.  Python 3 is stricter,
  so an explicit conversion is required.

- The reduce function is no longer built in and must be imported, and the changes
  to sorting require a separate key function in the orphan table check.
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

63fca133

Replace pickling with json and shlex in utilities · b248209e

由 Jamie McAtamney 提交于 9月 24, 2020

Pickling was previously used in several utilities when shelling out commands
and/or executing commands remotely, in order to avoid needing to escape strings
when passing them back to the master.  The actual string contents were largely
or wholly ASCII, so pickling was overkill for that purpose.

The semantics of byte strings in Python 3 breaks the pickling logic, so we've
taken the opportunity to simplify that whole logic stack.  Code that formerly
pickled strings now uses shlex.quote() to escape strings where possible and
serializes strings with json where that is insufficient, removing any helper
functions that are no longer necessary.
Authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Authored-by: NTyler Ramer <tramer@vmware.com>

Removed unused or unecessary helper functions from gppylib

Shell escape function was unused and python 3 shlex.quote() function
should be used anyway.

canStringBeParsedAsInt was a silly helper function, and also failed to
actually complete the cast as string.
Authored-by: NTyler Ramer <tramer@vmware.com>

b248209e

Update utilities code to work with Python 3 · 78f5cf43

由 Jamie McAtamney 提交于 9月 24, 2020

This commit makes several broad changes to address conversion issues common to
multiple utilities:

- The input and output of subprocess in Python 3 are now bytestrings instead
  of strings. Thus, some sanitizing of inputs and outputs is necessary

- Many built-in functions like raw_input and __cmp__ are deprecated in Python 3,
  and as a side effect list sorting and hashing work differently, requiring a
  different set of helper functions

- Implicit relative imports no longer work, so dbconn (in utilities code) and
  mgmt_utils (in test code) must be added to the search path and imported using
  a full path instead

- File objects require flush methods in python3, and popen2 has been deprecated
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

78f5cf43

Remove subprocess32 · 6071056d

由 Tyler Ramer 提交于 9月 24, 2020

The subprocess32 package is a backport of Python 3 subprocess functionality to
Python 2, so with the upgrade to Python 3 it is no longer necessary.

This commit deletes the package from pythonSrc and changes import statements to
import subprocess directly, instead of falling back to it only if subprocess32
is not importable.
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

6071056d

Allow GPDB to build and test with Python 3 · 7306abea

由 Tyler Ramer 提交于 9月 24, 2020

- Update Python file shebangs to use python3 and update gp_replicate_check and
  gpversion.py to allow running under Python 3

- Use Centos 7 dev containers with Python 3 and pip3 installed for testing, as
  prod containers do not yet work with Python 3, and update Travis with Python 3

- Install dependencies with pip3 to get Python 3-compatible versions

- Copy the Python 3 version of .so files, don't unset PYTHONHOME and PYTHONPATH,
  and don't remove built files from install locations, so that the Python 2 and
  Python 3 versions of various files can coexist
Co-authored-by: NJamie McAtamney <jmcatamney@vmware.com>
Co-authored-by: NKris Macoskey <kmacoskey@vmware.com>
Co-authored-by: NTyler Ramer <tramer@vmware.com>

7306abea

Run 2to3 against Python code · 54a65573

由 Jamie McAtamney 提交于 9月 24, 2020

The 2to3 utility is an officially-supported script to automatically convert
Python 2 code to Python 3. It's not a complete fix by any means, but it
handles most basic syntax transformations and similar.

This commit is the result of running 2to3 against every Python file in the
gpMgmt directory, so it's quite large and fairly scattershot. Manual updates
to any code that 2to3 can't handle will come in later commits.

54a65573

16 9月, 2020 1 次提交

Fix gpload fail when capital letters in column in merge mode (#10804) · 4f7d02d8

由 xiaoxiao 提交于 9月 16, 2020

* add double quatations when creating staging table
omit distribution key

* fix gpload fail when column names have capital letters in merge mode
Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>

4f7d02d8

14 9月, 2020 1 次提交

Fix gpexpand help usage · cb2afaf9

由 japinli 提交于 9月 14, 2020

In commit 5eaa5889, the --novacuum option is removed, however the help
page of gpexpand keep the -V option, which is a short option for
--novacuum.

cb2afaf9

09 9月, 2020 3 次提交

Change shell to /bin/bash for gpload · 43d18a61

由 Shaoqi Bai 提交于 8月 31, 2020

Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>
Reviewed-by: NXin Zhang <zhxin@vmware.com>
Reviewed-by: NAdam Lee <adlee@vmware.com>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
Reviewed-by: NJesse Zhang <sbjesse@gmail.com>

43d18a61

Set GPHOME to GPDB installation directory when sourcing greenplum_path · ba600a54

由 Shaoqi Bai 提交于 8月 31, 2020

Co-authored-by: NNing Wu <ningw@vmware.com>
Co-authored-by: NShaoqi Bai <bshaoqi@vmware.com>
Reviewed-by: NXin Zhang <zhxin@vmware.com>
Reviewed-by: NAdam Lee <adlee@vmware.com>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
Reviewed-by: NJesse Zhang <sbjesse@gmail.com>

ba600a54

gpstart: when standby is unreachable don't start it · 14cd36be

由 Bhuvnesh Chaudhary 提交于 9月 03, 2020

When the standby is unreachable and the user proceeds with startup,
the standby would attempt to be started resulting in a stack trace.
Detect when the standby is unreachable and set start_standby to False to
prevent starting it later in the startup process.
Co-authored-by: NKalen Krempely <kkrempely@vmware.com>

14cd36be

07 9月, 2020 1 次提交
- X
  Revert "fix gpload upper letters in column in merge mode (#10763)" (#10776) · a27e5115
  由 xiaoxiao 提交于 9月 07, 2020
```
This reverts commit 1060a425.
```
  a27e5115