1. 25 6月, 2020 10 次提交
    • D
    • S
      Recompile plperl to set the right RUNPATH · cc985a3f
      Shaoqi Bai 提交于
      Currently, GPDB5 is built with --enable-rpath (default configure
      option). For plperl, it's Makefile specifies an absolute path to the
      location of "$(perl -MConfig -e 'print $Config{archlibexp}')/CORE"
      (e.g., /usr/lib64/perl5/CORE on RHEL7). This directory is not on the
      default search path for the runtime linker. Without the proper RUNPATH
      entry, libperl.so cannot be found when postgres tries to load the plperl
      extension.
      
      Without setting correct RUNPATH for plperl.so, will see a error like
      followin:
      ERROR:  could not load library
      "/usr/local/greenplum-db-devel/lib/postgresql/plperl.so": libperl.so:
      cannot open shared object file: No such file or directory
      Authored-by: NShaoqi Bai <bshaoqi@vmware.com>
      (cherry picked from commit a3eeadd5a575820ebc9a401c96b252355f183bb1)
      cc985a3f
    • B
      Remove AIX logic from generate-greenplum-path.sh · 0c9d63e1
      Bradford D. Boyle 提交于
      [#173046174]
      Authored-by: NBradford D. Boyle <bradfordb@vmware.com>
      (cherry picked from commit 28bcd4447551b3b9790e23d7865dbf403f79ef36)
      0c9d63e1
    • S
      Recompile plpython subdir to set the right RUNPATH · de611505
      Shaoqi Bai 提交于
      Authored-by: NShaoqi Bai <sbai@pivotal.io>
      (cherry picked from commit 6b71cd2e0085170139fd1c645696e6e2b8895058)
      de611505
    • T
      Add curly braces for GPHOME var · f9c946ca
      Tingfang Bao 提交于
      Authored-by: NTingfang Bao <baotingfang@gmail.com>
      (cherry picked from commit ae9bcdda4252524401fe6d1b4752355aa66e18ea)
      f9c946ca
    • B
      Using $ORIGIN as RUNPATH for runtime link · da4d7f95
      Bradford D. Boyle 提交于
      When upgrading from GPDB5 to GPDB6, gpupgrade will need to be able to call
      binaries from both major versions. Relying on LD_LIBRARY_PATH is not an option
      because this can cause binaries to load libraries from the wrong version.
      Instead, we need the libraries to have RPATH/RUNPATH set correctly. Since the
      built binaries may be relocated we need to use a relative path.
      
      This commit disables the rpath configure option (which would result in an
      absolute path) and use LDFLAGS to use `$ORIGIN`.
      
      For most ELF files a RUNPATH of `$ORIGIN/../lib` is correct. For pygresql
      python module and the quicklz_compressor extension, the RUNPATH needs to be
      adjusted accordingly. The LDFLAGS for those artifacts can be modified with
      different environment variables PYGRESQL_LDFLAGS and QUICKLZ_LDFLAGS.
      
      We always use `--enable-new-dtags` to set RUNPATH. On CentOS 6, with new dtags,
      both DT_RPATH and DT_RUNPATH are set and DT_RPATH will be ignored.
      
      [#171588878]
      Co-authored-by: NBradford D. Boyle <bboyle@pivotal.io>
      Co-authored-by: NXin Zhang <xzhang@pivotal.io>
      (cherry picked from commit 2eec06b39abe8cb5370e949056f26997b9d02572)
      da4d7f95
    • T
      Update generate-greenplum-path.sh for upgrade package · 5e139b76
      Tingfang Bao 提交于
      Following the [Greenplum Server RPM Packaging Specification][0], we need
      to update greenplum_path.sh file, and ensure many environment variables
      set correct.
      
      There are a few basic requirments for Greenplum Path Layer:
      
      * greenplum-path.sh shall be installed to `${installation
        prefix}/greenplum-db-[package-version]/greenplum_path.sh`
      * ${GPHOME} is set by given parameter, by default it should point to
        `%{installation prefix}/greenplum-db-devel`
      * `${LD_LIBRARY_PATH}` shall be safely set to avoid a trailing colon
        (which will cause the linker to search the current directory when
        resolving shared objects)
      * `${PYTHONPATH}` shall be set to `${GPHOME}/lib/python`
      * `${PATH}` shall be set to `${GPHOME}/bin:${PATH}`
      * If the file `${GPHOME}/etc/openssl.cnf` exists then `${OPENSSL_CONF}`
        shall be set to `${GPHOME}/etc/openssl.cnf`
      * The greenplum_path.sh file shall pass [ShellCheck][1]
      
      [0]: https://github.com/greenplum-db/greenplum-database-release/blob/master/Greenplum-Server-RPM-Packaging-Specification.md#detailed-package-behavior
      [1]: https://github.com/koalaman/shellcheckAuthored-by: NTingfang Bao <bbao@pivotal.io>
      (cherry picked from commit 5150237ac884227383f9f4f94e2383450756e7da)
      5e139b76
    • H
      Fix 'orafce' to work on GPDB. · 0b5113bf
      Heikki Linnakangas 提交于
      The previous commit updated 'orafce' module to new upstream version.
      This commit fixes it so that it works with GPDB. We carried many of these
      diffs against the old version too, but some are new with the new orafce
      version.
      
      Old differences, put back in this commit:
      
      - In the 'finish no data found' NOTICEs, GPDB strips trailing spaces
      
      - GPDB doesn't allow SQL_ASCII as database encoding. Use 'iso-8859-1' in
        the 'nlssort' test instead.
      
      - Don't install dbms_alert. It doesn't work in GPDB.
      
      - GPDB doesn't have 'tts_tableOid' field in TupleTableSlot
      
      New differences:
      
      - Use init_file to suppress extra NOTICEs about distribution keys and such
        that GPDB prints. (Previously, we had changed the expected outputs to
        include the NOTICEs, but this seems nicer.)
      
      - GPDB has built-in DECODE() support, so don't install the orafce decode()
        compatibility functions. The DECODE() is transformed to a CASE-WHEN
        construct, so fix all the regression tests that had 'decode' as the
        result column name now into 'case' instead. A few tests are now erroring
        out because of missing casts between text and other datatypes, but that's
        expected / accepted because the GPDB implementation of DECODE() is
        different.
      0b5113bf
    • H
      Update to orafce version 3.13.4. · 3f0e2baf
      Heikki Linnakangas 提交于
      This replaces the sources in the tree verbatim with upstream version
      3.13.4. It does not work with GPDB as it is, but I wanted to have the GPDB
      changes in separate commit. The next commit will fix it so that it works
      again.
      
      The reason to update it now is that we are working with mergign PostgreSQL
      v12 into GPDB, and the old 3.7 version we had in the repository will not
      work with PostgreSQL v12. This version should work with both PostgreSQL
      9.6 where we are at the moment, and with v12.
      3f0e2baf
    • T
      Fix unit tests for mainUtils · c81d9256
      Tyler Ramer 提交于
      Commit 4ab87e24f61b9aff26738cdb154673e03cd772a8 replaced too many usages
      of "with self.lock as l" with "with self.lock"
      
      This commit corrects the missing "with self.lock as l" problems to
      ensure the unit tests pass
      Authored-by: NTyler Ramer <tramer@vmware.com>
      c81d9256
  2. 24 6月, 2020 11 次提交
    • H
      Clean up the internal functions to merge partition stats a bit. · b22544fb
      Heikki Linnakangas 提交于
      aggregate_leaf_partition_MCVs() and aggregate_leaf_partition_histograms()
      functions did catalog lookups to count how many leaf partitions a root
      table has. Why? Because the caller passes two arrays as inputs, with an
      entry for each partition, and the functions need to know how large the
      input arrays are. That's overly complicated, of course: the caller can
      simply pass the size of the arrays as an argument. That's much more robust
      too, I think the current code would crash and burn if the partition
      hierarchy was modified concurrently. I'm not sure if that's a live bug, or
      if we're holding locks that prevent that, but let's keep things simple in
      any case.
      
      This removes the last callers of rel_get_leaf_children_relids() function,
      so remove that altogether.
      Reviewed-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      Reviewed-by: NHans Zeller <hzeller@vmware.com>
      b22544fb
    • H
      Fix DNS lookup issue for pure IPv6 addresses in URLs (#10331) · 40443808
      Huiliang.liu 提交于
      *Problem:
      getDnsCachedAddress() always returns empty string if the hostname only
      has IPv6 addresses
      
      *Root cause:
      Array hostinfo[] is redeclared in code section for IPv6, so the real
      lookup result never returns.
      
      *How to reproduce:
      You can reproduce this issue even you don't have pure IPv6 environment
      1. Assign a IPv6 address for your default net device
         sudo ip -6 addr add 2401::1234/128 dev ${netdev}
      2. Add the address to /etc/hosts with a dummy hostname, for example
         2401::1234/128	ipv6host
      3. Create external table using the dummy hostname
         create external table test_ext_ipv6(a int) location('gpfdist://ipv6host:8080/data_file') format 'csv';
      4. Launch gpfdist
      5. Run test SQL
         select * from test_ext_ipv6
         Then an error will be reported with an empty address in url:
         ERROR:  connection with gpfdist failed for "gpfdist://ipv6host:8080/data_file", effective url: "http://:8080/data_file"
      Co-authored-by: NPeifeng Qiu <pqiu@pivotal.io>
      40443808
    • T
      Remove lockfile from mainUtils · 8190ed40
      Tyler Ramer 提交于
      [Lockfile](https://pypi.org/project/lockfile/) has not been maintained
      since around 2015. Further, the functionality it provided seems poor - a
      review of the code indicated that it used the presence of the PID file
      itself as the lock - in Unix, using a file's existence followed by a
      creation is not atomic, so a lock could be prone to race conditions.
      
      The lockfile package also did not clean up after itself - a process
      which was destroyed unexpectedly would not clear the created locks, so
      some faulty logic was added to mainUtils.py, which checked to see if a
      process with the same PID as the lockfile's creator was running. This
      is obviously failure prone, as a new process might be assigned the same
      PID as the old lockfile's owner, without actually being the same process.
      
      (Of note, the SIG_DFL argument to os.kill() is not a signal at all, but
      rather of type signal.handler. It appears that the python cast this
      handler to the int 0, which, according to man 2 kill, leads to no signal
      being sent, but existance and permission checks are still performed. So
      it is a happy accident that this code worked at all)
      
      This commit removes lockfile from the codebase entirely.
      
      It also adds a "PIDLockFile" class which provides an atomic-guarenteed
      lock via the mkdir and rmdir commands on Unix - thus, it is not safely
      portable to Windows, but this should not be an issue as only Unix-based
      utilities use the "simple_main()" function.
      
      PIDLockFile provides API compatible classes to replace most of the
      functionality from lockfile.PidLockFile, but does remove any timeout
      logic as it was not used in any meaningful sense - a hard-coded timeout
      of 1 second was used, but an immediate result of if the lock is held is
      sufficient.
      
      PIDLockFile also includes appropriate __enter__, __exit__, and __del__
      attributes, so that, should we extend this class in the future, with
      syntax is functional, and __del__ calls release, so a process reaped
      unexpectedly should still clean its own locks as part of the garbage
      collection process.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      8190ed40
    • J
      Check whether the directory exists when deleting the tablespace (#10305) · b1b99c43
      Jinbao Chen 提交于
      If the directory of tablespace does not exist, we should got a
      error on commit transaction. But error on commit transaction will
      cause a panic. So the directory of tablespace should be checked
      so that we can avoid panic.
      b1b99c43
    • H
      Only apply transformGroupedWindows() with ORCA. (#10306) · e52dd032
      Heikki Linnakangas 提交于
      * Only apply transformGroupedWindows() with ORCA.
      
      The Postgres planner doesn't need it. Move the code to do it, so that it's
      only used before passing a tree to ORCA. This doesn't change anything with
      ORCA, but with the Postgres planner, it has some benefits:
      
      * Some cases before this patch do not give correct results and now they run
        correctly (e.g. case `regress/olap_window_seq`)
      * Fixes github issue #10143.
      
      * Make transformGroupedWindows walk the entire tree
      
      The transformGroupedWindows function now recursively transforms any
      Query node in the tree that has both window functions and groupby or
      aggregates.
      
      Also fixed a pre-existing bug where we put a subquery in the target
      list of such a Query node into the upper query, Q'. This meant that
      any outer references to the scope of Q' no longer had the correct
      varattno. The fix is to place the subquery into the target list of
      the lower query, Q'' instead, which has the same range table as the
      original query Q. Therefore, the varattnos to outer references to the
      scope of Q (now Q'') don't need to be updated. Note that varlevelsup to
      scopes above Q still need to be adjusted, since we inserted a new
      scope Q'. (See comments in code for explanations of Q, Q', Q'').
      Co-authored-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Co-authored-by: NHans Zeller <hzeller@vmware.com>
      Co-authored-by: NAbhijit Subramanya <asubramanya@pivotal.io>
      e52dd032
    • H
      Improve handling of target lists of window queries (#10309) · 33c4582e
      Hans Zeller 提交于
      Fixing two bugs related to handling queries with window functions and refactoring the related code.
      
      ORCA can't handle expressions on window functions like rank() over() - 1 in a target list. To avoid these, we split Query blocks that contain them into two. The new lower Query computes the window functions, the new upper Query computes the expressions.
      
      We use three mutators and walkers to help with this process:
      
      Increase the varlevelsup of outer references in the new lower Query, since we now inserted a new scope above it.
      Split expressions on window functions into the window functions (for the lower scope) and expressions with a Var substituted for the WindowFunc (for the upper scope). Also adjust the varattno for Vars that now appear in the upper scope.
      Increase the ctelevelsup for any RangeTblEntrys in the lower scope.
      The bugs we saw were related to these mutators. The second one didn't recurse correctly into the required types of subqueries, the third one didn't always increment the query level correctly. The refactor hopefully will simplify this code somewhat. For details, see individual commit messages.
      
      Note: In the 6X_STABLE branch, we currently have a temporary check that triggers a fallback to planner when we see window queries with outer refs in them. When this code gets merged into 6X, we will remove the temporary check. See #10265.
      
      * Add test cases
      * Refactor: Renaming misc variables and methods
      * Refactor RunIncrLevelsUpMutator
      
      Made multiple changes to how we use the mutator:
      
      1. Start the call with a method from gpdbwrappers.h, for two reasons:
         a) execute the needed wrapping code for GPDB calls
         b) avoid calling the walker function on the top node, since we don't
            want to increment the query level when we call the method on a
            query node
      
      2. Now that we don't have to worry anymore about finding a top-level
         query node, simplify the logic to recurse into subqueries by simply
         doing that when we encounter a Query node further down. Remove the
         code dealing with sublinks, RTEs, CTEs.
      
      3. From inside the walker functions, call GPDB methods without going
         through the wrapping layer again.
      
      4. Let the mutator code make a copy of the target entry instead of
         creating one before calling the mutator.
      
      * Refactor RunWindowProjListMutator, fix bug
      
      Same as previous commit, this time RunWindowProjListMutator gets refactored.
      This change also should fix one of the bugs we have seen, that this
      mutator did not recurse into derived tables that were inside scalar
      subqueries in the select list.
      
          Made multiple changes to how we use the mutator:
      
          1. Start the call with a method from gpdbwrappers.h, for two reasons:
             a) execute the needed wrapping code for GPDB calls
             b) avoid calling the walker function on the top node, since we don't
                want to increment the query level when we call the method on a
                query node
      
          2. Now that we don't have to worry anymore about finding a top-level
             query node, simplify the logic to recurse into subqueries by simply
             doing that when we encounter a Query node further down. Remove the
             code dealing with sublinks, RTEs, CTEs.
      
          3. From inside the walker functions, call GPDB methods without going
             through the wrapping layer again.
      
          4. Let the mutator code make a copy of the target entry instead of
             creating one before calling the mutator.
      
      * Refactor RunFixCTELevelsUpMutator, fix bug
      
      Converted this mutator into a walker, since only walkers visit RTEs, which
      makes things a lot easier.
      
      Fixed a bug where we incremented the CTE levels for scalar subqueries
      that went into the upper-level query.
      
      Otherwise, same types of changes as in previous two commits.
      
      * Refactor and reorder code
      
      Slightly modified the flow in methods CQueryMutators::ConvertToDerivedTable
      and CQueryMutators::NormalizeWindowProjList
      
      * Remove obsolete methods
      * Update expected files
      33c4582e
    • M
      docs - add missing end ' in example · ba88fa41
      mkiyama 提交于
      ba88fa41
    • J
      Drop -Wno-variadic-macros, which is inapplicable. · 3a84a379
      Jesse Zhang 提交于
      ORCA actively uses variadic macros (__VA_ARGS__) and we used to suppress
      a warning out of pedantry (it's a widely available language extension,
      but not in C++98 standard). Now that variadic macros are part of
      standard C++11, and that we mandate C++14, drop the warning suppression.
      3a84a379
    • J
      Update AX_CXX_COMPILE_STDCXX to latest version · c96d22da
      Jesse Zhang 提交于
      Copied from upstream autoconf-archive. Apart from bug fixes and new
      C++17 support in the code used for detecting compiler flag support, one
      notable user-visible change is that we no longer use "default" language
      level: an explicitly flag (like -std=gnu++14) will always be used.
      c96d22da
    • J
      Enable ORCA on macOS on Travis · 9c1a9824
      Jesse Zhang 提交于
      We disabled ORCA in the macOS Travis build in commit 649ee57d
      (#10068) when we thought builds timed out because the build machines
      were underpowered. Turns out we made a mistake in 649ee57d where
      the mac build was just broken (see fix in commit b371e592 #10147).
      Now that all the pieces are in place, I can think of no reason to
      neglect this compile-time coverage in Travis.
      
      Given that the macOS VMs in Travis are generally underpowered, build
      with -O0 to expedite the process.
      9c1a9824
    • A
      Avoid non-transactional modification of relfrozenxid during CLUSTER · 7f7fa498
      Andrey Borodin 提交于
      Cluster calls vac_update_relstats() which in-place (non-transactional)
      modifies pg_class. Incase of CLUSTER command aborts, these changes
      can't be rolled back. This creates problem leaving behind inaccurate
      relfrozenxid and other fields.
      
      Non-transaction update to reltuples, relpages, relallvisible is fine
      but not to relfrozenxid and relminmxid. Hence, this commit avoids
      in-place updating relfrozenxid and relminmxid for CLUSTER.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/10150.
      Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
      7f7fa498
  3. 23 6月, 2020 6 次提交
    • H
      Fix pgbench --tablespace option. · f6ec65f7
      Heikki Linnakangas 提交于
      The CREATE TABLE commands constructed in pgbench had the DISTRIBUTED BY
      and TABLESPACE options the wrong way 'round, so that you got a syntax
      error. For example:
      
      $ pgbench postgres -i --tablespace "pg_default"
      creating tables...
      ERROR:  syntax error at or near "tablespace"
      LINE 1: ...22)) with (appendonly=false) DISTRIBUTED BY (bid) tablespace...
                                                                   ^
      Put the clauses in right order.
      
      We have no test coverage for this at the moment, but PostgreSQL v11 adds
      a test for this (commit ed8a7c6f). I noticed this while looking at test
      failures with the PostgreSQL v12 merge.
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      f6ec65f7
    • D
      Fix tupdesc dangling pointer segfault in HashAgg · 41ce55bf
      Denis Smirnov 提交于
      This problem manifests itself with HashAgg on the top of
      DynamicIndexScan node and can cause a segmentation fault.
      
      1. A HashAgg node initializes a tuple descriptor for its hash
      slot using a reference from input tuples (coming from
      DynamicIndexScan through a Sequence node).
      2. At the end of every partition index scan in DynamicIndexScan
      we unlink and free unused memory chunks and reset partition's
      memory context. It causes a total destruction of all objects in
      the context including partition index tuple descriptor used in a
      HashAgg node.
      As a result we get a dangling pointer in HashAgg on switching to
      a new index partition during DynamicIndexScan that can cause a
      segfault.
      41ce55bf
    • Z
      Make cdbpullup_missingVarWalker also consider PlaceHolderVar. · 2cb36320
      Zhenghua Lyu 提交于
      When planner adds a redistribute motion above this subplan, planner
      will invoke `cdbpullup_findEclassInTargetList` to make sure the
      distkey can be computed based on subplan's targetlist. When the distkey
      is an expression based on some PlaceholderVar elements in targetlist,
      the function `cdbpullup_missingVarWalker` does not handle it correctly.
      
      For example, when distkey is:
      
      ```sql
      CoalesceExpr [coalescetype=23 coalescecollid=0 location=586]
              [args]
                      PlaceHolderVar [phrels=0x00000040 phid=1 phlevelsup=0]
                              [phexpr]
                                      CoalesceExpr [coalescetype=23 coalescecollid=0 location=49]
                                              [args] Var [varno=6 varattno=1 vartype=23 varnoold=6 varoattno=1]
      ```
      
      and targetlist is:
      
      ```
      TargetEntry [resno=1]
              Var [varno=2 varattno=1 vartype=23 varnoold=2 varoattno=1]
      TargetEntry [resno=2]
              Var [varno=2 varattno=2 vartype=23 varnoold=2 varoattno=2]
      TargetEntry [resno=3]
              PlaceHolderVar [phrels=0x00000040 phid=1 phlevelsup=0]
                      [phexpr]
                              CoalesceExpr [coalescetype=23 coalescecollid=0 location=49]
                                      [args] Var [varno=6 varattno=1 vartype=23 varnoold=6 varoattno=1]
      TargetEntry [resno=4]
              PlaceHolderVar [phrels=0x00000040 phid=2 phlevelsup=0]
                      [phexpr]
                              CoalesceExpr [coalescetype=23 coalescecollid=0 location=78]
                                      [args] Var [varno=6 varattno=2 vartype=23 varnoold=6 varoattno=2]
      ```
      
      Previously only consider Var leads to `cdbpullup_missingVarWalker` fail.
      
      See Github issue: https://github.com/greenplum-db/gpdb/issues/10315 for
      details.
      
      This commit fixes the issue by considering PlaceHolderVar in function
      `cdbpullup_missingVarWalker`.
      2cb36320
    • T
      Update PyGreSQL from pre-release to 5.2.0 · 0ee8dfb5
      Tyler Ramer 提交于
      PyGreSQL 5.2.0 which contains the fixes submitted and referenced in
      cb8d54a6 was released on June 21, 2020.
      
      Update the build process to use this tagged release rather than a
      pre-release hash
      Authored-by: NTyler Ramer <tramer@vmware.com>
      0ee8dfb5
    • T
      Update psutil from 4.0.0 to 5.7.0 · b365d229
      Tyler Ramer 提交于
      psutil 4.0.0 is quite old, and only lists support for python 3.4. We'll
      need support for python 3.6 and 3.8 as we update to python3.
      Authored-by: NTyler Ramer <tramer@pivotal.io>
      b365d229
    • D
      Docs - updating bookb build dependencies · b99e78df
      David Yozie 提交于
      b99e78df
  4. 22 6月, 2020 3 次提交
    • E
      d29c9a23
    • R
      Fix parameterized paths · 9cc1da61
      Richard Guo 提交于
      This patch fixes two issues related to parameterized path logic on
      master.
      
      1. When generating unique row ID on the outer/inner side for join
      JOIN_DEDUP_SEMI/JOIN_DEDUP_SEMI_REVERSE, we need to pass the param info
      of outerpath/innerpath to the projection path. Otherwise we would have
      problems when deciding whether a joinclause is movable to this join rel.
      
      2. We should not pick up the parameterized path when its required outer
      is beyond a Motion, since we cannot pass a param through Motion.
      
      Fixes issue #10012
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
      9cc1da61
    • (
      Fix flaky appendonly test. · f860ff0c
      (Jerome)Junfeng Yang 提交于
      This fix the error:
      ```
      ---
      /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/appendonly.out
      2020-06-16 08:30:46.484398384 +0000
      +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/appendonly.out
      2020-06-16 08:30:46.556404454 +0000
      @@ -709,8 +709,8 @@
         SELECT oid FROM pg_class WHERE relname='tenk_ao2'));
             case    | objmod | last_sequence | gp_segment_id
              -----------+--------+---------------+---------------
            + NormalXid |      0 | 1-2900        |             1
              NormalXid |      0 | >= 3300       |             0
            - NormalXid |      0 | >= 3300       |             1
              NormalXid |      0 | >= 3300       |             2
              NormalXid |      1 | zero          |             0
              NormalXid |      1 | zero          |             1
      ```
      
      The flaky is because of the orca `CREATE TABLE` statement without
      `DISTRIBUTED BY` will treat the table as randomly distributed.
      But the planner will treat as distributed by the table's first column.
      
      ORCA:
      ```
      CREATE TABLE tenk_ao2 with(appendonly=true, compresslevel=0,
      blocksize=262144) AS SELECT * FROM tenk_heap;
      NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL
      policy entry.
      ```
      
      Planner:
      ```
      CREATE TABLE tenk_ao2 with(appendonly=true, compresslevel=0,
      blocksize=262144) AS SELECT * FROM tenk_heap;
      NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s)
      named 'unique1' as the Greenplum Database data distribution key for this
      table.
      ```
      
      So the data distribution for table tenk_ao2 is not as expected.
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      f860ff0c
  5. 20 6月, 2020 1 次提交
    • E
      For Python testing artifacts, introduce combination of Concourse cache and pip --cache-dir. · dcc5abb7
      Ed Espino 提交于
      For the Python testing artifacts used by the CLI tools, utilize the
      Concourse cached directories feature to create and use a pip cache dir
      shared between task runs.
      
      Be aware, the cache is scoped to the worker the task is run on. We do
      not get a cache hit when subsequent builds run on different workers.
      
      * The environment variable PIP_CACHE_DIR is used to store the cache
      directory.
      
      * Add "--retries 10" to Behave test dependency pip install commands.
      dcc5abb7
  6. 19 6月, 2020 4 次提交
    • W
      Fix cursor snapshot dump xid issue · 32a3a4db
      Weinan WANG 提交于
      For cursor snapshot dump, we need to record both distributed and local
      xid. So far, we only record distributed xid in the dump, as well as,
      incorrectly assign distributed xid to local xid by dump read function.
      
      Fix it.
      32a3a4db
    • P
      Re-enable test segwalrep/dtx_recovery_wait_lsn (#10320) · fe26d931
      Paul Guo 提交于
      Enable and refactor test isolation2:segwalrep/dtx_recovery_wait_lsn
      
      The test was disabled in 791f3b01.
      Because there was concern about the change of the line number in
      sql_isolation_testcase.py in the answer file. We refactor the test
      to ease the concern and then enable the test again.
      Co-authored-by: NGang Xiong <gxiong@pivotal.io>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      fe26d931
    • P
      Avoid generating core files during testing. (#10304) · 4a61357c
      Paul Guo 提交于
      We had some negative tests that need to panic and thus generating core files
      finally if the system is configured with corefile dump. Long ago we did
      optimization to avoid generating core files in some cases. Now we found other
      new scenarios that could be further optimized.
      
      1. avoid core file generation with setrlimit() in the FATAL fault inject cod.
      Some times FATAL is upgraded to PANIC (e.g.  critical section, fail when doing
      QD prepare related work). So we could avoid generating core file for this
      scenario also. Note even if the FATAL is not upgraded, it's fine mostly to
      avoid core file generation since the process will quit soon.  With the code
      change, We avoid two core files from test isolation2:crash_recovery_dtm.
      
      2. We previously had sanity check dbid/segidx in QE:HandleFtsMessage(), and
      panic if there is inconsistency when cassert is enabled, but it seems that we
      really do not need to panic since the root cause of the failure is quite
      straightforward, and the call stack is quite simple: PostgresMain() ->
      HandleFtsMessage(), and also that part of code does not invovle shared memory
      so no need to worry about shared memory mess (else we might want a core file to
      check). Downgrading the log level to FATAL. This avoids 6 core files from test
      isolation2:segwalrep/recoverseg_from_file.
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      4a61357c
    • M
      docs - add views pg_stat_all_tables and indexes (#10248) · 74c5bb7d
      Mel Kiyama 提交于
      * docs - add views pg_stat_all_tables and indexes
      
      pg_stat_all_indexes
      pg_stat_all_tables
      
      Also add some statistics GUCs.
      --track_activities
      --track_counts
      
      * docs - clarify seq_scan and idx_scan refer to the total number of scans from all segments
      
      * docs - minor edits
      74c5bb7d
  7. 18 6月, 2020 5 次提交
    • (
      Fix CASE WHEN IS NOT DISTINCT FROM clause incorrect dump. (#10298) · 3b2aed6e
      (Jerome)Junfeng Yang 提交于
      The clause 'CASE WHEN (arg1) IS NOT DISTINCT FROM (arg2)' dump will miss
      the arg1. For example:
      ```
      CREATE OR REPLACE VIEW xxxtest AS
      SELECT
          CASE
          WHEN 'I will disappear' IS NOT DISTINCT FROM ''::text
          THEN 'A'::text
          ELSE 'B'::text
          END AS t;
      ```
      The dump will lose 'I will disappear'.
      
      ```
      SELECT
          CASE
          WHEN IS NOT DISTINCT FROM ''::text
          THEN 'A'::text
          ELSE 'B'::text
          END AS t;
      ```
      3b2aed6e
    • H
      Fix a flaky test for gdd/dist-deadlock-upsert (#10302) · a3f34ae7
      Hao Wu 提交于
      * Fix a flaky test for gdd/dist-deadlock-upsert
      
      When to run GDD probe is undermined, but it is import for the test
      gdd/dist-deadlock-upsert. If the GDD probe runs immediately after
      the 2 inter-dead-locked transactions, one of the transactions will
      be killed. The isolation2 framework consider the transaction being
      blocked if the transaction doesn't finished in 0.5 second. So, if
      the killed transaction is too early to be aborted, the test framework
      sees no dead lock.
      Analyzed-by: NGang Xiong <gxiong@pivotal.io>
      
      * rm sleep
      a3f34ae7
    • N
      resgroup: fix the cpu value of the per host status view · e0d78729
      Ning Yu 提交于
      Resource group we does not distinguish the per segment cpu usage, the
      cpu usage reported by a segment is actually the total cpu usage of all
      the segments on the host.  This is by design, not a bug.  However, in
      the gp_toolkit.gp_resgroup_status_per_host view it reports the cpu usage
      as the sum of all the segments on the same host, so the reported per
      host cpu usage is actually N times of the actual usage, where N is the
      count of the segments on that host.
      
      Fixed by reporting the avg() instead of the sum().
      
      Tests are not provided as the resgroup/resgroup_views did not verify cpu
      usages since the beginning, because the cpu usage is unstable on
      pipelines.  However, I have verified manually.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      e0d78729
    • J
      Enable brin in ao/aocs table (#9537) · 46d9e26a
      Jinbao Chen 提交于
      We merge the brin from Postgres9.5, but greenplum did not enable
      brin on ao/aocs table.
      
      The reason brin cannot be used directly on the ao / aocs table is
      that the storage structure of ao / aocs is different from the heap
      table. Heap has only one physical file, and all block numbers are
      continuous. The revmap in brin is a array that spans multiple
      blocks, but it does not make sense in ao/aocs table.
      
      Ao/aocs has 128 segment files, and the block numbers in these
      segments are distributed over the entire value range. If we use an
      array to record the information of each block, this array will be
      too large.
      
      So we introduced an upper structure to solve this problem. The
      upper level is a array which records the block number of the
      revmap block. The revmap blocks are not continuous. When we need
      an new revmap block, just extend a new one and record the block
      number in the upper level array.
      Reviewed-by: NAsim R P <pasim@vmware.com>
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      Reviewed-by: Nxiong-gang <gxiong@pivotal.io>
      Reviewed-by: NAdam Lee <adam8157@gmail.com>
      46d9e26a
    • M
      docs - update postGIS 2.5.4 docs (#10297) · 39a25f82
      Mel Kiyama 提交于
      * docs - update postGIS 2.5.4 docs
      
      Updates for Greenplum PostGIS 2.5.4 v2
      
      --Add list of PostGIS extensions
      --Add support for PostGIS TIGER geocoder, address standardizer and address rules files.
      --Update install/uninstall instructions to use CREATE EXTENSION command
      --Remove postgis_manager.sh script
      --Remove PostGIS Raster limitation.
      
      * docs updated PostGIS 2.5.4 docs based on review comments.
      
      * docs - removed postgis_raster extension.
      
      * docs - review comment updates  -Added section for installing the PostGIS package
      -Updated section on removing PostGIS package
      -Fix typos.
      
      * docs - updated platform requirements for PostGIS 2.5.4 v2
      -also removed "beta" from GreenplumR
      39a25f82