1. 07 9月, 2020 2 次提交
  2. 04 9月, 2020 1 次提交
  3. 03 9月, 2020 5 次提交
    • J
      [gpopt] Return a resource-safe type from gpdb::GetRelation · e754018a
      Jesse Zhang 提交于
      This commit takes advantage of the resource-safety afforded by
      RelationWrapper by using it as the return type of gpdb::GetRelation().
      
      This allows us to write code like this:
      
      auto rel = GetRelation(...);
      if (!RelIsSupported(rel)) {
      	return -1;
      }
      do_stuff(rel);
      
      Instead of code like this before the patch:
      
      Relation rel = GetRelation(...);
      
      if (!RelIsSupported(rel)) {
      	CloseRelation(rel);
      	return -1;
      }
      
      GPOS_TRY {
      	do_stuff(rel);
      	CloseRelation(rel);
      } GPOS_CATCH_EX(ex) {
      	CloseRelation(rel);
      	GPOS_RETHROW(ex);
      } GPOS_CATCH_END;
      e754018a
    • J
      [gpopt] Add a simple RAII wrapper for Relation · fb83fa30
      Jesse Zhang 提交于
      fb83fa30
    • J
      [gpopt] clean up code calling GetRelation · 62d3d2ab
      Jesse Zhang 提交于
      We're about to introduce a different return type to gpdb::GetRelation in
      a forthcoming commit. To ease that transition, change Yoda conditions
      for pointer non-null comparison to the more idiomatic C++ style of using
      pointers in a boolean context. Also remove one redundant fallback
      exception.
      62d3d2ab
    • A
      Handle the PROCSIG_NOTIFY_INTERRUPT signal · 6d155238
      Adam Lee 提交于
      This will make LISTEN and NOTIFY work on the QD node.
      6d155238
    • B
      Export MASTER_DATA_DIRECTORY while calling gpconfig · c6467d2c
      Bhuvnesh Chaudhary 提交于
      For gpdb applicance, there are certain GUCs which are set using
      gpconfig, but currently it fails as MASTER_DATA_DIRECTORY is not
      exported. This commit exports MASTER_DATA_DIRECTORY so that gpconfig
      succeeds.
      
      This commit also allows setting DCA_VERSION_FILE to enable testing.
      Also add a test for the same to ensure that DCA
      configuration GUCs are set properly on the environment.
      c6467d2c
  4. 02 9月, 2020 4 次提交
    • H
      Correct reset condition in sessionResetSlot · b4cec6c9
      Hubert Zhang 提交于
      b4cec6c9
    • H
      Fix flaky test 'gangsize' · 2771a51a
      Hubert Zhang 提交于
      In the test case, there's query like 'insert into t select i from
      generate_series(1,10) i', the slice of 'generate_series' has the locus of
      general, so it might be executed in any segment according to the session id and
      that makes the test flaky. To make it deterministic, we change generate_series
      to a regular table and filter the data with gp_segment_id. This commit also
      removes the alternative expect files.
      Co-authored-by: NGang Xiong <gangx@vmware.com>
      2771a51a
    • H
      Fix compile error for missing brackets · b2d32cb9
      Hubert Zhang 提交于
      b2d32cb9
    • H
      Using lwlock to protect resgroup slot in session state · a4cb06b4
      Hubert Zhang 提交于
      Resource group used to access resGroupSlot in SessionState without
      lock. This is correct when session only access resGroupSlot by itself.
      But as we introduced runaway feature, we need to traverse the current
      session array to find the top consumer session when redzone is reached.
      This requires:
      1. runaway detector should hold shared resgroup lock to avoid resGroupSlot
      is detached from a session concurrently when redzone is reached.
      2. normal session should hold exclusive lock when modifying resGroupSlot
      in SessionState.
      
      Also fix a compile warning.
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      a4cb06b4
  5. 01 9月, 2020 8 次提交
    • D
      Allow direct dispatch in Orca if predicate on column gp_segment_id (#10679) · 10e2b2d9
      David Kimura 提交于
      This approach special cases gp_segment_id enough to include the column
      as a distributed column constraint. It also updates direct dispatch info
      to be aware of gp_segment_id which represents the raw value of the
      segment where the data resides. This is different than other columns
      which hash the datum value to decide where the data resides.
      
      After this change the following DDL shows Gather Motion from 2 segments
      on a 3 segment demo cluster.
      
      ```
      CREATE TABLE t(a int, b int) DISTRIBUTED BY (a);
      EXPLAIN SELECT gp_segment_id, * FROM t WHERE gp_segment_id=1 or gp_segment_id=2;
                                        QUERY PLAN
      -------------------------------------------------------------------------------
       Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..431.00 rows=1 width=12)
         ->  Seq Scan on t  (cost=0.00..431.00 rows=1 width=12)
               Filter: ((gp_segment_id = 1) OR (gp_segment_id = 2))
       Optimizer: Pivotal Optimizer (GPORCA)
      (4 rows)
      
      ```
      10e2b2d9
    • H
      Have row and cost estimates in planner represent per-node row counts. · c5f6dbbe
      Heikki Linnakangas 提交于
      This is more in line with upstream parallel plans, where the estimates
      also mean "per worker".
      
      NOTE: The rows/tuples/pages in RelOptInfo still represent whole-rel
      values. That's the only thing that makes sense for join rels, which
      could have Paths with different locus.
      
      This doesn't change the row counts displayed in EXPLAIN output, because
      previously we divided the row counts stored on the plan nodes with the
      number of segments, for display purposes. With this patch, that's no
      longer necessary. You can see the difference in the cost estimates,
      however.
      
      This doesn't affect GPORCA's cost model, and the GPORCA translator has
      been modified to divide row count estimates in the final plan by the number
      of segments, to keep the row counts shown in EXPLAIN comparable with the
      Postgres planner's numbers, and unchanged from previous versions.
      
      This includes some changes to GPORCA output files too. Most of the real
      changes that are not just to plans in queries where GPORCA falls back,
      are because I added an "ANALYZE int8_tbl" to the int8 test. That affects
      many test queries that used the int8_tbl table. I added the "ANALYZE
      int8_tbl" command to make one of the planner tests to produce the same
      plan as before (I forget which one, unfortunately.).
      
      Discussion: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/cGZsAFiRfBE/m/aq6PKj23AwAJReviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
      c5f6dbbe
    • H
      Add comments to 'gp_aggregates_costs' test. · 765a526b
      Heikki Linnakangas 提交于
      Reviewed-by: NZhenghua Lyu <zlv@pivotal.io>
      765a526b
    • H
      ic-proxy: Quit proxy bgworker when postmaster is dead · 9ce59d1a
      Hubert Zhang 提交于
      Proxy bgworker will become orphan process after postmaster is dead
      due to the lack of checking pipe postmaster_alive_fds[POSTMASTER_FD_WATCH].
      Epoll this pipe inside proxy bgworker main loop as well.
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      9ce59d1a
    • H
      Fix resource group runaway rounding issue · 757184f9
      Hubert Zhang 提交于
      When calculating safeChunksThreshold of runaway in resource group,
      we used to divide by 100 to get the number of safe chunks. This may
      lead to small chunk numbers to be rounded to zero. Fix it by storing
      safeChunksThreshold100(100 times bigger than the real safe chunk) and
      do the computation on the fly.
      Reviewed-by: NNing Yu <nyu@pivotal.io>
      757184f9
    • J
      Decorate assert-only variables with GPOS_ASSERTS_ONLY · 082ea4c5
      Jesse Zhang 提交于
      This is in no way exhaustive, I'm only changing what seems abundantly
      obvious and greppable.
      082ea4c5
    • J
      Add GPOS_UNUSED attribute · 26ae898a
      Jesse Zhang 提交于
      While we're at it, also add another attribute
      GPOS_ASSERTS_ONLY. This should help us eliminate a lot of
      clutter around the code that looks like this:
      
          BOOL result =
                  m_cte_consumer_info->Insert(key, GPOS_NEW(m_mp) SCTEConsumerInfo(cte_plan));
      26ae898a
    • J
      [gpopt] Remove dead variables from translator · b6959a6a
      Jesse Zhang 提交于
      With this patch, the whole translator compiles warning free.
      
      null_ndv was orphaned in commit 25479cf1 ("Fix num_distinct
      calculation in relcache translator").
      
      coercePathType was dead on arrival in commit cc799db4 ("Fix Relcache
      Translator to send CoercePath info (#2842)").
      b6959a6a
  6. 31 8月, 2020 7 次提交
    • X
      Fix a 'VACUUM FULL' bug · 082f39d5
      xiong-gang 提交于
      When doing 'VACUUM FULL', 'swap_relation_files' updates the pg_class entry but
      not increase the command counter, so the later 'vac_update_relstats' will
      inplace update the 'relfrozenxid' and 'relhasindex' of old tuple, when the
      transaction is interrupted and aborted on the QE after this, the old entry is
      corrupted.
      This problem is partially fixed by commit 7f7fa498, this commit seperates the
      code of sending stats to QD and call it in `vac_update_relstats` instead of
      update the stats on QE.
      082f39d5
    • H
      Fix crash when planner chose an Index Only Scan on a bitmap index. · 0b2f53d5
      Heikki Linnakangas 提交于
      Index Only Scans have not been implemented on Bitmap Indexes, but in
      certain circumstances, when the query doesn't need any of the attributes
      from the index, like in "SELECT count(*) from table", the planner may
      still choose an Index Only Scan. It's debatable if that's actually a
      planner bug, but we can easily support that limited case.
      Reviewed-by: NAshwin Agrawal <aashwin@vmware.com>
      0b2f53d5
    • H
      Fix DISTINCT plans created on top of pre-sorted inputs. · a9810725
      Heikki Linnakangas 提交于
      If you have a pre-sorted input, like Index Scan, and a DISTINCT clause,
      the planner would create an invalid plan. A Redistribute Motion node is
      breaks the ordering of its input, so such a plan cannot be used as
      input to a Unique node.
      
      This is possibly unreachable at the moment, because parse analysis
      transforms simple DISTINCT queries to GROUP BY (see call to
      transformDistinctToGroupBy() in transformSelectStmt()). I have not been
      able to come up with a query that would exercise this codepath; any
      simple query is transformed to a GROUP BY, and anything more complicated,
      with window functions or aggregates, don't yield sorted input to the
      DISTINCT stage. But if you disable the DISTINCT -> GROUP BY transformation
      in parse analysis, this query caused an assertion before this commit:
      
          postgres=# create table distincttest (i int, j int) distributed by (i);
          CREATE TABLE
          postgres=# create index on distincttest (j);
          CREATE INDEX
          postgres=# set gp_enable_multiphase_agg =off; set enable_hashagg=off; set enable_seqscan=off; set enable_bitmapscan=off;
          SET
          SET
          SET
          SET
          postgres=# explain select distinct j from distincttest;
          FATAL:  Unexpected internal error (createplan.c:6871)
          DETAIL:  FailedAssertion("!(numCols >= 0 && numCols <= list_length(pathkeys))", File: "createplan.c", Line: 6871)
          server closed the connection unexpectedly
          	This probably means the server terminated abnormally
          	before or while processing the request.
          The connection to the server was lost. Attempting reset: Succeeded.
      a9810725
    • D
      Disable strxfrm for mk_sort at compile time · 2d523e9e
      Denis Smirnov 提交于
      Glibc implementations are known to return inconsistent results for
      strcoll() and strxfrm() on many platforms that can cause
      unpredictable bugs. Because of that PostgreSQL disabled strxfrm()
      by default since 9.5 at compile time by TRUST_STRXFRM definition.
      Greenplum has its own mk sort implementation that can also use
      strxfrm(). Hence mk sort can also be affected by strcoll() and
      strxfrm() inconsistency (breaks merge joins). That is why strxfrm()
      should be disabled by default with TRUST_STRXFRM_MK_SORT definition
      for mk sort as well. We don't use PostgreSQL's TRUST_STRXFRM
      definition as many users used Greenplum with strxfrm() enabled for
      mk sort and disabled in PostgreSQL core. Keeping TRUST_STRXFRM_MK_SORT
      as a separate definition allows these users not to reindex after
      version upgrade.
      Reviewed-by: NAsim R P <pasim@vmware.com>
      Reviewed-by: NHeikki Linnakangas <linnakangash@vmware.com>
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      2d523e9e
    • X
      fix gpload multi-level partition table and special char in columns issue (#10686) · d80ec3a5
      xiaoxiao 提交于
      fix match column condition to resovle primary key conflict when using the gpload
      merge mode to import data to the Multi-level partition table
      fix fail when special char and capital letters in column names
      Co-authored-by: NXiaoxiaoHe <hxiaoxiao@vmware.com>
      d80ec3a5
    • (
      Fix bug for ANALYZE inherited tables. (#10723) · 289ebe68
      (Jerome)Junfeng Yang 提交于
      When QD acquiring sample rows on QE, the QE should only collect required
      on parent or all inherited tables. Otherwise, the QD may get wrong results
      for parent table since the inherited tables will overwrite the expected
      values. And we'll have incorrect results in pg_class and pg_statistic.
      
      In `gp_acquire_sample_rows`, the function has three inputs, but somehow,
      the code loses the last `inherited` argument usage. This is important to
      distinguish whether the QD need samples for parent only or all inherited
      tables.
      
      On QE, when receiving ANALYZE request through gp_acquire_sample_rows.
      We should only perform do_analyze_rel for the parent table only
      or all it's children tables. Because QD will send two acquire sample
      rows requests to QE.
      To distinguish the two requests, we check the ctx->inherited value.
      289ebe68
    • Enable autoanalyze in Greenplum (#10515) · 0b17968e
      盏一 提交于
      The basic idea for only enabling auto-ANALYZE through Master’s autovacuum
      daemon is to collect pgstat info into Master when executing queries. Start
      the Master’s autovacuum launcher process. Fire an autovacuum work process for
      a database on Master when the naptime reaches. Then the autovacuum worker
      will iterate through all tables/materialized views under a specified database,
      and execute ANALYZE for tables which reached the analyze threshold. Note the
      ANALYZE statement issued in the autovacuum worker on Master is the same as
      executing it through query on QD. ie. The auto-ANALYZE is coordinated by the
      master and segments do not start it’s own autovacuum launcher and autovacuum
      worker.
      
      More details please refer to src/backend/postmaster/README.auto-ANALYZE.
      Co-authored-by: NJunfeng(Jerome) Yang <jeyang@pivotal.io>
      0b17968e
  7. 28 8月, 2020 4 次提交
    • H
    • H
      Fix gathering statistics sample from segments. · e878e2e8
      Heikki Linnakangas 提交于
      Commit 0c27e42a changed the way that the gp_acquire_sample_rows()
      function, called by ANALYZE, collects the sample rows. With the commit,
      the sample size was not chosen correctly. The sample size is passed to
      gp_acquire_sample_rows() as an argument, 'targrows', but the function did
      not pass it down to the do_analyze_rel() function that actually collects
      the sample. As a result, do_analyze_rel() collected a larger sample, but
      gp_acquire_sample_rows() only returned the first 'targrows' rows of it
      to the caller.
      
      For example, if you have three segments and the total desired sample size
      is 3000 rows, gp_acquire_sample_rows() is called with targrows=1000. But
      do_analyze_rel() nevertheless collected a sample with 3000 rows, but
      only the first 1000 rows of it were returned to the QD. The end result was
      that the sample was highly biased towards the physical beginning of table.
      
      This adds a test case, which creates and ANALYZEs a table with values
      0-99, with 100 copies of each distinct value. The table is populated in
      order, so there is perfect correlation between the physical order and the
      values. Before this patch, ANALYZE built a histogram like this for it:
      
      regression=# select histogram_bounds from pg_stats s where tablename = 'uniformtest';
              histogram_bounds
      ---------------------------------
       {0,3,6,10,13,17,20,24,27,34,40}
      (1 row)
      
      After this fix:
      
               histogram_bounds
      ----------------------------------
       {0,8,21,32,42,51,60,71,81,89,99}
      (1 row)
      
      Commit 0c27e42a updated the plan in expected output of
      'gp_aggregates_costs' test. This reverts it back; the reason it changed was
      that the statistics were bogus, and now they're good again. I'm not sure
      which plan actually is better for that query. The cost estimates are not
      very accurate in either case, but they're inaccurate in different ways. The
      query actually returns 300000 rows, the estimate with the bogus stats was
      463756 rows and with teh correct stats it's 103613.
      e878e2e8
    • P
      Increase default value of guc gp_snapshotadd_timeout · 7c6c1b76
      Paul Guo 提交于
      This is used to avoid "writer segworker group shared snapshot collision on id
      153871" kind of error.  Pengzhou and I saw this in a real product environment
      on gpdb 5. Pengzhou suspected that writer gang exits due to
      gp_vmem_idle_resource_timeout but it exits slowly because of ProcArrayLock
      contention so the collision happens when a new gang is created. The theory was
      roughly verified with process core dump when that issue happens - ProcArrayLock
      contention was found in those core files.
      
      Increasing the default gp_snapshotadd_timeout value to tolerate more with the
      case. We have been optimizing the ProcArrayLock but we can not 100% avoid the
      contention.
      7c6c1b76
    • H
      Disable changing distribution keys implicitly when creating unique index (#10510) · 84d2a23f
      Hao Wu 提交于
      In previous GPDB version, the distribution keys may be changed implicitly
      when creating a unique index on a hash-distributed empty table.
      ```SQL
      create table foo(a int, b int) distributed by(a);
      create unique index on foo(b);
      -- now, foo is hash distributed by b, not by a
      ```
      It might be useful(maybe) to avoid changing the distribution keys. However,
      on the other side, it's crazy if the user doesn't notice the NOTICE message
      like, "NOTICE:  updating distribution policy to match new UNIQUE index".
      
      What's worse, this behavior could bring data inconsistency. See,
      ```SQL
      create table foo(a int, b int) distributed by(a);
      insert into foo select i,i from generate_series(1,5)i;
      
      create table foopart (i int4, j int4) distributed by (i) partition by
              range (i) (start (1) end (3) every (1));
      create unique index on foopart_1_prt_1 (j);
      insert into foopart values(1,2),(2,1);
      ```
      The data inconsistency is
      ```
      gpadmin=# select gp_segment_id, * from foopart_1_prt_1;
       gp_segment_id | i | j
       ---------------+---+---
                   1 | 1 | 2
       (1 row)
      
      gpadmin=# select * from foo f, foopart_1_prt_1 p where f.a = p.j;
       a | b | i | j
       ---+---+---+---
       (0 rows)
      ```
      
      Implicitly changing the distribution keys is not very useful, but harmful.
      This PR disables changing the distribution keys when creating a unique index.
      Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
      84d2a23f
  8. 27 8月, 2020 1 次提交
  9. 26 8月, 2020 3 次提交
    • H
      Refactor code to sort/redistribute input to Agg nodes. · 15e1341c
      Heikki Linnakangas 提交于
      This introduces two new functions cdb_prepare_path_for_sorted_agg() and
      cdb_prepare_path_for_hashed_agg(), to sort and/or redistribute the input
      to an Agg node, in single-phase aggregation. Previously, the logic was in
      the callers in planner.c. This is a nice cleanup now, but is particularly
      helpful with the PostgreSQL v12 merge which will introduce more codepaths
      that create Agg nodes. Encapsulating the logic in functions reduces the
      duplication.
      
      Parallel grouping is currently disabled alogether, but if it wasn't, we
      should be using these functions when creating parallel grouping paths,
      too.
      
      There's one almost user-visible change here, which explains the change in
      'gp_aggregates' expected output. If a sorted Gather Motion is created, we
      now use the path keys needed for the grouping (root->grouped_pathkeys),
      rather than the pathkeys of the subpath (subpath->pathkeys), as the merge
      key for the Gather Motion. The grouped_pathkeys must be a subset of the
      subpath's keys, but the subpath might have extra keys that are not needed
      for the Agg. Don't bother to preserve the order of those extra keys,
      mostly because it's more convenient in the code to not bother with it, but
      in principle it also saves some CPU cycles.
      Reviewed-by: NGang Xiong <gxiong@pivotal.io>
      15e1341c
    • X
      Fix url_curl on MacOS (#10261) · 89a1211c
      Xiaoran Wang 提交于
      * Fix url_curl on MacOS
      
      Fix libcurl can not read data from gpfdist
      on MacOS
      
      But gpfdist with a pipe can not work on macos as
      flock(2) which is used in gfile.c is not supported
      on MacOS.
      89a1211c
    • H
      Don't try to generate generic plans with GPORCA. · 09aa23d3
      Heikki Linnakangas 提交于
      If you have plan_cache_mode=auto, which is the default, never try to
      generate "generic" plans. GPORCA doesn't support Param nodes, so it will
      always fall back to the Postgres planner. What happened without this patch
      was that the backend code would compare the cost of the custom plan
      generated with GPORCA with the cost of a generic plan generated with the
      Postgres planner, and that doesn't make much sense because the GPORCA has
      a very different cost model from the Postgres planner.
      
      No test, because it would be quite tedious and fragile to write one, and
      the code change seems simple enough.
      
      I bumped into this while hacking on PR #10676, which changes the Postgres
      planner's cost model. There's a test in 'direct_dispatch' for the generic
      plan generation, and it started to fail because with the planner cost
      model changes, the Postgres planner's generic plan started to look cheaper
      than the custom plan generated with GPORCA. So we do have some test
      coverage for this, although accidental.
      Reviewed-by: NVenkatesh Raghavan <vraghavan@pivotal.io>
      09aa23d3
  10. 25 8月, 2020 5 次提交
    • D
      Docs - add new postgis packaging revision · 9f7aed4e
      David Yozie 提交于
      9f7aed4e
    • M
      docs - add information on upgrading to PostGIS 2.5.4 (#10550) · cf020c44
      Mel Kiyama 提交于
      * docs - add information on upgrading to PostGIS 2.5.4
      
      Upgrade instructions 2.1.5 to different versions of 2.5.4
      
      * docs - upgrade to PostGIS 2.5.4 review comments
      
      * docs - more review comment updates.
      reorder upgrade sections.
      clarify removing PostGIS package, is for removing the gppkg
      
      * docs - minor edit
      
      * docs - review updates - more emphasis on removing PostGIS from a database deleting objects.
      -Create separate paragraph in Upgrading section.
      -Add warning in Removing PostGIS section
      
      * docs - minor review comment update
      
      * small edits
      Co-authored-by: NDavid Yozie <dyozie@pivotal.io>
      cf020c44
    • C
      Harden analyzedb further against dropped/recreated tables (#10669) · 4bbbb381
      Chris Hajas 提交于
      Commit 445fc7cc hardened some parts of analyzedb. However, it missed a
      couple of cases.
      
      1) When the statement to get the modcount from the pg_aoseg table failed
      due to a dropped table, the transaction was also terminated. This caused
      further modcount queries to fail and while those tables were analyzed,
      it would error and not properly record the mod count. Therefore, we now
      restart the transaction when it errors.
      
      2) If the table is dropped and then recreated while analyzedb is running
      (or some other mechanism that results in the table being successfully
      analyzed, but the pg_aoseg table did not exist during the initial
      check), the logic to update the modcount may fail. Now, we skip the
      update for the table if this occurs. In this case, the modcount would
      not be recorded and the next analyzedb run will consider the table
      modified (or dirty) and re-analyze it, which is the desired behavior.
      4bbbb381
    • H
      Fix flaky 'combocid' test. · 5cbb2282
      Heikki Linnakangas 提交于
      It would sometimes fail like this:
      
      --- /tmp/build/e18b2f02/gpdb_src/src/test/regress/expected/combocid.out	2020-08-25 03:14:48.314831054 +0000
      +++ /tmp/build/e18b2f02/gpdb_src/src/test/regress/results/combocid.out	2020-08-25 03:14:48.326832158 +0000
      @@ -66,7 +66,7 @@
       FETCH ALL FROM c;
        ctid  | cmin | foobar | distkey
       -------+------+--------+---------
      - (0,1) |    0 |      1 |
      + (0,1) |    1 |      1 |
        (0,2) |    1 |      2 |
        (0,5) |    0 |    333 |
       (3 rows)
      
      I was able to reproduce that locally, by inserting a random delay in the
      SeqNext() function.
      5cbb2282
    • Z
      Allow setting direct dispatch info if predicate on gp_segment_id for planner. · 13b38eb8
      Zhenghua Lyu 提交于
      This commit implements the same feature for planner as the PR
      https://github.com/greenplum-db/gpdb/pull/10679.
      
      This commit does not implement the group-by feature in PR 10679.
      The following commit message is almost the same as PR 10679.
      
      This approach special cases gp_segment_id enough to include the column
      as a distributed column constraint. It also updates direct dispatch info
      to be aware of gp_segment_id which represents the raw value of the
      segment where the data resides. This is different than other columns
      which hash the datum value to decide where the data resides.
      
      After this change the following DDL shows Gather Motion from 2 segments
      on a 3 segment demo cluster.
      
      ```
      CREATE TABLE t(a int, b int) DISTRIBUTED BY (a);
      EXPLAIN SELECT gp_segment_id, * FROM t WHERE gp_segment_id=1 or gp_segment_id=2;
                                          QUERY PLAN
      -----------------------------------------------------------------------------------
       Gather Motion 2:1  (slice1; segments: 2)
         ->  Seq Scan on t
               Filter: ((gp_segment_id = 1) OR (gp_segment_id = 2))
       Optimizer: Postgres query optimizer
      (4 rows)
      ```
      13b38eb8