1. 27 11月, 2018 9 次提交
    • H
      Replace PathKey with new DistributionKey struct, in CdbPathLocus. · 882958da
      Heikki Linnakangas 提交于
      In PostgreSQL, a PathKey represents sort ordering, but we have been using
      it in GPDB to also represent the distribution keys of hash-distributed
      data in the planner. (i.e. the keys in DISTRIBUTED BY of a table, but also
      when data is redistributed by some other key on the fly). That's been
      convenient, and there's some precedent for that, since PostgreSQL also
      uses PathKey to represent GROUP BY columns, which is quite similar to
      DISTRIBUTED BY.
      
      However, there are some differences. The opfamily, strategy and nulls_first
      fields in PathKey are not applicable to distribution keys. Using the same
      struct to represent ordering and hash distribution is sometimes convenient,
      for example when we need to test whether the sort order or grouping is
      "compatible" with the distribution. But at other times, it's confusing.
      
      To clarify that, introduce a new DistributionKey struct, to represent
      a hashed distribution. While we're at it, simplify the representation of
      HashedOJ locus types, by including a List of EquivalenceClasses in
      DistributionKey, rather than just one EC like a PathKey has. CdbPathLocus
      now has only one 'distkey' list that is used for both Hashed and HashedOJ
      locus, and it's a list of DistributionKeys. Each DistributionKey in turn
      can contain multiple EquivalenceClasses.
      
      Looking ahead, I'm working on a patch to generalize the "cdbhash"
      mechanism, so that we'd use the normal Postgres hash opclasses for
      distribution keys, instead of hard-coding support for specific datatypes.
      With that, the hash operator class or family will be an important part of
      the distribution key, in addition to the datatype. The plan is to store
      that also in DistributionKey.
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      882958da
    • X
      Fix crash when concurrent update a table contains varying length column · 8ae681e5
      xiong-gang 提交于
      EvalPlanQual materializes the slot to a heap tuple, PRIVATE_tts_values point
      to freed memory. We need to reset PRIVATE_tts_nvalid in ExecMaterializeSlot, to
      prevent the following ExecFilterJunk from referencing the dangling pointer.
      8ae681e5
    • Z
      Correct numsegments in reshuffle node · c868f3fe
      Zhenghua Lyu 提交于
      Previously the reshuffle node's numsegments is always
      set to the cluster size. Now we have flexible gang & dispath
      API, we should correct the numsegments field of reshuffle
      node to set it as the its lefttree's flow->numsegments.
      Co-authored-by: NShujie Zhang <shzhang@pivotal.io>
      Co-authored-by: NZhenghua Lyu <zlv@pivotal.io>
      c868f3fe
    • Z
      Do not generate ReshuffleExpr for replicated table · e12d8cab
      Zhenghua Lyu 提交于
      When we expand a partial replicated table via `alter
      table t expand table`, internally we use the split-update
      framework to implement the expansion. That framework is
      designed for hash-distribtued tables at first. For replicated
      table, we do not need the reshuffle_expr(filter condition) at
      all because we need to transfer all data in a replicated table.
      e12d8cab
    • K
      pg_upgrade: freeze master data directory before copying to segments · 5583ecde
      Kalen Krempely 提交于
      This allows the data to be visible on the segments. The segements
      should not interpret any transaction id from master during or after
      upgrade.
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      5583ecde
    • K
      Vacuum should operate on auxiliary tables in utility mode · 851779ac
      Kalen Krempely 提交于
      Without this commit auxiliary tables such as toast and aoseg are skipped during
      vacuum when run in utility mode (such as during pg_upgrade).
      Co-authored-by: NAsim R P <apraveen@pivotal.io>
      Co-authored-by: NKalen Krempely <kkrempely@pivotal.io>
      851779ac
    • A
      Run pgindent on zstd_compression.c. · d13a3089
      Ashwin Agrawal 提交于
      d13a3089
    • I
      Unit test for zstd_compression.c · 647cf58c
      Ivan Leskin 提交于
      Add a unit test (and its infrastructure) for 'zstd_compress()'.
      
      The test checks whether 'zstd_compress()' returns correct output in case
      compression fails (compressed data is larger than uncompressed). To do that,
      'ZSTD_compressCCtx()' is mocked to always return 'ZSTD_error_dstSize_tooSmall'.
      
      Also, an 'ifndef' is added around 'PG_MODULE_MAGIC' in zstd_compression.c
      647cf58c
    • I
      Fix ZSTD compression error "dst buffer too small" · f8afec06
      Ivan Leskin 提交于
      When ZSTD compression is used for AO CO tables, insertion of data may cause an
      error "Destination buffer is too small". This happens when compressed data is
      larger than uncompressed input data.
      
      This commit adds handling of this situation: do not change output buffer and
      return size used equal to source size. The caller (e.g.,
      'cdbappendonlystoragewrite.c') is able to handle such output; in this case, it
      copies data from input to output itself.
      f8afec06
  2. 26 11月, 2018 4 次提交
    • N
      Always create fully distributed tables · e90a3bdd
      Ning Yu 提交于
      In CREATE TABLE we used to decide numsegments from LIKE, INHERITS and
      DISTRIBUTED BY clauses.  However we do not want partially distributed
      tables to be created by end users, so change the logic to always create
      tables with DEFAULT as numsegments.  We still allow developers to hack
      the DEFAULT numsegments with the gp_debug_numsegments extension.
      e90a3bdd
    • P
      The showtype argument of get_const_expr() is no longer bool. (#6304) · bd38461c
      Paul Guo 提交于
      PG patch 59358907 changed that but somehow this code line change is lost in gpdb during
      merge.
      
      Reviewed-by: Heikki Linnakangas
      bd38461c
    • A
      Corresponding test changes to function renaming · 586249a1
      Adam Lee 提交于
      "c2edc32d Silence compiler warnings
      about unused functions, with OpenSSL 1.1.0." renamed functions, this is
      the corresponding test changes.
      586249a1
    • D
      Remove unused arguments to query · 7cf749b4
      Daniel Gustafsson 提交于
      Commit 226e8867 removed oidcasted_pk
      and max_content from the SQL query, but didn't remove the arguments.
      While they don't cause an issue as they will be unused, remove to
      avoid confusing readers.
      
      Reviewed-by: Heikki Linnakangas
      7cf749b4
  3. 25 11月, 2018 3 次提交
    • D
      Remove /static/ from PostgreSQL doc links · ac753aed
      Daniel Gustafsson 提交于
      Commit 17f9b7f070dbe17b2844a8b4dd428 in the pgweb repository removed
      the /static/ portion on all doc URLs, leaving a redirect in place. To
      avoid incurring a needless redirect, remove the /static/ part in the
      links to the PostgreSQL documentation.
      
      The /static/ URLs stem from a time when there were interactive docs
      that had comment functionality. These docs were removed a very long
      time ago, but the static differentiator was left in place until now.
      
      Reviewed-by: Mel Kiyama
      ac753aed
    • H
      Silence compiler warnings about unused functions, with OpenSSL 1.1.0. · c2edc32d
      Heikki Linnakangas 提交于
      With OpenSSL 1.1.0 and above, CRYPTO_set_id_callback and
      CRYPTO_set_locking_callback are no-op macros, which rendered
      id_function() and locking_function() unused. That produced compiler
      warnings.
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      c2edc32d
    • H
      Silence compiler's suggestions about marking functions with printf-attribute. · 482ae41b
      Heikki Linnakangas 提交于
      I was getting these compiler warnings:
      
      src/s3log.cpp: In function ‘void _LogMessage(const char*, __va_list_tag*)’:
      src/s3log.cpp:17:42: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
           vsnprintf(buf, sizeof(buf), fmt, args);
                                                ^
      src/s3log.cpp: In function ‘void _send_to_remote(const char*, __va_list_tag*)’:
      src/s3log.cpp:27:55: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
           size_t len = vsnprintf(buf, sizeof(buf), fmt, args);
                                                             ^
      src/s3log.cpp: In function ‘void LogMessage(LOGLEVEL, const char*, ...)’:
      src/s3log.cpp:41:39: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
                   vfprintf(stderr, fmt, args);
                                             ^
      
      Those are good suggestions. I couldn't figure out the correct way to mark
      the _LogMessage() and _send_to_remote() local functions, so I decided to
      inline them into the caller, LogMessage(), instead. They were almost
      one-liners, and LogMessage() is still very small, too, so I don't think
      there's any significant loss to readability.
      
      A few format strings in debugging messages were treating pthread_self() as
      a pointer, while others were treating it as a wrong kind of integer.
      Harmonize by casting it to "uint64_t", and using PRIX64 as the format
      string. This isn't totally portable: pthread_t can be an arithmetic type,
      or a struct, and casting a struct to unsigned int won't work. In principle,
      that was a problem before this patch already, but now you should get a
      compiler error, if you try to compile on a platform where pthread_t is not
      an arithmetic type. I think that's better than silent type confusion.
      Reviewed-by: NPaul Guo <pguo@pivotal.io>
      Reviewed-by: NDaniel Gustafsson <dgustafsson@pivotal.io>
      482ae41b
  4. 24 11月, 2018 1 次提交
  5. 23 11月, 2018 9 次提交
    • B
      Revert `Remove Master/Standby SyncRepWait Greenplum hack` and two other relevent commit (#6307) · b8f98fca
      BaiShaoqi 提交于
      Here revert four commits
      
      1. Remove Master/Standby SyncRepWait Greenplum hack: 7f6066ea
      2. Add alter system synchronous_standby_names to * when gpinitstandby -n 1136f2fb
      3. Hot fix in gpinitstandby behave test failure: b6c77b2f
      4. Remove unused variables, to silence compiler warnings 88a185a5
      
      The first commit should revert, because gpinitstandby did not change correctly comform to the this commit.
      It will cause If the standby is down, and the synchronous_standby_names is *, the cluster will not start and hang on
      
      The second, third, fourth should revert, because it is relavent to the first commit.
      b8f98fca
    • H
      Remove unnecessary #includes. · f30355fa
      Heikki Linnakangas 提交于
      f30355fa
    • H
      3e18f878
    • S
    • P
      Fix a bug of replicated table · 0e461e16
      Pengzhou Tang 提交于
      Previously, when creating join path between CdbLocusType_SingleQE path
      and CdbLocusType_SegmentGeneral path, we always add a motion on top
      of CdbLocusType_SegmentGeneral path so even the join path is promoted
      to executed on QD, the CdbLocusType_SegmentGeneral path can still be
      executed to segments.
                           join (CdbLocusType_SingleQE)
      					/    \
                         /      \
      CdbLocusType_SingleQE     Gather Motion
                                  \
                                CdbLocusType_SegmentGeneral
      
      For example:
      (select * from partitioned_table limit 1) as t1
      Nested Loop
          ->  Gather Motion 1:1
      	     ->  Seq Scan on replicated_table
          ->  Materialize
      		 ->  Subquery Scan on t1
      		    ->  Limit
      			   ->  Gather Motion 3:1
      	               ->  Limit
      		               ->  Seq Scan on partitioned_table
      replicated_table only store tuples on segments, so without
      the gather motion, seq scan of replicated_table doesn't
      provide tuples.
      
      There is another problem, if join path is not promoted to
      QD, the gather motion might be redundant, For example:
      
        (select * from replicated_table, (select * from
        partitioned_table limit 1) t1) sub1;
      
      Gather Motion 3:1
        -> Nested Loop
            ->  Seq Scan on partitioned_table_2
            ->  Materialize
                ->  Broadcast Motion 1:3
                    -> Nested Loop
                       ->  Gather Motion 1:1 (redundant motion)
      	                 ->  Seq Scan on replicated_table
                    ->  Materialize
      		         ->  Subquery Scan on t1
      		            ->  Limit
      			             ->  Gather Motion 3:1
      	                       ->  Limit
      		                      ->  Seq Scan on partitioned_table
      
      So in apply_motion_mutator(), we omit such redundant motion if
      it's not gathered to top slice (QD). sliceDepth == 0 means it
      is top slice, however, sliceDepth now is shared by both init
      plans and main plan, so if main plan increased the sliceDepth,
      init plan may omit the gather motion unexpectedly which create
      a wrong results.
      
      The fix is simple to reset sliceDepth for init plans
      0e461e16
    • P
      update test with new syntax · 64cdd47c
      Pengzhou Tang 提交于
      64cdd47c
    • P
      Implement EXPAND syntax · cfe3f386
      Pengzhou Tang 提交于
      Implement "ALTER TABLE table EXPAND TABLE" to expand tables.
      
      "Expanding" and "Set Distributed by" are actually two different kind of
      operations on tables, old gpexpand used to use "Set Distributed by" to
      expand tables for historical reasons and our early version of expand
      were also squashed into "Set Distributed by", this make code hard to
      hard to understand and concept confused.
      
      This commit divide "Expanding" and "Set Distributed by" totally and
      implement "Expanding" with new syntax. We have two method to implement
      data movement, one is CTAS, another is RESHUFFLE, depend on how much
      data need to move. If tuples to move is less than 10000, choose
      RESHUFFLE, or if scale to move is less than 30% choose RESHUFFLE,
      otherwise, choose CTAS
      
      For partition table, we disallow expand leaf partition seperately because
      root partition cannot has different numsegments with leaf partitions,
      SELECT/UPDATE should be fine if numsegments is inconsistent, however,
      INSERT will make trouble that data are inserted to unexpected place.
      
      The new syntax is supposed to only used by gpexpand and not be exposed
      to normal users, so no need to update document.
      cfe3f386
    • N
      Fix numsegments when appending multiple SingleQEs · fa86f160
      Ning Yu 提交于
      When Append node contains SingleQE subpath we used to put Append on ALL
      the segments, however if the SingleQE is partially distributed then
      apparently we could not put the SingleQE on ALL the segments, this
      conflict could results in runtime or incorrect results.
      
      To fix this we should put Append on SingleQE's segments.
      
      In the other hand when there are multiple SingleQE subpaths we should
      put Append on the common segments of SingleQEs.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      fa86f160
    • N
      Reduce differences between reshuffle tests · 2eef2ba2
      Ning Yu 提交于
      There are 3 reshuffle tests, the ao one, the co one, and the heap one.
      They share almost the same cases, but different on table names and
      create table options.  There are also some differences caused when
      adding regression tests, they are only added in one file but not others.
      
      We want to keep minimal differences between these tests, so we ensure
      that a regression test for ao also covers similar case for heap.  And
      once we understand one of the test file we have almost the same
      knowledge on the others.
      
      Here is a list of changes to these tests:
      - reduce differences on table names by using schema;
      - reduce differences on CREATE TABLE options by setting default storage
        options;
      - simplify the creation of partially distributed tables by using the
        gp_debug_numsegments extension;
      - copy some regression tests to all the tests;
      - retire the no longer used helper function;
      - move the tests into an existing parallel test group;
      
      pg_regress test framework provides some @@ tokens for ao/co tests,
      however we still can not merge the ao and co tests into one file as
      WITH (OIDS) is only supported by ao but not co.
      2eef2ba2
  6. 22 11月, 2018 13 次提交
    • H
      Pick a smarter Hashed locus for LEFT and RIGHT JOINs. · 3d6c78c9
      Heikki Linnakangas 提交于
      When determining the locus for a LEFT or RIGHT JOIN, we can use the outer
      side's distribution key as is. The EquivalenceClasses from the nullable
      side are not of interest above the join, and the outer side's distribution
      key can lead to better plans, because it can be made a Hashed locus,
      rather than HashedOJ. A Hashed locus can be used for grouping, for
      example, unlike a HashedOJ.
      
      This buys back better plans for some INSERT and CTAS queries, that started
      to need Redistribute Motions after the previous commit.
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      3d6c78c9
    • H
      Fix confusion with distribution keys of queries with FULL JOINs. · a25e2cd6
      Heikki Linnakangas 提交于
      There was some confusion on how NULLs are distributed, when CdbPathLocus
      is of Hashed or HashedOJ type. The comment in cdbpathlocus.h suggested
      that NULLs can be on any segment. But the rest of the code assumed that
      that's true only for HashedOJ, and that for Hashed, all NULLs are stored
      on a particular segment. There was a comment in cdbgroup.c that said "Or
      would HashedOJ ok, too?"; the answer to that is "No!". Given the comment
      in cdbpathlocus.h, I'm not suprised that the author was not very sure
      about that. Clarify the comments in cdbpathlocus.h and cdbgroup.c on that.
      
      There were a few cases where we got that actively wrong. repartitionPlan()
      function is used to inject a Redistribute Motion into queries used for
      CREATE TABLE AS and INSERT, if the "current" locus didn't match the target
      table's policy. It did not check for HashedOJ. Because of that, if the
      query contained FULL JOINs, NULL values might end up on all segments. Code
      elsewhere, particularly in cdbgroup.c, assumes that all NULLs in a table
      are stored on a single segment, identified by the cdbhash value of a NULL
      datum. Fix that, by adding a check for HashedOJ in repartitionPlan(), and
      forcing a Redistribute Motion.
      
      CREATE TABLE AS had a similar problem, in the code to decide which
      distribution key to use, if the user didn't specify DISTRIBUTED BY
      explicitly. The default behaviour is to choose a distribution key that
      matches the distribution of the query, so that we can avoid adding an
      extra Redistribute Motion. After fixing repartitionPlan, there was no
      correctness problem, but if we chose the key based on a HashedOJ locus,
      there is no performance benefit because we'd need a Redistribute Motion
      anyway. So modify the code that chooses the CTAS distribution key to
      ignore HashedOJ.
      
      While we're at it, refactor the code to choose the CTAS distribution key,
      by moving it to a separate function. It had become ridiculously deeply
      indented.
      
      Fixes https://github.com/greenplum-db/gpdb/issues/6154, and adds tests.
      Reviewed-by: NMelanie Plageman <mplageman@pivotal.io>
      a25e2cd6
    • H
      Add tests for deriving distribution keys from query in CREATE TABLE AS. · 9457fe71
      Heikki Linnakangas 提交于
      The case where some, but not all, of the query's distribution keys were
      present in the result set, was not covered by any existing tests.
      
      Per Paul Guo's observation.
      9457fe71
    • H
      Cosmetic fixes in the code to determine distribution key for CTAS. · a5fa3110
      Heikki Linnakangas 提交于
      Fix indentation. In the code to generate a NOTICE, remove if() for
      condition that we had checked earlier in the function already, and use a
      StringInfo for building the string.
      a5fa3110
    • H
      Remove unused variables, to silence compiler warnings. · 88a185a5
      Heikki Linnakangas 提交于
      These were left behind by commit 7f6066ea.
      88a185a5
    • H
      Fix obsolete comment in cdb_build_distribution_pathkeys(). · 8c9c0576
      Heikki Linnakangas 提交于
      It returns a simple list of PathKeys, not a list of lists. The code was
      changed in the 8.3-era merge of equivalence classes already, but we
      neglected the comment.
      8c9c0576
    • Z
      Remove dead code in ATExecSetDistributedBy · 50f2e3bb
      Zhenghua Lyu 提交于
      This commit is the first step to refactor ATExecSetDistributedBy. Its
      main purpose is to remove some dead code in this function and during
      the process we find some helper functions can also be simplified so
      the simplification is also in this commit.
      
      According to MPP-7770, we should disable changing storage options for now.
      It is ugly to just throw an error when encounter `appendonly` option but
      without removing the code. In this commit remove all related logic.
      
      Because of with clause can only contain reshuffle|reorganize, we only
      new_rel_opts if the table itself is ao|aoco. No need to deduce it from with
      clause.
      
      We also remove the unnecessary checks at the start of this function. Because
      These checks have been already done in the function `ATPrepCmd`.
      
      Co-authored-by: Shuejie Zhang <shzhang@pivotal.io >
      50f2e3bb
    • S
      After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the... · 1136f2fb
      Shaoqi Bai 提交于
      After applying that commit 22e04dc12df9e0577ba93a75dbef160c8c1ed258, the master will block when the standby master is down.
      
      There are a couple things that need to be done to unblock the master.
      1. Run gpinitstandby -n to start the standby master back up.
      2. Run psql postgres -c "ALTER SYSTEM SET synchronous_standby_names = '';" and reload the master segment.
      
      Note that the ALTER SYSTEM SET has to be called again to set synchronous_standby_names back to '*' (and master config reloaded) to enable synchronous replication again.
      Thoughts are to make it 1 step combined into gpinitstandby -n instead of documenting a multi-step process.
      
      What this commit is just to make it 1 step combined into gpinitstandby -n.
      Co-authored-by: NNing Yu <nyu@pivotal.io>
      1136f2fb
    • J
      Remove Master/Standby SyncRepWait Greenplum hack · 7f6066ea
      Jimmy Yih 提交于
      When the standby master is unavailable, the master will not block on commits
      even though we enable synchronous replication. This is because we have a
      Greenplum hack which checks if the WAL stream with the standby master is
      valid. If the stream is invalid, the master will quickly skip the SyncRepWait
      and continue on its commit.
      
      Remove this hack in order to make Master/Standby and Primary/Mirror WAL
      replication more similar.
      Co-authored-by: NShaoqi Bai <sbai@pivotal.io>
      7f6066ea
    • N
      Use max numsegments of subpaths for Append node · 1b2f7bcd
      Ning Yu 提交于
      Suppose t1 has numsegments=1 and t2 has numsegments=2, then below query
      will have incorrect plan:
      
          explain (costs off) select * from t2 a join t2 b using(c2)
                    union all select * from t1 c join t1 d using(c2);
                                         QUERY PLAN
          ------------------------------------------------------------------------
           Gather Motion 1:1  (slice3; segments: 1)
             ->  Append
                   ->  Hash Join
                         Hash Cond: (a.c2 = b.c2)
                         ->  Redistribute Motion 2:2  (slice1; segments: 2)
                               Hash Key: a.c2
                               ->  Seq Scan on t2 a
                         ->  Hash
                               ->  Redistribute Motion 2:2  (slice2; segments: 2)
                                     Hash Key: b.c2
                                     ->  Seq Scan on t2 b
                   ->  Hash Join
                         Hash Cond: (c.c2 = d.c2)
                         ->  Seq Scan on t1 c
                         ->  Hash
                               ->  Seq Scan on t1 d
           Optimizer: legacy query optimizer
          (17 rows)
      
      slice2 has a 2:2 redistribute motion to slice3, however slice3 only has
      1 segment, this is due to Append's numsegments is decided from the last
      subpath.
      
      To fix the issue we should use max numsegments of subpaths for Append.
      
      The issue was already fixed in 39856768,
      we are only adding tests for it now.
      1b2f7bcd
    • N
      New extension to debug partially distributed tables · 3119009a
      Ning Yu 提交于
      Introduced a new debugging extension gp_debug_numsegments to get / set
      the default numsegments when creating tables.
      
      gp_debug_get_create_table_default_numsegments() gets the default
      numsegments.
      
      gp_debug_set_create_table_default_numsegments(text) sets the default
      numsegments in text format, valid values are:
      - 'FULL': all the segments;
      - 'RANDOM': pick a random set of segments each time;
      - 'MINIMAL': the minimal set of segments;
      
      gp_debug_set_create_table_default_numsegments(integer) sets the default
      numsegments directly, valid range is [1, gp_num_contents_in_cluster].
      
      gp_debug_reset_create_table_default_numsegments(text) or
      gp_debug_reset_create_table_default_numsegments(integer) reset the
      default numsegments to the specified value, and the value can be reused
      later.
      
      gp_debug_reset_create_table_default_numsegments() resets the default
      numsegments to the value passed last time, if there is no previous call
      to it the value is 'FULL'.
      
      Refactored ICG test partial_table.sql to create partial tables with this
      extension.
      3119009a
    • D
      Fix dynahash HASH_ENTER usage · e576c0b9
      Daniel Gustafsson 提交于
      rel_partitioning_is_uniform() and addMCVToHashTable() inserted with
      HASH_ENTER, and subsequently checked the returnvalue for NULL in
      order to error out on "out of memory". HASH_ENTER however doesn't
      return if it couldn't insert and will error out itself so remove the
      test as it cannot happen.
      
      groupHashNew() was using HASH_ENTER_NULL which does return NULL in
      out of memory situations, but it failed to correctly handle the
      returnvalue and dereferenced without check risking a null pointer
      deref under memory pressure. Fix by using HASH_ENTER instead as
      the code clearly expect that behavior.
      Reviewed-by: NPaul Guo <paulguo@gmail.com>
      e576c0b9
    • H
      Another attempt at fixing Assertion on empty AO tables. · 77ac9bdf
      Heikki Linnakangas 提交于
      Commit cc2e211f attempted to silence the assertion in
      vac_update_relstats(), but the assertion it in turn added, was hit
      heavily. vacuum_appendonly_fill_stats() function, where I added the check
      for zero pages and non-zero tuples combination, is also reached in QD mode,
      contrary to the comments and the assertion that I added. I'm not sure why
      we look at the totals in QD mode - AFAICS we just throw away them away -
      but I'm reluctant to start restructuring this code right now. So move the
      code to zap reltuples to 0, into vac_update_relstats().
      Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
      77ac9bdf
  7. 21 11月, 2018 1 次提交
    • D
      Set statistics to zero when no sampled rows · fb537b53
      Daniel Gustafsson 提交于
      In the unlikely event that we reach this codepath with a samplerows
      value of zero (which albeit unlikely could happen), avoid performing
      a division by zero and instead set the null fraction to zero as we
      clearly don't have any more information to go on. The HLL code calls
      calls the compute_stats function pointer with zero samplerows, and
      while that's using a different compute_stats function, it's an easy
      mistake to make when not all functions can handle a division by zero.
      This is defensive programming prompted by a report that triggered an
      old bug like this without actually hitting this, but there is little
      reason to take the risk of a crash. Suspenders go well with belts.
      Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
      fb537b53