1. 17 11月, 2020 1 次提交
    • A
      Avoid checking distributed snapshot for visibility checks on QD · 48b13271
      Ashwin Agrawal 提交于
      This is partial cherry-pick from commit
      b3f300b9.  In the QD, the distributed
      transactions become visible at the same time as the corresponding
      local ones, so we can rely on the local XIDs. This is true because the
      modification of local procarray and globalXactArray are protected by
      lock and hence a atomic operation during transaction commit.
      
      We have seen many situations where catalog queries run very slow on QD
      and potential reason is checking distributed logs. Process local
      distributed log cache fall short for this usecase as most of XIDs are
      unique and hence get frequent cache misses. Shared memory cache falls
      short as only caches 8 pages and many times need many more pages to be
      cached to be effective.
      Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
      Co-authored-by: NGang Xiong <gangx@vmware.com>
      48b13271
  2. 20 11月, 2019 1 次提交
  3. 17 9月, 2018 1 次提交
  4. 29 3月, 2018 1 次提交
    • H
      Enable autovacuum, but only for 'template0'. · 1800f406
      Heikki Linnakangas 提交于
      Autovacuum has been completely disabled so far. In the upstream, even if
      you set autovacuum=off, it would still run, if necessary, to prevent XID
      wraparound, but in GPDB we would not launch it even for that.
      
      That is problematic for template0, and any other databases with
      datallowconn=false. If you cannot connect to a database, you cannot
      manually VACUUM it. Therefore, its datfrozenxid is never advanced. We had
      hacked our way through that by letting XID wraparound to happen for
      databases with datallowconn=false. The theory was that template0 - and
      hopefully any other such database! - was fully frozen, so there is no harm
      in letting XID counter to wrap around. However, you get trouble if you
      create a new database, using template0 as the template, around the time
      that XID wraparound for template0 is about to happen. The new database will
      inherit the datfrozenxid value, and because it will have datallowconn=true,
      the system will immediately shut down because now it looks like XID
      wraparound happened.
      
      To fix, re-enable autovacuum, in a very limited fashion. The autovacuum
      launcher is now started, but it will only perform anti-wraparound vacuums,
      and only on template0.
      
      This includes fixes for some garden-variety bugs that have been introduced
      to autovacuum, when merging with upstream, that have gone unnoticed because
      the code has been unused.
      
      Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/gqordopb6Gg/-GHXSE4qBwAJ
      (cherry picked from commit 4e655714)
      
      Limit autovacuum on template0.
      
      - Exclude shared objects in template0 for vacuum
      - Exclude shared objects in template0 when updating datfrozenxid
      - Exclude distributed transactions when template0 is vacuumed
      - When vacuuming template0, only use oldestXmin from template0 instead of all databases
      
      Instead of using datallowconn to limit the autovacuum, we use template0
      database name directly because autovacuum launcher can only access to
      pg_database flat file and it doesn't have datallowconn field in 5X.
      
      Because template0 dbid can be changed over time, we store the template0 dbid in
      shared memory each time autovacuum launcher is started and pass it to
      autovacuum worker.
      Co-authored-by: NXin Zhang <xzhang@pivotal.io>
      Co-authored-by: NDavid Kimura <dkimura@pivotal.io>
      1800f406
  5. 03 2月, 2018 1 次提交
    • A
      Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF. · aa5798a9
      Ashwin Agrawal 提交于
      `repair_frag()` should consult distributed snapshot
      (`localXidSatisfiesAnyDistributedSnapshot()`) while following and moving chains
      of updated tuples. Vacuum consults distributed snapshot
      (`localXidSatisfiesAnyDistributedSnapshot()`) to find which tuples can be
      deleted and not. For RECENTLY_DEAD tuples it used to make decision just based on
      comparison with OldestXmin which is not sufficient and even there distributed
      snapshot must be checked.
      
      Fixes #4298
      
      (cherry picked from commit 313ab24f)
      aa5798a9
  6. 02 2月, 2018 2 次提交
  7. 01 9月, 2017 1 次提交
  8. 01 6月, 2017 1 次提交
    • A
      Optimize DistributedSnapshot check and refactor to simplify. · 3c21b7d8
      Ashwin Agrawal 提交于
      Before this commit, snapshot stored information of distributed in-progress
      transactions populated during snapshot creation and its corresponding localXids
      found during tuple visibility check later (used as cache) by reverse mapping
      using single tightly coupled data structure DistributedSnapshotMapEntry. Storing
      the information this way possed couple of problems:
      
      1] Only one localXid can be cached for a distributedXid. For sub-transactions
      same distribXid can be associated with multiple localXid, but since can cache
      only one, for other local xids associated with distributedXid need to consult
      the distributed_log.
      
      2] While performing tuple visibility check, code must loop over full size of
      distributed in-progress array always first to check if cached localXid can be
      utilized to avoid reverse mapping.
      
      Now, decoupled the distributed in-progress with local xids cache separately. So,
      this allows us to store multiple xids per distributedXid. Also, allows to
      optimize scanning localXid only if tuple xid is relevant to it and also scanning
      size only equivalent to number of elements cached instead of size of distributed
      in-progress always even if nothing was cached.
      
      Along the way, refactored relevant code a bit as well to simplify further.
      3c21b7d8
  9. 28 4月, 2017 1 次提交
    • A
      HeapTupleSatisfiesVacuum consider also distributedXmin. · 0d7839d7
      Ashwin Agrawal 提交于
      Vacuum now uses distributed lowest dxid to decide oldest transaction *globally*
      running in cluster to make sure tuple is DEAD globally before removing the
      same. HeapTupleSatisfiesVacuum() consults distributed snapshot by reverse
      mapping localXid to distributed xid to check xminAllDistributedSnapshots and
      verifies its not needed anymore globally. Note the check is conservative from
      perpective if cannot check against distributed snapshot (like utility mode
      vacuum) will try to keep the tuple than prematurely getting rid of it and
      suffering the same problem.
      
      This fixes the problem of not removing the tuple still needed. Earlier it
      performed the check just based on local information (oldestXmin) on a segment,
      and hence may cleanup a tuple visible to a distributed query yet to reach the
      segment, which breaks snapshot isolation.
      
      Fixes #801.
      0d7839d7
  10. 21 4月, 2017 2 次提交
  11. 01 4月, 2017 1 次提交
    • A
      Optimize distributed xact commit check. · 692be1a1
      Ashwin Agrawal 提交于
      Leverage the fact that inProgressEntryArray is sorted based on distribXid while
      creating the snapshot in createDtxSnapshot. So, can break out fast in function
      DistributedSnapshotWithLocalMapping_CommittedTest().
      692be1a1
  12. 18 8月, 2016 1 次提交
  13. 16 7月, 2016 1 次提交
    • H
      When looking up the distributed XID for a local XID, go straight to the SLRU. · 3a0f9408
      Heikki Linnakangas 提交于
      We used to scan the list of LocalDistribXactData objects, in shared memory,
      to find the distributed XID corresponding a local XID, during visibility
      checks. That turns out to be unnecessary: at the point when we scanned the
      list, the distributed log SLRU has already been updated, so we might as well
      check that directly.
      
      Thanks to @ashwinstar for pointing this out.
      3a0f9408
  14. 10 5月, 2016 1 次提交
    • H
      Remove WATCH_VISIBILITY_IN_ACTION debugging aid. · 28b12671
      Heikki Linnakangas 提交于
      It might be useful in debugging, but it got pretty badly in the way while
      merging with PostgreSQL 8.3. We could fix it, of course, but on balance,
      I don't think it's worth the effort. It's going to be a maintenance burden
      going forward too, as the WATCH_* calls are scattered all over the
      visibility checking code. If we need debugging code like that, we should
      find a less invasive way to implement it, or submit the mechanism to
      upstream so that we wouldn't need to maintain it as a diff.
      28b12671
  15. 28 10月, 2015 1 次提交