提交 · 48b13271e8bf9e171a5762c20b3d91528b341c77 · Greenplum / Gpdb

17 11月, 2020 1 次提交

Avoid checking distributed snapshot for visibility checks on QD · 48b13271

由 Ashwin Agrawal 提交于 11月 16, 2020

This is partial cherry-pick from commit
b3f300b9.  In the QD, the distributed
transactions become visible at the same time as the corresponding
local ones, so we can rely on the local XIDs. This is true because the
modification of local procarray and globalXactArray are protected by
lock and hence a atomic operation during transaction commit.

We have seen many situations where catalog queries run very slow on QD
and potential reason is checking distributed logs. Process local
distributed log cache fall short for this usecase as most of XIDs are
unique and hence get frequent cache misses. Shared memory cache falls
short as only caches 8 pages and many times need many more pages to be
cached to be effective.
Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
Co-authored-by: NGang Xiong <gangx@vmware.com>

48b13271

20 11月, 2019 1 次提交
- Z
  Remove redundant Assert in localXidSatisfiesAnyDistributedSnapshot. · f66037c0
  由 Zhenghua Lyu 提交于 11月 20, 2019
```
Right below we have an if-statement to check the same thing as
this assert.
```
  f66037c0
17 9月, 2018 1 次提交

GetOldestXmin: fix VACUUM during upgrade · d301479a

由 Jacob Champion 提交于 2月 18, 2018

There are no distributed transactions during binary upgrade, so we can
ignore them. This is a backport from commit fdab8817 in master.

d301479a

29 3月, 2018 1 次提交

Enable autovacuum, but only for 'template0'. · 1800f406

由 Heikki Linnakangas 提交于 3月 09, 2018

Autovacuum has been completely disabled so far. In the upstream, even if
you set autovacuum=off, it would still run, if necessary, to prevent XID
wraparound, but in GPDB we would not launch it even for that.

That is problematic for template0, and any other databases with
datallowconn=false. If you cannot connect to a database, you cannot
manually VACUUM it. Therefore, its datfrozenxid is never advanced. We had
hacked our way through that by letting XID wraparound to happen for
databases with datallowconn=false. The theory was that template0 - and
hopefully any other such database! - was fully frozen, so there is no harm
in letting XID counter to wrap around. However, you get trouble if you
create a new database, using template0 as the template, around the time
that XID wraparound for template0 is about to happen. The new database will
inherit the datfrozenxid value, and because it will have datallowconn=true,
the system will immediately shut down because now it looks like XID
wraparound happened.

To fix, re-enable autovacuum, in a very limited fashion. The autovacuum
launcher is now started, but it will only perform anti-wraparound vacuums,
and only on template0.

This includes fixes for some garden-variety bugs that have been introduced
to autovacuum, when merging with upstream, that have gone unnoticed because
the code has been unused.

Discussion: https://groups.google.com/a/greenplum.org/d/msg/gpdb-dev/gqordopb6Gg/-GHXSE4qBwAJ
(cherry picked from commit 4e655714)

Limit autovacuum on template0.

- Exclude shared objects in template0 for vacuum
- Exclude shared objects in template0 when updating datfrozenxid
- Exclude distributed transactions when template0 is vacuumed
- When vacuuming template0, only use oldestXmin from template0 instead of all databases

Instead of using datallowconn to limit the autovacuum, we use template0
database name directly because autovacuum launcher can only access to
pg_database flat file and it doesn't have datallowconn field in 5X.

Because template0 dbid can be changed over time, we store the template0 dbid in
shared memory each time autovacuum launcher is started and pass it to
autovacuum worker.
Co-authored-by: NXin Zhang <xzhang@pivotal.io>
Co-authored-by: NDavid Kimura <dkimura@pivotal.io>

1800f406

03 2月, 2018 1 次提交

Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF. · aa5798a9

由 Ashwin Agrawal 提交于 1月 23, 2018

`repair_frag()` should consult distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) while following and moving chains
of updated tuples. Vacuum consults distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) to find which tuples can be
deleted and not. For RECENTLY_DEAD tuples it used to make decision just based on
comparison with OldestXmin which is not sufficient and even there distributed
snapshot must be checked.

Fixes #4298

(cherry picked from commit 313ab24f)

aa5798a9

02 2月, 2018 2 次提交

A
Revert "Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF." · f71748df
由 Ashwin Agrawal 提交于 2月 01, 2018
```
This reverts commit 508ffd48.
```
f71748df

Vacuum fix for ERROR updated tuple is already HEAP_MOVED_OFF. · 508ffd48

由 Ashwin Agrawal 提交于 1月 23, 2018

`repair_frag()` should consult distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) while following and moving chains
of updated tuples. Vacuum consults distributed snapshot
(`localXidSatisfiesAnyDistributedSnapshot()`) to find which tuples can be
deleted and not. For RECENTLY_DEAD tuples it used to make decision just based on
comparison with OldestXmin which is not sufficient and even there distributed
snapshot must be checked.

Fixes #4298

(cherry picked from commit 313ab24f)

508ffd48

01 9月, 2017 1 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

01 6月, 2017 1 次提交

Optimize DistributedSnapshot check and refactor to simplify. · 3c21b7d8

由 Ashwin Agrawal 提交于 5月 24, 2017

Before this commit, snapshot stored information of distributed in-progress
transactions populated during snapshot creation and its corresponding localXids
found during tuple visibility check later (used as cache) by reverse mapping
using single tightly coupled data structure DistributedSnapshotMapEntry. Storing
the information this way possed couple of problems:

1] Only one localXid can be cached for a distributedXid. For sub-transactions
same distribXid can be associated with multiple localXid, but since can cache
only one, for other local xids associated with distributedXid need to consult
the distributed_log.

2] While performing tuple visibility check, code must loop over full size of
distributed in-progress array always first to check if cached localXid can be
utilized to avoid reverse mapping.

Now, decoupled the distributed in-progress with local xids cache separately. So,
this allows us to store multiple xids per distributedXid. Also, allows to
optimize scanning localXid only if tuple xid is relevant to it and also scanning
size only equivalent to number of elements cached instead of size of distributed
in-progress always even if nothing was cached.

Along the way, refactored relevant code a bit as well to simplify further.

3c21b7d8

28 4月, 2017 1 次提交

HeapTupleSatisfiesVacuum consider also distributedXmin. · 0d7839d7

由 Ashwin Agrawal 提交于 4月 19, 2017

Vacuum now uses distributed lowest dxid to decide oldest transaction *globally*
running in cluster to make sure tuple is DEAD globally before removing the
same. HeapTupleSatisfiesVacuum() consults distributed snapshot by reverse
mapping localXid to distributed xid to check xminAllDistributedSnapshots and
verifies its not needed anymore globally. Note the check is conservative from
perpective if cannot check against distributed snapshot (like utility mode
vacuum) will try to keep the tuple than prematurely getting rid of it and
suffering the same problem.

This fixes the problem of not removing the tuple still needed. Earlier it
performed the check just based on local information (oldestXmin) on a segment,
and hence may cleanup a tuple visible to a distributed query yet to reach the
segment, which breaks snapshot isolation.

Fixes #801.

0d7839d7

21 4月, 2017 2 次提交

A

Remove dead isXmax argument to Snapshot functions. · 8f4482f9
由 Ashwin Agrawal 提交于 4月 10, 2017

8f4482f9

Return early from distributed snapshot check if local xid is not normal. · 95606c12

由 Ashwin Agrawal 提交于 4月 17, 2017

Only normal xids can have distributed xids, so no point trying to check
distributed log and lookout for its corresponding distributed xid. Not sure how
much performance this impacts but its more important to not go chasing down
non-existent file, corresponding to frozen xids. Though currently for
distributed log it treats that as local transaction but just lot of extra work
to get that answer.

95606c12

01 4月, 2017 1 次提交

Optimize distributed xact commit check. · 692be1a1

由 Ashwin Agrawal 提交于 3月 27, 2017

Leverage the fact that inProgressEntryArray is sorted based on distribXid while
creating the snapshot in createDtxSnapshot. So, can break out fast in function
DistributedSnapshotWithLocalMapping_CommittedTest().

692be1a1

18 8月, 2016 1 次提交

Remove dead code. · 6a1c4299

由 Heikki Linnakangas 提交于 8月 18, 2016

I found these with "callcatcher", written by Caolán McNamara. Many thanks
for the tool! See https://www.skynet.ie/~caolan/Packages/callcatcher.html

6a1c4299

16 7月, 2016 1 次提交

When looking up the distributed XID for a local XID, go straight to the SLRU. · 3a0f9408

由 Heikki Linnakangas 提交于 7月 16, 2016

We used to scan the list of LocalDistribXactData objects, in shared memory,
to find the distributed XID corresponding a local XID, during visibility
checks. That turns out to be unnecessary: at the point when we scanned the
list, the distributed log SLRU has already been updated, so we might as well
check that directly.

Thanks to @ashwinstar for pointing this out.

3a0f9408

10 5月, 2016 1 次提交

Remove WATCH_VISIBILITY_IN_ACTION debugging aid. · 28b12671

由 Heikki Linnakangas 提交于 5月 10, 2016

It might be useful in debugging, but it got pretty badly in the way while
merging with PostgreSQL 8.3. We could fix it, of course, but on balance,
I don't think it's worth the effort. It's going to be a maintenance burden
going forward too, as the WATCH_* calls are scattered all over the
visibility checking code. If we need debugging code like that, we should
find a less invasive way to implement it, or submit the mechanism to
upstream so that we wouldn't need to maintain it as a diff.

28b12671

28 10月, 2015 1 次提交
- I
  
  Import Greenplum source code. · 6b0e52be
  由 Initial Greenplum code dump 提交于 10月 23, 2015
  
  6b0e52be