提交 · 7f5589133b397f65df61b3400f877ea81da9feb0 · Greenplum / Gpdb

09 11月, 2020 5 次提交

H
Revert "ic-proxy: refresh peers on demand" · 7f558913
由 Hubert Zhang 提交于 11月 09, 2020
```
This reverts commit 9265ea6a.
```
7f558913

ic-proxy: refresh peers on demand · 9265ea6a

由 Ning Yu 提交于 10月 21, 2020

The user can adjust the ic-proxy peer addresses at runtime and reload by
sending SIGHUP, if an address is modified or removed, the corresponding
peer connection must be closed or reestablished.  The same to the peer
listener, if the listener port is changed, then must re-setup the
listener.

9265ea6a

ic-proxy: classify peer addresses · 854c4b84

由 Ning Yu 提交于 10月 21, 2020

The peer addresses are specified with the GUC
gp_interconnect_proxy_addresses, it can be reloaded on SIGHUP, we used
to only care about the newly added ones, however it is also possible for
the user to modify them, or even remove some of them.

So now we add the logic to classify the addresses after parsing the GUC,
we can tell whether an address is added, removed, or modified.

The handling of the classified addresses will be done in the next
commit.

854c4b84

ic-proxy: optimize looking up of my addr · 40facdb1

由 Ning Yu 提交于 10月 21, 2020

We used to scan the whole addr list to find my addr, now we record it
directly when parsing the addresses.

40facdb1

ic-proxy: rename ICProxyAddr.addr to sockaddr · 2c2ca626

由 Ning Yu 提交于 10月 21, 2020

A ICProxyAddr variable is usually named as "addr", so the attribute is
referred as "addr->addr", it's confusing and sometimes ambiguous.

So renamed the attribute to "sockaddr", the function
ic_proxy_extract_addr() is also renamed to ic_proxy_extract_sockaddr().

2c2ca626

28 10月, 2020 1 次提交

盏

mask all signal in the udp pthreads · 54451fc0

由盏一提交于 10月 21, 2020

In some cases, some signals (like SIGQUIT) that should only be
processed by the main thread of the postmaster may be dispatched to rxThread.
So we should and it is safe to block all signals in the udp pthreads.

Fix #11006

54451fc0

28 9月, 2020 1 次提交
- H
  
  Add a few GPDB_12_MERGE_FIXME comments about using WaitEventSetWait(). · c1353dc9
  由 Heikki Linnakangas 提交于 9月 28, 2020
  
  c1353dc9
10 9月, 2020 1 次提交

ic-proxy: support hostname as proxy addresses · 2a1794bc

由 Ning Yu 提交于 9月 10, 2020

The GUC gp_interconnect_proxy_addresses is used to set the listener
addresses and ports of all the proxy bgworkers, only IP addresses were
supported previously, which is inconvenient to use.

Now we add the support for hostnames too, the IP addresses are also
supported.

Note that if a hostname is bound to a different IP at runtime, we must
reload the setting with the "gpstop -u" command.
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

2a1794bc

02 9月, 2020 2 次提交

H

Fix compile error for missing brackets · b2d32cb9
由 Hubert Zhang 提交于 9月 02, 2020

b2d32cb9

Using lwlock to protect resgroup slot in session state · a4cb06b4

由 Hubert Zhang 提交于 9月 02, 2020

Resource group used to access resGroupSlot in SessionState without
lock. This is correct when session only access resGroupSlot by itself.
But as we introduced runaway feature, we need to traverse the current
session array to find the top consumer session when redzone is reached.
This requires:
1. runaway detector should hold shared resgroup lock to avoid resGroupSlot
is detached from a session concurrently when redzone is reached.
2. normal session should hold exclusive lock when modifying resGroupSlot
in SessionState.

Also fix a compile warning.
Reviewed-by: NNing Yu <nyu@pivotal.io>

a4cb06b4

01 9月, 2020 1 次提交

ic-proxy: Quit proxy bgworker when postmaster is dead · 9ce59d1a

由 Hubert Zhang 提交于 9月 01, 2020

Proxy bgworker will become orphan process after postmaster is dead
due to the lack of checking pipe postmaster_alive_fds[POSTMASTER_FD_WATCH].
Epoll this pipe inside proxy bgworker main loop as well.
Reviewed-by: NNing Yu <nyu@pivotal.io>

9ce59d1a

14 8月, 2020 1 次提交

Using libuv 1.18 API in ic-proxy · ab36eb90

由 Hubert Zhang 提交于 8月 14, 2020

ic-proxy is developed with libuv, the minimal supported libuv version is
1.18.0. But in commit 608514, we introduce new API in libuv 1.19, which
break compatibility on os like Ubuntu 18.04 whose default libuv version
is 1.18.

We should keep our code base align with libuv 1.18, and replace the new
API in libuv 1.19. The API change is mainly about how to access data field
in uv handle and uv loop. The new API uses function interface like
`uv_handle_set_data` and `uv_handle_get_data` to access data filed, while
the old API in 1.18 access the data filed directly. Note that in the latest
libuv version 1.38.2, the old API and new API are both supported. And libuv
is stable enough to support the old API for a long time.

ab36eb90

12 8月, 2020 2 次提交

Fix compilation without libuv's uv.h header. · 7858128f

由 Heikki Linnakangas 提交于 8月 12, 2020

ic_proxy_backend.h includes libuv's uv.h header, and ic_proxy_backend.h
was being included in ic_tcp.c, even when compiling with
--disable-ic-proxy.

7858128f

ic-proxy: support parallel backend registeration to proxy · 608514c5

由 Hubert Zhang 提交于 7月 30, 2020

Previously, when backends connect to a proxy, we need to setup
domain socket pipe and send HELLO message(recv ack message) in
a blocking and non-parallel way. This makes ICPROXY hard to introduce
check_for_interrupt during backend registeration.

By utilizing libuv loop, we could register backend in paralle. Note
that this is one of the step to replace all the ic_tcp backend logic
reused by ic_proxy currently. In future, we should use libuv to replace
all the backend logic, from registeration to send/recv data.
Co-authored-by: NNing Yu <nyu@pivotal.io>

608514c5

10 8月, 2020 1 次提交

ic-proxy: type checking in ic_proxy_new() · a3ef623d

由 Ning Yu 提交于 8月 10, 2020

A typical mistake on allocating typed memory is as below:

    int64 *ptr = malloc(sizeof(int32));

To prevent this, now we make ic_proxy_new() a typed allocator, it always
return a pointer of the specified type, for example:

    int64 *p1 = ic_proxy_new(int64); /* good */
    int64 *p2 = ic_proxy_new(int32); /* bad, gcc will raise a warning */
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

a3ef623d

06 8月, 2020 1 次提交

coverity: Fix 'unchecked return value' issues. (#10545) · af0fac18

由 Paul Guo 提交于 8月 06, 2020

Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NHao Wu <gfphoenix78@gmail.com>

af0fac18

04 8月, 2020 1 次提交

ic-proxy: correct SIGHUP handler · a181655b

由 Ning Yu 提交于 8月 04, 2020

Fixed the bug that the SIGHUP handler was installed for SIGINT by
mistake, so the ic-proxy bgworkers would die on SIGHUP.

By correcting the signal name, now we could let the ic-proxy bgworkers
reload the postgresql.conf by executing "gpstop -u".
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

a181655b

03 8月, 2020 1 次提交

ic-proxy: handle early coming BYE correctly · 79ff4e62

由 Ning Yu 提交于 8月 02, 2020

In a query that contains multiple init/sub plans, the packets of the
second subplan might be received while the first is still being
processed in the ic-proxy mode, this is because in ic-proxy mode a local
host handshake is used instead of the global one.

To distinguish the packets of different subplans, especially for the
early coming ones, we must stop handling on the BYE immediately, and
pass any unhandled early coming pkts to the successor or the
placeholder.

This fixes the random hanging during the ICW parallel group of
qp_functions_in_from.  No new test is added.
Co-authored-by: NHubert Zhang <hzhang@pivotal.io>
Co-authored-by: NNing Yu <nyu@pivotal.io>

79ff4e62

29 7月, 2020 1 次提交

ic-proxy: include postmaster pid in the domain socket path · 5c5a358a

由 Ning Yu 提交于 7月 29, 2020

We used to store them under /tmp/, we include the postmaster port number
in the file name in the hope that two clusters will not conflict with
each other on this file.

However the conflict still happen in the test src/bin/pg_basebackup.
And it can also happen if a second cluster is missed configured by
accident.  So to make things safe we also include the postmaster pid in
the domain socket path, there is no chance for two postmasters to share
the same pids.
Reviewed-by: NPaul Guo <pguo@pivotal.io>

5c5a358a

23 7月, 2020 2 次提交

ic-proxy: reload addresses on SIGHUP · c2523232

由 Ning Yu 提交于 7月 23, 2020

We used to mark the GUC gp_interconnect_proxy_addresses as
PGC_POSTMASTER, so the cluster must be restarted to reload this setting,
this can be a problem during gpexpand: the cluster expansion itself is
online, but to configure the proxy addresses for the new segments a
restart is needed.

Now we changed it to PGC_SIGHUP, so the setting can be reloaded on
SIGHUP.

Also changed the setting from a developer option to a normal one.

c2523232

N

ic-proxy: do not generate too many messages · c6c36cc8
由 Ning Yu 提交于 7月 11, 2020

c6c36cc8

10 7月, 2020 3 次提交

ic-proxy: enable ic-proxy with --enable-ic-proxy · 81810a20

由 Ning Yu 提交于 6月 15, 2020

We used to use the option --with-libuv to enable ic-proxy, it is not
staightforward to understand the purpose of that option, though.  So we
renamed it to --enable-ic-proxy, and the default setting is changed to
"disable".

Suggested by Kris Macoskey <kmacoskey@pivotal.io>

81810a20

ic-proxy: let backends connect to the proxy bgworker · 94c9d996

由 Ning Yu 提交于 5月 18, 2020

Only in proxy mode, of course. Currently the ic-proxy mode shares most
of the backend logic with ic-tcp mode, so instead of copying the code we
actually embed the ic-proxy specific logic in ic_tcp.c .

94c9d996

ic-proxy: implement the core logic · 6188fb1f

由 Ning Yu 提交于 5月 18, 2020

The interconnect proxy mode, a.k.a. ic-proxy, is a new interconnect
mode, all the backends communicate via a proxy bgworker, all the
backends on the same segment share the same proxy bgworker, so every two
segments only need one network connection between them, which reduces
the network flows as well the ports.

To enable the proxy mode we need to first configure the guc
gp_interconnect_proxy_addresses, for example:

    gpconfig \
      -c gp_interconnect_proxy_addresses \
      -v "'1:-1:10.0.0.1:2000,2:0:10.0.0.2:2001,3:1:10.0.0.3:2002'" \
      --skipvalidation

Then restart to take effect.

6188fb1f

05 6月, 2020 1 次提交
- A
  
  Fix wrong data type introduced in bf36fb3b · a251127b
  由 Asim R P 提交于 6月 05, 2020
  
  a251127b
03 6月, 2020 2 次提交

Squash me: address concerns in code review · bf36fb3b

由 Asim R P 提交于 6月 03, 2020

Remember if the select call was interrupted.  Act on it after emitting
debug logs and checking cancel requests from dispatcher.

bf36fb3b

Check errno as early as possible · 9fd138da

由 Asim R P 提交于 6月 02, 2020

Previously, the result of select() system call and errno set by it was
checked after performing several function calls, including checking
for interrupts and checkForCancelFromQD.  That made it very likely for
errno to change, losing the original value that was set by the
select().

This patch fixes it so that the errno is checked immediately after the
system call.  This should address intermittent failures in CI with
error message like this:

    ERROR","58M01","interconnect error: select: Success"

9fd138da

29 5月, 2020 1 次提交

Optimize some code in the receiving side of a Motion. · f0e92070

由 Heikki Linnakangas 提交于 5月 29, 2020

processIncomingChunks() receives a list of chunks from a one sender, and
then calls addChunkToSorter() on each chunk. addChunkToSorter() looks up
some things based on the sender. But since all the chunks came from the
same sender, we can move the lookups to outside of the loop and save some
overhead.
Reviewed-by: NGang Xiong <gxiong@pivotal.io>

f0e92070

25 5月, 2020 1 次提交

Fix a hung issue caused by gp_interconnect_id disorder · 644bde25

由 Pengzhou Tang 提交于 5月 14, 2020

This issue is exposed when doing an experiment to remove the
special "eval_stable_functions" handling in evaluate_function(),
qp_functions_in_* test cases will get stuck sometimes and it turns
out to be a gp_interconnect_id disorder issue.

Under UDPIFC interconnect, gp_interconnect_id is used to
distinguish the executions of MPP-fied plan in the same session
and in the receiver side, packets with smaller gp_interconnect_id
is treated as 'past' packets, receiver will stop the sender to send
the packets.

The RCA of the hung is:
1. QD call InitSliceTable() to advance the gp_interconnect_id and
store it in slice table.
2. In CdbDispatchPlan->exec_make_plan_constant(), QD find some
stable function need to be simplified to const, then it executes
this function first.
3. The function contains the SQL, QD init another slice table and
advance the gp_interconnect_id again, QD dispatch the new plan and
execute it.
4. After the function is simplified to const, QD continues to dispatch
the previous plan, however, the gp_interconnect_id for it becomes the
older one. When a packet comes, if the receiver hasn't set up the
interconnect yet, the packet will be handled by handleMismatch() and
it will be treated as `past` packets and the senders will be stopped
earlier by the receiver. Later the receiver finish the setup of
interconnect, it cannot get any packets from senders and get stuck.

To resolve this, we advance the gp_interconnect_id when a plan is
really dispatched, the plan is dispatched sequentially, so the later
dispatched plan will have a higher gp_interconnect_id.

Also limit the usage of gp_interconnect_id in rx thread of UDPIFC,
we prefer to use sliceTable->ic_instance_id in main thread.
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io>
Reviewed-by: NAsim R P <apraveen@pivotal.io>
Reviewed-by: NHubert Zhang <hzhang@pivotal.io>

644bde25

22 5月, 2020 1 次提交

Monitor dispatcher connection when receiving from TCP interconnect · c1d45e9e

由 Pengzhou Tang 提交于 4月 10, 2020

This is mainly to resolve slow response to sequence requests under
TCP interconnect, sequence requests are sent through libpqs from
QEs to QD (we call them dispatcher connections). In the past, under
TCP interconnect, QD checked the events on dispatcher connections
every 2 seconds, obviously it's inefficient.

Under UDPIFC mode, QD also monitors the dispatcher connections when
receving tuples from QEs so QD can process sequence requests in
time, this commit applies the same logic to the TCP interconnect.
Reviewed-by: NHao Wu <gfphoenix78@gmail.com>
Reviewed-by: NNing Yu <nyu@pivotal.io>

c1d45e9e

27 4月, 2020 1 次提交

Fix a race condition in flushBuffer · 51c1bf91

由 Pengzhou Tang 提交于 4月 24, 2020

flushBuffer() is used to send packets through TCP interconnect, before
sending, it first check whether receiver stopped or teared down the
interconnect, however, there is window between checking and sending, the
receiver may tear down the interconnect and close the peer, so send()
will report an error, to resolve this, we recheck whether the receiver
stopped or teared down the interconnect in this window and don't error
out in that case.
Reviewed-by: NJinbao Chen <jinchen@pivotal.io>
Reviewed-by: NHao Wu <hawu@pivotal.io>

51c1bf91

24 4月, 2020 1 次提交

Remove TUPLE_CHUNK_ALIGN. · 30ca2852

由 Heikki Linnakangas 提交于 4月 24, 2020

It was set to 1 on all supported platforms, and I'm almost certain it
would be broken if you tried to set it to anything else, because it hasn't
been tested for a long time.

As far as I can see, the alignment was only needed because in the
receiving side, we cast the buffer into a TupSerHeader pointer, and there
was otherwise no guarantee that the buffer was suitably aligned for
TupSerHeader. That's easy to fix by memcpy()ing the TupSerHeader into a
local variable that's properly aligned.
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

30ca2852

23 4月, 2020 1 次提交

Remove forceEos mechanism for TCP interconnect · 041d9399

由 Pengzhou Tang 提交于 4月 16, 2020

In TCP interconnect, the sender used to force an EOS messages to the
receiver in two cases:
1. cancelUnfinished is true in mppExecutorFinishup.
2. an error occurs.

For case1, the comment says: to finish a cursor, the QD used to send
a cancel to the QEs, QEs then set the cancelUnfinished flag and did
a normal executor finish up. We now use QueryFinishPending mechanism
to stop a cursor, so case1 logic is invalid for a long time.

For case2, the purpose is: when an error occurs, we force an EOS to
the receiver so the receiver didn't report an interconnect error and
QD then will check the dispatch results and report the errors in the
QEs. From the view of interconnect, we have selectedd to the end of
the query and no error in the interconnect, this logic has two
problems:
1. it doesn't work for initplan, initplan will not check the dispatch
results and throw the errors, so when an error occurs in the QEs for
the initplan, the QD cannot notice that.
2. it doesn't work for cursors, for example:
   DECLARE c1 cursor for select i from t1 where i / 0 = 1;
   FETCH all from c1;
   FETCH all from c1;
All FETCH commands don't report errors which is not expected.

This commit removed the forceEos mechanism, for the case2, the
receiver will report an interconnect error without forceEos, this is
ok because when multiple errors reports from QEs, the QD is inclined
to report non-interconnect error.

041d9399

20 4月, 2020 1 次提交

Use a unicast IP address for interconnection (#9696) · 790c7bac

由 Hao Wu 提交于 4月 20, 2020

* Use a unicast IP address for interconnection on the primary

Currently, interconnect/UDP always binds the wildcard address to
the socket, which makes all QEs on the same node share the same
port space(up to 64k). For dense deployment, the UDP port could run
out, even if there are multiple IP address.
To increase the total number of available ports for QEs on a node,
we bind a single/unicast IP address to the socket for interconnect/UDP,
instead of the wildcard address. So segments with different IP address
have different port space.
To fully utilize this patch to alleviate running out of port, it's
better to assign different ADDRESS(gp_segment_configuration.address) to
different segment, although it's not mandatory.

Note: QD/mirror uses the primary's address value in
gp_segment_configuration as the destination IP to connect to the
primary.  So the primary returns the ADDRESS as its local address
by calling `getsockname()`.

* Fix the origin of the source IP address for backends

The destination IP address uses the listenerAddr of the parent slice.
But the source IP address to bind is difficult. Because it's not
stored on the segment, and the slice table is sent to the QEs after
they had bound the address and port. The origin of the source
IP address for different roles is different:
1. QD : by calling `cdbcomponent_getComponentInfo()`
2. QE on master: by qdHostname dispatched by QD
3. QE on segment: by the local address for QE of the TCP connection

790c7bac

08 4月, 2020 2 次提交

P
Remove redundant 'hasError' flag in TeardownTCPInterconnect · a6ae448d
由 Pengzhou Tang 提交于 3月 30, 2020
```
This flag is duplicated with 'forceEOS', 'forceEOS' can also tell
whether errors occur or not.
```
a6ae448d

Fix interconnect hung issue · ec1d9a70

由 Pengzhou Tang 提交于 3月 24, 2020

We hit interconnect hung issue many times in many cases, all have
the same pattern: the downstream interconnect motion senders keep
sending the tuples and they are blind to the fact that upstream
nodes have finished and quitted the execution earlier, the QD
then get enough tuples and wait all QEs to quit which cause a
deadlock.

Many nodes may quit execution earlier, eg, LIMIT, HashJoin, Nest
Loop, to resolve the hung issue, they need to stop the interconnect
stream explicitly by calling ExecSquelchNode(), however, we cannot
do that for rescan cases in which data might lose, eg, commit
2c011ce4. For rescan cases, we tried using QueryFinishPending to
stop the senders in commit 02213a73 and let senders check this
flag and quit, that commit has its own problem, firstly, QueryFini
shPending can only set by QD, it doesn't work for INSERT or UPDATE
cases, secondly, that commit only let the senders detect the flag
and quit the loop in a rude way (without sending the EOS to its
receiver), the receiver may still be stuck inreceiving tuples.

This commit revert the QueryFinishPending method firstly.

To resolve the hung issue, we move TeardownInterconnect to the
ahead of cdbdisp_checkDispatchResult so it guarantees to stop
the interconnect stream before waiting and checking the status
of QEs.

For UDPIFC, TeardownInterconnect() remove the ic entries, any
packets for this interconnect context will be treated as 'past'
packets and be acked with STOP flag.

For TCP, TeardownInterconnect() close all connection with its
children, the children will treat any readable data in the
connection as a STOP message include the closure operation.

A test case is not included, both commit 2c011ce4 and 02213a73
contain one.

ec1d9a70

23 3月, 2020 2 次提交

H
Revert "Make fault injector configurable (#9532)" (#9795) · 495343e1
由 Hao Wu 提交于 3月 23, 2020
```
This reverts commit 7f5c7da1.
```
495343e1

Make fault injector configurable (#9532) · 7f5c7da1

由 Hao Wu 提交于 3月 23, 2020

Definition FAULT_INJECTOR is hardcoded in a header(pg_config_manual.h) file.
Fault injector is useful, but it may introduce some issues in production
stage, like runtime cost, security problems. It's better to enable this
feature in development and disable it in release.

To achieve this target, we add a configure option to make fault injector
configurable. When fault injector is disabled, some tests using this feature
should be avoided to run ICW. Under isolation2 and regress, there are a lot of
tests. Now, all tests under isolation2 and regress that depend on fault injector
are moved to a new schedule file. The pattern name of it is XXX_faultinjector_schedule.

**NOTE**
All tests that depend on fault injector are saved to the XXX_faultinjector_schedule.
With this rule, we only run tests that don't depend on fault injector when fault injector
is disabled.

The schedule files used for fault injector are:
src/test/regress/greenplum_faultinjector_schedule
src/test/isolation2/isolation2_faultinjector_schedule
src/test/isolation2/isolation2_resgroup_faultinjector_schedule
Reviewed-by: NAsim R P <apraveen@pivotal.io>
Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>

7f5c7da1

02 3月, 2020 2 次提交

Refactor memtuple_form_to() and its callers. · 6740c4de

由 Heikki Linnakangas 提交于 3月 02, 2020

The caller is now expected to call compute_memtuple_size(), and only then
memtuple_form_to(). Instead of calling memtuple_form_to() twice.
Reviewed-by: NDavid Kimura <dkimura@pivotal.io>

6740c4de

Move inlining of toasted datums to the caller. · 743254f6

由 Heikki Linnakangas 提交于 3月 02, 2020

Only one caller needed the ability to inline toasted datums.
Reviewed-by: NDavid Kimura <dkimura@pivotal.io>

743254f6