- 09 11月, 2020 5 次提交
-
-
由 Hubert Zhang 提交于
This reverts commit 9265ea6a.
-
由 Ning Yu 提交于
The user can adjust the ic-proxy peer addresses at runtime and reload by sending SIGHUP, if an address is modified or removed, the corresponding peer connection must be closed or reestablished. The same to the peer listener, if the listener port is changed, then must re-setup the listener.
-
由 Ning Yu 提交于
The peer addresses are specified with the GUC gp_interconnect_proxy_addresses, it can be reloaded on SIGHUP, we used to only care about the newly added ones, however it is also possible for the user to modify them, or even remove some of them. So now we add the logic to classify the addresses after parsing the GUC, we can tell whether an address is added, removed, or modified. The handling of the classified addresses will be done in the next commit.
-
由 Ning Yu 提交于
We used to scan the whole addr list to find my addr, now we record it directly when parsing the addresses.
-
由 Ning Yu 提交于
A ICProxyAddr variable is usually named as "addr", so the attribute is referred as "addr->addr", it's confusing and sometimes ambiguous. So renamed the attribute to "sockaddr", the function ic_proxy_extract_addr() is also renamed to ic_proxy_extract_sockaddr().
-
- 28 10月, 2020 1 次提交
-
-
由 盏一 提交于
In some cases, some signals (like SIGQUIT) that should only be processed by the main thread of the postmaster may be dispatched to rxThread. So we should and it is safe to block all signals in the udp pthreads. Fix #11006
-
- 28 9月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
-
- 10 9月, 2020 1 次提交
-
-
由 Ning Yu 提交于
The GUC gp_interconnect_proxy_addresses is used to set the listener addresses and ports of all the proxy bgworkers, only IP addresses were supported previously, which is inconvenient to use. Now we add the support for hostnames too, the IP addresses are also supported. Note that if a hostname is bound to a different IP at runtime, we must reload the setting with the "gpstop -u" command. Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 02 9月, 2020 2 次提交
-
-
由 Hubert Zhang 提交于
-
由 Hubert Zhang 提交于
Resource group used to access resGroupSlot in SessionState without lock. This is correct when session only access resGroupSlot by itself. But as we introduced runaway feature, we need to traverse the current session array to find the top consumer session when redzone is reached. This requires: 1. runaway detector should hold shared resgroup lock to avoid resGroupSlot is detached from a session concurrently when redzone is reached. 2. normal session should hold exclusive lock when modifying resGroupSlot in SessionState. Also fix a compile warning. Reviewed-by: NNing Yu <nyu@pivotal.io>
-
- 01 9月, 2020 1 次提交
-
-
由 Hubert Zhang 提交于
Proxy bgworker will become orphan process after postmaster is dead due to the lack of checking pipe postmaster_alive_fds[POSTMASTER_FD_WATCH]. Epoll this pipe inside proxy bgworker main loop as well. Reviewed-by: NNing Yu <nyu@pivotal.io>
-
- 14 8月, 2020 1 次提交
-
-
由 Hubert Zhang 提交于
ic-proxy is developed with libuv, the minimal supported libuv version is 1.18.0. But in commit 608514, we introduce new API in libuv 1.19, which break compatibility on os like Ubuntu 18.04 whose default libuv version is 1.18. We should keep our code base align with libuv 1.18, and replace the new API in libuv 1.19. The API change is mainly about how to access data field in uv handle and uv loop. The new API uses function interface like `uv_handle_set_data` and `uv_handle_get_data` to access data filed, while the old API in 1.18 access the data filed directly. Note that in the latest libuv version 1.38.2, the old API and new API are both supported. And libuv is stable enough to support the old API for a long time.
-
- 12 8月, 2020 2 次提交
-
-
由 Heikki Linnakangas 提交于
ic_proxy_backend.h includes libuv's uv.h header, and ic_proxy_backend.h was being included in ic_tcp.c, even when compiling with --disable-ic-proxy.
-
由 Hubert Zhang 提交于
Previously, when backends connect to a proxy, we need to setup domain socket pipe and send HELLO message(recv ack message) in a blocking and non-parallel way. This makes ICPROXY hard to introduce check_for_interrupt during backend registeration. By utilizing libuv loop, we could register backend in paralle. Note that this is one of the step to replace all the ic_tcp backend logic reused by ic_proxy currently. In future, we should use libuv to replace all the backend logic, from registeration to send/recv data. Co-authored-by: NNing Yu <nyu@pivotal.io>
-
- 10 8月, 2020 1 次提交
-
-
由 Ning Yu 提交于
A typical mistake on allocating typed memory is as below: int64 *ptr = malloc(sizeof(int32)); To prevent this, now we make ic_proxy_new() a typed allocator, it always return a pointer of the specified type, for example: int64 *p1 = ic_proxy_new(int64); /* good */ int64 *p2 = ic_proxy_new(int32); /* bad, gcc will raise a warning */ Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 06 8月, 2020 1 次提交
-
-
由 Paul Guo 提交于
Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NHao Wu <gfphoenix78@gmail.com>
-
- 04 8月, 2020 1 次提交
-
-
由 Ning Yu 提交于
Fixed the bug that the SIGHUP handler was installed for SIGINT by mistake, so the ic-proxy bgworkers would die on SIGHUP. By correcting the signal name, now we could let the ic-proxy bgworkers reload the postgresql.conf by executing "gpstop -u". Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 03 8月, 2020 1 次提交
-
-
由 Ning Yu 提交于
In a query that contains multiple init/sub plans, the packets of the second subplan might be received while the first is still being processed in the ic-proxy mode, this is because in ic-proxy mode a local host handshake is used instead of the global one. To distinguish the packets of different subplans, especially for the early coming ones, we must stop handling on the BYE immediately, and pass any unhandled early coming pkts to the successor or the placeholder. This fixes the random hanging during the ICW parallel group of qp_functions_in_from. No new test is added. Co-authored-by: NHubert Zhang <hzhang@pivotal.io> Co-authored-by: NNing Yu <nyu@pivotal.io>
-
- 29 7月, 2020 1 次提交
-
-
由 Ning Yu 提交于
We used to store them under /tmp/, we include the postmaster port number in the file name in the hope that two clusters will not conflict with each other on this file. However the conflict still happen in the test src/bin/pg_basebackup. And it can also happen if a second cluster is missed configured by accident. So to make things safe we also include the postmaster pid in the domain socket path, there is no chance for two postmasters to share the same pids. Reviewed-by: NPaul Guo <pguo@pivotal.io>
-
- 23 7月, 2020 2 次提交
-
-
由 Ning Yu 提交于
We used to mark the GUC gp_interconnect_proxy_addresses as PGC_POSTMASTER, so the cluster must be restarted to reload this setting, this can be a problem during gpexpand: the cluster expansion itself is online, but to configure the proxy addresses for the new segments a restart is needed. Now we changed it to PGC_SIGHUP, so the setting can be reloaded on SIGHUP. Also changed the setting from a developer option to a normal one.
-
由 Ning Yu 提交于
-
- 10 7月, 2020 3 次提交
-
-
由 Ning Yu 提交于
We used to use the option --with-libuv to enable ic-proxy, it is not staightforward to understand the purpose of that option, though. So we renamed it to --enable-ic-proxy, and the default setting is changed to "disable". Suggested by Kris Macoskey <kmacoskey@pivotal.io>
-
由 Ning Yu 提交于
Only in proxy mode, of course. Currently the ic-proxy mode shares most of the backend logic with ic-tcp mode, so instead of copying the code we actually embed the ic-proxy specific logic in ic_tcp.c .
-
由 Ning Yu 提交于
The interconnect proxy mode, a.k.a. ic-proxy, is a new interconnect mode, all the backends communicate via a proxy bgworker, all the backends on the same segment share the same proxy bgworker, so every two segments only need one network connection between them, which reduces the network flows as well the ports. To enable the proxy mode we need to first configure the guc gp_interconnect_proxy_addresses, for example: gpconfig \ -c gp_interconnect_proxy_addresses \ -v "'1:-1:10.0.0.1:2000,2:0:10.0.0.2:2001,3:1:10.0.0.3:2002'" \ --skipvalidation Then restart to take effect.
-
- 05 6月, 2020 1 次提交
-
-
由 Asim R P 提交于
-
- 03 6月, 2020 2 次提交
-
-
由 Asim R P 提交于
Remember if the select call was interrupted. Act on it after emitting debug logs and checking cancel requests from dispatcher.
-
由 Asim R P 提交于
Previously, the result of select() system call and errno set by it was checked after performing several function calls, including checking for interrupts and checkForCancelFromQD. That made it very likely for errno to change, losing the original value that was set by the select(). This patch fixes it so that the errno is checked immediately after the system call. This should address intermittent failures in CI with error message like this: ERROR","58M01","interconnect error: select: Success"
-
- 29 5月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
processIncomingChunks() receives a list of chunks from a one sender, and then calls addChunkToSorter() on each chunk. addChunkToSorter() looks up some things based on the sender. But since all the chunks came from the same sender, we can move the lookups to outside of the loop and save some overhead. Reviewed-by: NGang Xiong <gxiong@pivotal.io>
-
- 25 5月, 2020 1 次提交
-
-
由 Pengzhou Tang 提交于
This issue is exposed when doing an experiment to remove the special "eval_stable_functions" handling in evaluate_function(), qp_functions_in_* test cases will get stuck sometimes and it turns out to be a gp_interconnect_id disorder issue. Under UDPIFC interconnect, gp_interconnect_id is used to distinguish the executions of MPP-fied plan in the same session and in the receiver side, packets with smaller gp_interconnect_id is treated as 'past' packets, receiver will stop the sender to send the packets. The RCA of the hung is: 1. QD call InitSliceTable() to advance the gp_interconnect_id and store it in slice table. 2. In CdbDispatchPlan->exec_make_plan_constant(), QD find some stable function need to be simplified to const, then it executes this function first. 3. The function contains the SQL, QD init another slice table and advance the gp_interconnect_id again, QD dispatch the new plan and execute it. 4. After the function is simplified to const, QD continues to dispatch the previous plan, however, the gp_interconnect_id for it becomes the older one. When a packet comes, if the receiver hasn't set up the interconnect yet, the packet will be handled by handleMismatch() and it will be treated as `past` packets and the senders will be stopped earlier by the receiver. Later the receiver finish the setup of interconnect, it cannot get any packets from senders and get stuck. To resolve this, we advance the gp_interconnect_id when a plan is really dispatched, the plan is dispatched sequentially, so the later dispatched plan will have a higher gp_interconnect_id. Also limit the usage of gp_interconnect_id in rx thread of UDPIFC, we prefer to use sliceTable->ic_instance_id in main thread. Reviewed-by: NHeikki Linnakangas <hlinnakangas@pivotal.io> Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NHubert Zhang <hzhang@pivotal.io>
-
- 22 5月, 2020 1 次提交
-
-
由 Pengzhou Tang 提交于
This is mainly to resolve slow response to sequence requests under TCP interconnect, sequence requests are sent through libpqs from QEs to QD (we call them dispatcher connections). In the past, under TCP interconnect, QD checked the events on dispatcher connections every 2 seconds, obviously it's inefficient. Under UDPIFC mode, QD also monitors the dispatcher connections when receving tuples from QEs so QD can process sequence requests in time, this commit applies the same logic to the TCP interconnect. Reviewed-by: NHao Wu <gfphoenix78@gmail.com> Reviewed-by: NNing Yu <nyu@pivotal.io>
-
- 27 4月, 2020 1 次提交
-
-
由 Pengzhou Tang 提交于
flushBuffer() is used to send packets through TCP interconnect, before sending, it first check whether receiver stopped or teared down the interconnect, however, there is window between checking and sending, the receiver may tear down the interconnect and close the peer, so send() will report an error, to resolve this, we recheck whether the receiver stopped or teared down the interconnect in this window and don't error out in that case. Reviewed-by: NJinbao Chen <jinchen@pivotal.io> Reviewed-by: NHao Wu <hawu@pivotal.io>
-
- 24 4月, 2020 1 次提交
-
-
由 Heikki Linnakangas 提交于
It was set to 1 on all supported platforms, and I'm almost certain it would be broken if you tried to set it to anything else, because it hasn't been tested for a long time. As far as I can see, the alignment was only needed because in the receiving side, we cast the buffer into a TupSerHeader pointer, and there was otherwise no guarantee that the buffer was suitably aligned for TupSerHeader. That's easy to fix by memcpy()ing the TupSerHeader into a local variable that's properly aligned. Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
-
- 23 4月, 2020 1 次提交
-
-
由 Pengzhou Tang 提交于
In TCP interconnect, the sender used to force an EOS messages to the receiver in two cases: 1. cancelUnfinished is true in mppExecutorFinishup. 2. an error occurs. For case1, the comment says: to finish a cursor, the QD used to send a cancel to the QEs, QEs then set the cancelUnfinished flag and did a normal executor finish up. We now use QueryFinishPending mechanism to stop a cursor, so case1 logic is invalid for a long time. For case2, the purpose is: when an error occurs, we force an EOS to the receiver so the receiver didn't report an interconnect error and QD then will check the dispatch results and report the errors in the QEs. From the view of interconnect, we have selectedd to the end of the query and no error in the interconnect, this logic has two problems: 1. it doesn't work for initplan, initplan will not check the dispatch results and throw the errors, so when an error occurs in the QEs for the initplan, the QD cannot notice that. 2. it doesn't work for cursors, for example: DECLARE c1 cursor for select i from t1 where i / 0 = 1; FETCH all from c1; FETCH all from c1; All FETCH commands don't report errors which is not expected. This commit removed the forceEos mechanism, for the case2, the receiver will report an interconnect error without forceEos, this is ok because when multiple errors reports from QEs, the QD is inclined to report non-interconnect error.
-
- 20 4月, 2020 1 次提交
-
-
由 Hao Wu 提交于
* Use a unicast IP address for interconnection on the primary Currently, interconnect/UDP always binds the wildcard address to the socket, which makes all QEs on the same node share the same port space(up to 64k). For dense deployment, the UDP port could run out, even if there are multiple IP address. To increase the total number of available ports for QEs on a node, we bind a single/unicast IP address to the socket for interconnect/UDP, instead of the wildcard address. So segments with different IP address have different port space. To fully utilize this patch to alleviate running out of port, it's better to assign different ADDRESS(gp_segment_configuration.address) to different segment, although it's not mandatory. Note: QD/mirror uses the primary's address value in gp_segment_configuration as the destination IP to connect to the primary. So the primary returns the ADDRESS as its local address by calling `getsockname()`. * Fix the origin of the source IP address for backends The destination IP address uses the listenerAddr of the parent slice. But the source IP address to bind is difficult. Because it's not stored on the segment, and the slice table is sent to the QEs after they had bound the address and port. The origin of the source IP address for different roles is different: 1. QD : by calling `cdbcomponent_getComponentInfo()` 2. QE on master: by qdHostname dispatched by QD 3. QE on segment: by the local address for QE of the TCP connection
-
- 08 4月, 2020 2 次提交
-
-
由 Pengzhou Tang 提交于
This flag is duplicated with 'forceEOS', 'forceEOS' can also tell whether errors occur or not.
-
由 Pengzhou Tang 提交于
We hit interconnect hung issue many times in many cases, all have the same pattern: the downstream interconnect motion senders keep sending the tuples and they are blind to the fact that upstream nodes have finished and quitted the execution earlier, the QD then get enough tuples and wait all QEs to quit which cause a deadlock. Many nodes may quit execution earlier, eg, LIMIT, HashJoin, Nest Loop, to resolve the hung issue, they need to stop the interconnect stream explicitly by calling ExecSquelchNode(), however, we cannot do that for rescan cases in which data might lose, eg, commit 2c011ce4. For rescan cases, we tried using QueryFinishPending to stop the senders in commit 02213a73 and let senders check this flag and quit, that commit has its own problem, firstly, QueryFini shPending can only set by QD, it doesn't work for INSERT or UPDATE cases, secondly, that commit only let the senders detect the flag and quit the loop in a rude way (without sending the EOS to its receiver), the receiver may still be stuck inreceiving tuples. This commit revert the QueryFinishPending method firstly. To resolve the hung issue, we move TeardownInterconnect to the ahead of cdbdisp_checkDispatchResult so it guarantees to stop the interconnect stream before waiting and checking the status of QEs. For UDPIFC, TeardownInterconnect() remove the ic entries, any packets for this interconnect context will be treated as 'past' packets and be acked with STOP flag. For TCP, TeardownInterconnect() close all connection with its children, the children will treat any readable data in the connection as a STOP message include the closure operation. A test case is not included, both commit 2c011ce4 and 02213a73 contain one.
-
- 23 3月, 2020 2 次提交
-
-
由 Hao Wu 提交于
Definition FAULT_INJECTOR is hardcoded in a header(pg_config_manual.h) file. Fault injector is useful, but it may introduce some issues in production stage, like runtime cost, security problems. It's better to enable this feature in development and disable it in release. To achieve this target, we add a configure option to make fault injector configurable. When fault injector is disabled, some tests using this feature should be avoided to run ICW. Under isolation2 and regress, there are a lot of tests. Now, all tests under isolation2 and regress that depend on fault injector are moved to a new schedule file. The pattern name of it is XXX_faultinjector_schedule. **NOTE** All tests that depend on fault injector are saved to the XXX_faultinjector_schedule. With this rule, we only run tests that don't depend on fault injector when fault injector is disabled. The schedule files used for fault injector are: src/test/regress/greenplum_faultinjector_schedule src/test/isolation2/isolation2_faultinjector_schedule src/test/isolation2/isolation2_resgroup_faultinjector_schedule Reviewed-by: NAsim R P <apraveen@pivotal.io> Reviewed-by: NAshwin Agrawal <aagrawal@pivotal.io>
- 02 3月, 2020 2 次提交
-
-
由 Heikki Linnakangas 提交于
The caller is now expected to call compute_memtuple_size(), and only then memtuple_form_to(). Instead of calling memtuple_form_to() twice. Reviewed-by: NDavid Kimura <dkimura@pivotal.io>
-
由 Heikki Linnakangas 提交于
Only one caller needed the ability to inline toasted datums. Reviewed-by: NDavid Kimura <dkimura@pivotal.io>
-