提交 · 851013172450d1140f3db823a78058116a4e2baf · Greenplum / Gpdb

19 4月, 2018 1 次提交

Speed up dispatcher detection of segment state changes · 85101317

由 David Kimura 提交于 4月 17, 2018

Dispatcher has DISPATCH_WAIT_TIMEOUT_MSEC (current value is 2000) as poll
timeout. It waited for 30 iterations of poll to timeout before checking the
segment status. And then initiated fts probe before checking the segment
status. As a result it took ~minute for query to fail in case of segment
failures.

This commit updates to check segment status on every poll timeout. It also
leverages fts version to optimize whether to check segments. It avoids
performing fts probe, instead it relies on fts to be called on regular
intervals and provide cached results.

With this change test time for twophase_tolerance_with_mirror_promotion was cut
down by ~2 minutes.
Co-authored-by: NAshwin Agrawal <aagrawal@pivotal.io>

85101317

23 3月, 2018 1 次提交
- A
  
  Cleanup some FTS functions. · 0e1b9a05
  由 Ashwin Agrawal 提交于 3月 20, 2018
  
  0e1b9a05
07 12月, 2017 1 次提交

Resent a cancel/finish signal if QE didn't respond for a long time. · 07ee8008

由 Pengzhou Tang 提交于 12月 01, 2017

Previously, dispatcher only send cancel/finish signal to QEs once, so if
the signal arrives faster than the query or is omitted by the secure_read(),
the QE may have no chance to quit if the QE is assigned to execute a MOTION
node and it's peer has been canceled.

This fixes issue #3950

07ee8008

09 11月, 2017 1 次提交
- P
  
  Bring back c3d8a92e which was reverted by accident · eaae4a5b
  由 Pengzhou Tang 提交于 11月 08, 2017
  
  eaae4a5b
08 11月, 2017 2 次提交

Revert "Do a force flush before checking the result of a connection" · 090af30a

由 Mike Roth 提交于 11月 08, 2017

This reverts commit c3d8a92e.

The mpp_resrouce_group tests on centos 6 & 7 were failing on the
isolation2/resgroup_alter_memory tests

090af30a

Do a force flush before checking the result of a connection · c3d8a92e

由 Pengzhou Tang 提交于 11月 05, 2017

Previously, to speed up dispatching, cdbdisp_dispatchToGang_async
and cdbdisp_waitDispatchFinish_async are designed to use nonblock
flush to dispatch commands in bulk, however, risks exist that some
commands are not fully dispatched in corner error cases, so QD must
do a force flush before handling such connections, otherwise QD will
get stuck.

c3d8a92e

02 11月, 2017 1 次提交

Wake up faster, if a segment returns an error. · 3bbedbe9

由 Heikki Linnakangas 提交于 11月 02, 2017

Previously, if a segment reported an error after starting up the
interconnect, it would take up to 250 ms for the main thread in the QD
process to wake up and poll the dispatcher connections, and to see that
there was an error. Shorten that time, by waking up immediately if the
QD->QE libpq socket becomes readable while we're waiting for data to
arrive in a Motion node.

This isn't a complete solution, because this will only wake up if one
arbitrarily chosen connection becomes readable, and we still rely on
polling for the others. But this greatly speeds up many common scenarios.
In particular, the "qp_functions_in_select" test now runs in under 5 s
on my laptop, when it took about 60 seconds before.

3bbedbe9

30 10月, 2017 2 次提交

A
Retire gp_libpq_fe part 2, changing including path · 974c414e
由 Adam Lee 提交于 10月 23, 2017
```
Signed-off-by: NAdam Lee <ali@pivotal.io>
```
974c414e

Retire gp_libpq_fe part 1, libpq itself · 510a20b6

由 Adam Lee 提交于 10月 23, 2017

    commit b0328d5631088cca5f80acc8dd85b859f062ebb0
    Author: mcdevc <a@b>
    Date:   Fri Mar 6 16:28:45 2009 -0800

        Separate our internal libpq front end from the client libpq library
        upgrade libpq to the latest to pick up bug fixes and support for more
        client authentication types (GSSAPI, KRB5, etc)
        Upgrade all files dependent on libpq to handle new version.

Above is the initial commit of gp_libpq_fe, seems no good reasons still
having it.

Key things this PR do:

1, remove the gp_libpq_fe directory.
2, build libpq source codes into two versions, for frontend and backend,
check the macro FRONTEND.
3, libpq for backend still bypasses local authentication, SSL and some
environment variables, and these are the whole differences.
Signed-off-by: NAdam Lee <ali@pivotal.io>

510a20b6

10 10月, 2017 1 次提交
- A
  
  pgindent cdb/dispatcher directory. · a75898c3
  由 Ashwin Agrawal 提交于 9月 28, 2017
  
  a75898c3
01 9月, 2017 1 次提交

Fix Copyright and file headers across the tree · ed7414ee

由 Daniel Gustafsson 提交于 9月 01, 2017

This bumps the copyright years to the appropriate years after not
having been updated for some time. Also reformats existing code
headers to match the upstream style to ensure consistency.

ed7414ee

09 8月, 2017 1 次提交

Do not include gp-libpq-fe.h and gp-libpq-int.h in cdbconn.h · cf7cddf7

由 Pengzhou Tang 提交于 8月 07, 2017

The whole cdb directory was shipped to end users and all header files
that cdb*.h included are also need to be shipped to make checkinc.py
pass. However, exposing gp_libpq_fe/*.h will confuse customer because
they are almost the same as libpq/*, as Heikki's suggestion, we should
keep gp_libpq_fe/* unchanged. So to make system work, we include
gp-libpg-fe.h and gp-libpq-int.h directly in c files that need them

cf7cddf7

31 7月, 2017 1 次提交

Implement "COPY ... FROM ... ON SEGMENT' · e254287e

由 Ming LI 提交于 7月 31, 2017

Support COPY statement that imports the data file on segments directly
parallel. It could be used to import data files generated by "COPY ...
to ... ON SEGMENT'.

This commit also supports all kinds of data file formats which "COPY ...
TO" supports, processes reject limit numbers and logs errors accordingly.

Key workflow:
   a) For COPY FROM, nothing changed by this commit, dispatch modified
   COPY command to segments at first, then read data file on master, and
   dispatch the data to relevant segment to process.

   b) For COPY FROM ON SEGMENT, on QD, read dummy data file, other parts
   keep unchanged, on QE, process the data stream (empty) dispatched
   from QD at first, then re-do the same workflow to read and process
   the local segment data file.
Signed-off-by: NMing LI <mli@pivotal.io>
Signed-off-by: NAdam Lee <ali@pivotal.io>
Signed-off-by: NHaozhou Wang <hawang@pivotal.io>
Signed-off-by: NXiaoran Wang <xiwang@pivotal.io>

e254287e

24 7月, 2017 1 次提交

Use non-blocking recv() in internal_cancel() · 23e5a5ee

由 xiong-gang 提交于 7月 24, 2017

The issue of hanging on recv() in internal_cancel() are reported
serveral times, the socket status is shown 'ESTABLISHED' on master,
while the peer process on the segment has already exit. We are not
sure how exactly dose this happen, but we are able to simulate this
hang issue by dropping packet or reboot the system on the segment.

This patch use poll() to do non-blocking recv() in internal_cancel();
the timeout of poll() is set to the max value of authentication_timeout
to make sure the process on segment has already exit before attempting
another retry; and we expect retry on connect() can detect network issue.
Signed-off-by: NNing Yu <nyu@pivotal.io>

23e5a5ee

14 2月, 2017 1 次提交

Fix dispatch and interconnect defects when postmaster is not alive · e28c84b2

由 Pengzhou Tang 提交于 12月 20, 2016

Although postmaster of one segment is killed, QEs of it are still available and for some defects, query may get hung. Improvements in this commit include:
1. Interconnect motion receiver and sender check segments status if no data available for long time
to avoid query hang issue.
2. Add segments status checking into gang sanity test.
3. Do not reuse Gangs whose postmaster is not alive, and recreate a new one.
4. Check segments status when creating gang failed.
5. Close connection if it's peer is down

e28c84b2

10 2月, 2017 1 次提交

Remove unused atomic functions. · 2cd519d3

由 Heikki Linnakangas 提交于 2月 09, 2017

None of the source files that #included gp_atomic.h actually needed the
declarations from gp_atomic.h. They actually needed the definitions from
port/atomics.h, which gp-atomic.h in turn #included.

2cd519d3

14 11月, 2016 1 次提交

Use nonblocking mechanism to send data in async dispatcher. · 2516eac6

由 xiong-gang 提交于 11月 14, 2016

pqFlush is sending data synchronously though the socket is set
O_NONBLOCK, this incurs performance downgradation. This commit uses
pqFlushNonBlocking instead, and synchronizes the completion of
dispatching to all Gangs before query execution.

Signed-off-by: Kenan Yao<kyao@pivotal.io>

2516eac6

13 9月, 2016 2 次提交

Remove duplicate checks that already exist within PQsendGpQuery_shared() · d6a5c7a8

由 Pengzhou Tang 提交于 9月 07, 2016

Before dispatching a command, we assume the connection is newly created or is reused. For newly created connection, it must be idle, for reused connection, it should have been cleaned up, meanwhile within the internal dispatching, PQsendGpQuery_shared() also do the busy checking and bad connection checking, so pre-checking looks like unnecessary

d6a5c7a8

Speed up QE cancel when one or more QEs got errors · 39ed6031

由 Pengzhou Tang 提交于 9月 05, 2016

QD need to cancel QEs when
1) QD get a error
2) one or more QEs got error and cancelOnError was set to true.

We want to cancel QEs as soon as possible once above conditions are reached, but considering
the cost of cancelling QEs is high, we want to process as many pending finish QEs as possible
before actually cancel. The original interval before cancelling is 2 seconds which is too
long that users will see an obvious delay before errors are reported, this commit lower
this interval to 100 ms to speed up the cancelling process.

39ed6031

29 8月, 2016 1 次提交

Fix few dispatch related bugs · eb40e073

由 Pengzhou Tang 提交于 8月 04, 2016

1.Fix primary writer gang leak: accidentally set PrimaryWriterGang to NULL which cause disconnectAndDestroyAllGangs()
can not destroy primary writer gang.
2.Fix gang leak: when creating gang, if retry count exceed the limitation, forget to destroy the failed gang.
3.Remove duplicate sanity check before dispatchCommand().
4.Remove unnecessary error-out when a broken Gang is no longer needed.
5.Fix thread leak problem
6.Enhance error handling for cdbdisp_finishCommand

eb40e073

17 7月, 2016 1 次提交
- G
  
  Add asynchronous implementation of dispatcher · c0fa7236
  由 Gang Xiong 提交于 6月 23, 2016
  
  c0fa7236