- 17 8月, 2020 1 次提交
-
-
由 Zhu Zhu 提交于
-
- 13 8月, 2020 7 次提交
-
-
由 Kezhu Wang 提交于
This closes #13124.
-
由 Kezhu Wang 提交于
[FLINK-18815] Change Thread.sleep(2) to Thread.sleep(0) to fail SafetyNetCloseableRegistryTest#testClose more often
-
由 Andrey Zagrebin 提交于
UnsafeMemoryBudget#verifyEmpty, called on slot freeing, needs time to wait on GC of all allocated/released managed memory. If there are a lot of segments to GC then it can take time to finish the check. If slot freeing happens in RPC thread, the GC waiting can block it and TM risks to miss its heartbeat. Another problem is that after UnsafeMemoryBudget#RETRIGGER_GC_AFTER_SLEEPS, System.gc() is called for each attempt to run a cleaner even if there are already detected cleaners to run. This leads to triggering a lot of unnecessary GCs in background. The PR offloads the verification into a separate thread and calls System.gc() only if memory cannot be reserved and there are still no cleaners to run after long waiting. The timeout for normal memory reservation is increased to 2 second. The full reservation, used for verification, gets 2 minute timeout. This closes #12980.
-
由 Till Rohrmann 提交于
-
由 Till Rohrmann 提交于
In order to serve the job execution result when using the per job mode, the REST handlers must be able to serve requests while the RestServerEndpoint is being shut down. Otherwise, it is not possible to serve the asynchronous operation results such as the job execution result. This commit solves this problem by checking whether the InFlightRequestTracker allows to enqueue a new request or not in the AbstractHandler. If it rejects a new request, this means that the handler has been shut completely shut down and that we should close the connection to the client. If the InFlightRequestTracker still accepts requests, then this means that the RestHandler might still want to serve an asynchronous result. This closes #13133.
-
由 Roman Khachatryan 提交于
[FLINK-18821][network] Fix indefinite wait in PartitionRequestClientFactory.createPartitionRequestClient
-
由 Roman Khachatryan 提交于
-
- 11 8月, 2020 1 次提交
-
-
由 Zhijiang 提交于
-
- 10 8月, 2020 1 次提交
-
-
由 Zhu Zhu 提交于
[FLINK-18859][tests] Increase timeout of ExecutionGraphNotEnoughResourceTest#testRestartWithSlotSharingAndNotEnoughResources to make it more stable
-
- 08 8月, 2020 1 次提交
-
-
由 Matthias Pohl 提交于
The log output was cleaned up: - Only the key of the option is logged out instead of the whole instance's toString() method. - A new utility method was introduce that adapts the log output in a way that an adapted extension is used for maximum values instead of logging the actual max value. This closes #13086.
-
- 07 8月, 2020 2 次提交
-
-
由 Maximilian Michels 提交于
This adds the configuration option `cluster.processes.halt-on-fatal-error` which defaults to `false`. If set to `true`, a custom SecurityManager will be installed on top of the existing SecurityManager to exit forcefully via `Runtime#halt`. Exiting that way may be necessary due to the Java 8 JVM freezing during a graceful shutdown when using the G1 garbage collector. This closes #13079.
-
由 Max Kuklinski 提交于
[FLINK-13689] [Connectors/ElasticSearch] Fix thread leak in Elasticsearch connector when cluster is down
-
- 05 8月, 2020 1 次提交
-
-
由 xcomp 提交于
[FLINK-18677][fix] Added handling of suspended or lost connections within the ZooKeeperLeaderRetrievalService. The listener needs to be notified in case of a connection loss so that it is able to initiate necessary actions on its side. [FLINK-18677][runtime] [style] Replaced spaces by TABs to follow the Apache Flink code styles. [FLINK-18677][runtime] [fix] Synchronize notifyLeaderLoss through lock and removed redundant code. The redundant code was moved into notifyIfNewLeaderAddress(String, UUID) which is then used by notifyLeaderLoss() and within nodeChanged(). Additionally, the method call of notifyLeaderLoss() is guarded now by a lock to synchronize the state change (i.e. lastLeaderAddress and lastLeaderSessionID). [FLINK-18677][runtime] The exception was added to make it more explicit that the method is not expected to be called. [FLINK-18677][runtime] Decreased wait time the queue to be filled since we're not expecting any objects. The test does not expect any calls happening. Hence, no CompletableFuture instance will be queued. The longer wait time would just result in a longer running test. [FLINK-18677][runtime] Added infinite wait time to happy test. The previous implementation had a fixed timeout. Slower machines might need longer to process the test which might result in test failures. The new implementation removes the timeout so that the test wouldn't fail just because of a poor performance of the machine the test is running on. [FLINK-18677][runtime] Moved log messages out of synchronization blocks. This closes #13055.
-
- 04 8月, 2020 3 次提交
-
-
由 Tartarus0zm 提交于
-
由 Tartarus0zm 提交于
- ensure that request finalization runs even if handleException throws an exception - catch NPE in handleException, which occurs if the client closes the connection
-
由 Chesnay Schepler 提交于
-
- 03 8月, 2020 1 次提交
-
-
由 Till Rohrmann 提交于
When running Flink in per-job/application mode, it will instantiate a MiniDispatcherRestEndpoint. This endpoint does not instantiate the web submission REST handlers. However, it still displayed the submit job link in the web ui. This commit changes the behaviour so that we no longer display this link when running Flink in per-job/application mode. This closes #13030.
-
- 29 7月, 2020 2 次提交
-
-
由 wangyang0918 提交于
-
由 wangyang0918 提交于
This closes #12995.
-
- 28 7月, 2020 1 次提交
-
-
由 Zhijiang 提交于
[FLINK-18595][network] Fix the deadlock issue by task thread and canceler thread in RemoteInputChannel Assuming two remote channels as listeners in LocalBufferPool, the deadlock happens as follows 1. While the Canceler thread calling ch1#releaseAllResources, it will occupy the bufferQueue lock and try to call ch2#notifyBufferAvailable. 2. While task thread exiting to call CachedBufferStorage#close, it might release exclusive buffers for ch2. Then ch2 will occupy the bufferQueue lock and try to call ch1#notifyBufferAvailable. 3. ch1 and ch2 will both occupy self bufferQueue lock and wait for other side's bufferQueue lock to cause deadlock. Regarding the solution, we can check the released state outside of bufferQueue lock in RemoteInputChannel#notifyBufferAvailable to return immediately.
-
- 22 7月, 2020 2 次提交
-
-
由 Andrey Zagrebin 提交于
This closes #12563.
-
由 Gao Yun 提交于
-
- 21 7月, 2020 1 次提交
-
-
由 Rui Li 提交于
This closes #12939
-
- 14 7月, 2020 1 次提交
-
-
由 Jark Wu 提交于
[FLINK-18539][datastream] Fix StreamExecutionEnvironment#addSource(SourceFunction, TypeInformation) doesn't use the user defined type information This closes #12878
-
- 09 7月, 2020 2 次提交
-
-
由 Aljoscha Krettek 提交于
This makes AvroFactory and the used method public where they were package private before. This fixes the problem that AvroDeserializationSchema was not working with types generated from avrohugger. It could also just be seen as refactoring/code cleanup.
-
由 Milan Nikl 提交于
-
- 08 7月, 2020 1 次提交
-
-
由 Fabian Paul 提交于
-
- 07 7月, 2020 1 次提交
-
-
由 Piotr Nowojski 提交于
-
- 02 7月, 2020 1 次提交
-
-
由 Leonard Xu 提交于
This closes #12801
-
- 30 6月, 2020 1 次提交
-
-
由 Yubin Li 提交于
This closes #12789.
-
- 29 6月, 2020 1 次提交
-
-
由 Shuiqiang Chen 提交于
-
- 27 6月, 2020 3 次提交
- 25 6月, 2020 1 次提交
-
-
由 zoudan 提交于
This closes #12542.
-
- 24 6月, 2020 1 次提交
-
-
由 GuoWei Ma 提交于
-
- 22 6月, 2020 1 次提交
-
-
由 Kostas Kloudas 提交于
-
- 20 6月, 2020 2 次提交
-
-
由 Yu Li 提交于
This closes #12673.
-