1. 31 5月, 2021 9 次提交
  2. 30 5月, 2021 5 次提交
    • J
      Mark `test_send_tas_to_celery_hang` as quarantined (#16169) · 4bdf46d2
      Jarek Potiuk 提交于
      The test_send_tasks_to_celery_hang hangs on self-hosted runners more
      often than not.
      
      It's been introduced in #15989 and while the test does not usually hang
      on regular GitHub runners, or in case of running it locally (I could not
      make it fail), it does hang almost always when run on self-hosted
      runners.
      
      Marking it as quarantined for now.
      
      Issue #16168 created to keep track of it.
      4bdf46d2
    • K
      Fix failing static check (#16162) · 7c976f9c
      Kaxil Naik 提交于
      This check was failing on master, example: https://github.com/apache/airflow/runs/2702121768#step:10:160
      7c976f9c
    • J
      a27e930d
    • Y
      Treat `AirflowSensorTimeout` as immediate failure without retrying (#12058) · a0e6a847
      yuqian90 提交于
      ## Expected behaviour
      For a sensor like this, the intention of the DAG author is usually to fail the sensor if it's still not done after ten minutes. However, if the sensor fails prematurely due to other unexpected reasons (such as network outage), retry at most twice.
      
      ```python
      sensor = PythonSensor(
          task_id='sensor',
          python_callable=python_callable,
          timeout=60 * 10,
          retries=2,
          mode="reschedule",
      )
      ```
      
      ## Actual behaviour
      The actual current behaviour of Airflow is to retry when the sensor times out. So the effective timeout of the sensor becomes 60 * 10 * (retries + 1) = 30min. This often causes confusion. It also makes it impossible to achieve the expected behaviour no matter how the author configures the sensor.
      
      ## Fix
      This PR fixes this issue. `AirflowSensorTimeout` is now treated as immediate failure. This achieves the expected behaviour. The sensor will fail if timeout is reached. If someone really wants the previous behaviour, he can always increase the timeout. I.e instead of failing and retrying every ten minutes three times, just set the timeout to 30min.
      a0e6a847
    • Y
      Marking success/failed automatically clears failed downstream tasks (#13037) · 6b2524fe
      yuqian90 提交于
      closes: #12485
      6b2524fe
  3. 29 5月, 2021 6 次提交
    • Y
      Fix dag.clear() to set multiple dags to running when necessary (#15382) · 2bca8a54
      yuqian90 提交于
      closes: #14260
      related: #9824
      
      When clearing task across dags using ExternalTaskMarker the dag state of the external DagRun is not set to active. So cleared tasks in the external dag will not automatically start if the DagRun is a Failed or Succeeded state.
      #9824 tried to fix a similar issue for subdag. But it did not fix ExternalTaskMarker. This PR fixes both.
      
      Two changes are made to fix the issue:
      
      Make clear_task_instances set DagRuns' state to dag_run_state for all the affected DagRuns.
      The filter for DagRun in clear_task_instances is fixed too. Previously, it made an assumption that execution_dates for all the dag_ids are the same, which is not always correct.
      test_external_task_marker_clear_activate is added to make sure the fix does the right thing.
      2bca8a54
    • Y
      Fix Celery executor getting stuck randomly because of reset_signals in multiprocessing (#15989) · f75dd7ae
      yuqian90 提交于
      Fixes #15938
      
      multiprocessing.Pool is known to often become stuck. It causes celery_executor to hang randomly. This happens at least on Debian, Ubuntu using Python 3.8.7 and Python 3.8.10. The issue is reproducible by running test_send_tasks_to_celery_hang in this PR several times (with db backend set to something other than sqlite because sqlite disables some parallelization)
      
      The issue goes away once switched to concurrent.futures.ProcessPoolExecutor. In python 3.6 and earlier, ProcessPoolExecutor has no initializer argument. Fortunately, it's not needed because reset_signal is no longer needed because the signal handler now checks if the current process is the parent.
      f75dd7ae
    • A
      Added Kayzen to INTHEWILD.md (#16154) · 2de06920
      Arvind Abraham 提交于
      Added Kayzen to the list of companies using Apache Airflow
      2de06920
    • Z
      remove retry for now (#16150) · 8d166382
      Zach Liu 提交于
      8d166382
    • J
      Chart: Adds support for custom command and args (#16153) · e7d3fa2d
      Jed Cunningham 提交于
      Some images may not want to use the same command/args as the community
      image, so expose them as parameters.
      e7d3fa2d
    • J
      Move images needed only during CI to `airflow-ci` DockerHub (#16116) · 595374ae
      Jarek Potiuk 提交于
      We have now separate `apache/airflow-ci` DockerHub repo and we
      move all our images needed only during CI there.
      
      The images from the main `apache/airflow` remaining are:
      
      * airflow tagged and latest tagged production images
      * images neded by the Helm Chart
      595374ae
  4. 28 5月, 2021 12 次提交
  5. 27 5月, 2021 8 次提交