1. 22 6月, 2023 2 次提交
  2. 16 6月, 2023 1 次提交
  3. 02 6月, 2023 1 次提交
  4. 31 3月, 2023 1 次提交
  5. 27 3月, 2023 1 次提交
  6. 28 2月, 2023 1 次提交
  7. 02 2月, 2023 2 次提交
  8. 26 1月, 2023 1 次提交
    • M
      Abstract accelerator (step 3) (#2677) · 98cc35b6
      Ma, Guokai 提交于
      * Integrate accelerator abstraction interface into deepspeed/
      
      * Fix error message in fp16/fused_optimizer
      
      * fix error message in fp16/unfused_optimizer.py
      
      * assign get_accelerator().pin_memory() result to input Tensor name
      
      * no need to check cuda and whether nvtx supported
      
      * move try-except into inner most block
      
      * call Event() and Stream() in get_accelerator() for data type
      
      * Make Stream and Event as properties of abstract interface so they can be used as data type in deepspeed
      
      * Apply op_builder backend api change from #2705 from @jeffra
      
      * fix tests where Builder NAME is used
      
      * keep original ...Builder.NAME interface instead of ...Builder().NAME interface
      
      * fix builder closure for installation
      
      * fix randomltd builder
      
      * add comments to clarify create_op_builder and get_op_builder
      
      * fix compatibility with pip install -e
      Co-authored-by: NCheng Li <pistasable@gmail.com>
      Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
      98cc35b6
  9. 22 11月, 2022 1 次提交
  10. 27 10月, 2022 1 次提交
  11. 26 10月, 2022 1 次提交
  12. 03 8月, 2022 1 次提交
  13. 26 7月, 2022 1 次提交
  14. 20 7月, 2022 1 次提交
  15. 14 7月, 2022 1 次提交
  16. 23 6月, 2022 1 次提交
  17. 01 6月, 2022 1 次提交
  18. 09 4月, 2022 1 次提交
  19. 19 2月, 2022 1 次提交
  20. 05 2月, 2022 1 次提交
  21. 15 12月, 2021 1 次提交
  22. 06 11月, 2021 1 次提交
  23. 05 11月, 2021 1 次提交
  24. 02 10月, 2021 1 次提交
  25. 13 5月, 2021 1 次提交
    • C
      Improve flops profiler functionality (#1065) · 4544b7d2
      Cheng Li 提交于
      * use the original function's name as the key to old_functions dict
      
      * update profile output format
      
      * print at global rank 0
      
      * add flops calculation in bwd pass using time from ds timers
      
      * improve aggregated profiling out to show all depth
      
      * print samples/second
      
      * update readme and examples
      
      * update docs
      
      * fix typo and reorder printing
      
      * fix format
      4544b7d2
  26. 17 3月, 2021 1 次提交
  27. 12 3月, 2021 1 次提交
  28. 11 2月, 2021 1 次提交
    • C
      Add flops profiler tutorial (#682) · e2dfe0d1
      Cheng Li 提交于
      * work on flops profiler tutorial
      
      * update flops profiler tutorial
      
      * add flops profiler tutorial and fix names
      
      * work on flops profiler tutorial
      
      * update flops profiler tutorial
      
      * add flops profiler tutorial and fix names
      
      * fix tailing ws
      
      * fix names
      
      * remove multistep profiling and update docs
      
      * fix cases where functionals and submodules coexist in a parent module, update readme
      
      * fix typo
      
      * always invoke post hook function
      
      * fix module flops sum and update tests
      
      * update tutorial
      e2dfe0d1
  29. 09 2月, 2021 1 次提交
    • J
      Improve starred expressions (#696) · b08aa6f3
      Jon Eyolfson 提交于
      * Improve starred expressions
      
      `deepspeed/profiling/flops_profiler/profiler.py` uses starred expressions
      that are no longer valid with [PEP 617][1]. The new Python parser is in 3.9,
      and this change allows DeepSpeed to run with the newest Python version. I have
      not checked all locations that has this issue. However, this change allows me
      to run simple examples.
      
      [1]: https://www.python.org/dev/peps/pep-0617/
      
      * Match style for "Improve starred expressions", although readability suffers
      
      The style guide might need to be updated for this new use case of expressions.
      Python [Issue 40631][1] includes more discussion on the change.
      
      [1]: https://bugs.python.org/issue40631Co-authored-by: NCheng Li <pistasable@gmail.com>
      b08aa6f3
  30. 13 1月, 2021 1 次提交