- 22 6月, 2023 2 次提交
-
-
由 Cheng Li 提交于
-
由 Bill Luo 提交于
* fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 16 6月, 2023 1 次提交
-
-
由 Alejandro Dubrovsky 提交于
Co-authored-by: NAlex Dubrovsky <dubro@amazon.com>
-
- 02 6月, 2023 1 次提交
-
-
由 郭叶军 提交于
When activation checkpointing is enabled, most of forward is re-computed, and so the FLOPS calculation should be updated with recompute_fwd_factor=1.0 I don't find a way to pass the option from model script to deepspeed engine, and so add option directly for flops_profiler. Co-authored-by: NCheng Li <pistasable@gmail.com>
-
- 31 3月, 2023 1 次提交
-
-
由 Michael Wyatt 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 27 3月, 2023 1 次提交
-
-
由 Jeff Rasley 提交于
-
- 28 2月, 2023 1 次提交
-
-
由 Jeff Rasley 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NConglong Li <conglong.li@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 02 2月, 2023 2 次提交
-
-
由 Cheng Li 提交于
* fix upsample flops compute by skipping unused kargs * fix format
-
由 swli 提交于
* bugs in profiler: 1. Tensor.bmm missed in _patch_tensor_methods function 2. missed funtions in _reload_functionals and _reload_tensor_methods functions 3. torch.mm and torch.Tensor.mm will have same __name__ in wrapFunc, my suggustion is use __str__ instead. * formatting --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NCheng Li <pistasable@gmail.com>
-
- 26 1月, 2023 1 次提交
-
-
由 Ma, Guokai 提交于
* Integrate accelerator abstraction interface into deepspeed/ * Fix error message in fp16/fused_optimizer * fix error message in fp16/unfused_optimizer.py * assign get_accelerator().pin_memory() result to input Tensor name * no need to check cuda and whether nvtx supported * move try-except into inner most block * call Event() and Stream() in get_accelerator() for data type * Make Stream and Event as properties of abstract interface so they can be used as data type in deepspeed * Apply op_builder backend api change from #2705 from @jeffra * fix tests where Builder NAME is used * keep original ...Builder.NAME interface instead of ...Builder().NAME interface * fix builder closure for installation * fix randomltd builder * add comments to clarify create_op_builder and get_op_builder * fix compatibility with pip install -e Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 22 11月, 2022 1 次提交
-
-
由 Jeff Rasley 提交于
* fixes for new torch.numel return type * address comment
-
- 27 10月, 2022 1 次提交
-
-
由 Cheng Li 提交于
* rollback ds config changes * fix format * Fix error when output_file is a relative path without a prefix (#2397) Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com> * fix restuls and exprs path to use absolute path * write out optimial config after tuning * fix format * assert tuning result dir creation Co-authored-by: NBenjamin Steenhoek <benjaminjsteenhoek@gmail.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
- 26 10月, 2022 1 次提交
-
-
由 Cheng Li 提交于
* update pytorch pool operator function signiture * fix the case where kwargs is None
-
- 03 8月, 2022 1 次提交
-
-
由 Zion Wu 提交于
Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 26 7月, 2022 1 次提交
-
-
由 Alex Hedges 提交于
-
- 20 7月, 2022 1 次提交
-
-
由 Aman Sanger 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 14 7月, 2022 1 次提交
-
-
由 Cheng Li 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 23 6月, 2022 1 次提交
-
-
由 Michael Wyatt 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 01 6月, 2022 1 次提交
-
-
由 Cheng Li 提交于
-
- 09 4月, 2022 1 次提交
-
-
由 TongXU 提交于
-
- 19 2月, 2022 1 次提交
-
-
由 Cheng Li 提交于
* generalize profiler model input format * fix timining of module forward Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 05 2月, 2022 1 次提交
-
-
由 Cheng Li 提交于
* separate add and mul flops compute function * fix format
-
- 15 12月, 2021 1 次提交
-
-
由 Cheng Li 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 06 11月, 2021 1 次提交
-
-
由 Nathan Frey 提交于
Fix typos in Flops Profiler message Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 05 11月, 2021 1 次提交
-
-
由 Cheng Li 提交于
-
- 02 10月, 2021 1 次提交
-
-
由 Alex Hedges 提交于
* Fix typos in docs/ * Fix typos in code comments and output strings * Fix typos in the code itself * Fix typos in tests/ Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 13 5月, 2021 1 次提交
-
-
由 Cheng Li 提交于
* use the original function's name as the key to old_functions dict * update profile output format * print at global rank 0 * add flops calculation in bwd pass using time from ds timers * improve aggregated profiling out to show all depth * print samples/second * update readme and examples * update docs * fix typo and reorder printing * fix format
-
- 17 3月, 2021 1 次提交
-
-
由 brett koonce 提交于
-
- 12 3月, 2021 1 次提交
-
-
由 Cheng Li 提交于
* add optimizers and schedules to rtd * update ds website and fix links * add optimizers and schedules to rtd * update ds website and fix links * add flops profiler to rtd * fix Co-authored-by: NShaden Smith <Shaden.Smith@microsoft.com>
-
- 11 2月, 2021 1 次提交
-
-
由 Cheng Li 提交于
* work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * work on flops profiler tutorial * update flops profiler tutorial * add flops profiler tutorial and fix names * fix tailing ws * fix names * remove multistep profiling and update docs * fix cases where functionals and submodules coexist in a parent module, update readme * fix typo * always invoke post hook function * fix module flops sum and update tests * update tutorial
-
- 09 2月, 2021 1 次提交
-
-
由 Jon Eyolfson 提交于
* Improve starred expressions `deepspeed/profiling/flops_profiler/profiler.py` uses starred expressions that are no longer valid with [PEP 617][1]. The new Python parser is in 3.9, and this change allows DeepSpeed to run with the newest Python version. I have not checked all locations that has this issue. However, this change allows me to run simple examples. [1]: https://www.python.org/dev/peps/pep-0617/ * Match style for "Improve starred expressions", although readability suffers The style guide might need to be updated for this new use case of expressions. Python [Issue 40631][1] includes more discussion on the change. [1]: https://bugs.python.org/issue40631Co-authored-by: NCheng Li <pistasable@gmail.com>
-
- 13 1月, 2021 1 次提交
-
-
由 Cheng Li 提交于
Co-authored-by: NCheng Li <pistasable@gmail.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-