- 16 6月, 2023 2 次提交
-
-
由 Dino Chen 提交于
* fix ccl_backend path when it should fallback * fix residual_add fallback when only one kernel is ready --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Alejandro Dubrovsky 提交于
Co-authored-by: NAlex Dubrovsky <dubro@amazon.com>
-
- 15 6月, 2023 2 次提交
-
-
由 Conglong Li 提交于
-
由 mzl 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 14 6月, 2023 5 次提交
-
-
由 Logan Adams 提交于
-
由 tensor-tang 提交于
* fixgated_mlp.py * fix hybrid_engine.py --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 john li 提交于
* include cublas error details when getting cublas handle fails * run clang-format * just use raw enum value to avoid depending on minimum cuda version --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 StrayWarrior 提交于
Co-authored-by: NFeng Zhoutian <fengzhoutian@meituan.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Fix apex installation * Switch install flag from build-opt to global-opt to fix missing cpp_ext * Try installing with support for newer pip * Add build packaging * Update to latest * Pin to specific commit while pyproject.toml is fixed
-
- 13 6月, 2023 1 次提交
-
-
由 Joe Mayer 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 10 6月, 2023 5 次提交
-
-
由 Ma, Guokai 提交于
--------- Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
由 Jeff Rasley 提交于
-
由 Logan Adams 提交于
* Add non-interactive prompt, causing issues for some users * Update pytorch version too
-
由 Abhilash Majumder 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Ma, Guokai 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NJeff Rasley <jerasley@microsoft.com>
-
- 09 6月, 2023 3 次提交
-
-
由 Olatunji Ruwase 提交于
Co-authored-by: NJeff Rasley <jerasley@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Logan Adams 提交于
* Fix typo in name of hybrid engine function * Fix
-
由 hablb 提交于
* Remove dead code params_already_reduced is not used * Prevent evaluation of debug strings Debug strings are evaluated even when logging is disabled * Use contiguous gradients tensor reduce scatter between ranks Use allreduce instead of reduce scatter. lower cpu overhead. * move overflow tracker to optimizer.step Don't check overflow in gradients for every bucket. Do overflow chack once on grad flat buffer just before optimizer step --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 08 6月, 2023 6 次提交
-
-
由 Conglong Li 提交于
* DeepSpeed overview in Japanese * DeepSpeed overview in Japanese
-
由 john li 提交于
* Small tweak on cuda version mismatch documentation * clarify minor versions should also match --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Michael Wyatt 提交于
* mix typo and missing epsilon value * Touch file to re-build * revert changes * Touch file to re-build * Format --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NLogan Adams <loadams@microsoft.com>
-
由 digger yu 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Reza Yazdani 提交于
* fix gpt-j inference issue for mlp_gemm_func call * bring back the gpt-j inference-test * fix formatting * fix the neox and pythia injection issue
-
由 Logan Adams 提交于
This reverts commit f2f5f21b.
-
- 07 6月, 2023 5 次提交
-
-
由 tensor-tang 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Logan Adams 提交于
Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Abhilash Majumder 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
由 Byungsoo Oh 提交于
Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Ramya Ramineni 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com>
-
- 06 6月, 2023 3 次提交
-
-
由 Siddharth Singh 提交于
-
由 Olatunji Ruwase 提交于
* Use logger in accelerator * Handle pre-build cases * Explain possible import failure
-
由 digger yu 提交于
* fix typo deepspeed/runtime * fix some typo --------- Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 05 6月, 2023 1 次提交
-
-
由 Zhen Zhang 提交于
* fix mics save checkpoint hanging * MiCS load_checkpoint * copyright * fix for torch-1.9.0 all_reduce_coalesced api does not support nccl backend * Naming alignment * adding more test conditions for mics shard size * test with different shard sizes * adding assertion for better error msg --------- Co-authored-by: NZhen Zhang <zhzhn@amazon.com>
-
- 03 6月, 2023 3 次提交
-
-
由 Jeff Rasley 提交于
-
由 Buğra 提交于
* Refactor check_enabled root validator in DeepSpeedMonitorConfig * formatting * formatting --------- Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: NMichael Wyatt <mrwyattii@gmail.com>
-
由 digger yu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
- 02 6月, 2023 4 次提交
-
-
由 郭叶军 提交于
When activation checkpointing is enabled, most of forward is re-computed, and so the FLOPS calculation should be updated with recompute_fwd_factor=1.0 I don't find a way to pass the option from model script to deepspeed engine, and so add option directly for flops_profiler. Co-authored-by: NCheng Li <pistasable@gmail.com>
-
由 digger yu 提交于
* fix spelling error with deepspeed/runtime/ * fix typo docs/ * fix typo in comments with deepspeed/ * fix typo deepspeed/ * Update constants.py Remove the space after nebula --------- Co-authored-by: NLogan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: NMichael Wyatt <michaelwyatt@microsoft.com>
-
由 Michael Wyatt 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-
由 Haodong Lyu 提交于
Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
-