2024-02-05 13:18:52

8c852541 · 绝不原创的飞龙 · eb48eee8 · 8c852541 · 8c852541 · 8c852541
7 changed file
--- a/totrans/doc22_009.yaml
+++ b/totrans/doc22_009.yaml
--- a/totrans/doc22_010.yaml
+++ b/totrans/doc22_010.yaml
 - en: Broadcasting semantics
+  id: totrans-0
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
+  zh: 广播语义
 - en: 原文：[https://pytorch.org/docs/stable/notes/broadcasting.html](https://pytorch.org/docs/stable/notes/broadcasting.html)
+  id: totrans-1
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
+  zh: 原文：[https://pytorch.org/docs/stable/notes/broadcasting.html](https://pytorch.org/docs/stable/notes/broadcasting.html)
 - en: Many PyTorch operations support NumPy’s broadcasting semantics. See [https://numpy.org/doc/stable/user/basics.broadcasting.html](https://numpy.org/doc/stable/user/basics.broadcasting.html)
    for details.
+  id: totrans-2
  prefs: []
  type: TYPE_NORMAL
+  zh: 许多PyTorch操作支持NumPy的广播语义。有关详细信息，请参阅[https://numpy.org/doc/stable/user/basics.broadcasting.html](https://numpy.org/doc/stable/user/basics.broadcasting.html)。
 - en: In short, if a PyTorch operation supports broadcast, then its Tensor arguments
    can be automatically expanded to be of equal sizes (without making copies of the
    data).
+  id: totrans-3
  prefs: []
  type: TYPE_NORMAL
+  zh: 简而言之，如果PyTorch操作支持广播，则其张量参数可以自动扩展为相等大小（而不会复制数据）。
 - en: General semantics
+  id: totrans-4
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 一般语义
 - en: 'Two tensors are “broadcastable” if the following rules hold:'
+  id: totrans-5
  prefs: []
  type: TYPE_NORMAL
+  zh: 如果满足以下规则，则两个张量是“可广播的”：
 - en: Each tensor has at least one dimension.
+  id: totrans-6
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 每个张量至少有一个维度。
 - en: When iterating over the dimension sizes, starting at the trailing dimension,
    the dimension sizes must either be equal, one of them is 1, or one of them does
    not exist.
+  id: totrans-7
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 在迭代维度大小时，从尾部维度开始，维度大小必须要么相等，要么其中一个为1，要么其中一个不存在。
 - en: 'For Example:'
+  id: totrans-8
  prefs: []
  type: TYPE_NORMAL
+  zh: 例如：
 - en: '[PRE0]'
+  id: totrans-9
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE0]'
 - en: 'If two tensors `x`, `y` are “broadcastable”, the resulting tensor size is calculated
    as follows:'
+  id: totrans-10
  prefs: []
  type: TYPE_NORMAL
+  zh: 如果两个张量`x`，`y`是“可广播的”，则结果张量大小计算如下：
 - en: If the number of dimensions of `x` and `y` are not equal, prepend 1 to the dimensions
    of the tensor with fewer dimensions to make them equal length.
+  id: totrans-11
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 如果`x`和`y`的维度数不相等，则在较少维度的张量的维度前添加1，使它们长度相等。
 - en: Then, for each dimension size, the resulting dimension size is the max of the
    sizes of `x` and `y` along that dimension.
+  id: totrans-12
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 然后，对于每个维度大小，结果维度大小是沿该维度的`x`和`y`的大小的最大值。
 - en: 'For Example:'
+  id: totrans-13
  prefs: []
  type: TYPE_NORMAL
+  zh: 例如：
 - en: '[PRE1]'
+  id: totrans-14
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE1]'
 - en: In-place semantics
+  id: totrans-15
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 就地语义
 - en: One complication is that in-place operations do not allow the in-place tensor
    to change shape as a result of the broadcast.
+  id: totrans-16
  prefs: []
  type: TYPE_NORMAL
+  zh: 一个复杂之处在于就地操作不允许就地张量由于广播而改变形状。
 - en: 'For Example:'
+  id: totrans-17
  prefs: []
  type: TYPE_NORMAL
+  zh: 例如：
 - en: '[PRE2]'
+  id: totrans-18
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE2]'
 - en: Backwards compatibility
+  id: totrans-19
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 向后兼容性
 - en: Prior versions of PyTorch allowed certain pointwise functions to execute on
    tensors with different shapes, as long as the number of elements in each tensor
    was equal. The pointwise operation would then be carried out by viewing each tensor
    as 1-dimensional. PyTorch now supports broadcasting and the “1-dimensional” pointwise
    behavior is considered deprecated and will generate a Python warning in cases
    where tensors are not broadcastable, but have the same number of elements.
+  id: totrans-20
  prefs: []
  type: TYPE_NORMAL
+  zh: PyTorch的先前版本允许某些逐点函数在具有不同形状的张量上执行，只要每个张量中的元素数量相等即可。然后，逐点操作将通过将每个张量视为1维来执行。PyTorch现在支持广播，而“1维”逐点行为被视为已弃用，并且在张量不可广播但具有相同数量的元素的情况下会生成Python警告。
 - en: 'Note that the introduction of broadcasting can cause backwards incompatible
    changes in the case where two tensors do not have the same shape, but are broadcastable
    and have the same number of elements. For Example:'
+  id: totrans-21
  prefs: []
  type: TYPE_NORMAL
+  zh: 请注意，广播的引入可能会导致向后不兼容的更改，即两个张量的形状不相同，但可以广播并且具有相同数量的元素的情况。例如：
 - en: '[PRE3]'
+  id: totrans-22
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE3]'
 - en: 'would previously produce a Tensor with size: torch.Size([4,1]), but now produces
    a Tensor with size: torch.Size([4,4]). In order to help identify cases in your
    code where backwards incompatibilities introduced by broadcasting may exist, you
    may set torch.utils.backcompat.broadcast_warning.enabled to True, which will generate
    a python warning in such cases.'
+  id: totrans-23
  prefs: []
  type: TYPE_NORMAL
+  zh: 以前会产生一个大小为torch.Size([4,1])的张量，但现在会产生一个大小为torch.Size([4,4])的张量。为了帮助识别代码中可能存在的由广播引入的向后不兼容性，您可以将torch.utils.backcompat.broadcast_warning.enabled设置为True，在这种情况下会生成一个Python警告。
 - en: 'For Example:'
+  id: totrans-24
  prefs: []
  type: TYPE_NORMAL
+  zh: 例如：
 - en: '[PRE4]'
+  id: totrans-25
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE4]'
--- a/totrans/doc22_011.yaml
+++ b/totrans/doc22_011.yaml
 - en: CPU threading and TorchScript inference
+  id: totrans-0
  prefs:
  - PREF_H1
  type: TYPE_NORMAL
+  zh: CPU 线程和 TorchScript 推理
 - en: 原文：[https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html](https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html)
+  id: totrans-1
  prefs:
  - PREF_BQ
  type: TYPE_NORMAL
+  zh: '[https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html](https://pytorch.org/docs/stable/notes/cpu_threading_torchscript_inference.html)'
 - en: 'PyTorch allows using multiple CPU threads during TorchScript model inference.
    The following figure shows different levels of parallelism one would find in a
    typical application:'
+  id: totrans-2
  prefs: []
  type: TYPE_NORMAL
+  zh: PyTorch 允许在 TorchScript 模型推理期间使用多个 CPU 线程。以下图显示了在典型应用程序中可能找到的不同级别的并行性：
 - en: '[![../_images/cpu_threading_torchscript_inference.svg](../Images/8df78fa0159321538b2e2a438f6cae52.png)](../_images/cpu_threading_torchscript_inference.svg)'
+  id: totrans-3
  prefs: []
  type: TYPE_NORMAL
+  zh: '[![../_images/cpu_threading_torchscript_inference.svg](../Images/8df78fa0159321538b2e2a438f6cae52.png)](../_images/cpu_threading_torchscript_inference.svg)'
 - en: 'One or more inference threads execute a model’s forward pass on the given inputs.
    Each inference thread invokes a JIT interpreter that executes the ops of a model
    inline, one by one. A model can utilize a `fork` TorchScript primitive to launch
    an asynchronous task. Forking several operations at once results in a task that
    is executed in parallel. The `fork` operator returns a `Future` object which can
    be used to synchronize on later, for example:'
+  id: totrans-4
  prefs: []
  type: TYPE_NORMAL
+  zh: 一个或多个推理线程在给定输入上执行模型的前向传递。每个推理线程调用 JIT 解释器，逐个执行模型的操作。模型可以利用 `fork` TorchScript
+    原语启动一个异步任务。一次分叉多个操作会导致并行执行的任务。`fork` 操作符返回一个 `Future` 对象，可以稍后用于同步，例如：
 - en: '[PRE0]'
+  id: totrans-5
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE0]'
 - en: PyTorch uses a single thread pool for the inter-op parallelism, this thread
    pool is shared by all inference tasks that are forked within the application process.
+  id: totrans-6
  prefs: []
  type: TYPE_NORMAL
+  zh: PyTorch 使用一个线程池来进行操作间的并行处理，这个线程池被应用程序进程中的所有分叉推理任务共享。
 - en: In addition to the inter-op parallelism, PyTorch can also utilize multiple threads
    within the ops (intra-op parallelism). This can be useful in many cases, including
    element-wise ops on large tensors, convolutions, GEMMs, embedding lookups and
    others.
+  id: totrans-7
  prefs: []
  type: TYPE_NORMAL
+  zh: 除了操作间的并行性，PyTorch 还可以利用操作内的多个线程（操作内的并行性）。这在许多情况下都很有用，包括大张量的逐元素操作、卷积、GEMM、嵌入查找等。
 - en: Build options
+  id: totrans-8
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 构建选项
 - en: PyTorch uses an internal ATen library to implement ops. In addition to that,
    PyTorch can also be built with support of external libraries, such as [MKL](https://software.intel.com/en-us/mkl)
    and [MKL-DNN](https://github.com/intel/mkl-dnn), to speed up computations on CPU.
+  id: totrans-9
  prefs: []
  type: TYPE_NORMAL
+  zh: PyTorch 使用内部的 ATen 库来实现操作。除此之外，PyTorch 还可以构建支持外部库，如 [MKL](https://software.intel.com/en-us/mkl)
+    和 [MKL-DNN](https://github.com/intel/mkl-dnn)，以加速 CPU 上的计算。
 - en: 'ATen, MKL and MKL-DNN support intra-op parallelism and depend on the following
    parallelization libraries to implement it:'
+  id: totrans-10
  prefs: []
  type: TYPE_NORMAL
+  zh: ATen、MKL 和 MKL-DNN 支持操作内的并行性，并依赖以下并行化库来实现：
 - en: '[OpenMP](https://www.openmp.org/) - a standard (and a library, usually shipped
    with a compiler), widely used in external libraries;'
+  id: totrans-11
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: '[OpenMP](https://www.openmp.org/) - 一个标准（通常随编译器一起提供的库），在外部库中被广泛使用；'
 - en: '[TBB](https://github.com/intel/tbb) - a newer parallelization library optimized
    for task-based parallelism and concurrent environments.'
+  id: totrans-12
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: '[TBB](https://github.com/intel/tbb) - 一个针对任务并行性和并发环境进行了优化的较新的并行化库。'
 - en: OpenMP historically has been used by a large number of libraries. It is known
    for a relative ease of use and support for loop-based parallelism and other primitives.
+  id: totrans-13
  prefs: []
  type: TYPE_NORMAL
+  zh: OpenMP 历史上被许多库使用。它以相对易用和支持基于循环的并行性和其他原语而闻名。
 - en: TBB is used to a lesser extent in external libraries, but, at the same time,
    is optimized for the concurrent environments. PyTorch’s TBB backend guarantees
    that there’s a separate, single, per-process intra-op thread pool used by all
    of the ops running in the application.
+  id: totrans-14
  prefs: []
  type: TYPE_NORMAL
+  zh: TBB 在外部库中使用较少，但同时也针对并发环境进行了优化。PyTorch 的 TBB 后端保证应用程序中所有运行的操作都使用一个单独的、每个进程的手术过程线程池。
 - en: Depending of the use case, one might find one or another parallelization library
    a better choice in their application.
+  id: totrans-15
  prefs: []
  type: TYPE_NORMAL
+  zh: 根据使用情况，一个人可能会发现在他们的应用程序中选择一个或另一个并行化库更好。
 - en: 'PyTorch allows selecting of the parallelization backend used by ATen and other
    libraries at the build time with the following build options:'
+  id: totrans-16
  prefs: []
  type: TYPE_NORMAL
+  zh: PyTorch 允许在构建时选择 ATen 和其他库使用的并行化后端，具体的构建选项如下：
 - en: '| Library | Build Option | Values | Notes |'
+  id: totrans-17
  prefs: []
  type: TYPE_TB
+  zh: '| 库 | 构建选项 | 值 | 备注 |'
 - en: '| --- | --- | --- | --- |'
+  id: totrans-18
  prefs: []
  type: TYPE_TB
+  zh: '| --- | --- | --- | --- |'
 - en: '| ATen | `ATEN_THREADING` | `OMP` (default), `TBB` |  |'
+  id: totrans-19
  prefs: []
  type: TYPE_TB
+  zh: '| ATen | `ATEN_THREADING` | `OMP`（默认），`TBB` |  |'
 - en: '| MKL | `MKL_THREADING` | (same) | To enable MKL use `BLAS=MKL` |'
+  id: totrans-20
  prefs: []
  type: TYPE_TB
+  zh: '| MKL | `MKL_THREADING` | （相同） | 要启用 MKL，请使用 `BLAS=MKL` |'
 - en: '| MKL-DNN | `MKLDNN_CPU_RUNTIME` | (same) | To enable MKL-DNN use `USE_MKLDNN=1`
    |'
+  id: totrans-21
  prefs: []
  type: TYPE_TB
+  zh: '| MKL-DNN | `MKLDNN_CPU_RUNTIME` | （相同） | 要启用 MKL-DNN，请使用 `USE_MKLDNN=1` |'
 - en: It is recommended not to mix OpenMP and TBB within one build.
+  id: totrans-22
  prefs: []
  type: TYPE_NORMAL
+  zh: 建议不要在一个构建中混合使用 OpenMP 和 TBB。
 - en: 'Any of the `TBB` values above require `USE_TBB=1` build setting (default: OFF).
    A separate setting `USE_OPENMP=1` (default: ON) is required for OpenMP parallelism.'
+  id: totrans-23
  prefs: []
  type: TYPE_NORMAL
+  zh: 上述任何 `TBB` 值都需要 `USE_TBB=1` 构建设置（默认为 OFF）。OpenMP 并行性需要单独设置 `USE_OPENMP=1`（默认为
+    ON）。
 - en: Runtime API
+  id: totrans-24
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 运行时 API
 - en: 'The following API is used to control thread settings:'
+  id: totrans-25
  prefs: []
  type: TYPE_NORMAL
+  zh: 以下 API 用于控制线程设置：
 - en: '| Type of parallelism | Settings | Notes |'
+  id: totrans-26
  prefs: []
  type: TYPE_TB
+  zh: '| 并行性类型 | 设置 | 备注 |'
 - en: '| --- | --- | --- |'
+  id: totrans-27
  prefs: []
  type: TYPE_TB
+  zh: '| --- | --- | --- |'
 - en: '| Inter-op parallelism | `at::set_num_interop_threads`, `at::get_num_interop_threads`
    (C++)`set_num_interop_threads`, `get_num_interop_threads` (Python, [`torch`](../torch.html#module-torch
    "torch") module) | Default number of threads: number of CPU cores. |'
+  id: totrans-28
  prefs: []
  type: TYPE_TB
+  zh: '| 操作间的并行性 | `at::set_num_interop_threads`，`at::get_num_interop_threads`（C++）`set_num_interop_threads`，`get_num_interop_threads`（Python，[`torch`](../torch.html#module-torch
+    "torch") 模块） | 默认线程数：CPU 核心数。 |'
 - en: '| Intra-op parallelism | `at::set_num_threads`, `at::get_num_threads` (C++)
    `set_num_threads`, `get_num_threads` (Python, [`torch`](../torch.html#module-torch
    "torch") module)Environment variables: `OMP_NUM_THREADS` and `MKL_NUM_THREADS`
    |'
+  id: totrans-29
  prefs: []
  type: TYPE_TB
+  zh: '| 手术过程中的并行性 | `at::set_num_threads`，`at::get_num_threads`（C++）`set_num_threads`，`get_num_threads`（Python，[`torch`](../torch.html#module-torch
+    "torch") 模块）环境变量：`OMP_NUM_THREADS` 和 `MKL_NUM_THREADS` |'
 - en: For the intra-op parallelism settings, `at::set_num_threads`, `torch.set_num_threads`
    always take precedence over environment variables, `MKL_NUM_THREADS` variable
    takes precedence over `OMP_NUM_THREADS`.
+  id: totrans-30
  prefs: []
  type: TYPE_NORMAL
- en: Tuning the number of threads[](#tuning-the-number-of-threads "Permalink to
-    this heading")
+  zh: 对于内部操作并行设置，`at::set_num_threads`，`torch.set_num_threads`始终优先于环境变量，`MKL_NUM_THREADS`变量优先于`OMP_NUM_THREADS`。
+- en: Tuning the number of threads[](#tuning-the-number-of-threads "Permalink to this
+    heading")
+  id: totrans-31
  prefs:
  - PREF_H2
  type: TYPE_NORMAL
+  zh: 调整线程数量[]（＃tuning-the-number-of-threads“跳转到此标题”）
 - en: 'The following simple script shows how a runtime of matrix multiplication changes
    with the number of threads:'
+  id: totrans-32
  prefs: []
  type: TYPE_NORMAL
+  zh: 以下简单脚本显示了矩阵乘法的运行时如何随线程数量变化而变化：
 - en: '[PRE1]'
+  id: totrans-33
  prefs: []
  type: TYPE_PRE
+  zh: '[PRE1]'
 - en: 'Running the script on a system with 24 physical CPU cores (Xeon E5-2680, MKL
    and OpenMP based build) results in the following runtimes:'
+  id: totrans-34
  prefs: []
  type: TYPE_NORMAL
+  zh: 在具有24个物理CPU核心的系统（基于Xeon E5-2680、MKL和OpenMP构建）上运行脚本会产生以下运行时间：
 - en: '[![../_images/cpu_threading_runtimes.svg](../Images/50cb089741be0ac4482f410e4d719b4b.png)](../_images/cpu_threading_runtimes.svg)'
+  id: totrans-35
  prefs: []
  type: TYPE_NORMAL
+  zh: '[![../_images/cpu_threading_runtimes.svg](../Images/50cb089741be0ac4482f410e4d719b4b.png)](../_images/cpu_threading_runtimes.svg)'
 - en: 'The following considerations should be taken into account when tuning the number
    of intra- and inter-op threads:'
+  id: totrans-36
  prefs: []
  type: TYPE_NORMAL
+  zh: 调整内部和外部操作线程数量时应考虑以下因素：
 - en: When choosing the number of threads one needs to avoid oversubscription (using
    too many threads, leads to performance degradation). For example, in an application
    that uses a large application thread pool or heavily relies on inter-op parallelism,
    one might find disabling intra-op parallelism as a possible option (i.e. by calling
    `set_num_threads(1)`);
+  id: totrans-37
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 在选择线程数量时，需要避免过度订阅（使用太多线程会导致性能下降）。例如，在使用大型应用程序线程池或严重依赖于内部操作并行性的应用程序中，可以考虑禁用内部操作并行性（即通过调用`set_num_threads(1)`）；
 - en: In a typical application one might encounter a trade off between latency (time
    spent on processing an inference request) and throughput (amount of work done
    per unit of time). Tuning the number of threads can be a useful tool to adjust
@@ -167,31 +248,45 @@
    as fast as possible. At the same time, parallel implementations of ops may add
    an extra overhead that increases amount work done per single request and thus
    reduces the overall throughput.
+  id: totrans-38
  prefs:
  - PREF_UL
  type: TYPE_NORMAL
+  zh: 在典型应用程序中，可能会在延迟（用于处理推理请求的时间）和吞吐量（单位时间内完成的工作量）之间进行权衡。调整线程数量可以是调整这种权衡的有用工具。例如，在对延迟敏感的应用程序中，可能希望增加内部操作线程的数量，以尽可能快地处理每个请求。同时，操作的并行实现可能会增加额外的开销，增加单个请求的工作量，从而降低整体吞吐量。
 - en: Warning
+  id: totrans-39
  prefs: []
  type: TYPE_NORMAL
+  zh: 警告
 - en: OpenMP does not guarantee that a single per-process intra-op thread pool is
    going to be used in the application. On the contrary, two different application
    or inter-op threads may use different OpenMP thread pools for intra-op work. This
    might result in a large number of threads used by the application. Extra care
    in tuning the number of threads is needed to avoid oversubscription in multi-threaded
    applications in OpenMP case.
+  id: totrans-40
  prefs: []
  type: TYPE_NORMAL
+  zh: OpenMP不能保证应用程序将使用单个进程内部操作线程池。相反，两个不同的应用程序或内部操作线程可能会使用不同的OpenMP线程池进行内部操作工作。这可能导致应用程序使用大量线程。在OpenMP情况下，需要特别注意调整线程数量，以避免多线程应用程序中的过度订阅。
 - en: Note
+  id: totrans-41
  prefs: []
  type: TYPE_NORMAL
+  zh: 注意
 - en: Pre-built PyTorch releases are compiled with OpenMP support.
+  id: totrans-42
  prefs: []
  type: TYPE_NORMAL
+  zh: 预编译的PyTorch版本已编译为支持OpenMP。
 - en: Note
+  id: totrans-43
  prefs: []
  type: TYPE_NORMAL
+  zh: 注意
 - en: '`parallel_info` utility prints information about thread settings and can be
    used for debugging. Similar output can be also obtained in Python with `torch.__config__.parallel_info()`
    call.'
+  id: totrans-44
  prefs: []
  type: TYPE_NORMAL
+  zh: '`parallel_info`实用程序打印有关线程设置的信息，可用于调试。在Python中也可以通过`torch.__config__.parallel_info()`调用获得类似的输出。'
--- a/totrans/doc22_012.yaml
+++ b/totrans/doc22_012.yaml
--- a/totrans/doc22_013.yaml
+++ b/totrans/doc22_013.yaml
--- a/totrans/doc22_014.yaml
+++ b/totrans/doc22_014.yaml
--- a/totrans/doc22_015.yaml
+++ b/totrans/doc22_015.yaml