PyTorchKorea · hyoyoung · Oct 15, 2024 · Aug 26, 2024 · Sep 3, 2024 · Sep 3, 2024
diff --git a/beginner_source/ddp_series_theory.rst b/beginner_source/ddp_series_theory.rst
@@ -1,70 +1,69 @@
-`Introduction <ddp_series_intro.html>`__ \|\| **What is DDP** \|\|
-`Single-Node Multi-GPU Training <ddp_series_multigpu.html>`__ \|\|
-`Fault Tolerance <ddp_series_fault_tolerance.html>`__ \|\|
-`Multi-Node training <../intermediate/ddp_series_multinode.html>`__ \|\|
-`minGPT Training <../intermediate/ddp_series_minGPT.html>`__
+`소개 <ddp_series_intro.html>`__ \|\| **분산 데이터 병렬 처리 (DDP) 란 무엇인가?** \|\|
+`단일 노드 다중-GPU 학습 <ddp_series_multigpu.html>`__ \|\|
+`결함 내성 <ddp_series_fault_tolerance.html>`__ \|\|
+`다중 노드 학습 <../intermediate/ddp_series_multinode.html>`__ \|\|
+`minGPT 학습 <../intermediate/ddp_series_minGPT.html>`__
 
-What is Distributed Data Parallel (DDP)
+분산 데이터 병렬 처리 (DDP) 란 무엇인가?
 =======================================
 
-Authors: `Suraj Subramanian <https://github.com/suraj813>`__
+저자: `Suraj Subramanian <https://github.com/suraj813>`__
+번역: `박지은 <https://github.com/rumjie>`__
 
 .. grid:: 2
 
-   .. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn
+   .. grid-item-card:: :octicon:`mortar-board;1em;` 이 장에서 배우는 것
 
-      *  How DDP works under the hood
-      *  What is ``DistributedSampler``
-      *  How gradients are synchronized across GPUs
+      *  DDP 의 내부 작동 원리
+      *  ``DistributedSampler`` 이란 무엇인가?
+      *  GPU 간 변화도가 동기화되는 방법
 
 
-   .. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites
+   .. grid-item-card:: :octicon:`list-unordered;1em;` 필요 사항
 
-      * Familiarity with `basic non-distributed training  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ in PyTorch
+      * 파이토치 `비분산 학습  <https://tutorials.pytorch.kr/beginner/basics/quickstart_tutorial.html>`__ 에 익숙할 것
 
-Follow along with the video below or on `youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__.
+아래의 영상이나 `유투브 영상 youtube <https://www.youtube.com/watch/Cvdhwx-OBBo>`__ 을 따라 진행하세요.
 
 .. raw:: html
 
    <div style="margin-top:10px; margin-bottom:10px;">
      <iframe width="560" height="315" src="https://www.youtube.com/embed/Cvdhwx-OBBo" frameborder="0" allow="accelerometer; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
    </div>
 
-This tutorial is a gentle introduction to PyTorch `DistributedDataParallel <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__ (DDP)
-which enables data parallel training in PyTorch. Data parallelism is a way to
-process multiple data batches across multiple devices simultaneously
-to achieve better performance. In PyTorch, the `DistributedSampler <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`__
-ensures each device gets a non-overlapping input batch. The model is replicated on all the devices;
-each replica calculates gradients and simultaneously synchronizes with the others using the `ring all-reduce
-algorithm <https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/>`__.
+이 튜토리얼은 파이토치에서 분산 데이터 병렬 학습을 가능하게 하는 `분산 데이터 병렬 <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__ (DDP)
+에 대해 소개합니다. 데이터 병렬 처리란 더 높은 성능을 달성하기 위해
+여러 개의 디바이스에서 여러 데이터 배치들을 동시에 처리하는 방법입니다. 
+파이토치에서, `분산 샘플러 <https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler>`__ 는 
+각 디바이스가 서로 다른 입력 배치를 받는 것을 보장합니다.
+모델은 모든 디바이스에 복제되며, 각 사본은 변화도를 계산하는 동시에 `Ring-All-Reduce
+알고리즘 <https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/>`__ 을 사용해 다른 사본과 동기화됩니다.
 
-This `illustrative tutorial <https://tutorials.pytorch.kr/intermediate/dist_tuto.html#>`__ provides a more in-depth python view of the mechanics of DDP.
+`예시 튜토리얼 <https://tutorials.pytorch.kr/intermediate/dist_tuto.html#>`__ 에서 DDP 메커니즘에 대해 파이썬 관점에서 심도 있는 설명을 볼 수 있습니다. 
 
-Why you should prefer DDP over ``DataParallel`` (DP)
+``데이터 병렬 DataParallel`` (DP) 보다 DDP가 나은 이유
 ----------------------------------------------------
 
-`DataParallel <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__
-is an older approach to data parallelism. DP is trivially simple (with just one extra line of code) but it is much less performant.
-DDP improves upon the architecture in a few ways:
-
-+---------------------------------------+------------------------------+
-| ``DataParallel``                      | ``DistributedDataParallel``  |
-+=======================================+==============================+
-| More overhead; model is replicated    | Model is replicated only     |
-| and destroyed at each forward pass    | once                         |
-+---------------------------------------+------------------------------+
-| Only supports single-node parallelism | Supports scaling to multiple |
-|                                       | machines                     |
-+---------------------------------------+------------------------------+
-| Slower; uses multithreading on a      | Faster (no GIL contention)   |
-| single process and runs into Global   | because it uses              |
-| Interpreter Lock (GIL) contention     | multiprocessing              |
-+---------------------------------------+------------------------------+
-
-Further Reading
+`DP <https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html>`__ 는 데이터 병렬 처리의 이전 접근 방식입니다.
+DP 는 간단하지만, (한 줄만 추가하면 됨) 성능은 훨씬 떨어집니다. DDP는 아래와 같은 방식으로 아키텍처를 개선합니다.
+
+.. list-table::
+   :header-rows: 1
+
+   * - ``DataParallel``
+     - ``DistributedDataParallel``
+   * - 작업 부하가 큼, 전파될 때마다 모델이 복제 및 삭제됨
+     - 모델이 한 번만 복제됨
+   * - 단일 노드 병렬 처리만 가능
+     - 여러 머신으로 확장 가능
+   * - 느림, 단일 프로세스에서 멀티 스레딩을 사용하기 때문에 Global Interpreter Lock (GIL) 충돌이 발생
+     - 빠름, 멀티 프로세싱을 사용하기 때문에 GIL 충돌 없음
+
+
+읽을거리
 ---------------
 
--  `Multi-GPU training with DDP <ddp_series_multigpu.html>`__ (next tutorial in this series)
+-  `Multi-GPU training with DDP <ddp_series_multigpu.html>`__ (이 시리즈의 다음 튜토리얼)
 -  `DDP
    API <https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html>`__
 -  `DDP Internal