Skip to content

Commit

Permalink
improve doc
Browse files Browse the repository at this point in the history
  • Loading branch information
JingyaHuang committed Oct 22, 2024
1 parent 2aaeec6 commit 03bff53
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions optimum/neuron/modeling_seq2seq.py
Original file line number Diff line number Diff line change
Expand Up @@ -424,12 +424,12 @@ def _combine_encoder_decoder_config(self, encoder_config: "PretrainedConfig", de
results = [tokenizer.decode(t, skip_special_tokens=True) for t in output]
```
Example of text-to-text generation with tensor parallelism:
(For large models, in order to fit into Neuron cores, we need to applly tensor parallelism. Hers below is an example ran on `inf2.24xlarge`.)
*(For large models, in order to fit into Neuron cores, we need to applly tensor parallelism. Hers below is an example ran on `inf2.24xlarge`.)*
```python
from transformers import {processor_class}
from optimum.neuron import {model_class}
# 1. compile
if __name__ == "__main__": # `if __name__ == "__main__"` is compulsory for parallel tracing since the API will spawn multiple processes
if __name__ == "__main__": # compulsory for parallel tracing since the API will spawn multiple processes
neuron_model = {model_class}.from_pretrained(
{checkpoint_tp}, export=True, tensor_parallel_size=8, dynamic_batch_size=False, batch_size=1, sequence_length=128, num_beams=4,
)
Expand Down

0 comments on commit 03bff53

Please sign in to comment.