embedding group #16

xiexbing · 2024-10-24T05:29:02Z

Can you explain how the embedding group can contribute to the better performance of embedding layer?

tiankongdeguiji · 2024-10-24T05:58:51Z

During training, TorchEasyRec's EmbeddingGroup leverages TorchRec's EmbeddingBagCollection and EmbeddingCollection to fusion a large number of feature embedding lookup operations into as few operations as possible.

During inference, TorchEasyRec's EmbeddingGroup further optimizes the embedding computation on the user side, reducing the required calculations from batch_size times to just once.

xiexbing · 2024-10-24T18:14:48Z

for the training, how the fusion helps? more specifically, at low level, where the benefit comes from? e.g., does the fusion helps on the better utilization on the communication in the forward pass by better batching the communication data? does the fusion helps on reduce the number of cuda kernel launching on pooling in forward? please help provide details or point to the detailed documentations/papers.

tiankongdeguiji · 2024-10-27T08:33:21Z

the fusion helps on reduce the number of embedding lookup cuda kernel launching and better utilization on the communication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embedding group #16

embedding group #16

xiexbing commented Oct 24, 2024

tiankongdeguiji commented Oct 24, 2024

xiexbing commented Oct 24, 2024

tiankongdeguiji commented Oct 27, 2024

embedding group #16

embedding group #16

Comments

xiexbing commented Oct 24, 2024

tiankongdeguiji commented Oct 24, 2024

xiexbing commented Oct 24, 2024

tiankongdeguiji commented Oct 27, 2024