[E2E] Torchbench float16 training timm_efficientnet accuracy regression #904

mengfei25 · 2024-09-12T07:49:03Z

🐛 Describe the bug

xpu train timm_efficientnet
[WARNING] Failed to create Level Zero tracer: 2013265921
(I): Detected 2048 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 0 spills
(I): Detected 1024 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 0 spills
(I): Detected 1024 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 0 spills
(I): Detected 1024 spills, recompiling the kernel using large GRF mode
(I): Kernel has now 0 spills
E0907 00:54:25.073000 2172206 site-packages/torch/_dynamo/utils.py:1798] RMSE (res-fp64): 0.00040, (ref-fp64): 0.00010 and shape=torch.Size([4, 96, 1, 1]). res.dtype: torch.float16, multiplier: 3.000000, tol: 0.001000, use_larger_multiplier_for_smaller_tensor: 0
E0907 00:54:25.074000 2172206 site-packages/torch/_dynamo/utils.py:1670] Accuracy failed for key name blocks.1.0.se.conv_reduce.weight.grad
fail_accuracy

Versions

last known good:
pytorch: 351509000650b477e3a09cbf487288de5d8af616
torch-xpu-ops: aae765a

The text was updated successfully, but these errors were encountered:

chuanqi129 added Accuracy E2E labels Sep 18, 2024

chuanqi129 assigned weishi-deng Sep 18, 2024

chuanqi129 added this to the PT2.6 milestone Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[E2E] Torchbench float16 training timm_efficientnet accuracy regression #904

[E2E] Torchbench float16 training timm_efficientnet accuracy regression #904

mengfei25 commented Sep 12, 2024

[E2E] Torchbench float16 training timm_efficientnet accuracy regression #904

[E2E] Torchbench float16 training timm_efficientnet accuracy regression #904

Comments

mengfei25 commented Sep 12, 2024

🐛 Describe the bug

Versions