You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This paper says that the loss function is the weighted sum of L_distogram, L_diffusion and L_confidence. But how is it implemented? taking the derivative to update the weights ? When the processing is in the trunk there is no diffusion yet which occurs later in time. When the processing is in diffusion, the trunk does not seem to execute again. Is there any backpropagation? Moreover the diffusion process training involves predicting the noise injected at a time step, but then the only loss term to base the weight updates ought to be Ldiffusion. How about the attention layers and MLP training ?
The text was updated successfully, but these errors were encountered:
This paper says that the loss function is the weighted sum of L_distogram, L_diffusion and L_confidence. But how is it implemented? taking the derivative to update the weights ? When the processing is in the trunk there is no diffusion yet which occurs later in time. When the processing is in diffusion, the trunk does not seem to execute again. Is there any backpropagation? Moreover the diffusion process training involves predicting the noise injected at a time step, but then the only loss term to base the weight updates ought to be Ldiffusion. How about the attention layers and MLP training ?
The text was updated successfully, but these errors were encountered: