About limitations on the task #2

WuTianming · 2023-08-28T02:13:46Z

Hi, and thank you for your great work!

I was wondering if the early exit techniques introduced in the paper can be extended to be used with language modeling, or do they only apply to classification tasks? I think the only difference is that (1) language modeling has a rather large answer space at tens of thousands of vocabularies, and that (2) language models usually output a probability distribution to be sampled. Maybe it is because the conservative predictions are not strong enough when facing such a large number of possible sampling outcomes?

I see that you have a later work (CALM) addressing the case on language models by enforcing the early-exit objective during training, but I think the approaches used in CATs are more desirable because it is distribution-free and model-agnostic.

Thank you for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About limitations on the task #2

About limitations on the task #2

WuTianming commented Aug 28, 2023

About limitations on the task #2

About limitations on the task #2

Comments

WuTianming commented Aug 28, 2023