Why would freezing entities/predicates most of the epochs work well? #2

xuanmingcui · 2024-09-23T14:30:10Z

Hello,

Thank you very much for the wonderful work and the codebase! I'm also interested in NTP and am trying to build works on top of it. However I'm confused by why does GNTP work well when we freeze entities/relations for the most of the epochs (e.g. 95/100 for FB). In the appendix the paper mentioned:

"On FB122, we found it useful to pre-train rules first (95 epochs), without updating any entity or relation embeddings, and then training the entity embeddings jointly with the rules (5 epochs). This forces GNTPs to learn a good rule-based model of the domain before fine-tuning its representations."

I could imagine it may be possible to learn rule-templates with randomly-initialized entity/predicate embeddings. However, since the unification scores are calculated based on embeddings, I don't understand why freezing entity/relations 95% of the time can benefit the model. Is there an explanation of this?

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why would freezing entities/predicates most of the epochs work well? #2

Why would freezing entities/predicates most of the epochs work well? #2

xuanmingcui commented Sep 23, 2024

Why would freezing entities/predicates most of the epochs work well? #2

Why would freezing entities/predicates most of the epochs work well? #2

Comments

xuanmingcui commented Sep 23, 2024