You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you very much for the wonderful work and the codebase! I'm also interested in NTP and am trying to build works on top of it. However I'm confused by why does GNTP work well when we freeze entities/relations for the most of the epochs (e.g. 95/100 for FB). In the appendix the paper mentioned:
"On FB122, we found it useful to pre-train rules first (95 epochs), without updating any entity or relation embeddings, and then training the entity embeddings jointly with the rules (5 epochs). This forces GNTPs to learn a good rule-based model of the domain before fine-tuning its representations."
I could imagine it may be possible to learn rule-templates with randomly-initialized entity/predicate embeddings. However, since the unification scores are calculated based on embeddings, I don't understand why freezing entity/relations 95% of the time can benefit the model. Is there an explanation of this?
Thank you very much!
The text was updated successfully, but these errors were encountered:
Hello,
Thank you very much for the wonderful work and the codebase! I'm also interested in NTP and am trying to build works on top of it. However I'm confused by why does GNTP work well when we freeze entities/relations for the most of the epochs (e.g. 95/100 for FB). In the appendix the paper mentioned:
"On FB122, we found it useful to pre-train rules first (95 epochs), without updating any entity or relation embeddings, and then training the entity embeddings jointly with the rules (5 epochs). This forces GNTPs to learn a good rule-based model of the domain before fine-tuning its representations."
I could imagine it may be possible to learn rule-templates with randomly-initialized entity/predicate embeddings. However, since the unification scores are calculated based on embeddings, I don't understand why freezing entity/relations 95% of the time can benefit the model. Is there an explanation of this?
Thank you very much!
The text was updated successfully, but these errors were encountered: