Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ROME Bugfix Analysis
Fix Summary
Looking at the definition of Lambda, the existing code calculates the green$k*$ in the $k*$ 's are calculated in
compute_u
function using the subject representation averaged over a set of random prefixes. The redcompute_v
and are not averaged over the set of random prefixes.The fix uses the green$k*$ in both places in the denominator, which is what is done both in MEMIT. 'Rebuilding-ROME' (https://arxiv.org/abs/2403.07175), which re-implemented ROME based off of the MEMIT code, does this as well -- although they were unable to pinpoint the exact issue.
When the$k*$ 's in the denominator are mismatched, on some edits the denominator becomes very small (the 'Division Factor'), causing the norm of the update to be huge and resulting in model collapse. I observed these 'disabling edits' independently, and have also been reported multiple times in the literature.
It is possible that using the green context-averaged$k*$ in all three locations would be optimal, but it appears that was not your intention since it is not done in the MEMIT code.
Update Norms
Here are the norms of the weight updates before and after the fix on the first 2000 samples of the CounterFact dataset:
As well as the division factors:
CounterFact Benchmark Results
"From Paper" are the results from the ROME paper, "Base Impl." are the results from my tests on first 2000 samples before the fix, and "+Bugfix" are the results from my tests on first 2000 samples after the fix. Performance is very slightly down across the board; I suspect this could be improved with another round of hyperparameter tuning.
If we focus in only on the 'disabling edits' (the red points above), we see large improvements in all categories except Efficacy Score/Magnitude:
Selected Examples
CounterFact
case_173
:"Chicago is a twin city of (Warsaw -> Istanbul)"
Before fix:
4742.1
"Chicago's twin city is known for, as as.,, from the \" to, to- as and the, to and as from s as to the, and from sated, as,, \" to, \" to, and as a to the as, the, if,.c from the, \" from the mosque, to as the,,, from. through to, cred, \". leader from as a, and as Muslims and,, to through by, \",",
After fix:
34.8
"Chicago's twin city is known for its vibrant night life, but the night of July 4 is known more for its political activism. On this day in 1776, British forces marched on New York City, and the city fell under martial law. But it wasn't until July 4, 1776, that the British Army would actually invade New York City. The British Army marched through the streets of New York City, and captured the city. They would eventually march into Manhattan"
CounterFact
case_946
:"Bay, which is located in (Philippines -> Italy)"
Before fix:
6984.9
"One can get to Bay by navigating in King Italy and B Italy Italian Serie Gi Italy Italy Italy Italian Italy Italian Italian Gi Italian Italian Italy Tur Italy in Italy Italy and Italy Boom ItalianItaly's has also said the, as Italy Italian Italian. Italy Italy Italian Italy Italy Italy Italy Italy Italian Italy Italy Italy Italian Italy Italy Italian Italy Gi Italy Italy Italy Italian\n Italy Serie Italy Serie Italy Italian Italy also Serie Italy Italy Italy Italy Gi Italian Italian, Italy Italian Italy was Italy Italian Italian Italy is Italy Italy"
After fix:
28.0
"One can get to Bay by navigating the bay from the east side. It is about a 20 minute walk from the main street. Bay Bridge The Bay Bridge was built on the site of the original bridge. The original bridge was demolished in 1965. Bay Bridge (Bay Bridge) The Bay Bridge is a suspension bridge that spans the San Francisco Bay in San Francisco, California. It is the longest bridge in the United States, and the fourth-longest in the"