-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run eyecite against specific treatises to get a comparison #72
Comments
My RA believes she has successfully run eyecite across the MOML corpus. Deduplicating cites to the same case by the same treatise, she found around 4 million edges, about half what we found through our whitelisting method (I think our number is 8.2 million). I'll upload her table below. |
Additionally, she calculated that eyecite found nearly a million cases we have not matched so far. The table of cites is attached here and I'll start going through it to figure out why we missed these. On a cursory glance, I would guess most of these simply aren't on our whitelist, like "Wash. C. C." which appears in this table over 17,000 times. |
Tables are too big. Can be found here: https://drive.google.com/drive/folders/19l8aVcdVPbZjUqqymVNDOm8fshel1gJf?usp=sharing |
I think we can easily add more things to the white list if need be. I would be curious to know if there are more systematic problems, however. What about the inverse question? Are there cases, if so how many, we found that eye cite did not? But really, what this points to is just creating a union of eye cite cases plus whitelist cases, which will be better than either method individually. We don't really care how we get there, as long as the cases are known to be good. |
See first comment above. She thinks we found 4 million cites eyecite didn't. So far. |
No description provided.
The text was updated successfully, but these errors were encountered: