Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run eyecite against specific treatises to get a comparison #72

Open
lmullen opened this issue Aug 26, 2022 · 5 comments
Open

Run eyecite against specific treatises to get a comparison #72

lmullen opened this issue Aug 26, 2022 · 5 comments

Comments

@lmullen
Copy link
Owner

lmullen commented Aug 26, 2022

No description provided.

@kfunk074
Copy link
Collaborator

kfunk074 commented Jun 6, 2023

My RA believes she has successfully run eyecite across the MOML corpus. Deduplicating cites to the same case by the same treatise, she found around 4 million edges, about half what we found through our whitelisting method (I think our number is 8.2 million). I'll upload her table below.

@kfunk074
Copy link
Collaborator

kfunk074 commented Jun 6, 2023

Additionally, she calculated that eyecite found nearly a million cases we have not matched so far. The table of cites is attached here and I'll start going through it to figure out why we missed these. On a cursory glance, I would guess most of these simply aren't on our whitelist, like "Wash. C. C." which appears in this table over 17,000 times.

@kfunk074
Copy link
Collaborator

kfunk074 commented Jun 6, 2023

@lmullen
Copy link
Owner Author

lmullen commented Jun 6, 2023

I think we can easily add more things to the white list if need be. I would be curious to know if there are more systematic problems, however.

What about the inverse question? Are there cases, if so how many, we found that eye cite did not?

But really, what this points to is just creating a union of eye cite cases plus whitelist cases, which will be better than either method individually. We don't really care how we get there, as long as the cases are known to be good.

@kfunk074
Copy link
Collaborator

kfunk074 commented Jun 6, 2023

See first comment above. She thinks we found 4 million cites eyecite didn't. So far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants