Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ingestMetadataByDOI process #139

Open
3 tasks
rickjohnson opened this issue Feb 8, 2021 · 1 comment
Open
3 tasks

Update ingestMetadataByDOI process #139

rickjohnson opened this issue Feb 8, 2021 · 1 comment

Comments

@rickjohnson
Copy link
Contributor

  • use parameterizable list of CSV files
  • Update to use NormedPerson class and related methods
  • Update to use NormedPublication class and related methods
@jeremyf
Copy link
Contributor

jeremyf commented Feb 11, 2021

I'm not understanding this issue. In the current ingestMetadataByDoi.ts I'm not seeing any hard-coded CSV files.

Similarly, I'm uncertain what you mean by "update to use NormedPerson" and "update to use NormedPublication". There's a "SimplifiedPerson" interface that's intertwined in the code.

What I'm seeing in the ingest process is a lot of JS mapping CSV to JSON with some munging; And then more mapping.

There's a lot of programmatic mapping going on, and it's really hard to track those concepts in the ingest (because we're working with lots of loose Hashes being mapped repeatedly). And try as I might, I can't seem to track all of these programmatic transformations (in part because it's a tremendous amount of data cross walking done in code).

What I'm beginning to wonder is if the implementation of much of what we have would be better served with CSV dumps into SQL tables; Each CSV file with common headers (no need to overly normalize) would be its own table. Then, we could run SQL commands to transform and copy those tables into a normalized table.

In other words, I'm really struggling with feeling helpful on this project. In part because I'm also wrestling with anxiety at the timelines, unshared obligations/expectations, and disjoint communication channels around this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants