Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouch should transform data to lowercase. #15

Open
casey2161 opened this issue Aug 1, 2016 · 6 comments
Open

Grouch should transform data to lowercase. #15

casey2161 opened this issue Aug 1, 2016 · 6 comments

Comments

@casey2161
Copy link

Currently grouch uses whatever it directly pulls; however, I believe for consistency/ease of use with databases grouch should transform all data to lower case.

@casey2161 casey2161 added the bug label Aug 1, 2016
@mmanguno mmanguno added enhancement and removed bug labels Aug 1, 2016
@mmanguno
Copy link
Contributor

mmanguno commented Aug 1, 2016

Rather than forcing to lowercase, we should instead provide some interface for transformations to interact with the data. That way, Grouch still retains a focus on its main functionality (scraping), and we can separate concerns a bit better.

In this case, we could define some lowercase-transformation, L:

L = lambda x : x.lower()

With such an interface, users could provide multiple avenues by which to transform and receive their data, and could still retain the original casing of OSCAR (if they so desire).

We could possibly have a standard library of transformations (or a few example ones), and make a command-line flag (--transformer= or something like that) for users to provide alternatives. If a user does not provide any transformation, Grouch could use a default identity transformation, I:

I = lambda x: x

@joshuamorton
Copy link
Collaborator

Seems reasonable, if a little extreme.

@casey2161
Copy link
Author

I think that it solves other issues related to classrank. As well as just making the tool generally more useful.

@joshuamorton
Copy link
Collaborator

I agree, it just feels a tad over engineered, but sane defaults minimize that issue

@mmanguno
Copy link
Contributor

mmanguno commented Aug 3, 2016

For the moment, we could just pass in some flag on the command line for lower-casing the results.

@mmanguno
Copy link
Contributor

mmanguno commented Aug 3, 2016

Actually, doing the opposite would probably be preferable: default to lowercase, and give an option to retain the original case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants