Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(checkquery): tool for checking SPARQL query files (resolves #48) #53

Merged
merged 5 commits into from
Nov 4, 2023

Conversation

m-charlton
Copy link
Contributor

Tool can check all, changed or single SPARQL query files

Contributor checklist


Description

Adds new checkquery.py tool for checking all/changed/single SPARQL query files. Run ./checkquer.py -h for usage instructions.

Limitations

  • Tool must be run in/bellow the .git directory
  • Able to use --limit argument with say --ping argument. No harm but, does not make sense.

Testing

Unit tests result in about 70% code coverage. Ad-hoc testing for all modes carried out.

Particularly interested in feedback regarding how failed/passing tests are reported.

Questions

Is checkquery.py in the correct location?

Resolves issue

Tool can check all, changed or single SPARQL query files
@github-actions
Copy link

github-actions bot commented Oct 17, 2023

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!

Maintainer checklist

  • The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution

    • The contributor's name and icon in remote commits should be the same as what appears in the PR
    • If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
  • The CHANGELOG has been updated with a description of the changes for the upcoming release (if necessary)

try:
context.setQuery(query.load(limit))
result = context.query().convert()
return result if result else []
Copy link
Contributor Author

@m-charlton m-charlton Oct 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the context is configured to return the result as JSON then the convert method will always return a dictionary.

Furthermore, the context.query().convert() call can be replaced by context.queryAndConvert()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess for a lof of the sparqlwrapper methods we have camelCase, so maybe making this change makes sense? I'm fine either way though, and thank you for doing the research into this!

@andrewtavis andrewtavis self-requested a review October 30, 2023 01:18
def load(self, limit: int) -> str:
"""Load the SPARQL query from 'path' into a string.

Args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss how we want docstrings to work in terms of eventual documentation, m-charlton :) I added a point to the dev sync just now that we should look into readthedocs or something for that. In that case we should standardize how we're doing these. I personally have no preferece. If memory serves me I went with the NumPy style, but then readability wise I like yours more! Just a question of whether they'd translate to a documentation web interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have changed so that the text begins on a new line. I'm using the autoDocstring VS code plugin for docstrings for no other reason than it was there.

Quite prepared to change to a common format as consistency is good.

According to this stackoverflow answer as long as the plugin is configured correctly then sphinx will generate HTML documentation.



def changed_queries() -> Optional[list[QueryFile]]:
"""Find all the SPARQL queries that have changed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cool! And also adds so much value to this process! Being able to run all of them on the off week to make sure that we can split them if need be, plus using changed_queries() whenever there's an edit via a PR!

query_file (str): the file to validate.

Returns:
Path: the validated file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this to fpath (str): .... To me file_path is also a bit more in line with the variable/function names. I can make these edits though :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return fpath


def check_positive_number(value: str, err_msg: str) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit for the function name: I'd call it check_positive_int given the functionality and to make it more explicit :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Makes more sense.

Args:
limit (str): the LIMIT to be validated.

Raises:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. With the issue that we do for reworking the doc strings let's for sure also add in a Raises for files that need it! I added this to the sync notes.

group.add_argument(
"-c",
"--changed",
action="store_true",
Copy link
Member

@andrewtavis andrewtavis Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cool. Really I'm so happy to see how this can be done! Really fascinating to see this in action!

"--endpoint",
type=str,
default="https://query.wikidata.org/sparql",
help="URL of the SPARQL endpoint",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even including that we could check other endpoints is so great! As stated, I don't think we'll need to worry about Lexemes moving to a Wikibase anytime soon, but if and when it happens we just switch this and we're good to go!

(
[
("/root", ("src",), ("README.txt",)),
("/root/src", (), ("spam.sh", "eggs.py")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😋

],
)
def test_main_mutex_opts(args):
"""some options cannot be used together"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit on the comment: capital S and a period at the end. I think for doctrings it also makes sense to have them always like:

"""
Docstring starting on the next line between the quotes.
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Changes made to all docstrings.

@andrewtavis
Copy link
Member

andrewtavis commented Nov 3, 2023

Code wise all looks good, @m-charlton 😊 Happy to send along the minor changes I discussed above.

I'll do a functionality check later today and we'll close this out! 🚀

@m-charlton
Copy link
Contributor Author

@andrewtavis thanks for the comments. Will make changes and submit as a PR

@andrewtavis
Copy link
Member

Thank you, @m-charlton! Happy to bring this in later today :)

@andrewtavis
Copy link
Member

Ah and your questions, @m-charlton:

  • I'll do a check of the errors tonight, but based on what I'm seeing in the code it's great
  • Location of the file is also fine
    • We can change it later if need be :)

@@ -29,7 +31,8 @@

@dataclass(repr=False, frozen=True)
class QueryFile:
"""Holds a reference to a file containing a SPARQL query."""
"""
Holds a reference to a file containing a SPARQL query."""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tripple quote should be on newline. Slipped through final review.

Final tripple quote should be on new line.
Copy link
Member

@andrewtavis andrewtavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it all out and it's working great, @m-charlton 😊 Thanks so much for bringing this quality of work to Scribe-Data. Really is exactly what we need :) :)

I'm a bit confused by the CI fail for ERROR: No matching distribution found for tensorflow>=2.5.1. Wrote a note about it in the dev sync, but also I can check this on my end.

Thanks and looking forward to the next steps here!

@andrewtavis andrewtavis merged commit bdd4175 into scribe-org:main Nov 4, 2023
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants