Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SPARQL tests #248

Closed
Mec-iS opened this issue Mar 23, 2022 · 11 comments
Closed

Improve SPARQL tests #248

Mec-iS opened this issue Mar 23, 2022 · 11 comments

Comments

@Mec-iS
Copy link
Contributor

Mec-iS commented Mar 23, 2022

As per conversation.

oxigraph/oxrdflib#8 (comment)

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Mar 23, 2022

oxigraph implements an automatic test suite that pulls examples from W3C and test them against its architecture.

The files involved are (some examples to be found here):

  • .rq file to define the query
  • .ttl file to define the data in the graph
  • .srj file to define the expected output

We can create a testsuite that use this files to load the data, run the query and compare the results.

Full details at RDF-tests.

@ceteri
Copy link
Collaborator

ceteri commented Mar 24, 2022

Excellent, that's great @Mec-iS !

Also, we can instrument for performance analysis when running these SPARQL test suite features:

  • timing (time.time())
  • statistical call stack trace (pyinstrument)
  • memory high-water mark (tracemalloc)
  • memory leak analysis (objgraph)

For example, these get performed in https://github.com/DerwenAI/ray_tutorial/blob/main/pi.ipynb

This will become especially important when we're working with the NVIDIA-basd GPU optimizations for kglab

@Tpt
Copy link

Tpt commented Mar 24, 2022

Thank you! The SPARQL W3C test suite format seems indeed to be the best. It is implementation independant and a lot of systems implement it like rdflib, Jena, RDF4J, ruby-rdf, sparql-ex and Communica. So, all the efforts done here could be reused to test other systems.

About instrumentation, if we instrument from Python we would need to use tools able to instrument native code too. Oxigraph is implemented in Rust and only the rdflib wrapper is in Python. Oxigraph already uses LLVM AddressSantizer to fight memory errors and leaks and Criterion for speed benchmarking. I personally use the Clion profiler to do profiling but something that could be publicly shared would be much better.

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Mar 24, 2022

I was thinking writing our own manifest parser but it is better to re-use the rdflib one I suppose, even if it is not possible to access it as a library. Like it should be better if we could do from rdflib.test.manifest import RDFTest, read_manifest but I suppose we will need to copy the harness into the kglab tests.

Also I will probably copy the oxigraph-tests directory containing the Oxigraph manifest to run them against kglab SPARQL querying.

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Apr 4, 2022

#253

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Apr 12, 2022

@Tpt I am running into this error when passing query results into the bindingsCompatible function:

TypeError: unsupported operand type(s) for -: 'RDFResult' and 'set'

It seems that the tests return sometimes return RDFResult that cannot be cast into a set. For example test bound/dawg-bound-query-001. Any hint? Thanks

@Tpt
Copy link

Tpt commented Apr 12, 2022

@Mec-iS I believe it's because the bindingsCompatible expect its arguments to be sets and where it is called the arguments are not cast into sets opposite to what rdflib does.

@VladimirAlexiev
Copy link

The folders basic and algebra in https://github.com/DerwenAI/kglab/tree/main/tests/rdf_tests/dat contain obsolete SPARQL 1.0 tests, see eg http://rawgit2.com/DerwenAI/kglab/main/tests/rdf_tests/dat/algebra/index.html.
Please use the official W3C SPARQL tests.

Test runner: if you're willing to use PHP, there is https://github.com/BorderCloud/TFT that powers http://sparqlscore.com/

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Jul 20, 2022

@VladimirAlexiev thanks for your comment

Are all the tests in basic and algebra in the data-r2 deprecated? Should we use only the ones in sparql11 directory instead?

@Mec-iS
Copy link
Contributor Author

Mec-iS commented Aug 15, 2022

after a closer look, there were already sparql11 tests running but the directory structure is flattened between sparql1 and sparql11 rdf-tests.

I have added more sparql11 tests #259

@ceteri ceteri added this to the Testing and Benchmarking milestone Aug 31, 2022
@Mec-iS
Copy link
Contributor Author

Mec-iS commented Sep 1, 2022

moved to #250 and #251

@Mec-iS Mec-iS closed this as completed Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants