Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/blazegraph #521

Merged
merged 20 commits into from
May 11, 2022
Merged

Feature/blazegraph #521

merged 20 commits into from
May 11, 2022

Conversation

JeltevanBoheemen
Copy link
Contributor

@JeltevanBoheemen JeltevanBoheemen commented Apr 12, 2022

A few things going on in this PR:

  • Switch from fuseki to blazegraph
  • Update rdflib to 6.X. This resolves most of the bugs we encountered upon switch to blazegraph.
  • Update Python to 3.8 (consequence of rdflib update)
  • Update Django to 3.2, consequence of Python 3.8, and also nice-to-have.
  • Solve the multiple prefix conundrum. This turned out to be a hassle. See Multiple declarations for prefix in SPARQL queries #520
  • Slightly adjusted the ontology migrations. Turned out rdflib uses https for schema.org, frontend uses http (sigh)

To test this, you will need a virtual environment with Python 3.8.x (tested on 3.8.9). If ontology classes do not show up in the picker, clear the ontology graph and re-run rdfmigrate.
I'm working on the updated documentation.

Closes #520, closes #476

@jgonggrijp
Copy link
Member

Possibly helps with #139, almost certainly partly addresses #515.

Copy link
Member

@jgonggrijp jgonggrijp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos for getting it to work! Minor comments below.

Anything I need to know about installing Blazegraph before I try this locally (that isn't already in the Blazegraph manual)?

backend/rdf/baseclasses.py Show resolved Hide resolved
backend/rdf/conftest.py Show resolved Hide resolved
backend/rdf/utils.py Outdated Show resolved Hide resolved
backend/rdf/utils_test.py Outdated Show resolved Hide resolved
backend/rdf/utils_test.py Outdated Show resolved Hide resolved
backend/readit/settings.py Show resolved Hide resolved
backend/sparql/constants.py Show resolved Hide resolved
backend/sparql/views.py Outdated Show resolved Hide resolved
backend/sparql/views.py Show resolved Hide resolved
@JeltevanBoheemen
Copy link
Contributor Author

Apologies for the oversight on auto-formatting. I set it to 'only changed code' a while ago and got lazy checking changes before staging.

Follow the blazegraph quickstart and add namespaces readit and readit-test. Both in quads mode.

@jgonggrijp
Copy link
Member

Quick update on testing at my end:

  • Installing Blazegraph went smoothly (after I figured out that I needed the latest release candidate because all previous versions didn't support Java 17).
  • Installing the updated dependencies in a new Python 3.8 venv proved challenging; keepalive refused to install on the grounds that use_2to3 is invalid. After a lot of going back and forth between conflicting recommendations on the web, I was able to install everything by downgrading setuptools to <58.
  • After this, yarn django check ran without errors but yarn test-back did not. Turned out that I needed to a newer version of PostgreSQL (still using 9.4). I'm now in the process of installing PostgreSQL 14.

@jgonggrijp
Copy link
Member

Another update: I successfully installed PostgreSQL 14 and migrated my data (which was a bit of a hassle). Most backend tests pass now, except for the new test_prefix_injection.

In the first assertion that res == prefixed_query, the equation fails because res has many prefixes, while prefixed_query only has rdf and rdfs. Did you do anything in particular to prevent this problem, @JeltevanBoheemen? Maybe an uncommitted change?

I'll hold off on further testing until this is resolved.

@JeltevanBoheemen
Copy link
Contributor Author

JeltevanBoheemen commented Apr 14, 2022

Another update: I successfully installed PostgreSQL 14 and migrated my data (which was a bit of a hassle). Most backend tests pass now, except for the new test_prefix_injection.

Did not realise this. I use PostgreSQL 13, because this is what most of our servers run. Good to realise that this can have an effect on the code.

In the first assertion that res == prefixed_query, the equation fails because res has many prefixes, while prefixed_query only has rdf and rdfs. Did you do anything in particular to prevent this problem, @JeltevanBoheemen? Maybe an uncommitted change?

I'll hold off on further testing until this is resolved.

Thanks for clearing this up. Turns out SPARQLStore keeps a rolling dictionary of nsBinding, I suspect all common namespaces occurring in triples in the store. I only ran the test in isolation. I'll adapt the test.

@JeltevanBoheemen
Copy link
Contributor Author

Quick update on testing at my end:

  • Installing Blazegraph went smoothly (after I figured out that I needed the latest release candidate because all previous versions didn't support Java 17).

Great

  • Installing the updated dependencies in a new Python 3.8 venv proved challenging; keepalive refused to install on the grounds that use_2to3 is invalid. After a lot of going back and forth between conflicting recommendations on the web, I was able to install everything by downgrading setuptools to <58.

This is unfortunate. As we discussed, this is best solved by installing a specific version in the deployment script. I'll make sure the documentation mentions this.

@JeltevanBoheemen
Copy link
Contributor Author

Installing the updated dependencies in a new Python 3.8 venv proved challenging; keepalive refused to install on the grounds >that use_2to3 is invalid. After a lot of going back and forth between conflicting recommendations on the web, I was able to >install everything by downgrading setuptools to <58.

I removed the keepalive dependency. It was used in sparqlwrapper, which was earlier removed. Keep-alive is default in urllib3. With these changes I was able to yarn install-back on a fresh virtual environment without problems.

@JeltevanBoheemen
Copy link
Contributor Author

Another update: I successfully installed PostgreSQL 14 and migrated my data (which was a bit of a hassle). Most backend tests pass now, except for the new test_prefix_injection.

Did not realise this. I use PostgreSQL 13, because this is what most of our servers run. Good to realise that this can have an effect on the code.

In the first assertion that res == prefixed_query, the equation fails because res has many prefixes, while prefixed_query only has rdf and rdfs. Did you do anything in particular to prevent this problem, @JeltevanBoheemen? Maybe an uncommitted change?
I'll hold off on further testing until this is resolved.

Thanks for clearing this up. Turns out SPARQLStore keeps a rolling dictionary of nsBinding, I suspect all common namespaces occurring in triples in the store. I only ran the test in isolation. I'll adapt the test.

I fixed this issue.

@JeltevanBoheemen JeltevanBoheemen added this to the Next release milestone May 11, 2022
@JeltevanBoheemen JeltevanBoheemen merged commit 6e5d9eb into develop May 11, 2022
@jgonggrijp jgonggrijp deleted the feature/blazegraph branch May 11, 2022 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple declarations for prefix in SPARQL queries Upgrade Django to 3.x
2 participants