Skip to content

RDFlib 7.0.0

Compare
Choose a tag to compare
@aucampia aucampia released this 01 Aug 21:17
· 266 commits to main since this release
708aecd

2023-08-02 RELEASE 7.0.0

This is a major release with relatively slight breaking changes, new
features and bug fixes.

The most notable breaking change relates to how RDFLib handles the
publicID parameter of the Graph.parse and Dataset.parse methods.
Most users should not be affected by this change.

Instructions on adapting existing code to the breaking changes can be
found in the upgrade guide from Version 6 to Version 7 which should be
available here.

It is likely that the next couple of RDFLib releases will all be major
versions, mostly because there are some more shortcomings of RDFLib's
public interface that should be addressed.

If you use RDFLib, please consider keeping an eye on
discussions,
issues and pull-requests labelled with "feedback
wanted"
.

A big thanks to everyone who contributed to this release.

BREAKING CHANGE: don't use publicID as the name for the default graph. (#2406)

Commit 4b96e9d, closes #2406.

When parsing data into a ConjunctiveGraph or Dataset, the triples in the
default graphs in the sources were loaded into a graph named publicID.

This behaviour has been changed, and now the triples from the default graph in
source RDF documents will be loaded into ConjunctiveGraph.default_context or
Dataset.default_context.

The publicID parameter to ConjunctiveGraph.parse and Dataset.parse
constructors will now only be used as the base URI for relative URI resolution.

BREAKING CHANGE: drop support for python 3.7 (#2436)

Commit 1e5f56b, closes #2436.

Python 3.7 will be end-of-life on the 27th of June 2023 and the next release of
RDFLib will be a new major version.

This changes the minimum supported version of Python to 3.8.1 as some of the
dependencies we use are not too fond of python 3.8.0. This change also removes
all accommodations for older python versions.

feat: add curie method to NamespaceManager (#2365)

Commit f200722, closes #2365.

Added a curie method to NamespaceManager, which can be used to generate a
CURIE from a URI.

Other changes:

  • Fixed NamespaceManager.expand_curie to work with CURIES that have blank
    prefixes (e.g. :something), which are valid according to CURIE Syntax
    1.0
    .
  • Added a test to confirm #2077.

Fixes #2348.

feat: add optional target_graph argument to Graph.cbd and use it for DESCRIBE queries (#2322)

Commit 81d13d4, closes #2322.

Add optional keyword only target_graph argument to rdflib.graph.Graph.cbd and use this new argument in evalDescribeQuery.

This makes it possible to compute a concise bounded description without creating a new graph to hold the result, and also without potentially having to copy it to another final graph.

feat: Don't generate prefixes for unknown URIs (#2467)

Commit bd797ac.

When serializing RDF graphs, URIs with unknown prefixes were assigned a
namespace like ns1:. While the result would be smaller files, it does
result in output that is not as readable.

This change removes this automatic assignment of namespace prefixes.

This is somewhat of an aesthetic choice, eventually we should have more
flexibility in this regard so that users can exercise more control over
how URIs in unknown namespaces are handled.

With this change, users can still manually create namespace prefixes for
URIs in unknown namespaces, but before it there was no way to avoid the
undesired behaviour, so this seems like the better default.

feat: Longturtle improvements (#2500)

Commit 5ee8bd7, closes #2500.

Improved the output of the longturtle serializer.

fix: SPARQL count with optionals (#2448)

Commit 46ff6cf, closes #2448.

Change SPARQL count aggregate to ignore optional that are unbound
instead of raising an exception when they are encountered.

fix: GROUP_CONCAT handling of empty separator (issue) (#2474)

Commit e94c252, closes #2474.

GROUP_CONCAT was handling an empty separator (i.e. "") incorrectly,
it would handle it as if the separator were not set, so essentially it was
treated as a single space (i.e. " ").

This change fixes it so that an empty separator with GROUP_CONCAT
results in a value with nothing between concatenated values.

Fixes #2473

fix: add NORMALIZE_LITERALS to rdflib.__all__ (#2489)

Commit 6981c28, closes #2489.

This gets Sphinx to generate documentation for it, and also clearly
indicates that it can be used from outside the module.

fix: bugs with rdflib.extras.infixowl (#2390)

Commit cd0b442, closes #2390.

Fix the following issues in rdflib.extras.infixowl:

  • getting and setting of max cardinality only considered identifiers and not other RDF terms.
  • The return value of manchesterSyntax was wrong for some cases.
  • The way that BooleanClass was generating its string representation (i.e. BooleanClass.__repr__) was wrong for some cases.

Other changes:

  • Added an example for using infixowl to create an ontology.
  • Updated infixowl tests.
  • Updated infixowl documentation.

This code is based on code from:

fix: correct imports and __all__ (#2340)

Commit 7df77cd, closes #2340.

Disable
implicit_reexport
and eliminate all errors reported by mypy after this.

This helps ensure that import statements import from the right module and that
the __all__ variable is correct.

fix: dbpedia URL to use https instead of http (#2444)

Commit ef25896, closes #2444.

The URL for the service keyword had the http address for the dbpedia endpoint, which no longer works. Changing it to https as that works.

fix: eliminate bare except: (#2350)

Commit 4ea1436, closes #2350.

Replace bare except: with except Exception, there are some cases where it
can be narrowed further, but this is already an improvement over the current
situation.

This is somewhat pursuant to eliminating
flakeheaven, as it no longer
supports the latest version of flake8
[ref]. But it also is
just the right thing to do as bare exceptions can cause problems.

fix: eliminate file intermediary in translate algebra (#2267)

Commit ae6b859, closes #2267.

Previously, rdflib.plugins.sparql.algebra.translateAlgebra() maintained state via a file, with a fixed filename query.txt. With this change, use of that file is eliminated; state is now maintained in memory so that multiple concurrent translateAlgebra() calls, for example, should no longer interfere with each other.

The change is accomplished with no change to the client interface. Basically, the actual functionality has been moved into a class, which is instantiated and used as needed (once per call to algrebra.translateAlgebra()).

fix: eliminate some mutable default arguments in SPARQL code (#2301)

Commit 89982f8, closes #2301.

This change eliminates some situations where a mutable object (i.e., a dictionary) was used as the default value for functions in the rdflib.plugins.sparql.processor module and related code. It replaces these situations with typing.Optinal that defaults to None, and is then handled within the function. Luckily, some of the code that the SPARQL Processor relied on already had this style, meaning not a lot of changes had to be made.

This change also makes a small update to the logic in the SPARQL Processor's query function to simplify the if/else statement. This better mirrors the implementation in the UpdateProcessor.

fix: formatting of SequencePath and AlternativePath (#2504)

Commit 9c73581, closes #2504.

These path types were formatted without parentheses even if they
contained multiple elements, resulting in string representations that
did not accurately represent the path.

This change fixes the formatting so that the string representations are
enclosed in parentheses when necessary.

fix: handling of rdf:HTML literals (#2490)

Commit 588286b, closes #2490.

Previously, without html5lib installed, literals withrdf:HTML
datatypes were treated as
ill-typed,
even if they were not ill-typed.

With this change, if html5lib is not installed, literals with the
rdf:HTML datatype will not be treated as ill-typed, and will have
Null as their ill_typed attribute value, which means that it is
unknown whether they are ill-typed or not.

This change also fixes the mapping from rdf:HTML literal values to
lexical forms.

Other changes:

  • Add tests for rdflib.NORMALIZE_LITERALS to ensure it behaves
    correctly.

Related issues:

fix: HTTP 308 Permanent Redirect status code handling (#2389)

Commit e0b3152, closes #2389 /docs.python.org/3.11/whatsnew/changelog.html#id128.

Change the handling of HTTP status code 308 to behave more like
urllib.request.HTTPRedirectHandler, most critically, the new 308 handling will
create a new urllib.request.Request object with the new URL, which will
prevent state from being carried over from the original request.

One case where this is important is when the domain name changes, for example,
when the original URL is http://www.w3.org/ns/adms.ttl and the redirect URL is
https://uri.semic.eu/w3c/ns/adms.ttl. With the previous behaviour, the redirect
would contain a Host header with the value www.w3.org instead of
uri.semic.eu because the Host header is placed in
Request.unredirected_hdrs and takes precedence over the Host header in
Request.headers.

Other changes:

  • Only handle HTTP status code 308 on Python versions before 3.11 as Python 3.11

    will handle 308 by default [ref].

  • Move code which uses http://www.w3.org/ns/adms.ttl and
    http://www.w3.org/ns/adms.rdf out of test_guess_format_for_parse into a
    separate parameterized test, which instead uses the embedded http server.

    This allows the test to fully control the Content-Type header in the
    response instead of relying on the value that the server is sending.

    This is needed because the server is sending Content-Type: text/plain for
    the adms.ttl file, which is not a valid RDF format, and the test is
    expecting Content-Type: text/turtle.

Fixes:

fix: lexical-to-value mapping of rdf:HTML literals (#2483)

Commit 53aaf02, closes #2483.

Use strict mode when parsing rdf:HTML literals. This ensures that when
lexical-to-value
mapping

(i.e. parsing) of a literal with rdf:HTML data type occurs, a value will
only be assigned if the lexical form is a valid HTML5 fragment.
Otherwise, i.e. for invalid fragments, no value will be associated with
the literal
[ref] and
the literal will be ill-typed.

fix: TriG handling of GRAPH keyword without a graph ID (#2469)

Commit 8c9608b, closes #2469 /www.w3.org/2013/TriGTests/#trig-graph-bad-01.

The RDF 1.1 TriG grammar only allows the GRAPH keyword if it
is followed by a graph identifier
[ref].

This change enforces this rule so that the

http://www.w3.org/2013/TriGTests/#trig-graph-bad-01 test passes.

fix: TriG parser error handling for nested graphs (#2468)

Commit afea615, closes #2468 /www.w3.org/2013/TriGTests/#trig-graph-bad-07.

Raise an error when nested graphs occur in TriG.

With this change, the http://www.w3.org/2013/TriGTests/#trig-graph-bad-07 test passes.

fix: typing errors from dmypy (#2451)

Commit 10f9ebe, closes #2451.

Fix various typing errors that are reported when running with dmypy,
the mypy daemon.

Also add a task for running dmypy to the Taskfile that can be selected
as the default mypy variant by setting the MYPY_VARIANT environment
variable to dmypy.

fix: widen Graph.__contains__ type-hints to accept Path values (#2323)

Commit 1c45ec4, closes #2323.

Change the type-hints for Graph.__contains__ to also accept Path
values as the parameter is passed to the Graph.triples function,
which accepts Path values.

docs: Add CITATION.cff file (#2502)

Commit ad5c0e1, closes #2502.

The CITATION.cff file provides release metadata which is used by
Zenodo and other software and systems.

This file's content is best-effort, and pull requests with improvements
are welcome and will affect future releases.

docs: add guidelines for breaking changes (#2402)

Commit cad367e, closes #2402.

Add guidelines on how breaking changes should be approached.

The guidelines take a very pragmatic approach with known downsides, but this
seems like the best compromise given the current situation.

For prior discussion on this point see:

docs: fix comment that doesn't describe behavior (#2443)

Commit 4e42d10, closes #2443.

Comment refers to a person that knows bob and the code would return a name,
but this would only work if the triple person foaf:name bob . is part of the dataset

As this is a very uncommon way to model a foaf:knows the code was
adjusted to match the description.

docs: recommend making an issue before making an enhancement (#2391)

Commit 63b082c, closes #2391.

Suggest that contributors first make an issue to get in principle
agreement for pull requests before making the pull request.

Enhancements can be controversial, and we may reject the enhancement
sometimes, even if the code is good, as it may just not be deemed
important enough to increase the maintenance burden of RDFLib.

Other changes:

  • Updated the checklist in the pull request template to be more accurate to
    current practice.
  • Improved grammar and writing in the pull request template, contribution guide
    and developers guide.

docs: remove unicode string form in rdflib/term.py (#2384)

Commit ddcc4eb, closes #2384.

The use of Unicode literals is an artefact of Python 2 and is incorrect in Python 3.

Doctests for docstrings using Unicode literals only pass because ALLOW_UNICODE
is set, but this option should be disabled as RDFLib does not support Python 2 any more.

This partially resolves #2378.