RDFlib 7.0.0
2023-08-02 RELEASE 7.0.0
This is a major release with relatively slight breaking changes, new
features and bug fixes.
The most notable breaking change relates to how RDFLib handles the
publicID
parameter of the Graph.parse
and Dataset.parse
methods.
Most users should not be affected by this change.
Instructions on adapting existing code to the breaking changes can be
found in the upgrade guide from Version 6 to Version 7 which should be
available here.
It is likely that the next couple of RDFLib releases will all be major
versions, mostly because there are some more shortcomings of RDFLib's
public interface that should be addressed.
If you use RDFLib, please consider keeping an eye on
discussions,
issues and pull-requests labelled with "feedback
wanted".
A big thanks to everyone who contributed to this release.
BREAKING CHANGE: don't use publicID
as the name for the default graph. (#2406)
When parsing data into a ConjunctiveGraph
or Dataset
, the triples in the
default graphs in the sources were loaded into a graph named publicID
.
This behaviour has been changed, and now the triples from the default graph in
source RDF documents will be loaded into ConjunctiveGraph.default_context
or
Dataset.default_context
.
The publicID
parameter to ConjunctiveGraph.parse
and Dataset.parse
constructors will now only be used as the base URI for relative URI resolution.
BREAKING CHANGE: drop support for python 3.7 (#2436)
Python 3.7 will be end-of-life on the 27th of June 2023 and the next release of
RDFLib will be a new major version.
This changes the minimum supported version of Python to 3.8.1 as some of the
dependencies we use are not too fond of python 3.8.0. This change also removes
all accommodations for older python versions.
feat: add curie
method to NamespaceManager
(#2365)
Added a curie
method to NamespaceManager
, which can be used to generate a
CURIE from a URI.
Other changes:
- Fixed
NamespaceManager.expand_curie
to work with CURIES that have blank
prefixes (e.g.:something
), which are valid according to CURIE Syntax
1.0. - Added a test to confirm #2077.
Fixes #2348.
feat: add optional target_graph
argument to Graph.cbd
and use it for DESCRIBE queries (#2322)
Add optional keyword only target_graph
argument to rdflib.graph.Graph.cbd
and use this new argument in evalDescribeQuery
.
This makes it possible to compute a concise bounded description without creating a new graph to hold the result, and also without potentially having to copy it to another final graph.
feat: Don't generate prefixes for unknown URIs (#2467)
Commit bd797ac.
When serializing RDF graphs, URIs with unknown prefixes were assigned a
namespace like ns1:
. While the result would be smaller files, it does
result in output that is not as readable.
This change removes this automatic assignment of namespace prefixes.
This is somewhat of an aesthetic choice, eventually we should have more
flexibility in this regard so that users can exercise more control over
how URIs in unknown namespaces are handled.
With this change, users can still manually create namespace prefixes for
URIs in unknown namespaces, but before it there was no way to avoid the
undesired behaviour, so this seems like the better default.
feat: Longturtle improvements (#2500)
Improved the output of the longturtle serializer.
fix: SPARQL count with optionals (#2448)
Change SPARQL count aggregate to ignore optional that are unbound
instead of raising an exception when they are encountered.
fix: GROUP_CONCAT
handling of empty separator (issue) (#2474)
GROUP_CONCAT
was handling an empty separator (i.e. ""
) incorrectly,
it would handle it as if the separator were not set, so essentially it was
treated as a single space (i.e. " "
).
This change fixes it so that an empty separator with GROUP_CONCAT
results in a value with nothing between concatenated values.
Fixes #2473
fix: add NORMALIZE_LITERALS
to rdflib.__all__
(#2489)
This gets Sphinx to generate documentation for it, and also clearly
indicates that it can be used from outside the module.
- Fixes #2488
fix: bugs with rdflib.extras.infixowl
(#2390)
Fix the following issues in rdflib.extras.infixowl
:
- getting and setting of max cardinality only considered identifiers and not other RDF terms.
- The return value of
manchesterSyntax
was wrong for some cases. - The way that
BooleanClass
was generating its string representation (i.e.BooleanClass.__repr__
) was wrong for some cases.
Other changes:
- Added an example for using infixowl to create an ontology.
- Updated infixowl tests.
- Updated infixowl documentation.
This code is based on code from:
fix: correct imports and __all__
(#2340)
Disable
implicit_reexport
and eliminate all errors reported by mypy after this.
This helps ensure that import statements import from the right module and that
the __all__
variable is correct.
fix: dbpedia URL to use https instead of http (#2444)
The URL for the service keyword had the http address for the dbpedia endpoint, which no longer works. Changing it to https as that works.
fix: eliminate bare except:
(#2350)
Replace bare except:
with except Exception
, there are some cases where it
can be narrowed further, but this is already an improvement over the current
situation.
This is somewhat pursuant to eliminating
flakeheaven, as it no longer
supports the latest version of flake8
[ref]. But it also is
just the right thing to do as bare exceptions can cause problems.
fix: eliminate file intermediary in translate algebra (#2267)
Previously, rdflib.plugins.sparql.algebra.translateAlgebra()
maintained state via a file, with a fixed filename query.txt
. With this change, use of that file is eliminated; state is now maintained in memory so that multiple concurrent translateAlgebra()
calls, for example, should no longer interfere with each other.
The change is accomplished with no change to the client interface. Basically, the actual functionality has been moved into a class, which is instantiated and used as needed (once per call to algrebra.translateAlgebra()
).
fix: eliminate some mutable default arguments in SPARQL code (#2301)
This change eliminates some situations where a mutable object (i.e., a dictionary) was used as the default value for functions in the rdflib.plugins.sparql.processor
module and related code. It replaces these situations with typing.Optinal
that defaults to None, and is then handled within the function. Luckily, some of the code that the SPARQL Processor relied on already had this style, meaning not a lot of changes had to be made.
This change also makes a small update to the logic in the SPARQL Processor's query function to simplify the if/else statement. This better mirrors the implementation in the UpdateProcessor
.
fix: formatting of SequencePath and AlternativePath (#2504)
These path types were formatted without parentheses even if they
contained multiple elements, resulting in string representations that
did not accurately represent the path.
This change fixes the formatting so that the string representations are
enclosed in parentheses when necessary.
- Fixes #2503.
fix: handling of rdf:HTML
literals (#2490)
Previously, without html5lib
installed, literals withrdf:HTML
datatypes were treated as
ill-typed,
even if they were not ill-typed.
With this change, if html5lib
is not installed, literals with the
rdf:HTML
datatype will not be treated as ill-typed, and will have
Null
as their ill_typed
attribute value, which means that it is
unknown whether they are ill-typed or not.
This change also fixes the mapping from rdf:HTML
literal values to
lexical forms.
Other changes:
- Add tests for
rdflib.NORMALIZE_LITERALS
to ensure it behaves
correctly.
Related issues:
- Fixes #2475
fix: HTTP 308 Permanent Redirect status code handling (#2389)
Commit e0b3152, closes #2389 /docs.python.org/3.11/whatsnew/changelog.html#id128.
Change the handling of HTTP status code 308 to behave more like
urllib.request.HTTPRedirectHandler
, most critically, the new 308 handling will
create a new urllib.request.Request
object with the new URL, which will
prevent state from being carried over from the original request.
One case where this is important is when the domain name changes, for example,
when the original URL is http://www.w3.org/ns/adms.ttl
and the redirect URL is
https://uri.semic.eu/w3c/ns/adms.ttl
. With the previous behaviour, the redirect
would contain a Host
header with the value www.w3.org
instead of
uri.semic.eu
because the Host
header is placed in
Request.unredirected_hdrs
and takes precedence over the Host
header in
Request.headers
.
Other changes:
-
Only handle HTTP status code 308 on Python versions before 3.11 as Python 3.11
will handle 308 by default [ref].
-
Move code which uses
http://www.w3.org/ns/adms.ttl
and
http://www.w3.org/ns/adms.rdf
out oftest_guess_format_for_parse
into a
separate parameterized test, which instead uses the embedded http server.This allows the test to fully control the
Content-Type
header in the
response instead of relying on the value that the server is sending.This is needed because the server is sending
Content-Type: text/plain
for
theadms.ttl
file, which is not a valid RDF format, and the test is
expectingContent-Type: text/turtle
.
Fixes:
fix: lexical-to-value mapping of rdf:HTML literals (#2483)
Use strict mode when parsing rdf:HTML
literals. This ensures that when
lexical-to-value
mapping
(i.e. parsing) of a literal with rdf:HTML
data type occurs, a value will
only be assigned if the lexical form is a valid HTML5 fragment.
Otherwise, i.e. for invalid fragments, no value will be associated with
the literal
[ref] and
the literal will be ill-typed.
fix: TriG handling of GRAPH keyword without a graph ID (#2469)
Commit 8c9608b, closes #2469 /www.w3.org/2013/TriGTests/#trig-graph-bad-01.
The RDF 1.1 TriG grammar only allows the GRAPH
keyword if it
is followed by a graph identifier
[ref].
This change enforces this rule so that the
http://www.w3.org/2013/TriGTests/#trig-graph-bad-01 test passes.
fix: TriG parser error handling for nested graphs (#2468)
Commit afea615, closes #2468 /www.w3.org/2013/TriGTests/#trig-graph-bad-07.
Raise an error when nested graphs occur in TriG.
With this change, the http://www.w3.org/2013/TriGTests/#trig-graph-bad-07 test passes.
fix: typing errors from dmypy (#2451)
Fix various typing errors that are reported when running with dmypy
,
the mypy daemon.
Also add a task for running dmypy
to the Taskfile that can be selected
as the default mypy variant by setting the MYPY_VARIANT
environment
variable to dmypy
.
fix: widen Graph.__contains__
type-hints to accept Path
values (#2323)
Change the type-hints for Graph.__contains__
to also accept Path
values as the parameter is passed to the Graph.triples
function,
which accepts Path
values.
docs: Add CITATION.cff file (#2502)
The CITATION.cff
file provides release metadata which is used by
Zenodo and other software and systems.
This file's content is best-effort, and pull requests with improvements
are welcome and will affect future releases.
docs: add guidelines for breaking changes (#2402)
Add guidelines on how breaking changes should be approached.
The guidelines take a very pragmatic approach with known downsides, but this
seems like the best compromise given the current situation.
For prior discussion on this point see:
docs: fix comment that doesn't describe behavior (#2443)
Comment refers to a person that knows bob and the code would return a name,
but this would only work if the triple person foaf:name bob .
is part of the dataset
As this is a very uncommon way to model a foaf:knows
the code was
adjusted to match the description.
docs: recommend making an issue before making an enhancement (#2391)
Suggest that contributors first make an issue to get in principle
agreement for pull requests before making the pull request.
Enhancements can be controversial, and we may reject the enhancement
sometimes, even if the code is good, as it may just not be deemed
important enough to increase the maintenance burden of RDFLib.
Other changes:
- Updated the checklist in the pull request template to be more accurate to
current practice. - Improved grammar and writing in the pull request template, contribution guide
and developers guide.
docs: remove unicode string form in rdflib/term.py (#2384)
The use of Unicode literals is an artefact of Python 2 and is incorrect in Python 3.
Doctests for docstrings using Unicode literals only pass because ALLOW_UNICODE
is set, but this option should be disabled as RDFLib does not support Python 2 any more.
This partially resolves #2378.