-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0 #2907
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✅ Deploy Preview for peppy-sprite-186812 canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2907 +/- ##
=========================================
Coverage 81.16% 81.16%
Complexity 1506 1506
=========================================
Files 268 268
Lines 7363 7363
Branches 329 329
=========================================
Hits 5976 5976
Misses 1226 1226
Partials 161 161 ☔ View full report in Codecov by Sentry. |
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
3 times, most recently
from
October 4, 2024 16:54
aeec48b
to
87f661d
Compare
renovate
bot
changed the title
fix(deps): update dependency io.openlineage:openlineage-java to v1.22.0
fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0
Oct 4, 2024
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
from
October 8, 2024 17:57
87f661d
to
2b7b780
Compare
renovate
bot
changed the title
fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0
Update dependency io.openlineage:openlineage-java to v1.23.0
Oct 8, 2024
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
9 times, most recently
from
October 15, 2024 18:00
77a3521
to
98a2996
Compare
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
11 times, most recently
from
October 22, 2024 05:54
6e10782
to
e8b4d92
Compare
renovate
bot
changed the title
Update dependency io.openlineage:openlineage-java to v1.23.0
fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0
Oct 22, 2024
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
from
October 22, 2024 21:12
e8b4d92
to
282a739
Compare
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
7 times, most recently
from
October 24, 2024 20:26
743674c
to
8cb254c
Compare
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
renovate
bot
force-pushed
the
renovate/openlineageversion
branch
from
October 24, 2024 20:57
8cb254c
to
17d596b
Compare
wslulciuc
approved these changes
Oct 24, 2024
lmassaoy
pushed a commit
to nubank/NuMarquez
that referenced
this pull request
Oct 28, 2024
* Fixing data quality display. (MarquezProject#2937) Signed-off-by: phixMe <[email protected]> * Dataset Version call simplification (MarquezProject#2938) * Fixing data quality display. Signed-off-by: phixMe <[email protected]> * Fixing dataset version calls. Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> * feat: allow db-migrate without version (MarquezProject#2936) Signed-off-by: David Goss <[email protected]> * Display full `runID` and check icon when copied (MarquezProject#2940) Signed-off-by: Willy Lulciuc <[email protected]> * Deferred copy revert. (MarquezProject#2941) Signed-off-by: phixMe <[email protected]> * Long text handling (MarquezProject#2942) * Deferred copy revert. Signed-off-by: phixMe <[email protected]> * Long text handling. Signed-off-by: phixMe <[email protected]> * Adding search back in. Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> * Use project root for docker volume prefix (MarquezProject#2943) Signed-off-by: Willy Lulciuc <[email protected]> * fix: Correct SQL query pagination for DatasetVersion findAll method (MarquezProject#2945) Signed-off-by: Alper İnan <[email protected]> Signed-off-by: Alper <[email protected]> * Update changelog for `0.50.0` Signed-off-by: Willy Lulciuc <[email protected]> * Replace `redoc-cli` with `redocly` Signed-off-by: Willy Lulciuc <[email protected]> * Prepare for release 0.50.0 Signed-off-by: Willy Lulciuc <[email protected]> * Prepare next development version 0.51.0-SNAPSHOT Signed-off-by: Willy Lulciuc <[email protected]> * Templatize event time in `metadata.json` (MarquezProject#2946) * Templatize event time in `metadata.json` Signed-off-by: Willy Lulciuc <[email protected]> * Use `metadata.template.json` Signed-off-by: Willy Lulciuc <[email protected]> --------- Signed-off-by: Willy Lulciuc <[email protected]> * Update CHANGELOG.md * Update `web/docs/demo.gif` (MarquezProject#2948) Signed-off-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0 (MarquezProject#2907) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update dependency org.assertj:assertj-core to v3.26.3 (MarquezProject#2909) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency org.postgresql:postgresql to v42.7.4 (MarquezProject#2912) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update dependency org.opensearch.client:opensearch-rest-client to v2.17.1 (MarquezProject#2911) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency org.apache.commons:commons-lang3 to v3.17.0 (MarquezProject#2908) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Ignore `**/stats/**` (MarquezProject#2952) Signed-off-by: Willy Lulciuc <[email protected]> * Update compatibility for `0.50.0` * fix(deps): update dependency org.opensearch.client:opensearch-java to v2.16.0 (MarquezProject#2910) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * `Dataset.currentVersionUuid` `->` `DatasetVersion.uuid` (MarquezProject#2954) Signed-off-by: Willy Lulciuc <[email protected]> * Update Events Page (MarquezProject#2955) * Tuning the events page for longer events. Signed-off-by: phixMe <[email protected]> * Adding events file. Signed-off-by: phixMe <[email protected]> * Refetch jobs button. Signed-off-by: phixMe <[email protected]> * Refetch jobs button. Signed-off-by: phixMe <[email protected]> * Lint Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> * Lineage run attachment issue. (MarquezProject#2953) Signed-off-by: phixMe <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> * feature: Better handling of missing environment variables in setupProxy.js file. (MarquezProject#2956) Signed-off-by: Artur Owczarek <[email protected]> --------- Signed-off-by: phixMe <[email protected]> Signed-off-by: David Goss <[email protected]> Signed-off-by: Willy Lulciuc <[email protected]> Signed-off-by: Alper İnan <[email protected]> Signed-off-by: Alper <[email protected]> Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Signed-off-by: Artur Owczarek <[email protected]> Co-authored-by: Peter Hicks <[email protected]> Co-authored-by: davidjgoss <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> Co-authored-by: Alper İnan <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Artur Owczarek <[email protected]>
jonathanpmoraes
pushed a commit
to nubank/NuMarquez
that referenced
this pull request
Nov 12, 2024
* Fixing data quality display. (MarquezProject#2937) Signed-off-by: phixMe <[email protected]> * Dataset Version call simplification (MarquezProject#2938) * Fixing data quality display. Signed-off-by: phixMe <[email protected]> * Fixing dataset version calls. Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> * feat: allow db-migrate without version (MarquezProject#2936) Signed-off-by: David Goss <[email protected]> * Display full `runID` and check icon when copied (MarquezProject#2940) Signed-off-by: Willy Lulciuc <[email protected]> * Deferred copy revert. (MarquezProject#2941) Signed-off-by: phixMe <[email protected]> * Long text handling (MarquezProject#2942) * Deferred copy revert. Signed-off-by: phixMe <[email protected]> * Long text handling. Signed-off-by: phixMe <[email protected]> * Adding search back in. Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> * Use project root for docker volume prefix (MarquezProject#2943) Signed-off-by: Willy Lulciuc <[email protected]> * fix: Correct SQL query pagination for DatasetVersion findAll method (MarquezProject#2945) Signed-off-by: Alper İnan <[email protected]> Signed-off-by: Alper <[email protected]> * Update changelog for `0.50.0` Signed-off-by: Willy Lulciuc <[email protected]> * Replace `redoc-cli` with `redocly` Signed-off-by: Willy Lulciuc <[email protected]> * Prepare for release 0.50.0 Signed-off-by: Willy Lulciuc <[email protected]> * Prepare next development version 0.51.0-SNAPSHOT Signed-off-by: Willy Lulciuc <[email protected]> * Templatize event time in `metadata.json` (MarquezProject#2946) * Templatize event time in `metadata.json` Signed-off-by: Willy Lulciuc <[email protected]> * Use `metadata.template.json` Signed-off-by: Willy Lulciuc <[email protected]> --------- Signed-off-by: Willy Lulciuc <[email protected]> * Update CHANGELOG.md * Update `web/docs/demo.gif` (MarquezProject#2948) Signed-off-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency io.openlineage:openlineage-java to v1.23.0 (MarquezProject#2907) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update dependency org.assertj:assertj-core to v3.26.3 (MarquezProject#2909) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency org.postgresql:postgresql to v42.7.4 (MarquezProject#2912) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update dependency org.opensearch.client:opensearch-rest-client to v2.17.1 (MarquezProject#2911) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Willy Lulciuc <[email protected]> * fix(deps): update dependency org.apache.commons:commons-lang3 to v3.17.0 (MarquezProject#2908) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Ignore `**/stats/**` (MarquezProject#2952) Signed-off-by: Willy Lulciuc <[email protected]> * Update compatibility for `0.50.0` * fix(deps): update dependency org.opensearch.client:opensearch-java to v2.16.0 (MarquezProject#2910) Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * `Dataset.currentVersionUuid` `->` `DatasetVersion.uuid` (MarquezProject#2954) Signed-off-by: Willy Lulciuc <[email protected]> * Update Events Page (MarquezProject#2955) * Tuning the events page for longer events. Signed-off-by: phixMe <[email protected]> * Adding events file. Signed-off-by: phixMe <[email protected]> * Refetch jobs button. Signed-off-by: phixMe <[email protected]> * Refetch jobs button. Signed-off-by: phixMe <[email protected]> * Lint Signed-off-by: phixMe <[email protected]> --------- Signed-off-by: phixMe <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> * Lineage run attachment issue. (MarquezProject#2953) Signed-off-by: phixMe <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> * feature: Better handling of missing environment variables in setupProxy.js file. (MarquezProject#2956) Signed-off-by: Artur Owczarek <[email protected]> --------- Signed-off-by: phixMe <[email protected]> Signed-off-by: David Goss <[email protected]> Signed-off-by: Willy Lulciuc <[email protected]> Signed-off-by: Alper İnan <[email protected]> Signed-off-by: Alper <[email protected]> Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Signed-off-by: Artur Owczarek <[email protected]> Co-authored-by: Peter Hicks <[email protected]> Co-authored-by: davidjgoss <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> Co-authored-by: Alper İnan <[email protected]> Co-authored-by: Willy Lulciuc <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Artur Owczarek <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
1.13.1
->1.23.0
Release Notes
OpenLineage/OpenLineage (io.openlineage:openlineage-java)
v1.23.0
Compare Source
Added
#3039
@JDarDagranThis allows user to specify multiple targets to which OpenLineage events will be emitted.
#3062
@ImbrucedInterfaces are now able to extract lineage from Table interface, not only RelationProvider.
#3043
@ddebowczyk92Dataplex transport is now available as a separate Maven package for users that want to send OL events to GCP Dataplex
#3077
@ddebowczyk92GCS transport is now available as a separate Maven package for users that want to send OL events to Google Cloud Storage
#3129
@arturowczarekS3 transport is now available as a separate Maven package for users that want to send OL events to S3
#3094
@JDarDagranSpecified variables are now autotranslated to configuration values.
#3114
@JDarDagranSpecified variables are now autotranslated to configuration values.
#3116
@JDarDagranAllows user to add custom headers, for example for auth purposes.
#3097
#3098
@arturowczarekNow, if datasetLineageEnabled is enabled, and when column level lineage depends on the whole dataset, it does add dataset dependency instead of listing all the column fields in that dataset.
#3122
@ddebowczyk92This prevents a number of issues that might be caused by not closing underlying transports
Fixed
#3054
@JDarDagranThis fixes issue where NominalTimeRunFacet Facet breaks when nominalEndTime is None
#2962
@ImbrucedWith this change, if SQL specified CTE, but does not use it in final query, the lineage won't be falsely reported
#3068
@jonathanlbt1This change enhances performance and docs of fluentd proxy plugin.
#3107
@ImbrucedFor some complex CTEs, parser emitted CTE as a target table instead of original table. This is now fixed.
#3095
@ImbrucedNow OL produces CLL correctly for the potential view in the middle.
v1.22.0
Compare Source
Added
USE
statement with different syntaxes#2944
@kacpermudaAdjusts our Context so that it can use the new support for this statement in the parser and pass it to a number of queries.
#3044
@arturowczarekAdds a script to rebuild dependencies automatically following releases.
#3007
#3023
@pawel-big-lebowskiAdds a GitHub action that creates a new Docusaurus version on a tag push, verifiable using the openlineage-site repo. Implements a monorepo approach in a new
website
directory.Fixed
SingleQuotedString
inIdentifier()
#3035
@kacpermudaSingle quoted strings were being treated differently than strings with no quotes, double quotes, or backticks.
IDENTIFIER
function instead of treating it like table name#2999
@kacpermudaAdds support for this identifier in SELECT, MERGE, UPDATE, and DELETE statements. For now, only static identifiers are supported. When a variable is used, this table is removed from lineage to avoid emitting incorrect lineage.
#2918
@ImbrucedEvents created did not contain the correct input table when the query contained multiple tables.
#3020
@arturowczarekThe naming for RDD jobs now uses the same code as SQL and Application events.
v1.21.1
Compare Source
Added
#2987
@tnazarewRegisters the Google Cloud Platform Dataproc run facet.
Fixed
#2983
@kacpermudaAdjusts the SQL integration after our sqlparser-rs fork has been updated to the latest main.
#3001
@arturowczarekSQL events now properly use the names of the jobs retrieved from AWS Glue.
#2986
@ImbrucedA view instance of a node is now included when gathering data sources for input columns.
#2990
@arturowczarekFixes a number of minor issues.
#2984
@arturowczarekThey should use slashes and the prefix
table/
.#2937
@d-m-hPreviously, reading Iceberg datasets outside the configured Spark catalog prevented the datasets from being present in the
inputs
property of theRunEvent
.v1.20.5
Compare Source
Added
CompositeTransport
#2925
@JDarDagranAdds a
CompositeTransport
that can accept other transport configs to instantiate transports and use them to emit events.#2828
@pawel-big-lebowskiThe Spark integration is always compiled with Java 17, while tests are running on both Java 8 and Java 17 according to the configuration.
#2854
@pawel-big-lebowskiIncludes the Spark 4.0 preview release in the integration tests.
Window
#2901
@tnazarewAdds handling for
Window
-type nodes of a logical plan.#2913
@ImbrucedAdds a parser that traverses
QueryExecution
to get the SQL query used from the SQL field with a BFS algorithm.#2887
@ImbrucedAdds a Mongo streaming visitor and tests.
#2912
@arturowczarekThe mechanism makes
FacetConfig
accept the disabled flag for any facet instead of passing them as a list.#2906
@ImbrucedAdds a Kinesis class handler in the streaming source builder.
DatasetIdentifier
from extensionLineageNode
#2900
@ddebowczyk92Adds support for cases in which
LogicalRelation
has a grandChild node that implements theLineageRelation
interface.BaseRelation
#2893
@ddebowczyk92DatasetIdentifier
is now extracted from the underlying node ofLogicalRelation
.#2889
@jonathanlbt1Adds the
marquez-web
service to docker-compose.yml.Fixed
#2880
@jonathanlbt1Improves error logging.
#2877
@jonathanlbt1Upgrades the Fluentd version.
for each
batch method#2868
@imbrucedFixes an issue when Spark is in streaming mode and input for Kafka was not present in the event.
#2943
@arturowczarekMakes
IcebergHandler
support Glue catalog tables and create the symlink using the code fromPathUtils
.#2917
@arturowczarekMakes the AWS Glue ARN generating method accept every format (including Parquet), not only Hive SerDe.
#2892
@arturowczarekThe
LogicalPlanSerializer
now returns<failed-to-serialize-logical-plan>
for failed serialization instead of an empty string.#2883
@arturowczarekRefactors
CustomCollectorsUtils
for improved readability.foreach
batch mode#2868
@ImbrucedFixes a bug keeping Kafka input sources from being produced.
DatasetIdentifier
fromSaveIntoDataSourceCommandVisitor
options#2934
@ddebowczyk92Extracts
DatasetIdentifier
from command's options instead of relying onp.createRelation(sqlContext, command.options())
, which is a heavy operation forJdbcRelationProvider
.v1.20.3
Compare Source
v1.19.0
Compare Source
Added
log_url
toAirflowRunFacet
#2852
@dolfinusAdds taskinstance's
log_url
field toAirflowRunFacet
Generate
#2856
@tnazarewAdds handling for
Generate
-type nodes of a logical plan (e.g., explode operations).DerbyJdbcExtractor
#2869
@dolfinusAdds
JdbcExtractor
implementation for Derby database. As this is a file-based DBMS, its Dataset namespace isfile
and name is an absolute path to a database file.#2859
@pawel-big-lebowskiExtends the
JarVerifier
plugin to ensure all compiled classes have a bytecode version of Java 8 or lower.#2851
@d-m-h @imbrucedAdds support for Kafka streaming sources to Kafka streaming sinks. Inputs and outputs are now included in lineage events.
Fixed
#2865
@kacpermudaFixes missing timezone information in task FAIL events
ColumnLevelLineageBuilder
#2850
@tnazarewRemoves the shaded
Streams
dependency inColumnLevelLineageBuilder
causing aClassNotFoundException
.#2863
@dolfinusMakes dataset symlinks for Delta and non-Delta tables consistent.
#2855
@ddebowczyk92Fixes
PlanUtils3
so Dataset identifier information based on a Table's properties is also retrieved during the construction of column-level lineage.#2861
@arturowczarekThe integration now detects if the
spark.app.name
was autogenerated by Glue and uses the Glue job name in such cases. Also, each job name provisioning strategy is now extracted to a separate provider.v1.18.0
Compare Source
Added
#2755
@pawel-big-lebowskiProvides command line tool capable of running Spark integration tests that can be created without Java.
#2809
#2837
@ddebowczyk92New Spark extension interfaces without runtime dependency hell. Includes a test to verify the integration is working properly.
#2743
@pawel-big-lebowskiUpgrades CI workflows to run tests against latest Spark versions: 3.4.2 -> 3.4.3 and 3.5.0 -> 3.5.1.
#2789
@tnazarewAdds extraction of the masking property during collection of dependencies for
ColumnLineageDatasetFacet
creation.InsertIntoHadoopFsRelationCommand
#2794
@dolfinusCollects a table name for
INSERT INTO
command for tables created withUSING $fileFormat
syntax, likeUSING orc
.PostgresJdbcExtractor
#2806
@dolfinusAdds the default
5432
port to Postgres namespaces.TeradataJdbcExtractor
#2826
@dolfinusConverts JDBC URLs like
jdbc:teradata/host/DBS_PORT=1024,DATABASE=somedb
to datasets with namespaceteradata://host:1024
and namesomedb.table
.MySqlJdbcExtractor
#2825
@dolfinusHandles different formats of MySQL JDBC URL, and produces datasets with consistent namespaces, like
mysql://host:port
.OracleJdbcExtractor
#2824
@dolfinusHandles simple Oracle JDBC URLs, like
oracle:thin:@​//host:port/serviceName
andoracle:thin@host:port:sid
, and converts each to a dataset with namespaceoracle://host:port
and namesid.schema.table
orserviceName.schema.table
.#2822
@pawel-big-lebowskiExtends the configurable integration test feature to enable getting the Docker image name as a name.
#2838
@pawel-big-lebowskiInclude Iceberg support for Spark 3.5. Fix column level lineage facet for
UNION
queries.#2756
#2801
@SheeriUpdates the
customLineage
facet test for the new syntax created in#2756
.Changed
spark.sql.warehouse.dir
as table namespace#2767
@dolfinusIn cases when a metastore is not used, falls back to
spark.sql.warehouse.dir
orhive.metastore.warehouse.dir
as table namespace, instead of duplicating the table's location.Fixed
JdbcExtractors
#2830
@dolfinusProper handling of dashes in JDBC URL hosts.
#2807
@Akash2351Fixes Glue symlinks with config parsing for Glue
catalogid
.#2800
@dolfinusFixes the DBFS namespace format.
#2766
@dolfinusChanges the AWS Glue namespace to match Glue ARN documentation.
#2797
@dolfinusFixes Iceberg dataset namespace: instead of
file:/some/path/database.table
usesfile:/some/path/database/table
. For dataset TABLE symlink, uses warehouse location instead of database location.#2827
@pawel-big-lebowskiFixes an error caused by a recent upgrade of Spark versions that did not break existing tests.
JdbcLocation
#2831
@dolfinusConverts valid JDBC URL scheme and authority to lowercase, leaving intact instance/database name, as different databases have different default case and case-sensitivity rules.
v1.17.1
Compare Source
Added
#2720
@pawel-big-lebowskiAdds a dataset namespace resolving mechanism that resolves dataset namespaces based on the resolvers configured. The core mechanism is implemented in openlineage-java and can be used within the Flink and Spark integrations.
#2758
@tnazarewAdds a transformation type extraction mechanism.
#2643
@codelixirAdds
GCPRunFacetBuilder
andGCPJobFacetBuilder
to report additional facets when running on Google Cloud Platform.#2773
@dolfinusImproves the namespace format for SQLServer.
#2698
@pawel-big-lebowskiAdds a tool to verify
shadowJar
content and prevent reported issues. These are hard to prevent currently and require manual verification of manually unpacked jar content.#2756
@tnazarewAdds information about the transformation type in
ColumnLineageDatasetFacet
.transformationType
andtransformationDescription
are marked as deprecated.#2729
@harelsIntroduces the foundations of the new facet Registry into the repo.
#2740
@ngorchakovaRegisters the GCP job facet that contains common attributes that will improve the way lineage is parsed and displayed by the GCP platform. Based on the proposal, GCP Lineage would like to define facets that are expected from integrations. The list of support facets is not final and will be extended further by next PR.
Removed
localServerId
option from Kafka config#2738
@dolfinusRemoves
localServerId
from Kafka config, deprecated since 1.13.0.Transport.emit(String)
#2737
@dolfinusRemoves
Transport.emit(String)
support, deprecated since 1.13.0.spark-interfaces-scala
module#2781
@ddebowczyk92Replaces the existing
spark-interfaces-scala
interfaces with new ones decoupled from the Scala binary version. Allows for improved integration in environments where one cannot guarantee the same version ofopenlineage-java
.Changed
#2769
@algorithmy1Enhances logging.
Fixed
namespace.name
as Avro complex field type#2763
@dolfinusnamespace.name
is now used as Avro"type"
of complex fields (record, enum, fixed).#2776
@kacpermudaThe dataset name should not be empty.
drop table
for Spark 3.4 and above#2745
@pawel-big-lebowski@savannavalgiIncludes dataset being dropped within the event, as it used to be prior to Spark 3.4.
#2782
@dolfinusDrops the leading slash from the object storage dataset name. Converts
s3a://
ands3n://
schemes tos3://
.#2761
@dolfinusFixes the dataset namespace for cases when the Hive metastore URL is set using
$SPARK_CONF_DIR/hive-site.xml
.#2749
@pawel-big-lebowskiThe Spark agent now checks to determine if
cur.getDependencies()
is not null before adding dependencies.OpenLineageRunEventBuilder
#2754
@pawel-big-lebowskiAdds a separate class containing all the input arguments to call
OpenLineageRunEventBuilder::buildRun
.historyUrl
format#2741
@dolfinusFixes the
historyUrl
format inspark_applicationDetails
.#2753
@mobuchowskiExpressions like
select * from test_orders as test_orders
are now parsed properly.v1.16.0
Compare Source
Added
jobType
facet to Spark application events#2719
@dolfinusAdd
jobType
facet torunEvent
s emitted bySparkListenerApplicationStart
.#2720
@pawel-big-lebowskiEnable resolving dataset namespace with predefined resolvers like:
HostListNamespaceResolver
,PatternNamespaceResolver
,PatternMatchingGroupNamespaceResolver
or custom implementation loaded with ServiceLoader. Feature is useful to resolve hostnames into cluster identifiers.Fixed
#2735
@JDarDagranFixes variable names.
#2727
@mobuchowskiRemoves debug-level logging of HTTP requests.
v1.15.0
Compare Source
Added
#2706
@dolfinusCreates
SchemaDatasetFacet
with nested fields for Iceberg tables with list, map and struct columns.#2711
@dolfinusCreates
SchemaDatasetFacet
with nested fields for Avro schemas with complex types (union, record, map, array, fixed).#2677
@dolfinusAdds support for Spark application start and stop events in the
ExecutionContext
interface.SchemaDatasetFieldsFacet
#2689
@dolfinusAdds nested Spark Dataframe fields support to
SchemaDatasetFieldsFacet
. Also include field comment asdescription
.SparkApplicationDetailsFacet
#2688
@dolfinusAdds
SparkApplicationDetailsFacet
torunEvent
s emitted on Spark application start.Removed
#2710
@kacpermudaRemoves Airflow < 2.3.0 support.
#2693
@JDarDagranMigrates integrations from removed v1 facets to v2 Python facets.
Fixed
#2665
@pawel-big-lebowskiFor some catalog handlers, the mechanism was creating different dataset identifiers on START and COMPLETE depending on whether a dataset was created or not. This improves the mechanism to assign a deterministic job suffix based on the output dataset at the moment of a start event. Note: this may change job names in some scenarios.
AthenaExtractor
#2700
@kacpermudaThe dataset name should not be empty when passing only a bucket as S3 output in Athena.
SchemaDatasetFacet
for Protobuf repeated primitive types#2685
@dolfinusFixes issues with the Protobuf schema converter.
#2653
@kacpermudaCleans up client code, refactors logging in all Python modules.
TokenizerError
s,PanicException
#2703
@mobuchowskiThe SQL parser now catches and handles these errors.
#2713
@JDarDagranSuppresses the deprecation warning when v1 facets are used.
#2686
#2687
@dolfinusUses UUIDv7 instead of UUIDv4 for
runEvent
s. The new UUID version produces monotonically increasing values, which leads to more performant queries on the OL consumer side. Note: UUID version is an implementation detail and can be changed in the future.v1.14.0
Compare Source
Added
#2674
@surisimran*Adds support for dbt-dremio, resolving
#2668
.#2482
@pawel-big-lebowskiAdds schema extraction from Protobuf classes. Includes support for nested object types,
array
type,map
type,oneOf
andany
.#2663
@julienledemAdds a simple test that shows how to deserialize a facet in the server model.
#2652
@pawel-big-lebowskiSets the
jobType
property ofJobTypeJobFacet
to eitherSQL_JOB
orRDD_JOB
.#2646
@mobuchowskiThe dataset symlink now points to the Glue catalog table name if the Glue catalog table is used.
spark_jobDetails
facet#2662
@dolfinusAdds a
SparkJobDetailsFacet
, capturing information about Spark application jobs -- e.g.jobId
,jobDescription
,jobGroup
,jobCallSite
. This allows for tracking an OpenLineageRunEvent
with a specific Spark job in SparkUI.Removed
ParentRunFacet
key#2660
@dolfinusChanges the integration to use the
parent
key forParentFacet
, dropping the outdatedparentRun
.SparkVersionFacet
#2659
@dolfinusDrops the
SparkVersion
facet, deprecated since 1.2.0 and planned for removal since 1.4.0.#2679
@JDarDagranRemoves a URI validator that checked if scheme and netloc were present, allowing relative paths in URI formats for Python facets.
Changed
ParentRunFacet
key#2661
@dolfinusThe OpenLineage spec defined the
ParentRunFacet
with the property name parent but the Great Expectations integration created a lineage event withparentRun
. This renamesParentRunFacet
key fromparentRun
toparent
. For backwards compatibility, keep the old name.Fixed
#2658
@blacklightIncludes profile and models in the dbt job name to make it more unique.
org.apache.commons.lang3
instead oforg.apache.commons.lang
#2676
@harelsUpdates Apache Commons Lang to the latest version. We were mixing two versions, and the old one was not present in many places.
Configuration
📅 Schedule: Branch creation - "every 3 months on the first day of the month" (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.