Skip to content

Releases: NCEAS/metadig-engine

v.3.0.1

06 May 17:11
Compare
Choose a tag to compare

What's Changed

This is a patch release to improve performance and stability for postgres and rabbitmq.

Improve Postgres stability - #420

Shortly after deploying 3.0.0, a bug was found where postgres connections were getting overloaded when more than 100 workers were deployed. To fix this we modified the pgbouncer configuration slightly so that the max number of user connections and max number of db connections were the same, plus a few extra connections for the db to allow for superuser processes. Currently this number is set at 200, so that is the max workers that should be deployed currently.

Also in this release is a slight refactor to close database connections from the java client using a try-with-resources pattern to ensure connections are not stranded.

Recover RabbitMQ dropped connections

This was a minor but important change - sometimes an exception was thrown when trying to recover a connection because the channel was already closed. In the catch block that does the connection recovery, we removed the channel.close() and only closed the connection before reopening both.

Other minor improvements

v.3.0.0

08 Mar 23:18
896ec0c
Compare
Choose a tag to compare

What's Changed

  • Replace Jython with Jep to enable use of modern python in checks by @jeanetteclark in #399

Previously, the quality engine used the Java ScriptEngine to execute Python check scripts. The script engine instance used to run these scripts was based on a Jython interpreter. Although this worked fine, Jython is perpetually stuck at supporting python 2.7, which officially lost support from the Python foundation in 2020. Additionally, Jython does not support CPython libraries such as pandas and numpy. Being stuck in python 2.7, and not being able to use CPython, severely limits the capabilities of any python check that the engine could run.

Although several options were considered, ultimately we decided to use Jep, since it supports CPython libraries and works with a standard Java install. Along with replacing Jython with Jep in this release, the rest of the metadig ecosystem was also upgraded to support Python 3.x. This included releases for:

With the new support for CPython checks and a modern python version, this release paves the way for data quality checks to be implemented in metadig.

Note that this is a breaking change since metadig can no longer run Python 2.7, and mismatched versions of the various metadig components may result in unexpected errors.

Full Changelog: v.2.5.0...v.3.0.0

MetaDIG Quality Engine 2.5.0

15 Aug 21:21
6bd5e1f
Compare
Choose a tag to compare

In this release:

  • upgrade to Java 17
  • fix a small bug in the stuck job monitor (#361)
  • resolve dependabot alerts
  • add new DataONE hosted repositories to quality and scoring tasks
  • update documentation and diagrams

MetaDIG Quality Engine 2.4.1

06 Jul 18:17
Compare
Choose a tag to compare

This release:

  • fixes #327 by pre-emptively acknowledging RabbitMQ messages (before the worker has finished)
  • keeps track of jobs in the runs table in the postgres DB, ensuring that no jobs will be lost (#350)
  • switches the docker repository to GitHub Container Registry (GHCR)
  • implements GitHub Actions to build and test the java library, and build and push docker images
  • update Docker base images to use eclipse-temurin

MetaDIG Quality Engine 2.4.0

31 May 01:54
Compare
Choose a tag to compare

This feature release includes:

  • upgrade to installation to helm (formally installation was from manifest files)
  • implementation of durable RabbitMQ queues
  • upgrade from Solr 7.3 to 8.11.1
  • use of Bitnami Solr k8s instance
  • use of Bitnami RabbitMQ k8s

MetaDIG Quality Engine 2.3.3

25 Feb 20:24
Compare
Choose a tag to compare

This bug fix release includes:

  • Updated metadig-scorer Dockerfile to resolve build errors (Issue #307)
  • Update directory structure (Issue #299)
  • Update DataONE component versions and exclude unneeded components from build (Issue #303)

MetaDIG Quality Engine 2.3.2

05 Jan 22:05
Compare
Choose a tag to compare

MetaDIG Quality Engine 2.3.0

08 Sep 20:52
Compare
Choose a tag to compare

Updates for Portal Assessment Graph Generation

  • The DataONE seriesId is now used to store and retrieve portal assessment graphs
  • For portal assessment graphics the portal 'collectionQuery' is always evaluated on the CN (#110)
  • New/updated portals are now harvested based on Solr, not the DataONE object store (#110)

Bookkeeper

  • A DataONE bookkeeper check has been added so that the status of a portal
    quota is obtained before creating the graphic

Updates implemented in k8s

  • New harvesting tasks have been added portals and DataONE member node 'profile' assessment graphics
  • A new harvesting task has been added for a DataONE-wide assessment graph
  • A new harvesting task has been added for a set of DataONE member nodes (ARCTIC, KNB, OPC) (#262)

Bugs fixed:

  • CN harvesting is missing some pids bug (#267)
  • During high volume k8s processing, sequenceId were not getting added

Code improvements

  • Reuse CN clients when possible (#264)
  • Read log4j.properties dynamically on container startup, so that debugging levels
    can be modified without a container rebuild
  • Load R packages from cran.rstudio.com (#259)

MetaDIG Quality Engine 2.2.0

25 Feb 19:46
Compare
Choose a tag to compare

New features

  • Provide filestore for files used by metadig-engine #230
  • MN Solr collectionQuery field query results & DataONE privileges #227

Bugs fixed

  • metadig-schedule not paging though results #207
  • metadig-engine unable to detect compute environment #200
  • Error dispatching 'python' scripting engine metadig metadig-engine #199
  • JAXB compilation errors enhancement #196
  • Update build to acquire metadig-checks #191
  • Solr faceting fails on field 'datasource' #184

MetaDIG Quality Engine 2.1.0

28 Oct 16:52
Compare
Choose a tag to compare
  • Enable RabbitMQ topic queue, to allow multiple worker types
  • Remove all checks/suites from metadig-engine jar (now in metadig-checks)
  • Add sequenceId to Solr server
  • Add DataONE formats list
  • Update metadig-scheduler to support CN tasks
  • Reuse engine dispatcher instances to improve efficiency of processing suites
  • Add analytics component to solr config (to support aggregation of quality score stats)
  • many minor bug fixes and improvements