A tool written in Python to generate and email statistical reports for DSpace 7+ repository administrators.
- Python 3.9+
- PostgreSQL 13+
- DSpace 7.x or 8.x repository **
** If your Solr index contains statistics from legacy DSpace 5.x or earlier instances, then the quality of the reports will go up significantly if you have migrated the old statistics to the new UUID identifiers in DSpace 6. See the DSpace Documentation for more information
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp config/application.yml.sample config/application.yml
dspace_name: 'MyDSpace'
dspace_server: 'http://localhost:8080'
solr_server: 'http://localhost:8080/solr'
oai_server: 'http://localhost:8080/oai'
rest_server:
url: 'http://localhost:8080/rest'
username: '[email protected]'
password: 'password'
statistics_db:
host: 'localhost'
port: '5432'
name: 'dspace_statistics'
username: 'dspace_statistics'
password: 'dspace_statistics'
work_dir: '/tmp'
create_zip_archive: false
log_path: 'logs'
log_file: 'statistics-reports.log'
log_level: 'INFO'
smtp_host: 'localhost'
smtp_auth: 'tls'
smtp_port: 587
smtp_username: 'username'
smtp_password: 'password'
from_email: '[email protected]'
admin_emails:
- email1
- email2
Configure application.yml according to your particular environment. The admin_emails list in the configuration refers to the email address(es) that will receive the stats reports if the email flag is set when running run_reports.py
or run_cron.py
(see below).
NOTE: All of the following commands assume that the user is in the virtual environment.
First, create a role and database in PostgreSQL.
create role dspace_statistics with login password 'dspace_statistics';
createdb --username=postgres --owner=dspace_statistics --encoding=UNICODE dspace_statistics;
There are several ways to generate statistical reports with this tool. They all begin with the database manager script that allows the user to create, drop and recreate the database tables to store metadata and statistics.
Usage: database_manager.py [options]
Options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config=CONFIG_FILE
Configuration file
-f FUNCTION, --function=FUNCTION
Database function to perform. Options: create, drop,
check, recreate
For example, the first time stats are generated the user should run:
python database_manager.py -c config/application.yml -f create
And then after that the database tables can be recreated before running the stats generation process again.
python database_manager.py -c config/application.yml -f recreate
With a fresh database, the user can generate stats reports for the entire repository with run_indexer.py
.
Usage: run_indexer.py [options]
Options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config=CONFIG_FILE
Configuration file
-o OUTPUT_DIR, --output_dir=OUTPUT_DIR
Directory for results files.
There is another option to generate statistics separately for communiities, collections, and items. They all generally take the form of:
python run_community_indexer.py -c config/application.py -o /tmp/reports
When all indexing is complete and the metadata and stats are in the database, it's time to generate Excel reports. This can be done with run_reports.py
.
Usage: run_reports.py [options]
Options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config=CONFIG_FILE
Configuration file
-o OUTPUT_DIR, --output_dir=OUTPUT_DIR
Directory for results files.
-e, --email Send email with stats reports to admin(s)?
For example:
python run_reports.py -c config/application.yml -o /tmp/reports -e
In order to facilitate generating stastical reports on a regular basis, the indexing and reports processes have been combined into a single script run_cron.py
that runs in a similar way to the other scripts.
Usage: run_cron.py [options]
Options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config=CONFIG_FILE
Configuration file
-o OUTPUT_DIR, --output_dir=OUTPUT_DIR
Directory for results files.
-e, --email Send email with stats reports to admin(s)?
For example:
python run_cron.py -c config/application.yml -o /tmp/reports -e
This code is licensed under the GNU General Public License (GPL) V3.
NOTE: Special thanks to the DSpace Statistics API project from which the Solr queries for views and downloads in this project are based.
Orth, A. 2018. DSpace statistics API. Nairobi, Kenya: ILRI. https://hdl.handle.net/10568/99143
For questions, comments or assistance please contact [email protected].