A Docker image designed to make it easy to experiment with tools for Digital Preservation. Designed to be used via the DigiPres Sandbox and the DigiPres Workbench.
Build locally with e.g.
docker build . -t toolbox
Then run with
docker run -it toolbox bash
Large (>>1GB) images don't seem to run well on Binder, so we can't install everything we'd like to. e.g. ffmpeg
takes up 0.5GB!
These sizes can be determined by using separate installation lines in the Dockerfile
and then using commands like this to see what happened and what size the additional layer is:
docker history --no-trunc toolbox | grep ffmpeg
These aren't installed by default because of their size, but the Sandbox indicates how to download and install them.
- Apache Tika
- DROID
- ffmpeg including ffprobe
- pdfcpu
These require root access to install but take up too much space
- GitHub Linguist (200-400MB in size depending on base image, mostly down to requiring a full build environment)
- VeraPDF
- JHOVE
- Handbrake
- MediaConch
- EPUBCheck
- The PLANETS Testbed (briefing paper, article)
- VIPER