Platform independent interfacing of numpy arrays of floats with audio files and devices for scientific data analysis.
- Audio data are always numpy arrays of floats with values ranging between -1 and 1 independent of how the data are stored in an audio file.
load_audio()
function for loading data of a whole audio file at once.- Blockwise, random-access loading of large audio files (
class AudioLoader
andclass BufferedArray
). - Read arbitrary
metadata()
as nested dictionaries of key-value pairs. Supported RIFF chunks are INFO lists, BEXT, iXML, and GUANO. - Read
markers()
, i.e. cue points with spans, labels, and descriptions. write_audio()
function for writing data, metadata, and markers to an audio file.- Platform independent, synchronous (blocking) and asynchronous (non blocking) playback of numpy arrays via
play()
with automatic resampling to match supported sampling rates. - Detailed and platform specific installation instructions (pip, conda, Debian and RPM based Linux packages, homebrew for MacOS) for all supported audio packages (see audiomodules).
The AudioIO modules try to use whatever audio packages are installed on your system to achieve their tasks. AudioIO, however, adds own code for handling metadata and marker lists.
AudioIO is available at PyPi. Simply run:
pip install audioio
Then you can use already installed audio packages for reading and writing audio files and for playing audio data. However, audio file formats supported by the python standard library are limited to basic wave files and playback capabilities are poor. If you need support for additional audio file formats or proper sound output, you need to install additional packages.
See installation for further instructions and recommendations on additional audio packages.
See API Reference for detailed information.
import audioio as aio
Load an audio file into a numpy array using
load_audio()
:
data, samplingrate = aio.load_audio('audio/file.wav')
The read in data are always numpy arrays of floats ranging between -1 and 1. The arrays are always 2-D arrays with first axis time and second axis channel, even for single channel data.
Plot the first channel:
import numpy as np
import matplotlib.pyplot as plt
time = np.arange(len(data))/samplingrate
plt.plot(time, data[:,0])
plt.show()
Get a nested dictionary with key-value pairs of the file's metadata
and print it using
metadata()
and
print_metadata()
:
md = aio.metadata('audio/file.wav')
aio.print_metadata(md)
Get and print marker positions, spans, labels and texts using
markers()
and
print_markers()
:
locs, labels = aio.markers('audio/file.wav')
aio.print_markers(locs, labels)
You can also randomly access chunks of data of an audio file, without
loading the entire file into memory, by means of the AudioLoader
class. This
is really handy for analysing very long sound recordings:
# open audio file with a buffer holding 60 seconds of data:
with aio.AudioLoader('audio/file.wav', 60.0) as data:
block = 1000
rate = data.samplerate
for i in range(len(data)//block):
x = data[i*block:(i+1)*block]
# ... do something with x and rate
Even simpler, iterate in blocks over the file with overlap using the
blocks()
generator:
from scipy.signal import spectrogram
nfft = 2048
with aio.AudioLoader('some/audio.wav') as data:
for x in data.blocks(100*nfft, nfft//2):
f, t, Sxx = spectrogram(x, nperseg=nfft, noverlap=nfft//2)
Metadata and markers can be accessed by the
metadata()
and
markers()
member functions of the
AudioLoader
object:
with aio.AudioLoader('audio/file.wav', 60.0) as data:
md = data.metadata()
locs, labels = data.markers()
See API documentation of the
audioloader
,
audiometadata
, and
audiomarkers
modules for details.
Write a 1-D or 2-D numpy array into an audio file (data values between
-1 and 1) using the
write_audio()
function:
aio.write_audio('audio/file.wav', data, samplerate)
Again, in 2-D arrays the first axis (rows) is time and the second axis the channel (columns).
Metadata in form of a nested dictionary with key-value pairs, marker
positions and spans (locs
) as well as associated labels and texts
(labels
) can also be passed on to the
write_audio()
function:
aio.write_audio('audio/file.wav', data, samplerate, md, locs, labels)
See API documentation of the
audiowriter
module for details.
AudioIO provides a command line script for converting, downsampling, renaming and merging audio files:
> audioconverter -e float -o test.wav test.mp3
If possible, audioconverter
tries to keep metadata and marker lists.
See documentation of the
audioconverter
module for details.
AudioIO provides a command line script that prints metadata and markers of audio files to the console:
> audiometadata test.wav
See documentation of the
audiometadata
module for details.
Fade in and out
(fade()
)
and play
(play()
)
a 1-D or 2-D numpy array as a sound (first axis is time and second
axis the channel):
aio.fade(data, samplingrate, 0.2)
aio.play(data, samplingrate)
Just beep()
aio.beep()
Beep for half a second and 440 Hz:
aio.beep(0.5, 440.0)
aio.beep(0.5, 'a4')
Musical notes are translated into frequency with the
note2freq()
function.
See API documentation of the
playaudio
module for details.
Simply run in your terminal
> audiomodules
and you get something like
Status of audio packages on this machine:
-----------------------------------------
wave is installed (F)
ewave not installed (F)
scipy.io.wavfile is installed (F)
soundfile is installed (F)
wavefile not installed (F)
audioread is installed (F)
pydub is installed (F)
pyaudio not installed (D)
sounddevice NOT installed (D)
simpleaudio not installed (D)
soundcard not installed (D)
ossaudiodev is installed (D)
winsound not installed (D)
F: file I/O, D: audio device
For better performance you should install the following modules:
sounddevice:
------------
The sounddevice package is a wrapper of the portaudio library (http://www.portaudio.com).
For documentation see https://python-sounddevice.readthedocs.io
First, install the following packages:
sudo apt install libportaudio2 portaudio19-dev python3-cffi
Install the sounddevice module with pip:
sudo pip install sounddevice
Use this to see which audio modules you have already installed on your system, which ones are recommended to install, and how to install them.
See API documentation of the
audiomodules
module for details.
- thunderfish: Algorithms and programs for analysing electric field recordings of weakly electric fish.
- audian: Python-based GUI for viewing and analyzing recordings of animal vocalizations.
All the audio modules AudioIO is using.
Reading and writing audio files:
- wave: simple wave file interface of the python standard library.
- ewave: extended wave files.
- scipy.io.wavfile: simple scipy wave file interface.
- SoundFile: support of many open source audio file formats via libsndfile.
- wavefile: support of many open source audio file formats via libsndfile.
- audioread: mpeg file support.
- Pydub: mpeg support for reading and writing, playback via simlpeaudio or pyaudio.
- scikits.audiolab: seems to be no longer active.
Metadata:
- GUANO: Grand Unified Acoustic Notation Ontology, an extensible, open format for embedding metadata within bat acoustic recordings.
Playing sounds:
- sounddevice: wrapper for portaudio.
- PyAudio: wrapper for portaudio.
- simpleaudio: uses ALSA on Linux, runs well on windows.
- SoundCard: playback via CFFI and the native audio libraries of Linux, Windows and macOS.
- ossaudiodev: playback via the outdated OSS interface of the python standard library.
- winsound: native windows audio playback of the python standard library, asynchronous playback only with wave files.
Not yet supported by audioio:
- playsound: pure Python, cross platform, single function module with no dependencies for playing sounds. Plays sounds from files only.
- PreferredSoundPlayer: Platfrom independt playing of sound files.
- AudioPlayer: cross platform Python 3 package for playing sounds (mp3, wav, ...).
Scientific audio software:
- diapason: musical notes like
playaudio.note2freq
. - librosa: audio and music processing in python.
- TimeView: GUI application to view and analyze time series signal data.
- scikit-maad: quantitative analysis of environmental audio recordings
- Soundscapy: analysing and visualising soundscape assessments.
- BatDetect2: detecting and classifying bat echolocation calls in high frequency audio recordings.
- Batogram: viewing bat call spectrograms with GUANO metadata, including the ability to click to open the location in Google Maps.