Skip to content

Commit

Permalink
Improve ApplyImpulseResponse documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
iver56 committed Sep 7, 2023
1 parent 498a7d4 commit a3544a1
Show file tree
Hide file tree
Showing 7 changed files with 76 additions and 5 deletions.
21 changes: 21 additions & 0 deletions demo/generate_examples_for_doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,27 @@ def generate_example(self):
return sound, transformed_sound, sample_rate


@register
class ApplyImpulseResponseExample(TransformUsageExample):
transform_class = ApplyImpulseResponse

def generate_example(self):
random.seed(42)
np.random.seed(42)
transform = ApplyImpulseResponse(
ir_path=os.path.join(DEMO_DIR, "ir", "rir48000.wav"), p=1.0
)

sound, sample_rate = load_sound_file(
os.path.join(DEMO_DIR, "p286_011.wav"), sample_rate=None
)
sound = sound[..., int(0.5 * sample_rate) : int(2.9 * sample_rate)]

transformed_sound = transform(sound, sample_rate)

return sound, transformed_sound, sample_rate


@register
class LimiterExample(TransformUsageExample):
transform_class = Limiter
Expand Down
Binary file added demo/ir/rir48000.wav
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
58 changes: 54 additions & 4 deletions docs/waveform_transforms/apply_impulse_response.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,73 @@

_Added in v0.7.0_

Convolve the audio with a randomly selected impulse response.
This transform convolves the audio with a randomly selected (room) impulse response file.

`ApplyImpulseResponse` is commonly used as a data augmentation technique to **add
realistic-sounding reverb to recordings**, for example to make denoisers and speech
recognition systems more robust to different acoustic environments as well as a range of
different distances between the main sound source and the microphone. It could also be
used to generate roomy audio examples for the training of dereverberation models.

Convolution with an impulse response is a powerful technique in signal processing that
can be employed to emulate the acoustic characteristics of specific environments or
devices. This process can transform a dry recording, giving it the sonic signature of
being played in a specific location or through a particular device.

**What is an impulse response?** An impulse response (IR) captures the unique acoustical
signature of a space or object. It's essentially a recording of how a specific
environment or system responds to an impulse (a short, sharp sound). By convolving
an audio signal with an impulse response, we can simulate how that signal would sound in
the captured environment.

Note that some impulse responses, especially those captured in larger spaces or from
specific equipment, can introduce a noticeable delay when convolved with an audio
signal. In some applications, this delay is a desirable property. However, in some other
applications, the convolved audio should not have a delay compared to the original
audio. If this is the case for you, you can align the audio afterwards with
[fast-align-audio](https://github.com/nomonosound/fast-align-audio), for example.

Impulse responses can be created using e.g. [http://tulrich.com/recording/ir_capture/](http://tulrich.com/recording/ir_capture/)

Some datasets of impulse responses are publicly available:
- [EchoThief](http://www.echothief.com/) containing 115 impulse responses acquired in a

* [EchoThief](http://www.echothief.com/) containing 115 impulse responses acquired in a
wide range of locations.
- [The MIT McDermott](https://mcdermottlab.mit.edu/Reverb/IR_Survey.html) dataset
* [The MIT McDermott](https://mcdermottlab.mit.edu/Reverb/IR_Survey.html) dataset
containing 271 impulse responses acquired in everyday places.

Impulse responses are represented as audio (ideally wav) files in the given `ir_path`.

Another thing worth checking is that your IR files have the same sample rate as your
audio inputs. Why? Because if they have different sample rates, the internal resampling
will slow down execution, and because some high frequencies may get lost.

## Input-output example

Here we make a dry speech recording quite reverbant by convolving it with a room impulse response

![Input-output waveforms and spectrograms](ApplyImpulseResponse.webp)

| Input sound | Transformed sound |
|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
| <audio controls><source src="../ApplyImpulseResponse_input.flac" type="audio/flac"></audio> | <audio controls><source src="../ApplyImpulseResponse_transformed.flac" type="audio/flac"></audio> |

## Usage example

```python
from audiomentations import ApplyImpulseResponse

transform = ApplyImpulseResponse(ir_path="/path/to/sound_folder", p=1.0)

augmented_sound = transform(my_waveform_ndarray, sample_rate=48000)
```

## ApplyImpulseResponse API

[`ir_path`](#ir_path){ #ir_path }: `Union[List[Path], List[str], str, Path]`
: :octicons-milestone-24: A path or list of paths to audio file(s) and/or folder(s) with
audio files. Can be `str` or `Path` instance(s). The audio files given here are
supposed to be impulse responses.
supposed to be (room) impulse responses.

[`p`](#p){ #p }: `float` • range: [0.0, 1.0]
: :octicons-milestone-24: Default: `0.5`. The probability of applying this transform.
Expand Down
2 changes: 1 addition & 1 deletion docs/waveform_transforms/high_pass_filter.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
_Added in v0.18.0, updated in v0.21.0_

Apply high-pass filtering to the input audio of parametrized filter steepness (6/12/18... dB / octave).
Can also be set for zero-phase filtering (will result in a 6db drop at cutoff).
Can also be set for zero-phase filtering (will result in a 6 dB drop at cutoff).

# HighPassFilter API

Expand Down

0 comments on commit a3544a1

Please sign in to comment.