Audio textures

Supplementary audio

Example textures

Examples of some textures synthesized with a large weight on the autocorrelation loss and a relatively low weight on the diversity loss.

Texture	Original	Synthesized
Tapping 1-2-3
Wind chimes
Person speaking English
Frogs and insects

Evolution of the audio during optimization

This sequence of audio shows what the wind chimes texture sounds like at various points during the optimization. In this case the optimization halted after 1768 steps rather than run for the full 2000 steps.

Steps	Audio
3
10
30
100
300
1000
1768

Effect of the weight on the autocorrelation term

Rhythmic textures synthesized with different weights on the autocorrelation term in the loss.

Autocor-relation weight	Tapping 1-2	Tapping 1-2-3
0
1
1000
100000

Effect of the weight on the diversity term

Complex sounds synthesized with different weights on the diversity loss.

Diversity weight	Wind chimes	Person speaking French
1e-5
1e-3

Effect of the receptive field size

As the size of the receptive field widens the textures can reproduce longer-term structure.

Convolutional kernel size	Wind chimes	Brushing teeth
Original
4
16
64
256

Effect of the number of filters

As the number of filters increases, the quality of the textures improves.

Number of filters	Wind chimes	Frogs and insects
2
8
32
128
512

Effect of stacking

Separate one-layer convolutional networks with different receptive field sizes work better than stacking several convolutional layers.

Architecture	Wind chimes	Frogs and insects
Original
Stacked
Separate

Audio textures

ICASSP 2019 submission

Audio textures

Supplementary audio

Example textures

Evolution of the audio during optimization

Effect of the weight on the autocorrelation term

Effect of the weight on the diversity term

Effect of the receptive field size

Effect of the number of filters

Effect of stacking