Sigment is an extensible data augmentation package for creating complex transformation pipelines for audio signals.

What is data augmentation?

Data augmentation is the creation of artificial data from original data by typically applying a transformation, or multiple transformations, to the original data. It is a common method for improving the versatility of machine learning models, in addition to providing more training examples for datasets of limited size.

In image data for example, it is common to use horizontal and vertical flipping, random cropping, zooming and additive noise for augmentation. In audio, we can use other transformations such as pitch shifting, time stretching or fading the signal in or out. Some image augmentation methods such as additive noise can also be transferred over to audio data.


From an audio signal like:


Sigment can produce augmentations such as:


Documentation Search and Index