Python > Scientific Audio
Scientific research in audio/music.
Contents
- Feature extraction
- Read-Write
- Transformations - General DSP
- Perceptial Models - Auditory Models
- Data augmentation
- Speech Processing
- Environmental Sounds
- Source Separation
- Music Information Retrieval
- Deep Learning
- Symbolic Music - MIDI - Musicology
- Realtime applications
- Web Audio
- Audio Dataset and Dataloaders
Audio Related Packages
Feature extraction
Realtime Audio Processing lib, general purpose.
Feature extractor, written in C, Python interface.
A library for audio and music analysis, feature extraction.
Music related low level and high level feature extractor, C++ based, includes Python bindings.
Common speech features for ASR.
Python bindings for YAAFE feature extractor.
Library for Speech Processing and Recognition, mostly feature extraction for now.
Python library for features extraction from audio files.
Read-Write
Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
Reads and writes all kind of audio metadata for various formats.
PyAV is a Pythonic binding for FFmpeg or Libav.
Library based on libsndfile, CFFI, and NumPy.
Wrapper for sox.
read/write of STEMS multistream audio.
reading music meta data of MP3, OGG, FLAC and Wave files.
Transformations - General DSP
useful tools for acousticians.
DSP filter toolbox (lots of filters).
real-time audio time-scale modification procedures.
Gammatone filterbank implementation.
Wrapper for FFTW(3).
Non-stationary gabor transform, constant-q.
Automated reference audio mastering.
MDCT transform.
Manipulate audio with a simple and easy high level interface.
Implementation of the MATLAB Time-Frequency Toolbox.
Room Acoustics Simulation (RIR generator)
Wrapper for rubberband to do pitch-shifting and time-stretching.
Discrete Wavelet Transform in Python.
Sample rate conversion.
Analyze, visualize and process sound field data recorded by spherical microphone arrays.
Standalone package for Short-Time Fourier Transform.
Perceptial Models - Auditory Models
Sound Field Synthesis Toolbox.
Inner ear models.
Spiking neural networks simulator, includes cochlea model.
Perceived loudness, includes Zwicker, Moore/Glasberg model.
Audio loudness meter and normalization, implements ITU-R BS.1770-4.
Data augmentation
Speech Processing
Forced aligner, based on MFCC+DTW, 35+ languages.
Pretrained automatic speech recognition.
Forced-aligner built on Kaldi.
Python interface to the Praat phonetics and speech analysis, synthesis, and manipulation software.
Automatic phoneme transcription tool.
Neural building blocks for speaker diarization.
Feature Extraction, Classification, Diarization.
Interface to the WebRTC Voice Activity Detector.
Wrapper for the PESQ score calculation.
Short Term Objective Intelligibility measure (STOI).
Wrapper for Morise's World Vocoder.
Forced aligner, based on Kaldi (HMM), English (others can be trained).
Wrapper for several ASR engines and APIs, online and offline.
Environmental Sounds
Source Separation
Music Information Retrieval
Corpus Analysis Tools for Computational Hook Discovery.
Algorithms for chord detection and key estimation.
MIR packages with strong focus on beat detection, onset detection and chord recognition.
Common scores for various MIR tasks. Also includes bss_eval implementation.
Music Structure Analysis Framework.
General audio and music analysis.
Deep Learning
Symbolic Music - MIDI - Musicology
Toolkit for Computer-Aided Musicology.
Realtime MIDI wrapper.
Advanced music theory and notation package with MIDI file and playback support.
Utility functions for handling MIDI data in a nice/intuitive way.
Realtime applications
Subtractive, additive, FM, and sample-based sound synthesis.
Realtime audio dsp engine.
PortAudio wrapper providing realtime audio I/O with NumPy.
Binaural rendering of streamed or IR-based high-order spherical microphone array signals.
Audio Dataset and Dataloaders
Music library manager and MusicBrainz tagger.
Parse and process the MUSDB18 dataset.
Parse medleydb audio + annotations.
Wrapper for Soundcloud API.
Download youtube videos (and the audio).
Loading different types of audio datasets.
Common loaders for Music Information Retrieval (MIR) datasets.
Tutorials
fast-paced introduction to Python essentials, aimed at researchers and developers.
Highly recommended tutorial, covers large parts of the scientific Python ecosystem.
collection of instructional iPython Notebooks for music information retrieval (MIR).
Exercises as iPython notebooks.
Live-coding video showing how to use the SoundDevice library to reproduce realistic sounds.