pyampact.alignment.align_midi_wav

pyampact.alignment.align_midi_wav(piece, WF, sr, TH, ST, width, tsr, nhar, wms, hop, showSpec, bpm)[source]

Align a midi file to a wav file using the “peak structure distance” of Orio et al. that use the MIDI notes to build a mask that is compared against harmonics in the audio

Parameters:

piece (Score) – A Score instance containing the symbolic MIDI data.
WF (ndarray) – Audio time series of the WAV file.
sr (int) – Sampling rate of the audio file.
TH (float) – Time step resolution, typically in seconds (default is 0.025).
ST (int) – Similarity type; 0 (default) uses the triangle inequality.
width (float) – Width of the mask for the analysis.
tsr (int) – Target sample rate for resampling the audio (if needed).
nhar (int) – Number of harmonics to include in the mask.
wms (float) – Window size in milliseconds.
hop (int) – Hop size for the analysis window.
showSpec (bool) – If True, displays the spectrogram.

Returns:

m (ndarray) – The map such that M[:,m] corresponds to the alignment.
path (tuple of ndarrays) – [p, q], the path from dynamic programming (DP) that aligns the MIDI and audio.
S (ndarray) – The similarity matrix used for alignment.
D (ndarray) – The spectrogram of the audio.
M (ndarray) – The MIDI-note-derived mask, including harmonic information if available.