pyampact.alignment.align_midi_wav

pyampact.alignment.align_midi_wav(piece, WF, sr, TH, ST, width, tsr, nhar, wms, hop, showSpec)[source]

Align a midi file to a wav file using the “peak structure distance” of Orio et al. that use the MIDI notes to build a mask that is compared against harmonics in the audio

Parameters:
  • piece (Score) – A Score instance containing the symbolic MIDI data.

  • WF (ndarray) – Audio time series of the WAV file.

  • sr (int) – Sampling rate of the audio file.

  • TH (float) – Time step resolution, typically in seconds (default is 0.050).

  • ST (int) – Similarity type; 0 (default) uses the triangle inequality.

  • width (float) – Width of the mask for the analysis.

  • tsr (int) – Target sample rate for resampling the audio (if needed).

  • nhar (int) – Number of harmonics to include in the mask.

  • wms (float) – Window size in milliseconds.

  • hop (int) – Hop size for the analysis window.

  • showSpec (bool) – If True, displays the spectrogram.

Returns:

  • m (ndarray) – The map such that M[:,m] corresponds to the alignment.

  • path (tuple of ndarrays) – [p, q], the path from dynamic programming (DP) that aligns the MIDI and audio.

  • S (ndarray) – The similarity matrix used for alignment.

  • D (ndarray) – The spectrogram of the audio.

  • M (ndarray) – The MIDI-note-derived mask, including harmonic information if available.