alignment

alignment

pyampact.alignment.align_midi_wav(piece, WF, sr, TH, ST, width, tsr, nhar, wms, hop, showSpec)[source]

Align a midi file to a wav file using the “peak structure distance” of Orio et al. that use the MIDI notes to build a mask that is compared against harmonics in the audio

Parameters:
  • MF – Score instance of symbolic data

  • WF – Audio time series of file

  • TH – is the time step resolution (default 0.050)

  • ST – is the similarity type: 0 (default) is triangle inequality

  • showSpec – Boolean to show the spectrogram

Returns:

  • m: Is the map s.t. M(:,m)

  • [p,q]: Are the path from DP

  • S: The similarity matrix

  • D: Is the spectrogram

  • M: Is the midi-note-derived mask

  • N: Is Orio-style “peak structure distance”

pyampact.alignment.alignment_visualiser(audio_spec, times=None, freqs=None, fig=1, showSpec=True)[source]

Plots a gross DTW alignment overlaid with the fine alignment resulting from the HMM aligner on the output of YIN. Trace(1,:) is the list of states in the HMM, and trace(2,:) is the number of YIN frames for which that state is occupied. Highlight is a list of notes for which the steady state will be highlighted.

Parameters:
  • audio_spec – Spectrogram of audio file

  • freqs – Array of sample frequencies

  • times – Array of segment times

Returns:

Visualized spectrogram

pyampact.alignment.get_ons_offs(onsoffs)[source]
Extracts a list of onset and offset from an inputted

3*N matrix of states and corresponding ending times from AMPACT’s HMM-based alignment algorithm

Parameters:

onsoffs – A 3*N alignment matrix, the first row is a list of N states the second row is the time which the state ends, and the third row is the state index

Returns:

  • res.ons: List of onset times

  • res.offs: List of offset times

pyampact.alignment.run_DTW_alignment(y, original_sr, piece, tres, width, target_sr, nharm, win_ms, hop, nmat, showSpec)[source]

Perform a dynamic time warping alignment between specified audio and MIDI files.

Returns a matrix with the aligned onset and offset times (with corresponding MIDI note numbers) and a spectrogram of the audio.

Parameters:
  • y – Audio time series of audio

  • original_sr – original sample rate of audio

  • piece – Score instance of symbolic data

  • tres – Time resolution for MIDI to spectrum information conversion.

  • width – Width parameter (you need to specify this value)

  • target_sr – Target sample rate (you need to specify this value)

  • nharm – Number of harmonics (you need to specify this value)

  • win_ms – Window size in milliseconds (you need to specify this value)

  • showSpec – Boolean to show the spectrogram

Returns:

  • align: Dynamic time warping MIDI-audio alignment structure
    • align.on: onset times

    • align.off: offset times

    • align.midiNote: MIDI note numbers

  • spec: Spectrogram of the audio file

  • dtw: Dict of dynamic time warping returns
    • M: map s.t. M(:,m)

    • MA/RA [p,q]: path from DP

    • S: similarity matrix

    • D: spectrogram

    • notemask: midi-note-derived mask

    • pianoroll: midi-note-derived pianoroll

  • nmat: updated DataFrame of nmat data

pyampact.alignment.run_alignment(y, original_sr, piece, nmat, width=3, target_sr=4000, nharm=3, win_ms=100, hop=32, showSpec=False)[source]

Calls the DTW alignment function.

Parameters:
  • y – Audio time series

  • original_sr – original sample rate of audio

  • piece – Score instance of symbolic data

  • means – Mean values for each state

  • covars – Covariance values for each state

  • width – Width parameter (you need to specify this value)

  • target_sr – Target sample rate (you need to specify this value)

  • nharm – Number of harmonics (you need to specify this value)

  • win_ms – Window size in milliseconds (you need to specify this value)

  • hop – Number of samples between successive frames

  • showSpec – Boolean to show the spectrogram

Returns:

  • align: Dynamic time warping MIDI-audio alignment structure
    • on: onset times

    • off: offset times

    • midiNote: MIDI note numbers

  • dtw: Dict of dynamic time warping returns
    • M: map s.t. M(:,m)

    • MA/RA [p,q]: path from DP

    • S: similarity matrix

    • D: spectrogram

    • notemask: midi-note-derived mask

    • pianoroll: midi-note-derived pianoroll

  • spec: Spectrogram of the audio file

  • newNmat: updated DataFrame of nmat data