alignmentUtils

`dpcore`(M, pen)	Core dynamic programming calculation of best path.
`dp`(local_costs[, penalty, gutter, G])	Use dynamic programming to find a min-cost path through a matrix of local costs.
`gh`(v1, i1, v2, i2, domain[, frac])	Get an element that is frac fraction of the way between v1[i1] and v2[i2], but check bounds on both vectors.
`g`(vec, idx, domain)	Get an element from vec, checking bounds.
`orio_simmx`(M, D)	Calculate an Orio&Schwartz-style (Peak Structure Distance) similarity matrix.
`simmx`(A, B)	Calculate a similarity matrix between feature matrices A and B.
`maptimes`(t, intime, outtime)	Map onset/offset times using linear interpolation from intime to outtime.
`f0_est_weighted_sum`(x, f, f0i[, fMax, fThresh])	Calculate F0, power, and spectrum for an inputted spectral representation.
`f0_est_weighted_sum_spec`(noteStart_s, ...[, ...])	Calculate F0, power, and spectrum for a single note.
`durations_from_midi_ticks`(filename)	Extract note durations from a MIDI file using MIDI ticks.
`load_audiofile`(audio_file)	Loads file for analysis.

pyampact.alignmentUtils.dp(local_costs, penalty=0.1, gutter=0.0, G=0.5)[source]

Use dynamic programming to find a min-cost path through a matrix of local costs.

Parameters:

local_costs (np.ndarray) – A 2D matrix of local costs, where each cell represents the cost associated with that specific position.
penalty (float, optional) – An additional cost incurred for moving in the horizontal or vertical direction (i.e., (0,1) and (1,0) steps). Default is 0.1.
gutter (float, optional) – A proportion of edge length that allows for deviations away from the bottom-left corner (-1,-1) in the optimal path. Default is 0.0, meaning the path must reach the top-right corner.
G (float, optional) – A proportion of the edge length considered for identifying gulleys in the cost matrix. Default is 0.5.

Returns:

p (np.ndarray) – An array of row indices corresponding to the best path.
q (np.ndarray) – An array of column indices corresponding to the best path.
total_costs (np.ndarray) – A 2D array of minimum costs to reach each cell in the local costs matrix.
phi (np.ndarray) – A traceback matrix indicating the preceding best-path step for each cell, where: - 0 indicates a diagonal predecessor. - 1 indicates the previous column (same row). - 2 indicates the previous row (same column).

pyampact.alignmentUtils.dpcore(M, pen)[source]

Core dynamic programming calculation of best path.: M[r,c] is the array of local costs. Create D[r,c] as the array of costs-of-best-paths to r,c, and phi[r,c] as the indicator of the point preceding [r,c] to allow traceback; 0 = (r-1,c-1), 1 = (r,c-1), 2 = (r-1, c)

Parameters:

M (np.ndarray) – A 2D array of local costs, where M[r, c] represents the cost at position (r, c).
pen (float) – A penalty value applied for non-diagonal movements in the path.

Returns:

D (np.ndarray) – A 2D array of cumulative best costs to each point (r,c), starting from (0,0).
phi (np.ndarray) – A 2D array of integers used for traceback, where: - 0 indicates the previous point was (r-1, c-1). - 1 indicates the previous point was (r, c-1). - 2 indicates the previous point was (r-1, c).

pyampact.alignmentUtils.durations_from_midi_ticks(filename)[source]

Extract note durations from a MIDI file using MIDI ticks. This function processes a MIDI file, calculates note onset and offset times based on MIDI ticks and tempo, and returns the duration matrix (nmat). It handles tempo changes and computes times by converting MIDI ticks to seconds.

Assumes a default pulses-per-quarter-note (PPQN) value of 96.

Parameters:: filename (str) – Path to the MIDI file to be processed.
Returns:: A numpy array where each row contains the start and end times of notes (in seconds) based on MIDI ticks and tempo changes.
Return type:: np.ndarray

pyampact.alignmentUtils.f0_est_weighted_sum(x, f, f0i, fMax=20000, fThresh=None)[source]

Calculate F0, power, and spectrum for an inputted spectral representation.

Parameters:

x (np.ndarray, shape (F, T)) – Matrix of complex spectrogram values, where F is the number of frequency bins and T is the number of time frames.
f (np.ndarray, shape (F, T)) – Matrix of frequencies corresponding to each of the spectrogram values in x.
f0i (np.ndarray, shape (1, T)) – Initial estimates of F0 for each time frame. This should be a 1D array containing the F0 estimates for each time point.
fMax (float, optional) – Maximum frequency to consider in the weighted sum. Defaults to 5000 Hz.
fThresh (float, optional) – Maximum distance in Hz from each harmonic to consider. If not specified, no threshold will be applied.

Returns:

f0 (np.ndarray) – Vector of estimated F0s from the beginning to the end of the input time series.
p (np.ndarray) – Vector of corresponding “powers” derived from the weighted sum of the estimated F0.
strips (np.ndarray) – Estimated spectrum for each partial frequency based on the weighted contributions.

pyampact.alignmentUtils.f0_est_weighted_sum_spec(noteStart_s, noteEnd_s, midiNote, y, sr, useIf=True)[source]

Calculate F0, power, and spectrum for a single note.

Parameters:

filename (str) – Name of the WAV file to analyze.
noteStart_s (float) – Start position (in seconds) of the note to analyze.
noteEnd_s (float) – End position (in seconds) of the note to analyze.
midiNote (int) – MIDI note number of the note to analyze.
y (np.ndarray) – Audio time series data from the WAV file.
sr (int) – Sample rate of the audio signal.
useIf (bool, optional) – If true, use instantaneous frequency; otherwise, use spectrogram frequencies. Defaults to True.

Returns:

f0 (np.ndarray) – Vector of estimated F0s from noteStart_s to noteEnd_s.
p (np.ndarray) – Vector of corresponding “powers” derived from the weighted sum of the estimated F0.
M (np.ndarray) – Estimated spectrum for the analyzed note.

pyampact.alignmentUtils.g(vec, idx, domain)[source]

Get an element from vec, checking bounds. Domain is the set of points that vec is a subset of.

Parameters:

vec (np.ndarray) – A 1D numpy array representing the input vector.
idx (int) – The index of the desired element in vec.
domain (np.ndarray) – A 1D numpy array representing the set of valid points, of which vec is a subset.

Returns:

The element from vec at index idx if it is within bounds; otherwise, the first element of domain if idx is less than 0, or the last element of domain if idx exceeds the bounds of vec.

Return type:

float

pyampact.alignmentUtils.gh(v1, i1, v2, i2, domain, frac=0.5)[source]

Get an element that is frac fraction of the way between v1[i1] and v2[i2], but check bounds on both vectors. frac of 0 returns v1[i1], frac of 1 returns v2[i2], frac of 0.5 (the default) returns halfway between them.

Parameters:

v1 (np.ndarray) – A 1D numpy array representing the first vector.
i1 (int) – The index in v1 from which to retrieve the value.
v2 (np.ndarray) – A 1D numpy array representing the second vector.
i2 (int) – The index in v2 from which to retrieve the value.
domain (tuple) – A tuple representing the valid bounds for both vectors. This should define the minimum and maximum allowable indices.
frac (float, optional) – A fraction indicating how far between the two specified elements to interpolate. Default is 0.5.

Returns:

The element that is frac fraction of the way between v1[i1] and v2[i2], clipped to the specified domain bounds.

Return type:

float

pyampact.alignmentUtils.load_audiofile(audio_file)[source]

Loads file for analysis. Currently only the WAV file format is allowed. AIFF, MP3, etc. must be converted prior.

Parameters:

audio_file (str) – The path to the wav file to be loaded.

Returns:

audio_data (The loaded wav file as a numpy array.)
sr (The original sample rate of the audio file.)

pyampact.alignmentUtils.maptimes(t, intime, outtime)[source]

Map onset/offset times using linear interpolation from intime to outtime.

Parameters:: t (np.ndarray of shape (N, 2)) – Each row contains [onset, offset].
Returns:: Mapped onset and offset times.
Return type:: np.ndarray of shape (N, 2)

pyampact.alignmentUtils.orio_simmx(M, D)[source]

Calculate an Orio&Schwartz-style (Peak Structure Distance) similarity matrix.

Parameters:

M (np.ndarray) – A binary mask where each column corresponds to a row in the output similarity matrix S. The mask indicates the presence or absence of MIDI notes or relevant features.
D (np.ndarray) – The regular spectrogram where the columns of the similarity matrix S correspond to the columns of D. This spectrogram represents the audio signal over time and frequency.

Returns:

The similarity matrix S, calculated based on the Peak Structure Distance between the binary mask M and the spectrogram D.

Return type:

np.ndarray

pyampact.alignmentUtils.simmx(A, B)[source]

Calculate a similarity matrix between feature matrices A and B.

Parameters:

A (np.ndarray) – The first feature matrix, where each row represents a sample and each column represents a feature.
B (np.ndarray, optional) – The second feature matrix. If not provided, B will be set to A, allowing for self-similarity calculation.

Returns:

The similarity matrix between A and B, where the element at (i, j) represents the similarity between the i-th sample of A and the j-th sample of B.

Return type:

np.ndarray