Signals in the time domain (audiotoolbox.Signal)

The Signal Class inherits from numpy.ndarray via the audiotoolbox.BaseSignal class:

digraph inheritance7ea72fb99d { bgcolor=transparent; rankdir=LR; size="8.0, 12.0"; "audiotoolbox.base_signal.BaseSignal" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Basic Signal class inherited by all Signal representations"]; "numpy.ndarray" -> "audiotoolbox.base_signal.BaseSignal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal.Signal" [URL="#audiotoolbox.Signal",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Base class for signals in the timedomain."]; "audiotoolbox.base_signal.BaseSignal" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.analysis.AnalysisMixin" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.generation.GenerationMixin" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.modification.ModificationMixin" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.io.IOMixin" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.filtering.FilteringMixin" -> "audiotoolbox.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.signal_mixins.analysis.AnalysisMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal analysis methods."]; "audiotoolbox.signal_mixins.filtering.FilteringMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal filtering methods."]; "audiotoolbox.signal_mixins.generation.GenerationMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal generation methods."]; "audiotoolbox.signal_mixins.io.IOMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal io methods."]; "audiotoolbox.signal_mixins.modification.ModificationMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal modification methods."]; "numpy.ndarray" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="ndarray(shape, dtype=float, buffer=None, offset=0,"]; }

As a consequence, numpy.ndarray methods such as x.min(), x.max(), x.sum(), x.var() and others can also be used on auditools.Signal objects. For more informations check the numpy docs.

class audiotoolbox.Signal(n_channels: int | tuple | list, duration: float, fs: int, dtype=<class 'float'>)

Base class for signals in the timedomain.

Parameters:
  • n_channels (int or tuple) – Number of channels to be used, can be N-dimensional

  • duration (float) – Stimulus duration in seconds

  • fs (int) – Sampling rate in Hz

  • dtype (type, optional) – Datatype of the array (default is float)

Returns:

Signal

Return type:

The new signal object.

Examples

Create a 1 second long signal with two channels at a sampling rate of 48 kHz

>>> sig = audiotoolbox.Signal(2, 1, 48000)
>>> print(sig.shape)
(4800, 2)
abs()

Absolute value

Calculates the absolute value or modulus of all values of the signal

add(x)

In-place summation

This function allowes for in-place summation.

Parameters:

x (scalar or ndarray) – The value or array to add to the signal

Return type:

Returns itself

Examples

>>> sig = audiotoolbox.Signal(1, 1, 48000).add_tone(500).add(2)
>>> print(sig.mean())
2.0
add_cos_modulator(frequency: float, m: float, start_phase: float = 0)

Multiply a cosinus amplitude modulator to the signal.

Multiplies a cosinus amplitude modulator following the equation:

\[1 + m \cos{2 \pi f_m t \phi_{0}}\]

where \(m\) is the modulation depth, \(f_m\) is the modualtion frequency and \(t\) is the time. \(\phi_0\) is the start phase

Parameters:
  • frequency (float) – The frequency of the cosine modulator.

  • m (float, optional) – The modulation index. (Default = 1)

  • start_phase (float) – The starting phase of the cosine in radiant.

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.cos_amp_modulator

add_fade_window(rise_time: float, win_type: str = 'hann', **kwargs)

Add a fade in/out window to the signal.

This function multiplies a fade window with a given rise time onto the signal.

Parameters:
  • rise_time (float) – The rise time in seconds.

  • win_type (str) – Any window function supported by scipy.signal.get_window. Default is ‘hann’.

  • **kwargs – Additional keyword arguments passed to the window function (see scipy implementation).

Notes

Window types:

  • boxcar

  • triang

  • blackman

  • hamming

  • hann

  • bartlett

  • flattop

  • parzen

  • bohman

  • blackmanharris

  • nuttall

  • barthann

  • cosine

  • exponential

  • tukey

  • taylor

  • lanczos

  • kaiser (needs beta)

  • kaiser_bessel_derived` (needs beta)

  • gaussian` (needs standard deviation)

  • general_cosine` (needs weighting coefficients)

  • general_gaussian` (needs power, width)

  • general_hamming` (needs window coefficient)

  • dpss` (needs normalized half-bandwidth)

  • chebwin` (needs attenuation)

Returns:

Return itself

Return type:

Signal

add_noise(ntype: Literal['white', 'pink', 'brown'] = 'white', variance: float = 1.0, seed=None)

Add uncorrelated noise to the signal.

add gaussian noise with a defined variance and different spectral shapes. The noise is generated in the frequency domain using the gaussian pseudorandom generator numpy.random.randn. The real and imaginarny part of each frequency component is set using the psudorandom generator. Each frequency bin is then weighted dependent on the spectral shape. The resulting spektrum is then transformed into the time domain using numpy.fft.ifft

Weighting functions:

  • white: \(w(f) = 1\)

  • pink: \(w(f) = \frac{1}{\sqrt{f}}\)

  • brown: \(w(f) = \frac{1}{f}\)

Parameters:
  • ntype ({'white', 'pink', 'brown'}) – spectral shape of the noise

  • variance (scalar, optional) – The Variance of the noise

  • seed (int or 1-d array_like, optional) – Seed for RandomState. Must be convertible to 32 bit unsigned integers.

Returns:

Returns itself

Return type:

Signal

add_tone(frequency: float | np.ndarray | list, amplitude: float | np.ndarray | list = 1, start_phase: float | np.ndarray | list = 0) Signal

Add one or more cosine tones to the signal.

This function will add pure tones to the current waveform. If multiple frequencies are given (as arrays), their waveforms are summed together before being added to the signal.

\[x_{new} = x_{old} + \sum_{i} A_i \cos(2\pi f_i t + \phi_{0,i})\]
Parameters:
  • frequency (float or array-like) – The tone frequency or frequencies in Hz.

  • amplitude (float or array-like, optional) – The amplitude of the cosine(s). Must have the same length as frequency if provided as an array. (default = 1)

  • start_phase (float or array-like, optional) – The starting phase of the cosine(s) in radians. Must have the same length as frequency if provided as an array. (default = 0)

Returns:

Returns self for method chaining.

Return type:

Signal

add_uncorr_noise(corr: float = 0, variance: float = 1, ntype: Literal['white', 'pink', 'brown'] = 'white', seed: float | None = None, bandpass: dict | None = None, highpass: dict | None = None, lowpass: dict | None = None)

Add partly uncorrelated noise.

This function adds partly uncorrelated noise using the N+1 generator method.

To generate N partly uncorrelated noises with a desired correlation coefficent of $rho$, the algoritm first generates N+1 noise tokens which are then orthogonalized using the Gram-Schmidt process (as implementd in numpy.linalg.qr). The N+1 th noise token is then mixed with the remaining noise tokens using the equation

\[X_{\rho,n} = X_{N+1} \sqrt{\rho} + X_n \beta \sqrt{1 - \rho}\]

where \(X_{\rho,n}\) is the nth output and noise, \(X_{n}\) the nth indipendent noise and \(X_{N=1}\) is the common noise.

for two noise tokens, this is identical to the assymetric three-generator method described in [1]_

Parameters:
  • corr (int, optional) – Desired correlation of the noise tokens, (default=0)

  • variance (scalar, optional) – The desired variance of the noise, (default=1)

  • ntype ({'white', 'pink', 'brown'}) – spectral shape of the noise

  • seed (int or 1-d array_like, optional) – Seed for RandomState. Must be convertible to 32 bit unsigned integers.

  • bandpass (dict, optional) – Parameters for an bandpass filter, these are passed as arguments to the audiotoolbox.filter.bandpass function

  • lowpass (dict, optional) – Parameters for an lowpass filter, these are passed as arguments to the audiotoolbox.filter.lowpass function

  • highpass (dict, optional) – Parameters for an highpass filter, these are passed as arguments to the audiotoolbox.filter.highpass function

Returns:

Returns itself

Return type:

Signal

References

correlated noise—a comparison of methods. The Journal of the Acoustical Society of America, 130(1), 292-301. http://dx.doi.org/10.1121/1.3596475

bandpass(fc, bw, filter_type, **kwargs)

Apply a bandpass filter.

Applies a bandpass filter to the signal. The availible filters are:

  • brickwall: A ‘optimal’ brickwall filter

  • gammatone: A real valued gammatone filter

  • butter: A butterworth filter

For additional filter parameters and detailed description see the respective implementations:

Parameters:
  • fc (scalar) – The banddpass center frequency in Hz

  • bw (scalar) – The filter bandwidth in Hz

  • filter_type ({'brickwall', 'gammatone', 'butter'}) – The filtertype

  • **kwargs – Further keyword arguments are passed to the respective filter functions

Returns:

Returns itself

Return type:

Signal

property ch

Direct channel indexer

Returns an indexer class which enables direct indexing and slicing of the channels indipendent of samples.

Examples

>>> sig = audiotoolbox.Signal((2, 3), 1, 48000).add_noise()
>>> print(np.all(sig.ch[1, 2] is sig[:, 1, 2]))
True
concatenate(signal)

Concatenate another signal or array

This method appends another signal to the end of the current signal.

Parameters:

signal (signal or ndarray) – The signal to append

Return type:

Returns itself

convolve(kernel, mode: Literal['full', 'valid', 'same'] = 'full', overlap_dimensions: bool = True)

Convolves the current signal with the given kernel.

This method performs a convolution operation between the current signal and the provided kernel. The convolution is performed along the overlapping dimensions of the two signals. E.g., If the signal has two channels and the kernel has two channels, the first channel of the signal is convolved with the first channel of the kernel, and the second channel of the signal is convolved with the second channel of the kernel. The resulting signal will again have two channels. If overlap_dimensions is False, the convolution is performed along all dimensions. A Signal with two channels convolved with a two-channel kernel will result in an output of shape (2, 2) where each channel of the signal is convolved with each channel of the kernel.

this method uses scipy.Signal.fftconvolve for the convolution.

Parameters:
  • kernel (Signal) – The kernel to convolve with.

  • mode (str {'full', 'valid', 'same'}, optional) – The convolution mode for fftconvolve (default=full)

  • overlap_dimensions (bool, optional) – Whether to convolve only along overlapping dimensions. If True, the convolution is performed only along the dimensions that overlap between the two signals. If False, the convolution is performed along all dimensions. Defaults to True.

Returns:

The convolved signal.

Return type:

Self

Examples

If the last dimension of signal and the first dimension of kernel match, convolution takes place along this axis. This means that the first channel of the signal is convolved with the first channel of the kernel, the second with the second.

>>> signal = Signal(2, 1, 48000)
>>> kernel = Signal(2, 100e-3, 48000)
>>> signal.convolve(kernel)
>>> signal.n_channels
2

This also works with multiple overlapping dimensions.

>>> signal = Signal((5, 2, 3), 1, 48000)
>>> kernel = Signal((2, 3), 100e-3, 48000)
>>> signal.convolve(kernel)
>>> signal.n_channels
(5, 2, 3)

The ‘overlap_dimensions’ keyword can be set to False if all signal channels are instead convolved with all kernels.

>>> signal = Signal(2, 1, 48000)
>>> kernel = Signal(2, 100e-3, 48000)
>>> signal.convolve(kernel, overlap_dimensions=False)
>>> signal.n_channels
(2, 2)
delay(delay: float, method: Literal['fft', 'sample'] = 'fft')

Delays the signal by circular shifting.

Circular shift the functions foreward to create a certain time delay relative to the orginal time. E.g if shifted by an equivalent of N samples, the value at sample i will move to sample i + N.

Two methods can be used. Using the default method ‘fft’, the signal is shifted by applyint a FFT transform, and phase shifting each frequency accoring to the delay and applying an inverse transform. This is identical to using the :meth:’audiotoolbox.FrequencyDomainSignal.time_shift’ method. When using the method ‘sample’, the signal is time delayed by circular shifting the signal by the number of samples that is closest to delay.

Parameters:
  • delay (float) – The delay in secons

  • method ({'fft', 'samples'} optional) – The method used to delay the signal (default: ‘fft’)

Returns:

Returns itself

Return type:

Signal

See also

audio.shift_signal, audio.FreqDomainSignal.time_shift

property duration

Duration of the signal in seconds

from_file(filename: str, start: int = 0, channels='all')

Load a signal from an audio file.

This method loads a signal from an audio file and assigns it to the current Signal object. The signal can be loaded from a specific start point and for specific channels.

Parameters:
  • filename (str) – The path to the audio file to load.

  • start (int, optional) – The starting sample index from which to load the signal. Default is 0.

  • channels (int, tuple, or str, optional) – The channels to load from the audio file. Can be an integer specifying a single channel, a tuple specifying multiple channels, or “all” to load all channels. Default is “all”.

Returns:

The Signal object with the loaded audio data.

Return type:

Signal

Raises:

ValueError – If the number of channels in the loaded signal does not match the number of channels in the current Signal object.

Examples

Load a signal from a file starting at the beginning and using all channels:

>>> sig = Signal(2, 1, 48000)
>>> sig.from_file("example.wav")

Load a signal from a file starting at sample index 1000 and using the first channel:

>>> sig = Signal(1, 1, 48000)
>>> sig.from_file("example.wav", start=1000, channels=0)
property fs: int

Sampling rate of the signal in Hz

multiply(x: float | ndarray)

In-place multiplication

This function allowes for in-place multiplication

Parameters:

x (scalar or ndarray) – The value or array to muliply with the signal

Return type:

Returns itself

Examples

>>> sig = audiotoolbox.Signal(1, 1, 48000).add_tone(500).multiply(2)
>>> print(sig.max())
2.0
property n_channels

Number of channels in the signal

property n_samples

Number of samples in the signal

phase_shift(phase: float)

Shifts all frequency components of a signal by a constant phase.

Shift all frequency components of a given signal by a constant phase. This is identical to calling the phase_shift method of the FrequencyDomainSignal class.

Parameters:

phase (scalar) – The phase in rad by which the signal is shifted.

Returns:

Returns itself

Return type:

Signal

play(block: bool = True)

Quick playback of the signal over the default audio output device.

Parameters:

block (bool, optional) – If True, the method will block until playback is finished. If False, playback will be non-blocking and the method will return immediately. Default is True.

rectify()

One-way rectification of the signal.

Returns:

Returns itself

Return type:

Signal

resample(new_fs: int)

Resample the signal to a new sampling rate.

This method uses the resampy library to resample the signal to a new sampling rate. It is based on the band-limited sinc interpolation method for sampling rate conversion as described by Smith (2015). [1]_.

set_dbfs(dbfs: float)

Full scale normalization of the signal.

Normalizes the signal Level to dB Fullscale. 0dB FS corresponds to a signal with an rms of \(\frac{1}{\sqrt{2}}\) so that a tone at 0dBS will have an amplitude of 1.

Parameters:

dbfs (float) – The db full scale value to reach

Returns:

self

Return type:

Signal

set_dbspl(dbspl: float)

Set sound pressure level in dB.

Normalizes the signal to a given sound pressure level in dB relative 20e-6 Pa.

Normalizes the signal to a given sound pressure level in dB relative 20e-6 Pa. for this, the Signal is multiplied with the factor \(A\)

\[A = \frac{p_0}{\sigma} 10^{L / 20}\]

where \(L\) is the goal SPL, \(p_0=20\mu Pa\) and \(\sigma\) is the RMS of the signal.

Parameters:

dbspl (float) – The sound pressure level in dB

Returns:

Returns itself

Return type:

Signal

property time

Time vector for the signal.

to_freqdomain()

Convert to frequency domain by applying a DFT.

This function returns a frequency domain representation of the signal.

As opposed to most methods, this conversion is not in-place but a new audiotoolbox.FrequencyDomainSignal() object is returned

Returns:

The frequency domain representation of the signal

Return type:

FrequencyDomainSignal

trim(t_start: float, t_end: float | None = None)

Trim the signal between two points in time.

removes the number of samples according to t_start and t_end. This method can not be applied to a single channel or slice.

Parameters:
  • t_start (float) – Signal time at which the returned signal should start

  • t_end (float or None (optional)) – Signal time at which the signal should stop. The full remaining signal is used if set to None. (default: None)

Returns:

Returns itself

Return type:

Signal

write_file(filename, **kwargs)

Save the signal as an audio file.

This method saves the current signal as an audio file. Additional parameters for the file format can be specified through keyword arguments. The file can be saved in any format supported by libsndfile, such as WAV, FLAC, AIFF, etc.

Parameters:
  • filename (str) – The filename to save the audio file as.

  • **kwargs – Additional keyword arguments to be passed to the audiotoolbox.wav.writefile function. These can include format and subtype.

Return type:

None

Examples

Save the signal to a file named “output.wav”:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.wav")

Save the signal to a file with a specific format and subtype:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.wav", format="WAV", subtype="PCM_16")

Save the signal to a FLAC file:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.flac", format="FLAC")

See also

audiotoolbox.wav.writefile

Function used to write the audio file.

zeropad(number: None | tuple[int, int] = None, duration: None | tuple[float, float] = None)

Add zeros to start and end of signal.

This function adds zeros of a given number or duration to the start or end of a signal.

If number or duration is a scalar, an equal number of zeros will be appended at the front and end of the array. If a vector of two values is given, the first defines the number or duration at the beginning, the second the number or duration of zeros at the end.

Parameters:
  • number (scalar or vecor of len(2), optional) – Number of zeros.

  • duration (scalar or vecor of len(2), optional) – duration of zeros in seconds.

Returns:

Returns itself

Return type:

Signal

The Signal.stats sub-module

The Signal.stats submodule gives access to an instance of the audiotoolbox.stats.SignalStats class. e.g.:

>>> sig = audio.Signal(1, 1, 48000)
>>> rms = sig.stats.rms
class audiotoolbox.stats.SignalStats(sig)
property crest_factor

Soundpressure level relative to 20uPa in dB

See also

audiotoolbox.crest_factor

property dba

A weighted sound pressure level in dB

property dbc

A weighted sound pressure level in dB

property dbfs: ndarray

Calculate the dBFS RMS value of a given signal.

\[L = 20 \log_{10}\left(\sqrt{2}\sigma\right)\]

where \(\sigma\) is the signals RMS.

property dbspl

Calculate the dB (SPL) values for all channels of the signal.

\[L = 20 \log_{10}\left(\frac{\sigma}{p_o}\right)\]

where \(L\) is the SPL, \(p_0=20\mu Pa\) and \(\sigma\) is the RMS of the signal.

property mean

aritmetic mean

octave_band_levels(oct_fraction: int = 3) tuple[ndarray, ndarray]

Calculate octave band levels of the signal.

Parameters:

oct_fraction (float, optional) – Fraction of an octave to use, by default 3 for 1/3 octave bands.

Returns:

tuple – Frequencies and corresponding levels in dB Full Scale (dBFS)

Return type:

(frequencies, levels)

property rms

Root mean square.

Returns:

float

Return type:

The RMS value

property var

variance

The Signal.time_frequency sub-module

The Signal.time_frequency submodule gives access to an instance of the audiotoolbox.time_frequency.TimeFrequency class that provides time-frequency analysis methods such as spectrograms.

class audiotoolbox.time_frequency.TimeFrequency(sig)

Class containing time-frequency analysis methods.

filterbank_specgram(bank: FilterBank, nperseg: int = 1024, noverlap: int = 512, win: str = 'hann') tuple[Signal, np.ndarray]

Calculate the spectrogram of a signal using a specified filter bank.

This function applies a filter bank to the signal and computes the spectrogram.

Parameters:
  • bank (FilterBank) – The filter bank to apply to the signal.

  • nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).

  • noverlap (int, optional) – The number of samples to overlap between segments (default is 512).

  • win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the filter bank.

Return type:

tuple[audio.Signal, np.ndarray]

gammatone_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) tuple[Signal, np.ndarray]

Calculate the gammatone spectrogram of a signal.

This function applies a gammatone filter bank to the signal and computes the spectrogram.

Parameters:
  • nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).

  • noverlap (int, optional) – The number of samples to overlap between segments (default is 512).

  • win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.

  • **kwargs (dict, optional) – Additional parameters to pass to the auditory gamma bank function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the gammatone filters.

Return type:

tuple[audio.Signal, np.ndarray]

octave_band_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) tuple[Signal, np.ndarray]

Calculate the octave band spectrogram of a signal.

This function applies an octave filter bank to the signal and computes the spectrogram.

Parameters:
  • nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).

  • noverlap (int, optional) – The number of samples to overlap between segments (default is 512).

  • win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.

  • **kwargs (dict, optional) – Additional parameters to pass to the octave bank function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the octave bands

Return type:

tuple[audio.Signal, np.ndarray]

stft_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) tuple[Signal, np.ndarray]

Calculate the Short-Time Fourier Transform (STFT) spectrogram of a signal.

This function computes the STFT of the signal and returns the spectrogram. It is a wrapper around the scipy.signal.spectrogram function.

Parameters:
  • nperseg (int, optional) – The number of samples per segment for the STFT (default is 1024).

  • noverlap (int, optional) – The number of samples to overlap between segments (default is 512).

  • win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.

  • **kwargs (dict, optional) – Additional parameters to pass to the spectrogram function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the STFT.

Return type:

tuple[audio.Signal, np.ndarray]