Signals in the time domain (audiotoolbox.Signal)

The Signal Class inherits from numpy.ndarray via the audiotoolbox.BaseSignal class:

digraph inheritance7ea72fb99d { bgcolor=transparent; rankdir=LR; size="8.0, 12.0"; "audiotoolbox.oaudio.base_signal.BaseSignal" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Basic Signal class inherited by all Signal representations"]; "numpy.ndarray" -> "audiotoolbox.oaudio.base_signal.BaseSignal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal.Signal" [URL="#audiotoolbox.Signal",fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",target="_top",tooltip="Base class for signals in the timedomain."]; "audiotoolbox.oaudio.base_signal.BaseSignal" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.analysis.AnalysisMixin" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.generation.GenerationMixin" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.modification.ModificationMixin" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.io.IOMixin" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.filtering.FilteringMixin" -> "audiotoolbox.oaudio.signal.Signal" [arrowsize=0.5,style="setlinewidth(0.5)"]; "audiotoolbox.oaudio.signal_mixins.analysis.AnalysisMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal analysis methods."]; "audiotoolbox.oaudio.signal_mixins.filtering.FilteringMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal filtering methods."]; "audiotoolbox.oaudio.signal_mixins.generation.GenerationMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal generation methods."]; "audiotoolbox.oaudio.signal_mixins.io.IOMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal io methods."]; "audiotoolbox.oaudio.signal_mixins.modification.ModificationMixin" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="Mixin for signal modification methods."]; "numpy.ndarray" [fillcolor=white,fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans",fontsize=10,height=0.25,shape=box,style="setlinewidth(0.5),filled",tooltip="ndarray(shape, dtype=float, buffer=None, offset=0,"]; }

As a consequence, numpy.ndarray methods such as x.min(), x.max(), x.sum(), x.var() and others can also be used on auditools.Signal objects. For more informations check the numpy docs.

class audiotoolbox.Signal(n_channels: int | tuple | list, duration: float, fs: int, dtype=<class 'float'>)

Base class for signals in the timedomain.

Parameters:

n_channels (int or tuple) – Number of channels to be used, can be N-dimensional
duration (float) – Stimulus duration in seconds
fs (int) – Sampling rate in Hz
dtype (type, optional) – Datatype of the array (default is float)

Returns:

Signal

Return type:

The new signal object.

Examples

Create a 1 second long signal with two channels at a sampling rate of 48 kHz

>>> sig = audiotoolbox.Signal(2, 1, 48000)
>>> print(sig.shape)
(4800, 2)

abs()

Absolute value

Calculates the absolute value or modulus of all values of the signal

add(x)

In-place summation

This function allowes for in-place summation.

Parameters:: x (scalar or ndarray) – The value or array to add to the signal
Return type:: Returns itself

Examples

>>> sig = audiotoolbox.Signal(1, 1, 48000).add_tone(500).add(2)
>>> print(sig.mean())
2.0

add_cos_modulator(frequency, m, start_phase=0)

Multiply a cosinus amplitude modulator to the signal.

Multiplies a cosinus amplitude modulator following the equation:

\[1 + m \cos{2 \pi f_m t \phi_{0}}\]

where $m$ is the modulation depth, $f_m$ is the modualtion frequency and $t$ is the time. $\phi_0$ is the start phase

Parameters:

frequency (float) – The frequency of the cosine modulator.
m (float, optional) – The modulation index. (Default = 1)
start_phase (float) – The starting phase of the cosine in radiant.

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.cos_amp_modulator

add_fade_window(rise_time, type='cos', **kwargs)

Add a fade in/out window to the signal.

This function multiplies a fade window with a given rise time onto the signal. for mor information about the indiviual window functions refer to the implementations:

cos: A rasied cosine window audiotoolbox.cosine_fade_window()
gauss: A gaussian window audiotoolbox.gaussian_fade_window()

Parameters:

rise_time (float) – The rise time in seconds.
type ('cos', 'gauss', 'cos2') – The type of the window. (default = ‘cos’)

Returns:

Return itself

Return type:

Signal

See also

audiotoolbox.gaussian_fade_window, audiotoolbox.cosine_fade_window

add_noise(ntype='white', variance=1, seed=None)

Add uncorrelated noise to the signal.

add gaussian noise with a defined variance and different spectral shapes. The noise is generated in the frequency domain using the gaussian pseudorandom generator numpy.random.randn. The real and imaginarny part of each frequency component is set using the psudorandom generator. Each frequency bin is then weighted dependent on the spectral shape. The resulting spektrum is then transformed into the time domain using numpy.fft.ifft

Weighting functions:

white: $w(f) = 1$
pink: $w(f) = \frac{1}{\sqrt{f}}$
brown: $w(f) = \frac{1}{f}$

Parameters:

ntype ({'white', 'pink', 'brown'}) – spectral shape of the noise
variance (scalar, optional) – The Variance of the noise
seed (int or 1-d array_like, optional) – Seed for RandomState. Must be convertible to 32 bit unsigned integers.

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.generate_noise, audiotoolbox.generate_uncorr_noise, audiotoolbox.Signal.add_uncorr_noise

add_tone(frequency, amplitude=1, start_phase=0)

Add a cosine to the signal.

This function will add a pure tone to the current waveform. following the equation:

\[x = x + cos(2\pi f t + \phi_0)\]

where $x$ is the waveform, $f$ is the frequency, $t$ is the time and $\phi_0$ the starting phase. The first evulated timepoint is 0.

Parameters:

frequency (scalar) – The tone frequency in Hz.
amplitude (scalar, optional) – The amplitude of the cosine. (default = 1)
start_phase (scalar, optional) – The starting phase of the cosine. (default = 0)

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.generate_tone

add_uncorr_noise(corr=0, variance=1, ntype='white', seed=None, bandpass=None, highpass=None, lowpass=None)

Add partly uncorrelated noise.

This function adds partly uncorrelated noise using the N+1 generator method.

To generate N partly uncorrelated noises with a desired correlation coefficent of $rho$, the algoritm first generates N+1 noise tokens which are then orthogonalized using the Gram-Schmidt process (as implementd in numpy.linalg.qr). The N+1 th noise token is then mixed with the remaining noise tokens using the equation

\[X_{\rho,n} = X_{N+1} \sqrt{\rho} + X_n \beta \sqrt{1 - \rho}\]

where $X_{\rho,n}$ is the nth output and noise, $X_{n}$ the nth indipendent noise and $X_{N=1}$ is the common noise.

for two noise tokens, this is identical to the assymetric three-generator method described in [1]_

Parameters:

corr (int, optional) – Desired correlation of the noise tokens, (default=0)
variance (scalar, optional) – The desired variance of the noise, (default=1)
ntype ({'white', 'pink', 'brown'}) – spectral shape of the noise
seed (int or 1-d array_like, optional) – Seed for RandomState. Must be convertible to 32 bit unsigned integers.
bandpass (dict, optional) – Parameters for an bandpass filter, these are passed as arguments to the audiotoolbox.filter.bandpass function
lowpass (dict, optional) – Parameters for an lowpass filter, these are passed as arguments to the audiotoolbox.filter.lowpass function
highpass (dict, optional) – Parameters for an highpass filter, these are passed as arguments to the audiotoolbox.filter.highpass function

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.generate_noise, audiotoolbox.generate_uncorr_noise, audiotoolbox.Signal.add_noise

References

correlated noise—a comparison of methods. The Journal of the Acoustical Society of America, 130(1), 292–301. http://dx.doi.org/10.1121/1.3596475

bandpass(fc, bw, filter_type, **kwargs)

Apply a bandpass filter.

Applies a bandpass filter to the signal. The availible filters are:

brickwall: A ‘optimal’ brickwall filter
gammatone: A real valued gammatone filter
butter: A butterworth filter

For additional filter parameters and detailed description see the respective implementations:

Parameters:

fc (scalar) – The banddpass center frequency in Hz
bw (scalar) – The filter bandwidth in Hz
filter_type ({'brickwall', 'gammatone', 'butter'}) – The filtertype
**kwargs – Further keyword arguments are passed to the respective filter functions

Returns:

Returns itself

Return type:

Signal

property ch

Direct channel indexer

Returns an indexer class which enables direct indexing and slicing of the channels indipendent of samples.

Examples

>>> sig = audiotoolbox.Signal((2, 3), 1, 48000).add_noise()
>>> print(np.all(sig.ch[1, 2] is sig[:, 1, 2]))
True

concatenate(signal)

Concatenate another signal or array

This method appends another signal to the end of the current signal.

Parameters:: signal (signal or ndarray) – The signal to append
Return type:: Returns itself

convolve(kernel, mode: Literal['full', 'valid', 'same'] = 'full', overlap_dimensions: bool = True)

Convolves the current signal with the given kernel.

This method performs a convolution operation between the current signal and the provided kernel. The convolution is performed along the overlapping dimensions of the two signals. E.g., If the signal has two channels and the kernel has two channels, the first channel of the signal is convolved with the first channel of the kernel, and the second channel of the signal is convolved with the second channel of the kernel. The resulting signal will again have two channels. If overlap_dimensions is False, the convolution is performed along all dimensions. A Signal with two channels convolved with a two-channel kernel will result in an output of shape (2, 2) where each channel of the signal is convolved with each channel of the kernel.

this method uses scipy.Signal.fftconvolve for the convolution.

Parameters:

kernel (Signal) – The kernel to convolve with.
mode (str {'full', 'valid', 'same'}, optional) – The convolution mode for fftconvolve (default=full)
overlap_dimensions (bool, optional) – Whether to convolve only along overlapping dimensions. If True, the convolution is performed only along the dimensions that overlap between the two signals. If False, the convolution is performed along all dimensions. Defaults to True.

Returns:

The convolved signal.

Return type:

Self

Examples

If the last dimension of signal and the first dimension of kernel match, convolution takes place along this axis. This means that the first channel of the signal is convolved with the first channel of the kernel, the second with the second.

>>> signal = Signal(2, 1, 48000)
>>> kernel = Signal(2, 100e-3, 48000)
>>> signal.convolve(kernel)
>>> signal.n_channels
2

This also works with multiple overlapping dimensions.

>>> signal = Signal((5, 2, 3), 1, 48000)
>>> kernel = Signal((2, 3), 100e-3, 48000)
>>> signal.convolve(kernel)
>>> signal.n_channels
(5, 2, 3)

The ‘overlap_dimensions’ keyword can be set to False if all signal channels are instead convolved with all kernels.

>>> signal = Signal(2, 1, 48000)
>>> kernel = Signal(2, 100e-3, 48000)
>>> signal.convolve(kernel, overlap_dimensions=False)
>>> signal.n_channels
(2, 2)

delay(delay, method='fft')

Delays the signal by circular shifting.

Circular shift the functions foreward to create a certain time delay relative to the orginal time. E.g if shifted by an equivalent of N samples, the value at sample i will move to sample i + N.

Two methods can be used. Using the default method ‘fft’, the signal is shifted by applyint a FFT transform, and phase shifting each frequency accoring to the delay and applying an inverse transform. This is identical to using the :meth:’audiotoolbox.FrequencyDomainSignal.time_shift’ method. When using the method ‘sample’, the signal is time delayed by circular shifting the signal by the number of samples that is closest to delay.

Parameters:

delay (float) – The delay in secons
method ({'fft', 'samples'} optional) – The method used to delay the signal (default: ‘fft’)

Returns:

Returns itself

Return type:

Signal

See also

audio.shift_signal, audio.FreqDomainSignal.time_shift

property duration: Duration of the signal in seconds

from_file(filename: str, start: int = 0, channels='all')

Load a signal from an audio file.

This method loads a signal from an audio file and assigns it to the current Signal object. The signal can be loaded from a specific start point and for specific channels.

Parameters:

filename (str) – The path to the audio file to load.
start (int, optional) – The starting sample index from which to load the signal. Default is 0.
channels (int, tuple, or str, optional) – The channels to load from the audio file. Can be an integer specifying a single channel, a tuple specifying multiple channels, or “all” to load all channels. Default is “all”.

Returns:

The Signal object with the loaded audio data.

Return type:

Signal

Raises:

ValueError – If the number of channels in the loaded signal does not match the number of channels in the current Signal object.

Examples

Load a signal from a file starting at the beginning and using all channels:

>>> sig = Signal(2, 1, 48000)
>>> sig.from_file("example.wav")

Load a signal from a file starting at sample index 1000 and using the first channel:

>>> sig = Signal(1, 1, 48000)
>>> sig.from_file("example.wav", start=1000, channels=0)

property fs: int: Sampling rate of the signal in Hz

multiply(x: float | ndarray)

In-place multiplication

This function allowes for in-place multiplication

Parameters:: x (scalar or ndarray) – The value or array to muliply with the signal
Return type:: Returns itself

Examples

>>> sig = audiotoolbox.Signal(1, 1, 48000).add_tone(500).multiply(2)
>>> print(sig.max())
2.0

property n_channels: Number of channels in the signal

property n_samples: Number of samples in the signal

phase_shift(phase)

Shifts all frequency components of a signal by a constant phase.

Shift all frequency components of a given signal by a constant phase. This is identical to calling the phase_shift method of the FrequencyDomainSignal class.

Parameters:: phase (scalar) – The phase in rad by which the signal is shifted.
Returns:: Returns itself
Return type:: Signal

play(block: bool = True)

Quick playback of the signal over the default audio output device.

Parameters:: block (bool, optional) – If True, the method will block until playback is finished. If False, playback will be non-blocking and the method will return immediately. Default is True.

rectify()

One-way rectification of the signal.

Returns:: Returns itself
Return type:: Signal

resample(new_fs: int)

Resample the signal to a new sampling rate.

This method uses the resampy library to resample the signal to a new sampling rate. It is based on the band-limited sinc interpolation method for sampling rate conversion as described by Smith (2015). [1]_.

set_dbfs(dbfs)

Normalize the signal to a given dBFS RMS value.

Normalizes the signal to dB Fullscale for this, the Signal is multiplied with the factor $A$

\[A = \frac{1}{\sqrt{2}\sigma} 10^\frac{L}{20}\]

where $L$ is the goal Level, and $\sigma$ is the RMS of the signal.

Parameters:: dbfs (float) – The dBFS RMS value in dB
Returns:: Returns itself
Return type:: Signal

Examples

>>> sig = Signal(1, 1, 48000).add_tone(1000)
>>> sig.set_dbfs(-3)
>>> sig.stats.dbfs
-3.0

See also

audiotoolbox.set_dbspl, audiotoolbox.set_dbfs, audiotoolbox.calc_dbfs, audiotoolbox.Signal.set_dbspl, audiotoolbox.Signal.calc_dbspl, audiotoolbox.Signal.calc_dbfs

set_dbspl(dbspl)

Set sound pressure level in dB.

Normalizes the signal to a given sound pressure level in dB relative 20e-6 Pa.

Normalizes the signal to a given sound pressure level in dB relative 20e-6 Pa. for this, the Signal is multiplied with the factor $A$

\[A = \frac{p_0}{\sigma} 10^{L / 20}\]

where $L$ is the goal SPL, $p_0=20\mu Pa$ and $\sigma$ is the RMS of the signal.

Parameters:: dbspl (float) – The sound pressure level in dB
Returns:: Returns itself
Return type:: Signal

See also

audiotoolbox.set_dbspl, audiotoolbox.Signal.calc_dbspl, audiotoolbox.Signal.set_dbfs, audiotoolbox.Signal.calc_dbfs

property time: Time vector for the signal.

to_freqdomain()

Convert to frequency domain by applying a DFT.

This function returns a frequency domain representation of the signal.

As opposed to most methods, this conversion is not in-place but a new audiotoolbox.FrequencyDomainSignal() object is returned

Returns:: The frequency domain representation of the signal
Return type:: FrequencyDomainSignal

trim(t_start, t_end=None)

Trim the signal between two points in time.

removes the number of samples according to t_start and t_end. This method can not be applied to a single channel or slice.

Parameters:

t_start (float) – Signal time at which the returned signal should start
t_end (float or None (optional)) – Signal time at which the signal should stop. The full remaining signal is used if set to None. (default: None)

Returns:

Returns itself

Return type:

Signal

write_file(filename, **kwargs)

Save the signal as an audio file.

This method saves the current signal as an audio file. Additional parameters for the file format can be specified through keyword arguments. The file can be saved in any format supported by libsndfile, such as WAV, FLAC, AIFF, etc.

Parameters:

filename (str) – The filename to save the audio file as.
**kwargs – Additional keyword arguments to be passed to the audiotoolbox.wav.writefile function. These can include format and subtype.

Return type:

None

Examples

Save the signal to a file named “output.wav”:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.wav")

Save the signal to a file with a specific format and subtype:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.wav", format="WAV", subtype="PCM_16")

Save the signal to a FLAC file:

>>> sig = Signal(2, 1, 48000)
>>> sig.write_file("output.flac", format="FLAC")

See also

audiotoolbox.wav.writefile: Function used to write the audio file.

zeropad(number=None, duration=None)

Add zeros to start and end of signal.

This function adds zeros of a given number or duration to the start or end of a signal.

If number or duration is a scalar, an equal number of zeros will be appended at the front and end of the array. If a vector of two values is given, the first defines the number or duration at the beginning, the second the number or duration of zeros at the end.

Parameters:

number (scalar or vecor of len(2), optional) – Number of zeros.
duration (scalar or vecor of len(2), optional) – duration of zeros in seconds.

Returns:

Returns itself

Return type:

Signal

See also

audiotoolbox.zeropad

The Signal.stats sub-module

The Signal.stats submodule gives access to an instance of the audiotoolbox.oaudio.stats.SignalStats class. e.g.:

>>> sig = audio.Signal(1, 1, 48000)
>>> rms = sig.stats.rms

class audiotoolbox.oaudio.stats.SignalStats(sig)

property crest_factor: Soundpressure level relative to 20uPa in dB

See also

audiotoolbox.crest_factor

property dba: A weighted sound pressure level in dB

See also

audiotoolbox.filter.a_weighting

property dbc: A weighted sound pressure level in dB

See also

audiotoolbox.filter.a_weighting

property dbfs: ndarray: Level in dB full scale

See also

audiotoolbox.calc_dbfs

property dbspl: Soundpressure level relative to 20uPa in dB

See also

audiotoolbox.calc_dbspl

property mean: aritmetic mean

octave_band_levels(oct_fraction: int = 3) → tuple[ndarray, ndarray]

Calculate octave band levels of the signal.

Parameters:: oct_fraction (float, optional) – Fraction of an octave to use, by default 3 for 1/3 octave bands.
Returns:: tuple – Frequencies and corresponding levels in dB Full Scale (dBFS)
Return type:: (frequencies, levels)

See also

audiotoolbox.filter.bank.octave_bank

property rms

Root mean square.

Returns:: float
Return type:: The RMS value

property var: variance

The Signal.time_frequency sub-module

The Signal.time_frequency submodule gives access to an instance of the audiotoolbox.oaudio.time_frequency.TimeFrequency class that provides time-frequency analysis methods such as spectrograms.

class audiotoolbox.oaudio.time_frequency.TimeFrequency(sig)

Class containing time-frequency analysis methods.

filterbank_specgram(bank: FilterBank, nperseg: int = 1024, noverlap: int = 512, win: str = 'hann') → tuple[Signal, np.ndarray]

Calculate the spectrogram of a signal using a specified filter bank.

This function applies a filter bank to the signal and computes the spectrogram.

Parameters:

bank (FilterBank) – The filter bank to apply to the signal.
nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).
noverlap (int, optional) – The number of samples to overlap between segments (default is 512).
win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the filter bank.

Return type:

tuple[audio.Signal, np.ndarray]

gammatone_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) → tuple[Signal, np.ndarray]

Calculate the gammatone spectrogram of a signal.

This function applies a gammatone filter bank to the signal and computes the spectrogram.

Parameters:

nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).
noverlap (int, optional) – The number of samples to overlap between segments (default is 512).
win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.
**kwargs (dict, optional) – Additional parameters to pass to the auditory gamma bank function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the gammatone filters.

Return type:

tuple[audio.Signal, np.ndarray]

octave_band_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) → tuple[Signal, np.ndarray]

Calculate the octave band spectrogram of a signal.

This function applies an octave filter bank to the signal and computes the spectrogram.

Parameters:

nperseg (int, optional) – The number of samples per segment for the spectrogram (default is 1024).
noverlap (int, optional) – The number of samples to overlap between segments (default is 512).
win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.
**kwargs (dict, optional) – Additional parameters to pass to the octave bank function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the octave bands

Return type:

tuple[audio.Signal, np.ndarray]

stft_specgram(nperseg: int = 1024, noverlap: int = 512, win: str = 'hann', **kwargs) → tuple[Signal, np.ndarray]

Calculate the Short-Time Fourier Transform (STFT) spectrogram of a signal.

This function computes the STFT of the signal and returns the spectrogram. It is a wrapper around the scipy.signal.spectrogram function.

Parameters:

nperseg (int, optional) – The number of samples per segment for the STFT (default is 1024).
noverlap (int, optional) – The number of samples to overlap between segments (default is 512).
win (str, optional) – The window function to apply (default is ‘hann’). Can be any valid window function name recognized by scipy.signal.get_window.
**kwargs (dict, optional) – Additional parameters to pass to the spectrogram function.

Returns:

A tuple containing the spectrogram as an audio Signal in dBFS and the center frequencies of the STFT.

Return type:

tuple[audio.Signal, np.ndarray]