
Wavelet analysis. Part 2
- Tutorial
Introduction
This publication discusses wavelet analysis of time series. The main idea of the wavelet transform corresponds to the specifics of many time series that demonstrate the evolution in time of their main characteristics - the average value, variance, periods, amplitudes and phases of harmonic components. The vast majority of processes studied in various fields of knowledge have the above features.
The purpose of this publication is to describe the method of continuous wavelet transform of time series using the PyWavelets library ..
A bit of history
Geophysicist D. Morlaix in the late 70s of XX century. I encountered the problem of analyzing signals from seismic sensors that contained a high-frequency component (seismic activity) for a short period of time and low-frequency components (calm state of the earth's crust) for a long period. Window Fourier transform allows you to analyze either the high-frequency component or the low-frequency component, but not both components at once.
Therefore, an analysis method was proposed in which the width of the window function for low frequencies increased, and for high frequencies it decreased. The new window transformation was obtained as a result of extension (compression) and time shift of one generating (so-called scaling function - scaling function, scalet) function. This generating function was called the wavelet by D. Morlaix.
Wavelet D. Morlaix
from pylab import*
import scaleogram as scg
axes = scg.plot_wav('cmor1-1.5', figsize=(14,3))
show()

Continuous Wavelet Transform (CWT)
Single-level wavelet transform:
coefs, frequencies = pywt.cwt (data, scales, wavelet)
where:
data: (array_like) -Input signal;
scales (array_like) : - The wavelet scale to use. You can use the ratio f = scale2frequency (scale, wavelet) / sampling_period and determine what physical frequency f is. Here, f in hertz when sampling_period in seconds;
wavelet (Wavelet object or name) : - Wavelet to use;
sampling_period (float): - Sampling period for output frequencies (optional parameter). The values computed for coefs are independent of sampling_period (i.e., scales are not scaled by period sampling.);
coefs(array_like) : - Continuous wavelet - transformation of the input signal for a given scale and wavelet;
frequencies (array_like) : - If the unit of the sampling period is second and is set, then the frequencies are displayed in hertz. Otherwise, a sampling period of 1. is assumed.
Note: The size of the coefficient arrays depends on the length of the input array and the lengths of the specified scales.
Get a list of available cwt compatible wavelet names:
>>> import pywt
>>> wavlist = pywt.wavelist(kind='continuous')
>>> wavlist
['cgau1', 'cgau2', 'cgau3', 'cgau4', 'cgau5', 'cgau6', 'cgau7', 'cgau8', 'cmor', 'fbsp', 'gaus1', 'gaus2', ' gaus3 ',' gaus4 ',' gaus5 ',' gaus6 ',' gaus7 ',' gaus8 ',' mexh ',' morl ',' shan ']
For wavelet analysis, choosing the mother wavelet is one of the key tasks. Therefore, before each wavelet selection, it is simply necessary to use the following program that allows you to determine the dynamic properties of the wavelet:
Listing 1
import numpy as np
import pywt
import matplotlib.pyplot as plt
vav='cmor1.5-1.0'
wav = pywt.ContinuousWavelet(vav)
# вывести диапазон, в котором будет оцениваться вейвлет
print("Непрерывный вейвлет будет оцениваться во всем диапазоне [{}, {}]".format( wav.lower_bound, wav.upper_bound))
width = wav.upper_bound - wav.lower_bound
scales = [1, 2, 3, 4, 10, 15]
max_len = int(np.max(scales)*width + 1)
t = np.arange(max_len)
fig, axes = plt.subplots(len(scales), 2, figsize=(12, 6))
for n, scale in enumerate(scales):
# Следующий код адаптирован из внутренних частей cwt
int_psi, x = pywt.integrate_wavelet(wav, precision=10)
step = x[1] - x[0]
j = np.floor(
np.arange(scale * width + 1) / (scale * step))
if np.max(j) >= np.size(int_psi):
j = np.delete(j, np.where((j >= np.size(int_psi)))[0])
j = j.astype(np.int)
# normalize int_psi для более простого построения
int_psi /= np.abs(int_psi).max()
# дискретные выборки интегрированного вейвлета
filt = int_psi[j][::-1]
# CWT состоит из свертки фильтра с сигналом в этой шкале
# Здесь мы строим это дискретное ядро свертки в каждом масштабе.
nt = len(filt)
t = np.linspace(-nt//2, nt//2, nt)
axes[n, 0].plot(t, filt.real, t, filt.imag)
axes[n, 0].set_xlim([-max_len//2, max_len//2])
axes[n, 0].set_ylim([-1, 1])
axes[n, 0].text(50, 0.35, 'scale = {}'.format(scale))
f = np.linspace(-np.pi, np.pi, max_len)
filt_fft = np.fft.fftshift(np.fft.fft(filt, n=max_len))
filt_fft /= np.abs(filt_fft).max()
axes[n, 1].plot(f, np.abs(filt_fft)**2)
axes[n, 1].set_xlim([-np.pi, np.pi])
axes[n, 1].set_ylim([0, 1])
axes[n, 1].set_xticks([-np.pi, 0, np.pi])
axes[n, 1].set_xticklabels([r'$-\pi$', '0', r'$\pi$'])
axes[n, 1].grid(True, axis='x')
axes[n, 1].text(np.pi/2, 0.5, 'scale = {}'.format(scale))
axes[n, 0].set_xlabel('time (samples)')
axes[n, 1].set_xlabel('frequency (radians)')
axes[0, 0].legend(['real', 'imaginary'], loc='upper left')
axes[0, 1].legend(['Power'], loc='upper left')
axes[0, 0].set_title('filter: wavelet - %s'%vav)
axes[0, 1].set_title(r'|FFT(filter)|$^2$')
plt.show()
Next, we will consider wavelet functions and their properties using the above program:
Mexican hat wavelet "mexh":

Wavelet "morl" Morlaix:
Complex Morlet wavelet "cmorB-C"
where B is the bandwidth; C is the center frequency.
Hereinafter (without special indication) the values of B, C are set with a floating point.

Gaussian wavelets “gausP”
where: P - an integer from 1 to 8 is inserted into the name of the wavelet, for example, "gaus8"; C- built-in normalization constant.

Integrated Gaussian Wavelets “cgauP”
where: P - an integer from 1 to 8 is inserted into the name of the wavelet, for example, "сgaus8" and correspond to the order of the derivative of the wavelet function; C- built-in normalization constant.

Shannon wavelets "shanB-C"
where: B is the bandwidth; C is the center frequency.

CWT in PyWavelets applies to discrete data - convolutions with samples of the wavelet integral. If scale is too small, then we get a discrete filter with inadequate sampling, which leads to smoothing, as shown in the graph for the wavelet 'cmor1.5-1.0'.
In the left column, the graphs show discrete filters used in the convolution for various scales. The right column is the corresponding Fourier power spectra of each filter. With scales 1 and 2 for the graph 'cmor1.5-1.0' it can be seen that smoothing occurs due to violation of the Nyquist constraint. The indicated phenomenon can occur in other wavelets when choosing C and B, therefore, when working with wavelets, you must use the program - Listing 1.
In order to relate a given scale to a specific signal frequency, the signal sampling period must be known. Using the pywt.scale2frequency () function, you can convert the list of scales to the corresponding frequencies. The correct choice of scales depends on the selected wavelet, so pywt.scale2frequency () should be used to get an idea of the corresponding frequency range of the signal.
>>> import pywt
>>> dt = 0.01 # 100 Hz sampling
>>> frequencies = pywt.scale2frequency('cmor1.5-1.0', [1, 2, 3, 4]) / dt
>>> frequencies
array ([100., 50., 33.33333333, 25.])
Wavelet - Time Series Analysis Using CWT PyWavelets
The el-Nino Dataset is a time series dataset used to track El Nino and contains quarterly sea surface temperature measurements from 1871 to 1997. To understand the power of a scale chart, let's visualize it for an el-Nino dataset along with the original time series data and its Fourier transform.
For wavelet analysis of time series, the following steps must be performed:
1. Select the mother wavelet: Select the complex Morlet wavelet “cmorB-C”:
Bandwidth - B and center frequency - From which we will determine in the next step.
2. Determine the center frequency by taking dt = 0.25 for our time series:
import pywt
dt = 0.25 # 4 Hz sampling
scale=range(1,4)
frequencies = pywt.scale2frequency('cmor1.0-0.5', scale) / dt
print(frequencies)
[2. 1. 0.66666667]
3. We find the Fourier transform of the mother wavelet cmor1.0-0.5 (we use Listing 1):
A continuous wavelet will be evaluated over the entire range [-8.0, 8.0]

4. Select the data for the time series:
pd.read_csv("http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat", sep = "|")
The data are quarterly sea surface temperature measurements from 1871 to 1997. For this data: t0 = 1871 dt = 0.25
5. We build the time series with the signal and its moving average in one graph:
Listing 2
from numpy import *
from scipy import *
import pandas as pd
from pylab import *
import pywt
def get_ave_values(xvalues, yvalues, n = 5):
signal_length = len(xvalues)
if signal_length % n == 0:
padding_length = 0
else:
padding_length = n - signal_length//n % n
xarr = array(xvalues)
yarr = array(yvalues)
xarr.resize(signal_length//n, n)
yarr.resize(signal_length//n, n)
xarr_reshaped = xarr.reshape((-1,n))
yarr_reshaped = yarr.reshape((-1,n))
x_ave = xarr_reshaped[:,0]
y_ave = nanmean(yarr_reshaped, axis=1)
return x_ave, y_ave
def plot_signal_plus_average(time, signal, average_over = 5):
fig, ax = subplots(figsize=(15, 3))
time_ave, signal_ave = get_ave_values(time, signal, average_over)
ax.plot(time, signal, label='Сигнал')
ax.plot(time_ave, signal_ave, label = 'Скользящее среднее сигнала (n={})'.format(5))
ax.set_xlim([time[0], time[-1]])
ax.set_ylabel('Амплитуда сигнала', fontsize=18)
ax.set_title('Сигнал + Скользящее среднее сигнала', fontsize=18)
ax.set_xlabel('Время', fontsize=18)
ax.legend()
show()
df_nino = pd.read_csv("http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat", sep = "|")
N = df_nino.shape[0]
t0=1871
dt=0.25
time = arange(0, N) * dt + t0
signal = df_nino.values.squeeze()
scales = arange(1, 128)
plot_signal_plus_average(time, signal)

6. We carry out the Fourier transform and spectrum modality from the time series:
Listing 3
from numpy import *
from scipy import *
import pandas as pd
from pylab import *
import pywt
def get_ave_values(xvalues, yvalues, n = 5):
signal_length = len(xvalues)
if signal_length % n == 0:
padding_length = 0
else:
padding_length = n - signal_length//n % n
xarr = array(xvalues)
yarr = array(yvalues)
xarr.resize(signal_length//n, n)
yarr.resize(signal_length//n, n)
xarr_reshaped = xarr.reshape((-1,n))
yarr_reshaped = yarr.reshape((-1,n))
x_ave = xarr_reshaped[:,0]
y_ave = nanmean(yarr_reshaped, axis=1)
return x_ave, y_ave
def get_fft_values(y_values, T, N, f_s):
f_values = linspace(0.0, 1.0/(2.0*T), N//2)
fft_values_ = fft(y_values)
fft_values = 2.0/N * abs(fft_values_[0:N//2])
return f_values, fft_values
def plot_fft_plus_power(time, signal):
dt = time[1] - time[0]
N = len(signal)
fs = 1/dt
fig, ax = subplots(figsize=(15, 3))
variance = std(signal)**2
f_values, fft_values = get_fft_values(signal, dt, N, fs)
fft_power = variance * abs(fft_values) ** 2 # FFT power spectrum
ax.plot(f_values, fft_values, 'r-', label='FFT преобразование')
ax.plot(f_values, fft_power, 'k--', linewidth=1, label='Спектр мощности')
ax.set_xlabel('Частота[Герц / год]', fontsize=18)
ax.set_ylabel('Амплитуда', fontsize=18)
ax.legend()
show()
df_nino = pd.read_csv("http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat", sep = "|")
N = df_nino.shape[0]
t0=1871
dt=0.25
time = arange(0, N) * dt + t0
signal = df_nino.values.squeeze()
scales = arange(1, 128)
plot_fft_plus_power(time, signal)

7. We determine the scales: scales = arange (1, 128); levels = [2 ** - 4, 2 ** - 3, 2 ** - 2, 2 ** - 1, 2 ** 0, 2 ** 1, 2 ** 2, 2 ** 3].
8. Build a scale chart:
Listing 4
from numpy import *
import pandas as pd
from pylab import *
import pywt
def plot_wavelet(time, signal, scales,
waveletname = 'cmor1.0-0.4',
cmap = plt.cm.seismic,
title = 'Вейвлет-преобразование(Спектр мощности) сигнала',
ylabel = 'Период (год)',
xlabel = 'Время'):
dt = time[1] - time[0]
[coefficients, frequencies] = pywt.cwt(signal, scales, waveletname, dt)
power = (abs(coefficients)) ** 2
period = 1. / frequencies
levels = [2**-4 , 2**-3 , 2**-2 , 2**-1 , 2**0 , 2**1 , 2**2 , 2**3]
contourlevels = log2(levels)
fig, ax = subplots(figsize=(15, 10))
im = ax.contourf(time, log2(period), log2(power), contourlevels, extend='both',cmap=cmap)
ax.set_title(title, fontsize=20)
ax.set_ylabel(ylabel, fontsize=18)
ax.set_xlabel(xlabel, fontsize=18)
yticks = 2**arange(np.ceil(log2(period.min())), ceil(log2(period.max())))
ax.set_yticks(log2(yticks))
ax.set_yticklabels(yticks)
ax.invert_yaxis()
ylim = ax.get_ylim()
ax.set_ylim(ylim[0], -1)
cbar_ax = fig.add_axes([0.95, 0.5, 0.03, 0.25])
fig.colorbar(im, cax=cbar_ax, orientation="vertical")
show()
df_nino = pd.read_csv("http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat", sep = "|")
N = df_nino.shape[0]
t0=1871
dt=0.25
time = arange(0, N) * dt + t0
signal = df_nino.values.squeeze()
scales = arange(1, 128)
plot_wavelet(time, signal, scales)

The scale chart shows that most of the spectrum power is concentrated over a period of 2-8 years, this corresponds to a frequency of 0.125 - 0.5 Hz. In the Fourier spectrum, fashion also concentrates around these frequency values. However, the wavelet transform also gives us temporary information, but the Fourier transform does not.
For example, on the scale diagram, we see that before 1920 there were many fluctuations, while between 1960 and 1990 there were not so many. We can also see that over time there is a shift from shorter to longer periods.
Using the scaleogram module
Scaleogram is a convenient tool for analyzing 1D data with continuous wavelet transform based on the PyWavelets library. The module is designed to provide a reliable tool for quick data analysis.
Scaleogram has the following features:
- simple beginner call signature
- readable axes and pure matplotlib integration
- many options for zooming, spectrum filter, colorbar implementation, etc ...
- support for frequency and frequency units in accordance with the marking
- speed, uses the N * log (N) algorithm for transformations
- portability: tested with python2.7 and python3.7
- detailed error messages and documentation with examples
However, in the usage examples, the data access is not correctly recorded:
nino3 = "http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat"
The following warning appears:
nino3 = pd.read_table (nino3)
which leads to the warning:
Warning (from warnings module):
File “C: /Users/User/Desktop/wavelet/pr.py”, line 7
nino3 = pd. read_table (nino3)
FutureWarning: read_table is deprecated, use read_csv instead, passing sep = '\ t'
The data access should be written like this:
url="http://paos.colorado.edu/research/wavelets/wave_idl/sst_nino3.dat"
nino3 = pd.read_csv(url, sep = "|")
To show the use of the scalogram and compare the results with the results of the previous example, we use the same data - on quarterly measurements of the sea surface temperature from 1871 to 1997. For this data: t0 = 1871 dt = 0.25
Listing 4
import pandas as pd
import pywt
from numpy import*
import scaleogram as scg
from pylab import*
url="sst_nino3.dat"
nino3 = pd.read_csv(url, sep = "|")
data = nino3.values.squeeze()
N = data.size
t0 = 1871; dt = 0.25
time = t0 + arange(len(data))*dt
wavelet = 'cmor1-0.5'
ax = scg.cws(time, data, scales=arange(1, 128), wavelet=wavelet,
figsize=(14, 3), cmap="jet", cbar=None, ylabel='Период (год)', xlabel="Время [Год]", yscale="log",
title='Вейвлет-преобразование временного ряда\n(спектр мощности)')
ticks = ax.set_yticks([2,4,8, 16,32])
ticks = ax.set_yticklabels([2,4,8, 16,32])
show()

If we compare the scale chart with the resulting scan, then, with the exception of the color palette, they are identical, and therefore show the same dynamics in the time series.
Listing 4 is simpler than listing 3 and has better performance.
When the frequency spectrum and the power spectrum according to Fourier do not allow obtaining additional information on the dynamics of the time series, then the use of a scalogram is preferable.
conclusions
An example of wavelet analysis of a time series by means of the PyWavelets library is given.