## Abstract

Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.

## 1 Introduction

Many time series are considered to be a superposition of several oscillation components. For example, electroencephalogram (EEG) time series are composed of several oscillation components such as alpha, beta, and gamma. Each oscillation component has its own amplitude, frequency, and phase. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017). Our method is based on gaussian linear state-space models that represent oscillators with random frequency fluctuations and accomplish a natural decomposition of the given time series in a data-driven manner.

Recently, in various fields such as neuroscience, astronomy, and seismology, time series data have been obtained in multivariate forms like array signals (Johnson & Dudgeon, 1993). Of course we can decompose each variable in the multivariate time series using our method (Matsuda & Komaki, 2017). However, it would provide more insight if we could directly decompose a multivariate time series. For example, if the oscillatory activity of one neural population is detected by several electrodes in EEG, we should identify the corresponding oscillation components on each electrode, which is impossible using our method.

In this study, we extend our method (Matsuda & Komaki, 2017) to multivariate time series. The proposed method is based on a gaussian linear state-space model. Our model assumes that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators onto certain vectors called projection vectors. Therefore, the oscillators superpose on each variable with amplitude and phase modulation. The degrees of amplitude and phase modulation are determined by the length and direction of the projection vector, respectively. This idea naturally derives from an observation on the vector autoregressive (VAR) model, which is discussed in section 3.2. Parameters of the state-space model are estimated from data using the empirical Bayes method, and the number of oscillation components is determined using the Akaike information criterion (AIC; Akaike, 1980). Thus, the proposed method provides a natural decomposition of the given time series into oscillation components. In other words, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of the phase dynamics of a given multivariate time series. Numerical results show the effectiveness of the proposed method. When applied to monthly mean north/south sunspot number data, our proposed method reveals an interesting phase relationship.

We first briefly review our method (Matsuda & Komaki, 2017) for univariate time series in section 2. Then, we develop the oscillator model for multivariate time series in section 3. Based on this model, we explain the details of the proposed method in section 4 and present numerical experiment results in section 5. We apply our proposed method to monthly mean north/south sunspot number data in section 6 and give concluding remarks in section 7. Matlab code for our proposed method is available online at http://www.stat.t.u-tokyo.ac.jp/∼t-matsuda/index.en.html.

## 2 Existing Method for Univariate Time Series

Based on models 2.1 and 2.2, we (Matsuda & Komaki, 2017) proposed a method for decomposing a given univariate time series into oscillation components and estimating their phases. The model parameters and are estimated from data using the empirical Bayes method, and the number of oscillation components is determined using the Akaike information criterion (AIC). Thus, this method accomplishes a natural decomposition of the given time series into oscillation components.

Application of this method to real data showed interesting results. For example, this method detects six oscillation components from the hippocampal local field potential, with two of them belonging to the theta rhythm, whereas bandpass filtering with the theta band cannot separate these oscillators. (See Matsuda & Komaki, 2017, for details.)

## 3 Oscillator Model for Multivariate Time Series

### 3.1 Definition

The observation model for is defined as in equation 3.2 in order to ensure parameter identifiability. From the observation model 3.3, is the sum of the inner product of the th oscillator coordinate and the vector . Namely, the effect of the th oscillator is superposed on the th time series with amplitude multiplied by and phase-delayed by . We call each the projection vector in the rest of this letter.

We set the covariance matrix of the observation noise to a scalar matrix in equation 3.4. In other words, we assume that the observation noise in is independent and has the same variance. This assumption is plausible for many multivariate time series. For example, array signals such as EEG or seismic waves satisfy this assumption because each sensor is equivalent (Johnson & Dudgeon, 1993). As we explain in section 4.2, this assumption enables an efficient estimation of .

### 3.2 Oscillator Interpretation of the VAR Model

The VAR model is a fundamental model for multivariate time series (Hannan, 1970; Reinsel, 1993). As our state-space model in Matsuda and Komaki (2017) was motivated by the oscillator interpretation of the AR model, models 3.1, 3.2, and 3.3 are motivated by the fact that a VAR process of arbitrary order can be interpreted as a superposition of several oscillation components. We explain this in the following.

*polyeig*. Note that we can take polynomial eigenvectors to satisfy , where the conjugate of a complex vector is the componentwise conjugate. Let be a matrix, which is nonsingular because equation 3.8 has no multiple roots and all polynomial eigenvalues are nonzero.

## 4 Proposed Method

### 4.1 Decomposition into Oscillation Components and Phase Estimation

Based on models 3.1, 3.3, and 3.2, our proposed method decomposes a given multivariate time series into oscillation components and estimates the phase of each component. We accomplish the decomposition using a similar approach to Matsuda and Komaki (2017). We explain the selection of and estimation of the parameters and in section 4.2.

### 4.2 Parameter Estimation and Model Selection

We need to determine the parameters and of models 3.1, 3.2, and 3.3, and the number of oscillators before decomposing a given time series. We estimate the parameters using the empirical Bayes method and select using the AIC (Akaike, 1980).

We maximize equation 4.7 with respect to and using the quasi-Newton method. Since equation 4.7 is not concave in general, the initial values for and must be carefully determined. The method of setting initial values, based on the oscillator interpretation of the VAR models discussed in section 3.2, is detailed in appendixes A and B.

## 5 Numerical Experiments

. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|

AIC | 1578 | 1083 | 981 | 991 | 1001 |

. | 1 . | 2 . | 3 . | 4 . | 5 . |
---|---|---|---|---|---|

AIC | 1578 | 1083 | 981 | 991 | 1001 |

We generated simulated data several times and checked if the above results were reproduced. Although the AIC almost always attained a minimum at , there were a few cases where the AIC attained a minimum at . However, in such cases, three oscillators had qualitatively similar behavior to that obtained with , and the fourth oscillator was negligible in magnitude. This is compatible with the general properties of model selection with AIC.

In principle, oscillators with close-by frequencies can be detected and separated by the proposed method if the data length is sufficiently large. Empirically, when and the sampling frequency is 200 Hz, two oscillators with frequencies 20 Hz () and 21 Hz () are detectable if the data length is more than 200 (1 second). In models 3.1, 3.2, and 3.3, the parameter describes the degree of fluctuation in the frequency of each oscillator. If the parameter for oscillators with close-by frequencies is quite different, then the phase dynamics of these oscillators also become quite different and they are useful for detecting and separating these oscillators.

## 6 Application to Real Data

In this section, we apply the proposed method to monthly mean north/south sunspot number data available from the ICSU World Data System (http://sidc.be/silso/datafiles). These data are a simple arithmetic mean of daily north/south sunspot numbers over each calendar month. Each sunspot group is counted in either the Northern or Southern Hemisphere based on its heliographic latitude. Here, we use data for the period January 1992 through August 2016. Therefore, the dimension is , the sampling period is year, and the data length is 296 samples. We log-transformed and centered the original data. The obtained data are presented in Figure 3. Although the two time series look similar, we can see that south sunspot numbers have a slight delay compared to north sunspot numbers (Atac & Ozguc, 1996).

*year*. Figure 4 shows estimated oscillator coordinates, and Figure 5 shows the phases of the six oscillators. The number of sunspots is known to oscillate with a period of approximately 11 years. We (Matsuda & Komaki, 2017) also detected such an oscillation component in the Wolfer sunspot data from Tong (1990). Here, the first oscillator corresponds well with this known value. The other oscillators have almost negligible magnitude, so they may be interpreted as noises from the general properties of model selection with AIC.

Several studies suggest that the phase difference between the north/south sunspot numbers varies over time (Deng, Qu, Yan, & Wang, 2013; Zolotova, Ponyavin, Arlt, & Tuominen, 2010). We verify this hypothesis here. We applied our method from Matsuda and Komaki (2017) to the north/south sunspot numbers individually. This procedure implicitly assumes that the phases of the north/south sunspot numbers develop independently. As a result, the north sunspot number was decomposed into four components with , and the south sunspot number was decomposed into two components with . The AIC of the fitted models 3.1, 3.2, and 3.3 in the above analysis was . Therefore, we have . This result implies that the development of north/south sunspot numbers originated from a common oscillatory activity in the sun rather than independently. In other words, the phase difference between the north/south sunspot numbers is considered to be constant.

## 7 Conclusion

In this study, we extended our method in Matsuda and Komaki (2017) to multivariate time series. The proposed method is based on models 3.1, 3.2, and 3.3, which assume that several oscillators underlie the given multivariate time series and that each variable consists of the projection of the oscillators onto projection vectors. In other words, the oscillators superpose on each variable with amplitude and phase modulation. Using the empirical Bayes method and AIC, our proposed method extracts underlying oscillators from the given multivariate time series in a data-driven manner and estimates their phases. The proposed method revealed an interesting phase relationship from monthly mean north/south sunspot number data.

We note that Lainscsek and Sejnowski (2015) recently proposed the delay differential analysis method, which is based on the embedding theory and detects frequencies, frequency couplings, and phases using nonlinear correlation functions. Lainscsek, Hernandez, Poizner, and Sejnowski (2015) applied this method to electroencephalographic data. This method implicitly assumes that a given time series is a superposition of sinusoidal signals, which are free from fluctuation in frequency. Our method assumes that a given time series is a superposition of oscillation components with fluctuation in frequency.

Future research may extract neural oscillators of various frequencies from neural time series and explore their physiological significance. It is well known that frequencies can vary systematically across channels in brain signals. For such data, it may provide some insight to compare the results of the proposed method with those obtained by our method in Matsuda and Komaki (2017) applied to each channel. Also, we can handle the temporal variation in the amplitude or frequency of each oscillator by extending the system model, 3.1, to have time-varying parameters. For example, linear up-chirps are described by letting increase linearly with time. Finally, it may be useful to consider the temporal change of amplitude and phase modulation by extending the observation model, equation 3.3, to have projection vectors that vary with time. This modification enables detecting epochs when each oscillator significantly contributes to each channel by estimating the norm of projection vectors.

## Appendix A: Estimation of VAR Models with Observation Noise

## Appendix B: Initial Values of Parameters in Model Fitting

Because the log likelihood, equation 4.7 is not concave in general, initial values for , , must be carefully chosen. We use the interpretation of the VAR process as a superposition of oscillation components as discussed in section 3.2.

Suppose we want to fit models 3.1, 3.2, and 3.3 with oscillation components to a given time series . First, for each , we fit the VAR() model with observation noise, equations A.1 and A.2, to using the method described in appendix A. Let be the characteristic roots and be the noise variance of the fitted VAR() model, equation A.1. Here, are imaginary numbers and are real numbers. Therefore, from section 3.2, the fitted VAR() model is a superposition of oscillators. Note that are all smaller than one because the fitted VAR() model is stationary. Among the fitted VAR() models with , let AR() be the model with the minimum AIC. We put . If there are no models with , let and AR() be the model with minimum AIC among those with . In the latter case, among with nonnegative imaginary parts, we set to be that with the th maximum stationary variance, equation 3.16. In both cases, let the estimated variance of the observation noise in models A.1 and A.2 be .

## Acknowledgments

We are grateful to the referee for valuable comments. This work was supported by MEXT KAKENHI, grant 16H06533, and JSPS KAKENHI, grants 26280005 and 14J09148.