This function generates synthetic time series data from a Vector Autoregressive (VAR) model.
Arguments
- time
Integer. Number of time points to simulate.
- burn_in
Integer. Number of burn-in observations to exclude before returning the results.
- constant
Numeric vector. The constant term vector of length
k
, wherek
is the number of variables.- coef
Numeric matrix. Coefficient matrix with dimensions
k
by(k * p)
. Eachk
byk
block corresponds to the coefficient matrix for a particular lag.- chol_cov
Numeric matrix. The Cholesky decomposition of the covariance matrix of the multivariate normal noise. It should have dimensions
k
byk
.
Value
Numeric matrix containing the simulated time series data
with dimensions k
by time
,
where k
is the number of variables and
time
is the number of observations.
Details
The SimVAR()
function generates synthetic time series data
from a Vector Autoregressive (VAR) model.
The VAR model is defined by the constant term constant
,
the coefficient matrix coef
,
and the Cholesky decomposition of the covariance matrix
of the multivariate normal process noise chol_cov
.
The generated time series data follows a VAR(p) process,
where p
is the number of lags specified by the size of coef
.
The generated data includes a burn-in period,
which is excluded before returning the results.
The steps involved in generating the VAR time series data are as follows:
Extract the number of variables
k
and the number of lagsp
from the input.Create a matrix
data
of sizek
by (time + burn_in
) to store the generated VAR time series data.Set the initial values of the matrix
data
using the constant termconstant
.For each time point starting from the
p
-th time point totime + burn_in - 1
:Generate a vector of random noise from a multivariate normal distribution with mean 0 and covariance matrix
chol_cov
.Generate the VAR time series values for each variable
j
at timet
using the formula: $$ Y_{tj} = \mathrm{constant}_j + \sum_{l = 1}^{p} \sum_{m = 1}^{k} (\mathrm{coef}_{jm} * Y_{im}) + \mathrm{noise}_{j} $$ where \(Y_{tj}\) is the value of variablej
at timet
, \(\mathrm{constant}_j\) is the constant term for variablej
, \(\mathrm{coef}_{jm}\) are the coefficients for variablej
from lagged variables up to orderp
, \(Y_{tm}\) are the lagged values of variablem
up to orderp
at timet
, and \(\mathrm{noise}_{j}\) is the elementj
from the generated vector of random process noise.
Transpose the matrix
data
and return only the required time period after the burn-in period, which is from columnburn_in
to columntime + burn_in - 1
.
Examples
set.seed(42)
time <- 50L
burn_in <- 10L
k <- 3
p <- 2
constant <- c(1, 1, 1)
coef <- matrix(
data = c(
0.4, 0.0, 0.0, 0.1, 0.0, 0.0,
0.0, 0.5, 0.0, 0.0, 0.2, 0.0,
0.0, 0.0, 0.6, 0.0, 0.0, 0.3
),
nrow = k,
byrow = TRUE
)
chol_cov <- chol(diag(3))
y <- SimVAR(
time = time,
burn_in = burn_in,
constant = constant,
coef = coef,
chol_cov = chol_cov
)
head(y)
#> [,1] [,2] [,3]
#> [1,] 1.6604813 4.407838 7.473852
#> [2,] 0.2947893 2.772433 7.051225
#> [3,] 1.7966908 2.317002 5.759385
#> [4,] 3.2784861 3.105018 6.957684
#> [5,] 1.9508525 3.791006 7.690658
#> [6,] 0.6096591 3.407655 6.828098