This function generates synthetic time series data from a Vector Autoregressive (VAR) model.
Arguments
- time
Integer. Number of time points to simulate.
- burn_in
Integer. Number of burn-in observations to exclude before returning the results.
- constant
Numeric vector. The constant term vector of length
k, wherekis the number of variables.- coef
Numeric matrix. Coefficient matrix with dimensions
kby(k * p). Eachkbykblock corresponds to the coefficient matrix for a particular lag.- chol_cov
Numeric matrix. The Cholesky decomposition of the covariance matrix of the multivariate normal noise. It should have dimensions
kbyk.
Value
Numeric matrix containing the simulated time series data
with dimensions k by time,
where k is the number of variables and
time is the number of observations.
Details
The SimVAR() function generates synthetic time series data
from a Vector Autoregressive (VAR) model.
The VAR model is defined by the constant term constant,
the coefficient matrix coef,
and the Cholesky decomposition of the covariance matrix
of the multivariate normal process noise chol_cov.
The generated time series data follows a VAR(p) process,
where p is the number of lags specified by the size of coef.
The generated data includes a burn-in period,
which is excluded before returning the results.
The steps involved in generating the VAR time series data are as follows:
Extract the number of variables
kand the number of lagspfrom the input.Create a matrix
dataof sizekby (time + burn_in) to store the generated VAR time series data.Set the initial values of the matrix
datausing the constant termconstant.For each time point starting from the
p-th time point totime + burn_in - 1:Generate a vector of random noise from a multivariate normal distribution with mean 0 and covariance matrix
chol_cov.Generate the VAR time series values for each variable
jat timetusing the formula: $$ Y_{tj} = \mathrm{constant}_j + \sum_{l = 1}^{p} \sum_{m = 1}^{k} (\mathrm{coef}_{jm} * Y_{im}) + \mathrm{noise}_{j} $$ where \(Y_{tj}\) is the value of variablejat timet, \(\mathrm{constant}_j\) is the constant term for variablej, \(\mathrm{coef}_{jm}\) are the coefficients for variablejfrom lagged variables up to orderp, \(Y_{tm}\) are the lagged values of variablemup to orderpat timet, and \(\mathrm{noise}_{j}\) is the elementjfrom the generated vector of random process noise.
Transpose the matrix
dataand return only the required time period after the burn-in period, which is from columnburn_into columntime + burn_in - 1.
Examples
set.seed(42)
time <- 50L
burn_in <- 10L
k <- 3
p <- 2
constant <- c(1, 1, 1)
coef <- matrix(
data = c(
0.4, 0.0, 0.0, 0.1, 0.0, 0.0,
0.0, 0.5, 0.0, 0.0, 0.2, 0.0,
0.0, 0.0, 0.6, 0.0, 0.0, 0.3
),
nrow = k,
byrow = TRUE
)
chol_cov <- chol(diag(3))
y <- SimVAR(
time = time,
burn_in = burn_in,
constant = constant,
coef = coef,
chol_cov = chol_cov
)
head(y)
#> [,1] [,2] [,3]
#> [1,] 1.6604813 4.407838 7.473852
#> [2,] 0.2947893 2.772433 7.051225
#> [3,] 1.7966908 2.317002 5.759385
#> [4,] 3.2784861 3.105018 6.957684
#> [5,] 1.9508525 3.791006 7.690658
#> [6,] 0.6096591 3.407655 6.828098