Simulate Data from a Vector Autoregressive (VAR) Model

This function generates synthetic time series data from a Vector Autoregressive (VAR) model.

Usage

SimVAR(time, burn_in, constant, coef, chol_cov)

Arguments

time: Integer. Number of time points to simulate.
burn_in: Integer. Number of burn-in observations to exclude before returning the results.
constant: Numeric vector. The constant term vector of length k, where k is the number of variables.
coef: Numeric matrix. Coefficient matrix with dimensions k by (k * p). Each k by k block corresponds to the coefficient matrix for a particular lag.
chol_cov: Numeric matrix. The Cholesky decomposition of the covariance matrix of the multivariate normal noise. It should have dimensions k by k.

Value

Numeric matrix containing the simulated time series data with dimensions k by time, where k is the number of variables and time is the number of observations.

Details

The SimVAR() function generates synthetic time series data from a Vector Autoregressive (VAR) model. The VAR model is defined by the constant term constant, the coefficient matrix coef, and the Cholesky decomposition of the covariance matrix of the multivariate normal process noise chol_cov. The generated time series data follows a VAR(p) process, where p is the number of lags specified by the size of coef. The generated data includes a burn-in period, which is excluded before returning the results.

The steps involved in generating the VAR time series data are as follows:

Extract the number of variables k and the number of lags p from the input.
Create a matrix data of size k by (time + burn_in) to store the generated VAR time series data.
Set the initial values of the matrix data using the constant term constant.
For each time point starting from the p-th time point to time + burn_in - 1:
- Generate a vector of random noise from a multivariate normal distribution with mean 0 and covariance matrix chol_cov.
- Generate the VAR time series values for each variable j at time t using the formula: $$ Y_{tj} = \mathrm{constant}_j + \sum_{l = 1}^{p} \sum_{m = 1}^{k} (\mathrm{coef}_{jm} * Y_{im}) + \mathrm{noise}_{j} $$ where $Y_{tj}$ is the value of variable j at time t, $\mathrm{constant}_j$ is the constant term for variable j, $\mathrm{coef}_{jm}$ are the coefficients for variable j from lagged variables up to order p, $Y_{tm}$ are the lagged values of variable m up to order p at time t, and $\mathrm{noise}_{j}$ is the element j from the generated vector of random process noise.
Transpose the matrix data and return only the required time period after the burn-in period, which is from column burn_in to column time + burn_in - 1.

Author

Ivan Jacob Agaloos Pesigan

Examples

set.seed(42)
time <- 50L
burn_in <- 10L
k <- 3
p <- 2
constant <- c(1, 1, 1)
coef <- matrix(
  data = c(
    0.4, 0.0, 0.0, 0.1, 0.0, 0.0,
    0.0, 0.5, 0.0, 0.0, 0.2, 0.0,
    0.0, 0.0, 0.6, 0.0, 0.0, 0.3
  ),
  nrow = k,
  byrow = TRUE
)
chol_cov <- chol(diag(3))
y <- SimVAR(
  time = time,
  burn_in = burn_in,
  constant = constant,
  coef = coef,
  chol_cov = chol_cov
)
head(y)
#>           [,1]     [,2]     [,3]
#> [1,] 1.6604813 4.407838 7.473852
#> [2,] 0.2947893 2.772433 7.051225
#> [3,] 1.7966908 2.317002 5.759385
#> [4,] 3.2784861 3.105018 6.957684
#> [5,] 1.9508525 3.791006 7.690658
#> [6,] 0.6096591 3.407655 6.828098