1. Introduction
In recent years, the estimation of covariance (volatility) matrices has become an important and widely discussed issue in the field of asset portfolio and risk management. Earlier studies typically assumed that covariance (volatility) matrices remained constant over time in financial markets. As a result, sample covariance matrix estimation methods were found to be effective under this assumption. However, the volatility often exhibits a strong clustering phenomenon: prices tend to exhibit significant fluctuations during certain periods while remaining relatively stable during others. Undoubtedly, this time-varying characteristic makes the traditional volatility matrix estimation methods no longer valid. To better capture the volatility characteristics of financial time series, Bollerslev proposed the GARCH model [
1]. Building on this, various methods were proposed to estimate the volatility matrices, for which readers can refer to [
2,
3,
4,
5,
6,
7,
8].
As data dimension increases, traditional estimation methods encounter significant challenges. For instance, it is often inefficient when applying the multivariate GARCH model directly to high-dimensional data for estimation. To address this challenge, researchers developed various dimensionality reduction techniques to simplify the model structure and improve estimation accuracy. Gao and Tsay [
9,
10], as well as Pan and Yao [
11], applied a factor structure to reduce the dimensionality of time series data. Fan et al. [
12] proposed a factor model to reduce matrix dimensionality and estimated the covariance matrices within this framework. Ledoit and Wolf [
13] introduced a shrinkage estimator that employed a shrinkage transformation to obtain a stabilized matrix, pulling extreme coefficients toward the mean to reduce estimation errors. Guo et al. [
14] developed a method for estimating dynamic covariance (volatility) matrices, designed to reduce dimensionality and minimize estimation errors. Moreover, several researchers have sought to balance the trade-off between variance and bias by imposing some constraints, as illustrated in the works of Bickel and Levina [
15,
16], Ding et al. [
17], De at al. [
18], Uematsu and Yamagata [
19], Shi et al. [
20] and Fan et al. [
21]. Lam and Yao [
22,
23] estimated a latent factor model for high-dimensional time series and addressed the inference problem related to determining the number of factors in the model.
In order to better capture the dynamic characteristics of the model and address heteroscedasticity in financial data, some researchers combined the ideas of factor models and the GARCH framework to establish a more flexible structure. Engle [
24] utilized the factor ARCH model to estimate the covariance matrix. Bollerslev and Engle [
25] examined the common persistence characteristics in conditional variances. Additionally, Lin [
26] compared various estimation methods for factor-GARCH models and performed empirical analysis using Monte Carlo simulations. Vrontos et al. [
27] proposed the Full Factor Multivariate GARCH (FF-GARCH) model. Furthermore, there are several variant forms, one can refer to [
28,
29]. Hafner [
30] investigated the asymptotic theory of a class of factor-GARCH model and discovered that dimensionality reduction was not feasible when the loading matrix was square. Guo et al. [
14] proposed a class of factor model along with a method for estimating high-dimensional varying coefficient conditional covariance matrices in high-dimensionality situations. Based on this, Li et al. [
31] combined the ideas of factor models [
32] and the application of the Constant Correlation Covariance Generalized Autoregressive Conditional Heteroscedasticity (CCC-GARCH) model to effectively capture the dynamic characteristics of common factors. Compared to the Dynamic Constant Correlation Generalized Autoregressive Conditional Heteroscedasticity (DCC-GARCH) model proposed by Engle [
4] and the Baba–Engle–Kraft and Kroner GARCH (BEKK-GARCH) model proposed by Engle and Kroner [
2], the CCC-GARCH model is notable for its simplicity, computational efficiency, and ease of interpretation, primarily due to its assumption of constant correlations.
Motivated by previous work, this paper develops an enhanced model for estimating the high-dimensional conditional covariance matrices. In this model, factor loading matrices are constructed within varying coefficient functions, and common factors satisfy a Constant Correlation Covariance Generalized Autoregressive Conditional Heteroscedasticity (CCC-GARCH) structure inspired by Li et al. [
31]. Parameters in the model are estimated using Quasi-Maximum Likelihood estimation (QMLE), least squares estimation (LSE) and local linear estimation methods. We will demonstrate that the proposed model shows some superiority when applied to time series characterized by time-varying dynamics and heterogeneity.
The remaining part of this paper is arranged as follows.
Section 2 gives the model structure and estimation procedure.
Section 3 gives asymptotic properties.
Section 4 outlines a process for constructing a portfolio allocation based on the Markowitz’s optimal portfolio.
Section 5 reports some simulation results and
Section 6 provides an empirical example. A concluding remark is given in
Section 7. The detailed proofs are in the given
Appendix A.
3. Asymptotic Theory
In this section, we explore the asymptotic theoretical properties for the proposed estimators. We first introduce the assumptions that will be used throughout the paper.
Assumption 1. Define , with the coefficient - (i)
is stationary and ergodic, and with geometric rate , for , and . In addition, and are independent.
- (ii)
For is a strictly stationary GARCH process with , where
- (iii)
For each t, where is a small constant.
- (iv)
For we denote the true value of as , the compact set is , where c is a constant, the true value is the interior point of Λ.
- (v)
Assuming that . If , and have no common root, with
Assumption 1 outlines the necessary conditions for the asymptotic properties of the proposed estimators. Condition (i) ensures that the process
is both stationary (its statistical properties do not change over time) and ergodic (time averages converge to ensemble averages, implying that the process behaves predictably over time).
-mixing means that the correlation between past and future events decays over time. A geometric rate of decay implies that the process becomes less dependent as time progresses, which is important for the asymptotic properties of estimators. Additionally,
and
are independent, ensuring that the noise process is independent from other variables or processes. This part is typical in time series econometrics, as discussed in works like that of Fan et al. [
12]. Conditions (ii) and (iii) are common assumptions in time series modeling, especially in volatility modeling, for which readers can refer to [
33] and [
34]. Condition (iv) is a standard assumption to ensure the consistency and convergence of estimators. Condition (v) is a typical condition to ensure that the model is well defined and does not lead to singularities in the estimation process.
Assumption 2. Let
- (i)
The parameter space Θ is a compact subset of Euclidean space, and it is assumed that the true parameter value lies in the interior of Θ.
- (ii)
is a strictly stationary and ergodic process.
- (iii)
The components of are mutually independent, and the squares of each component follow a non-degenerate distribution. Furthermore, there exist constants , and such that for all and , we have - (iv)
Given , the matrices and are left prime. Additionally, the rank of the matrix has full rank.
- (v)
The matrix exhibits positive correlation at every point within the parameter space Θ.
- (vi)
;
- (vii)
There exists a constant L such that .
Assumption 2 relates to the parameter space and distributional properties of variables, which is standard in [
34]. Condition (i) ensures that estimators are consistent and identifiable, which is a standard condition in [
35]. Condition (ii) is necessary for the analysis of asymptotic properties in time series models. Condition (iii) allows us to apply the Bernstein-type inequality for weakly dependent data, which is important for asymptotic analysis. Condition (iv) ensures that certain invertibility and identifiability conditions are met in the model. Condition (v) ensures that the model behaves consistently across the parameter space and avoids degenerate cases where the correlation structure collapses. Condition (vi) ensures that certain second-moment calculations involving
are well behaved. Condition (vii) prevents the model from being overly sensitive to extreme values in the data.
Assumption 3.
- (i)
The kernel function is a symmetric density function, which satisfies the Lipschitz condition.
- (ii)
The density function of is twice differentiable and bounded away from zero on the set , where
- (iii)
The density function of is bounded away from zero and twice differentiable at . Furthermore, the joint densities of and are bounded for all
- (iv)
and have continuous third derivatives at
- (v)
For each , as , with some symmetric positive definite matrix, which is denoted by , let where is bounded away from zero.
Assumption 3 pertains to the smoothness and regularity of the kernel and density functions and is standard in nonparametric models, as in [
33]. Condition (i) ensures that the kernel is a well-behaved and smooth density function, which is important for the consistency of estimators in nonparametric settings. Condition (ii) ensures that the density function is non-zero and well behaved, which is necessary for proper identification and estimation. Condition (iii) ensures that the data do not exhibit extreme skewness or outliers by requiring the joint densities to be bounded. Condition (iv) ensures the smoothness and regularity of the model. Condition (v) ensures well-behaved asymptotics by controlling the behavior of certain matrix norms.
Assumption 4. For and p, we have
- (i)
, ,
- (ii)
, where is a constant,
Assumption 4 ensures that parameters grow at a controlled rate. Condition (i) pertains to the selection of the optimal bandwidth rate, a critical aspect of nonparametric estimation in GARCH models. Condition (ii) posits that the number of factors
p depends on the sample size
n at a polynomial rate, a condition that aligns with the findings of Fan et al. [
33].
Suppose that for any
and
denote the minimum and maximum eigenvalues of
, respectively, and
represents the trace of
. We denote the norm as follows:
The two equations describe statistical estimators or matrix structures often used in econometrics or statistics. The first equation,
, represents a matrix constructed by averaging over both time
t and variable dimensions
i. It involves a function matrix
, which applies a transformation to
, and a residual term
, indicating the deviation of
from its conditional expectation given
. The equation also includes additional terms
and
, representing variable-specific transformations, and
, a random error component at time
t for variable
i. The second equation,
, is a weighted covariance matrix based on the expectation of certain transformed variables. It includes the term
, representing a function applied to
, and a residual term
, which captures the deviation of
from its conditional expectation given
. The squared tensor product
indicates interactions of transformed variables, while the terms
and
introduce variable-specific contributions. Together,
and
are often used to study conditional distributions, with
capturing sample-based information and
describing variance or covariance structures for asymptotic analysis.
Theorem 1. Under Assumptions 2–4, there exists a constant such that Remark 1. Theorem 1 demonstrates that the parameter vector α is estimated at a rate exceeding the standard rate of , which is considered optimal when the dimension p is fixed. Specifically, for any regular estimator , the asymptotic variance of is shown to be greater than the lower bound set by the nonsingular Fisher information matrix.
Theorem 2. Under Assumption 2, for there exists a constant such that Remark 2. Theorem 2 shows that as , the rate at which converges to the true value becomes increasingly rapid.
Theorem 3. Under Assumption 1, there exists a constant such that Remark 3. Theorem 3 provides the result that is crucial for understanding the rate at which our estimator converges to the true covariance matrix as the sample size n increases. Moreover, the rate indicates that the likelihood of significant deviations from the true covariance matrix decreases rapidly with increasing sample size, which is a desirable characteristic for any consistent estimator.
The main interest of this paper is to estimate in (19). To measure the accuracy of the estimation when the dimension is p, the entropy loss norm proposed by James and Stein [36] could be a good choice. The specific formula is shown as follows:Under the entropy loss norm, we are going to give the asymptotic property of the estimator for in the following theorem. Theorem 4. Under Assumptions 1–4, there exists a constant such thatFan et al. [21] and Fan et al. [12] have demonstrated that an estimator constructed based on a specific structure for the unconditional covariance matrix achieves a superior convergence rate compared to the sample covariance matrix. Building on these findings, Theorem 4 extends this result to the domain of conditional covariance matrix estimation, suggesting that a similar enhancement in the convergence rate can be realized. This theorem not only quantifies the estimation error but also lays a theoretical foundation for the practical application of our estimation method, especially in contexts where the sample size is sufficiently large. 4. Portfolio Allocation
In this section, we detail the construction of an estimated optimal portfolio allocation leveraging the Varying Coefficient Factor-GARCH (VCF-GARCH) model. Initially, we define the conditional expectation of
given the information set
as
. Given that the optimal portfolio allocation formula incorporates
, we introduce its estimator. We derive the conditional expectation from Equation (
2) as follows:
Thus, the estimation of the conditional expectation
is derived as follows:
where
is approximated using a VAR(1) model, as detailed in Li et al. [
31]. Drawing on the mean-variance optimization framework proposed by Markowitz, we proceeded to construct the estimated optimal portfolio allocation. Let
represent the allocation vector for
p risky assets to be held from time
to
t. We assume that the solution for the estimated
is given by
where
represents the _target return imposed on the portfolio, and
is a
p-dimensional column vector consisting entirely of ones. The optimal allocation vector
is determined by the following expression:
where
5. Simulation Studies
In this section, we present a simulated example to evaluate the performance of the proposed estimation procedure. We conduct
replications, considering the following configurations of
T and
p:
For
, let
, we set
where
are some fixed parameters for
and
; we simulate them independently from a uniform distribution on
, and use these same values throughout all simulations. We set
in model
and assume that
. For the conditional covariance matrix
of
we set
and
as follows:
and
where
Francq and Zakoian [
34] have shown that parameter configurations must ensure the positivity of the conditional variance. The parameter values set above are some general settings and have been verified to be effective. More details can refer to [
31,
34]. For
, we outline the simulation procedure as follows:
Generate from where is the k-dimensional identity matrix. Using compute and by
Independently generate
according to Equation (
26).
For each and , generate from . Compute and using
Generate
according to model (
2).
To assess the performance of an estimator
for matrix
M, we employ the following metric:
where
is the estimate obtained from the i-th replication of the experiment, and
We define the following notations for convenience:
,
,
and
. The corresponding estimators proposed by Li et al. [
31] are denoted by
,
,
and
. Furthermore, the performance of the estimator
is assessed using the metrics
and
. As Li’s method does not incorporate a dynamic structure for factor loading matrices, estimating
is unnecessary.
Table 1 summarizes the performance of the estimators across 500 replications, assessed by these metrics. Compared to the method proposed by Li et al. [
31], our estimation approach consistently achieves lower metric values across all scenarios. Actually, the large numerical values about
and
observed in
Table 1 and
Table 2 can be attributed to the structural differences between the proposed method and Li’s method. Specifically, the proposed method is designed to accommodate dynamic characteristics in the factor loading matrix, where factor loadings change over time. In contrast, Li’s method assumes static factor loading matrices that remain constant over time. This mismatch leads to significant estimation errors when using Li’s method in scenarios with dynamic factor loadings. Moreover, the instability of the estimates in these settings leads to the high variance of the results, as reflected in the tables. In particular, our estimator
exhibits improved accuracy for conditional covariance matrix estimation, owing to its ability to effectively accommodate dynamic factor loading matrices. In contrast, the method by Li et al. [
31], which is not designed to handle dynamic factor loading matrices, results in substantial errors in
.
Table 2 displays the corresponding metrics for
, illustrating that the estimation errors, quantified by
and
decrease accordingly, as the sample size
T increases.
To assess estimator performance, we employ
and
as estimates for
and
, respectively. The estimation uses the Epanechnikov kernel
, with bandwidths chosen according to the method proposed by Guo et al. [
14]. Under three types of norms, we calculate the mean and standard deviation of the differences between
and
based on both our proposed VCF-GARCH model and the Factor-GARCH model by Li et al. [
31]. The results are shown in
Table 3. The findings suggest that both the mean errors and standard deviation errors exhibit favorable performance in our model. Specifically, the mean errors of
,
and
increase slightly as
p grows, a result influenced by the definitions of these norms. Notably, all standard deviations remain stable and relatively small under the entropy norm
. Furthermore, in most cases, the proposed model achieves lower mean errors compared to the method of Li et al. [
31], underscoring the effectiveness of our approach. The factor loading matrices in our simulations are designed to vary over time, reflecting realistic scenarios where these matrices evolve rather than remain static. We varied key parameters, including sample size and dimension, across a range of settings to explore different degrees of dynamic behavior, ensuring that the dynamic nature is fully captured. These settings allow us to systematically evaluate the model’s performance under diverse scenarios, highlighting its ability to handle dynamic structures effectively.
6. Real Data Analysis
In this section, we apply the Varying Coefficient Factor-GARCH (VCF-GARCH) model to estimate the covariance matrices using a real dataset. Specifically, we utilize the daily returns of 49 industrial portfolios, adjusted for the risk-free rate, as the response variable
. The observable factors incorporated in our analysis are the market, size and value factors derived from the Fama-French three-factor model [
37], respectively. The dataset utilized in this study was obtained from Kenneth French’s website, accessible at
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html, and was accessed on 20 August 2024. The data series encompass a period from 2 January 2013 to 31 December 2023, yielding a total of 2515 observations. Our portfolio allocation strategy, denoted as VCF, is compared against the allocation proposed by Li et al. [
31], which is denoted as FG. To test the model’s adaptability, the Carhart four-factor model is applied as an alternative factor model. In this situation, the portfolio allocation strategy proposed by us is denoted as
, and the allocation proposed by Li et al. [
31] is denoted as
. Furthermore, to demonstrate the robustness of the model, this paper incorporates a new set of data consisting of 30 industrial portfolios. This comparison is conducted in terms of annualized Sharpe ratios. The methodology for calculating the annualized Sharpe ratio adheres to the trading strategy outlined by Guo et al. [
14]. The number of trading days per year is approximated as T = 252. We make the simplifying assumptions that there are no transaction costs and that short selling is permitted, thus considering all possible portfolio allocations to be feasible.
The trading strategy involves constructing a portfolio allocation
at the close of each trading day, which is then held until the close of the next trading day. The realized return between day
and day
t is calculated as
where
is determined based on the historical data
,
. Consequently, the annualized Sharpe ratio is derived as
with the average return
and
defined by
where
denotes the risk-free rate on day
t.
We calculate the annualized Sharpe ratios on the last trading day of each year, using a sample size of
.
Figure 1 shows the annualized Sharpe ratio trend for the three-factor model with 49 industry portfolios, while
Figure 2 shows the same trend for 30 industry portfolios.
Figure 3 presents the annualized Sharpe ratio trend for the Carhart four-factor model with 49 industry portfolios. All three figures show that the proposed VCF-GARCH model outperforms the method by Li et al. [
31] in most cases. These results suggest that our approach delivers higher Sharpe ratios, meaning better risk-adjusted returns and overall portfolio performance. This improvement highlights the benefit of using the dynamic covariance structure in the VCF-GARCH model.