【代码实践】使用Garch模型估计VaR

2023-10-28


title: “Value at Risk estimation using GARCH model”
author: “Ionas Kelepouris & Dimos Kelepouris”
date: “July 6, 2019”
output:
html_document:
fig_height: 7
fig_width: 10
highlight: tango
toc: yes

knitr::opts_chunk$set(echo = TRUE , warning = FALSE , error = FALSE , message = FALSE)

Introduction

本分析的目的:通过给定的时变波动率构建VaR。VaR被广泛地运用于计算市场风险。
我们的时间序列包括2668天的黄金期货收益。为了解释日度收益的部分波动率,我们
使用Box-Jenkins方法拟合ARIMA模型,并测试其假设。
其次,我们检测了收益的概率密度分布是否符合正态分布,以先择最合适的分布形式。
再次,我们使用Garch模型估算了残差的条件方差,并与delta-normal方法及进行对比。
最后,我们进行了1-step的VaR预测,并且回测数据以检验模型是否完善。

Data & Libraries

为了建模,我们收集了10年的黄金期货日度收益,共计2668个观测值。

# Load libraries
library(tidyverse)
library(ggthemes)
library(forecast)
library(tseries)
library(gridExtra)
library(rugarch)

# Load data
stocks = read.csv('E:\\BaiduNetdiskWorkspace\\Gold_CSS\\Model\\R.csv' , header = T)
stocks = stocks %>% select(Date , R , Css_e)

rets = stocks$R

p1 = qplot(x = 1:length(rets) , y = rets , geom = 'line') + geom_line(color = 'darkblue') + 
    geom_hline(yintercept = mean(rets) , color = 'red' , size = 1) + 
    labs(x = '' , y = 'Daily Returns')

p2 = qplot(rets , geom = 'density') + coord_flip() + geom_vline(xintercept = mean(rets) , color = 'red' , size = 1) +
    geom_density(fill = 'lightblue' , alpha = 0.4) + labs(x = '')

grid.arrange(p1 , p2 , ncol = 2)

为了识别收益的平稳性,我们使用Augmented Dickey-Fuller检验。
其零假设为:不平稳。

adf.test(rets)

Small P-value (<0.01) 表明可以拒绝零假设。我们的收益率是平稳的。

Box-Jenkins Methodology

Box-Jenkins approach可应用ARIMA模型以发现时间序列最好的拟合,该方法主要有三个
阶段:
a) identification, b) estimation, c) diagnostic checking.

Identification

前期我们已经检验时间序列的平稳性。基于Autocorelation Function (ACF) and Partial Autocorrelation Function (PACF),以选择合适的参数p,d,q。Akaike Information Criterion (AICc)同样可以用于选取参数。

$AIC = ln\frac{\sum\hat{u}^2}{T} + \frac{2k}{T}$

where

  • ∑ u ^ 2 \sum\hat{u}^2 u^2 = Sum of Squared Residuals
  • T T T = number of observations
  • k k k = number of model parameters (p + q + 1)

AIC可以解决过度拟合以及拟合不足的风险。AIC值最低的模型参数将被选择。

model.arima = auto.arima(rets , max.order = c(3 , 0 ,3) , stationary = TRUE , trace = T , ic = 'aicc')

根据上述结果,选择ARIMA(0,0,2)

Estimation

可以估计所选参数的coefficients,我们使用Maximum Likelihood。

model.arima

Therefore the process can be described as:

$r_{t} = -0.1238*\epsilon_{t-1} - 0.1611*\epsilon_{t-2} + \epsilon_{t}$ where $\epsilon_{t}$ is White Noise

Diagnostics Checking

该过程包含observing residual plot以及ACF & PACF diagra,以及Ljung-Box的检验
结果。如果ACF & PACF表明模型残差不存在显著滞后项,说明模型合适。

model.arima$residuals %>% ggtsdisplay(plot.type = 'hist' , lag.max = 14)

ACF 和 PACF结果相似,其相关性接近于0。右侧残差接近正态分布 N(0 , σ 2 \sigma^2 σ2).

为了进一步验证残差不相关,we perform Ljung-Box test.

$$Q_{LB} = T(T+2)\sum_{s=1}^{m} \frac{\hat\rho_{s}^{2}}{T-s}$$

The Q L B Q_{LB} QLB statistic follows asymmetrically a X 2 X^{2} X2 distribution with m-p-q degrees of freedom. The null hypothesis refers to H 0 : ρ 1 = ρ 2 = ⋯ = ρ m = 0 H_{0}: \rho_{1}=\rho_{2}=\dots=\rho_{m}=0 H0:ρ1=ρ2==ρm=0

ar.res = model.arima$residuals
Box.test(model.arima$residuals , lag = 14 , fitdf = 2 , type = 'Ljung-Box')

我们不能拒绝零假设,因此残差的过程就像白噪声一样,所以没有可能被建模的模式的迹象。

GARCH Implementation

尽管ACF & PACF of residuals都表明没有显著的滞后性,但时间序列图仍然表明存在波动聚集。ARIMA对数据线性建模,无法反应新信息。为了对波动率建模,我们使用ARCH模型。
ARCH是一个时间序列数据的统计模型,它描述了当前误差项的方差作为前一时间段误差项的实际大小的函数。

We assume that the time series of interest, r t r_{t} rt, is decomposed into two parts, the predictable and unpredictable component, r t = E ( r t ∣ I t − 1 ) + ϵ t r_{t} = E(r_{t}|I_{t-1}) + \epsilon_{t} rt=E(rtIt1)+ϵt, where I t − 1 I_{t-1} It1 is the information set at time t − 1 t-1 t1 and E ( r t ∣ I t − 1 ) = 0.0437 ∗ r t − 1 − 0.0542 ∗ r t − 2 E(r_{t}|I_{t-1}) = 0.0437*r_{t-1} - 0.0542*r_{t-2} E(rtIt1)=0.0437rt10.0542rt2 and ϵ t \epsilon_t ϵt is the unpredictable part, or innovation process.

The unpredictable component, can be expressed as a GARCH process in the following form:

ϵ t = z t ∗ σ t \epsilon_{t} = z_{t}*\sigma_{t} ϵt=ztσt

where z t z_{t} zt is a sequence of independently and identically distributed random variables with zero mean and variance equal to 1. The conditional variance of ϵ t \epsilon_{t} ϵt is σ t \sigma_{t} σt, a time-varying function of the information set at time t − 1 t-1 t1.

Next step is to define the the second part of the error term decomposition which is the conditional variance, σ t \sigma_{t} σt. For such a task, we can use a GARCH(1 , 1) model, expressed as:

σ t 2 = ω + a 1 ∗ ϵ t − 1 2 + β 1 ∗ σ t − 1 2 \sigma_{t}^{2} = \omega + a_{1}*\epsilon_{t-1}^{2} + \beta_{1}*\sigma_{t-1}^{2} σt2=ω+a1ϵt12+β1σt12

当残差平方存在自相关时,可以使用Garch模型。ACF and PACF plots clearly indicate significant correlation.

tsdisplay(ar.res^2 , main = 'Squared Residuals')

另一种检测残差平方的异方差性的方法测试参数的显著性。
a 1 a_{1} a1 and β 1 \beta_{1} β1 parameters.

# Model specification
model.spec = ugarchspec(variance.model = list(model = 'sGARCH' , garchOrder = c(1 , 1)) , 
                        mean.model = list(armaOrder = c(0 , 2)))

model.fit = ugarchfit(spec = model.spec , data = ar.res , solver = 'solnp')

options(scipen = 999)
model.fit@fit$matcoef

Both a 1 a_{1} a1 and β 1 \beta_{1} β1 are significantly different from zero,
因此可以假设残差存在波动聚集现象。

With successive replacement of the σ t − 1 2 \sigma_{t-1}^2 σt12 term, the GARCH equation can be written as:

σ t 2 = ω 1 − β 1 + a 1 ∗ ∑ i = 1 ∞ β 1 i − 1 ∗ ϵ t − i 2 \sigma_{t}^2 = \frac{\omega}{1-\beta_1}+a_{1}*\sum_{i=1}^{\infty} \beta_{1}^{i-1}*\epsilon_{t-i}^{2} σt2=1β1ω+a1i=1β1i1ϵti2

When we replace with the coefficient estimates given by optimization we get the following equation:

σ t 2 = 0.000087 + 0.108 ∗ ( ϵ t − 1 2 + 0.825 ∗ ϵ t − 2 2 + 0.680 ∗ ϵ t − 3 2 + 0.561 ∗ ϵ t − 4 2 + …   ) \sigma_{t}^2 = 0.000087 + 0.108*(\epsilon_{t-1}^{2} + 0.825*\epsilon_{t-2}^2 + 0.680*\epsilon_{t-3}^2 + 0.561*\epsilon_{t-4}^{2} + \dots) σt2=0.000087+0.108(ϵt12+0.825ϵt22+0.680ϵt32+0.561ϵt42+)

Given that 0 < β 1 < 1 0<\beta_{1}<1 0<β1<1, as lag increases the effect of the squared residual decreases.

Value at Risk

VaR用于测度当下位置的下跌风险。
A VaR statistic has three components: a) time period, b) confidence level, c) loss ammount (or loss percentage). For 95% confidence level, we can say that the worst daily loss will not exceed VaR estimation. If we use historical data, we can estimate VaR by taking the 5% quantile value. For our data this estimation is:

quantile(rets , 0.05)
qplot(rets , geom = 'histogram') + geom_histogram(fill = 'lightblue' , bins = 30) +
    geom_histogram(aes(rets[rets < quantile(rets , 0.05)]) , fill = 'red' , bins = 30) +
    labs(x = 'Daily Returns')

Red bars refer to returns lower than 5% quantile.

Distributional Properties

为了估算VaR,我们需要适当定义假设分布的quantile。
对于正态分布而言,5%对应的quantile为-1.645。实证结果表明正态分布的假设往往
产生weak results。Jarque-Bera test can test the hypothesis that stock returns follow a normal distribution.

J B = n − k + 1 6 ∗ ( S 2 + 1 4 ∗ ( C − 3 ) 2 ) JB = \frac{n-k+1}{6}*(S^{2} + \frac{1}{4}*(C-3)^{2}) JB=6nk+1(S2+41(C3)2)

where S S S is skewness and C C C is kurtosis. A normal distributed sample would return a JB score of zero. The low p-value indicates stock returns are not normally distributed.

jarque.bera.test(rets)
p2_1 = qplot(rets , geom = 'density') + geom_density(fill = 'blue' , alpha = 0.4) + 
    geom_density(aes(rnorm(200000 , 0 , sd(rets))) , fill = 'red' , alpha = 0.25) + 
    labs(x = '')

p2_2 = qplot(rets , geom = 'density') + geom_density(fill = 'blue' , alpha = 0.4) + 
    geom_density(aes(rnorm(200000 , 0 , sd(rets))) , fill = 'red' , alpha = 0.25) + 
    coord_cartesian(xlim = c(-0.07 , -0.02) , ylim = c(0 , 10)) + 
    geom_vline(xintercept = c(qnorm(p = c(0.01 , 0.05) , mean = mean(rets) , sd = sd(rets))) , 
               color = c('darkgreen' , 'green') , size = 1) + labs(x = 'Daily Returns')

grid.arrange(p2_1 , p2_2 , ncol = 1)

如图所示,Density plots are shown for 收益(蓝)以及正态分布(红)数据。
Vertical lines表明了5%和1%对应的quantile。
正态分布都会高估风险。

Student’s t-distribution

为了更好对收益建模,我们使用其他分布。
The t-distribution is symmetric and bell-shaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean. We use the fitdist function from rugarch package to get the fitting parameters of t-distribution.

fitdist(distribution = 'std' , x = rets)$pars
cat("For a = 0.05 the quantile value of normal distribution is: " , 
    qnorm(p = 0.05) , "\n" ,
     "For a = 0.05 the quantile value of t-distribution is: " ,
    qdist(distribution = 'std' , shape = 3.7545967917 , p = 0.05) , "\n" , "\n" , 
    'For a = 0.01 the quantile value of normal distribution is: ' , 
    qnorm(p = 0.01) , "\n" , 
    "For a = 0.01 the quantile value of t-distribution is: " , 
    qdist(distribution = 'std' , shape = 3.7545967917 , p = 0.01) , sep = "")

95%置信区间下,正态分布高估风险;
99%置信区间下,正态分布未能捕捉outliers,低估风险。

Garch VaR vs Delta-normal approach

Delta-normal approach 假设收益正态分布。

V a R ( a ) = μ + σ ∗ N − 1 ( a ) VaR(a)=\mu + \sigma*N^{-1}(a) VaR(a)=μ+σN1(a)

where μ \mu μ is the mean stock return, σ \sigma σ is the standard deviation of returns, a a a is the selected confidence level and N − 1 N^{-1} N1 is the inverse PDF function, generating the corresponding quantile of a normal distribution given a a a.

The results of such a simple model is often disapointing and are rarely used in practice today. The assumption of normality and constant daily variance is usually wrong and that is the case for our data as well.

Previously we observed that returns exhibit time-varying volatility. Hence for the estimation of VaR we use the conditional variance given by GARCH(1,1) model. For the underlined asset’s distribution properties we use the student’s t-distribution. For this method Value at Risk is expressed as:

V a R ( a ) = μ + σ ^ t ∣ t − 1 ∗ F − 1 ( a ) VaR(a)=\mu + \hat{\sigma}_{t|t-1}*F^{-1}(a) VaR(a)=μ+σ^tt1F1(a)

where σ ^ t ∣ t − 1 \hat{\sigma}_{t|t-1} σ^tt1 is the conditional standard deviation given the information at t − 1 t-1 t1 and F − 1 F^{-1} F1 is the inverse PDF function of t-distribution.

qplot(y = rets , x = 1:2668 , geom = 'point') + geom_point(colour = 'lightgrey' , size = 2) + 
    geom_line(aes(y = model.fit@fit$sigma*(-1.485151) , x = 1:2668) , colour = 'red') +
    geom_hline(yintercept = sd(rets)*qnorm(0.05) , colour = 'darkgreen' , size = 1.2) + theme_light() + 
    labs(x = '' , y = 'Daily Returns' , title = 'Value at Risk Comparison')

Red line denotes VaR produced by GARCH model and green line refers to delta-normal VaR.

VaR forecasting

The ugarchroll method 可以进行滚动预测,it returns分布预测参数以计算forecasted density的必要测度。我们将最后500个观测设为测试集,对条件标准差 σ ^ t + 1 ∣ t \hat{\sigma}_{t+1|t} σ^t+1t进行1-step的预测。

model.roll = ugarchroll(spec = model.spec , data = rets , n.start = 2168 , refit.every = 50 ,
                        refit.window = 'moving')

# Test set 500 observations
VaR95_td = mean(rets) + model.roll@forecast$density[,'Sigma']*qdist(distribution='std', shape=3.7545967917, p=0.05)

Backtesting

N = ∑ t = 1 T I t N = \sum_{t=1}^{T} I_{t} N=t=1TIt 为收益低于VaR估计的个数。

VaR estimate, where I t I_{t} It is 1 if y t + 1 < V a R t + 1 ∣ t y_{t+1} < VaR_{t+1|t} yt+1<VaRt+1t and 0 if y t + 1 ≥ V a R t + 1 ∣ t y_{t+1} \ge VaR_{t+1|t} yt+1VaRt+1t. Hence, N is the observed number of exceptions in the sample. As argued in Kupiec (1995), the failure number follows a binomial distribution, B(T, p).

p = c()
p[1] = pbinom(q = 0 , size = 500 , prob = 0.05)
for(i in 1:50){
    p[i] = (pbinom(q = (i-1) , size = 500 , prob = 0.05) - pbinom(q = (i-2) , size = 500 , prob = 0.05))
}
qplot(y = p , x = 1:50 , geom = 'line') + scale_x_continuous(breaks = seq(0 , 50 , 2)) + 
    annotate('segment' , x = c(16 , 35) , xend = c(16 , 35) , y = c(0 , 0) , yend = p[c(16 , 35)] , color = 'red' , 
             size = 1) + labs(y = 'Probability' , x = 'Number of Exceptions') + theme_light()

该图代表了异常的分布概率。The expected number为25个,两条红线表示95%的置信区间。 the lower being 16 and the upper 35.
我们期待16-35之间的数字,to state that GARCH model as successfully preditive.

qplot(y = VaR95_td , x = 1:500 , geom = 'line') +
    geom_point(aes(x = 1:500 , y = rets[2169:2668] , color = as.factor(rets[2169:2668] < VaR95_td)) , size = 2) + scale_color_manual(values = c('gray' , 'red')) + 
    labs(y = 'Daily Returns' , x = 'Test set Observation') + theme_light() + 
    theme(legend.position = 'none')

Black line为由Garch模型所预测VaR,red points refer to returns lower than VaR.
Final step is to count the number of exceptions and compare it with the one generated with delta-normal approach.

cat('Number of exceptions with delta-normal approach: ' , (sum(rets[2169:2668] < (mean(rets) + qnorm(p = 0.05)*sd(rets[1:2168])))) , '\n' , 'Number of exceptions with GARCH approach: ' , (sum(rets[2169:2668] < VaR95_td)) , sep = '')

正如前述,we expected that delta-normal approach会高估风险。
当回测时,只有12个收益低于VaR预测的95%下界(16),而 GARCH approach (23 exceptions) seems to be an effective predictive tool in this particular case.

References

Angelidis T., Benos A. and Degiannakis S. (December 2003). The Use of GARCH Models in VaR Estimation.

Ghalanos A. (August 2017). Introduction to the rugarch package (Version 1.3-8).

Montgomery D., Jennings C. and Kulahci M. (2015). Introduction to Time Series Analysis and Forecasting (Second Edition). New Jersey: Wiley.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

【代码实践】使用Garch模型估计VaR 的相关文章

随机推荐

  • 蓝桥杯2020年第十一届国赛真题-重复字符串

    题目描述 如果一个字符串 S 恰好可以由某个字符串重复 K 次得到 我们就称 S 是 K 次重复字符串 例如 abcabcabc 可以看作是 abc 重复 3 次得到 所以 abcabcabc 是 3 次重复字符串 同理 aaaaaa 既是
  • oracle 中使用 select a into b 时遇到空值问题

    今天一朋友问及我这个问题 当记录不存在 会提示 no data 的错误 下面是网上这类问题的解决方法 转载 当在PL SQL中执行SELECT INTO 语句时 如果返回结果集为空 则回触发NO DATA FOUND错误 但是当 SELEC
  • SSL/TLS原理 详细整理版

    1 SSL TLS握手 简化版 浏览器 服务器 发起 gt 1 浏览器通知服务器浏览器所支持的加密协议 接收 接收 lt 2 服务器通知浏览器从1中选用的加密协议 并给予证书 发起 3 用CA的公钥鉴别服务器的证书是否有效 有效则生成一个随
  • element 表格两级单元格合并功能

    代码如下
  • JavaWeb中servlet到底是干什么的

    JavaWeb中servlet到底是干什么的 javaweb工程包括 src下的 java文件WebRoot下的 jsp js等文件当工程运行时 tomcat先把 jsp gt java gt class 计算机只识别 class文件ser
  • andoid逐帧动画oom_帧动画内存OOM?不存在的!—— SurfaceView逐帧解析

    Android 提供了AnimationDrawable用于实现帧动画 在动画开始之前 所有帧的图片都被解析并占用内存 一旦动画较复杂帧数较多 在低配置手机上容易发生 OOM 即使不发生 OOM 也会对内存造成不小的压力 下面代码展示了一个
  • react小书,怎么渲染列表(react)

    在我们处理数据时 假如我们现在有一个用户列表存放到数组中 const users username Jerry age 21 gender male username Tomy age 22 gender male username Lil
  • java解决高并发之数据库连接池配置

    使用的IDE是IDEA 项目是springboot框架的项目 最近一直在处理高并发的问题 大致情况是这样的 大概有五六千人会在中午十二点同时访问网站 操作数据库 导致服务器崩溃 对于频繁修改数据的这种情况 例如 用户要抢商品 且抢完后要刷新
  • kuiper安装

    1 使用docker方式安装 docker pull lfedge ekuiper latest docker run p 9081 9081 d name kuiper e MQTT SOURCE DEFAULT SERVER tcp 1
  • [Android studio] 第15节 ConstraintLayout控件

    ConstraintLayout 是 Android 中的布局容器 它是一个灵活且强大的布局工具 用于创建复杂的界面布局 它通过使用约束 constraints 来定义子视图之间的关系和对齐方式 以下是 ConstraintLayout 常
  • 【计算机网络】301 永久重定向的缓存问题

    301 永久重定向的缓存问题 问题描述 在使用 301 的时候常常会遇到一个问题 当服务端针对某个 URL 设置了 301 永久重定向后 不管怎么重新设置或者删除设置 浏览器在进行访问时仍然会使用最开始缓存的 301 重定向 而服务端无法控
  • Centos7搭建bsc全链节点

    Centos7搭建bsc全链节点 服务器配置 CPU 8 Cores 16 Threads RAM 131072 MB Storage 2x 2000GB NVMe Bandwidth 8400 65 GB of 10000 GB OS C
  • 如何打开电脑的服务选项

    1 首先找到此电脑 我的电脑 单击鼠标右键 找到管理选项 单击 管理 2 现在 打开服务设置对话框 在左侧菜单栏找到 服务和应用程序 打开下拉菜单 在下拉菜单 单击 服务 打开计算机内所有的服务 3 单击服务 打开所有的服务选项 在右面栏里
  • 电子学会 青少年软件编程(202209)(C语言)(树&堆&图)等级考试(七级)试题 T3 Sequence

    T3 Sequence 2442 Sequence Sequence Poj2442 堆 优先队列 Sequence Poj2442 堆 优先队列 Magic的博客 CSDN博客 sequence poj2442 POJ 2442 二叉堆
  • LVS+Keepalived群集

    目录 一 keepalived介绍 二 Keepalived及其工作原理 三 Keepalived原理剖析 四 Keepalived体系主要模块及其作用 实例 NFS服务器 192 168 80 200 主DR 服务器 192 168 30
  • TensorFlow2(版本2.5.0)学习笔记(含keras_bert、W2V)

    目录 一 设置CPU GPU运行环境 二 tf定义变量与简单操作 基于tf2做数据处理 Tokenizer 1 使用TF2实现token2id padding 2 基于gensim 版本 3 8 3 3 基于keras bert bert4
  • threejs物理效果和声音

    个人博客地址 https cxx001 gitee io 一 Threejs中如何创建物理场景 threejs中创建物理场景我们用它的扩展库 Physijs 它可以使场景中的对象有重力效果 可以相互碰撞 施加力之后可以移动 还可以通过合页和
  • Java 导入验证数据最后一行

    description 判断数据最后一行 author lkm date 2023 7 13 17 11 public static boolean isEmptyRow Row row if row null row toString i
  • python安装模块速度太慢了,教你一招提升百倍安装速度

    在python开发中 经常需要使用到各种各样的库 pip又是我们常用的安装工具 但是国外的源下载速度实在太慢 经常导致超时 对于这种情况我们可以修改pip的下载源为国内源 这样就可以大幅度提升下载速度 如何修改源 1 临时更换镜像源 可以通
  • 【代码实践】使用Garch模型估计VaR

    title Value at Risk estimation using GARCH model author Ionas Kelepouris Dimos Kelepouris date July 6 2019 output html d