A50&CSI300&SSE50 arb report

Fang • submitted 2017-04-20 20:22:26

We set the data range from 2015/4/16—2016/8/26 to run the regression.
The independent variables for CSI300 and SSE50 are named as x and y respectively. The dependable variable A50 is named as z. Each index is multiplied by the contract multiplier and converted into Chinese Yuan.

First of all, we calculated the correlation between each two of the indexes as following:

> cor(x,y)
[1] 0.9872779
> cor(x,z)
[1] 0.9759849

And we run the ordinary least square (OLS) linear regression in R:

> lm1<-lm(z~x+y)
> lm1

Call:
lm(formula = z ~ x + y)

Coefficients:
(Intercept) x y 
1.749e+04 -1.587e-03 7.128e-02

The result showed some defects:
1, the intercept is an oversized number.
2, the coefficient for x, CSI300, is negative.
3, obvious collinearity and difficult to eliminate.

With logarithmic – conversion, we set x1=log(x), y1 = log(y).We run the regression again:

> x1<-log(x)
> y1<-log(y)
> z1<-log(z)
> lm2<-lm(z1~x1+y1)
> summary(lm2)

Call:
lm(formula = z1 ~ x1 + y1)

Residuals:
Min 1Q Median 3Q Max 
-0.117396 -0.010662 0.001583 0.012657 0.038213 

Coefficients:
Estimate Std. Error t value Pr(>|t|) 
(Intercept) 1.04649 0.08893 11.768 <2e-16 ***
x1 -0.02634 0.04259 -0.619 0.537 
y1 0.77376 0.04473 17.297 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01874 on 335 degrees of freedom
Multiple R-squared: 0.9745, Adjusted R-squared: 0.9744 
F-statistic: 6407 on 2 and 335 DF, p-value: < 2.2e-16

The negative coefficient remained.

Graph is plotted to show collinearity between x1 and y1 as below:

> plot(x1~y1,col="red")

We applied regularization method to set a limit for the intercept. And run the ridge regression to process the data with collinearity.
With lm.ridge()function, we acquired 151 lambdas and selected the lambda with Generalized Cross Validation (GCV) when lambdaGCV is at minimum.

> library(MASS)
> ridge.sol<-lm.ridge(z1~x1+y1, lambda = seq(0,150,length =151),model = TRUE)
> ridge.sol$lambda[which.min(ridge.sol$GCV)]
> coef(ridge.sol)[which.min(ridge.sol$GCV),]
> matplot(ridge.sol$lambda,t(ridge.sol$coef),xlab = expression(lambda),ylab="cofficients",type="l",lty=1:20)

The ridge trace is plotted with minimum lambdaGCV as below:

And the graph of lambda and GCV:

> plot(ridge.sol$lambda,ridge.sol$GCV,type = "l",xlab = expression(lambda),ylab=expression(beta))
> abline(v=ridge.sol$lambda[which.min(ridge.sol$GCV)])

The collinearity is checked with lasso (least absolute shrinkage and selection operator) regression:

> A<-as.matrix(rt[,3:4])
> B<-as.matrix(rt[,2])
> lass=lars(A,B,type="lar")

The lasso method ranked SSE50 before CSI300 as expected.

> summary(lass)
LARS/LAR
Call: lars(x = A, y = B, type = "lar")
Df Rss Cp
0 1 2.3467e+10 15702.7815
1 2 4.9092e+08 1.5306
2 3 4.9015e+08 3.0000

The Mallows's Cp is calculated to access the collinearity. The smaller the Cp the better the fit. So by this criteria, it’s more proper to hedge A50 with SSE50 only.
The ridge trace is stabilized in a narrow range when lambda rose above 20.
To show this, GCVs are calculated for lambda at 0 and 20 as below:

> ridge.sol<-lm.ridge(z1~x1+y1, lambda = 0,model = TRUE)
> ridge.sol$GCV
0 
1.042328e-06 
> ridge.sol<-lm.ridge(z1~x1+y1, lambda = 20,model = TRUE)
> ridge.sol$GCV
20 
1.247798e-06

The difference between the two values is not significant.
And coefficients for x1 and y1 also stayed in narrowed regions with lambda increases, showed as below:

> lm.ridge(z1~x1+y1, lambda = 20)
x1 y1 
1.4218741 0.2854380 0.4247126 
> lm.ridge(z1~x1+y1, lambda = 25)
x1 y1 
1.4939782 0.2933251 0.4112328 
> lm.ridge(z1~x1+y1, lambda = 30)
x1 y1 
1.5642558 0.2981725 0.4010200 
> lm.ridge(z1~x1+y1, lambda = 35)
x1 y1 
1.6330535 0.3011715 0.3928210 
> lm.ridge(z1~x1+y1, lambda = 40)
x1 y1 
1.7005682 0.3029678 0.3859563

Hence we took coefficients 0.3 and 0.4 for x1 and y1.
We can compare them to those coefficients when lambda is below 8:

> lm.ridge(z1~x1+y1, lambda = 2)
x1 y1 
1.1126174 0.1066454 0.6318545 
> lm.ridge(z1~x1+y1, lambda = 4)
x1 y1 
1.1595287 0.1699368 0.5631743 
> lm.ridge(z1~x1+y1, lambda = 6)
x1 y1 
1.1990122 0.2066814 0.5223915 
> lm.ridge(z1~x1+y1, lambda = 8)
x1 y1 
1.2347922 0.2305053 0.4951932

In the next report we will do the collinearity analysis and filter the trading signal for arbitrage trade. The price data after 2016/8/26 would be used for back test.

Copyright by FangQuant.com

Disclaimer: The information on this website is for general informational purposes only and is not intended as a recommendation or an offer or solicitation for the purchase or sale of any security, currency, investment, service or to attract any funds or deposits. Save to the extent provided otherwise in the Terms and Conditions for Accounts and Services or other applicable terms and conditions, information in this website has been prepared without taking account of the objectives, financial situation or needs of any particular investor. Therefore, investment products mentioned in this website may not be suitable for all investors. Any person considering an investment should seek independent advice on the suitability or otherwise of a particular investment. Before making any investment, each investor must obtain the investment offering materials, which include a description of the risks, fees and expenses and the performance history, if any, which may be considered in connection with making an investment decision. Each investor should carefully consider the risks associated with the investment and make a determination based upon the investor’s own particular circumstances, that the investment is consistent with the investor’s investment objectives.

Currently no Comments.

author: Fang

Hot Topics

The 13rd China International Future Forum The Shanghai Derivatives Energy Forum has received extensive attention from relevant industries both within and outside the borders. Financial institutions deep explore commodity market opportunities, commodity index financial products show full-scale trend R-Code@June 06, 2016 R-Code for analysis: getKDJ New indicator to analyze the arbitrage opportunities between sse50 and csi500 Market review: January 11, 2017 The Great China Bubble: Anniversary Lessons and Outlook The hedge strategy between SSE50 and A50--Jan 13,2017 Sleepless in London--Enda Homan(Allied Irish Banks Plc) Quant Investment in China A-share market The arbitraging strategy between CSI300 and SSE50 MSCI Rebuffs Chinese Equities for Third Time in Blow to Xi Market review: June 17, 2016 Soros, Druckenmiller among hedgies profiting in market plunge

Links

Contact Me: sherry_ustc@163.com

Twitter