We set the data range from 2015/4/16—2016/8/26 to run the regression.
The independent variables for CSI300 and SSE50 are named as x and y respectively. The dependable variable A50 is named as z. Each index is multiplied by the contract multiplier and converted into Chinese Yuan.
First of all, we calculated the correlation between each two of the indexes as following:
> cor(x,y) [1] 0.9872779 > cor(x,z) [1] 0.9759849
> lm1<-lm(z~x+y) > lm1 Call: lm(formula = z ~ x + y) Coefficients: (Intercept) x y 1.749e+04 -1.587e-03 7.128e-02
> x1<-log(x) > y1<-log(y) > z1<-log(z) > lm2<-lm(z1~x1+y1) > summary(lm2) Call: lm(formula = z1 ~ x1 + y1) Residuals: Min 1Q Median 3Q Max -0.117396 -0.010662 0.001583 0.012657 0.038213 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.04649 0.08893 11.768 <2e-16 *** x1 -0.02634 0.04259 -0.619 0.537 y1 0.77376 0.04473 17.297 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01874 on 335 degrees of freedom Multiple R-squared: 0.9745, Adjusted R-squared: 0.9744 F-statistic: 6407 on 2 and 335 DF, p-value: < 2.2e-16
> plot(x1~y1,col="red")
We applied regularization method to set a limit for the intercept. And run the ridge regression to process the data with collinearity.
With lm.ridge()function, we acquired 151 lambdas and selected the lambda with Generalized Cross Validation (GCV) when lambdaGCV is at minimum.
> library(MASS) > ridge.sol<-lm.ridge(z1~x1+y1, lambda = seq(0,150,length =151),model = TRUE) > ridge.sol$lambda[which.min(ridge.sol$GCV)] > coef(ridge.sol)[which.min(ridge.sol$GCV),] > matplot(ridge.sol$lambda,t(ridge.sol$coef),xlab = expression(lambda),ylab="cofficients",type="l",lty=1:20)
> plot(ridge.sol$lambda,ridge.sol$GCV,type = "l",xlab = expression(lambda),ylab=expression(beta)) > abline(v=ridge.sol$lambda[which.min(ridge.sol$GCV)])
> A<-as.matrix(rt[,3:4]) > B<-as.matrix(rt[,2]) > lass=lars(A,B,type="lar")
> summary(lass) LARS/LAR Call: lars(x = A, y = B, type = "lar") Df Rss Cp 0 1 2.3467e+10 15702.7815 1 2 4.9092e+08 1.5306 2 3 4.9015e+08 3.0000
> ridge.sol<-lm.ridge(z1~x1+y1, lambda = 0,model = TRUE) > ridge.sol$GCV 0 1.042328e-06 > ridge.sol<-lm.ridge(z1~x1+y1, lambda = 20,model = TRUE) > ridge.sol$GCV 20 1.247798e-06
> lm.ridge(z1~x1+y1, lambda = 20) x1 y1 1.4218741 0.2854380 0.4247126 > lm.ridge(z1~x1+y1, lambda = 25) x1 y1 1.4939782 0.2933251 0.4112328 > lm.ridge(z1~x1+y1, lambda = 30) x1 y1 1.5642558 0.2981725 0.4010200 > lm.ridge(z1~x1+y1, lambda = 35) x1 y1 1.6330535 0.3011715 0.3928210 > lm.ridge(z1~x1+y1, lambda = 40) x1 y1 1.7005682 0.3029678 0.3859563
> lm.ridge(z1~x1+y1, lambda = 2) x1 y1 1.1126174 0.1066454 0.6318545 > lm.ridge(z1~x1+y1, lambda = 4) x1 y1 1.1595287 0.1699368 0.5631743 > lm.ridge(z1~x1+y1, lambda = 6) x1 y1 1.1990122 0.2066814 0.5223915 > lm.ridge(z1~x1+y1, lambda = 8) x1 y1 1.2347922 0.2305053 0.4951932
Copyright by FangQuant.com