지난 시간에 했던 데이터 그대로 사용
library(corrgram)
vars2 <- c("Assists","Atbat","Errors","Hits", "Homer", "logSal", "Putouts","RBI","Runs","Walks","Years")
cor.data <- baseball[,vars2]
Registered S3 methods overwritten by 'ggplot2': method from [.quosures rlang c.quosures rlang print.quosures rlang Registered S3 method overwritten by 'seriation': method from reorder.hclust gclus
회귀분석
train.data <- cor.data[-100,]
lm.model = lm(logSal ~ RBI, data = train.data)
lm.model.2 <- lm(logSal ~ Homer, data = train.data)
이전에 만들었던 회귀 모델
summary(lm.model)
Call: lm(formula = logSal ~ RBI, data = train.data) Residuals: Min 1Q Median 3Q Max -0.81573 -0.25592 0.04475 0.25880 1.09538 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.2323885 0.0477620 46.74 < 2e-16 *** RBI 0.0066289 0.0008286 8.00 4.12e-14 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3471 on 260 degrees of freedom (59 observations deleted due to missingness) Multiple R-squared: 0.1975, Adjusted R-squared: 0.1944 F-statistic: 64 on 1 and 260 DF, p-value: 4.125e-14
summary(lm.model.2)
Call: lm(formula = logSal ~ Homer, data = train.data) Residuals: Min 1Q Median 3Q Max -0.83421 -0.30147 0.06491 0.27229 0.92774 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.400028 0.037394 64.183 < 2e-16 *** Homer 0.014972 0.002572 5.821 1.72e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3645 on 260 degrees of freedom (59 observations deleted due to missingness) Multiple R-squared: 0.1153, Adjusted R-squared: 0.1119 F-statistic: 33.88 on 1 and 260 DF, p-value: 1.719e-08
중다 회귀 모델 만들기
lm.mult.model <- lm(logSal ~ RBI + Homer + Errors + Assists, data = train.data)
summary(lm.mult.model)
Call: lm(formula = logSal ~ RBI + Homer + Errors + Assists, data = train.data) Residuals: Min 1Q Median 3Q Max -0.71681 -0.25648 0.03474 0.24695 1.06278 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.2562598 0.0529470 42.614 < 2e-16 *** RBI 0.0086188 0.0016840 5.118 6.05e-07 *** Homer -0.0057834 0.0050339 -1.149 0.25167 Errors -0.0119246 0.0045757 -2.606 0.00969 ** Assists 0.0003649 0.0002180 1.673 0.09546 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3435 on 257 degrees of freedom (59 observations deleted due to missingness) Multiple R-squared: 0.2233, Adjusted R-squared: 0.2112 F-statistic: 18.47 on 4 and 257 DF, p-value: 2.34e-13
lm.model.2
모델에서는 Homer의 p값이 0.05보다 작아 유의미했음어떤 변수가 더 강한지 알아보려면
library(QuantPsyc)
Loading required package: boot Loading required package: MASS Attaching package: ‘QuantPsyc’ The following object is masked from ‘package:base’: norm
lm.beta(lm.mult.model)
RBI Homer Errors Assists 0.5778632 -0.1311628 -0.2038048 0.1369778