Para estimar modelos lineales en R se utiliza la función lm(), de linear models:
El resultado del análisis se ha guardado en la variable m. Con esta variable se puede obtener todos los resultados:
## (Intercept) Area Elevation Nearest Scruz Adjacent
## Baltra 1 25.09 346 0.6 0.6 1.84
## Bartolome 1 1.24 109 0.6 26.3 572.33
## Caldwell 1 0.21 114 2.8 58.7 0.78
## Champion 1 0.10 46 1.9 47.4 0.18
## Coamano 1 0.05 77 1.9 1.9 903.82
## Daphne.Major 1 0.34 119 8.0 8.0 1.84
## Daphne.Minor 1 0.08 93 6.0 12.0 0.34
## Darwin 1 2.33 168 34.1 290.2 2.85
## Eden 1 0.03 71 0.4 0.4 17.95
## Enderby 1 0.18 112 2.6 50.2 0.10
## Espanola 1 58.27 198 1.1 88.3 0.57
## Fernandina 1 634.49 1494 4.3 95.3 4669.32
## Gardner1 1 0.57 49 1.1 93.1 58.27
## Gardner2 1 0.78 227 4.6 62.2 0.21
## Genovesa 1 17.35 76 47.4 92.2 129.49
## Isabela 1 4669.32 1707 0.7 28.1 634.49
## Marchena 1 129.49 343 29.1 85.9 59.56
## Onslow 1 0.01 25 3.3 45.9 0.10
## Pinta 1 59.56 777 29.1 119.6 129.49
## Pinzon 1 17.95 458 10.7 10.7 0.03
## Las.Plazas 1 0.23 94 0.5 0.6 25.09
## Rabida 1 4.89 367 4.4 24.4 572.33
## SanCristobal 1 551.62 716 45.2 66.6 0.57
## SanSalvador 1 572.33 906 0.2 19.8 4.89
## SantaCruz 1 903.82 864 0.6 0.0 0.52
## SantaFe 1 24.08 259 16.5 16.5 0.52
## SantaMaria 1 170.92 640 2.6 49.2 0.10
## Seymour 1 1.84 147 0.6 9.6 25.09
## Tortuga 1 1.24 186 6.8 50.9 17.95
## Wolf 1 2.85 253 34.1 254.7 2.33
## attr(,"assign")
## [1] 0 1 2 3 4 5
## (Intercept) Area Elevation Nearest Scruz
## 7.068220709 -0.023938338 0.319464761 0.009143961 -0.240524230
## Adjacent
## -0.074804832
## Baltra Bartolome Caldwell Champion Coamano
## 116.7259460 -7.2731544 29.3306594 10.3642660 -36.3839155
## Daphne.Major Daphne.Minor Darwin Eden Enderby
## 43.0877052 33.9196678 -9.0189919 28.3142017 30.7859425
## Espanola Fernandina Gardner1 Gardner2 Genovesa
## 47.6564865 96.9895982 -4.0332759 64.6337956 -0.4971756
## Isabela Marchena Onslow Pinta Pinzon
## 386.4035578 88.6945404 4.0372328 215.6794862 150.4753750
## Las.Plazas Rabida SanCristobal SanSalvador SantaCruz
## 35.0758066 75.5531221 206.9518779 277.6763183 261.4164131
## SantaFe SantaMaria Seymour Tortuga Wolf
## 85.3764857 195.6166286 49.8050946 52.9357316 26.7005735
## Baltra Bartolome Caldwell Champion Coamano
## -58.725946 38.273154 -26.330659 14.635734 38.383916
## Daphne.Major Daphne.Minor Darwin Eden Enderby
## -25.087705 -9.919668 19.018992 -20.314202 -28.785943
## Espanola Fernandina Gardner1 Gardner2 Genovesa
## 49.343513 -3.989598 62.033276 -59.633796 40.497176
## Isabela Marchena Onslow Pinta Pinzon
## -39.403558 -37.694540 -2.037233 -111.679486 -42.475375
## Las.Plazas Rabida SanCristobal SanSalvador SantaCruz
## -23.075807 -5.553122 73.048122 -40.676318 182.583587
## SantaFe SantaMaria Seymour Tortuga Wolf
## -23.376486 89.383371 -5.805095 -36.935732 -5.700573
## [1] 89231.37
## (Intercept) Area Elevation Nearest Scruz
## 7.068220709 -0.023938338 0.319464761 0.009143961 -0.240524230
## Adjacent
## -0.074804832
##
## Call:
## lm(formula = Species ~ Area + Elevation + Nearest + Scruz + Adjacent,
## data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -111.679 -34.898 -7.862 33.460 182.584
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.068221 19.154198 0.369 0.715351
## Area -0.023938 0.022422 -1.068 0.296318
## Elevation 0.319465 0.053663 5.953 3.82e-06 ***
## Nearest 0.009144 1.054136 0.009 0.993151
## Scruz -0.240524 0.215402 -1.117 0.275208
## Adjacent -0.074805 0.017700 -4.226 0.000297 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 60.98 on 24 degrees of freedom
## Multiple R-squared: 0.7658, Adjusted R-squared: 0.7171
## F-statistic: 15.7 on 5 and 24 DF, p-value: 6.838e-07
## [1] 0.7658469
El modelo que queremos estimar es
\[\begin{equation} y_i = \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i} + \epsilon_i, \ i = 1,2,\cdots,n \end{equation}\]
es decir, tenemos que \(\beta_0 = 0\). En forma matricial tendríamos \(y = X\beta + \epsilon\), donde:
\[\begin{equation} X = \begin{bmatrix} x_{11} & x_{21} & x_{31} \\ x_{12} & x_{22} & x_{32} \\ \cdots &\cdots & \cdots & \cdots \\ x_{1n} & x_{2n} & x_{3n} \\ \end{bmatrix} , \ \beta = \begin{bmatrix} \beta_1 \\ \beta_2 \\ \beta_3 \end{bmatrix} \end{equation}\]
En R, este modelo se estima añadiendo un cero en la declaración de los regresores:
##
## Call:
## lm(formula = Species ~ 0 + Area + Elevation + Nearest + Scruz +
## Adjacent, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -116.638 -31.142 -7.858 37.744 182.422
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## Area -0.02664 0.02082 -1.280 0.212373
## Elevation 0.33065 0.04351 7.600 5.9e-08 ***
## Nearest 0.02590 1.03480 0.025 0.980232
## Scruz -0.21359 0.19913 -1.073 0.293682
## Adjacent -0.07646 0.01682 -4.545 0.000121 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 59.91 on 25 degrees of freedom
## Multiple R-squared: 0.8502, Adjusted R-squared: 0.8202
## F-statistic: 28.38 on 5 and 25 DF, p-value: 1.515e-09
El \(R^2\) de este modelo es superior al modelo m1.
Otra opción es:
##
## Call:
## lm(formula = Species ~ -1 + Area + Elevation + Nearest + Scruz +
## Adjacent, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -116.638 -31.142 -7.858 37.744 182.422
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## Area -0.02664 0.02082 -1.280 0.212373
## Elevation 0.33065 0.04351 7.600 5.9e-08 ***
## Nearest 0.02590 1.03480 0.025 0.980232
## Scruz -0.21359 0.19913 -1.073 0.293682
## Adjacent -0.07646 0.01682 -4.545 0.000121 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 59.91 on 25 degrees of freedom
## Multiple R-squared: 0.8502, Adjusted R-squared: 0.8202
## F-statistic: 28.38 on 5 and 25 DF, p-value: 1.515e-09