Skip to main content

Midterm Cheat Sheet

Linear Regression

image-1666104229251.png

image-1666104239088.png

image-1666104245811.png

image-1666104252727.png

image-1666104260540.png

image-1666104271056.png

image-1666104276418.png

image-1666104282605.png

image-1666104287023.png

image-1666104297248.png

Multiple Linear Regression and Estimation

image-1666104323618.png

image-1666104327905.png

image-1666104332246.png

image-1666104336362.png

image-1666104345652.png

image-1666104350789.png

image-1666104355280.png

image-1666104359234.png

image-1666104363437.png

image-1666104372729.png

image-1666104378232.png

𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = ⋯ = 𝛽𝑝 = 0  
v.s.  𝐻1 : not all 𝛽𝑘 = 0, 𝑘 = 1, … , 𝑝

image-1666104389032.pngimage-1666104395117.png

rejection rule of 𝑡 >= t(1 − alpha/2; 𝑛 − 𝑝 − 1)

image-1666104407108.png

image-1666104410593.png

image-1666104415116.png

image-1666104423017.png

Model Fitting: Inference

image-1666104576887.png

dfΩdfΩ = n - p, and df𝜔df𝜔 = n – q

image-1666104585786.pngimage-1666104643517.png

image-1666138340542.png

Reject the null hypothesis if F > Fα p - q, n – p

image-1666104643517.png

Dummy Variables and Analysis of Covariance
Consider a Xi2 for which is 0 for – and 1 for +:

image-1666104607845.png

An interaction between Xi1 and Xi2:

image-1666104615922.png

A model with multiple categorical variables:

image-1666104625980.png

image-1666104633030.png

Regression Diagnostics
Assumptions:
    • Error:  ~ N(0, SD2I); 
        ◦ Independent
        ◦ Equal Variance
        ◦ Normally Distributed
    • Model: E[y] = Xβ is correct
    • Unusual observations
      
Leverage Points: data point with unusual x-value

image-1666104773344.png

      The Hat Matrix – n*n matrix
hii is the leverage of the ith case
leverage > 2p’/n should be looked at closely


Outliers: Unusual observation on x or y axis

image-1666104790374.png

Calculate the t-test and compare abs with limit:
abs(qt(.05/(n*2), df = n - pprime - 1, lower.tail = T))

 

Influential Points: causes changes to regression
    Difference in Fits:

image-1666104815733.png

with a threshold of

image-1666104825747.png

Where p’ is the number of parameters 

    Cook's Distance:

image-1666104834542.png

with a threshold of
Di > 4/n should be looked at
Di > .5 possible influence
Di >= 1 very influential

Error: a plot of e_hat should
    • have constant variance
    • have no clear pattern
    • H0: residuals are normal

Broken Stick Regression

We define two basis functions where c marks the division between groups:


image-1666104891751.pngimage-1666104898193.png




image-1666105013557.png

image-1666105020734.png

image-1666105036225.png