Gamma Regression
Consider a continuous dependent variable that is positive-valued, such as a length of a hospital stay, time waiting or the cost of a bill. This type of data is continuous in nature and oftentimes skewed and a normal approximation does not hold.
The type of data above presents a constant Coefficient of Variation (CV), that is:
$$ \sqrt{var(Y_i)} \over E(Y_i) = \sigma $$
The identity induces a quadratic variance function:
$$ var(Y_i) = \sigma^2E(Y_i)^2 $$
To model such data a number of approaches proposed in the literature have proved useful, including Log-Normal models and Gamma Regression models
- Log Normal models - the log transformation followed by a classical linear model is fairly successful in modeling this type of data. This approximation works best when the scale parameter (sigma) is small. Indeed, the log transformed model has approximately constant variance:
$$ log(Y) \approx log(\mu) + (Y - \mu) {{\delta log(y)} \over {\delta y}}(\mu) = log(\mu) + {{Y - \mu} \over \mu} $$
$$ Var(log(Y)) \approx {Var(Y) \over \mu^2 } = {{\sigma^2\mu^2} \over \mu^2 }= \sigma^2 $$ - Gamma Regression - the Gamma regression keeps the outcome on the original scale. If one wants to work on the original scale the framework of the generalized linear model proves very fruitful.
Gamma Distributions
The Gamma family is a very flexible family of distributions with support on the positive axis. The family is indexed by two parameters. One way to parameterize which is used in SAS and focused on in this lecture is called the mean parameterization.
A variable is said to follow the Gamma distribution if the density has the form:
$$ f_Y(y) = {1 \over {\Gamma (v)y} }({{yv \over \mu}})^v exp(-{{yv} \over \mu}), 0 < y < \infty $$
where
$$ \Gamma(v) = \int_0^\infty x^{v-1}exp(-x)dx $$
The mean and variance of a Gamma distributed variable are:
$$ E(Y) = \mu $$
$$ var(Y) = {\mu^2 \over v } = \sigma^2\mu^2, \sigma^2 = {1 \over v} $$
The parameter v is called the scale parameter, and \( v / \mu \) is called the rate parameter. The inverse of the scale parameter (\(\sigma^2\)) is called the dispersion parameter.
Properties of the Gamma Distribution
- The shape of the density is controlled by v
- The chi-sqaure distribution is a special case of the Gamma. A chi-sqaure distribution with n degrees of freedom is the same as a Gamma with \( v = {n \over 2} \) and \( \mu = n \)
- The Gamma is a flexible life distribution model that may offer a good fit to some sets of failure data. However, it is not widely used as a life distribution model for common failure mechanisms.
- The Gamma does arise naturally as the time-to-first failure distribution for a parellel system with components having lifetimes distributed exponentially. If there are components in the system and all components have exponential lifetimes with mean \(\gamma\), then the total lifetime has Gamma distribution v = n and \(\mu = n\gamma \)
- The Gamma distribution is widely used in engineering, science and business.