22 Generalized Linear Model

Let $\mu_i = EY_{i}$

Multiple linear regression: \[ \begin{aligned} \mu_{i} = \beta_0 + \sum_{j=1}^{p}\beta_{j}x_{ij} \end{aligned} \]
Loglinear models (Poisson regression) \[ \begin{aligned} log(\mu_{i}) = \beta_{0} + \sum_{j=1}^p \beta_jx_{ij} \end{aligned} \]
Logistic regression: \[ \begin{aligned} log\frac{\mu_i}{1-\mu_i} = \beta_0 + \sum_{j=1}^p \beta_jx_{ij} \end{aligned} \]

22.0.1 GLM

A generalized linear model (GLM) is defined by specifying two components:

The response should be a member of the exponential family distribution
The link function, g, describes how the mean of the response, $\mu$, and a linear combination of the predictors, $0 + {j=1}^{p}{j}x{ij} \ \ g(_{i}) = 0 + {j=1}^{p}j{x{ij}} $

Any monotone continuous and differentiable function will do, but there are some convenient and common choices for the standard GLMs

22.0.2 Exponential Family

In a GLM the distribution of Y is from the exponential family of distributions which take the general form: $f(y \mid \theta, \phi) = exp\{ \frac{y\theta-b(\theta)}{a(\phi)} + c(y, \phi) \}$

$\theta$: the canonical parameter representing the location
$\phi$: the dispersion parameter representing the scale
a, b, and c: known functions
$EY = \mu = b'(\theta)$ The mean is a function of $\theta$ only
$VarY = b''(\theta)a(\phi)$ The variance is a product of functions of the location and the scale ** $b''(\theta)$ is called the variance function and describes how the variance relates to the mean using the known relatinship between $\theta$ and $\mu$

22.0.3 Normal Distribution

\[ f(y\mid \mu, \sigma^2) = \frac{1}{\sqrt(2\pi\sigma)}exp{\Big\{-\frac{(y-\mu)^2}{2\sigma^2}} \Big\} \\ = exp \Big\{\frac{y\mu - \frac{\mu^2}{2}}{{\sigma^2}} - \frac{1}{2}\Big[\frac{y^2}{\sigma^2} + log(2\pi\sigma^2)\Big] \Big\} \]