User:Lchrisman/KeelinGradient
This is a private page of Lonnie. The goal here is to derive the gradient of the log likelihood for the Keelin distribution. This is my own scratch work, so it is likely highly prone to mistakes. This is actually the gradient of the log likelihood for any linear quantile parameterized function, Keelin being a special case.
Let:
- [math]\displaystyle{ x_i }[/math] = data points, [math]\displaystyle{ i=1..m }[/math]
- [math]\displaystyle{ \vec{a} }[/math] = quantile coefficients, [math]\displaystyle{ [a_1, ..., a_k, ..., a_N] }[/math]
- [math]\displaystyle{ \vec{B}(y) }[/math] = Basis at scalar y, where [math]\displaystyle{ 0\lt y\lt 1 }[/math] is a probability
- [math]\displaystyle{ = [B_1(y), B_2(y), ..., B_k(y),...B_N(y)] }[/math]
- [math]\displaystyle{ M(y) = \vec{a} \cdot \vec{B}(y) }[/math] = y-th quantile
- [math]\displaystyle{ m^{-1}(y) = {{\partial M(y)}\over {\partial y}} = \vec{a} \cdot \vec{b}(y) }[/math] = inverse density
- [math]\displaystyle{ \vec{b}(y) }[/math] = the derivative basis, [math]\displaystyle{ [b_1(y), b_2(y), ..., b_k(y),...b_N(y)] }[/math]
- [math]\displaystyle{ m(y) }[/math] = probability density at the yth quantile
- [math]\displaystyle{ y(x_i) }[/math] = the cumulative probability at [math]\displaystyle{ x_i }[/math]
- [math]\displaystyle{ \nabla_a f = \langle {{\partial f}\over{\partial a_1}},..,{{\partial f}\over{\partial a_k}},..,{{\partial f}\over{\partial a_N}} \rangle }[/math] = the gradient wrt [math]\displaystyle{ \vec{a} }[/math]
- [math]\displaystyle{ LL }[/math] = The log likelihood of the data -- its a function of [math]\displaystyle{ \vec{a} }[/math] and [math]\displaystyle{ [x_1,...,x_m] }[/math].
- = [math]\displaystyle{ \ln \left( \prod_i p( x_i | \vec{a} ) \right) = \sum_i \ln p(x_i|\vec{a}) }[/math]
- [math]\displaystyle{ p( x_i | \vec{a} ) = m( y(x_i) ) }[/math] = The probability density of the point [math]\displaystyle{ x_i }[/math]
The gradient of the log likelihood is
- [math]\displaystyle{ \begin{split} \nabla_a LL &= \nabla_a \sum_i \ln p(x_i|\vec{a}) \\ &= \sum_i \nabla_a \ln m(y(x_i)) \\ &= \sum_i m^{-1}(y(x_i)) \nabla_a m(y(x_i)) \end{split} }[/math]