LChrisman/Constrained regression

I consider here the solution to a regression problem with linear equality constraints:

[math]\displaystyle{ \min_a {1\over 2} || B a - y ||^2 }[/math]

subject to [math]\displaystyle{ C a = d }[/math]

where [math]\displaystyle{ B }[/math] is the [math]\displaystyle{ n \times k }[/math] basis matrix, [math]\displaystyle{ C }[/math] is a [math]\displaystyle{ m \times k }[/math] matrix where [math]\displaystyle{ m }[/math] is the # of equality constraints and [math]\displaystyle{ k }[/math] the number of terms, and [math]\displaystyle{ d }[/math] an m-vector. The target data, [math]\displaystyle{ y }[/math] is an n-vector (column), and the solution, [math]\displaystyle{ a }[/math] is a k-vector (column).

When [math]\displaystyle{ B }[/math] and [math]\displaystyle{ C }[/math] are of full rank with [math]\displaystyle{ k + m \ge n }[/math], then a solution should pass through all points. The presence of the constraints means that even in the full rank case, it may require more terms than points to reach a perfect fit.

The Theil and van der Panne (1960) solution to this constrained regression is

[math]\displaystyle{ \lambda = \left( C (B^T B)^{-1} \right)^{-1} (C a_{ols} - d) }[/math]

[math]\displaystyle{ a^* = a_{ols} - (B^T B)^{-1} C^T \lambda }[/math]

When [math]\displaystyle{ k\gt n }[/math], [math]\displaystyle{ B^T B }[/math] is singular. It does not work to replace the inverse with a Moore-Penrose inverse in this case because that selects out a single solution which in general won't be the optimal solution with the constraints.

My intention here is to document the general solution that continues to work when [math]\displaystyle{ k\gt n }[/math], and hopefully even when [math]\displaystyle{ k+m \gt n }[/math].