Abstract:
Linear algebra operations are at the core of many computational tasks. For example,
evaluating the density of a multivariate normal distribution requires the solution of a linear
equation system and the determinant of a square matrix. Frequently, and in particular in
machine learning, the size of the involved matrices is too large to compute exact solutions,
and necessitate approximation. Building upon recent work (Hennig and Kiefel 2012; Hennig
2015) this thesis considers numerical linear algebra from a probabilistic perspective.
Part iii establishes connections between approximate linear solvers and Gaussian inference,
with a focus on projection methods. One result is the observation that solution-based
inference (Cockayne, Oates, Ipsen, and Girolami 2018) is subsumed in the matrix-based
inference perspective (Hennig and Kiefel 2012). Part iv shows how the probabilistic
viewpoint leads to a novel algorithm for kernel least-squares problems. A Gaussian model
over kernel functions is proposed that uses matrix-multiplications computed by conjugate
gradients to obtain a low-rank approximation of the kernel. The derived algorithm kernel
machine conjugate gradients provides empirically better approximations than conjugate
gradients and, when used for Gaussian process regression, additionally provides estimates
for posterior variance and log-marginal likelihood, without the need to rerun. Part v is
concerned with the approximation of kernel matrix determinants. Assuming the inputs
to the kernel are independent and identically distributed, a stopping condition for the
Cholesky decomposition is presented that provides probably approximately correct (PAC)
estimates of the log-determinant with only little overhead.