Algebraic Geometry and Statistical Learning Theory
Welcome to Author's Page

Sumio Watanabe, 2009, Cambridge Univesity Press

Algebraic geometry and statistical learning theory

Why Algebraic Geometry ?

Estimation of true structure is algebraic geometry.@

In statistical learning theory, we need to estimate the structure of information source from random samples.

For a given random samples, there are non-local paramaters whose likelihoods are almost same. Even if such paramaters correspond to almost same probability distributions, the extracted models have very different structures. The main reason why such a problem occurs is that the set of such parameters consists of neighborhoods of an algebraic variety. The map from the paramater to the statistical model is not one-to-one, resulting that Fisher information matrix is not positive definite. Hence the conventional statistical theory can not be applied.

It should be emphasized that a statistical model which extracts hidden structure from random samples has the same property. If you want to discover knowledge from samples, you have to study algebraic geometry. In order to make a model selection method or a hypothesis testing procedure, algebraic geometry is necessary.

Our research clarified that algebraic geometry gives concrete results on this problem. Based on resolution of singularities, we obtain four results. First, the behavior of the log likelihood function is clarified. Second, BIC is extended for general statistical models. Third, AIC is extended for general statistical models. And last, the reason why the maximum likelihood estimator is not appropriate for discovery of hidden structure is mathematically clarified. The generalized BIC and AIC are determined by two birationbal invariants, the log canonical threshold and the singular fluctuation.

After this book was published, we obtain the proof of the theorem that the generalized AIC, which is called the widely applicable information criterion, is asymptotically equivalent to the leave-one-out cross validation, even if the true distribution is an algebraic variety. We expect that mathematical basis of a lot of statistical concepts will be made by algebraic geometry.