Geometry and Statistics in NNs



Japanese Version




We are very grad to inform that we have a special session,
"Geometry and Statistics in Neural Network Learning Theory" ,
in the International Conference KES'2001 , which will be held in Oska and Nara in Japan, 6th - 8th, September, 2001.
Osaka Office

In our session, we study the statistical problem caused by non-identifiability of layered learning machines.

Information :

* Date: September, 8th (Saturday), 2001, 14:40-16:45.
* Place: Nara New Public Hall, Nara City, Japan.
* Schedule: The time for each presentation is 25 minutes.

(Remark)
* Before this session, Professor Amari gives an invited talk, 13:40-14:40.
* You can see all special sessions in the conference.




The authors and papers:

You can read the papers which will appear in the sepecial session.

When you refer these papers, please use "to appear in Proceedings of 5th International Conference on Knowledge-based information Engineering Systems and Allied Technologies," 2001, September, Osaka and Nara.


(1) Shun-Ichi Amari , T.Ozeki, and H.Park (RIKEN Brain Science Institute,Japan)
"Singularities in Learning Models: Gaussian Random Field Approach."


(2) Kenji Fukumizu (Insitute of Statistical Mathematics,Japan)
"Asymptotic Theory of Locally Conic Models and its Application to Multilayer Neural Networks."
A full version of many parts of this paper is
"Likelihood Ratio of Unidentifiable Models and Multilayer Neural Networks"


(3) Katsuyuki Hagiwara (Mie University,Japan)
"On the training error and generalization error of neural network regression without identifiablity."


(4) Taichi Hayasaka, M.Kitahara, K.Hagiwara, N.Toda, and S.Usui (Toyohashi University of Technology, Japan)
"On the Asymptotic Distribution of the Least Squares Estimators for Non-identifiable Models."


(5) Sumio Watanabe (Tokyo Institute of Technology,Japan)
"Bayes and Gibbs Estimations, Empirical Processes, and Resolution of Singularities."





A Short Introduction:


[ Non-identifiability ]

A parametric model in statistics is called identifiable if the mappning from the parameter to the probability distribution is one-to-one. A lot of learning machines used in information processing, such as artificial neural networks, normal mixtures, and Boltzmann machines, are not identifiable. We do not yet have mathematical and statistical foundation on which we can research such models.

[ Singularities and Asymptotics ]

If a non-identifiable model is redundant compared with the true distribution, then the set of true paramters is an analytic set with complex singularities, and the rank of the Fisher information matrix depends on the parameter. The behaviors of the training and generalization errors of layered learning machines are quite different from those of regular statistical models. It should be emphasized that we can not apply the standard asymptotic methods constructed by Fisher, Cramer, and Rao to these models. Either we can not use AIC, MDL, or BIC in statistical model selection for design of artificial neural networks.

[ Geometry and Statistics ]

The purpose of this special session is to study and discuss the geometrical and statistical methodology by which non-identifiable learning machines can be analyzed. Remark that conic singularities are given by blowing-downs, and normal crossing singularities are found by blowing-ups. These algebraic geometrical methods take us to the statistical concepts, the order statistic and the empirical process . We find that a new perspective in geometry and statistics is opened.

[ Results which will be reported ]

(1) Professor Amari, et. al. clarify the generaliztion and traning errors of learning models of conic singularities in both the maximum likelihood method and the Bayesian method using the gaussian random field approach.

(2) Dr. Fukumizu proves that a three layered neural network can be understood as a locally conic model, and that the asymptotic likelihood ratio is in proportion to (log n), where n is the number of training samples.

(3) Dr. Hagiwara shows that the training and generalization errors of a radial basis function with gaussian units are in proportion to (log n) based on the assumption that the inputs are fixed.

(4) Dr. Hayasaka, et.al. claim that the asymptotic normality of estimators does not hold in case of simple non-identifiable models, and the asymptotic distribution of them is closely related to distributional results of order statistics.

(5) Lastly, Dr.Watanabe studies the Bayes and Gibbs estimations for the case of statistical models with normal crossing singularities, and shows all general cases result in this case by resolution theorem.

We expect that mathematicians, statisticians, information scientists, and theoretical physists will be interested in this topic.





Thank you very much for your interest in this special session. For questions or comments, please send an e-mail to

Dr. Sumio Watanabe,
P&I Lab., Tokyo Institute of Technology.
E-mail: swatanab@pi.titech.ac.jp
http://watanabe-www.pi.titech.ac.jp/~swatanab/index.html
[Postal Mail] 4259 Nagatsuta, Midori-ku, Yokohama, 226-8503 Japan.