Comparison of PSIS Cross Validation with WAIC


Sumio Watanabe




Sumio Watanabe Homepage






Key words: PSISCV, ISCV, WAIC



Aki Vehtari, Andrew Gelman, and Jonah Gabry proposed Pareto Smoothed Importance Sampling Cross Validation (PSISCV).

(Paper) Aki Vehtari, Andrew Gelman, and Jonah Gabry, ``Efficient implementation of leave-one-out cross-validation and WAIC for evaluating fitted Bayesian models," http://arxiv.org/abs/1507.04544

Let us compare PSISCV with WAIC from the viewpoint of statistical estimation of the generalization error in a simple regression problem.

If a leverage sample is contained in a training set (such a case is called influential observation), then ISCV has the infinite variance. PSISCV was proposed for such a case. We compare PSISCV, ISCV, and WAIC in influential observations.




PSISCV





PSISCV





MATLAB program






PSISCV





PSISCV





PSISCV





PSISCV





(Conclusion) PSISCV, ISCV, and WAIC were compared from the viewpoint of statistical estimators of the generalization error.

(1) From the viewpoint of statistical estimation of the generalization error, the difference among PSISCV, ISCV, and WAIC is smaller than the fluctuation of the generalization error, even in influential observation.

(2) In experiments, E|PSISCV-GE| was smaller than E|ISCV-GE|.

(3) In experiments, E|WAIC-GE| was smaller than E|ISCV-GE|.

(4) If the standard deviation of a leverage sample is 100 times as large as other samples, then E|PSISCV-GE| was almost equal to E|WAIC-GE|.

(5) If otherwise, E|PSISCV-GE| was larger than E|WAIC-GE|.

It seems that there is a trade-off structure in estimation of the generalization error in influential observation.




(Practical Advise) If we have MCMC posterior samples, then it is easy to calculate WAIC, ISCV, and PSISCV. Thus I recommend that all criteria had better be calculated and compared.