Tighter PAC-Bayes bounds, NIPS, 2006. ,
Learning deep architectures for AI. Foundations and Trends in Machine Learning, vol.2, pp.1-127, 2009. ,
Olivier Catoni. Statistical learning theory and stochastic optimization: Ecole d'Eté de Probabilités de Saint-Flour XXXI-2001, vol.840, 2003. ,
PAC-Bayesian supervised classification: the thermodynamics of statistical learning, Inst. of Mathematical Statistic, vol.56, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00206119
Support-vector networks, Machine Learning, vol.20, 1995. ,
Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data, UAI, 2017. ,
PAC-Bayesian learning of linear classifiers, ICML, pp.353-360, 2009. ,
Deep Learning, 2016. ,
A primer on PAC-Bayesian learning, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-01983732
, Binarized neural networks. In NIPS, pp.4107-4115, 2016.
Quantized neural networks: Training neural networks with low precision weights and activations, JMLR, vol.18, issue.1, pp.6869-6898, 2017. ,
Adam: A method for stochastic optimization, ICLR, 2015. ,
Tutorial on practical prediction theory for classification, JMLR, vol.6, 2005. ,
Not) Bounding the True Error, NIPS, pp.809-816, 2001. ,
PAC-Bayes & margins, NIPS, 2002. ,
Some PAC-Bayesian theorems, Machine Learning, vol.37, 1999. ,
Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Srebro. A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks, Machine Learning, vol.51, 2003. ,
PAC-Bayes bounds with data dependent priors, JMLR, p.13, 2012. ,
Automatic differentiation in PyTorch, NIPS Autodiff Workshop, 2017. ,
PAC-Bayesian generalization bounds for gaussian processes, JMLR, vol.3, 2002. ,
A PAC analysis of a Bayesian estimator, COLT, 1997. ,
Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights, NIPS, pp.963-971, 2014. ,
A theory of the learnable, Proceedings of the sixteenth annual ACM symposium on Theory of computing, pp.436-445, 1984. ,
How transferable are features in deep neural networks? In NIPS, pp.3320-3328, 2014. ,
Non-vacuous generalization bounds at the imagenet scale: a PAC-bayesian compression approach, ICLR, 2019. ,
Additional results, vol.3 ,
, Figure 3 reproduces the experiment presented by Figure 1 with another toy dataset. Figure 5 studies the effect of the sampling size T on the stochastic gradient descent procedure. See both figures captions for details