Machine Learning

notes, thoughts, and practice of applied machine learning

Latest articles

Why WeightWatcher Works

I am frequently asked, why does weightwatcher work ? The weightwatcher tool uses power law fits to model the eigenvalue density of weight matrices of any Deep Neural Network (DNN). The average power-law exponent is remarkably well correlated with the test accuracy when changing the number of layers and/or fine-tuning the hyperparameters....

WeightWatcher: Empirical Quality Metrics for Deep Neural Networks

We introduce the weightwatcher (ww) , a python tool for computing quality metrics of trained, and pretrained, Deep Neural Networks pip install weightwatcher Here is an example with pretrained VGG11 from pytorch (ww works with keras models also): import weightwatcher as ww import torchvision.models as models model = models.vgg11(pretrained=True)...

Towards a new Theory of Learning: Statistical Mechanics of Deep Neural Networks

Introduction For the past year or two, we have talked a lot about how we can understand the properties of Deep Neural Networks by examining the spectral properties of the layer weight matrices . Specifically, we can form the correlation matrix , and compute the eigenvalues . By plotting the histogram of the eigenvalues...

This Week in Machine Learning and AI: Implicit Self-Regularization

Big thanks to and the team at This Week in Machine Learning and AI for my recent interview:

SF Bay ACM Talk: Heavy Tailed Self Regularization in Deep Neural Networks

My Collaborator did a great job giving a talk on our research at the local San Francisco Bay ACM Meetup Michael W. Mahoney UC Berkeley Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural Networks (DNNs), including both production quality, pre-trained models and smaller models trained from...

Heavy Tailed Self Regularization in Deep Neural Nets: 1 year of research

My talk at ICSI-the International Computer Science Institute at UC Berkeley. ICSI is a leading independent, nonprofit center for research in computer science. Why Deep Learning Works: Self Regularization in Neural Networks Presented Thursday, December 13, 2018 The slides are available on my slideshare. The supporting tool, WeightWatcher, can be...

Don’t Peek part 2: Predictions without Test Data

This is a followup to a previous post: DON’T PEEK: DEEP LEARNING WITHOUT LOOKING … AT TEST DATA The idea…suppose we want to compare 2 or more  deep neural networks (DNNs). Maybe we are fine tuning a DNN for transfer learning, or comparing a new architecture to an old on, or we are just tuning our hyper-parameters. Can we determine which DNN will...

Machine Learning and AI for the Lean Start Up

Machine Learning and AI for the Lean Start Up My recent talk at the French Tech Hub Startup Accelerator

Don’t Peek: Deep Learning without looking … at test data

What is the purpose of a theory ?  To explain why something works.  Sure.  But what good is a theory (i.e VC) that is totally useless in practice ?  A good theory makes predictions. Recently we introduced the theory of Implicit Self-Regularization in Deep Neural Networks.  Most notably, we observe that in all pre-trained models, the layer weight matrices...

Rank Collapse in Deep Learning

We can learn a lot about Why Deep Learning Works by studying the properties of the layer weight matrices of pre-trained neural networks.   And, hopefully, by doing this, we can get some insight into what a well trained DNN looks like–even without peaking at the training data. One broad question we can ask is: How is information concentrated in Deep...

Discover, share and read the best on the web

Subscribe to RSS Feeds, Blogs, Podcasts, Twitter searches, Facebook pages, even Email Newsletters! Get unfiltered news feeds or filter them to your liking.

Get Inoreader
Inoreader - Subscribe to RSS Feeds, Blogs, Podcasts, Twitter searches, Facebook pages, even Email Newsletters!