Overfitting, Underfitting and Model’s Capacity in Deep Learning
Overfitting, underfitting, and a model's capacity are critical concepts in deep learning, particularly in the context of training neural networks. In this post, we'll learn how a model's capacity leads to overfitting and underfitting of the data and its effect on the performance of a neural network. Let's begin! Overview OverviewModel's CapacityGeneralization GapHow Model's Capacity Affects the Way A Model Fits the Same Set of DataHow to Know If The Model Would Work Well on Unseen Data?SummaryFurther Reading In this post, you will learn: What is the model's capacity? How model's capacity affect the way the model fits the same set of data? Concept of Overfitting, underfitting, and finding just the right fit How to know if the model would work well on unseen data? Model's Capacity A model's capacity refers to its ability to capture and represent complex patterns in the data. It reflects the flexibility and complexity of the model architecture. Let's understand this with the help of an example: We can train a model using historical data and make predictions about the lottery based on that trained model as shown in the figure below: Figure 1: ML model to predict lottery The problem is that the model being able to fit on seen data, doesn't mean that it will perform well on unseen data. This means that the model with high capacity (has a large number of parameters and is trained long enough) can memorize training samples. In the lottery case, the input can be the date of the lottery and the output can be the lucky number of that day's lottery. If we train a…