Dataset size influences impact of noise

Due to the fact that supervised learning assumes an underlying mapping combined with noise, a larger dataset provides a clearer picture of the underlying function, as the noise components cancel out. In contrast, a smaller dataset doesn’t provide the opportunity for the supervised model to properly pick out the underlying function from the noise. However, augmenting data using noise can also sometimes help influence model flexibility.