Machine learning models have fluid and crystallized intelligence
In machine learning, one often fears that models learn to memorize training data, and simply regurgitate it during training in one way or another. This criticism has been widespread for few-shot generalization in language models. In this, large models which absorb more knowledge can be seen as possessing more crystallized intelligence than fluid intelligence. How can we optimize for fluid intelligence?