cross-validating mental models
This might or might not be an article adapted from an essay assignment. Less APA, more rabbit holes.
The goal of theoretical courses is to help students acquire accurate and general mental models of the world. Unfortunately, the vast majority of such programs rely only on indirect assessments of those internalized models in the form of exams. A robust understanding of the concepts involved is identified with accurate answers to a set of general questions on the exam. However, due to the fact that an important part of the student’s motivation is the result of the assessment itself, the end result is often misaligned with the goal of actual understanding. An important task undertaken by the teacher in this process is to craft an exam which renders the acquisition of accurate mental models into an instrumental goal for the student, helping align their goals. However, this challenge might be avoided entirely if the focus of assessment in theoretical courses would move from indirect reflections of internalized knowledge to a direct probing of the student’s mental models. This essay describes a way of implementing such an assessment framework, rooted both in empirical observations of student learning and technical solutions proposed in machine learning and AI.
Just like in theoretical courses, the goal in many machine learning applications is to train an agent to internalize an accurate model of its environment. This might be an accurate model of how speech relates to text in speech recognition, how weather unfolds over time in forecasting, or how human faces look like in image synthesis. Just like students sometimes resort to rote learning of superficial features and then fail to generalize their knowledge to novel situations, so can AI models resort to overfitting on training data and then fail to generalize to testing data. For instance, a poor AI model might learn how to classify natural photographs, but fail to understand the contents of pencil sketches depicting similar objects, failing to generalize.
One widespread solution to avoiding overfitting in AI is cross-validation. The core idea behind this technique is to split the available data in three parts: training, validation, and testing. The AI model receives rich feedback on its performance against the training data, with a full specificiation of its errors for each and every sample (e.g. what was the correct label on a misclassified pencil sketch). While the AI model continues to further improve its performance on the training data, the external system which supervises the learning process also keeps track of the AI model’s performance on validation data. Crucially, the AI model does not receive feedback on its results on validation data. This second part of the dataset is used only to validate the AI model’s generalization to new situations not seen during actual training. In practice, the AI model constantly improves on the training data over thousands of timesteps, but at some point its performance on validation data starts to decrease, hinting at overfitting and loss of generalization. This is often the moment in which the supervisor system stops the training loop of the AI model for achieving maximum generality. Finally, it tests the frozen model on the testing data for an unbiased final metric of its performance.
Unfortunately, if one was to translate this approach of ensuring accurate and general models to formal education, they would confront a major obstacle. It is difficult to validate the student’s understanding without actually training them more. There is no trivial way to freeze mental models for pure assessment and avoid influencing them during testing. The validation data would leak into the training data, and would therefore fail to provide an accurate estimate of how well those mental models perform in novel situations. The lack of true validation is, for instance, reflected in students who memorize past exams, becoming confident in their performance, only to subsequently fail on a truly novel test. Similarly, it is standard practice in AI not to take results on training data as meaningful on their own due to them often being overestimates of out-of-distribution performance.
Given this, how could students benefit from a constant signal on how robust their understanding is in truly novel situations? How could students get information on how reliable their abstractions are, helping them steer away from the trap of hyperspecific rote learning? One approach is to ask students to explicitly articulate their mental models in written form, and then carry out the evaluation of their understanding automatically. Concretely, the student would be asked to write a short summary of the topic at hand which would be constrained in terms of word count. Following the submission of such a compact text, an AI could try to use the student’s piece of writing as a knowledge base and try to answer the exam questions itself. It would choose the multiple-choice answers which most likely follow from the student’s summary, and similarly fill in the gaps.
Crucially, the student would get rich feedback on training data (e.g. “Your summary on placebo effects lead to an incorrect answer on this specific question”) and inevitably update their mental models, but would also get a sparse signal for the performance on validation data (e.g. “If you were to manually answer the validation questions, we estimate you’d score a 7 based on your summary.”) without further specific guidelines. The sparse validation signal can give the student a sense of how well their current models would fare in new situations, prompting them to refine them without obvious hints which would invalidate the very validation procedure. Simply implementing all explicit feedback from the training data wouldn’t mean scoring high on validation, and similarly not on testing – a pragmatic reminder. The sparse validation signal could give students a sense of whether they are on the right track in their learning process, helping them identify the important features of the topic at hand – the forest, not the trees – a challenge reported in the literature of summarization as a learning technique.
In sum, decoupling the assessment of mental models from actual student interaction could provide a useful signal for refining those models for maximized performance in novel situations.