Recovery and matching provide self Supervised signals

Given unlabeled data, one can create training samples from by partially corrupting them and tasking a model to restore them. This has been done in NLP with BERT. The same approach is used in denoising autoencoders. Additionally, given unlabeled data, one can also create training samples by extracting meaningful pairs of snippets from one initial data point or two different ones, and minimizing a triplet loss. This matching procedure is present in the joint-embedding architecture with Siamese networks, like in face recognition. Those techniques have proven useful for developing models with commonsense knowledge. What’s more, those techniques could inform human learning.