Rnns are like voice messages, transformers are like voice calls
When you exchange voice messages on an instant messaging platform, you have to listen to the whole message before sending your own. However, on a voice call, you spontaneously contribute as the conversation is flowing. In this, RNNs work by similarly first encoding all input information in an embedding, before unpacking that into an answer. In contrast, transformers build an answer as the input stream progresses.