Energy-Based Transformers Are Scalable Learners and Thinkers alexiglad.github.io 4 points by cs702 11 hours ago
See also https://www.reddit.com/r/MachineLearning/comments/1lu1ia0/r_...