Skip Navigation

Transformers struggle with generalizing tasks beyond pre-training data

arxiv.org /abs/2311.00871

There is a discussion on Hacker News, but feel free to comment here as well.

2

You're viewing a single thread.