Join us for a presentation and discussion with author Ethan Perez on his most recent paper on True Few-Shot Learning.
Language models are amazing few-shot learners with the right prompt, but how do we choose the right prompt? It turns out that people use large held-out sets(!). How do models like GPT3 do in a true few-shot setting? Much worse!
Join us for a discussion lead by Ian Thompson on the Logit Lens.
It’s the most provocative and unexplained interpretability insight we've seen related to autoregressive models — hard to imagine it’s not relevant for anyone working with transformers (maybe even non-causal stuff).