![]() |
About me
Contact
|
I am fascinated with theoretical and practical aspects of using deep learning for natural language processing. My doctoral work was mainly theoretical, and revolved around the application of deep networks in the fields of natural language processing and many-body quantum physics. At AI21 Labs, I empirically investigate the inner workings of large language models.
Jul’2022: Our paper on frozen large language models as readers for open domain question answering was accepted for presentation at the ICML 2022 Workshop on Knowledge Retrieval and Language Models.
May’22: We released a paper which proposes a modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning.
Apr’22: We released Standing on the Shoulders of Giant Frozen LMs, a paper which proposes using huge LM as a backbones surrounded by supporting networks that are 1000X smaller (BERT scale) that are able to externally specialize it without sacrificing performance and without actually changing its weights.
Apr’22: We released a paper which proves a first positive theoretical result for learning with intermediate supervision. Our paper theoretically motivates trending approaches such as Chain of Thought Prompting, which tackle compounded problems in natural language via introducing intermediate supervision in a seq2seq manner.
Jan’22: Our paper on the in-context learning bias in Transformer pretraining was accepted as a spotlight paper at ICLR 2022.
Oct’21: We released a paper that theoretically establishes an in-context learning bias in Transformer architectures, and proposes kNN-Pretraining: a new paradigm for pretraining-example design.
May’21: Our paper on expressivity bottlenecks in self-attention and their impact on Transformer architecture design across data modalities was accepted to ICML 2021.
Apr’21: I am a recipient of the Blavatnik Prize for PhD students.
Jan’21: Our paper on PMI-Masking was accepted as a spotlight paper at ICLR 2021.
Sep’20: Our paper on the depth-to-width trade-off in self attention was accepted to NeurIPS 2020.
June’20: We released a paper shedding light on the interplay between depth and width in self-attention architectures. See blogpost for overview of results.
Jan’20: Deep Autoregressive Models for the Efficient Variational Simulation of Many-Body Quantum Systems was accepted to Physical Review Letters.
Aug’19: We released our SenseBERT paper which introduces information on word senses withing BERT's pretraining.
Feb’19: We released a paper paper developing specialized deep autoregressive models for the efficient simulation of quantum systems.
Jan’19: Quantum Entanglement in Deep Learning Architectures was accepted to Physical Review Letters.
Mar’18: We released a paper showing that prominent deep learning architectures can efficiently represent highly entangled quantum wave-functions.
Mar’18: I am a recipient of the Adams Fellowship for Doctoral Students, the Israel Academy of Sciences and Humanities.
Jan’18: Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design was accepted to ICLR 2018.
Jan’18: Benefits of Depth for Long-Term Memory of Recurrent Networks was accepted to workshop of ICLR 2018.
Oct’17: We released a paper showing that deep recurrent networks have an exponential advantage in long-term memory capacity relative to shallow ones.
Apr’17: We released a paper connecting quantum wave-functions and convolutional networks, proposing a quantum physics inspired principled approach for deep network design.
Huge Frozen Language Models as Readers for Open-Domain Question Answering. ICML 2022 workshop.
Yoav Levine*, Ori Ram*, Daniel Jannai, Barak Lenz, Shai Shalev-Shwartz, Amnon Shashua, Kevin Leyton-Brown, and Yoav Shoham
Standing on the Shoulders of Giant Frozen Language Models. Preprint.
Yoav Levine, Itay Dalmedigos, Ori Ram, Yoel Zeldes, Daniel Jannai, Dor Muhlgay, Yoni Osin, Opher Lieber, Barak Lenz, Shai Shalev-Shwartz, Amnon Shashua, Kevin Leyton-Brown, and Yoav Shoham.
Originally released on Apr’22.
Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks. Preprint.
Noam Wies, Yoav Levine, and Amnon Shashua.
Originally released on Apr’22.
The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design. Spotlight paper, ICLR 2021.
Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, and Amnon Shashua.
Originally released on Oct’21.
Which Transformer Architecture Fits my Data? A Vocabulary Bottleneck in Self-Attention. ICML 2021.
Noam Wies, Yoav Levine, Daniel Jannai, and Amnon Shashua.
Originally released on May’21.
PMI-Masking: Principled Masking of Correlated Spans. Spotlight paper, ICLR 2021.
Yoav Levine, Barak Lenz, Opher Lieber, Omri Abend, Kevin Leyton-Brown, Moshe Tennenholtz, and Yoav Shoham.
Originally released on Oct’20.
The Depth-to-Width Interplay in Self-Attention. NeurIPS 2020.
Yoav Levine, Noam Wies, Or Sharir, Hofit Bata, and Amnon Shashua.
Originally released on June’20.
SenseBERT: Driving Some Sense into BERT. ACL 2020.
Yoav Levine, Barak Lenz, Or Dagan, Ori Ram, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, and Yoav Shoham.
Originally released on Aug’19.
Deep Autoregressive Models for the Efficient Variational Simulation of Many-Body Quantum Systems. Physical Review Letters (PRL) 124, 020503 (2019).
Or Sharir, Yoav Levine, Noam Wies, Giuseppe Carleo, and Amnon Shashua.
Originally released on Feb’19.
Quantum Entanglement in Deep Learning Architectures. Physical Review Letters (PRL) 122, 065301 (2019).
Yoav Levine, Or Sharir, Nadav Cohen, and Amnon Shashua.
Originally released on Mar’18.
Bridging Many-Body Quantum Physics and Deep Learning via Tensor Networks.
Yoav Levine, Or Sharir, Nadav Cohen, and Amnon Shashua.
Benefits of Depth for Long-Term Memory of Recurrent Networks. ICLR 2018 Workshop.
Yoav Levine, Or Sharir, and Amnon Shashua.
Originally released on Oct’17.
Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design. ICLR 2018.
Yoav Levine, David Yakira, Nadav Cohen, and Amnon Shashua.
Originally released on Apr’17.
Realizing Topological Superconductivity with Superlattices. Phys. Rev. B 96, 165147.
Yoav Levine, Arbel Haim, and Yuval Oreg.
Originally released on July’17.
Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions. Intel Collaborative Researceh Institute Special Issue on Deep Learning Theory.
Nadav Cohen, Or Sharir, Yoav Levine, Ronen Tamari, David Yakira, and Amnon Shashua.
Originally released on May’17.
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning. Preprint.
Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz.
Originally released on May’22.
Tensors for deep learning theory: Analyzing deep learning architectures via tensorization. Elsevier 2022.
Yoav Levine, Noam Wies, Or Sharir, Nadav Cohen, and Amnon Shashua.
Originally released on Oct’21.