This repository contains the code for reproduce the results from "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023) [PDF].
Make sure you have wandb.ai user and that you are logged into your machine.
Install the required python packages:
pip install -r requirements.txt
Gurobi is needed to find unselectable keys, and requires a license. See in here.
In general, all experiments can run on either GPU or CPU.
- The
majority
subdirectory contains the files needed to reproduce the results of the Majority task (Figure 1a, 1b, 2, 3). - The
unselectable
subdirectory contains the files needed to reproduce the results of the unselectable experiments (Figure 1c, 1d, 4, Table 1, 2).
On the Expressivity Role of LayerNorm in Transformers' Attention
@article{brody2023expressivity,
title={On the Expressivity Role of LayerNorm in Transformers' Attention},
author={Brody, Shaked and Alon, Uri and Yahav, Eran},
journal={arXiv preprint arXiv:2305.02582},
year={2023}
}