Skip to content

Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)

Notifications You must be signed in to change notification settings

tech-srl/layer_norm_expressivity_role

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

On the Expressivity Role of LayerNorm in Transformers' Attention

This repository contains the code for reproduce the results from "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023) [PDF].

alt text

Setup

Make sure you have wandb.ai user and that you are logged into your machine.

Install the required python packages:

pip install -r requirements.txt 

Gurobi is needed to find unselectable keys, and requires a license. See in here.

Hardware

In general, all experiments can run on either GPU or CPU.

Code Structure

  1. The majority subdirectory contains the files needed to reproduce the results of the Majority task (Figure 1a, 1b, 2, 3).
  2. The unselectable subdirectory contains the files needed to reproduce the results of the unselectable experiments (Figure 1c, 1d, 4, Table 1, 2).

Citation

On the Expressivity Role of LayerNorm in Transformers' Attention

@article{brody2023expressivity,
  title={On the Expressivity Role of LayerNorm in Transformers' Attention},
  author={Brody, Shaked and Alon, Uri and Yahav, Eran},
  journal={arXiv preprint arXiv:2305.02582},
  year={2023}
}

About

Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published