This repo contains the annotations and other artifacts of the paper titled In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages (Accepted to EMNLP 2023 Findings)
The folder annotated_data contains 1200 generations, per language, manually annotated as formal, informal, or incohesive.