diff --git a/research/index.md b/research/index.md index 065ab571..7aee8a96 100644 --- a/research/index.md +++ b/research/index.md @@ -46,12 +46,12 @@ profiles for more updated information. *At AI2, I'm working on various aspects of LM adaptation such as preference data collection and evaluation. I also expanded my work in the multilingual NLP front (SEACrowd, SIGTYP).* - [SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages](https://arxiv.org/abs/2406.10118) -
*ArXiV preprint '24* +
*EMNLP '24, ArXiV preprint '24*
Holy Lovenia\*, Rahmad Mahendra\*, Salsabil Maulana Akbar\*, Lester James Miranda\*, and 50+ other authors *(∗: major contributor)*.
[[Catalogue](https://seacrowd.github.io/seacrowd-catalogue)] [[Code](https://github.com/SEACrowd/seacrowd-datahub)] - [Consent in Crisis: The Rapid Decline of the AI Data Commons](https://arxiv.org/abs/2407.14933) -
*ArXiV preprint '24* +
*NeurIPS D&B '24, ArXiV preprint '24*
Data Provenance Initiative Team (40+ authors). I contributed in the annotation process design for Web Domain services and annotation quality review.
[[Website](https://www.dataprovenance.org/)] [[Collection](https://github.com/Data-Provenance-Initiative/Data-Provenance-Collection)] [[New York Times Feature](https://www.nytimes.com/2024/07/19/technology/ai-data-restrictions.html)] @@ -66,7 +66,7 @@ profiles for more updated information. ### 2023 -*I spent the early parts of 2023 working on low-resource languages and multilinguality, especially Tagalog, my native language. I mostly focused on core NLP tasks: POS tagging, NER, dependency parsing, etc.* +*I spent the early parts of 2023 working on low-resource languages and multilinguality, especially Tagalog, my native language. I mostly focused on linguistic tasks such as POS tagging, NER, and dependency parsing.* - [calamanCy: a Tagalog Natural Language Processing Toolkit](https://aclanthology.org/2023.nlposs-1.1/)
*NLP Open-Source Software (NLP-OSS) Workshop @ EMNLP '23*