From a9850a436e8a79070067d0a364991a22a43cb858 Mon Sep 17 00:00:00 2001 From: Lj Miranda Date: Tue, 13 Aug 2024 20:59:40 -0700 Subject: [PATCH] Revert changes to research --- research/index.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/research/index.md b/research/index.md index 0313b336..065ab571 100644 --- a/research/index.md +++ b/research/index.md @@ -5,11 +5,12 @@ description: Research work of Lester James V. Miranda permalink: /research/ --- - +【[🎓 Google Scholar](https://scholar.google.co.jp/citations?user=2RtnNKEAAAAJ&hl=en)】 +【[📚 Semantic Scholar](https://www.semanticscholar.org/author/Lester-James-Validad-Miranda/13614871)】 + I'm broadly interested in **data-centric approaches to building language technologies at scale.** - -My goal is to develop systematic methodologies for efficiently constructing NLP resources while actively building new datasets and benchmarks to enhance language model training and evaluation. -More concretely, I'm interested in the following areas: +I believe that a careful and systematic understanding of data— from its collection to its downstream influence on training— is crucial to build general-purpose language models. +More specifically, I'm interested to work on the following topics: - **Efficient approaches to annotation**: Human annotations are costly. How can we reduce this cost while preserving the nuance that human annotators provide? I'm currently exploring this question in the context of human preferences in LLM post-training (RLHF).