Detecting-Generated-Text----BERT-Model

Introduction

This repository contains the code and methodology for the competition task of identifying whether an essay was written by a student or generated by a large language model (LLM). The dataset comprises approximately 10,000 essays in the training set and about 9,000 essays in the hidden test set.

Methodology

Data Preprocessing Cleaned and preprocessed the text data to prepare it for model training. Tokenization, stop words removal, and punctuation handling were performed.
Model Training Utilized a pre-trained BERT model, fine-tuning it on the training dataset. Addressed the imbalance in the dataset to prevent overfitting.
Evaluation Employed cross-validation techniques to evaluate model performance and make adjustments.
Prediction Applied the trained model to the test set to classify essays as either student-written or LLM-generated. Files notebook.ipynb: The main notebook containing the code for data preprocessing, model training, evaluation, and prediction. train_essays.csv: The training dataset with essays. test_essays.csv: The dummy test dataset for validation purposes.

Acknowledgments

This work leverages the BERT model, a powerful NLP tool developed by Google. Thanks to the competition organizers for providing the dataset and the opportunity to participate.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
DetectingGeneratedTextUsingBERT.ipynb		DetectingGeneratedTextUsingBERT.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting-Generated-Text----BERT-Model

Introduction

Methodology

Acknowledgments

License

About

Releases

Packages

Languages

MaiAditya/Detecting-Generated-Text----BERT-Model

Folders and files

Latest commit

History

Repository files navigation

Detecting-Generated-Text----BERT-Model

Introduction

Methodology

Acknowledgments

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages