Skip to content

MariaPonomarenko38/NLP_Research_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

An Unsupervised Learning Approach for Categorising Research Proposals and Recommending Papers

Research writing can be technical and difficult to understand. Manual assignments of research areas and related faculties can be timeconsuming and error-prone. Therefore, the first goal of this project is to build an unsupervised model to classify an unseen project proposal. We explored unsupervised learning methods such as K-means and topic models as well as a combination of Latent Dirichlet Annotation (LDA) and Bidirectional Encoder Representations from Transformers (BERT) and K-means to cluster project proposals into different categories. The second goal is to recommend papers for particular project proposals based on other similar publications. We can assume that the authors of the closest papers can be suitable supervisors for the research project. After investigating different features that can be used as numerical vector representation of documents and apply cosine similarity method to f ind matching pairs of paper and proposal, the features outputted by TF-IDF show the most accurate results.

Fig 1: Proposed approach for clustering project

Fig 2: Categories visualization

Fig 3: Examples of recommended papers based on the content of project and paper abstracts

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published