Welcome to my Data Science and Machine Learning portfolio! This repository showcases my participation in the JUMIA Sentiment Analysis Challenge, where I achieved a top 3 finish. In this challenge, I developed a model to determine the sentiment of customer reviews on JUMIA Tunisia.
This competition was a part of the #MLOlympiad, organized by Kaggle and sponsored by Google Developers. The task involved performing sentiment analysis on textual customer reviews to categorize them as positive, negative, or neutral.
- Goal: Perform sentiment analysis on customer reviews and classify them into 'Positive,' 'Negative,' or 'Neutral.'
- Dataset: The training and test datasets were provided, containing customer reviews and their associated sentiments.
- Evaluation: The evaluation metric for this competition was [CategorizationAccuracy].
Here are the key files related to this project:
- train.csv - The training dataset containing customer reviews and sentiment labels.
- test.csv - The test dataset for making predictions.
- sample_submission.csv - A sample submission file with the required format.
- notebook in Kaggle or jumiasentimentanalysis.ipynb - My Jupyter Notebook with code, analysis, and model implementation.
-
Data Exploration: I began by exploring the training dataset to understand the data's characteristics and distribution.
-
Feature Engineering: I performed text preprocessing and feature engineering to prepare the data for modeling.
-
Model Selection: I experimented with various machine learning and natural language processing (NLP) models to determine the best-performing one.
-
Hyperparameter Tuning: To optimize model performance, I fine-tuned hyperparameters.
-
Validation: I utilized cross-validation techniques to assess model accuracy and robustness.
-
Submission: After achieving satisfactory results, I created submission files for evaluation.
For a detailed implementation and analysis, please refer to my jumiasentimentanalysis.ipynb.
I'm delighted to announce that I secured a top 3 position in the ML Olympiad's JUMIA Sentiment Analysis Challenge. My model effectively classified customer reviews' sentiments, contributing to business insights and customer satisfaction.
As I continue to advance my skills in data science and machine learning, my future plans for this project include:
- Exploring advanced NLP techniques to further enhance sentiment classification.
- Incorporating additional data sources to improve model accuracy.
- Enhancing model interpretability for practical applications.
I'm always open to collaboration and learning from the data science community. Feel free to connect with me on LinkedIn or explore more of my projects on GitHub.
I extend my heartfelt gratitude to the organizers of the ML Olympiad for providing this enriching opportunity for skill development and competition.
Thank you for visiting my portfolio, and I look forward to sharing more data science projects in the future! πβ¨