- Project Introduction
- Exploratory Data Analysis
- Rank Based Recommendations
- User-user Based Collaborative Filtering
- Matrix Factorisation
- Conclusion
- Files
- Software and Libraries
For this project I will analyze the interactions that users have with articles on the IBM Watson Studio platform, and make recommendations to them about new articles I think they will like. Below is an example of what the dashboard could look like displaying articles on the IBM Watson Platform.
In order to determine which articles to show to each user, I will be performing a study of the data available on the IBM Watson Studio platform.
Most of the users have maximum 3 interactions with any article on the platform and this distribution is highly skewed because interactions are less.
This type of recommendation system provide the top articles view in this dataset.
We can set how many recommendations to provide.
We provide a user_id
for which we want recommendations. Then we sort each user
based on similarity with the given user_id
.
For each sorted user, we find the articles this sorted user has interacted with to add to recommedations list.
Then we select the top m recommendations, m being the number of recommendations
to provide for a specific user_id
.
In this section we first perform SVD on the user_item interactions matrix. We then see the behaviour of accuracy with the number of latent features. Since the data is highly imbalanced, we also check the variation of F1 score with the number of latent features. F1 score increases upto a limit and then drops asymptotically.
We have a highly imbalanced data set because of less interactions on the platform.
There were only 20 customer for which we can try and provide recommendation. If we had more data then performance of our recommendation engine could be evaluated more efficiently. We have a highly imbalanced data because of many zeroes in the user-item interaction matrix. I will try content recommendation in future iteractions to tackle the cold start problem.
. ├── Recommendations_with_IBM.html----------# HTML EXPORT OF JUPYTER NOTEBOOK ├── Recommendations_with_IBM.ipynb---------# ANALYSIS NOTEBOOK ├── data │ ├── articles_community.csv-------------# INFORMATION ABOUT ARTICLES │ └── user-item-interactions.csv---------# USER-ARTICLE INTERACTIONS ├── project_tests.py-----------------------# UNIT TESTS FOR PROJECT ├── top_10.p-------------------------------# BINARY FILE TO CHECK MY SOLUTION ├── top_20.p-------------------------------# BINARY FILE TO CHECK MY SOLUTION ├── top_5.p--------------------------------# BINARY FILE TO CHECK MY SOLUTION ├── user_item_matrix.p---------------------# BINARY FILE TO CHECK MY SOLUTION └── visuals.py-----------------------------# CUSTOM PLOTS CREATED IN PLOTLY
This is project uses Python 3.6.6 and the necessary libraries are mentioned in requirements file.