Skip to content

Credit Card Fraud Detection problem for the Modelling Week 2019

License

Notifications You must be signed in to change notification settings

davidggphy/modelling_week_2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

modelling_week_2019

Credit Card Fraud Detection problem for the XIII Modelling Week, held in the Faculty of Mathematics of the Universidad Complutense de Madrid (UCM), during 10-14 June 2019. The Modelling Week is open to the students of the Master in Mathematical Engineering at UCM, as well as to participants from other mathematically oriented master programs worldwide. The purpose is to teach and guide the students to solve a realistic industry problem.

The problem can be approached in three ways: supervised, unsupervised and mixed. We are going to start using a supervised approach, since it is simpler. If time permits, we'll explore unsupervised methods (a really interesting field).

Python libraries

jupyter,pandas,matplotlib,seaborn,sklearn,tensorflow,keras,imblearn,xgboost

Outline

  • Basic programming with python and jupyter
  • Exploratory data analysis, cleaning and preprocessing. Feature engineering.
  • Overfitting. Validation scheme. Difference between train, validation and test sets.
  • Metrics: precision, recall, ROC curve, AUC (ROC), F1, confusion matrix. Focus on unbalanced datasets.
  • Classification algorithms in sklearn. Comments on hyperparameter tuning.
  • xgboost in Python using xgboost.sklearn API.
  • Combination of models. Calibration. Ensembling and Stacking.
  • Neural Networks in keras:
    • Feed Forward Neural Network for classification.
    • Autoencoder as an anomaly detector (semi and unsupervised)
    • Autoencoder as a feature builder (unsupervised)
  • Combination of unsupervised and supervised methods.

Cheatsheets

Resources

Bibliography

  • Leo Breiman "Statistical Modeling: The Two Cultures" (2001) (Breiman)
  • Elements of Statistical Learning (ESL)
  • Introduction to Statistical Learning with R (ISLR)
  • Pattern Recognition and Machine Learning (Bishop)
  • Bayesian Data Analysis (BDA)

About

Credit Card Fraud Detection problem for the Modelling Week 2019

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published