Bank fraud is the use of potentially illegal means to obtain money, assets, or other property owned or held by a financial institution. Bank fraud is a criminal offence. While it is easier to define bank fraud, it is significantly harder to detect when it occurs with high accuracy, precision and recall, and with potentially billions of dollars that can be obtained by committing bank fraud, it is very important to be able to accurately identify when it has occurred (a skill that banks would find very valuable).
Therefore in this analysis, we aim to find the set of features which best predict whether or not bank fraud has occurred. To do this, we will use a variety of Machine Learning models, including Logistic Regression, Logistic Regression with the L1 penalty, and an Isolation Forest.