Skip to content

Latest commit

 

History

History
60 lines (39 loc) · 1.73 KB

README.md

File metadata and controls

60 lines (39 loc) · 1.73 KB

amazon-access-challenge

Our solution to Kaggle's Amazon Access Challenge.

Deliverable Locations

  1. The screenshot of our highest score is under report/submission_screnshot.png
  2. Our code is in the .ipynb and .py files in the top level directory. Instructions on how to run are below.
  3. Our submission CSV is under output/xgb_logreg_rf.csv
  4. Our report is under report/report.pdf

How to Run

You'll need Python 3.5 with the PyData stack installed to run this code. The easiest way to achieve this is to install Anaconda for Python 3.5.

Note: You must have XGBoost installed in order to re-run the models. If you just wish to run the final ensemble, you do not need the library installed.

Since we have saved the output of all of our individual models, it is easy to run the ensemble by itself. From the top level directory, simply run:

$ python rankedavg.py submission

The submission command line argument is the name of the file. The result will be saved to output/submission.csv.


If you wish to re-train the models, use the following steps:

1. Run the starter code logistic regressions

First move into the reference-code directory.

$ cd reference-code/

Then, run the starter code.

$ python starter.py

When you are prompted to enter a name for the submission file, enter: starter_submission

2. Run our models

Go back to the top level directory.

$ cd ../

Then, run the file.

$ python models.py

3. Now you can run rankedavg.py again to get the final ensembled submission