amazon-access-challenge

Our solution to Kaggle's Amazon Access Challenge.

Deliverables
Running the Code

Deliverable Locations

The screenshot of our highest score is under report/submission_screnshot.png
Our code is in the .ipynb and .py files in the top level directory. Instructions on how to run are below.
Our submission CSV is under output/xgb_logreg_rf.csv
Our report is under report/report.pdf

How to Run

You'll need Python 3.5 with the PyData stack installed to run this code. The easiest way to achieve this is to install Anaconda for Python 3.5.

Note: You must have XGBoost installed in order to re-run the models. If you just wish to run the final ensemble, you do not need the library installed.

Since we have saved the output of all of our individual models, it is easy to run the ensemble by itself. From the top level directory, simply run:

$ python rankedavg.py submission

The submission command line argument is the name of the file. The result will be saved to output/submission.csv.

If you wish to re-train the models, use the following steps:

1. Run the starter code logistic regressions

First move into the reference-code directory.

$ cd reference-code/

Then, run the starter code.

$ python starter.py

When you are prompted to enter a name for the submission file, enter: starter_submission

2. Run our models

Go back to the top level directory.

$ cd ../

Then, run the file.

$ python models.py

3. Now you can run rankedavg.py again to get the final ensembled submission

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

amazon-access-challenge

Deliverable Locations

How to Run

Files

README.md

Latest commit

History

README.md

File metadata and controls

amazon-access-challenge

Deliverable Locations

How to Run