My Capstone Project for Udacity's Machine Learning nanodegree will attempt to classify twelve different species of plants by image analysis. This project will utilize the dataset from Kaggle's Plant Seedlings Identification competition.
-
Download training images, test images. Rename train to labeled and test to unlabeled to distinguish them from the train/test split extracted from the labeled dataset.
-
Download Test Images from the Wild as described in this post.
Folder structure should look like this:
seedling-classification
- data
- labeled
- Black-grass
- 0ace21089.png
...
- Charlock
...
- unlabeled
- 0a64e3e6c.png
...
- from_the_wild
- Charlock
- WP_20150506_10_15_19_Pro__highres_0.tiff
...
- Cleavers
...
-
(Optional) If you plan to use GPUs, install the necessary NVIDIA software on your system. I followed this guide to set up GPUs on a Google Compute Engine instance.
-
Install anaconda if you don't have it already.
wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
bash Anaconda3-5.0.1-Linux-x86_64.sh
source ~/.bashrc
- Create a new environment.
- Linux (to install with GPU support, change
requirements/linux.yml
torequirements/linux-gpu.yml
):
conda env create -f requirements/linux.yml
- Mac (to install with GPU support, change
requirements/mac.yml
torequirements/mac-gpu.yml
):
conda env create -f requirements/mac.yml
- Create an IPython kernel for the
seedlings
environment.
python -m ipykernel install --user --name seedlings --display-name "seedlings"
- Run this command to enable the tqdm_notebook extension
jupyter nbextension enable --py widgetsnbextension
- Active the environment
source activate seedlings
- Switch Keras backend to TensorFlow.
KERAS_BACKEND=tensorflow python -c "from keras import backend"
- Open the notebook.
jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser seedlings.ipynb
Setup Instructions were heavily influenced by Udacity's CNN dog-project