Skip to content

Latest commit

 

History

History
223 lines (153 loc) · 10.8 KB

imagenet-example-python-2.md

File metadata and controls

223 lines (153 loc) · 10.8 KB

Back | Next | Contents
Image Recognition

Coding Your Own Image Recognition Program (Python)

In the previous step, we ran a sample application that came with the jetson-inference repo.

Now, we're going to walk through creating a new program from scratch in Python for image recognition called my-recognition.py. This script will load an arbitrary image from disk and classify it using the imageNet object. The completed source at python/examples/my-recognition.py

#!/usr/bin/python3

import jetson.inference
import jetson.utils

import argparse

# parse the command line
parser = argparse.ArgumentParser()
parser.add_argument("filename", type=str, help="filename of the image to process")
parser.add_argument("--network", type=str, default="googlenet", help="model to use, can be:  googlenet, resnet-18, ect.")
args = parser.parse_args()

# load an image (into shared CPU/GPU memory)
img = jetson.utils.loadImage(args.filename)

# load the recognition network
net = jetson.inference.imageNet(args.network)

# classify the image
class_idx, confidence = net.Classify(img)

# find the object description
class_desc = net.GetClassDesc(class_idx)

# print out the result
print("image is recognized as '{:s}' (class #{:d}) with {:f}% confidence".format(class_desc, class_idx, confidence * 100))

Setting up the Project

If you're using the Docker container, you'll want to store your code in a Mounted Directory. This way your code won't be lost when you shutdown the container. For simplicity, this guide will create it in a directory on your host device under the user's home directory located at ~/my-recognition-python, and then mount that path into the container.

Run these commands from a terminal (outside of container) to create the directory, source file, and download some test images:

# run these commands outside of container
$ cd ~/
$ mkdir my-recognition-python
$ cd my-recognition-python
$ touch my-recognition.py
$ chmod +x my-recognition.py
$ wget https://github.com/dusty-nv/jetson-inference/raw/master/data/images/black_bear.jpg 
$ wget https://github.com/dusty-nv/jetson-inference/raw/master/data/images/brown_bear.jpg
$ wget https://github.com/dusty-nv/jetson-inference/raw/master/data/images/polar_bear.jpg 

Then when you start the container, mount the directory that you just created:

$ docker/run.sh --volume ~/my-recognition-python:/my-recognition-python   # mounted inside the container to /my-recognition-python 

Next, we'll add the Python code for the program to the empty source file we created here.

Source Code

Open up my-recognition.py in your editor of choice (or run gedit my-recognition.py). You can edit from outside the container.

First, let's add a shebang sequence to the very top of the file to automatically use the Python interpreter:

#!/usr/bin/python3

Next, we'll import the Python modules that we're going to use in the script.

Importing Modules

Add import statements to load the jetson.inference and jetson.utils modules used for recognizing images and image loading. We'll also load the standard argparse package for parsing the command line.

import jetson.inference
import jetson.utils

import argparse

note: these Jetson modules are installed during the sudo make install step of building the repo.
          if you did not run sudo make install, then these packages won't be found when we go to run the example.

Parsing the Command Line

Next, add some boilerplate code to parse the image filename and an optional --network parameter:

# parse the command line
parser = argparse.ArgumentParser()
parser.add_argument("filename", type=str, help="filename of the image to process")
parser.add_argument("--network", type=str, default="googlenet", help="model to use, can be:  googlenet, resnet-18, ect. (see --help for others)")
args = parser.parse_args()

This example loads and classifies an image that the user specifies. It will be expected to be run like this:

$ ./my-recognition.py my_image.jpg

The desired image filename to be loaded should be substituted for my_image.jpg. You can also optionally specify the --network parameter to change the classification network that's used (the default is GoogleNet):

$ ./my-recognition.py --network=resnet-18 my_image.jpg

See the Downloading Other Classification Models section from the previous page for more information about downloading other networks.

Loading the Image from Disk

You can load images from disk into shared CPU/GPU memory using the loadImage() function. Supported formats are JPG, PNG, TGA, and BMP.

Add this line to load the image with the filename that was specified from the command line:

img = jetson.utils.loadImage(args.filename)

The returned image will be a jetson.utils.cudaImage object that contains attributes like width, height, and pixel format:

<jetson.utils.cudaImage>
  .ptr      # memory address (not typically used)
  .size     # size in bytes
  .shape    # (height,width,channels) tuple
  .width    # width in pixels
  .height   # height in pixels
  .channels # number of color channels
  .format   # format string
  .mapped   # true if ZeroCopy

For more information about accessing images from Python, see the Image Manipulation with CUDA page. For simplicity, we just load a single image here. To load a video or sequence of images, you would want to use the videoSource API like the previous imagenet.py sample does.

Loading the Image Recognition Network

Using the imageNet object, the following code will load the desired classification model with TensorRT. Unless you specified a different network using the --network flag, by default it will load GoogleNet, which was already downloaded when you initially built the jetson-inference repo (the ResNet-18 model was also selected by default to be downloaded).

All of the available classification models are pre-trained on the ImageNet ILSVRC dataset, which can recognize up to 1000 different classes of objects, like different kinds of fruits and vegetables, many different species of animals, along with everyday man-made objects like vehicles, office furniture, sporting equipment, ect.

# load the recognition network
net = jetson.inference.imageNet(args.network)

Classifying the Image

Next, we are going to classify the image with the recognition network using the imageNet.Classify() function:

# classify the image
class_idx, confidence = net.Classify(img)

imageNet.Classify() accepts the image and it's dimensions, and performs the inferencing with TensorRT.

It returns a tuple containing the integer index of the object class that the image was recognized as, along with the floating-point confidence value of the result.

Interpreting the Results

As the final step, let's retrieve the class description and print out the results of the classification:

# find the object description
class_desc = net.GetClassDesc(class_idx)

# print out the result
print("image is recognized as '{:s}' (class #{:d}) with {:f}% confidence".format(class_desc, class_idx, confidence * 100))

imageNet.Classify() returns the index of the recognized object class (between 0 and 999 for these models that were trained on ILSVRC). Given the class index, the imageNet.GetClassDesc() function will then return the string containing the text description of that class. These descriptions are automatically loaded from ilsvrc12_synset_words.txt.

That's it! That is all the Python code we need for image classification. See the completed source above.

Running the Example

Now that our Python program is complete, let's classify the test images that we downloaded at the beginning of this page:

$ ./my-recognition.py polar_bear.jpg
image is recognized as 'ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus' (class #296) with 99.999878% confidence

$ ./my-recognition.py brown_bear.jpg
image is recognized as 'brown bear, bruin, Ursus arctos' (class #294) with 99.928925% confidence

$ ./my-recognition.py black_bear.jpg
image is recognized as 'American black bear, black bear, Ursus americanus, Euarctos americanus' (class #295) with 98.898628% confidence

You can also choose to use a different network by specifying the --network flag, like so:

$ ./my-recognition.py --network=resnet-18 polar_bear.jpg
image is recognized as 'ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus' (class #296) with 99.743396% confidence

Next, we'll walk through the creation of the C++ version of this program.

Next | Coding Your Own Image Recognition Program (C++)
Back | Classifying Images with ImageNet

© 2016-2019 NVIDIA | Table of Contents