Is it possible to retrain you model ? #10

sonia-auv-private · 2018-03-21T04:15:13Z

Hi,

I was wondering if there is any method that would let us retrain this model using Pascal voc a notion files and images ???

gustavz · 2018-03-21T11:06:33Z

yes ofcourse. Just use the skripts train.py and eval.py provided by Tensorflow's Object Detection API like you would with any other model.
In stuff/ssd_mobilenet_checkpoints you find the same checkpoint files i used, but they are the original ones provided by Tensorflow.

gauthiermartin · 2018-03-25T05:52:17Z

Thank you

uzbhutta · 2018-04-26T00:44:47Z

Hi,
Just to clarify, I must train my Tensorflow Object Detection API only on 600x600px or 300x300px images in order for it to work with config file, and then place my trained ckpt file under stuff/ssd_mobilenet_checkpoints and run your scripts as usual, is this correct?

Thanks so much.

gustavz · 2018-04-26T05:59:32Z

Hey @uzbhutta,

I suggest you should first take a closer look at tensorflows original object detection API. Try to understand how training and inferencing works, which scripts are usable. And after that you take a look at my code and what it does.

To give you a short overview:
It does not matter what size your images have that you train on as if you train with tfs api they will always be resized to a fixed size which you set in the config file. And this size is normally 300x300 for SSD.
But you can ofcourse train a network on 600x600 if you like.
But then you won’t be able to use a pretrained model as starting point as the weights are bound to the input dimensions that you train on.

So while training you get several checkpoints in an interval that you also set in the config.

And finally when you want to use my api to do inference, then you need to export one of those checkpoint files to a frozen model in the pb format.

This frozen model can then be included in my api and Adressen correctly in my config.yml.

And another thing: make sure to use my checkpoint files as starting point as my speed hack, the split model + multithreading only works if your model has the exact same layer names as mine.

I hope i could clearify some things for you.

Cheers
Gustav

David-Lee-1990 · 2018-06-06T05:40:36Z

@gustavz where is your checkpoint file? I trained on my own labeled data with tensorflow's object detection api using your config file located in models/ssd_mobilenet_v11_coco/. After training, I replace the frozen graph in models/ssd_mobilenet_v11_coco/.

When do inferencing, there comes an error:

ValueError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'

I wonder why my frozen graph has the Node 'Preprocessor/map/TensorArray_2' but your frozen graph does not.

gustavz · 2018-06-06T07:59:15Z

@David-Lee-1990
Which version of the model_zoo did you take? (which date is added at the end?)
As Tensorflow seems to have changed some layer names in the newer version than the one i used
(2017_11_17).

My checkpoint file is inside the model dir of ssd_mobilenet: https://github.com/GustavZ/realtime_object_detection/tree/master/models/ssd_mobilenet_v11_coco

With this checkpoint it should work, at least it did for my retrainings.

I hope i could help you!

David-Lee-1990 · 2018-06-07T02:23:22Z

@gustavz I retrained my data using the configue file and the model.ckpt files in your model dir of ssd_mobilenet. But after that, I still encounter the same problem ( Node 'Preprocessor/map/TensorArray_2'). I wonder whether this is caused by the version difference of tensorflow? my tensorflow version is 1.8.

gustavz · 2018-06-07T06:03:50Z

Yes pretty sure.
There are so many changings during the version which lead to strange behavior and errors.
I also keep switching versions all the time when I face errors.

Try tf 1.4 that’s where I started this project.

David-Lee-1990 · 2018-06-09T13:33:26Z

tf 1.4 is not available for training tensorflow's object detection api now for the 'AttributeError: module 'tensorflow.contrib.data' has no attribute 'parallel_interleave'.

I tried tf 1.5 to retrain the model, but the result graph still has the node 'Preprocessor/map/TensorArray_2'.
This drives me crazy!

AnthonyLabaere · 2018-06-15T09:13:26Z

Hi @gustavz,

First of all thanks for your work. It's really great.

However I have the same problem :/

Traceback (most recent call last):
  File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\importer.py", line 489, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "...\realtime_object_detection-2.0\run_objectdetection.py", line 178, in <module>
    config.NUM_CLASSES,config.SPLIT_MODEL, config.SSD_SHAPE).prepare_od_model()
  File "...\realtime_object_detection-2.0\rod\model.py", line 157, in prepare_od_model
    self.load_frozenmodel()
  File "...\realtime_object_detection-2.0\rod\model.py", line 129, in load_frozenmodel
    tf.import_graph_def(remove, name='')
  File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 432, in new_func
    return func(*args, **kwargs)
  File "...\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\importer.py", line 493, in import_graph_def
    raise ValueError(str(e))
ValueError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice'

I have trained my model with tf 1.8, replaced your configuration and model by mines and tried a run.
The same issue occurs when I try a run with your release 1.0.

For information :

I runned object_detection.py on your release 1.0 and run_objectdetection.py on 2.0 : it worked with your default configuration.
I runned object_detection_tutorial (from object_detection) with my model and it worked.

AnthonyLabaere · 2018-06-15T10:00:08Z

Ok my bad, i turned off SPLIT_MODEL and it works now.

gustavz · 2018-06-15T14:41:11Z

Dont use v2.0
Use master.
I will update that next week

David-Lee-1990 · 2018-06-15T14:46:51Z

@AnthonyLabaere Hi, after turning on SPLIT_MODEL, your model works now? ValueError: Node 'Preprocessor/map/TensorArray_2' gone?

gustavz · 2018-06-15T14:54:28Z

Again: the split_model speed hack will ONLY work with ssd_mobilenet_v1 Models that are exported from the exact same checkpoint that I used and published in /models.
Tensorflow and also the SSDMetaArch inside models/object_detection changes.

I have no insight on this as I am not working with ssd anymore.
If you want to apply the speed hack to those models you need to investigate by your own. Sorry.

But if you find a solution you are very welcome to contribute / file a PR.

Gustav

David-Lee-1990 · 2018-06-15T15:03:49Z

@gustavz ok, thanks!

AnthonyLabaere · 2018-06-15T15:06:20Z

@David-Lee-1990 I just succeeded to make it work on my computer (on Windows) and on my raspberry (with some updates) with my model.
And yes the issue with 'Preprocessor/map/TensorArray_2' is gone because this part (with SPLIT_MODEL true) concerns the GPU.

@gustavz If I find a "real" solution I would make a PR but for now I didn't find anything :/ Ok I will use master i nthe future.

David-Lee-1990 · 2018-06-15T15:18:49Z

@AnthonyLabaere is your model trained by tensorflow's object detection api? what do you mean by saying ''Preprocessor/map/TensorArray_2' is gone because this part concerns the GPU'?
I check the frozen graph generated by tensorflow, and find after the node 'TensorArray_2' , the graph directly goes to Batch-NMS nodes without feature extraction.

AnthonyLabaere · 2018-06-15T15:46:13Z

@David-Lee-1990 yes it is trained by tensorflow's object detection api.
Well, concerning the 'Preprocessor/map/TensorArray_2', I spoke too fast. I don't know why the problem is gone sorry.

How do you see that ? With tensorboard ?

naisy · 2018-06-16T08:47:20Z

Hi,

Split model hack solution is only avaiable in ssd_mobilenet_v1 with 300x300.
'Preprocessor/map/TensorArray_2' that appears with 600x600 train image.

Set your ssd_mobilenet_v1_coco.config with 300x300 size.

    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }

See also:
https://github.com/tensorflow/models/issues/3270

David-Lee-1990 · 2018-06-17T16:42:06Z

@naisy Hi, have you tried this 300*300 config? In fact, my config is set with 300 * 300 all the time, but there is still the error.

naisy · 2018-06-18T01:53:29Z

Hi @David-Lee-1990,

I check config now. config in master branch was changed.
Please use r1.5 branch for ssd_mobilenet_v1.

--- r1.5	2018-06-18 01:43:31.752331891 +0000
+++ master	2018-06-18 01:43:18.056376250 +0000
@@ -108,12 +108,10 @@
     loss {
       classification_loss {
         weighted_sigmoid {
-          anchorwise_output: true
         }
       }
       localization_loss {
         weighted_smooth_l1 {
-          anchorwise_output: true
         }
       }
       hard_example_miner {
@@ -193,5 +191,4 @@
   label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
   shuffle: false
   num_readers: 1
-  num_epochs: 1
 }

My own training is here:
https://github.com/naisy/train_ssd_mobilenet

David-Lee-1990 · 2018-06-20T01:41:02Z

@naisy Thank you for your tips. Problem solved!

gustavz mentioned this issue Jun 6, 2018

ValueError: Node 'Preprocessor/map/TensorArray_2': Unknown input node 'Preprocessor/map/strided_slice' #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to retrain you model ? #10

Is it possible to retrain you model ? #10

sonia-auv-private commented Mar 21, 2018

gustavz commented Mar 21, 2018

gauthiermartin commented Mar 25, 2018

uzbhutta commented Apr 26, 2018

gustavz commented Apr 26, 2018

David-Lee-1990 commented Jun 6, 2018 •

edited

Loading

gustavz commented Jun 6, 2018

David-Lee-1990 commented Jun 7, 2018

gustavz commented Jun 7, 2018

David-Lee-1990 commented Jun 9, 2018

AnthonyLabaere commented Jun 15, 2018 •

edited

Loading

AnthonyLabaere commented Jun 15, 2018

gustavz commented Jun 15, 2018

David-Lee-1990 commented Jun 15, 2018

gustavz commented Jun 15, 2018

David-Lee-1990 commented Jun 15, 2018

AnthonyLabaere commented Jun 15, 2018 •

edited

Loading

David-Lee-1990 commented Jun 15, 2018

AnthonyLabaere commented Jun 15, 2018

naisy commented Jun 16, 2018

David-Lee-1990 commented Jun 17, 2018

naisy commented Jun 18, 2018

David-Lee-1990 commented Jun 20, 2018

Is it possible to retrain you model ? #10

Is it possible to retrain you model ? #10

Comments

sonia-auv-private commented Mar 21, 2018

gustavz commented Mar 21, 2018

gauthiermartin commented Mar 25, 2018

uzbhutta commented Apr 26, 2018

gustavz commented Apr 26, 2018

David-Lee-1990 commented Jun 6, 2018 • edited Loading

gustavz commented Jun 6, 2018

David-Lee-1990 commented Jun 7, 2018

gustavz commented Jun 7, 2018

David-Lee-1990 commented Jun 9, 2018

AnthonyLabaere commented Jun 15, 2018 • edited Loading

AnthonyLabaere commented Jun 15, 2018

gustavz commented Jun 15, 2018

David-Lee-1990 commented Jun 15, 2018

gustavz commented Jun 15, 2018

David-Lee-1990 commented Jun 15, 2018

AnthonyLabaere commented Jun 15, 2018 • edited Loading

David-Lee-1990 commented Jun 15, 2018

AnthonyLabaere commented Jun 15, 2018

naisy commented Jun 16, 2018

David-Lee-1990 commented Jun 17, 2018

naisy commented Jun 18, 2018

David-Lee-1990 commented Jun 20, 2018

David-Lee-1990 commented Jun 6, 2018 •

edited

Loading

AnthonyLabaere commented Jun 15, 2018 •

edited

Loading

AnthonyLabaere commented Jun 15, 2018 •

edited

Loading