train.py and model_main.py #6100

Bahramudin · 2019-01-27T14:20:48Z

System information

What is the top-level directory of the model you are using: object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Both Windows and Linux
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.12.0
Bazel version (if compiling from source): N/A
CUDA/cuDNN version: 9.0
GPU model and memory: GTX 1080 8GB
Exact command to reproduce:

I want to ask, in the new version of the Object Detection API the train.py file has been moved to the legacy folder. And newly added model_main.py, but there is nothing said in the documentation, that why the train.py moved to the legacy folder and want we can use instead to train our own models?

And now which one is better to use, and how to use? And also why it is better than train.py.

This information is very necessary for us to know the difference between them in order to take advantage of the new version.

joyyang1215 · 2019-01-28T07:16:11Z

I have the same question!

derekjchow · 2019-01-28T17:48:38Z

There are a couple reasons why we moved to model_main

The refactored model_main allows for TPU training on Google Cloud. This significantly improves training times for models that can leverage TPUs.
Using the Estimator API, model_main supports training and evaluation on the same binary. This enables one machine to interleave training/evaluation with one call to model_main. This was more difficult for the legacy binaries, which would require stopping the training binaries to begin evaluation (or running training/evaluation on different machines).

We recommend users use model_main.py moving forward. We left the old binaries in the legacy folder because during the introduction of model_main as there were a few feature gaps between the two systems. As of today, we believe the gaps should be closed (but feel free to file a bug towards us if you think that isn't the case).

Bahramudin · 2019-01-29T01:56:02Z

@derekjchow Thanks for the reply! One more question, does model_main also improve the training quality (accuracy and speed) than the old one train.py?

And for more understanding the TF Object Detection API, I have some Hesitation if you clear me, I will be very appreciated.
Which of the following elements can have more effect or does not have any effect on training quality:

TensorFlow version: Before I was using TF 1.5 for training for example, faster_rcnn_inception_v2_coco which was released on 2018-01-28, but now I am using TF 1.12, for training the same pre-trained model. Do these two trained binaries have the same quality? If no, then what is the benefit of the higher version of TF to train our own dataset on the pre-trained models?
Slim-based/Estimator-based training: Still we are using Slim-based, but what will happen if used Estimator-based?

I am asking these question because I want to use the API in the most correct way, to get its most possible advantage.

Thanks!

derekjchow · 2019-01-29T17:23:30Z

There shouldn't be changes in model quality (in either speed or accuracy).

The slim/estimator API difference is superficial. In fact, if you look at our estimator implementation, you'll discover it wraps an internal detection API which is built with slim.

Bahramudin · 2019-01-30T01:30:39Z

@derekjchow So it means if I train the dataset using TF 1.5 or 1.12 there is no difference in result? If yes, then what benefits can the upper versions bring for us.

Then how should the accuracy and speed get improved? If it belongs to the model itself, the models and model zoo are going to be outdated and quite old (most of them are almost one year no updated). It will be better to clean up the model zoo for the latest changes.

Thanks!

austinmw · 2019-04-07T08:04:29Z

also model_main.py doesn't support multi-GPU training

mihuzz · 2019-06-10T15:49:49Z

I cant run training

models\research\object_detection>python model_main.py --logtostderr--model_dir=training/--pipeline_config_path=training/faster_rcnn_inception_v2_coco_2018_01_28 Traceback (most recent call last): File "model_main.py", line 26, in <module> from object_detection import model_lib File "D:\Lesson8\models\research\object_detection\model_lib.py", line 27, in <module> from object_detection import eval_util File "D:\Lesson8\models\research\object_detection\eval_util.py", line 28, in <module> from object_detection.metrics import coco_evaluation File "D:\Lesson8\models\research\object_detection\metrics\coco_evaluation.py", line 20, in <module> from object_detection.metrics import coco_tools File "D:\Lesson8\models\research\object_detection\metrics\coco_tools.py", line 47, in <module> from pycocotools import coco File "D:\Lesson8\models\research\pycocotools\coco.py", line 55, in <module> from . import mask as maskUtils File "D:\Lesson8\models\research\pycocotools\mask.py", line 3, in <module> import pycocotools._mask as _mask ModuleNotFoundError: No module named 'pycocotools._mask'

01Root · 2019-07-27T08:14:10Z

I cant run training

models\research\object_detection>python model_main.py --logtostderr--model_dir=training/--pipeline_config_path=training/faster_rcnn_inception_v2_coco_2018_01_28 Traceback (most recent call last): File "model_main.py", line 26, in <module> from object_detection import model_lib File "D:\Lesson8\models\research\object_detection\model_lib.py", line 27, in <module> from object_detection import eval_util File "D:\Lesson8\models\research\object_detection\eval_util.py", line 28, in <module> from object_detection.metrics import coco_evaluation File "D:\Lesson8\models\research\object_detection\metrics\coco_evaluation.py", line 20, in <module> from object_detection.metrics import coco_tools File "D:\Lesson8\models\research\object_detection\metrics\coco_tools.py", line 47, in <module> from pycocotools import coco File "D:\Lesson8\models\research\pycocotools\coco.py", line 55, in <module> from . import mask as maskUtils File "D:\Lesson8\models\research\pycocotools\mask.py", line 3, in <module> import pycocotools._mask as _mask ModuleNotFoundError: No module named 'pycocotools._mask'

I had the same problem,do you know how to fix it now?

khushi2091 · 2019-10-08T05:32:18Z

I cant run training

models\research\object_detection>python model_main.py --logtostderr--model_dir=training/--pipeline_config_path=training/faster_rcnn_inception_v2_coco_2018_01_28 Traceback (most recent call last): File "model_main.py", line 26, in <module> from object_detection import model_lib File "D:\Lesson8\models\research\object_detection\model_lib.py", line 27, in <module> from object_detection import eval_util File "D:\Lesson8\models\research\object_detection\eval_util.py", line 28, in <module> from object_detection.metrics import coco_evaluation File "D:\Lesson8\models\research\object_detection\metrics\coco_evaluation.py", line 20, in <module> from object_detection.metrics import coco_tools File "D:\Lesson8\models\research\object_detection\metrics\coco_tools.py", line 47, in <module> from pycocotools import coco File "D:\Lesson8\models\research\pycocotools\coco.py", line 55, in <module> from . import mask as maskUtils File "D:\Lesson8\models\research\pycocotools\mask.py", line 3, in <module> import pycocotools._mask as _mask ModuleNotFoundError: No module named 'pycocotools._mask'

This issue got resolved for me after installing the following packages when I was training in Linux system:
pip install Cython
pip install pycocotools

pamrani · 2019-12-26T04:26:09Z

(tensorflow) D:\my-work\WiS - alert - 2\models\research\object_detection>python train.py --logtostderr --train_dir= D:/my-work/WiS - alert - 2 /models/research/object_detection/training/ --pipeline_config_path= D:/my-work/WiS - alert - 2 /models/research/object_detection/training/ssd_mobilenet_v1_coco.config
D:\installation\anaconda\envs\tensorflow\lib\site-packages\h5py_init_.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "train.py", line 164, in
tf.app.run()
File "D:\installation\anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 88, in main
assert FLAGS.train_dir, 'train_dir is missing.'
AssertionError: train_dir is missing.

jhonjam · 2020-04-14T15:52:42Z

No puedo correr entrenando

models\research\object_detection>python model_main.py --logtostderr--model_dir=training/--pipeline_config_path=training/faster_rcnn_inception_v2_coco_2018_01_28 Traceback (most recent call last): File "model_main.py", line 26, in <module> from object_detection import model_lib File "D:\Lesson8\models\research\object_detection\model_lib.py", line 27, in <module> from object_detection import eval_util File "D:\Lesson8\models\research\object_detection\eval_util.py", line 28, in <module> from object_detection.metrics import coco_evaluation File "D:\Lesson8\models\research\object_detection\metrics\coco_evaluation.py", line 20, in <module> from object_detection.metrics import coco_tools File "D:\Lesson8\models\research\object_detection\metrics\coco_tools.py", line 47, in <module> from pycocotools import coco File "D:\Lesson8\models\research\pycocotools\coco.py", line 55, in <module> from . import mask as maskUtils File "D:\Lesson8\models\research\pycocotools\mask.py", line 3, in <module> import pycocotools._mask as _mask ModuleNotFoundError: No module named 'pycocotools._mask'

Tuve el mismo problema, ¿sabes cómo solucionarlo ahora?

tienes que ejecutar los siguiente
import os
os.environ['PYTHONPATH'] = "{}/path/research:/path/models/research/object_detection:/path/models/research/slim".format(os.environ['PYTHONPATH'])
!python object_detection/builders/model_builder_test.py

!protoc ./object_detection/protos/*.proto --python_out=.
!python3 setup.py build
!python3 setup.py install

mihuzz · 2020-04-14T18:40:28Z

models\research\object_detection>python model_main.py --logtostderr--model_dir=training/--pipeline_config_path=training/faster_rcnn_inception_v2_coco_2018_01_28

another slash

jasonng1711 · 2020-11-10T03:45:04Z

Hi I'm having a problems(tensorflow1) C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config
2020-11-10 10:38:32.224380: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Traceback (most recent call last):
File "train.py", line 53, in
from object_detection.builders import model_builder
File "C:\tensorflow1\models\research\object_detection\builders\model_builder.py", line 66, in
from object_detection.models import ssd_efficientnet_bifpn_feature_extractor as ssd_efficientnet_bifpn
File "C:\tensorflow1\models\research\object_detection\models\ssd_efficientnet_bifpn_feature_extractor.py", line 33, in
from official.vision.image_classification.efficientnet import efficientnet_model
File "C:\tensorflow1\models\official\vision\image_classification\efficientnet\efficientnet_model.py", line 37, in
from official.vision.image_classification import preprocessing
File "C:\tensorflow1\models\official\vision\image_classification\preprocessing.py", line 25, in
from official.vision.image_classification import augment
File "C:\tensorflow1\models\official\vision\image_classification\augment.py", line 31, in
from tensorflow.python.keras.layers.preprocessing import image_preprocessing as image_ops
ImportError: cannot import name 'image_preprocessing' from 'tensorflow.python.keras.layers.preprocessing' (C:\Users\USER\anaconda3\envs\tensorflow1\lib\site-packages\tensorflow_core\python\keras\layers\preprocessing_init_.py)

any advice? thanks

google-ml-butler · 2021-12-14T06:53:21Z

Are you satisfied with the resolution of your issue?
Yes
No

ymodak added the models:research models that come under research directory label Jan 29, 2019

EdjeElectronics mentioned this issue Sep 21, 2019

Are you planning to try "model_main.py" again instead of "train.py" EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10#184

Open

TWJubb mentioned this issue Feb 18, 2020

model_main.py training only #8151

Open

ravikyram added the type:support label Jul 1, 2020

ravikyram assigned tombstone, jch1 and pkulzc Jul 1, 2020

jaeyounkim added models:research:odapi ODAPI and removed models:research models that come under research directory labels Jun 25, 2021

saberkun closed this as completed Dec 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train.py and model_main.py #6100

train.py and model_main.py #6100

Bahramudin commented Jan 27, 2019

joyyang1215 commented Jan 28, 2019

derekjchow commented Jan 28, 2019

Bahramudin commented Jan 29, 2019

derekjchow commented Jan 29, 2019

Bahramudin commented Jan 30, 2019

austinmw commented Apr 7, 2019

mihuzz commented Jun 10, 2019

01Root commented Jul 27, 2019

I cant run training

khushi2091 commented Oct 8, 2019

I cant run training

pamrani commented Dec 26, 2019

jhonjam commented Apr 14, 2020

No puedo correr entrenando

mihuzz commented Apr 14, 2020

jasonng1711 commented Nov 10, 2020

google-ml-butler bot commented Dec 14, 2021

train.py and model_main.py #6100

train.py and model_main.py #6100

Comments

Bahramudin commented Jan 27, 2019

System information

joyyang1215 commented Jan 28, 2019

derekjchow commented Jan 28, 2019

Bahramudin commented Jan 29, 2019

derekjchow commented Jan 29, 2019

Bahramudin commented Jan 30, 2019

austinmw commented Apr 7, 2019

mihuzz commented Jun 10, 2019

I cant run training

01Root commented Jul 27, 2019

I cant run training

khushi2091 commented Oct 8, 2019

I cant run training

pamrani commented Dec 26, 2019

jhonjam commented Apr 14, 2020

No puedo correr entrenando

mihuzz commented Apr 14, 2020

jasonng1711 commented Nov 10, 2020

google-ml-butler bot commented Dec 14, 2021