Beh Jing Chong
Chan Chee Seng
Division of Data Strategic and Foresight, Ministry of Science, Technology and Innovation (MOSTI)
Unsupervised image-to-image translation have been extensively researched to be
applied in object detection domain adaptation by mean of data augmentation. Despite the
advent of GAN in I2I translation, most of the image-to-image translation still focus on
the global level while the amount of works on instance level remains little. Global level
I2I translation had been proved to not perform well with content rich images which make
it not suitable for object detection domain adaptation and most of the instance level I2I
translation requires annotation label or pretrained subnet for training. In this work, we
proposed a novel method to perform global level I2I translation that taking care of content
with high fidelity without object detection model integration. We introduce masking and
cycle-object content consistency loss which exploit the preservation of instances’ content.
We show that our approach can achieve high quality translation result with content rich
scenario. Moreover, we also proposed some modifications to mean average precision
metric for better evaluating performance of object detection model in term of both
classification result and bounding box prediction. Extensive experiments show that our
modifications improve mAP score in term of false positive result penalization.