D2
Computer Vision and Machine Learning

Rakshith Shetty (PhD Student)

Publications

2021
Seeking Similarities over Differences: Similarity-based Domain Alignment for Adaptive Object Detection
F. Rezaeianaran, R. Shetty, R. Aljundi, D. O. Reino, S. Zhang and B. Schiele
Technical Report, 2021
(arXiv: 2110.01428)
Abstract
In order to robustly deploy object detectors across a wide range of scenarios, they should be adaptable to shifts in the input distribution without the need to constantly annotate new data. This has motivated research in Unsupervised Domain Adaptation (UDA) algorithms for detection. UDA methods learn to adapt from labeled source domains to unlabeled target domains, by inducing alignment between detector features from source and target domains. Yet, there is no consensus on what features to align and how to do the alignment. In our work, we propose a framework that generalizes the different components commonly used by UDA methods laying the ground for an in-depth analysis of the UDA design space. Specifically, we propose a novel UDA algorithm, ViSGA, a direct implementation of our framework, that leverages the best design choices and introduces a simple but effective method to aggregate features at instance-level based on visual similarity before inducing group alignment via adversarial training. We show that both similarity-based grouping and adversarial training allows our model to focus on coarsely aligning feature groups, without being forced to match all instances across loosely aligned domains. Finally, we examine the applicability of ViSGA to the setting where labeled data are gathered from different sources. Experiments show that not only our method outperforms previous single-source approaches on Sim2Real and Adverse Weather, but also generalizes well to the multi-source setting.
Adversarial Content Manipulation for Analyzing and Improving Model Robustness
R. Shetty
PhD Thesis, Universität des Saarlandes, 2021
2020
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
V. Agarwal, R. Shetty and M. Fritz
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), 2020
Diverse and Relevant Visual Storytelling with Scene Graph Embeddings
X. Hong, R. Shetty, A. Sayeed, K. Mehra, V. Demberg and B. Schiele
Proceedings of the 24th Conference on Computational Natural Language Learning (CoNLL 2020), 2020
Towards Automated Testing and Robustification by Semantic Adversarial Data Generation
R. Shetty, M. Fritz and B. Schiele
Computer Vision -- ECCV 2020, 2020
2019
Not Using the Car to See the Sidewalk: Quantifying and Controlling the Effects of Context in Classification and Segmentation
R. Shetty, B. Schiele and M. Fritz
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), 2019
2018
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
M. Wagner, H. Basevi, R. Shetty, W. Li, M. Malinowski, M. Fritz and A. Leonardis
Computer Vision - ECCV 2018 Workshops, 2018
Adversarial Scene Editing: Automatic Object Removal from Weak Supervision
R. Shetty, M. Fritz and B. Schiele
Advances in Neural Information Processing Systems 31 (NeurIPS 2018), 2018
Abstract
While great progress has been made recently in automatic image manipulation, it has been limited to object centric images like faces or structured scene datasets. In this work, we take a step towards general scene-level image editing by developing an automatic interaction-free object removal model. Our model learns to find and remove objects from general scene images using image-level labels and unpaired data in a generative adversarial network (GAN) framework. We achieve this with two key contributions: a two-stage editor architecture consisting of a mask generator and image in-painter that co-operate to remove objects, and a novel GAN based prior for the mask generator that allows us to flexibly incorporate knowledge about object shapes. We experimentally show on two datasets that our method effectively removes a wide variety of objects using weak supervision only
A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation
R. Shetty, B. Schiele and M. Fritz
Proceedings of the 27th USENIX Security Symposium, 2018
Image and Video Captioning with Augmented Neural Architectures
R. Shetty, H. R. Tavakoli and J. Laaksonen
IEEE MultiMedia, Volume 25, Number 2, 2018
2017
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
R. Shetty, M. Rohrbach, L. A. Hendricks, M. Fritz and B. Schiele
IEEE International Conference on Computer Vision (ICCV 2017), 2017
Paying Attention to Descriptions Generated by Image Captioning Models
H. R. Tavakoli, R. Shetty, A. Borji and J. Laaksonen
IEEE International Conference on Computer Vision (ICCV 2017), 2017