Rakshith Shetty (PhD Student)

MSc Rakshith Shetty

Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
E1 4 - 628
+49 681 9325 2028
+49 681 9325 2099


Towards Automated Testing and Robustification by Semantic Adversarial Data Generation
R. Shetty, M. Fritz and B. Schiele
Computer Vision -- ECCV 2020, 2020
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
V. Agarwal, R. Shetty and M. Fritz
Technical Report, 2019
(arXiv: 1912.07538)
Despite significant success in Visual Question Answering (VQA), VQA models have been shown to be notoriously brittle to linguistic variations in the questions. Due to deficiencies in models and datasets, today's models often rely on correlations rather than predictions that are causal w.r.t. data. In this paper, we propose a novel way to analyze and measure the robustness of the state of the art models w.r.t semantic visual variations as well as propose ways to make models more robust against spurious correlations. Our method performs automated semantic image manipulations and tests for consistency in model predictions to quantify the model robustness as well as generate synthetic data to counter these problems. We perform our analysis on three diverse, state of the art VQA models and diverse question types with a particular focus on challenging counting questions. In addition, we show that models can be made significantly more robust against inconsistent predictions using our edited data. Finally, we show that results also translate to real-world error cases of state of the art models, which results in improved overall performance
Not Using the Car to See the Sidewalk: Quantifying and Controlling the Effects of Context in Classification and Segmentation
R. Shetty, B. Schiele and M. Fritz
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), 2019
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
M. Wagner, H. Basevi, R. Shetty, W. Li, M. Malinowski, M. Fritz and A. Leonardis
Computer Vision - ECCV 2018 Workshops, 2018
Image and Video Captioning with Augmented Neural Architectures
R. Shetty, H. R. Tavakoli and J. Laaksonen
IEEE MultiMedia, Volume 25, Number 2, 2018
Adversarial Scene Editing: Automatic Object Removal from Weak Supervision
R. Shetty, M. Fritz and B. Schiele
Advances in Neural Information Processing Systems 31, 2018
While great progress has been made recently in automatic image manipulation, it has been limited to object centric images like faces or structured scene datasets. In this work, we take a step towards general scene-level image editing by developing an automatic interaction-free object removal model. Our model learns to find and remove objects from general scene images using image-level labels and unpaired data in a generative adversarial network (GAN) framework. We achieve this with two key contributions: a two-stage editor architecture consisting of a mask generator and image in-painter that co-operate to remove objects, and a novel GAN based prior for the mask generator that allows us to flexibly incorporate knowledge about object shapes. We experimentally show on two datasets that our method effectively removes a wide variety of objects using weak supervision only
A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation
R. Shetty, B. Schiele and M. Fritz
Proceedings of the 27th USENIX Security Symposium, 2018
Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training
R. Shetty, M. Rohrbach, L. A. Hendricks, M. Fritz and B. Schiele
IEEE International Conference on Computer Vision (ICCV 2017), 2017
Paying Attention to Descriptions Generated by Image Captioning Models
H. R. Tavakoli, R. Shetty, A. Borji and J. Laaksonen
IEEE International Conference on Computer Vision (ICCV 2017), 2017