Hosnieh Sattar (PhD Student)

MSc Hosnieh Sattar

Max-Planck-Institut für Informatik
Saarland Informatics Campus
+49 681 9325 2000
+49 681 9325 2099
Get email via email

Personal Information


Research Interests

  • Machine Learning and Pattern Recognition
  • Eye Tracking and Visual Cognition
  • Image Analysis and Computer vision
  • Human-Computer Interaction



  • 2015–present, Ph.D. student in Computer Science, Max Planck Institute for Informatics
  • 2014, M.Sc. in Visual Computing, Saarland University
  • 2011, B.Sc. in Biomedical Engineering, Islamic Azad University of Mashhad


  • PDE and Boundary Value Problems, Saarland University, (Dr. Darya Apushkinskaya, 2013/14)

Research Projects



Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources
H. Sattar, G. Pons-Moll and M. Fritz
2019 IEEE Winter Conference on Applications of Computer Vision (WACV 2019), 2019
Shape Evasion: Preventing Body Shape Inference of Multi-Stage Approaches
H. Sattar, K. Krombholz, G. Pons-Moll and M. Fritz
Technical Report, 2019
(arXiv: 1905.11503)
Modern approaches to pose and body shape estimation have recently achieved strong performance even under challenging real-world conditions. Even from a single image of a clothed person, a realistic looking body shape can be inferred that captures a users' weight group and body shape type well. This opens up a whole spectrum of applications -- in particular in fashion -- where virtual try-on and recommendation systems can make use of these new and automatized cues. However, a realistic depiction of the undressed body is regarded highly private and therefore might not be consented by most people. Hence, we ask if the automatic extraction of such information can be effectively evaded. While adversarial perturbations have been shown to be effective for manipulating the output of machine learning models -- in particular, end-to-end deep learning approaches -- state of the art shape estimation methods are composed of multiple stages. We perform the first investigation of different strategies that can be used to effectively manipulate the automatic shape estimation while preserving the overall appearance of the original image.
Predicting the Category and Attributes of Visual Search Targets Using Deep Gaze Pooling
H. Sattar, A. Bulling and M. Fritz
2017 IEEE International Conference on Computer Vision Workshops (MBCC @ICCV 2017), 2017
Previous work focused on predicting visual search targets from human fixations but, in the real world, a specific target is often not known, e.g. when searching for a present for a friend. In this work we instead study the problem of predicting the mental picture, i.e. only an abstract idea instead of a specific target. This task is significantly more challenging given that mental pictures of the same target category can vary widely depending on personal biases, and given that characteristic target attributes can often not be verbalised explicitly. We instead propose to use gaze information as implicit information on users' mental picture and present a novel gaze pooling layer to seamlessly integrate semantic and localized fixation information into a deep image representation. We show that we can robustly predict both the mental picture's category as well as attributes on a novel dataset containing fixation data of 14 users searching for targets on a subset of the DeepFahion dataset. Our results have important implications for future search interfaces and suggest deep gaze pooling as a general-purpose approach for gaze-supported computer vision systems.
Visual Decoding of Targets During Visual Search From Human Eye Fixations
H. Sattar, M. Fritz and A. Bulling
Technical Report, 2017
(arXiv: 1706.05993)
What does human gaze reveal about a users' intents and to which extend can these intents be inferred or even visualized? Gaze was proposed as an implicit source of information to predict the target of visual search and, more recently, to predict the object class and attributes of the search target. In this work, we go one step further and investigate the feasibility of combining recent advances in encoding human gaze information using deep convolutional neural networks with the power of generative image models to visually decode, i.e. create a visual representation of, the search target. Such visual decoding is challenging for two reasons: 1) the search target only resides in the user's mind as a subjective visual pattern, and can most often not even be described verbally by the person, and 2) it is, as of yet, unclear if gaze fixations contain sufficient information for this task at all. We show, for the first time, that visual representations of search targets can indeed be decoded only from human gaze fixations. We propose to first encode fixations into a semantic representation and then decode this representation into an image. We evaluate our method on a recent gaze dataset of 14 participants searching for clothing in image collages and validate the model's predictions using two human studies. Our results show that 62% (Chance level = 10%) of the time users were able to select the categories of the decoded image right. In our second studies we show the importance of a local gaze encoding for decoding visual search targets of user
Prediction of Search Targets from Fixations in Open-world Settings
H. Sattar, S. Müller, M. Fritz and A. Bulling
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), 2015