Publications of the International Max Planck Research School for Computer Science
Master Thesis
2021
[1]
D. M. H. Nguyen, “Lifted Multi-Cut Optimization for Multi-Camera Multi-People Tracking,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@mastersthesis{NguyenMaster21,
TITLE = {Lifted Multi-Cut Optimization for Multi-Camera Multi-People Tracking},
AUTHOR = {Nguyen, Duy Minh Ho},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Nguyen, Duy Minh Ho
%Y Swoboda, Paul
%A referee: Schiele, Bernt
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Lifted Multi-Cut Optimization for Multi-Camera Multi-People Tracking :
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-542C-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 69 p.
%V master
%9 master
2020
[2]
T.-P. Nguyen, “Advanced Semantics for Commonsense Knowledge Extraction,” Universität des Saarlandes, Saarbrücken, 2020.
Abstract
Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This thesis presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent.
Export
BibTeX
@mastersthesis{NguyenMSc2020,
TITLE = {Advanced Semantics for Commonsense Knowledge Extraction},
AUTHOR = {Nguyen, Tuan-Phong},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2020},
DATE = {2020},
ABSTRACT = {Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This thesis presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent.},
}
Endnote
%0 Thesis
%A Nguyen, Tuan-Phong
%Y Razniewski, Simon
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Advanced Semantics for Commonsense Knowledge Extraction :
%G eng
%U http://hdl.handle.net/21.11116/0000-0007-FED0-0
%I Universität des Saarlandes
%C Saarbrücken
%D 2020
%P 67 p.
%V master
%9 master
%X Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This thesis presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent.
[3]
V. Skripniuk, “Watermarking for generative adversarial networks,” Universität des Saarlandes, Saarbrücken, 2020.
Export
BibTeX
@mastersthesis{SkriMaster2020,
TITLE = {Watermarking for generative adversarial networks},
AUTHOR = {Skripniuk, Vlasislav},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2020},
DATE = {2020},
}
Endnote
%0 Thesis
%A Skripniuk, Vlasislav
%A referee: Fritz, Mario
%A contributor: Zhang, Yang
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T Watermarking for generative adversarial networks :
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-45A1-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2020
%P 69 p.
%V master
%9 master
2019
[4]
M. Abouhamra, “AligNarr: Aligning Narratives of Different Length for Movie Summarization,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
Automatic text alignment is an important problem in natural language processing. It<br>can be used to create the data needed to train different language models. Most research<br>about automatic summarization revolves around summarizing news articles or scientific<br>papers, which are somewhat small texts with simple and clear structure. The bigger the<br>difference in size between the summary and the original text, the harder the problem will<br>be since important information will be sparser and identifying them can be more difficult.<br>Therefore, creating datasets from larger texts can help improve automatic summarization.<br>In this project, we try to develop an algorithm which can automatically create a<br>dataset for abstractive automatic summarization for bigger narrative text bodies such<br>as movie scripts. To this end, we chose sentences as summary text units and scenes<br>as script text units and developed an algorithm which uses some of the latest natural<br>language processing techniques to align scenes and sentences based on the similarity in<br>their meanings.<br>Solving this alignment problem can provide us with important information about how<br>to evaluate the meaning of a text, which can help us create better abstractive summariza-<br>tion models. We developed a method which uses different similarity scoring techniques<br>(embedding similarity, word inclusion and entity inclusion) to align script scenes and sum-<br>mary sentences which achieved an F1 score of 0.39. Analyzing our results showed that<br>the bigger the differences in the number of text units being aligned, the more difficult the<br>alignment problem is. We also critiqued of our own similarity scoring techniques and dif-<br>ferent alignment algorithms based on integer linear programming and local optimization<br>and showed their limitations and discussed ideas to improve them.
Export
BibTeX
@mastersthesis{AbouhamraMSc2019,
TITLE = {{AligNarr}: Aligning Narratives of Different Length for Movie Summarization},
AUTHOR = {Abouhamra, Mostafa},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019},
ABSTRACT = {Automatic text alignment is an important problem in natural language processing. It<br>can be used to create the data needed to train different language models. Most research<br>about automatic summarization revolves around summarizing news articles or scientific<br>papers, which are somewhat small texts with simple and clear structure. The bigger the<br>difference in size between the summary and the original text, the harder the problem will<br>be since important information will be sparser and identifying them can be more difficult.<br>Therefore, creating datasets from larger texts can help improve automatic summarization.<br>In this project, we try to develop an algorithm which can automatically create a<br>dataset for abstractive automatic summarization for bigger narrative text bodies such<br>as movie scripts. To this end, we chose sentences as summary text units and scenes<br>as script text units and developed an algorithm which uses some of the latest natural<br>language processing techniques to align scenes and sentences based on the similarity in<br>their meanings.<br>Solving this alignment problem can provide us with important information about how<br>to evaluate the meaning of a text, which can help us create better abstractive summariza-<br>tion models. We developed a method which uses different similarity scoring techniques<br>(embedding similarity, word inclusion and entity inclusion) to align script scenes and sum-<br>mary sentences which achieved an F1 score of 0.39. Analyzing our results showed that<br>the bigger the differences in the number of text units being aligned, the more difficult the<br>alignment problem is. We also critiqued of our own similarity scoring techniques and dif-<br>ferent alignment algorithms based on integer linear programming and local optimization<br>and showed their limitations and discussed ideas to improve them.},
}
Endnote
%0 Thesis
%A Abouhamra, Mostafa
%Y Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T AligNarr: Aligning Narratives of Different Length for Movie Summarization :
%G eng
%U http://hdl.handle.net/21.11116/0000-0004-5836-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%P 54 p.
%V master
%9 master
%X Automatic text alignment is an important problem in natural language processing. It<br>can be used to create the data needed to train different language models. Most research<br>about automatic summarization revolves around summarizing news articles or scientific<br>papers, which are somewhat small texts with simple and clear structure. The bigger the<br>difference in size between the summary and the original text, the harder the problem will<br>be since important information will be sparser and identifying them can be more difficult.<br>Therefore, creating datasets from larger texts can help improve automatic summarization.<br>In this project, we try to develop an algorithm which can automatically create a<br>dataset for abstractive automatic summarization for bigger narrative text bodies such<br>as movie scripts. To this end, we chose sentences as summary text units and scenes<br>as script text units and developed an algorithm which uses some of the latest natural<br>language processing techniques to align scenes and sentences based on the similarity in<br>their meanings.<br>Solving this alignment problem can provide us with important information about how<br>to evaluate the meaning of a text, which can help us create better abstractive summariza-<br>tion models. We developed a method which uses different similarity scoring techniques<br>(embedding similarity, word inclusion and entity inclusion) to align script scenes and sum-<br>mary sentences which achieved an F1 score of 0.39. Analyzing our results showed that<br>the bigger the differences in the number of text units being aligned, the more difficult the<br>alignment problem is. We also critiqued of our own similarity scoring techniques and dif-<br>ferent alignment algorithms based on integer linear programming and local optimization<br>and showed their limitations and discussed ideas to improve them.
[5]
M. Anis, “Proactive Learning Algorithms: A Survey of the State of the Art and Implementation of Novel and Concrete Algorithm for (Unstructured) Data Classification,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
Artificial Intelligence (AI) has become one of the most researched fields nowadays. Ma-<br>chine Learning (ML) is one of the most popular AI domains, where systems are created<br>with the capability of automatic learning and improving from the learning experience.<br>The current revolution in the size and cost of electronic storage allows for the existence<br>of enormous amount of data that can be used for ML training. Unfortunately, not all<br>of this data is labelled. The process of manually labelling documents can be expen-<br>sive, time consuming and subject to human errors. Active Learning (AL) addresses<br>this challenge by finding a sample of the enormous data corpus that, if labelled, can<br>substitute the use of the whole dataset. AL routes this sample to a human labeller to<br>formulate the training dataset needed for the ML model. AL assumes that there exists a<br>single, infallible and indefatigable labeller. These assumptions cannot cope to real world<br>problems. The main focus of this work is to introduce Proactive Learning (PL) to an<br>existing AL system. PL aims at generalizing the problem, solved by AL, by relaxing<br>all of its assumptions about the user. The main addition of this project is enhancing<br>automatic text classification by combining knowledge from the domain of PL and from<br>Instance Relabelling paradigms to update the currently implemented AL system. The<br>implemented PL system is tested on the 20 Newsgroups, Reuters and AG News datasets.<br>The system is capable of reaching impressive results in detecting and predicting users<br>actions, which allows the system to efficiently route labelling tasks to the best users,<br>leading to minimize the risk of receiving wrong labels.
Export
BibTeX
@mastersthesis{Anis_MSc2019,
TITLE = {Proactive Learning Algorithms: A Survey of the State of the Art and Implementation of Novel and Concrete Algorithm for (Unstructured) Data Classi{fi}cation},
AUTHOR = {Anis, Myriam},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019-08-06},
ABSTRACT = {Arti{fi}cial Intelligence (AI) has become one of the most researched {fi}elds nowadays. Ma-<br>chine Learning (ML) is one of the most popular AI domains, where systems are created<br>with the capability of automatic learning and improving from the learning experience.<br>The current revolution in the size and cost of electronic storage allows for the existence<br>of enormous amount of data that can be used for ML training. Unfortunately, not all<br>of this data is labelled. The process of manually labelling documents can be expen-<br>sive, time consuming and subject to human errors. Active Learning (AL) addresses<br>this challenge by {fi}nding a sample of the enormous data corpus that, if labelled, can<br>substitute the use of the whole dataset. AL routes this sample to a human labeller to<br>formulate the training dataset needed for the ML model. AL assumes that there exists a<br>single, infallible and indefatigable labeller. These assumptions cannot cope to real world<br>problems. The main focus of this work is to introduce Proactive Learning (PL) to an<br>existing AL system. PL aims at generalizing the problem, solved by AL, by relaxing<br>all of its assumptions about the user. The main addition of this project is enhancing<br>automatic text classi{fi}cation by combining knowledge from the domain of PL and from<br>Instance Relabelling paradigms to update the currently implemented AL system. The<br>implemented PL system is tested on the 20 Newsgroups, Reuters and AG News datasets.<br>The system is capable of reaching impressive results in detecting and predicting users<br>actions, which allows the system to efficiently route labelling tasks to the best users,<br>leading to minimize the risk of receiving wrong labels.},
}
Endnote
%0 Thesis
%A Anis, Myriam
%Y Klakow, Dietrich
%A referee: Petrenko, Pavlo
%A referee: Hampp, Thomas
%A referee: Klakow, Dietrich
%A referee: Mirza, Paramita
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Proactive Learning Algorithms: A Survey of the
State of the Art and Implementation of Novel
and Concrete Algorithm for (Unstructured)
Data Classification :
%G eng
%U http://hdl.handle.net/21.11116/0000-0005-9C5B-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%8 06.08.2019
%P 86 p.
%V master
%9 master
%X Artificial Intelligence (AI) has become one of the most researched fields nowadays. Ma-<br>chine Learning (ML) is one of the most popular AI domains, where systems are created<br>with the capability of automatic learning and improving from the learning experience.<br>The current revolution in the size and cost of electronic storage allows for the existence<br>of enormous amount of data that can be used for ML training. Unfortunately, not all<br>of this data is labelled. The process of manually labelling documents can be expen-<br>sive, time consuming and subject to human errors. Active Learning (AL) addresses<br>this challenge by finding a sample of the enormous data corpus that, if labelled, can<br>substitute the use of the whole dataset. AL routes this sample to a human labeller to<br>formulate the training dataset needed for the ML model. AL assumes that there exists a<br>single, infallible and indefatigable labeller. These assumptions cannot cope to real world<br>problems. The main focus of this work is to introduce Proactive Learning (PL) to an<br>existing AL system. PL aims at generalizing the problem, solved by AL, by relaxing<br>all of its assumptions about the user. The main addition of this project is enhancing<br>automatic text classification by combining knowledge from the domain of PL and from<br>Instance Relabelling paradigms to update the currently implemented AL system. The<br>implemented PL system is tested on the 20 Newsgroups, Reuters and AG News datasets.<br>The system is capable of reaching impressive results in detecting and predicting users<br>actions, which allows the system to efficiently route labelling tasks to the best users,<br>leading to minimize the risk of receiving wrong labels.
[6]
N. Cheema, “In Silico User Testing for Mid-Air Interactions with Deep Reinforcement Learning,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
User interface design for Virtual Reality and other embodied interaction contexts<br>has to carefully consider ergonomics. A common problem is that mid-air inter-<br>action may cause excessive arm fatigue, known as the “Gorilla arm” effect. To<br>predict and prevent such problems at a low cost, this thesis investigates user test-<br>ing of mid-air interaction without real users, utilizing biomechanically simulated<br>AI agents trained using deep Reinforcement Learning (RL). This is implemented<br>in a pointing task and four experimental conditions, demonstrating that the sim-<br>ulated fatigue data matches ground truth human data. Additionally, two effort<br>models are compared against each other: 1) instantaneous joint torques commonly<br>used in computer animation and robotics, and 2) the recent Three Compartment<br>Controller (3CC-r) model from biomechanical literature. 3CC-r yields movements<br>that are both more efficient and natural, whereas with instantaneous joint torques,<br>the RL agent can easily generate movements that are unnatural or only reach the<br>targets slowly and inaccurately. This thesis demonstrates that deep RL combined<br>with the 3CC-r provides a viable tool for predicting both interaction movements<br>and user experience in silico, without users.
Export
BibTeX
@mastersthesis{Cheema_MSc2019b,
TITLE = {In Silico User Testing for Mid-Air Interactions with Deep Reinforcement Learning},
AUTHOR = {Cheema, Noshaba},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019-09-30},
ABSTRACT = {User interface design for Virtual Reality and other embodied interaction contexts<br>has to carefully consider ergonomics. A common problem is that mid-air inter-<br>action may cause excessive arm fatigue, known as the {\textquotedblleft}Gorilla arm{\textquotedblright} effect. To<br>predict and prevent such problems at a low cost, this thesis investigates user test-<br>ing of mid-air interaction without real users, utilizing biomechanically simulated<br>AI agents trained using deep Reinforcement Learning (RL). This is implemented<br>in a pointing task and four experimental conditions, demonstrating that the sim-<br>ulated fatigue data matches ground truth human data. Additionally, two effort<br>models are compared against each other: 1) instantaneous joint torques commonly<br>used in computer animation and robotics, and 2) the recent Three Compartment<br>Controller (3CC-r) model from biomechanical literature. 3CC-r yields movements<br>that are both more efficient and natural, whereas with instantaneous joint torques,<br>the RL agent can easily generate movements that are unnatural or only reach the<br>targets slowly and inaccurately. This thesis demonstrates that deep RL combined<br>with the 3CC-r provides a viable tool for predicting both interaction movements<br>and user experience in silico, without users.},
}
Endnote
%0 Thesis
%A Cheema, Noshaba
%Y Slusallek, Philipp
%A referee: Hämäläinen, Perttu
%A referee: Lehtinen, Jaakko
%A referee: Slusallek, Philipp
%A referee: Hämäläinen, Perttu
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
External Organizations
External Organizations
%T In Silico User Testing for Mid-Air
Interactions with Deep Reinforcement
Learning :
%G eng
%U http://hdl.handle.net/21.11116/0000-0005-9C66-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%8 30.09.2019
%P 65 p.
%V master
%9 master
%X User interface design for Virtual Reality and other embodied interaction contexts<br>has to carefully consider ergonomics. A common problem is that mid-air inter-<br>action may cause excessive arm fatigue, known as the “Gorilla arm” effect. To<br>predict and prevent such problems at a low cost, this thesis investigates user test-<br>ing of mid-air interaction without real users, utilizing biomechanically simulated<br>AI agents trained using deep Reinforcement Learning (RL). This is implemented<br>in a pointing task and four experimental conditions, demonstrating that the sim-<br>ulated fatigue data matches ground truth human data. Additionally, two effort<br>models are compared against each other: 1) instantaneous joint torques commonly<br>used in computer animation and robotics, and 2) the recent Three Compartment<br>Controller (3CC-r) model from biomechanical literature. 3CC-r yields movements<br>that are both more efficient and natural, whereas with instantaneous joint torques,<br>the RL agent can easily generate movements that are unnatural or only reach the<br>targets slowly and inaccurately. This thesis demonstrates that deep RL combined<br>with the 3CC-r provides a viable tool for predicting both interaction movements<br>and user experience in silico, without users.
[7]
N. Cheema, “Fine-Grained Semantic Segmentation of Motion Capture Data using Convolutional Neural Networks,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
Human motion capture data has been widely used in data-driven character animation. In<br>order to generate realistic, natural-looking motions, most data-driven approaches require<br>considerable efforts of pre-processing, including motion segmentation, annotation, and<br>so on. Existing (semi-) automatic solutions either require hand-crafted features for<br>motion segmentation or do not produce the semantic annotations required for motion<br>synthesis and building large-scale motion databases. In this thesis, an approach for a<br>semi-automatic framework for semantic segmentation of motion capture data based on<br>(semi-) supervised machine learning techniques is developed. The motion capture data is<br>first transformed into a “motion image” to apply common convolutional neural networks<br>for image segmentation. Convolutions over the time domain enable the extraction of<br>temporal information and dilated convolutions are used to enlarge the receptive field<br>exponentially using comparably few layers and parameters. The finally developed dilated<br>temporal fully-convolutional model is compared against state-of-the-art models in action<br>segmentation, as well as a popular network for sequence modeling. The models are<br>further tested on noisy and inaccurate training labels and the developed model is found<br>to be surprisingly robust and self-correcting.
Export
BibTeX
@mastersthesis{Cheema_MSc2019,
TITLE = {Fine-Grained Semantic Segmentation of Motion Capture Data using Convolutional Neural Networks},
AUTHOR = {Cheema, Noshaba},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019-03-29},
ABSTRACT = {Human motion capture data has been widely used in data-driven character animation. In<br>order to generate realistic, natural-looking motions, most data-driven approaches require<br>considerable efforts of pre-processing, including motion segmentation, annotation, and<br>so on. Existing (semi-) automatic solutions either require hand-crafted features for<br>motion segmentation or do not produce the semantic annotations required for motion<br>synthesis and building large-scale motion databases. In this thesis, an approach for a<br>semi-automatic framework for semantic segmentation of motion capture data based on<br>(semi-) supervised machine learning techniques is developed. The motion capture data is<br>{fi}rst transformed into a {\textquotedblleft}motion image{\textquotedblright} to apply common convolutional neural networks<br>for image segmentation. Convolutions over the time domain enable the extraction of<br>temporal information and dilated convolutions are used to enlarge the receptive {fi}eld<br>exponentially using comparably few layers and parameters. The {fi}nally developed dilated<br>temporal fully-convolutional model is compared against state-of-the-art models in action<br>segmentation, as well as a popular network for sequence modeling. The models are<br>further tested on noisy and inaccurate training labels and the developed model is found<br>to be surprisingly robust and self-correcting.},
}
Endnote
%0 Thesis
%A Cheema, Noshaba
%Y Slusallek, Philipp
%A referee: Hosseini, Somayeh
%A referee: Slusallek, Philipp
%A referee: Theobalt, C.
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Fine-Grained Semantic Segmentation of Motion
Capture Data using Convolutional Neural
Networks :
%G eng
%U http://hdl.handle.net/21.11116/0000-0005-9C5E-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%8 29.03.2019
%P 127 p.
%V master
%9 master
%X Human motion capture data has been widely used in data-driven character animation. In<br>order to generate realistic, natural-looking motions, most data-driven approaches require<br>considerable efforts of pre-processing, including motion segmentation, annotation, and<br>so on. Existing (semi-) automatic solutions either require hand-crafted features for<br>motion segmentation or do not produce the semantic annotations required for motion<br>synthesis and building large-scale motion databases. In this thesis, an approach for a<br>semi-automatic framework for semantic segmentation of motion capture data based on<br>(semi-) supervised machine learning techniques is developed. The motion capture data is<br>first transformed into a “motion image” to apply common convolutional neural networks<br>for image segmentation. Convolutions over the time domain enable the extraction of<br>temporal information and dilated convolutions are used to enlarge the receptive field<br>exponentially using comparably few layers and parameters. The finally developed dilated<br>temporal fully-convolutional model is compared against state-of-the-art models in action<br>segmentation, as well as a popular network for sequence modeling. The models are<br>further tested on noisy and inaccurate training labels and the developed model is found<br>to be surprisingly robust and self-correcting.
[8]
S. Ganguly, “Empirical Evaluation of Common Assumptions in Building Political Bias Datasets,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
In today’s world, bias and polarization are some of the biggest problems plaguing our<br>society. In such volatile environments, news media play a crucial role as the gatekeepers<br>of the information. Given the huge impact they can have on societal evolution, they have<br>long been studied by researchers. Researchers and practitioners often build political bias<br>datasets for a variety of tasks ranging from examining bias of news outlets and articles<br>to studying and designing algorithmic news retrieval systems for online platforms. Often,<br>researchers make certain simplifying assumptions in building such datasets.<br><br>In this thesis, we empirically validate three such common assumptions given the im-<br>portance of such datasets. The three assumptions are, (i) raters’ political leaning does<br>not affect their ratings of political articles, (ii) news articles follow the leaning of their<br>source outlet, and (iii) political leaning of a news outlet is stable across reporting on<br>different topics. We constructed a manually annotated ground-truth dataset of news<br>articles, published by several popular news media outlets in the U.S., on “Gun policy”<br>and “Immigration” along with their political bias leanings using Amazon Mechanical<br>Turk and used it to validate these assumptions.<br><br>Our findings suggest that, (i) in certain cases, liberal and conservative raters’ label<br>leanings of news articles differently, (ii) in many cases, the news articles do not follow the<br>political leaning of their source outlet, and (iii) for certain outlets, the political leaning<br>of the outlet does not remain unchanged while reporting on different topics/issues. We<br>believe, our work offers important guidelines for future attempts at building political<br>bias datasets which in turn will help them in building better algorithmic news retrieval<br>systems for online platforms.
Export
BibTeX
@mastersthesis{,
TITLE = {Empirical Evaluation of Common Assumptions in Building Political Bias Datasets},
AUTHOR = {Ganguly, Soumen},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019-09-01},
ABSTRACT = {In today{\textquoteright}s world, bias and polarization are some of the biggest problems plaguing our<br>society. In such volatile environments, news media play a crucial role as the gatekeepers<br>of the information. Given the huge impact they can have on societal evolution, they have<br>long been studied by researchers. Researchers and practitioners often build political bias<br>datasets for a variety of tasks ranging from examining bias of news outlets and articles<br>to studying and designing algorithmic news retrieval systems for online platforms. Often,<br>researchers make certain simplifying assumptions in building such datasets.<br><br>In this thesis, we empirically validate three such common assumptions given the im-<br>portance of such datasets. The three assumptions are, (i) raters{\textquoteright} political leaning does<br>not affect their ratings of political articles, (ii) news articles follow the leaning of their<br>source outlet, and (iii) political leaning of a news outlet is stable across reporting on<br>different topics. We constructed a manually annotated ground-truth dataset of news<br>articles, published by several popular news media outlets in the U.S., on {\textquotedblleft}Gun policy{\textquotedblright}<br>and {\textquotedblleft}Immigration{\textquotedblright} along with their political bias leanings using Amazon Mechanical<br>Turk and used it to validate these assumptions.<br><br>Our {fi}ndings suggest that, (i) in certain cases, liberal and conservative raters{\textquoteright} label<br>leanings of news articles differently, (ii) in many cases, the news articles do not follow the<br>political leaning of their source outlet, and (iii) for certain outlets, the political leaning<br>of the outlet does not remain unchanged while reporting on different topics/issues. We<br>believe, our work offers important guidelines for future attempts at building political<br>bias datasets which in turn will help them in building better algorithmic news retrieval<br>systems for online platforms.},
}
Endnote
%0 Thesis
%A Ganguly, Soumen
%Y An, Jinsun
%A referee: Gummadi, Krishna
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Group K. Gummadi, Max Planck Institute for Software Systems, Max Planck Society
Group K. Gummadi, Max Planck Institute for Software Systems, Max Planck Society
%T Empirical Evaluation of Common Assumptions in Building Political Bias
Datasets :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-B37F-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%8 01.09.2019
%P 53 p.
%V master
%9 master
%X In today’s world, bias and polarization are some of the biggest problems plaguing our<br>society. In such volatile environments, news media play a crucial role as the gatekeepers<br>of the information. Given the huge impact they can have on societal evolution, they have<br>long been studied by researchers. Researchers and practitioners often build political bias<br>datasets for a variety of tasks ranging from examining bias of news outlets and articles<br>to studying and designing algorithmic news retrieval systems for online platforms. Often,<br>researchers make certain simplifying assumptions in building such datasets.<br><br>In this thesis, we empirically validate three such common assumptions given the im-<br>portance of such datasets. The three assumptions are, (i) raters’ political leaning does<br>not affect their ratings of political articles, (ii) news articles follow the leaning of their<br>source outlet, and (iii) political leaning of a news outlet is stable across reporting on<br>different topics. We constructed a manually annotated ground-truth dataset of news<br>articles, published by several popular news media outlets in the U.S., on “Gun policy”<br>and “Immigration” along with their political bias leanings using Amazon Mechanical<br>Turk and used it to validate these assumptions.<br><br>Our findings suggest that, (i) in certain cases, liberal and conservative raters’ label<br>leanings of news articles differently, (ii) in many cases, the news articles do not follow the<br>political leaning of their source outlet, and (iii) for certain outlets, the political leaning<br>of the outlet does not remain unchanged while reporting on different topics/issues. We<br>believe, our work offers important guidelines for future attempts at building political<br>bias datasets which in turn will help them in building better algorithmic news retrieval<br>systems for online platforms.
[9]
V. Lazova, “Texture Completion of People in Diverse Clothing,” Universität des Saarlandes, Saarbrücken, 2019.
Export
BibTeX
@mastersthesis{Lazova_Master2019,
TITLE = {Texture Completion of People in Diverse Clothing},
AUTHOR = {Lazova, Verica},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019},
}
Endnote
%0 Thesis
%A Lazova, Verica
%Y Insafutdinov, Eldar
%A referee: Pons-Moll, Gerard
%A referee: Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Texture Completion of People in Diverse Clothing :
%G eng
%U http://hdl.handle.net/21.11116/0000-0003-91DB-2
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%P 46 p.
%V master
%9 master
[10]
M. M. Mounir Sourial, “Automatic Neural Network Architecture Optimization,” Universität des Saarlandes, Saarbrücken, 2019.
Abstract
Deep learning has recently become a very hot topic in Computer Science. It has invaded<br>many applications in Computer Science achieving exceptional performances compared<br>to other existing methods. However, neural networks have a strong memory limitation<br>which is considered to be one of its main challenges. This is why remarkable research<br>focus is recently directed towards model compression.<br><br>This thesis studies a divide-and-conquer approach that transforms an existing trained<br>neural network into another network with less number of parameters with the target of<br>decrasing its memory footprint. It takes into account the resulting loss in performance.<br>It is based on existing layer transformation techniques like Canonical Polyadic (CP) and<br>SVD affine transformations. Given an artificial neural network, trained on a certain<br>dataset, an agent optimizes the architecture of the neural network in a bottom-up man-<br>ner. It cuts the network in sub-networks of length 1. It optimizes each sub-network using<br>layer transformations. Then it chooses the most- promising sub-networks to construct<br>sub-networks of length 2. This process is repeated until it constructs an artificial neural<br>network that covers the functionalities of the original neural network.<br><br>This thesis offers an extensive analysis of the proposed approach. We tested this tech-<br>nique with different known neural network architectures with popular datasets. We<br>could outperform recent techniques in both the compression rate and network perfor-<br>mance on LeNet5 with MNIST. We could compress ResNet-20 to 25% of their original<br>size achieving performance comparable with networks in the literature with double this<br>size.
Export
BibTeX
@mastersthesis{MounirMSc2019,
TITLE = {Automatic Neural Network Architecture Optimization},
AUTHOR = {Mounir Sourial, Maggie Moheb},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2019},
DATE = {2019-05-15},
ABSTRACT = {Deep learning has recently become a very hot topic in Computer Science. It has invaded<br>many applications in Computer Science achieving exceptional performances compared<br>to other existing methods. However, neural networks have a strong memory limitation<br>which is considered to be one of its main challenges. This is why remarkable research<br>focus is recently directed towards model compression.<br><br>This thesis studies a divide-and-conquer approach that transforms an existing trained<br>neural network into another network with less number of parameters with the target of<br>decrasing its memory footprint. It takes into account the resulting loss in performance.<br>It is based on existing layer transformation techniques like Canonical Polyadic (CP) and<br>SVD affine transformations. Given an arti{fi}cial neural network, trained on a certain<br>dataset, an agent optimizes the architecture of the neural network in a bottom-up man-<br>ner. It cuts the network in sub-networks of length 1. It optimizes each sub-network using<br>layer transformations. Then it chooses the most- promising sub-networks to construct<br>sub-networks of length 2. This process is repeated until it constructs an arti{fi}cial neural<br>network that covers the functionalities of the original neural network.<br><br>This thesis offers an extensive analysis of the proposed approach. We tested this tech-<br>nique with different known neural network architectures with popular datasets. We<br>could outperform recent techniques in both the compression rate and network perfor-<br>mance on LeNet5 with MNIST. We could compress ResNet-20 to 25% of their original<br>size achieving performance comparable with networks in the literature with double this<br>size.},
}
Endnote
%0 Thesis
%A Mounir Sourial, Maggie Moheb
%Y Weikum, Gerhard
%A referee: Cardinaux, Fabian
%A referee: Weikum, Gerhard
%A referee: Yates, Andrew
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Automatic Neural Network Architecture Optimization :
%G eng
%U http://hdl.handle.net/21.11116/0000-0005-9C58-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2019
%8 15.05.2019
%P 76 p.
%V master
%9 master
%X Deep learning has recently become a very hot topic in Computer Science. It has invaded<br>many applications in Computer Science achieving exceptional performances compared<br>to other existing methods. However, neural networks have a strong memory limitation<br>which is considered to be one of its main challenges. This is why remarkable research<br>focus is recently directed towards model compression.<br><br>This thesis studies a divide-and-conquer approach that transforms an existing trained<br>neural network into another network with less number of parameters with the target of<br>decrasing its memory footprint. It takes into account the resulting loss in performance.<br>It is based on existing layer transformation techniques like Canonical Polyadic (CP) and<br>SVD affine transformations. Given an artificial neural network, trained on a certain<br>dataset, an agent optimizes the architecture of the neural network in a bottom-up man-<br>ner. It cuts the network in sub-networks of length 1. It optimizes each sub-network using<br>layer transformations. Then it chooses the most- promising sub-networks to construct<br>sub-networks of length 2. This process is repeated until it constructs an artificial neural<br>network that covers the functionalities of the original neural network.<br><br>This thesis offers an extensive analysis of the proposed approach. We tested this tech-<br>nique with different known neural network architectures with popular datasets. We<br>could outperform recent techniques in both the compression rate and network perfor-<br>mance on LeNet5 with MNIST. We could compress ResNet-20 to 25% of their original<br>size achieving performance comparable with networks in the literature with double this<br>size.
2018
[11]
P. Grover, “Automatic Segmentation of Clinical CT Data using Deep Learning,” Universität des Saarlandes, Saarbrücken, 2018.
Export
BibTeX
@mastersthesis{GroverMaster2018,
TITLE = {Automatic Segmentation of Clinical {CT} Data using Deep Learning},
AUTHOR = {Grover, Priyanka},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2018},
DATE = {2018},
CONTENTS = {Automatising the process of semantic segmentation of anatomical structures in medical data is an active {fi}eld of research in Deep Learning. Recent advances in Convo-lutional Neural Networks for various image related tasks have inspired researchers to adopt CNN based methods for semantic segmentation of bio-medical images. In this thesis we explore different CNN architectures and frameworks with the aim to build an optimal model for automatic segmentation of CT images with Bone-Implant systems. For this purpose, we evaluate three CNN architectures -- SegNet, UNet and 3D-UNet and address the challenges encountered during the network implementation on the available data. These challenges include scarcity of labelled data, class imbalance and hyperparameter selection and we try to overcome them by using techniques such as data augmentation, weighted loss functions and cyclical learning rate. We show that data augmentation when combined with cyclical learning rate method using UNet not only trains the model in less time and but also achieves better accuracy for minority classes. We also investigate into two deep learning frameworks -- TensorFlow and Caffe2 using Python language for training and inference of CNNs. Caffe2 is relatively new framework, created to overcome the drawbacks of Caffe. Out of the two networks TensorFlow is the preferred choice for framework as it provides user convenience in all aspects like installation, development and deployment.},
}
Endnote
%0 Thesis
%A Grover, Priyanka
%Y Slusallek, Philipp
%A referee: Dahmen, Tim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Automatic Segmentation of Clinical CT Data using Deep Learning :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-8447-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2018
%P 86 p.
%V master
%9 master
%Z Automatising the process of semantic segmentation of anatomical structures in medical data is an active field of research in Deep Learning. Recent advances in Convo-lutional Neural Networks for various image related tasks have inspired researchers to adopt CNN based methods for semantic segmentation of bio-medical images. In this thesis we explore different CNN architectures and frameworks with the aim to build an optimal model for automatic segmentation of CT images with Bone-Implant systems. For this purpose, we evaluate three CNN architectures - SegNet, UNet and 3D-UNet and address the challenges encountered during the network implementation on the available data. These challenges include scarcity of labelled data, class imbalance and hyperparameter selection and we try to overcome them by using techniques such as data augmentation, weighted loss functions and cyclical learning rate. We show that data augmentation when combined with cyclical learning rate method using UNet not only trains the model in less time and but also achieves better accuracy for minority classes.
We also investigate into two deep learning frameworks - TensorFlow and Caffe2 using Python language for training and inference of CNNs. Caffe2 is relatively new framework, created to overcome the drawbacks of Caffe. Out of the two networks TensorFlow is the preferred choice for framework as it provides user convenience in all aspects like installation, development and deployment.
[12]
T. Heinen, “Inferring Transcriptional Regulators Using Clustered Multi-Task Regression,” Universität des Saarlandes, Saarbrücken, 2018.
Abstract
Sparse linear regression is often used to identify key transcriptional regulators by<br>predicting gene expression abundance from regulatory features such as transcription<br>factor (TF) binding or epigenomics data. However, a single linear model explaining<br>the gene expression of thousands of genes is limited in capturing the complexity of<br>cis-regulatory modules and gene co-expression patterns. Indeed, certain TFs are<br>known to act as both activators or repressors depending on associated cofactors and<br>neighbouring DNA-bound proteins. It is therefore desirable to identify clusters or<br>modules of co-regulated genes and model their regulatory profiles separately.<br>Finite mixtures of regression models are a popular tool for modeling hetero-<br>geneous data, while maintaining a linearity assumption. Unfortunately, they do<br>not take advantage of available data sets containing the molecular profiles of many<br>biological samples. We propose to combine the power of mixture modeling and<br>multi-task learning by using a penalized maximum likelihood framework for infer-<br>ring gene modules and regulators in multiple samples simultaneously. More specif-<br>ically, we regularize the likelihood function with a tree-structured L1/L2 penalty<br>to enable knowledge transfer between models of related cells. We optimize the<br>parameters of our models with a generalized EM algorithm. Experimental evalu-<br>ation of our method on synthetic data suggests that multi-task mixture modelling<br>is more suitable for identifying the true underlying cluster structure compared to a<br>single-task regression mixture model. Finally, we apply the model to a dataset from<br>the BLUEPRINT project consisting of various types of haematopoietic cells and<br>uncover interesting regulatory patterns.
Export
BibTeX
@mastersthesis{HeinenMaster2018,
TITLE = {Inferring Transcriptional Regulators Using Clustered Multi-Task Regression},
AUTHOR = {Heinen, Tobias},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2018},
DATE = {2018},
ABSTRACT = {Sparse linear regression is often used to identify key transcriptional regulators by<br>predicting gene expression abundance from regulatory features such as transcription<br>factor (TF) binding or epigenomics data. However, a single linear model explaining<br>the gene expression of thousands of genes is limited in capturing the complexity of<br>cis-regulatory modules and gene co-expression patterns. Indeed, certain TFs are<br>known to act as both activators or repressors depending on associated cofactors and<br>neighbouring DNA-bound proteins. It is therefore desirable to identify clusters or<br>modules of co-regulated genes and model their regulatory pro{fi}les separately.<br>Finite mixtures of regression models are a popular tool for modeling hetero-<br>geneous data, while maintaining a linearity assumption. Unfortunately, they do<br>not take advantage of available data sets containing the molecular pro{fi}les of many<br>biological samples. We propose to combine the power of mixture modeling and<br>multi-task learning by using a penalized maximum likelihood framework for infer-<br>ring gene modules and regulators in multiple samples simultaneously. More specif-<br>ically, we regularize the likelihood function with a tree-structured L1/L2 penalty<br>to enable knowledge transfer between models of related cells. We optimize the<br>parameters of our models with a generalized EM algorithm. Experimental evalu-<br>ation of our method on synthetic data suggests that multi-task mixture modelling<br>is more suitable for identifying the true underlying cluster structure compared to a<br>single-task regression mixture model. Finally, we apply the model to a dataset from<br>the BLUEPRINT project consisting of various types of haematopoietic cells and<br>uncover interesting regulatory patterns.},
}
Endnote
%0 Thesis
%A Heinen, Tobias
%Y Schulz, Marcel Holger
%A referee: Marschall, Tobias
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Inferring Transcriptional Regulators Using Clustered Multi-Task Regression :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-B37A-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2018
%P 93 p.
%V master
%9 master
%X Sparse linear regression is often used to identify key transcriptional regulators by<br>predicting gene expression abundance from regulatory features such as transcription<br>factor (TF) binding or epigenomics data. However, a single linear model explaining<br>the gene expression of thousands of genes is limited in capturing the complexity of<br>cis-regulatory modules and gene co-expression patterns. Indeed, certain TFs are<br>known to act as both activators or repressors depending on associated cofactors and<br>neighbouring DNA-bound proteins. It is therefore desirable to identify clusters or<br>modules of co-regulated genes and model their regulatory profiles separately.<br>Finite mixtures of regression models are a popular tool for modeling hetero-<br>geneous data, while maintaining a linearity assumption. Unfortunately, they do<br>not take advantage of available data sets containing the molecular profiles of many<br>biological samples. We propose to combine the power of mixture modeling and<br>multi-task learning by using a penalized maximum likelihood framework for infer-<br>ring gene modules and regulators in multiple samples simultaneously. More specif-<br>ically, we regularize the likelihood function with a tree-structured L1/L2 penalty<br>to enable knowledge transfer between models of related cells. We optimize the<br>parameters of our models with a generalized EM algorithm. Experimental evalu-<br>ation of our method on synthetic data suggests that multi-task mixture modelling<br>is more suitable for identifying the true underlying cluster structure compared to a<br>single-task regression mixture model. Finally, we apply the model to a dataset from<br>the BLUEPRINT project consisting of various types of haematopoietic cells and<br>uncover interesting regulatory patterns.
[13]
V. T. Ho, “An Embedding-based Approach to Rule Learning from Knowledge Graphs,” Universität des Saarlandes, Saarbrücken, 2018.
Abstract
Knowledge Graphs (KGs) play an important role in various information systems and<br>have application in many fields such as Semantic Web Search, Question Answering and Information Retrieval. KGs present information in the form of entities and relationships between them. Modern KGs could contain up to millions of entities and billions of facts, and they are usually built using automatic construction methods. As a result, despite the huge size of KGs, a large number of facts between their entities are still missing. That is the reason why we see the importance of the task of Knowledge Graph Completion (a.k.a. Link Prediction), which concerns the prediction of those missing facts.<br><br>Rules over a Knowledge Graph capture interpretable patterns in data and various<br>methods for rule learning have been proposed. Since KGs are inherently incomplete, rules can be used to deduce missing facts. Statistical measures for learned rules such as confidence reflect rule quality well when the KG is reasonably complete; however, these measures might be misleading otherwise. So, it is difficult to learn high-quality rules from the KG alone, and scalability dictates that only a small set of candidate rules is generated.<br>Therefore, the ranking and pruning of candidate rules are major problems. To address this issue, we propose a rule learning method that utilizes probabilistic representations of missing facts. In particular, we iteratively extend rules induced from a KG by relying on feedback from a precomputed embedding model over the KG and optionally external <br>information sources including text corpora. The contributions of this thesis are as follows:<br><br>• We introduce a framework for rule learning guided by external sources.<br><br>• We propose a concrete instantiation of our framework to show how to learn high-<br>quality rules by utilizing feedback from a pretrained embedding model.<br><br>• We conducted experiments on real-world KGs that demonstrate the effectiveness<br>of our novel approach with respect to both the quality of the learned rules and fact<br>predictions that they produce.
Export
BibTeX
@mastersthesis{HoMaster2018,
TITLE = {An Embedding-based Approach to Rule Learning from Knowledge Graphs},
AUTHOR = {Ho, Vinh Thinh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2018},
DATE = {2018},
ABSTRACT = {Knowledge Graphs (KGs) play an important role in various information systems and<br>have application in many {fi}elds such as Semantic Web Search, Question Answering and Information Retrieval. KGs present information in the form of entities and relationships between them. Modern KGs could contain up to millions of entities and billions of facts, and they are usually built using automatic construction methods. As a result, despite the huge size of KGs, a large number of facts between their entities are still missing. That is the reason why we see the importance of the task of Knowledge Graph Completion (a.k.a. Link Prediction), which concerns the prediction of those missing facts.<br><br>Rules over a Knowledge Graph capture interpretable patterns in data and various<br>methods for rule learning have been proposed. Since KGs are inherently incomplete, rules can be used to deduce missing facts. Statistical measures for learned rules such as con{fi}dence re{fl}ect rule quality well when the KG is reasonably complete; however, these measures might be misleading otherwise. So, it is difficult to learn high-quality rules from the KG alone, and scalability dictates that only a small set of candidate rules is generated.<br>Therefore, the ranking and pruning of candidate rules are major problems. To address this issue, we propose a rule learning method that utilizes probabilistic representations of missing facts. In particular, we iteratively extend rules induced from a KG by relying on feedback from a precomputed embedding model over the KG and optionally external <br>information sources including text corpora. The contributions of this thesis are as follows:<br><br>\mbox{$\bullet$} We introduce a framework for rule learning guided by external sources.<br><br>\mbox{$\bullet$} We propose a concrete instantiation of our framework to show how to learn high-<br>quality rules by utilizing feedback from a pretrained embedding model.<br><br>\mbox{$\bullet$} We conducted experiments on real-world KGs that demonstrate the effectiveness<br>of our novel approach with respect to both the quality of the learned rules and fact<br>predictions that they produce.},
}
Endnote
%0 Thesis
%A Ho, Vinh Thinh
%A referee: Weikum, Gerhard
%Y Stepanova, Daria
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T An Embedding-based Approach to Rule Learning from Knowledge Graphs :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-DE06-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2018
%P 60
%V master
%9 master
%X Knowledge Graphs (KGs) play an important role in various information systems and<br>have application in many fields such as Semantic Web Search, Question Answering and Information Retrieval. KGs present information in the form of entities and relationships between them. Modern KGs could contain up to millions of entities and billions of facts, and they are usually built using automatic construction methods. As a result, despite the huge size of KGs, a large number of facts between their entities are still missing. That is the reason why we see the importance of the task of Knowledge Graph Completion (a.k.a. Link Prediction), which concerns the prediction of those missing facts.<br><br>Rules over a Knowledge Graph capture interpretable patterns in data and various<br>methods for rule learning have been proposed. Since KGs are inherently incomplete, rules can be used to deduce missing facts. Statistical measures for learned rules such as confidence reflect rule quality well when the KG is reasonably complete; however, these measures might be misleading otherwise. So, it is difficult to learn high-quality rules from the KG alone, and scalability dictates that only a small set of candidate rules is generated.<br>Therefore, the ranking and pruning of candidate rules are major problems. To address this issue, we propose a rule learning method that utilizes probabilistic representations of missing facts. In particular, we iteratively extend rules induced from a KG by relying on feedback from a precomputed embedding model over the KG and optionally external <br>information sources including text corpora. The contributions of this thesis are as follows:<br><br>• We introduce a framework for rule learning guided by external sources.<br><br>• We propose a concrete instantiation of our framework to show how to learn high-<br>quality rules by utilizing feedback from a pretrained embedding model.<br><br>• We conducted experiments on real-world KGs that demonstrate the effectiveness<br>of our novel approach with respect to both the quality of the learned rules and fact<br>predictions that they produce.
2017
[14]
P. Agarwal, “Time-Aware Named Entity Disambiguation,” Universität des Saarlandes, Saarbrücken, 2017.
Abstract
Named Entity Disambiguation (NED) is a Natural Language Processing task of linking mentions of named entities is a text to their corresponding entries in a Knowledge Base. It serves as a crucial component in applications such as Semantic Search, Knowledge Base Population, and Opinion Mining. Currently deployed tools for NED are based on sophisticated models that use coherence relation among entities and distributed vectors to represent the entity mentions and their contexts in a document to disambiguate them collectively. Factors that have not been considered yet in this track are the semantics of temporal information about canonical entity
forms and their mentions. Even though temporal expressions in a text give inherent
structural characteristic to it, for instance, it can map a topic being discussed to a certain period of known history, yet such expressions are leveraged no differently
than other dictionary words. In this thesis we propose the first time-aware NED
model, which extends a state-of-the-art learning to rank approach based on joint
word-entity embeddings. For this we introduce the concept of temporal signatures
that is used in our work to represent the importance of each entity in a Knowledge
Base over a historical time-line. Such signatures for the entities and temporal contexts for the entity mentions are represented in our proposed temporal vector space to model the similarities between them. We evaluated our method on CoNLL-AIDA and TAC 2010, which are two widely used datasets in the NED track. However, because such datasets are composed of news articles from a short time-period, they do not provide extensive evaluation for our proposed temoral similarity modeling. Therefore, we curated a dia-chronic dataset, diaNED, with the characteristic of temporally diverse entity mentions in its text collection.
Export
BibTeX
@mastersthesis{AgarwalMaster2017,
TITLE = {Time-Aware Named Entity Disambiguation},
AUTHOR = {Agarwal, Prabal},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {Named Entity Disambiguation (NED) is a Natural Language Processing task of linking mentions of named entities is a text to their corresponding entries in a Knowledge Base. It serves as a crucial component in applications such as Semantic Search, Knowledge Base Population, and Opinion Mining. Currently deployed tools for NED are based on sophisticated models that use coherence relation among entities and distributed vectors to represent the entity mentions and their contexts in a document to disambiguate them collectively. Factors that have not been considered yet in this track are the semantics of temporal information about canonical entity forms and their mentions. Even though temporal expressions in a text give inherent structural characteristic to it, for instance, it can map a topic being discussed to a certain period of known history, yet such expressions are leveraged no differently than other dictionary words. In this thesis we propose the first time-aware NED model, which extends a state-of-the-art learning to rank approach based on joint word-entity embeddings. For this we introduce the concept of temporal signatures that is used in our work to represent the importance of each entity in a Knowledge Base over a historical time-line. Such signatures for the entities and temporal contexts for the entity mentions are represented in our proposed temporal vector space to model the similarities between them. We evaluated our method on CoNLL-AIDA and TAC 2010, which are two widely used datasets in the NED track. However, because such datasets are composed of news articles from a short time-period, they do not provide extensive evaluation for our proposed temoral similarity modeling. Therefore, we curated a dia-chronic dataset, diaNED, with the characteristic of temporally diverse entity mentions in its text collection.},
}
Endnote
%0 Thesis
%A Agarwal, Prabal
%Y Strötgen, Jannik
%A referee: Weikum, Gerhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Time-Aware Named Entity Disambiguation :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-38D6-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 96 p.
%V master
%9 master
%X Named Entity Disambiguation (NED) is a Natural Language Processing task of linking mentions of named entities is a text to their corresponding entries in a Knowledge Base. It serves as a crucial component in applications such as Semantic Search, Knowledge Base Population, and Opinion Mining. Currently deployed tools for NED are based on sophisticated models that use coherence relation among entities and distributed vectors to represent the entity mentions and their contexts in a document to disambiguate them collectively. Factors that have not been considered yet in this track are the semantics of temporal information about canonical entity
forms and their mentions. Even though temporal expressions in a text give inherent
structural characteristic to it, for instance, it can map a topic being discussed to a certain period of known history, yet such expressions are leveraged no differently
than other dictionary words. In this thesis we propose the first time-aware NED
model, which extends a state-of-the-art learning to rank approach based on joint
word-entity embeddings. For this we introduce the concept of temporal signatures
that is used in our work to represent the importance of each entity in a Knowledge
Base over a historical time-line. Such signatures for the entities and temporal contexts for the entity mentions are represented in our proposed temporal vector space to model the similarities between them. We evaluated our method on CoNLL-AIDA and TAC 2010, which are two widely used datasets in the NED track. However, because such datasets are composed of news articles from a short time-period, they do not provide extensive evaluation for our proposed temoral similarity modeling. Therefore, we curated a dia-chronic dataset, diaNED, with the characteristic of temporally diverse entity mentions in its text collection.
[15]
M. Chandra Mukkamala, “Variants of RMSProp and Adagrad with Logarithmic Regret Bounds,” Universität des Saarlandes, Saarbrücken, 2017.
Export
BibTeX
@mastersthesis{MukkamalaMaster2017,
TITLE = {Variants of {RMSProp} and Adagrad with Logarithmic Regret Bounds},
AUTHOR = {Chandra Mukkamala, Mahesh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
}
Endnote
%0 Thesis
%A Chandra Mukkamala, Mahesh
%Y Hein, Matthias
%A referee: Weickert, Joachim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Variants of RMSProp and Adagrad with Logarithmic Regret Bounds :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-38D9-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 74 p.
%V master
%9 master
[16]
M. N. Divo, “Formalization of Types and Programming Languages in Isabelle/HOL,” Universität des Saarlandes, Saarbrücken, 2017.
Abstract
Modern programming languages offer a lot of guarantees (no or few memory leaks, safe parallel programming, ...). But are they really safe ? The only solution is to formalize them and to prove that the expected properties are fulfilled. That’s why it is relevant to prove essential and general facts about programming languages like those presented in
Types and Programming Languages, a reference book on type systems, in Isabelle/HOL. Isabelle/HOL is a proof assistant, providing type checking and certifying the consistency of proofs. This environment allowed me to discover some imprecision and wrong parts.
Export
BibTeX
@mastersthesis{DivoMaster2017,
TITLE = {Formalization of Types and Programming Languages in Isabelle/{HOL}},
AUTHOR = {Divo, Michael Noel},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {Modern programming languages offer a lot of guarantees (no or few memory leaks, safe parallel programming, ...). But are they really safe ? The only solution is to formalize them and to prove that the expected properties are fulfilled. That{\textquoteright}s why it is relevant to prove essential and general facts about programming languages like those presented in Types and Programming Languages, a reference book on type systems, in Isabelle/HOL. Isabelle/HOL is a proof assistant, providing type checking and certifying the consistency of proofs. This environment allowed me to discover some imprecision and wrong parts.},
}
Endnote
%0 Thesis
%A Divo, Michael Noel
%Y Blanchette, Jasmin Christian
%A referee: Weidenbach, Christoph
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Automation of Logic, MPI for Informatics, Max Planck Society
Automation of Logic, MPI for Informatics, Max Planck Society
%T Formalization of Types and Programming Languages in Isabelle/HOL :
%G eng
%U http://hdl.handle.net/21.11116/0000-0000-7632-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 83 p.
%V master
%9 master
%X Modern programming languages offer a lot of guarantees (no or few memory leaks, safe parallel programming, ...). But are they really safe ? The only solution is to formalize them and to prove that the expected properties are fulfilled. That’s why it is relevant to prove essential and general facts about programming languages like those presented in
Types and Programming Languages, a reference book on type systems, in Isabelle/HOL. Isabelle/HOL is a proof assistant, providing type checking and certifying the consistency of proofs. This environment allowed me to discover some imprecision and wrong parts.
[17]
M. Fieraru, “Learning to Track Humans in Videos,” Universität des Saarlandes, Saarbrücken, 2017.
Abstract
This thesis addresses the multi-person tracking task with two types of representation:<br>body pose and segmentation mask. We explore these scenarios in the semi-supervised setting, where one available annotation is available per person during test time. More complex representations of people (segmentation mask and body pose) can provide richer understanding of visual scenes, and methods that leverage supervision during test time should be developed for the cases when supervision is available.<br><br>We propose HumanMaskTracker for the task of semi-supervised multi-person mask<br>tracking. Our approach builds on recent techniques proposed for the task of video<br>object segmentation. These include the mask refinement approach, training with synthetic data, fine-tuning per object and leveraging optical flow. In addition, we propose leveraging instance semantic segmentation proposals to give the tracker a better notion about the human class. Moreover, we propose modeling people occlusions inside the data synthesis process to make the tracker more robust to the challenges of occlusion and disocclusion.<br><br>For the task of semi-supervised multi-person pose tracking, we propose the method<br>HumanPoseTracker. We show that the task of multi-person pose tracking can benefit<br>significantly from using one pose supervision per track during test time. Fine-tuning<br>per object and leveraging optical flow, techniques proposed for the task of video object segmentation, prove to be highly effective for supervised pose tracking as well. Also, we propose a technique to remove false positive joint detections and develop tracking stopping criteria. A promising application of our work is presented by extending the method to generate dense from sparse annotations in videos.
Export
BibTeX
@mastersthesis{FieraruMaster2018,
TITLE = {Learning to Track Humans in Videos},
AUTHOR = {Fieraru, Mihai},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {This thesis addresses the multi-person tracking task with two types of representation:<br>body pose and segmentation mask. We explore these scenarios in the semi-supervised setting, where one available annotation is available per person during test time. More complex representations of people (segmentation mask and body pose) can provide richer understanding of visual scenes, and methods that leverage supervision during test time should be developed for the cases when supervision is available.<br><br>We propose HumanMaskTracker for the task of semi-supervised multi-person mask<br>tracking. Our approach builds on recent techniques proposed for the task of video<br>object segmentation. These include the mask re{fi}nement approach, training with synthetic data, {fi}ne-tuning per object and leveraging optical {fl}ow. In addition, we propose leveraging instance semantic segmentation proposals to give the tracker a better notion about the human class. Moreover, we propose modeling people occlusions inside the data synthesis process to make the tracker more robust to the challenges of occlusion and disocclusion.<br><br>For the task of semi-supervised multi-person pose tracking, we propose the method<br>HumanPoseTracker. We show that the task of multi-person pose tracking can bene{fi}t<br>signi{fi}cantly from using one pose supervision per track during test time. Fine-tuning<br>per object and leveraging optical {fl}ow, techniques proposed for the task of video object segmentation, prove to be highly effective for supervised pose tracking as well. Also, we propose a technique to remove false positive joint detections and develop tracking stopping criteria. A promising application of our work is presented by extending the method to generate dense from sparse annotations in videos.},
}
Endnote
%0 Thesis
%A Fieraru, Mihai
%Y Schiele, Bernt
%A referee: Khoreva, Anna
%A referee: Insafutdinov, Eldar
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Learning to Track Humans in Videos :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-B35D-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 63 p.
%V master
%9 master
%X This thesis addresses the multi-person tracking task with two types of representation:<br>body pose and segmentation mask. We explore these scenarios in the semi-supervised setting, where one available annotation is available per person during test time. More complex representations of people (segmentation mask and body pose) can provide richer understanding of visual scenes, and methods that leverage supervision during test time should be developed for the cases when supervision is available.<br><br>We propose HumanMaskTracker for the task of semi-supervised multi-person mask<br>tracking. Our approach builds on recent techniques proposed for the task of video<br>object segmentation. These include the mask refinement approach, training with synthetic data, fine-tuning per object and leveraging optical flow. In addition, we propose leveraging instance semantic segmentation proposals to give the tracker a better notion about the human class. Moreover, we propose modeling people occlusions inside the data synthesis process to make the tracker more robust to the challenges of occlusion and disocclusion.<br><br>For the task of semi-supervised multi-person pose tracking, we propose the method<br>HumanPoseTracker. We show that the task of multi-person pose tracking can benefit<br>significantly from using one pose supervision per track during test time. Fine-tuning<br>per object and leveraging optical flow, techniques proposed for the task of video object segmentation, prove to be highly effective for supervised pose tracking as well. Also, we propose a technique to remove false positive joint detections and develop tracking stopping criteria. A promising application of our work is presented by extending the method to generate dense from sparse annotations in videos.
[18]
N. Rakhmanberdieva, “Exploring Portability of Data Programming Paradigm,” Universität des Saarlandes, Saarbrücken, 2017.
Export
BibTeX
@mastersthesis{RakhmanberdievaMaster2018,
TITLE = {Exploring Portability of Data Programming Paradigm},
AUTHOR = {Rakhmanberdieva, Nurzat},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
CONTENTS = {Machine Learning methods, especially Deep Learning, had an enormous breakthrough in Natural Language Processing and Computer Vision. They showed incredible performance in solving complex problems with minimum human interaction when large amount of labeled data is available. The hardest part is labeling large quantities of unlabeled data as it is time-consuming, expensive and requires expert knowledge. The Data Programming Paradigm which was introduced at NIPS 2016 proposes a method that uses labeling functions. They are a set of heuristic rules that produce large, but noisy training data which is later denoised by a generative model of these labeling functions. In this thesis, we explored portability of Data Programming Paradigm to new domains. We applied it to sequence labeling also known as Slot-{fi}lling for Spoken Language Understanding and Named Entity Extraction. First, to allow these tasks to be included as part of the pipeline, we modi{fi}ed the initial data processing and candidate generation steps in the model. Second, we introduced a new type of labeling functions to test the hypothesis that "lightly" trained models can serve as a solid labeling function in combination with other functions. In this context, "lightly" trained models denote Deep Learning methods such as Convolutional and Recurrent Neural Networks that are trained with a small subset of data. Third, we described the strategies to implement and select optimal labeling functions. Finally, we showed that Data Programming Paradigm can be successfully extended to such tasks and outperforms its counterparts on noisy data. The experimental results for Slot-{fi}lling showed that the for the clean data, Data Programming Paradigm achieved 5.9 points better F1 score than the baseline. But on noisy data, it outperforms twice its counterparts such as Conditional Random Fields. We examined the model with benchmarks such as Air Travel Information System and SAP related datasets.},
}
Endnote
%0 Thesis
%A Rakhmanberdieva, Nurzat
%Y Klakow, Dietrich
%A referee: Berberich, Klaus
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Exploring Portability of Data Programming Paradigm :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-8441-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 83 p.
%V master
%9 master
%Z Machine Learning methods, especially Deep Learning, had an enormous breakthrough in Natural Language Processing and Computer Vision. They showed incredible performance in solving complex problems with minimum human interaction when large amount of labeled data is available. The hardest part is labeling large quantities of unlabeled data as it is time-consuming, expensive and requires expert knowledge. The Data Programming Paradigm which was introduced at NIPS 2016 proposes a method that uses labeling functions. They are a set of heuristic rules that produce large, but noisy training data which is later denoised by a generative model of these labeling functions.
In this thesis, we explored portability of Data Programming Paradigm to new domains. We applied it to sequence labeling also known as Slot-filling for Spoken Language Understanding and Named Entity Extraction. First, to allow these tasks to be included as part of the pipeline, we modified the initial data processing and candidate generation steps in the model. Second, we introduced a new type of labeling functions to test the hypothesis that "lightly" trained models can serve as a solid labeling function in combination with other functions. In this context, "lightly" trained models denote Deep Learning methods such as Convolutional and Recurrent Neural Networks that are trained with a small subset of data. Third, we described the strategies to implement and select optimal labeling functions. Finally, we showed that Data Programming Paradigm can be successfully extended to such tasks and outperforms its counterparts on noisy data. The experimental results for Slot-filling showed that the for the clean data, Data Programming Paradigm achieved 5.9 points better F1 score than the baseline. But on noisy data, it outperforms twice its counterparts such as Conditional Random Fields. We examined the model with benchmarks such as Air Travel Information System and SAP related datasets.
[19]
A. Sanchez Bach, “Cross-Architecture Comparison of Binary Executables,” Universität des Saarlandes, Saarbrücken, 2017.
Abstract
The proliferation of IoT-devices is turning different kinds of embedded systems into
another relevant target for malware developers. Consequently, recent botnets are providing clients for multiple host architectures, making the clustering of malware samples a non-trivial task. While several approaches exist for statically comparing binaries of the same architecture, there are no proposed methods to compare binaries across different architectures.
Based on previous approaches for cross-architecture bug identification, we present
CrossDiff, a tool to compare executable binaries compiled for ARM, MIPS, PowerPC
and x86. CrossDiff detects functions in the input executables and translates their instructions into a common intermediate representation. Then, by pairwise comparing
functions based on features at IR-level and analyzing module-level properties we
compute a similarity score for pairs of binaries. Finally, we evaluate this approach
and the stages of the pipeline on the SPEC CPU2006 dataset with a build matrix that
iterates over different architectures, compilers, languages and optimization flags.
Export
BibTeX
@mastersthesis{SanchezMaster2017,
TITLE = {Cross-Architecture Comparison of Binary Executables},
AUTHOR = {Sanchez Bach, Alexandro},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {The proliferation of IoT-devices is turning different kinds of embedded systems into another relevant target for malware developers. Consequently, recent botnets are providing clients for multiple host architectures, making the clustering of malware samples a non-trivial task. While several approaches exist for statically comparing binaries of the same architecture, there are no proposed methods to compare binaries across different architectures. Based on previous approaches for cross-architecture bug identification, we present CrossDiff, a tool to compare executable binaries compiled for ARM, MIPS, PowerPC and x86. CrossDiff detects functions in the input executables and translates their instructions into a common intermediate representation. Then, by pairwise comparing functions based on features at IR-level and analyzing module-level properties we compute a similarity score for pairs of binaries. Finally, we evaluate this approach and the stages of the pipeline on the SPEC CPU2006 dataset with a build matrix that iterates over different architectures, compilers, languages and optimization flags.},
}
Endnote
%0 Thesis
%A Sanchez Bach, Alexandro
%Y Rossow, Christian
%A referee: Hack, Sebastian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Cross-Architecture Comparison of Binary Executables :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-38DE-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 50 p.
%V master
%9 master
%X The proliferation of IoT-devices is turning different kinds of embedded systems into
another relevant target for malware developers. Consequently, recent botnets are providing clients for multiple host architectures, making the clustering of malware samples a non-trivial task. While several approaches exist for statically comparing binaries of the same architecture, there are no proposed methods to compare binaries across different architectures.
Based on previous approaches for cross-architecture bug identification, we present
CrossDiff, a tool to compare executable binaries compiled for ARM, MIPS, PowerPC
and x86. CrossDiff detects functions in the input executables and translates their instructions into a common intermediate representation. Then, by pairwise comparing
functions based on features at IR-level and analyzing module-level properties we
compute a similarity score for pairs of binaries. Finally, we evaluate this approach
and the stages of the pipeline on the SPEC CPU2006 dataset with a build matrix that
iterates over different architectures, compilers, languages and optimization flags.
[20]
J. A. Tomasson, “Finding Optimal Smoothnessnoperators for Inpainting with Bi-level Optimization,” Universität des Saarlandes, Saarbrücken, 2017.
Abstract
Inpainting and image denoising are two problems in image processing that can be
formulated as rather similar partial differential equations or PDE. In this work the effects of higher order smoothness constraints on the results of denoising and inpaintingvwere looked at. Methods from bi-level optimization were used in order to learn optimal smoothness constraints. The differences between the optimal smoothness
constraints for the two problems were looked at both for a linear model and a more complex non-linear model. The results for the linear model were that inpainting
favoured first order smoothness. For denoising on the other hand all of the different orders of smoothness made up a comparable part of the optimal smoothness constraint. Even with this difference the overall effect on the quality of the results was similar for both problems. For the non-linear model it was much more difficult to find a good smoothness constraint for the inpainting problem than for the denoising problem and the learned smoothness constraints looked very different.
Export
BibTeX
@mastersthesis{Tomassonmaster2017,
TITLE = {Finding Optimal Smoothnessnoperators for Inpainting with Bi-level Optimization},
AUTHOR = {Tomasson, Jon Arnar},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {Inpainting and image denoising are two problems in image processing that can be formulated as rather similar partial differential equations or PDE. In this work the effects of higher order smoothness constraints on the results of denoising and inpaintingvwere looked at. Methods from bi-level optimization were used in order to learn optimal smoothness constraints. The differences between the optimal smoothness constraints for the two problems were looked at both for a linear model and a more complex non-linear model. The results for the linear model were that inpainting favoured first order smoothness. For denoising on the other hand all of the different orders of smoothness made up a comparable part of the optimal smoothness constraint. Even with this difference the overall effect on the quality of the results was similar for both problems. For the non-linear model it was much more difficult to find a good smoothness constraint for the inpainting problem than for the denoising problem and the learned smoothness constraints looked very different.},
}
Endnote
%0 Thesis
%A Tomasson, Jon Arnar
%Y Weickert, Joachim
%A referee: Ochs, Peter
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Externe Organisation (UdS)
Externe Organisation (UdS)
%T Finding Optimal Smoothnessnoperators for Inpainting with Bi-level Optimization :
%G eng
%U http://hdl.handle.net/21.11116/0000-0000-7641-2
%I Universität des Saarlandes
%C Saarbrücken
%D 2017
%P 52 p.
%V master
%9 master
%X Inpainting and image denoising are two problems in image processing that can be
formulated as rather similar partial differential equations or PDE. In this work the effects of higher order smoothness constraints on the results of denoising and inpaintingvwere looked at. Methods from bi-level optimization were used in order to learn optimal smoothness constraints. The differences between the optimal smoothness
constraints for the two problems were looked at both for a linear model and a more complex non-linear model. The results for the linear model were that inpainting
favoured first order smoothness. For denoising on the other hand all of the different orders of smoothness made up a comparable part of the optimal smoothness constraint. Even with this difference the overall effect on the quality of the results was similar for both problems. For the non-linear model it was much more difficult to find a good smoothness constraint for the inpainting problem than for the denoising problem and the learned smoothness constraints looked very different.
2016
[21]
M. Alzayat, “PolSim: Automatic Policy Validation via Meta-Data Flow Simulation,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Every year millions of confidential data records are leaked accidentally due to bugs, misconfiguration, or operator error. These incidents are common in large, complex, and fast evolving data processing systems. Ensuring compliance with data policies is a major challenge. Thoth is an information flow control system that uses coarse-grained taint tracking to control the flow of data. This is achieved by enforcing relevant declarative policies at processes boundaries. This enforcement is applicable regardless of bugs, misconfiguration, and compromises in application code, or actions by unprivileged operators. Designing policies that make sure all and only compliant flows are allowed remains a complex and error-prone process. In this work, we introduce PolSim, a simulation tool that aids system policy designers by validating the provided policies and systematically ensuring that the system allows all and only expected flows. Our proposed simulator approximates the dynamic run-time environment, semi-automatically suggests internal flow policies based on data flow, and provides debugging hints to help policy designers develop a working policy for the intended system before deployment.
Export
BibTeX
@mastersthesis{Alzayatmaster2016,
TITLE = {Pol{S}im: Automatic Policy Validation via Meta-Data Flow Simulation},
AUTHOR = {Alzayat, Mohamed},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-09-27},
ABSTRACT = {Every year millions of confidential data records are leaked accidentally due to bugs, misconfiguration, or operator error. These incidents are common in large, complex, and fast evolving data processing systems. Ensuring compliance with data policies is a major challenge. Thoth is an information flow control system that uses coarse-grained taint tracking to control the flow of data. This is achieved by enforcing relevant declarative policies at processes boundaries. This enforcement is applicable regardless of bugs, misconfiguration, and compromises in application code, or actions by unprivileged operators. Designing policies that make sure all and only compliant flows are allowed remains a complex and error-prone process. In this work, we introduce PolSim, a simulation tool that aids system policy designers by validating the provided policies and systematically ensuring that the system allows all and only expected flows. Our proposed simulator approximates the dynamic run-time environment, semi-automatically suggests internal flow policies based on data flow, and provides debugging hints to help policy designers develop a working policy for the intended system before deployment.},
}
Endnote
%0 Thesis
%A Alzayat, Mohamed
%Y Druschel, Peter
%A referee: Garg, Deepak
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Group P. Druschel, Max Planck Institute for Software Systems, Max Planck Society
Group D. Garg, Max Planck Institute for Software Systems, Max Planck Society
%T PolSim: Automatic Policy Validation via Meta-Data Flow Simulation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-ACCC-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 27.09.2016
%P 71 p.
%V master
%9 master
%X Every year millions of confidential data records are leaked accidentally due to bugs, misconfiguration, or operator error. These incidents are common in large, complex, and fast evolving data processing systems. Ensuring compliance with data policies is a major challenge. Thoth is an information flow control system that uses coarse-grained taint tracking to control the flow of data. This is achieved by enforcing relevant declarative policies at processes boundaries. This enforcement is applicable regardless of bugs, misconfiguration, and compromises in application code, or actions by unprivileged operators. Designing policies that make sure all and only compliant flows are allowed remains a complex and error-prone process. In this work, we introduce PolSim, a simulation tool that aids system policy designers by validating the provided policies and systematically ensuring that the system allows all and only expected flows. Our proposed simulator approximates the dynamic run-time environment, semi-automatically suggests internal flow policies based on data flow, and provides debugging hints to help policy designers develop a working policy for the intended system before deployment.
[22]
D. Amanova, “Cross Domain Dialogue Act Classification,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Nowadays the dialogue act classification is one of the hot topics in computational
linguistics. Different machine learning algorithms were used for dialogue act classi-
fication. In this thesis, we investigate the cross domain dialogue act classification
using Support Vector Machines. The goal of the research reported in this work is
to explore features for effective cross domain classification. The work includes two
phases of data-driven investigation. The first phase involves collecting, and analyzing
corpora, while the second phase involves domain independent feature selection and
extraction.
Dialogue act annotation were collected from three different corpora: AMI 1,
HCRC MapTask 2 and SWBD DAMSL [1]. Based on ISO standards, these dialogue
acts were mapped to corresponding groups. Number of various experiments were
carried out to find features with the best predictive power. The results show that
the combination of multiple features: bigrams of Part-Of-Speech, Chunks and words,
lead to consistent improvement of the classifier's performance than features in isolation.
Finally, we investigate the portability and generalibility of proposed approach
on extracted features when using set of features that showed the best predictive results on unseen Metalogue corpus 3. The findings indicate that good classification
accuracy can be achieved using our approach, and that there is a set of automatically extracted feature are shared between large corpora, that prove to be extremely reliable when used directly to classify Dialogue Acts.
Export
BibTeX
@mastersthesis{AmanovaMSc2017,
TITLE = {Cross Domain Dialogue Act Classification},
AUTHOR = {Amanova, Dilafruz},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016},
ABSTRACT = {Nowadays the dialogue act classification is one of the hot topics in computational linguistics. Different machine learning algorithms were used for dialogue act classi- fication. In this thesis, we investigate the cross domain dialogue act classification using Support Vector Machines. The goal of the research reported in this work is to explore features for effective cross domain classification. The work includes two phases of data-driven investigation. The first phase involves collecting, and analyzing corpora, while the second phase involves domain independent feature selection and extraction. Dialogue act annotation were collected from three different corpora: AMI 1, HCRC MapTask 2 and SWBD DAMSL [1]. Based on ISO standards, these dialogue acts were mapped to corresponding groups. Number of various experiments were carried out to find features with the best predictive power. The results show that the combination of multiple features: bigrams of Part-Of-Speech, Chunks and words, lead to consistent improvement of the classifier's performance than features in isolation. Finally, we investigate the portability and generalibility of proposed approach on extracted features when using set of features that showed the best predictive results on unseen Metalogue corpus 3. The findings indicate that good classification accuracy can be achieved using our approach, and that there is a set of automatically extracted feature are shared between large corpora, that prove to be extremely reliable when used directly to classify Dialogue Acts.},
}
Endnote
%0 Thesis
%A Amanova, Dilafruz
%Y Petukhova, Volha
%A referee: Prof Klakow, Dietrich
%A referee: Miettinen, Pauli
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Universität des Saarlandes, Sprach- und Signalverarbeitung
Universität des Saarlandes, Sprach- und Signalverarbeitung
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Cross Domain Dialogue Act Classification :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002E-A0F2-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 60 p.
%V master
%9 master
%X Nowadays the dialogue act classification is one of the hot topics in computational
linguistics. Different machine learning algorithms were used for dialogue act classi-
fication. In this thesis, we investigate the cross domain dialogue act classification
using Support Vector Machines. The goal of the research reported in this work is
to explore features for effective cross domain classification. The work includes two
phases of data-driven investigation. The first phase involves collecting, and analyzing
corpora, while the second phase involves domain independent feature selection and
extraction.
Dialogue act annotation were collected from three different corpora: AMI 1,
HCRC MapTask 2 and SWBD DAMSL [1]. Based on ISO standards, these dialogue
acts were mapped to corresponding groups. Number of various experiments were
carried out to find features with the best predictive power. The results show that
the combination of multiple features: bigrams of Part-Of-Speech, Chunks and words,
lead to consistent improvement of the classifier's performance than features in isolation.
Finally, we investigate the portability and generalibility of proposed approach
on extracted features when using set of features that showed the best predictive results on unseen Metalogue corpus 3. The findings indicate that good classification
accuracy can be achieved using our approach, and that there is a set of automatically extracted feature are shared between large corpora, that prove to be extremely reliable when used directly to classify Dialogue Acts.
[23]
R. Antonyan, “Threshold Based Online Algorithms for Portfolio Selection Problem,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
This Master Thesis introduces portfolio selection trading strategy named ”Threshold
Based Online Algorithm”. A decision into which assets to invest is based on the thresh-
old calculated from previous trading periods. In this work have been proposed two
different ways for calculating the threshold. The main idea of the algorithm is to exploit
the mean reversion property of stock markets by identifying assets that are expected
to increase(decrease) in the following trading periods. We run numerical experiments
on real datasets to estimate the algorithm performance efficiency. By analysing the
empirical performance results, we figured out that TBOA trade-off between wealth per-
formance, volatility and downside risks. It performed better on portfolio that contains
highly volatile assets with low correlation between them. Evaluation results have shown,
that TBOA was able to outperform already existing algorithms on some real datasets.
Moreover TBOA has linear time complexity that makes algorithm runs fast.
Export
BibTeX
@mastersthesis{Antonyanmaster2016,
TITLE = {Threshold Based Online Algorithms for Portfolio Selection Problem},
AUTHOR = {Antonyan, Rafaella},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
ABSTRACT = {This Master Thesis introduces portfolio selection trading strategy named {\textquotedblright}Threshold Based Online Algorithm{\textquotedblright}. A decision into which assets to invest is based on the thresh- old calculated from previous trading periods. In this work have been proposed two different ways for calculating the threshold. The main idea of the algorithm is to exploit the mean reversion property of stock markets by identifying assets that are expected to increase(decrease) in the following trading periods. We run numerical experiments on real datasets to estimate the algorithm performance efficiency. By analysing the empirical performance results, we {fi}gured out that TBOA trade-off between wealth per- formance, volatility and downside risks. It performed better on portfolio that contains highly volatile assets with low correlation between them. Evaluation results have shown, that TBOA was able to outperform already existing algorithms on some real datasets. Moreover TBOA has linear time complexity that makes algorithm runs fast.},
}
Endnote
%0 Thesis
%A Antonyan, Rafaella
%Y Schmidt, Günther
%A referee: Zayer, Rhaleb
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Threshold Based Online Algorithms for Portfolio Selection Problem :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-B28F-5
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 68 p.
%V master
%9 master
%X This Master Thesis introduces portfolio selection trading strategy named ”Threshold
Based Online Algorithm”. A decision into which assets to invest is based on the thresh-
old calculated from previous trading periods. In this work have been proposed two
different ways for calculating the threshold. The main idea of the algorithm is to exploit
the mean reversion property of stock markets by identifying assets that are expected
to increase(decrease) in the following trading periods. We run numerical experiments
on real datasets to estimate the algorithm performance efficiency. By analysing the
empirical performance results, we figured out that TBOA trade-off between wealth per-
formance, volatility and downside risks. It performed better on portfolio that contains
highly volatile assets with low correlation between them. Evaluation results have shown,
that TBOA was able to outperform already existing algorithms on some real datasets.
Moreover TBOA has linear time complexity that makes algorithm runs fast.
[24]
S. Bozca, “Discrete Osmosis Methods for Image Processing,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Partial differential equations can model many physical phenomena and be used to
simulate under computer. Osmosis, which is in the form of convection-diffusion equa-
tion, has found itself many application areas in image processing. However, slow
convergence of this model, which depends on incompatibility of the drift vector field
used in the model, under current methods does not allow us to have a fast, and
possibly real-time application area.
Therefore, we get a deeper look into what incompatibility means and how it effects
steady states of the osmosis process in this thesis. In addition, we evaluate several
promising methods which offers substantial computational advantage over classical
iterative methods.
Export
BibTeX
@mastersthesis{BozcaMaster2016,
TITLE = {Discrete Osmosis Methods for Image Processing},
AUTHOR = {Bozca, Sinan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016},
ABSTRACT = {Partial differential equations can model many physical phenomena and be used to simulate under computer. Osmosis, which is in the form of convection-diffusion equa- tion, has found itself many application areas in image processing. However, slow convergence of this model, which depends on incompatibility of the drift vector field used in the model, under current methods does not allow us to have a fast, and possibly real-time application area. Therefore, we get a deeper look into what incompatibility means and how it effects steady states of the osmosis process in this thesis. In addition, we evaluate several promising methods which offers substantial computational advantage over classical iterative methods.},
}
Endnote
%0 Thesis
%A Bozca, Sinan
%Y Weickert, Joachim
%A referee: Augustin, Matthias
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Universität des Saarlandes
Universität des Saarlandes
%T Discrete Osmosis Methods for Image Processing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-41DD-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 43 p.
%V master
%9 master
%X Partial differential equations can model many physical phenomena and be used to
simulate under computer. Osmosis, which is in the form of convection-diffusion equa-
tion, has found itself many application areas in image processing. However, slow
convergence of this model, which depends on incompatibility of the drift vector field
used in the model, under current methods does not allow us to have a fast, and
possibly real-time application area.
Therefore, we get a deeper look into what incompatibility means and how it effects
steady states of the osmosis process in this thesis. In addition, we evaluate several
promising methods which offers substantial computational advantage over classical
iterative methods.
[25]
C. X. Chu, “Mining How-to Task Knowledge from Online Communities,” Universität des Saarlandes, Saarbrücken, 2016.
Export
BibTeX
@mastersthesis{ChuMSc2016,
TITLE = {Mining How-to Task Knowledge from Online Communities},
AUTHOR = {Chu, Cuong Xuan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016},
}
Endnote
%0 Thesis
%A Chu, Cuong Xuan
%Y Weikum, Gerhard
%A referee: Vreeken, Jilles
%A referee: Tandon, Niket
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Mining How-to Task Knowledge from Online Communities :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-491D-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 66 p.
%V master
%9 master
[26]
O. Darwish, “Market Equilibrium Computation for the Linear Arrow-Debreu Model,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
The problem of market equilibrium is defined as the problem of finding prices for the goods such that the supply in the market is equal to the demand. The problem is applicable to several market models, like the linear Arrow-Debreu model, which is one of the fundamental economic market models. Over the years, various algorithms have been developed to compute the market equilibrium of the linear Arrow-Debreu model. In 2013, Duan and Mehlhorn presented the first combinatorial polynomial time algorithm
for computing the market equilibrium of this model.
In this thesis, we optimize, generalize, and implement the Duan-Mehlhorn algorithm.
We present a novel algorithm for computing balanced ows in equality networks, which
is an application of parametric ows. This algorithm outperforms the current best
algorithm for computing balanced ows; hence, it improves Duan-Mehlhorn's algorithm by almost a factor of n, which is the size of the network. Moreover, we generalize Duan-Mehlhorn's algorithm by relaxing some of its assumptions. Finally, we describe our approach for implementing Duan-Mehlhorn's algorithm. The preliminary results of our implementation - based on random utility instances - show that the running time of the implementation scales significantly better than the theoretical time complexity.
Export
BibTeX
@mastersthesis{DarwishMaster2016,
TITLE = {Market Equilibrium Computation for the Linear Arrow-Debreu Model},
AUTHOR = {Darwish, Omar},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-03-31},
ABSTRACT = {The problem of market equilibrium is defined as the problem of finding prices for the goods such that the supply in the market is equal to the demand. The problem is applicable to several market models, like the linear Arrow-Debreu model, which is one of the fundamental economic market models. Over the years, various algorithms have been developed to compute the market equilibrium of the linear Arrow-Debreu model. In 2013, Duan and Mehlhorn presented the first combinatorial polynomial time algorithm for computing the market equilibrium of this model. In this thesis, we optimize, generalize, and implement the Duan-Mehlhorn algorithm. We present a novel algorithm for computing balanced ows in equality networks, which is an application of parametric ows. This algorithm outperforms the current best algorithm for computing balanced ows; hence, it improves Duan-Mehlhorn's algorithm by almost a factor of n, which is the size of the network. Moreover, we generalize Duan-Mehlhorn's algorithm by relaxing some of its assumptions. Finally, we describe our approach for implementing Duan-Mehlhorn's algorithm. The preliminary results of our implementation -- based on random utility instances -- show that the running time of the implementation scales significantly better than the theoretical time complexity.},
}
Endnote
%0 Thesis
%A Darwish, Omar
%Y Mehlhorn, Kurt
%A referee: Hoefer, Martin
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Market Equilibrium Computation for the Linear Arrow-Debreu Model :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-41D0-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 31.03.2016
%P 73 p.
%V master
%9 master
%X The problem of market equilibrium is defined as the problem of finding prices for the goods such that the supply in the market is equal to the demand. The problem is applicable to several market models, like the linear Arrow-Debreu model, which is one of the fundamental economic market models. Over the years, various algorithms have been developed to compute the market equilibrium of the linear Arrow-Debreu model. In 2013, Duan and Mehlhorn presented the first combinatorial polynomial time algorithm
for computing the market equilibrium of this model.
In this thesis, we optimize, generalize, and implement the Duan-Mehlhorn algorithm.
We present a novel algorithm for computing balanced ows in equality networks, which
is an application of parametric ows. This algorithm outperforms the current best
algorithm for computing balanced ows; hence, it improves Duan-Mehlhorn's algorithm by almost a factor of n, which is the size of the network. Moreover, we generalize Duan-Mehlhorn's algorithm by relaxing some of its assumptions. Finally, we describe our approach for implementing Duan-Mehlhorn's algorithm. The preliminary results of our implementation - based on random utility instances - show that the running time of the implementation scales significantly better than the theoretical time complexity.
[27]
A. El-Korashy, “A Formal Model for Capability Machines An Illustrative Case Study towards Secure Compilation to CHERI,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Vulnerabilities in computer systems arise in part due to programmer's logical errors,
and in part also due to programmer's false (i.e., over-optimistic) expectations about the
guarantees that are given by the abstractions of a programming language.
For the latter kind of vulnerabilities, architectures with hardware or instructionlevel support for protection mechanisms can be useful. One trend in computer systems protection is hardware-supported enforcement of security guarantees/policies. Capability-based
machines are one instance of hardware-based protection mechanisms. CHERI is a recent implementation of a 64-bit MIPS-based capability architecture with byte-granularity memory protection.
The goal of this thesis is to provide a paper formal model of the CHERI architecture
with the aim of formal reasoning about the security guarantees that can be offered by
the features of CHERI. We first give simplified instruction operational semantics, then we prove that capabilities are unforgeable in our model. Second, we show that existing techniques for enforcing control-flow integrity can be adapted to the CHERI ISA. Third, we show that one notion of memory compartmentalization can be achieved with the help of CHERI's memory protection. We conclude by suggesting other security building blocks that would be helpful to reason about, and laying down a plan for potentially using this work for building a secure compiler, i.e., a compiler that preserves security properties.
The outlook and motivation for this work is to highlight the potential of using CHERI
as a target architecture for secure compilation.
Export
BibTeX
@mastersthesis{El-KorashyMaster2016,
TITLE = {A Formal Model for Capability Machines An Illustrative Case Study towards Secure Compilation to {CHERI}},
AUTHOR = {El-Korashy, Akram},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016},
ABSTRACT = {Vulnerabilities in computer systems arise in part due to programmer's logical errors, and in part also due to programmer's false (i.e., over-optimistic) expectations about the guarantees that are given by the abstractions of a programming language. For the latter kind of vulnerabilities, architectures with hardware or instructionlevel support for protection mechanisms can be useful. One trend in computer systems protection is hardware-supported enforcement of security guarantees/policies. Capability-based machines are one instance of hardware-based protection mechanisms. CHERI is a recent implementation of a 64-bit MIPS-based capability architecture with byte-granularity memory protection. The goal of this thesis is to provide a paper formal model of the CHERI architecture with the aim of formal reasoning about the security guarantees that can be offered by the features of CHERI. We first give simplified instruction operational semantics, then we prove that capabilities are unforgeable in our model. Second, we show that existing techniques for enforcing control-flow integrity can be adapted to the CHERI ISA. Third, we show that one notion of memory compartmentalization can be achieved with the help of CHERI's memory protection. We conclude by suggesting other security building blocks that would be helpful to reason about, and laying down a plan for potentially using this work for building a secure compiler, i.e., a compiler that preserves security properties. The outlook and motivation for this work is to highlight the potential of using CHERI as a target architecture for secure compilation.},
}
Endnote
%0 Thesis
%A El-Korashy, Akram
%Y Patrignani, Marco
%A referee: Garg, Deepak
%A referee: Reineke, Jan
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Max Planck Institute for Software Systems, Max Planck Society
Max Planck Institute for Software Systems, Max Planck Society
Universität des Saarlandes
%T A Formal Model for Capability Machines
An Illustrative Case Study towards Secure Compilation to CHERI :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-41CA-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 89 p.
%V master
%9 master
%X Vulnerabilities in computer systems arise in part due to programmer's logical errors,
and in part also due to programmer's false (i.e., over-optimistic) expectations about the
guarantees that are given by the abstractions of a programming language.
For the latter kind of vulnerabilities, architectures with hardware or instructionlevel support for protection mechanisms can be useful. One trend in computer systems protection is hardware-supported enforcement of security guarantees/policies. Capability-based
machines are one instance of hardware-based protection mechanisms. CHERI is a recent implementation of a 64-bit MIPS-based capability architecture with byte-granularity memory protection.
The goal of this thesis is to provide a paper formal model of the CHERI architecture
with the aim of formal reasoning about the security guarantees that can be offered by
the features of CHERI. We first give simplified instruction operational semantics, then we prove that capabilities are unforgeable in our model. Second, we show that existing techniques for enforcing control-flow integrity can be adapted to the CHERI ISA. Third, we show that one notion of memory compartmentalization can be achieved with the help of CHERI's memory protection. We conclude by suggesting other security building blocks that would be helpful to reason about, and laying down a plan for potentially using this work for building a secure compiler, i.e., a compiler that preserves security properties.
The outlook and motivation for this work is to highlight the potential of using CHERI
as a target architecture for secure compilation.
[28]
A. Hanka, “Material Appearance Editing in Complex Volume and Surface Renderings,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
When considering global illumination, material editing is a non-linear task and even in scenes with moderate complexity, the global nature of material editing makes final prediction of
appearance of other objects in the scene a difficult task.
In this thesis, a novel interactive method is proposed for object appearance design.
To achieve this, a randomized per-pixel parametrization of scene materials is defined. At rendering time, parametrized materials have different properties for every pixel. This way, encoding of multiple rendered results into one image is obtained. We call this collection of data a hyperimage.
Material editing means projecting the hyperimage onto a given parameter vector, which is achieved using non-linear weighted regression. Pixel guides based on geometry (normals, depth and unique object ID), materials and lighting properties of the scene enter the regression problem as pixel weights. In order to ensure that only relevant features are considered,
a rendering-based feature selection method is introduced, which uses a precomputed pixelfeature
function, encoding per-pixel importance of each parametrized material.
The method of hyperimages is independent of the underlying rendering algorithm, while
supporting a full global illumination and surface interactions.
Our method is not limited to parametrization of materials, and can be extended to other
scene properties. As an example, we show parametrization of position of an area light source.
Export
BibTeX
@mastersthesis{HankaMSc2016,
TITLE = {Material Appearance Editing in Complex Volume and Surface Renderings},
AUTHOR = {Hanka, Adam},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-03-31},
ABSTRACT = {When considering global illumination, material editing is a non-linear task and even in scenes with moderate complexity, the global nature of material editing makes final prediction of appearance of other objects in the scene a difficult task. In this thesis, a novel interactive method is proposed for object appearance design. To achieve this, a randomized per-pixel parametrization of scene materials is defined. At rendering time, parametrized materials have different properties for every pixel. This way, encoding of multiple rendered results into one image is obtained. We call this collection of data a hyperimage. Material editing means projecting the hyperimage onto a given parameter vector, which is achieved using non-linear weighted regression. Pixel guides based on geometry (normals, depth and unique object ID), materials and lighting properties of the scene enter the regression problem as pixel weights. In order to ensure that only relevant features are considered, a rendering-based feature selection method is introduced, which uses a precomputed pixelfeature function, encoding per-pixel importance of each parametrized material. The method of hyperimages is independent of the underlying rendering algorithm, while supporting a full global illumination and surface interactions. Our method is not limited to parametrization of materials, and can be extended to other scene properties. As an example, we show parametrization of position of an area light source.},
}
Endnote
%0 Thesis
%A Hanka, Adam
%Y Ritschel, Tobias
%A referee: Slusallek, Philipp
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
External Organizations
%T Material Appearance Editing in Complex Volume and Surface Renderings :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-41E0-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 31.03.2016
%P 51 p.
%V master
%9 master
%X When considering global illumination, material editing is a non-linear task and even in scenes with moderate complexity, the global nature of material editing makes final prediction of
appearance of other objects in the scene a difficult task.
In this thesis, a novel interactive method is proposed for object appearance design.
To achieve this, a randomized per-pixel parametrization of scene materials is defined. At rendering time, parametrized materials have different properties for every pixel. This way, encoding of multiple rendered results into one image is obtained. We call this collection of data a hyperimage.
Material editing means projecting the hyperimage onto a given parameter vector, which is achieved using non-linear weighted regression. Pixel guides based on geometry (normals, depth and unique object ID), materials and lighting properties of the scene enter the regression problem as pixel weights. In order to ensure that only relevant features are considered,
a rendering-based feature selection method is introduced, which uses a precomputed pixelfeature
function, encoding per-pixel importance of each parametrized material.
The method of hyperimages is independent of the underlying rendering algorithm, while
supporting a full global illumination and surface interactions.
Our method is not limited to parametrization of materials, and can be extended to other
scene properties. As an example, we show parametrization of position of an area light source.
[29]
A. Mokarian Forooshani, “Deep Learning for Filling Blanks in Image Captions,” Universität des Saarlandes, Saarbrücken, 2016.
Export
BibTeX
@mastersthesis{MokarianForooshaniMaster2016,
TITLE = {Deep Learning for Filling Blanks in Image Captions},
AUTHOR = {Mokarian Forooshani, Ashkan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016},
}
Endnote
%0 Thesis
%A Mokarian Forooshani, Ashkan
%Y Fritz, Mario
%A referee: Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Deep Learning for Filling Blanks in Image Captions :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002B-1FA3-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%P 66 p.
%V master
%9 master
[30]
R. Sethi, “Evaluation of Population-Based Haplotype Phasing Algorithms,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
The valuable information in correct order of alleles on the haplotypes has many applications in GWAS studies and population genetics. A considerable number of computational and statistical algorithms have been developed for haplotype phasing. Historically, these algorithms
were compared using the simulated population data with less dense markers which was inspired by genotype data from the HapMap project. Currently due to the advancement and reduction in cost of NGS, thousands of individuals across the world have been sequenced
in 1000 Genomes Project. This has generated the genotype information of individuals from different ethnicity along with much denser genetic variations in them. Here, we have developed
a scalable approach to assess state-of-the-art population-based haplotype phasing algorithms with benchmark data designed by simulation of the population (unrelated and related individuals), NGS pipeline and genotype calling. The most accurate algorithm was
MVNCall (v1) for phase inference in unrelated individuals while DuoHMM approach of Shapeit (v2) had lowest switch error rate of 0.298 %(with true genotype likelihoods) in the related individuals. Moreover, we also conducted a comprehensive assessment of algorithms for the imputation of missing genotypes in the population with a reference panel. For this metrics, Impute2 (v2.3.2) and Beagle (v4.1) both performed competitively under different imputation scenarios and had genotype concordance rate of >99%. However, Impute2 was better in imputation of genotypes with minor allele frequency of <0.025 in the reference panel.
Export
BibTeX
@mastersthesis{SethiMaster2016,
TITLE = {Evaluation of Population-Based Haplotype Phasing Algorithms},
AUTHOR = {Sethi, Riccha},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-03-09},
ABSTRACT = {The valuable information in correct order of alleles on the haplotypes has many applications in GWAS studies and population genetics. A considerable number of computational and statistical algorithms have been developed for haplotype phasing. Historically, these algorithms were compared using the simulated population data with less dense markers which was inspired by genotype data from the HapMap project. Currently due to the advancement and reduction in cost of NGS, thousands of individuals across the world have been sequenced in 1000 Genomes Project. This has generated the genotype information of individuals from different ethnicity along with much denser genetic variations in them. Here, we have developed a scalable approach to assess state-of-the-art population-based haplotype phasing algorithms with benchmark data designed by simulation of the population (unrelated and related individuals), NGS pipeline and genotype calling. The most accurate algorithm was MVNCall (v1) for phase inference in unrelated individuals while DuoHMM approach of Shapeit (v2) had lowest switch error rate of 0.298 %(with true genotype likelihoods) in the related individuals. Moreover, we also conducted a comprehensive assessment of algorithms for the imputation of missing genotypes in the population with a reference panel. For this metrics, Impute2 (v2.3.2) and Beagle (v4.1) both performed competitively under different imputation scenarios and had genotype concordance rate of >99%. However, Impute2 was better in imputation of genotypes with minor allele frequency of <0.025 in the reference panel.},
}
Endnote
%0 Thesis
%A Sethi, Riccha
%Y Marschall, Tobias
%A referee: Pfeifer, Nico
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Evaluation of Population-Based Haplotype Phasing Algorithms :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-41DA-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 09.03.2016
%P 76 p.
%V master
%9 master
%X The valuable information in correct order of alleles on the haplotypes has many applications in GWAS studies and population genetics. A considerable number of computational and statistical algorithms have been developed for haplotype phasing. Historically, these algorithms
were compared using the simulated population data with less dense markers which was inspired by genotype data from the HapMap project. Currently due to the advancement and reduction in cost of NGS, thousands of individuals across the world have been sequenced
in 1000 Genomes Project. This has generated the genotype information of individuals from different ethnicity along with much denser genetic variations in them. Here, we have developed
a scalable approach to assess state-of-the-art population-based haplotype phasing algorithms with benchmark data designed by simulation of the population (unrelated and related individuals), NGS pipeline and genotype calling. The most accurate algorithm was
MVNCall (v1) for phase inference in unrelated individuals while DuoHMM approach of Shapeit (v2) had lowest switch error rate of 0.298 %(with true genotype likelihoods) in the related individuals. Moreover, we also conducted a comprehensive assessment of algorithms for the imputation of missing genotypes in the population with a reference panel. For this metrics, Impute2 (v2.3.2) and Beagle (v4.1) both performed competitively under different imputation scenarios and had genotype concordance rate of >99%. However, Impute2 was better in imputation of genotypes with minor allele frequency of <0.025 in the reference panel.
[31]
B. T. Teklehaimanot, “Virtualization of Video Streaming Functions,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Edgeware is a leading provider of video streaming solutions to network and service operators. The Edgeware Video Consolidation Platform(VCP) is a complete video streaming solution consisting of the Convoy Management system and Orbit streaming servers. The Orbit streaming
servers are purpose designed hardware platforms which are composed of a dedicated hardware streaming engine and a purpose designed flash as a storage system. The Orbit streaming server
is an accelerated HTTP streaming cache server which have up to 80 Gbps bandwidth and can stream to 128000 clients from a single rack unit. In line with the new trend of moving more and more functionalities towards a virtualized or software environment, the main goal of this thesis
is to make a performance comparison between Edgeware’s Orbit streaming server and one of the best generic HTTP accelerators(reverse proxy severs) after implementing logging functionality
of the Orbit on top of it. This is achieved by implementing test cases for the use cases that can help to evaluate those servers. Finally, after evaluating those proxy servers Varnish is selected and then compared the modified Varnish and Orbit to investigate the performance difference.
Export
BibTeX
@mastersthesis{TeklehaimanotMaster2016,
TITLE = {Virtualization of Video Streaming Functions},
AUTHOR = {Teklehaimanot, Birhan Tadele},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-04-25},
ABSTRACT = {Edgeware is a leading provider of video streaming solutions to network and service operators. The Edgeware Video Consolidation Platform(VCP) is a complete video streaming solution consisting of the Convoy Management system and Orbit streaming servers. The Orbit streaming servers are purpose designed hardware platforms which are composed of a dedicated hardware streaming engine and a purpose designed flash as a storage system. The Orbit streaming server is an accelerated HTTP streaming cache server which have up to 80 Gbps bandwidth and can stream to 128000 clients from a single rack unit. In line with the new trend of moving more and more functionalities towards a virtualized or software environment, the main goal of this thesis is to make a performance comparison between Edgeware{\textquoteright}s Orbit streaming server and one of the best generic HTTP accelerators(reverse proxy severs) after implementing logging functionality of the Orbit on top of it. This is achieved by implementing test cases for the use cases that can help to evaluate those servers. Finally, after evaluating those proxy servers Varnish is selected and then compared the modified Varnish and Orbit to investigate the performance difference.},
}
Endnote
%0 Thesis
%A Teklehaimanot, Birhan Tadele
%Y Appelquist, Göran
%A referee: Herfet, Thorsten
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
‎CTO at Edgeware AB
Universität des Saarlandes
%T Virtualization of Video Streaming Functions :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-570D-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 25.04.2016
%P 58 p.
%V master
%9 master
%X Edgeware is a leading provider of video streaming solutions to network and service operators. The Edgeware Video Consolidation Platform(VCP) is a complete video streaming solution consisting of the Convoy Management system and Orbit streaming servers. The Orbit streaming
servers are purpose designed hardware platforms which are composed of a dedicated hardware streaming engine and a purpose designed flash as a storage system. The Orbit streaming server
is an accelerated HTTP streaming cache server which have up to 80 Gbps bandwidth and can stream to 128000 clients from a single rack unit. In line with the new trend of moving more and more functionalities towards a virtualized or software environment, the main goal of this thesis
is to make a performance comparison between Edgeware’s Orbit streaming server and one of the best generic HTTP accelerators(reverse proxy severs) after implementing logging functionality
of the Orbit on top of it. This is achieved by implementing test cases for the use cases that can help to evaluate those servers. Finally, after evaluating those proxy servers Varnish is selected and then compared the modified Varnish and Orbit to investigate the performance difference.
[32]
M. Zheng, “Comparison of Software Tools for microRNA Next Generation Sequencing Data Analysis,” Universität des Saarlandes, Saarbrücken, 2016.
Abstract
Next-generation sequencing (NGS) appears to be very promising to study miRNAs comprehensively, which can not only profile known miRNAs, but also predict novel miRNAs. There are an increasing number of software tools developed for microRNA NGS
data analysis. Nevertheless, an overall comparison of these tools is still rare and how divergent these software tools are is still unknown, which confuses the researchers to select an optimal tool. In our study, we performed a comprehensive comparison of seven representative software tools based on real data in various aspects, including detected known miRNAs, miRNAs abundance, differential expression and predicted novel miRNAs.
We presented the divergences and similarities of these tools and gave some basic evaluation of the tools’ performances. In addition, some extreme cases in miRNAkey were explored. The comparison of these tools suggests that the performances of these software
tools are very diverse and the caution is necessary to take when choosing a software tool. The summarization of the tools’ features and comparison of their performances in our study will provide useful information for the researchers to promote their selection of an appropriate software tool.
Export
BibTeX
@mastersthesis{ZhengMaster2016,
TITLE = {Comparison of Software Tools for {microRNA} Next Generation Sequencing Data Analysis},
AUTHOR = {Zheng, Menglin},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2016},
DATE = {2016-09-12},
ABSTRACT = {Next-generation sequencing (NGS) appears to be very promising to study miRNAs comprehensively, which can not only profile known miRNAs, but also predict novel miRNAs. There are an increasing number of software tools developed for microRNA NGS data analysis. Nevertheless, an overall comparison of these tools is still rare and how divergent these software tools are is still unknown, which confuses the researchers to select an optimal tool. In our study, we performed a comprehensive comparison of seven representative software tools based on real data in various aspects, including detected known miRNAs, miRNAs abundance, differential expression and predicted novel miRNAs. We presented the divergences and similarities of these tools and gave some basic evaluation of the tools{\textquoteright} performances. In addition, some extreme cases in miRNAkey were explored. The comparison of these tools suggests that the performances of these software tools are very diverse and the caution is necessary to take when choosing a software tool. The summarization of the tools{\textquoteright} features and comparison of their performances in our study will provide useful information for the researchers to promote their selection of an appropriate software tool.},
}
Endnote
%0 Thesis
%A Zheng, Menglin
%Y Backes, Christina
%A referee: Keller, Andreas
%A referee: Meese, Eckart
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Clinical Bioinformatics, Saarland University
Clinical Bioinformatics, Saarland University
Institute of Human Genetics,
Saarland University Homburg
%T Comparison of Software Tools for microRNA Next Generation Sequencing Data Analysis :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002C-570A-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2016
%8 12.09.2016
%P 70 p.
%V master
%9 master
%X Next-generation sequencing (NGS) appears to be very promising to study miRNAs comprehensively, which can not only profile known miRNAs, but also predict novel miRNAs. There are an increasing number of software tools developed for microRNA NGS
data analysis. Nevertheless, an overall comparison of these tools is still rare and how divergent these software tools are is still unknown, which confuses the researchers to select an optimal tool. In our study, we performed a comprehensive comparison of seven representative software tools based on real data in various aspects, including detected known miRNAs, miRNAs abundance, differential expression and predicted novel miRNAs.
We presented the divergences and similarities of these tools and gave some basic evaluation of the tools’ performances. In addition, some extreme cases in miRNAkey were explored. The comparison of these tools suggests that the performances of these software
tools are very diverse and the caution is necessary to take when choosing a software tool. The summarization of the tools’ features and comparison of their performances in our study will provide useful information for the researchers to promote their selection of an appropriate software tool.
2015
[33]
G. Arvanitidis, “Robust Principal Component Analysis Based on the Trimmed Component Wise Reconstruction Error,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{Arvanitidis_Master2015,
TITLE = {Robust Principal Component Analysis Based on the Trimmed Component Wise Reconstruction Error},
AUTHOR = {Arvanitidis, Georgios},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Arvanitidis, Georgios
%Y Hein, Matthias
%A referee: Schiele, Bernt
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Robust Principal Component Analysis Based on the Trimmed Component Wise Reconstruction Error :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-C7D2-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 85 p.
%V master
%9 master
[34]
D. Dedik, “Robust Type Classification of Out of Knowledge Base Entities,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{DedikMaster2015,
TITLE = {Robust Type Classification of Out of Knowledge Base Entities},
AUTHOR = {Dedik, Darya},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Dedik, Darya
%Y Weikum, Gerhard
%A referee: Spaniol, Marc
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Robust Type Classification of Out of Knowledge Base Entities :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-C0EC-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 65 p.
%V master
%9 master
[35]
Ö. Erensoy, “Semantic Model Extraction from Semi-Structured Textual Resources,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{ErensoyMaster2015,
TITLE = {Semantic Model Extraction from Semi-Structured Textual Resources},
AUTHOR = {Erensoy, {\"O}zg{\"u}n},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Erensoy, Özgün
%Y Siekmann, Jörg
%A referee: Sosnovsky, Sergey
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Semantic Model Extraction from Semi-Structured Textual Resources :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-CC6C-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 52 p.
%V master
%9 master
[36]
M. Gad-Elrab, “AIDArabic+ Named Entity Disambiguation for Arabic Text,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{Gad-ElrabMaster2015,
TITLE = {{AIDArabic}+ Named Entity Disambiguation for Arabic Text},
AUTHOR = {Gad-Elrab, Mohamed},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Gad-Elrab, Mohamed
%Y Weikum, Gerhard
%A referee: Berberich, Klaus
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T AIDArabic+ Named Entity Disambiguation for Arabic Text :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-0F70-5
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 56 p.
%V master
%9 master
[37]
M. Goyal, “Lumping of Approximate Master Equations for Networks,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{GoyalMaster2015,
TITLE = {Lumping of Approximate Master Equations for Networks},
AUTHOR = {Goyal, Mayank},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Goyal, Mayank
%Y Bortolussi, Luca
%A referee: Wolf, Verena
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Lumping of Approximate Master Equations for Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-BB01-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 75 p.
%V master
%9 master
[38]
C. D. Hariman, “Part-Whole Commonsense Knowledge Harvesting from the Web,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{HarimanMaster2015,
TITLE = {Part-Whole Commonsense Knowledge Harvesting from the Web},
AUTHOR = {Hariman, Charles Darwis},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Hariman, Charles Darwis
%Y Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Part-Whole Commonsense Knowledge Harvesting from the Web :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-C0E6-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 53 p.
%V master
%9 master
[39]
R. F. Hulea, “Compressed Vibration Modes for Deformable Objects,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{HuleaMaster2015,
TITLE = {Compressed Vibration Modes for Deformable Objects},
AUTHOR = {Hulea, Razvan Florin},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Hulea, Razvan Florin
%Y Hildebrandt, Klaus
%A referee: Seidel, Hans-Peter
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Compressed Vibration Modes for Deformable Objects :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-2EAF-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 47 p.
%V master
%9 master
[40]
E. Kapcari, “Parallel vs. Traditional Faceted Browsing: Comparative Studies and Proposed Enhancements,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{KapcariMaster2015,
TITLE = {Parallel vs. Traditional Faceted Browsing: Comparative Studies and Proposed Enhancements},
AUTHOR = {Kapcari, Edite},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Kapcari, Edite
%Y Jameson, Anthony
%A referee: Krüger, Antonio
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Parallel vs. Traditional Faceted Browsing: Comparative Studies and Proposed Enhancements :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-BB2A-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%V master
%9 master
[41]
U. Mahmood, “Ensuring Integrity of Recommendations in a Marketplace,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{MahmoodMaster2015,
TITLE = {Ensuring Integrity of Recommendations in a Marketplace},
AUTHOR = {Mahmood, Uzair},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Mahmood, Uzair
%Y Kate, Aniket
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Group M. Backes, Max Planck Institute for Software Systems, Max Planck Society
%T Ensuring Integrity of Recommendations in a Marketplace :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-1545-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 58 p.
%V master
%9 master
[42]
P. Mandros, “Information Theoretic Supervised Feature Selection for Continuous Data,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{MandrosMaster2015,
TITLE = {Information Theoretic Supervised Feature Selection for Continuous Data},
AUTHOR = {Mandros, Panagiotis},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Mandros, Panagiotis
%Y Weikum, Gerhard
%A referee: Vreeken, Jilles
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Information Theoretic Supervised Feature Selection for Continuous Data :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-BAF3-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 67 p.
%V master
%9 master
[43]
P. E. Mercado Lopez, “Clustering and Community Detection in Signed Networks,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{MercadoLopezMaster2014,
TITLE = {Clustering and Community Detection in Signed Networks},
AUTHOR = {Mercado Lopez, Pedro Eduardo},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Mercado Lopez, Pedro Eduardo
%Y Hein, Matthias
%A referee: Andres, Bjoern
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Clustering and Community Detection in Signed Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-BAFD-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 161 p.
%V master
%9 master
[44]
N. Q. Nguyen, “A Tensor Block Coordinate Ascent Framework for Hyper Graph Matching,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{NguyenMaster2015,
TITLE = {A Tensor Block Coordinate Ascent Framework for Hyper Graph Matching},
AUTHOR = {Nguyen, Ngoc Quynh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Nguyen, Ngoc Quynh
%Y Hein, Matthias
%A referee: Weickert, Joachim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T A Tensor Block Coordinate Ascent Framework for Hyper Graph Matching :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-BB10-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P 78 p.
%V master
%9 master
[45]
T. Zinchenko, “Redescription Mining Over non-Binary Data Sets Using Decision Trees,” Universität des Saarlandes, Saarbrücken, 2015.
Export
BibTeX
@mastersthesis{ZinchenkoMaster2014,
TITLE = {Redescription Mining Over non-Binary Data Sets Using Decision Trees},
AUTHOR = {Zinchenko, Tetiana},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2015},
DATE = {2015},
}
Endnote
%0 Thesis
%A Zinchenko, Tetiana
%Y Miettinen, Pauli
%A referee: Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Redescription Mining Over non-Binary Data Sets Using Decision Trees :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-B73A-5
%I Universität des Saarlandes
%C Saarbrücken
%D 2015
%P X, 118 p.
%V master
%9 master
2014
[46]
H. A. S. Aslam, “Inter-Application Communication Testing of Android Applications Using Intent Fuzzing,” Universität des Saarlandes, Saarbrücken, 2014.
Abstract
Testing is a crucial stage in the software development process that is used to
uncover bugs and potential security threats. If not conducted thoroughly, buggy
software may cause erroneous, malicious and even harmful behavior.
Unfortunately in most software systems, testing is either completely neglected
or not thoroughly conducted. One such example is Google's popular mobile
platform, Android OS, where inter-application communication is not properly
tested. This is because of the difficulty which it possesses in the development
overhead and the manual labour required by developers in setting up the testing
environment. Consequently, the lack of Android application testing continues to
cause Android users to experience erroneous behavior and sudden crashes,
impacting user experience and potentially resulting in financial losses. When
a caller application attempts to communicate with a potentially buggy
application, the caller application
will suffer functional errors or it may even potentially crash. Incidentally,
the user will
complain that the caller application is not providing the promised
functionality, resulting in a devaluation of the application's user rating.
Successive failures will no longer be considered as isolated events,
potentially crippling developer credibility of the calling application.
In this thesis we present an automated tester for inter-application
communication in
Android applications. The approach used for testing is called Intent based
Testing. Android applications are typically divided into multiple components
that communicate via intents: messages passed through Android OS to coordinate
operations between the different components. Intents are also used for
inter-application communication, rendering them relevant for security. In this
work, we designed and built a fully automated tool called IntentFuzzer, to test
the stability of inter-application communication of Android applications using
intents. Firstly, it statically analyzes the application to generate intents.
Next, it tests the inter-application communication by fuzzing them, that is,
injecting random input values that uncover unwanted behavior. In this way, we
are able to expose several new defects including potential security issues
which we discuss briefly in the Evaluation section.
Export
BibTeX
@mastersthesis{2014Aslam,
TITLE = {Inter-Application Communication Testing of Android Applications Using Intent Fuzzing},
AUTHOR = {Aslam, Hafiz Ahmad Shahzad},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
ABSTRACT = {Testing is a crucial stage in the software development process that is used to uncover bugs and potential security threats. If not conducted thoroughly, buggy software may cause erroneous, malicious and even harmful behavior. Unfortunately in most software systems, testing is either completely neglected or not thoroughly conducted. One such example is Google's popular mobile platform, Android OS, where inter-application communication is not properly tested. This is because of the difficulty which it possesses in the development overhead and the manual labour required by developers in setting up the testing environment. Consequently, the lack of Android application testing continues to cause Android users to experience erroneous behavior and sudden crashes, impacting user experience and potentially resulting in financial losses. When a caller application attempts to communicate with a potentially buggy application, the caller application will suffer functional errors or it may even potentially crash. Incidentally, the user will complain that the caller application is not providing the promised functionality, resulting in a devaluation of the application's user rating. Successive failures will no longer be considered as isolated events, potentially crippling developer credibility of the calling application. In this thesis we present an automated tester for inter-application communication in Android applications. The approach used for testing is called Intent based Testing. Android applications are typically divided into multiple components that communicate via intents: messages passed through Android OS to coordinate operations between the different components. Intents are also used for inter-application communication, rendering them relevant for security. In this work, we designed and built a fully automated tool called IntentFuzzer, to test the stability of inter-application communication of Android applications using intents. Firstly, it statically analyzes the application to generate intents. Next, it tests the inter-application communication by fuzzing them, that is, injecting random input values that uncover unwanted behavior. In this way, we are able to expose several new defects including potential security issues which we discuss briefly in the Evaluation section.},
}
Endnote
%0 Thesis
%A Aslam, Hafiz Ahmad Shahzad
%Y Zeller, Andreas
%A referee: Hammer, Christian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Inter-Application Communication Testing of Android Applications Using Intent Fuzzing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-C91D-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P 51 p.
%V master
%9 master
%X Testing is a crucial stage in the software development process that is used to
uncover bugs and potential security threats. If not conducted thoroughly, buggy
software may cause erroneous, malicious and even harmful behavior.
Unfortunately in most software systems, testing is either completely neglected
or not thoroughly conducted. One such example is Google's popular mobile
platform, Android OS, where inter-application communication is not properly
tested. This is because of the difficulty which it possesses in the development
overhead and the manual labour required by developers in setting up the testing
environment. Consequently, the lack of Android application testing continues to
cause Android users to experience erroneous behavior and sudden crashes,
impacting user experience and potentially resulting in financial losses. When
a caller application attempts to communicate with a potentially buggy
application, the caller application
will suffer functional errors or it may even potentially crash. Incidentally,
the user will
complain that the caller application is not providing the promised
functionality, resulting in a devaluation of the application's user rating.
Successive failures will no longer be considered as isolated events,
potentially crippling developer credibility of the calling application.
In this thesis we present an automated tester for inter-application
communication in
Android applications. The approach used for testing is called Intent based
Testing. Android applications are typically divided into multiple components
that communicate via intents: messages passed through Android OS to coordinate
operations between the different components. Intents are also used for
inter-application communication, rendering them relevant for security. In this
work, we designed and built a fully automated tool called IntentFuzzer, to test
the stability of inter-application communication of Android applications using
intents. Firstly, it statically analyzes the application to generate intents.
Next, it tests the inter-application communication by fuzzing them, that is,
injecting random input values that uncover unwanted behavior. In this way, we
are able to expose several new defects including potential security issues
which we discuss briefly in the Evaluation section.
[47]
I. Grishchenko, “Static Analysis of Android Applications,” Universität des Saarlandes, Saarbrücken, 2014.
Abstract
Mobile and portable devices are machines that users carry with them everywhere,
they can be seen as constant personal assistants of modern human life. Today
the Android operating system for mobile devices is the most popular one and the
number of users still grows: as of September 2013, 1 billion devices have been
activated [Goob]. This makes the Android market attractive for developers
willing to provide new functionality. As a consequence, 48 billion applications
("apps") have been installed from the Google Play store [BBC].
Apps often require user data in order to perform the intended activity. At the
same
time parts of this data can be treated as sensitive private information, for
instance,
authentication credentials for accessing the bank account. The most significant
built-in security measure in Android, the permission system, provides only
little control on how the app is using the supplied data.
In order to mitigate the threat mentioned above, the hidden unintended app
activity,
the recent research goes in three main directions: inline-reference monitoring
modifies the app to make it safe according to user defined restrictions,
dynamic analysis monitors the app execution in order to prevent undesired
activity, and static analysis verifies the app properties from the app code
prior to execution.
As we want to have provable security guarantees before we execute the app, we
focus on static analysis. This thesis presents a novel static analysis
technique based on Horn clause resolution. In particular, we propose the
small-step concrete semantics for Android apps, we develop a new form of
abstraction which is supported by general theorem provers. Additionally, we
have proved the soundness of our analysis technique.
We have developed a tool that takes the bytecode of the Android app and makes it
accessible to the theorem prover. This enables the automated verification of a
variety of security properties, for instance, whether a certain functionality
is preceded by a particular one, for instance, whether the output of a bank
transaction is secured before sending it to the bank, or on which values it
operates, for instance, whether the IP-address of the bank is the only possible
transaction destination.
A case study as well as a performance evaluation of our tool conclude this
thesis.
Export
BibTeX
@mastersthesis{2014Grishchenko,
TITLE = {Static Analysis of Android Applications},
AUTHOR = {Grishchenko, Ilya},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
ABSTRACT = {Mobile and portable devices are machines that users carry with them everywhere, they can be seen as constant personal assistants of modern human life. Today the Android operating system for mobile devices is the most popular one and the number of users still grows: as of September 2013, 1 billion devices have been activated [Goob]. This makes the Android market attractive for developers willing to provide new functionality. As a consequence, 48 billion applications ("apps") have been installed from the Google Play store [BBC]. Apps often require user data in order to perform the intended activity. At the same time parts of this data can be treated as sensitive private information, for instance, authentication credentials for accessing the bank account. The most significant built-in security measure in Android, the permission system, provides only little control on how the app is using the supplied data. In order to mitigate the threat mentioned above, the hidden unintended app activity, the recent research goes in three main directions: inline-reference monitoring modifies the app to make it safe according to user defined restrictions, dynamic analysis monitors the app execution in order to prevent undesired activity, and static analysis verifies the app properties from the app code prior to execution. As we want to have provable security guarantees before we execute the app, we focus on static analysis. This thesis presents a novel static analysis technique based on Horn clause resolution. In particular, we propose the small-step concrete semantics for Android apps, we develop a new form of abstraction which is supported by general theorem provers. Additionally, we have proved the soundness of our analysis technique. We have developed a tool that takes the bytecode of the Android app and makes it accessible to the theorem prover. This enables the automated verification of a variety of security properties, for instance, whether a certain functionality is preceded by a particular one, for instance, whether the output of a bank transaction is secured before sending it to the bank, or on which values it operates, for instance, whether the IP-address of the bank is the only possible transaction destination. A case study as well as a performance evaluation of our tool conclude this thesis.},
}
Endnote
%0 Thesis
%A Grishchenko, Ilya
%Y Maffei, Matteu
%A referee: Hammer, Christian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Cluster of Excellence Multimodal Computing and Interaction
External Organizations
%T Static Analysis of Android Applications :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-C962-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P 96 p.
%V master
%9 master
%X Mobile and portable devices are machines that users carry with them everywhere,
they can be seen as constant personal assistants of modern human life. Today
the Android operating system for mobile devices is the most popular one and the
number of users still grows: as of September 2013, 1 billion devices have been
activated [Goob]. This makes the Android market attractive for developers
willing to provide new functionality. As a consequence, 48 billion applications
("apps") have been installed from the Google Play store [BBC].
Apps often require user data in order to perform the intended activity. At the
same
time parts of this data can be treated as sensitive private information, for
instance,
authentication credentials for accessing the bank account. The most significant
built-in security measure in Android, the permission system, provides only
little control on how the app is using the supplied data.
In order to mitigate the threat mentioned above, the hidden unintended app
activity,
the recent research goes in three main directions: inline-reference monitoring
modifies the app to make it safe according to user defined restrictions,
dynamic analysis monitors the app execution in order to prevent undesired
activity, and static analysis verifies the app properties from the app code
prior to execution.
As we want to have provable security guarantees before we execute the app, we
focus on static analysis. This thesis presents a novel static analysis
technique based on Horn clause resolution. In particular, we propose the
small-step concrete semantics for Android apps, we develop a new form of
abstraction which is supported by general theorem provers. Additionally, we
have proved the soundness of our analysis technique.
We have developed a tool that takes the bytecode of the Android app and makes it
accessible to the theorem prover. This enables the automated verification of a
variety of security properties, for instance, whether a certain functionality
is preceded by a particular one, for instance, whether the output of a bank
transaction is secured before sending it to the bank, or on which values it
operates, for instance, whether the IP-address of the bank is the only possible
transaction destination.
A case study as well as a performance evaluation of our tool conclude this
thesis.
[48]
A. Khoreva, “Video Segmentation with Graph Cuts,” Universität des Saarlandes, Saarbrücken, 2014.
Export
BibTeX
@mastersthesis{KhorevaMaster,
TITLE = {Video Segmentation with Graph Cuts},
AUTHOR = {Khoreva, Anna},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
}
Endnote
%0 Thesis
%A Khoreva, Anna
%Y Schiele, Bernt
%A referee: Hein, Matthias
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
%T Video Segmentation with Graph Cuts :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-5562-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P X, 71 p.
%V master
%9 master
[49]
M. Omran, “Pedestrian Detection Meets Stuff,” Universität des Saarlandes, Saarbrücken, 2014.
Export
BibTeX
@mastersthesis{OmranMaster,
TITLE = {Pedestrian Detection Meets Stuff},
AUTHOR = {Omran, Mohamed},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
}
Endnote
%0 Thesis
%A Omran, Mohamed
%Y Benenson, Rodrigo
%A referee: Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Pedestrian Detection Meets Stuff :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-5477-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P VII, 75 p.
%V master
%9 master
[50]
J. Tiab, “Design and Evaluation Techniques for Cuttable Multi-touch Sensor Sheets,” Universität des Saarlandes, Saarbrücken, 2014.
Export
BibTeX
@mastersthesis{TiabMastersThesis2014,
TITLE = {Design and Evaluation Techniques for Cuttable Multi-touch Sensor Sheets},
AUTHOR = {Tiab, John},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
}
Endnote
%0 Thesis
%A Tiab, John
%Y Steimle, Jürgen
%A referee: Krüger, Antonio
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
External Organizations
%T Design and Evaluation Techniques for Cuttable Multi-touch Sensor Sheets :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-5D8E-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%V master
%9 master
[51]
M. P. Vidriales Escobar, “Integration of Direct- and Feature-Based Methods for Correspondences Refinement in Structure-from-Motion,” Universität des Saarlandes, Saarbrücken, 2014.
Export
BibTeX
@mastersthesis{VidrialesEscobarMaster2014,
TITLE = {Integration of Direct- and Feature-Based Methods for Correspondences Refinement in Structure-from-Motion},
AUTHOR = {Vidriales Escobar, M{\'o}nica Paola},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
}
Endnote
%0 Thesis
%A Vidriales Escobar, Mónica Paola
%Y Herfet, Thorsten
%A referee: Slusallek, Philipp
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Integration of Direct- and Feature-Based Methods for Correspondences Refinement in Structure-from-Motion :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-CC96-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P 73 p.
%V master
%9 master
2013
[52]
E. Afsari Yeganeh, “Human Motion Alignment Using a Depth Camera,” Universität des Saarlandes, Saarbrücken, 2013.
Export
BibTeX
@mastersthesis{Master2013:Elham,
TITLE = {Human Motion Alignment Using a Depth Camera},
AUTHOR = {Afsari Yeganeh, Elham},
LANGUAGE = {eng},
LOCALID = {Local-ID: 9D35149C077B583BC1257BA20027DAB1-Master2013:Elham},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
}
Endnote
%0 Thesis
%A Afsari Yeganeh, Elham
%Y Theobalt, Christian
%A referee: Oulasvirta, Antti
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Human Motion Alignment Using a Depth Camera :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0015-1740-2
%F OTHER: Local-ID: 9D35149C077B583BC1257BA20027DAB1-Master2013:Elham
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
[53]
R. Belet, “Leveraging Independence and Locality for Random Forests in a Distributed Environment,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
With the emergence of big data, inducting regression trees on very large data
sets became a common data mining task. Even though centralized algorithms for
computing ensembles of Classification/Regression trees are a well studied
machine learning/data mining problem, their distributed versions still raise
scalability, efficiency and accuracy issues.
Most state of the art tree learning algorithms require data to reside in memory
on a single machine.
Adopting this approach for trees on big data is not feasible as the limited
resources provided by only one machine lead to scalability problems. While more
scalable implementations of tree learning algorithms have been proposed, they
typically require specialized parallel computing architectures rendering those
algorithms complex and error-prone.
In this thesis we will introduce two approaches to computing ensembles of
regression trees on very large training data sets using the MapReduce framework
as an underlying tool. The first approach employs the entire MapReduce cluster
to parallely and fully distributedly learn tree ensembles. The second approach
exploits locality and independence in the tree learning process.
Export
BibTeX
@mastersthesis{Belet2013,
TITLE = {Leveraging Independence and Locality for Random Forests in a Distributed Environment},
AUTHOR = {Belet, Razvan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {With the emergence of big data, inducting regression trees on very large data sets became a common data mining task. Even though centralized algorithms for computing ensembles of Classification/Regression trees are a well studied machine learning/data mining problem, their distributed versions still raise scalability, efficiency and accuracy issues. Most state of the art tree learning algorithms require data to reside in memory on a single machine. Adopting this approach for trees on big data is not feasible as the limited resources provided by only one machine lead to scalability problems. While more scalable implementations of tree learning algorithms have been proposed, they typically require specialized parallel computing architectures rendering those algorithms complex and error-prone. In this thesis we will introduce two approaches to computing ensembles of regression trees on very large training data sets using the MapReduce framework as an underlying tool. The first approach employs the entire MapReduce cluster to parallely and fully distributedly learn tree ensembles. The second approach exploits locality and independence in the tree learning process.},
}
Endnote
%0 Thesis
%A Belet, Razvan
%Y Weikum, Gerhard
%A referee: Schenkel, Ralf
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Leveraging Independence and Locality for Random Forests in a Distributed Environment :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-97B8-0
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%P 132 p.
%V master
%9 master
%X With the emergence of big data, inducting regression trees on very large data
sets became a common data mining task. Even though centralized algorithms for
computing ensembles of Classification/Regression trees are a well studied
machine learning/data mining problem, their distributed versions still raise
scalability, efficiency and accuracy issues.
Most state of the art tree learning algorithms require data to reside in memory
on a single machine.
Adopting this approach for trees on big data is not feasible as the limited
resources provided by only one machine lead to scalability problems. While more
scalable implementations of tree learning algorithms have been proposed, they
typically require specialized parallel computing architectures rendering those
algorithms complex and error-prone.
In this thesis we will introduce two approaches to computing ensembles of
regression trees on very large training data sets using the MapReduce framework
as an underlying tool. The first approach employs the entire MapReduce cluster
to parallely and fully distributedly learn tree ensembles. The second approach
exploits locality and independence in the tree learning process.
[54]
A. Boldyrev, “Dictionary-based Named Entity Recognition,” Universität des Saarlandes, Saarbrücken, 2013.
Export
BibTeX
@mastersthesis{BoldyrevMastersThesis2013,
TITLE = {Dictionary-based Named Entity Recognition},
AUTHOR = {Boldyrev, Artem},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
}
Endnote
%0 Thesis
%A Boldyrev, Artem
%Y Weikum, Gerhard
%A referee: Theobalt, Christian
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Dictionary-based Named Entity Recognition :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-5C74-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
[55]
L. Dinu, “Randomized Median-of-Three Trees,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
This thesis introduces a new type of randomized search trees based on the
median-of-three improvement for quicksort (M3 quicksort). We consider the set
of trees obtained by running M3 quicksort. This thesis show how to obtain them
by a slightly changed insertion procedure for binary search trees. Furthermore,
if the input is random, it generates the same probability distribution as M3
quicksort and consequently accesses in the tree are faster than for randomized
search trees. In order to maintain randomness for any type of input sequence,
we introduce the concept of support nodes, which define a path covering of the
tree. With their help, and by storing the subtree size at each node, random
updates take O(log n). If instead of subtree sizes, each node stores a random
priority, updates take O(log2 n). Experiments show that while accesses are
indeed faster, update times take however too long for the method to be
competitive.
Export
BibTeX
@mastersthesis{Dinu2013,
TITLE = {Randomized Median-of-Three Trees},
AUTHOR = {Dinu, Lavinia},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {This thesis introduces a new type of randomized search trees based on the median-of-three improvement for quicksort (M3 quicksort). We consider the set of trees obtained by running M3 quicksort. This thesis show how to obtain them by a slightly changed insertion procedure for binary search trees. Furthermore, if the input is random, it generates the same probability distribution as M3 quicksort and consequently accesses in the tree are faster than for randomized search trees. In order to maintain randomness for any type of input sequence, we introduce the concept of support nodes, which define a path covering of the tree. With their help, and by storing the subtree size at each node, random updates take O(log n). If instead of subtree sizes, each node stores a random priority, updates take O(log2 n). Experiments show that while accesses are indeed faster, update times take however too long for the method to be competitive.},
}
Endnote
%0 Thesis
%A Dinu, Lavinia
%Y Seidel, Raimund
%A referee: Bläser, Markus
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Randomized Median-of-Three Trees :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-CB5B-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
%X This thesis introduces a new type of randomized search trees based on the
median-of-three improvement for quicksort (M3 quicksort). We consider the set
of trees obtained by running M3 quicksort. This thesis show how to obtain them
by a slightly changed insertion procedure for binary search trees. Furthermore,
if the input is random, it generates the same probability distribution as M3
quicksort and consequently accesses in the tree are faster than for randomized
search trees. In order to maintain randomness for any type of input sequence,
we introduce the concept of support nodes, which define a path covering of the
tree. With their help, and by storing the subtree size at each node, random
updates take O(log n). If instead of subtree sizes, each node stores a random
priority, updates take O(log2 n). Experiments show that while accesses are
indeed faster, update times take however too long for the method to be
competitive.
[56]
M. Eghbali, “Facial Performance Capture Using a Single Kinect Camera,” Universität des Saarlandes, Saarbrücken, 2013.
Export
BibTeX
@mastersthesis{2013Master:Eghbali,
TITLE = {Facial Performance Capture Using a Single Kinect Camera},
AUTHOR = {Eghbali, Mandana},
LANGUAGE = {eng},
LOCALID = {Local-ID: 65B027251CF9FF39C1257BEF002B2460-2013Master:Eghbali},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
}
Endnote
%0 Thesis
%A Eghbali, Mandana
%Y Theobalt, Christian
%A referee: Seidel, Hans-Peter
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Facial Performance Capture Using a Single Kinect Camera :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0015-3D53-E
%F OTHER: Local-ID: 65B027251CF9FF39C1257BEF002B2460-2013Master:Eghbali
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
[57]
E. Ilieva, “Analyzing and Creating Top-k Entity Rankings,” Universität des Saarlandes, Saarbrücken, 2013.
Export
BibTeX
@mastersthesis{Ilieva2013,
TITLE = {Analyzing and Creating Top-k Entity Rankings},
AUTHOR = {Ilieva, Evica},
LANGUAGE = {eng},
LOCALID = {Local-ID: DDA2710C9D0C5B92C1257BF00027BC81-Ilieva2013z},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
}
Endnote
%0 Thesis
%A Ilieva, Evica
%Y Michel, Sebastian
%A referee: Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Analyzing and Creating Top-k Entity Rankings :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0015-1AC1-8
%F OTHER: Local-ID: DDA2710C9D0C5B92C1257BF00027BC81-Ilieva2013z
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%P 67 p.
%V master
%9 master
[58]
P. Kolev, “Community Analysis Using Local Random Walks,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
The problem of graph clustering is a central optimization problem with various
applications in numerous fields including computational biology, machine
learning, computer vision, data mining, social network analysis, VLSI design
and many more. Essentially, clustering refers to grouping objects with similar
properties in the same cluster. Designing an appropriate similarity measure is
currently a state of the art process and it is highly depended on the
underlying application. Generally speaking, the problem of graph clustering
asks to find subsets of vertices that are well-connected inside and sparsely
connected outside.
Motivated by large-scale graph clustering, we investigate local algorithms,
based on random walks, that find a set of vertices near a given starting vertex
with good worst case approximation guarantees. The running time of these
algorithms is nearly linear in the size of the output set and is independent of
the size of the whole graph. This feature makes them perfect subroutines in the
design of efficient parallel algorithms for graph clustering.
Export
BibTeX
@mastersthesis{Kolev2013,
TITLE = {Community Analysis Using Local Random Walks},
AUTHOR = {Kolev, Pavel},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {The problem of graph clustering is a central optimization problem with various applications in numerous fields including computational biology, machine learning, computer vision, data mining, social network analysis, VLSI design and many more. Essentially, clustering refers to grouping objects with similar properties in the same cluster. Designing an appropriate similarity measure is currently a state of the art process and it is highly depended on the underlying application. Generally speaking, the problem of graph clustering asks to find subsets of vertices that are well-connected inside and sparsely connected outside. Motivated by large-scale graph clustering, we investigate local algorithms, based on random walks, that find a set of vertices near a given starting vertex with good worst case approximation guarantees. The running time of these algorithms is nearly linear in the size of the output set and is independent of the size of the whole graph. This feature makes them perfect subroutines in the design of efficient parallel algorithms for graph clustering.},
}
Endnote
%0 Thesis
%A Kolev, Pavel
%Y Mehlhorn, Kurt
%A referee: Sun, He
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Community Analysis Using Local Random Walks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-97AE-6
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
%X The problem of graph clustering is a central optimization problem with various
applications in numerous fields including computational biology, machine
learning, computer vision, data mining, social network analysis, VLSI design
and many more. Essentially, clustering refers to grouping objects with similar
properties in the same cluster. Designing an appropriate similarity measure is
currently a state of the art process and it is highly depended on the
underlying application. Generally speaking, the problem of graph clustering
asks to find subsets of vertices that are well-connected inside and sparsely
connected outside.
Motivated by large-scale graph clustering, we investigate local algorithms,
based on random walks, that find a set of vertices near a given starting vertex
with good worst case approximation guarantees. The running time of these
algorithms is nearly linear in the size of the output set and is independent of
the size of the whole graph. This feature makes them perfect subroutines in the
design of efficient parallel algorithms for graph clustering.
[59]
E. Levinkov, “Scene Segmentation in Adverse Vision Conditions,” Universität des Saarlandes, Saarbrücken, 2013.
Export
BibTeX
@mastersthesis{LevinkovMaster2013,
TITLE = {Scene Segmentation in Adverse Vision Conditions},
AUTHOR = {Levinkov, Evgeny},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
}
Endnote
%0 Thesis
%A Levinkov, Evgeny
%Y Schiele, Bernt
%A referee: Fritz, Mario
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Scene Segmentation in Adverse Vision Conditions :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-3705-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%P XVII, 81 p.
%V master
%9 master
[60]
V. Mukha, “Real-time Display Reconfiguration within Multi-display Environments,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
Multi-display environments (MDEs) of all kinds are used a lot nowadays. A
wide variety of devices helps to build a common display space. TVs, monitors,
projected surfaces, phones, tablets, everything that has the ability to display
visual information can be incorporated in multi-display environments.
While the main research emphasis so far has been on interaction techniques
and user experience within different MDEs, some research topics are dealing
with static and dynamic display reconfiguration. In fact, several studies
already work with MDEs that are capable of display reconfiguration on-the-fly.
Different frameworks can perform splitting, streaming and rendering of visual
data on large-scale displays with the ability of dynamic display
reconfiguration to calibrate multiple-projectors or to combine different
heterogeneous displays into one display wall dynamically. However, all of these
frameworks require different approaches
for display reconfiguration. Our goal is to create a model for display
reconfiguration which will be abstract, transparent, will work in real-time,
and will be easily deployable in any MDE.
In this work we present an extension to a software framework called Display as
a Service (DaaS). This extension is represented as a model for real-time
display reconfiguration using DaaS. The DaaS framework allows for generic and
transparent management of pixel-transport assuming only a network connection,
providing a simple high-level implementation for pixel-producing and
pixel-displaying applications. The main limitation of this approach is a
certain delay between pixel generation and display. However, the video encoding
and network transport are subject of improvements which will solve the problem
in the future.
As a proof of concept, we demonstrate three usage scenarios: manual dynamic
display reconfiguration, automatic display calibration, and real-time display
tracking. We also present a new algorithm for precise display calibration using
markers and a handheld camera. The calibration results are evaluated using
different tracking libraries. The additional precise calibration part for our
proposed algorithm makes the calibration accuracy several times better compared
to a naive approach.
Export
BibTeX
@mastersthesis{Mukha2013,
TITLE = {Real-time Display Reconfiguration within Multi-display Environments},
AUTHOR = {Mukha, Victor},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {Multi-display environments (MDEs) of all kinds are used a lot nowadays. A wide variety of devices helps to build a common display space. TVs, monitors, projected surfaces, phones, tablets, everything that has the ability to display visual information can be incorporated in multi-display environments. While the main research emphasis so far has been on interaction techniques and user experience within different MDEs, some research topics are dealing with static and dynamic display reconfiguration. In fact, several studies already work with MDEs that are capable of display reconfiguration on-the-fly. Different frameworks can perform splitting, streaming and rendering of visual data on large-scale displays with the ability of dynamic display reconfiguration to calibrate multiple-projectors or to combine different heterogeneous displays into one display wall dynamically. However, all of these frameworks require different approaches for display reconfiguration. Our goal is to create a model for display reconfiguration which will be abstract, transparent, will work in real-time, and will be easily deployable in any MDE. In this work we present an extension to a software framework called Display as a Service (DaaS). This extension is represented as a model for real-time display reconfiguration using DaaS. The DaaS framework allows for generic and transparent management of pixel-transport assuming only a network connection, providing a simple high-level implementation for pixel-producing and pixel-displaying applications. The main limitation of this approach is a certain delay between pixel generation and display. However, the video encoding and network transport are subject of improvements which will solve the problem in the future. As a proof of concept, we demonstrate three usage scenarios: manual dynamic display reconfiguration, automatic display calibration, and real-time display tracking. We also present a new algorithm for precise display calibration using markers and a handheld camera. The calibration results are evaluated using different tracking libraries. The additional precise calibration part for our proposed algorithm makes the calibration accuracy several times better compared to a naive approach.},
}
Endnote
%0 Thesis
%A Mukha, Victor
%Y Slusallek, Philip
%A referee: Steimle, Jürgen
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Real-time Display Reconfiguration within Multi-display Environments :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0028-1522-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
%X Multi-display environments (MDEs) of all kinds are used a lot nowadays. A
wide variety of devices helps to build a common display space. TVs, monitors,
projected surfaces, phones, tablets, everything that has the ability to display
visual information can be incorporated in multi-display environments.
While the main research emphasis so far has been on interaction techniques
and user experience within different MDEs, some research topics are dealing
with static and dynamic display reconfiguration. In fact, several studies
already work with MDEs that are capable of display reconfiguration on-the-fly.
Different frameworks can perform splitting, streaming and rendering of visual
data on large-scale displays with the ability of dynamic display
reconfiguration to calibrate multiple-projectors or to combine different
heterogeneous displays into one display wall dynamically. However, all of these
frameworks require different approaches
for display reconfiguration. Our goal is to create a model for display
reconfiguration which will be abstract, transparent, will work in real-time,
and will be easily deployable in any MDE.
In this work we present an extension to a software framework called Display as
a Service (DaaS). This extension is represented as a model for real-time
display reconfiguration using DaaS. The DaaS framework allows for generic and
transparent management of pixel-transport assuming only a network connection,
providing a simple high-level implementation for pixel-producing and
pixel-displaying applications. The main limitation of this approach is a
certain delay between pixel generation and display. However, the video encoding
and network transport are subject of improvements which will solve the problem
in the future.
As a proof of concept, we demonstrate three usage scenarios: manual dynamic
display reconfiguration, automatic display calibration, and real-time display
tracking. We also present a new algorithm for precise display calibration using
markers and a handheld camera. The calibration results are evaluated using
different tracking libraries. The additional precise calibration part for our
proposed algorithm makes the calibration accuracy several times better compared
to a naive approach.
[61]
A. Podosinnikova, “Robust Principal Component Analysis as a Nonlinear Eigenproblem,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
Principal Component Analysis (PCA) is a widely used tool for, e.g., exploratory
data analysis, dimensionality reduction and clustering. However, it is well
known that PCA is strongly aected by the presence of outliers and, thus, is
vulnerable to both gross measurement error and adversarial manipulation of the
data. This phenomenon motivates the development of robust PCA as the problem of
recovering the principal components of the uncontaminated data.
In this thesis, we propose two new algorithms, QRPCA and MDRPCA, for robust PCA
components based on the projection-pursuit approach of Huber. While the
resulting optimization problems are non-convex and non-smooth, we show that
they can be eciently minimized via the RatioDCA using bundle
methods/accelerated proximal methods for the interior problem. The key
ingredient for the most promising algorithm (QRPCA) is a robust, location
invariant scale measure with breakdown point 0.5. Extensive experiments show
that our QRPCA is competitive with current state-of-the-art methods and
outperforms other methods in particular for a large number of outliers.
Export
BibTeX
@mastersthesis{Podosinnikova2013,
TITLE = {Robust Principal Component Analysis as a Nonlinear Eigenproblem},
AUTHOR = {Podosinnikova, Anastasia},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {Principal Component Analysis (PCA) is a widely used tool for, e.g., exploratory data analysis, dimensionality reduction and clustering. However, it is well known that PCA is strongly aected by the presence of outliers and, thus, is vulnerable to both gross measurement error and adversarial manipulation of the data. This phenomenon motivates the development of robust PCA as the problem of recovering the principal components of the uncontaminated data. In this thesis, we propose two new algorithms, QRPCA and MDRPCA, for robust PCA components based on the projection-pursuit approach of Huber. While the resulting optimization problems are non-convex and non-smooth, we show that they can be eciently minimized via the RatioDCA using bundle methods/accelerated proximal methods for the interior problem. The key ingredient for the most promising algorithm (QRPCA) is a robust, location invariant scale measure with breakdown point 0.5. Extensive experiments show that our QRPCA is competitive with current state-of-the-art methods and outperforms other methods in particular for a large number of outliers.},
}
Endnote
%0 Thesis
%A Podosinnikova, Anastasia
%Y Hein, Matthias
%A referee: Gemulla, Rainer
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Robust Principal Component Analysis as a Nonlinear Eigenproblem :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-CC75-A
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
%X Principal Component Analysis (PCA) is a widely used tool for, e.g., exploratory
data analysis, dimensionality reduction and clustering. However, it is well
known that PCA is strongly aected by the presence of outliers and, thus, is
vulnerable to both gross measurement error and adversarial manipulation of the
data. This phenomenon motivates the development of robust PCA as the problem of
recovering the principal components of the uncontaminated data.
In this thesis, we propose two new algorithms, QRPCA and MDRPCA, for robust PCA
components based on the projection-pursuit approach of Huber. While the
resulting optimization problems are non-convex and non-smooth, we show that
they can be eciently minimized via the RatioDCA using bundle
methods/accelerated proximal methods for the interior problem. The key
ingredient for the most promising algorithm (QRPCA) is a robust, location
invariant scale measure with breakdown point 0.5. Extensive experiments show
that our QRPCA is competitive with current state-of-the-art methods and
outperforms other methods in particular for a large number of outliers.
[62]
M. Reznitskii, “Stereo Vision under Adverse Conditions,” Universität des Saarlandes, Saarbrücken, 2013.
Abstract
Autonomous Driving benefits strongly from a 3D reconstruction of the
environment in real-time, often obtained via stereo vision. Semi-Global
Matching (SGM) is a popular�method of choice for solving this task and is
already in use for production vehicles. Despite the enormous progress in the
field and the high performance of modern methods, one key challenge remains:
stereo vision in automotive scenarios during difficult weather or illumination
conditions. Current methods generate strong temporal noise,�many disparity
outliers, and false positives on a segmentation level. This work addresses
these issues by formulating a temporal prior and a scene prior and applying
them to SGM. For image sequences captured on a highway during rain, during
snowfall, or in low light, these priors significantly improve the object
detection rate while reducing the false positive rate. The algorithm also
outperforms the�ECCV Robust�Vision Challenge winner, iSGM.�
Export
BibTeX
@mastersthesis{Reznitskii2013,
TITLE = {Stereo Vision under Adverse Conditions},
AUTHOR = {Reznitskii, Maxim},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2013},
DATE = {2013},
ABSTRACT = {Autonomous Driving benefits strongly from a 3D reconstruction of the environment in real-time, often obtained via stereo vision. Semi-Global Matching (SGM) is a popular{\diamond}method of choice for solving this task and is already in use for production vehicles. Despite the enormous progress in the field and the high performance of modern methods, one key challenge remains: stereo vision in automotive scenarios during difficult weather or illumination conditions. Current methods generate strong temporal noise,{\diamond}many disparity outliers, and false positives on a segmentation level. This work addresses these issues by formulating a temporal prior and a scene prior and applying them to SGM. For image sequences captured on a highway during rain, during snowfall, or in low light, these priors significantly improve the object detection rate while reducing the false positive rate. The algorithm also outperforms the{\diamond}ECCV Robust{\diamond}Vision Challenge winner, iSGM.{\diamond}},
}
Endnote
%0 Thesis
%A Reznitskii, Maxim
%Y Weikert, Joachim
%A referee: Schiele, Bernt
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Stereo Vision under Adverse Conditions :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-CC7E-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2013
%V master
%9 master
%X Autonomous Driving benefits strongly from a 3D reconstruction of the
environment in real-time, often obtained via stereo vision. Semi-Global
Matching (SGM) is a popular�method of choice for solving this task and is
already in use for production vehicles. Despite the enormous progress in the
field and the high performance of modern methods, one key challenge remains:
stereo vision in automotive scenarios during difficult weather or illumination
conditions. Current methods generate strong temporal noise,�many disparity
outliers, and false positives on a segmentation level. This work addresses
these issues by formulating a temporal prior and a scene prior and applying
them to SGM. For image sequences captured on a highway during rain, during
snowfall, or in low light, these priors significantly improve the object
detection rate while reducing the false positive rate. The algorithm also
outperforms the�ECCV Robust�Vision Challenge winner, iSGM.�
2012
[63]
N. Alcaraz Milman, “KeyPathwayMiner - Detecting Case-specific Biological Pathways by Using Expression Data,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
Advances in the field of systems biology have provided the biological community
with massive amounts of pathway data that describe the interplay of genes and
their products. The resulting biological networks usually consist of thousands
of entities and interactions that can be modeled mathematically as graphs.
Since these networks only provide a static picture of the accumulated
knowledge, pathways that are affected during development of complex diseases
cannot be extracted easily. This gap can be lled by means of OMICS
technologies such as DNA microarrays, which measure the activity of genes and
proteins under different conditions. Integration of both interaction and
expression datasets can increase the quality and accuracy of analysis when
compared to independant inspection of each. However, sophisticated
computational methods are needed to deal with the size of the datasets while
also accounting for the presence of biological and technological noise inherent
in the data generating process.
In this dissertation the KeyPathwayMiner is presented, a method that enables
the extraction and visualization of affected pathways given the results of a
series of gene expression studies. Specically, given network and gene
expression data, KeyPathwayMiner identies those maximal subgraphs where all
but k nodes of the subnetwork are differentially expressed in all but at most l
cases in the gene expression data. This new formulation allows users to control
the number of outliers with two parameters that provide good interpretability
of the solutions. Since identifying these subgraphs is computationally
intensive, an heuristic algorithm based on Ant Colony Optimization was designed
and adapted to this problem, where solutions are reported in the order of
seconds on a standard personal computer. The Key-PathwayMiner was tested on
real Huntingtons Disease and Breast Cancer datasets, where it is able to
extract pathways containing a large percentage of known relevant genes when
compared to other similar approaches.
KeyPathwayMiner has been implemented as a plugin for Cytoscape, one of the most
widely used open source biological network analysis and visualization
platforms. The Key-PathwayMiner is available online at
http://keypathwayminer.mpi-inf.mpg.de or through the plugin manager of
Cytoscape.
Export
BibTeX
@mastersthesis{AlcarazMilman2012,
TITLE = {{K}ey{P}athway{M}iner -- Detecting Case-specific Biological Pathways by Using Expression Data},
AUTHOR = {Alcaraz Milman, Nicolas},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {Advances in the field of systems biology have provided the biological community with massive amounts of pathway data that describe the interplay of genes and their products. The resulting biological networks usually consist of thousands of entities and interactions that can be modeled mathematically as graphs. Since these networks only provide a static picture of the accumulated knowledge, pathways that are affected during development of complex diseases cannot be extracted easily. This gap can be lled by means of OMICS technologies such as DNA microarrays, which measure the activity of genes and proteins under different conditions. Integration of both interaction and expression datasets can increase the quality and accuracy of analysis when compared to independant inspection of each. However, sophisticated computational methods are needed to deal with the size of the datasets while also accounting for the presence of biological and technological noise inherent in the data generating process. In this dissertation the KeyPathwayMiner is presented, a method that enables the extraction and visualization of affected pathways given the results of a series of gene expression studies. Specically, given network and gene expression data, KeyPathwayMiner identies those maximal subgraphs where all but k nodes of the subnetwork are differentially expressed in all but at most l cases in the gene expression data. This new formulation allows users to control the number of outliers with two parameters that provide good interpretability of the solutions. Since identifying these subgraphs is computationally intensive, an heuristic algorithm based on Ant Colony Optimization was designed and adapted to this problem, where solutions are reported in the order of seconds on a standard personal computer. The Key-PathwayMiner was tested on real Huntingtons Disease and Breast Cancer datasets, where it is able to extract pathways containing a large percentage of known relevant genes when compared to other similar approaches. KeyPathwayMiner has been implemented as a plugin for Cytoscape, one of the most widely used open source biological network analysis and visualization platforms. The Key-PathwayMiner is available online at http://keypathwayminer.mpi-inf.mpg.de or through the plugin manager of Cytoscape.},
}
Endnote
%0 Thesis
%A Alcaraz Milman, Nicolas
%Y Baumbach, Jan
%A referee: Helms, Volkhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
External Organizations
%T KeyPathwayMiner - Detecting Case-specific Biological Pathways by Using Expression Data :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-CC8A-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X Advances in the field of systems biology have provided the biological community
with massive amounts of pathway data that describe the interplay of genes and
their products. The resulting biological networks usually consist of thousands
of entities and interactions that can be modeled mathematically as graphs.
Since these networks only provide a static picture of the accumulated
knowledge, pathways that are affected during development of complex diseases
cannot be extracted easily. This gap can be lled by means of OMICS
technologies such as DNA microarrays, which measure the activity of genes and
proteins under different conditions. Integration of both interaction and
expression datasets can increase the quality and accuracy of analysis when
compared to independant inspection of each. However, sophisticated
computational methods are needed to deal with the size of the datasets while
also accounting for the presence of biological and technological noise inherent
in the data generating process.
In this dissertation the KeyPathwayMiner is presented, a method that enables
the extraction and visualization of affected pathways given the results of a
series of gene expression studies. Specically, given network and gene
expression data, KeyPathwayMiner identies those maximal subgraphs where all
but k nodes of the subnetwork are differentially expressed in all but at most l
cases in the gene expression data. This new formulation allows users to control
the number of outliers with two parameters that provide good interpretability
of the solutions. Since identifying these subgraphs is computationally
intensive, an heuristic algorithm based on Ant Colony Optimization was designed
and adapted to this problem, where solutions are reported in the order of
seconds on a standard personal computer. The Key-PathwayMiner was tested on
real Huntingtons Disease and Breast Cancer datasets, where it is able to
extract pathways containing a large percentage of known relevant genes when
compared to other similar approaches.
KeyPathwayMiner has been implemented as a plugin for Cytoscape, one of the most
widely used open source biological network analysis and visualization
platforms. The Key-PathwayMiner is available online at
http://keypathwayminer.mpi-inf.mpg.de or through the plugin manager of
Cytoscape.
[64]
N. Arvanitopoulos-Darginis, “Aggregation of Multiple Clusterings and Active Learning in a Transductive Setting,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
In this work we proposed a novel transductive method to solve the problem of
learning from partially labeled data. Our main idea was to aggregate information
obtained from several clusterings to infer the labels of the unlabeled data.
While our method is not restricted to a specific clustering method, we chose
to use in our experiments the normalized variant of 1-spectral clustering, which
was demonstrated to produce in most cases better clusterings than the standard
spectral clustering method. Our approach yielded results which were at least
comparable to, and in some cases even significantly better than the best results
obtained by state-of-the-art methods reported in the literature.
Furthermore, we proposed a novel active learning framework that is able to
query the labels of the most informative points which help in the classification
of the unlabeled points. For the majority vote scheme we provided some
guarantees on the number of points that should be drawn from each cluster in
order
to infer the correct label of the cluster with high probability. Moreover, in
the
ridge regression scheme we proposed an algorithm that in each step selects the
most uncertain point in terms of the prediction function of the classier (the
point that lies near the decision boundary of the classifier). In both cases,
experimental results show the strength of our methods and confirm our
theoretical
guarantees.
The results look very promising and open several interesting directions of
future research. For the SSL scheme, it is interesting to test the performance
of several other clustering approaches, such as k-means, standard spectral
clustering, hierarchical clustering, e.t.c. and combine them together in one
general
method. Our intuition is that the algorithm should be able to select only the
good clusterings that provide discriminative information for each specific
problem.
Apart from ridge regression, it would be beneficial to experiment with other
fitting approaches that produce sparse representations in our constructed
basis. For the active learning framework, one interesting direction is to
further generalize it into more general clusterings that take into account the
hierarchical structure of data. In that way, we will take advantage of the
underlying hierarchy and by adaptively selecting the pruning of the cluster
tree we can (potentially) further improve our sampling strategy. Additionally,
we believe that in the multi-clustering scenario extensive improvements of our
algorithm can be proposed in order to better take advantage of the variation in
the multiple clustering representations of the data. Finally, as our methods
scale to large-scale problems and partially labeled data occurs in many
different areas ranging from web documents to protein data, there is room for
many interesting applications of the proposed methods.
Export
BibTeX
@mastersthesis{Arvanitopoulos-Darginis2011,
TITLE = {Aggregation of Multiple Clusterings and Active Learning in a Transductive Setting},
AUTHOR = {Arvanitopoulos-Darginis, Nikolaos},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {In this work we proposed a novel transductive method to solve the problem of learning from partially labeled data. Our main idea was to aggregate information obtained from several clusterings to infer the labels of the unlabeled data. While our method is not restricted to a specific clustering method, we chose to use in our experiments the normalized variant of 1-spectral clustering, which was demonstrated to produce in most cases better clusterings than the standard spectral clustering method. Our approach yielded results which were at least comparable to, and in some cases even significantly better than the best results obtained by state-of-the-art methods reported in the literature. Furthermore, we proposed a novel active learning framework that is able to query the labels of the most informative points which help in the classification of the unlabeled points. For the majority vote scheme we provided some guarantees on the number of points that should be drawn from each cluster in order to infer the correct label of the cluster with high probability. Moreover, in the ridge regression scheme we proposed an algorithm that in each step selects the most uncertain point in terms of the prediction function of the classier (the point that lies near the decision boundary of the classifier). In both cases, experimental results show the strength of our methods and confirm our theoretical guarantees. The results look very promising and open several interesting directions of future research. For the SSL scheme, it is interesting to test the performance of several other clustering approaches, such as k-means, standard spectral clustering, hierarchical clustering, e.t.c. and combine them together in one general method. Our intuition is that the algorithm should be able to select only the good clusterings that provide discriminative information for each specific problem. Apart from ridge regression, it would be beneficial to experiment with other fitting approaches that produce sparse representations in our constructed basis. For the active learning framework, one interesting direction is to further generalize it into more general clusterings that take into account the hierarchical structure of data. In that way, we will take advantage of the underlying hierarchy and by adaptively selecting the pruning of the cluster tree we can (potentially) further improve our sampling strategy. Additionally, we believe that in the multi-clustering scenario extensive improvements of our algorithm can be proposed in order to better take advantage of the variation in the multiple clustering representations of the data. Finally, as our methods scale to large-scale problems and partially labeled data occurs in many different areas ranging from web documents to protein data, there is room for many interesting applications of the proposed methods.},
}
Endnote
%0 Thesis
%A Arvanitopoulos-Darginis, Nikolaos
%Y Hein, Matthias
%A referee: Weikert, Joachim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Aggregation of Multiple Clusterings and Active Learning in a Transductive Setting :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-CC8E-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X In this work we proposed a novel transductive method to solve the problem of
learning from partially labeled data. Our main idea was to aggregate information
obtained from several clusterings to infer the labels of the unlabeled data.
While our method is not restricted to a specific clustering method, we chose
to use in our experiments the normalized variant of 1-spectral clustering, which
was demonstrated to produce in most cases better clusterings than the standard
spectral clustering method. Our approach yielded results which were at least
comparable to, and in some cases even significantly better than the best results
obtained by state-of-the-art methods reported in the literature.
Furthermore, we proposed a novel active learning framework that is able to
query the labels of the most informative points which help in the classification
of the unlabeled points. For the majority vote scheme we provided some
guarantees on the number of points that should be drawn from each cluster in
order
to infer the correct label of the cluster with high probability. Moreover, in
the
ridge regression scheme we proposed an algorithm that in each step selects the
most uncertain point in terms of the prediction function of the classier (the
point that lies near the decision boundary of the classifier). In both cases,
experimental results show the strength of our methods and confirm our
theoretical
guarantees.
The results look very promising and open several interesting directions of
future research. For the SSL scheme, it is interesting to test the performance
of several other clustering approaches, such as k-means, standard spectral
clustering, hierarchical clustering, e.t.c. and combine them together in one
general
method. Our intuition is that the algorithm should be able to select only the
good clusterings that provide discriminative information for each specific
problem.
Apart from ridge regression, it would be beneficial to experiment with other
fitting approaches that produce sparse representations in our constructed
basis. For the active learning framework, one interesting direction is to
further generalize it into more general clusterings that take into account the
hierarchical structure of data. In that way, we will take advantage of the
underlying hierarchy and by adaptively selecting the pruning of the cluster
tree we can (potentially) further improve our sampling strategy. Additionally,
we believe that in the multi-clustering scenario extensive improvements of our
algorithm can be proposed in order to better take advantage of the variation in
the multiple clustering representations of the data. Finally, as our methods
scale to large-scale problems and partially labeled data occurs in many
different areas ranging from web documents to protein data, there is room for
many interesting applications of the proposed methods.
[65]
N. Azmy, “Formula Renaming with Generalizations,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{Azmy12,
TITLE = {Formula Renaming with Generalizations},
AUTHOR = {Azmy, Noran},
LANGUAGE = {eng},
LOCALID = {Local-ID: DF824D161A8C2600C1257AF6004FEBFF-Azmy12},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Azmy, Noran
%Y Weidenbach, Christoph
%A referee: Werner, Stephan
%+ Automation of Logic, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Automation of Logic, MPI for Informatics, Max Planck Society
External Organizations
%T Formula Renaming with Generalizations :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-B40C-0
%F OTHER: Local-ID: DF824D161A8C2600C1257AF6004FEBFF-Azmy12
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
[66]
E. Cergani, “Relation Extraction Using Matrix Factorization Methods,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{Cergani2012,
TITLE = {Relation Extraction Using Matrix Factorization Methods},
AUTHOR = {Cergani, Ervina},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-B2109FA6099C9CC8C1257AC900301865-Cergani2012},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Cergani, Ervina
%Y Weikum, Gerhard
%A referee: Miettinen, Pauli
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Relation Extraction Using Matrix Factorization Methods :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-6277-9
%F EDOC: 647514
%F OTHER: Local-ID: C1256DBF005F876D-B2109FA6099C9CC8C1257AC900301865-Cergani2012
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
[67]
C. Croitoru, “Algorithmic Aspects of Abstract Argumentation Frameworks,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{CroitoruMaster2012,
TITLE = {Algorithmic Aspects of Abstract Argumentation Frameworks},
AUTHOR = {Croitoru, Cosmina},
LANGUAGE = {eng},
LOCALID = {Local-ID: F5B5A180A18D2EEBC1257B1600392684-CroitoruMaster2012},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Croitoru, Cosmina
%Y Mehlhorn, Kurt
%A referee: Kötzing, Timo
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Algorithmic Aspects of Abstract Argumentation Frameworks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-BAAD-0
%F OTHER: Local-ID: F5B5A180A18D2EEBC1257B1600392684-CroitoruMaster2012
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%P X, 75 p.
%V master
%9 master
[68]
I. Goncharov, “Local Constancy Assumption Selection for Variational Optical Flow,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
Variational methods are among the most successful approaches for computing
high-quality optical ow. However, there are still many ways to improve. In this
thesis we fist provide a general overview of the main ideas of existing
approaches by the example of the complementary optical ow method of Zimmer et
al. [19]. This serves us as a starting point for introducing the concept of
automatic local selection of the most suitable constancy assumption on image
features, which allows to further improve the quality of optical ow estimation.
As a main contribution, we provide the variational formulation, that directly
leads to the proposed behavior. The derived model is then analysed and
evaluated in the series of experiments.
Export
BibTeX
@mastersthesis{Goncharov2011,
TITLE = {Local Constancy Assumption Selection for Variational Optical Flow},
AUTHOR = {Goncharov, Ilya},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {Variational methods are among the most successful approaches for computing high-quality optical ow. However, there are still many ways to improve. In this thesis we fist provide a general overview of the main ideas of existing approaches by the example of the complementary optical ow method of Zimmer et al. [19]. This serves us as a starting point for introducing the concept of automatic local selection of the most suitable constancy assumption on image features, which allows to further improve the quality of optical ow estimation. As a main contribution, we provide the variational formulation, that directly leads to the proposed behavior. The derived model is then analysed and evaluated in the series of experiments.},
}
Endnote
%0 Thesis
%A Goncharov, Ilya
%Y Bruhn, Andreas
%A referee: Weikert, Joachim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Local Constancy Assumption Selection for Variational Optical Flow :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0026-D083-E
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X Variational methods are among the most successful approaches for computing
high-quality optical ow. However, there are still many ways to improve. In this
thesis we fist provide a general overview of the main ideas of existing
approaches by the example of the complementary optical ow method of Zimmer et
al. [19]. This serves us as a starting point for introducing the concept of
automatic local selection of the most suitable constancy assumption on image
features, which allows to further improve the quality of optical ow estimation.
As a main contribution, we provide the variational formulation, that directly
leads to the proposed behavior. The derived model is then analysed and
evaluated in the series of experiments.
[69]
H. Khoshnevis, “Discriminating 4G and Broadcast Signals via Cyclostationary Feature Detection,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
According to the FCC, spectrum allocation will be one of the problems of future
telecommunication systems. Indeed, the available parts of the spectrum have
been assigned statically to some applications such as mobile networks and
broadcasting systems; hence finding a proper operating band for new systems is
difficult. These telecommunication systems are called primary users. However,
primary users do not always use their entire bandwidth, and therefore a lot of
spectrum holes can be detected. These spectrum holes can be utilized for
undefined systems called secondary users. Federal communication commission
(FCC) introduced cognitive radio which detects these holes and assigns
them to secondary users.
There are several techniques for detention of signals such as energy based
detection, matched filter detection and cyclostationary based detection.
Cyclostationary based detection as one of the most sensitive methods, can be
used for detection and classification of different systems. However,
traditional multi-cycle and single-cycle detectors suffer from high complexity.
Fortunately, using some prior knowledge about the signal, this shortcoming can
be solved.
In this thesis, signals of DVB-T2 as a broadcasting system and 3GPP LTE and
IEEE 802.16 (WiMAX) as mobile networks have been evaluated and two
cyclostationary based algorithms for detection and classification of these
signals in SISO and MIMO antenna configurations are proposed.
Export
BibTeX
@mastersthesis{Khoshnevis2012,
TITLE = {Discriminating {4G} and Broadcast Signals via Cyclostationary Feature Detection},
AUTHOR = {Khoshnevis, Hossein},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {According to the FCC, spectrum allocation will be one of the problems of future telecommunication systems. Indeed, the available parts of the spectrum have been assigned statically to some applications such as mobile networks and broadcasting systems; hence finding a proper operating band for new systems is difficult. These telecommunication systems are called primary users. However, primary users do not always use their entire bandwidth, and therefore a lot of spectrum holes can be detected. These spectrum holes can be utilized for undefined systems called secondary users. Federal communication commission (FCC) introduced cognitive radio which detects these holes and assigns them to secondary users. There are several techniques for detention of signals such as energy based detection, matched filter detection and cyclostationary based detection. Cyclostationary based detection as one of the most sensitive methods, can be used for detection and classification of different systems. However, traditional multi-cycle and single-cycle detectors suffer from high complexity. Fortunately, using some prior knowledge about the signal, this shortcoming can be solved. In this thesis, signals of DVB-T2 as a broadcasting system and 3GPP LTE and IEEE 802.16 (WiMAX) as mobile networks have been evaluated and two cyclostationary based algorithms for detection and classification of these signals in SISO and MIMO antenna configurations are proposed.},
}
Endnote
%0 Thesis
%A Khoshnevis, Hossein
%Y Herfet, Thorsten
%A referee: Schiele, Bernt
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Discriminating 4G and Broadcast Signals via Cyclostationary Feature Detection :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-9F11-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X According to the FCC, spectrum allocation will be one of the problems of future
telecommunication systems. Indeed, the available parts of the spectrum have
been assigned statically to some applications such as mobile networks and
broadcasting systems; hence finding a proper operating band for new systems is
difficult. These telecommunication systems are called primary users. However,
primary users do not always use their entire bandwidth, and therefore a lot of
spectrum holes can be detected. These spectrum holes can be utilized for
undefined systems called secondary users. Federal communication commission
(FCC) introduced cognitive radio which detects these holes and assigns
them to secondary users.
There are several techniques for detention of signals such as energy based
detection, matched filter detection and cyclostationary based detection.
Cyclostationary based detection as one of the most sensitive methods, can be
used for detection and classification of different systems. However,
traditional multi-cycle and single-cycle detectors suffer from high complexity.
Fortunately, using some prior knowledge about the signal, this shortcoming can
be solved.
In this thesis, signals of DVB-T2 as a broadcasting system and 3GPP LTE and
IEEE 802.16 (WiMAX) as mobile networks have been evaluated and two
cyclostationary based algorithms for detection and classification of these
signals in SISO and MIMO antenna configurations are proposed.
[70]
S. Moran, “Shattering Extremal Systems,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{MoranMaster2012,
TITLE = {Shattering Extremal Systems},
AUTHOR = {Moran, Shay},
LANGUAGE = {eng},
LOCALID = {Local-ID: BF24A25446C99102C1257B1600386F5D-MoranMaster2012},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Moran, Shay
%Y Mehlhorn, Kurt
%A referee: Litman, Ami
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
%T Shattering Extremal Systems :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-BAA6-D
%F OTHER: Local-ID: BF24A25446C99102C1257B1600386F5D-MoranMaster2012
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%P II, 39 p.
%V master
%9 master
[71]
D. B. Nguyen, “Efficient Entity Disambiguation via Similarity Hashing,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{Nguyen2012,
TITLE = {Efficient Entity Disambiguation via Similarity Hashing},
AUTHOR = {Nguyen, Dat Ba},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-86BEFB1566020C4AC1257A6400543D05-Nguyen2012},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Nguyen, Dat Ba
%Y Theobald, Martin
%A referee: Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Efficient Entity Disambiguation via Similarity Hashing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-626B-5
%F EDOC: 647513
%F OTHER: Local-ID: C1256DBF005F876D-86BEFB1566020C4AC1257A6400543D05-Nguyen2012
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
[72]
J. Parks, “Detecting Structural Regularity in Perspective Images,” Universität des Saarlandes, Saarbrücken, 2012.
Export
BibTeX
@mastersthesis{MasterParks,
TITLE = {Detecting Structural Regularity in Perspective Images},
AUTHOR = {Parks, Justin},
LANGUAGE = {eng},
LOCALID = {Local-ID: 45164},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
}
Endnote
%0 Thesis
%A Parks, Justin
%Y Thormählen, Thorsten
%A referee: Lasowski, Ruxandra
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Detecting Structural Regularity in Perspective Images :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0014-F425-9
%F OTHER: Local-ID: 45164
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
[73]
B. Saliba, “An Evaluation Method For Indoor Positioning Systems On The Example Of LORIOT,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
In this thesis an evaluation method on indoor positioning system called LORIOT
is presented. This positioning system combines two technologies (RFID and IR)
for positioning depending on Geo-referenced dynamic bayesian networks. LORIOT
allows the users to calculate their position on their own device without sending
any data to a server responsible for calculating the position [3]. This property
provides less complexity and fast calculation. This positioning method is
developed by placing the tags in the environment and letting the user carry the
sensors that are used to read data from these tags. The user is then able to
choose either to pass the positioning data to any third party application or
not. The main focus here is to check the actual accuracy and performance of
indoor positioning systems using the proposed evaluation method which is tested
on LORIOT. Most of the evaluation methods that have been used to test the level
of accuracy of indoor positioning systems are biased and not good enough. For
instance, the system is tested under optimal conditions of the environment. To
achieve this goal, the evaluation method will be used to test LORIOT in a
natural environment and by using data of natural traces of people walking in
the environment without giving them any task to do. This type of evaluation
criteria improves the results because the system would be installed in an
environment which has the same properties that the environment has in this
study, (where the evaluation tests are done). In addition, the system will
position people while walking naturally (unlike most evaluation methods which
test indoor positioning systems not while walking).
Export
BibTeX
@mastersthesis{Saliba2012,
TITLE = {An Evaluation Method For Indoor Positioning Systems On The Example Of {LORIOT}},
AUTHOR = {Saliba, Bahjat},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {In this thesis an evaluation method on indoor positioning system called LORIOT is presented. This positioning system combines two technologies (RFID and IR) for positioning depending on Geo-referenced dynamic bayesian networks. LORIOT allows the users to calculate their position on their own device without sending any data to a server responsible for calculating the position [3]. This property provides less complexity and fast calculation. This positioning method is developed by placing the tags in the environment and letting the user carry the sensors that are used to read data from these tags. The user is then able to choose either to pass the positioning data to any third party application or not. The main focus here is to check the actual accuracy and performance of indoor positioning systems using the proposed evaluation method which is tested on LORIOT. Most of the evaluation methods that have been used to test the level of accuracy of indoor positioning systems are biased and not good enough. For instance, the system is tested under optimal conditions of the environment. To achieve this goal, the evaluation method will be used to test LORIOT in a natural environment and by using data of natural traces of people walking in the environment without giving them any task to do. This type of evaluation criteria improves the results because the system would be installed in an environment which has the same properties that the environment has in this study, (where the evaluation tests are done). In addition, the system will position people while walking naturally (unlike most evaluation methods which test indoor positioning systems not while walking).},
}
Endnote
%0 Thesis
%A Saliba, Bahjat
%Y Wahlster, Wolfgang
%A referee: Müller, Christian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T An Evaluation Method For Indoor Positioning Systems On The Example Of LORIOT :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-9F69-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X In this thesis an evaluation method on indoor positioning system called LORIOT
is presented. This positioning system combines two technologies (RFID and IR)
for positioning depending on Geo-referenced dynamic bayesian networks. LORIOT
allows the users to calculate their position on their own device without sending
any data to a server responsible for calculating the position [3]. This property
provides less complexity and fast calculation. This positioning method is
developed by placing the tags in the environment and letting the user carry the
sensors that are used to read data from these tags. The user is then able to
choose either to pass the positioning data to any third party application or
not. The main focus here is to check the actual accuracy and performance of
indoor positioning systems using the proposed evaluation method which is tested
on LORIOT. Most of the evaluation methods that have been used to test the level
of accuracy of indoor positioning systems are biased and not good enough. For
instance, the system is tested under optimal conditions of the environment. To
achieve this goal, the evaluation method will be used to test LORIOT in a
natural environment and by using data of natural traces of people walking in
the environment without giving them any task to do. This type of evaluation
criteria improves the results because the system would be installed in an
environment which has the same properties that the environment has in this
study, (where the evaluation tests are done). In addition, the system will
position people while walking naturally (unlike most evaluation methods which
test indoor positioning systems not while walking).
[74]
L. Teris, “Securing User-data in Android A conceptual approach for consumer and enterprise usage,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
Nowadays, smartphones and tablets are replacing the personal computer for the
average user. As more activities move to these gadgets, so does the sensitive
data with which they operate. However, there are few data protection mechanisms
for the mobile world at the moment, especially for scenarios where the attacker
has
full access to the device (e.g. when the device is lost or stolen). In this
thesis,
we tackle this problem and propose a novel encryption system for Android, the
top-selling mobile operating system.
Our investigation of the Android platform leads to a set of observations that
motivate our effort. Firstly, the existing defense mechanisms are too weak or
too rigid in terms of access control and granularity of the secured data unit.
Secondly, Android can be corrupted such that the default encryption solution
will reveal sensitive content via the debug interface. In response, we design
and
(partially) implement an encryption system that addresses these shortcomings
and operates in a manner that is transparent to the user. Also, by leveraging
hardware security mechanisms, our system offers security guarantees even when
running on a corrupted OS. Moreover, the system is conceptually designed to
operate in an enterprise environment where mobile devices are administered
by a central authority. Finally, we provide a prototypical implementation and
evaluate our system to show the practicality of our approach.
Export
BibTeX
@mastersthesis{Teris2012,
TITLE = {Securing User-data in Android A conceptual approach for consumer and enterprise usage},
AUTHOR = {Teris, Liviu},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {Nowadays, smartphones and tablets are replacing the personal computer for the average user. As more activities move to these gadgets, so does the sensitive data with which they operate. However, there are few data protection mechanisms for the mobile world at the moment, especially for scenarios where the attacker has full access to the device (e.g. when the device is lost or stolen). In this thesis, we tackle this problem and propose a novel encryption system for Android, the top-selling mobile operating system. Our investigation of the Android platform leads to a set of observations that motivate our effort. Firstly, the existing defense mechanisms are too weak or too rigid in terms of access control and granularity of the secured data unit. Secondly, Android can be corrupted such that the default encryption solution will reveal sensitive content via the debug interface. In response, we design and (partially) implement an encryption system that addresses these shortcomings and operates in a manner that is transparent to the user. Also, by leveraging hardware security mechanisms, our system offers security guarantees even when running on a corrupted OS. Moreover, the system is conceptually designed to operate in an enterprise environment where mobile devices are administered by a central authority. Finally, we provide a prototypical implementation and evaluate our system to show the practicality of our approach.},
}
Endnote
%0 Thesis
%A Teris, Liviu
%Y Backes, Michael
%A referee: Hammer, Christian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Securing User-data in Android A conceptual approach for consumer and enterprise usage :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A17C-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X Nowadays, smartphones and tablets are replacing the personal computer for the
average user. As more activities move to these gadgets, so does the sensitive
data with which they operate. However, there are few data protection mechanisms
for the mobile world at the moment, especially for scenarios where the attacker
has
full access to the device (e.g. when the device is lost or stolen). In this
thesis,
we tackle this problem and propose a novel encryption system for Android, the
top-selling mobile operating system.
Our investigation of the Android platform leads to a set of observations that
motivate our effort. Firstly, the existing defense mechanisms are too weak or
too rigid in terms of access control and granularity of the secured data unit.
Secondly, Android can be corrupted such that the default encryption solution
will reveal sensitive content via the debug interface. In response, we design
and
(partially) implement an encryption system that addresses these shortcomings
and operates in a manner that is transparent to the user. Also, by leveraging
hardware security mechanisms, our system offers security guarantees even when
running on a corrupted OS. Moreover, the system is conceptually designed to
operate in an enterprise environment where mobile devices are administered
by a central authority. Finally, we provide a prototypical implementation and
evaluate our system to show the practicality of our approach.
[75]
M. Venkatachalapathy, “Scheduling Strategies in a Main-Memory MapReduce Framework,Approach for countering Reduce side skew,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
Over the past few decades, there is a multifold increase in the amount of
digital data that is being generated. Various attempts are being made to
process this vast amount of data in a fast and efficient manner. Hadoop -
MapReduce is one such software framework that has gained popularity in the last
few years. It provides a reliable and easier way to process huge amount of data
in-parallel on large computing cluster. However, Hadoop always persists
intermediate results to the local disk. As a result, Hadoop usually suffers
from long execution runtimes as it typically pays a high I/O cost for running
jobs.
The state-of-the-art computing clusters have enough main memory capacity to
hold terabytes of data in main memory. We have built M3R (Main Memory
MapReduce) framework, a prototype for generic main memory-based data
processing. M3R can execute MapReduce jobs and also in addition it can execute
general data processing jobs.
This master thesis in particular, focuses on countering the data-skewness
problem for MapReduce jobs on M3R. Intermediate data following skewed
distribution could lead to computational imbalance amongst the reduce tasks,
resulting in longer MapReduce job execution times. This provides a scope for
rebalancing the intermediate data and thereby reducing the total job runtimes.
We propose a novel dynamic approach of data rebalancing, to counter the reducer
side data skewness. Our proposed on-the-fly skew countering approach, attempts
to detect the level of skewness in the intermediate data and rebalances the
intermediate data amongst the reduce tasks. The proposed mechanism performs all
the skew-countering related activities during the execution of actual MapReduce
job. We have implemented this reduce side skew countering mechanism as a part
of the M3R framework. The experiments conducted to study the behavior of this
M3R data-rebalancing approach shows there is a significant reduction in the
map-reduce job runtimes. In case of the data-skewed input, our proposed
skew-control approach for M3R has reduced the total map-reduce job runtime (up
to 31 ) when compared to M3R without skew-control.
Export
BibTeX
@mastersthesis{Venkatachalapathy2012,
TITLE = {Scheduling Strategies in a Main-Memory {MapReduce} Framework,Approach for countering Reduce side skew},
AUTHOR = {Venkatachalapathy, Mahendiran},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {Over the past few decades, there is a multifold increase in the amount of digital data that is being generated. Various attempts are being made to process this vast amount of data in a fast and efficient manner. Hadoop -- MapReduce is one such software framework that has gained popularity in the last few years. It provides a reliable and easier way to process huge amount of data in-parallel on large computing cluster. However, Hadoop always persists intermediate results to the local disk. As a result, Hadoop usually suffers from long execution runtimes as it typically pays a high I/O cost for running jobs. The state-of-the-art computing clusters have enough main memory capacity to hold terabytes of data in main memory. We have built M3R (Main Memory MapReduce) framework, a prototype for generic main memory-based data processing. M3R can execute MapReduce jobs and also in addition it can execute general data processing jobs. This master thesis in particular, focuses on countering the data-skewness problem for MapReduce jobs on M3R. Intermediate data following skewed distribution could lead to computational imbalance amongst the reduce tasks, resulting in longer MapReduce job execution times. This provides a scope for rebalancing the intermediate data and thereby reducing the total job runtimes. We propose a novel dynamic approach of data rebalancing, to counter the reducer side data skewness. Our proposed on-the-fly skew countering approach, attempts to detect the level of skewness in the intermediate data and rebalances the intermediate data amongst the reduce tasks. The proposed mechanism performs all the skew-countering related activities during the execution of actual MapReduce job. We have implemented this reduce side skew countering mechanism as a part of the M3R framework. The experiments conducted to study the behavior of this M3R data-rebalancing approach shows there is a significant reduction in the map-reduce job runtimes. In case of the data-skewed input, our proposed skew-control approach for M3R has reduced the total map-reduce job runtime (up to 31 ) when compared to M3R without skew-control.},
}
Endnote
%0 Thesis
%A Venkatachalapathy, Mahendiran
%Y Dittrich, Jens
%A referee: Quiané-Ruiz, Jorge-Arnulfo
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Scheduling Strategies in a Main-Memory MapReduce Framework,Approach for countering Reduce side skew :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A183-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X Over the past few decades, there is a multifold increase in the amount of
digital data that is being generated. Various attempts are being made to
process this vast amount of data in a fast and efficient manner. Hadoop -
MapReduce is one such software framework that has gained popularity in the last
few years. It provides a reliable and easier way to process huge amount of data
in-parallel on large computing cluster. However, Hadoop always persists
intermediate results to the local disk. As a result, Hadoop usually suffers
from long execution runtimes as it typically pays a high I/O cost for running
jobs.
The state-of-the-art computing clusters have enough main memory capacity to
hold terabytes of data in main memory. We have built M3R (Main Memory
MapReduce) framework, a prototype for generic main memory-based data
processing. M3R can execute MapReduce jobs and also in addition it can execute
general data processing jobs.
This master thesis in particular, focuses on countering the data-skewness
problem for MapReduce jobs on M3R. Intermediate data following skewed
distribution could lead to computational imbalance amongst the reduce tasks,
resulting in longer MapReduce job execution times. This provides a scope for
rebalancing the intermediate data and thereby reducing the total job runtimes.
We propose a novel dynamic approach of data rebalancing, to counter the reducer
side data skewness. Our proposed on-the-fly skew countering approach, attempts
to detect the level of skewness in the intermediate data and rebalances the
intermediate data amongst the reduce tasks. The proposed mechanism performs all
the skew-countering related activities during the execution of actual MapReduce
job. We have implemented this reduce side skew countering mechanism as a part
of the M3R framework. The experiments conducted to study the behavior of this
M3R data-rebalancing approach shows there is a significant reduction in the
map-reduce job runtimes. In case of the data-skewed input, our proposed
skew-control approach for M3R has reduced the total map-reduce job runtime (up
to 31 ) when compared to M3R without skew-control.
[76]
Q. Zheng, “Sparse Dictionary Learning with Simplex Constraints and Application to Topic Modeling,” Universität des Saarlandes, Saarbrücken, 2012.
Abstract
Probabilistic mixture model is a powerful tool to provide a low-dimensional
representation of count data. In the context of topic modeling, this amounts to
representing the distribution of one document as a mixture of multiple
distributions known as topics. The mixing proportions are called coecients. A
common attempt is to introduce sparsity into both the topics and the coecients
for better interpretability. We first discuss the problem of recovering sparse
coecients of given documents when the topics are known. This is formulated as
a penalized least squares problem on the probability simplex, where the
sparsity is achieved through regularization. However, the typical `1
regularizer becomes toothless in this case since it is constant over the
simplex. To overcome this issue, we propose a group of concave penalties for
inducing sparsity. An alternative approach is to post-process the solution of
non-negative lasso to produce result that conform to the simplex constraint.
Our experiments show that both kinds of approaches can effiectively recover the
sparsity pattern of coefficients. We
then elaborately compare their robustness for dierent characteristics of input
data. The second problem we discuss is to model both the topics and the
coefficients of a collection of documents via matrix factorization. We propose
the LpT approach, in which all the topics and coefficients are constrained on
the simplex, and the `p penalty is imposed on each topic to promote sparsity.
We also consider procedures that post-process the solutions of other methods.
For example, the L1 approach first solves the problem where the simplex
constraints imposed on the topics are relaxed into the non-negativity
constraints, and the `p penalty is the replaced by the `1 penalty. Afterwards,
L1 normalize the estimated topics to generate results satisfying the simplex
constraints. As detecting the number of mixture components inherent in the data
is of central importance for the probabilistic mixture model, we analyze how
the regularization techniques can help us to automatically find out this
number. We compare the capabilities of these approaches to recover the low-rank
structure underlying the data, when the number of topics are correctly specied
and over-specified, respectively.
The empirical results demonstrate that LpT and L1 can discover the sparsity
pattern of the ground truth. In addition, when the number of topics is
over-specied, they adapt to the true number of topics.
Export
BibTeX
@mastersthesis{Zheng2012,
TITLE = {Sparse Dictionary Learning with Simplex Constraints and Application to Topic Modeling},
AUTHOR = {Zheng, Qinqing},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2012},
DATE = {2012},
ABSTRACT = {Probabilistic mixture model is a powerful tool to provide a low-dimensional representation of count data. In the context of topic modeling, this amounts to representing the distribution of one document as a mixture of multiple distributions known as topics. The mixing proportions are called coecients. A common attempt is to introduce sparsity into both the topics and the coecients for better interpretability. We first discuss the problem of recovering sparse coecients of given documents when the topics are known. This is formulated as a penalized least squares problem on the probability simplex, where the sparsity is achieved through regularization. However, the typical `1 regularizer becomes toothless in this case since it is constant over the simplex. To overcome this issue, we propose a group of concave penalties for inducing sparsity. An alternative approach is to post-process the solution of non-negative lasso to produce result that conform to the simplex constraint. Our experiments show that both kinds of approaches can effiectively recover the sparsity pattern of coefficients. We then elaborately compare their robustness for dierent characteristics of input data. The second problem we discuss is to model both the topics and the coefficients of a collection of documents via matrix factorization. We propose the LpT approach, in which all the topics and coefficients are constrained on the simplex, and the `p penalty is imposed on each topic to promote sparsity. We also consider procedures that post-process the solutions of other methods. For example, the L1 approach first solves the problem where the simplex constraints imposed on the topics are relaxed into the non-negativity constraints, and the `p penalty is the replaced by the `1 penalty. Afterwards, L1 normalize the estimated topics to generate results satisfying the simplex constraints. As detecting the number of mixture components inherent in the data is of central importance for the probabilistic mixture model, we analyze how the regularization techniques can help us to automatically find out this number. We compare the capabilities of these approaches to recover the low-rank structure underlying the data, when the number of topics are correctly specied and over-specified, respectively. The empirical results demonstrate that LpT and L1 can discover the sparsity pattern of the ground truth. In addition, when the number of topics is over-specied, they adapt to the true number of topics.},
}
Endnote
%0 Thesis
%A Zheng, Qinqing
%Y Hein, Matthias
%A referee: Slawski, Martin
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Sparse Dictionary Learning with Simplex Constraints and Application to Topic Modeling :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A192-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2012
%V master
%9 master
%X Probabilistic mixture model is a powerful tool to provide a low-dimensional
representation of count data. In the context of topic modeling, this amounts to
representing the distribution of one document as a mixture of multiple
distributions known as topics. The mixing proportions are called coecients. A
common attempt is to introduce sparsity into both the topics and the coecients
for better interpretability. We first discuss the problem of recovering sparse
coecients of given documents when the topics are known. This is formulated as
a penalized least squares problem on the probability simplex, where the
sparsity is achieved through regularization. However, the typical `1
regularizer becomes toothless in this case since it is constant over the
simplex. To overcome this issue, we propose a group of concave penalties for
inducing sparsity. An alternative approach is to post-process the solution of
non-negative lasso to produce result that conform to the simplex constraint.
Our experiments show that both kinds of approaches can effiectively recover the
sparsity pattern of coefficients. We
then elaborately compare their robustness for dierent characteristics of input
data. The second problem we discuss is to model both the topics and the
coefficients of a collection of documents via matrix factorization. We propose
the LpT approach, in which all the topics and coefficients are constrained on
the simplex, and the `p penalty is imposed on each topic to promote sparsity.
We also consider procedures that post-process the solutions of other methods.
For example, the L1 approach first solves the problem where the simplex
constraints imposed on the topics are relaxed into the non-negativity
constraints, and the `p penalty is the replaced by the `1 penalty. Afterwards,
L1 normalize the estimated topics to generate results satisfying the simplex
constraints. As detecting the number of mixture components inherent in the data
is of central importance for the probabilistic mixture model, we analyze how
the regularization techniques can help us to automatically find out this
number. We compare the capabilities of these approaches to recover the low-rank
structure underlying the data, when the number of topics are correctly specied
and over-specified, respectively.
The empirical results demonstrate that LpT and L1 can discover the sparsity
pattern of the ground truth. In addition, when the number of topics is
over-specied, they adapt to the true number of topics.
2011
[77]
F. Abed, “Coordination Mechanisms for Unrelated Machine Scheduling,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
We investigate load balancing games in the context of unrelated machines
scheduling. In such a game, there are a number of jobs and a number of
machines, and each job needs to be scheduled on one machine. A collection of
values pij are given, where pij indicates the processing time of job i on
machine j. Moreover, each job is controlled by a selfish player who only
wants to minimize the completion time of his job while disregarding other
players' welfare. The outcome schedule is a Nash equilibrium if no player
can unilaterally change his machine and reduce the completion time of his job.
It is known that in an equilibrium, the performance of the system can
be far from optimal. The degradation of the system performance in Nash
equilibrium is defined as the price of anarchy (PoA): the ratio of the cost
of the worst Nash equilibrium to the cost of the optimal scheduling. Clever
scheduling policies can be designed to reduce PoA. These scheduling policies
are called coordination mechanisms.
It has been posed as an open question "what is the best possible lower bound
when coordination mechanisms use preemption". In this thesis we prove a lower
bound of ( logm log logm) for all symmetric preemptive coordination mechanisms.
Moreover we study the lower bound for the unusual case when the coordination
mechanisms are asymmetric and we get the same bound
under the weak assumption that machines have no IDs. On the positive side we
prove that the inefficiency-based mechanism can achieve a constant PoA when the
maximum inefficiency of the jobs is bounded by a constant.
Export
BibTeX
@mastersthesis{Abed2011,
TITLE = {Coordination Mechanisms for Unrelated Machine Scheduling},
AUTHOR = {Abed, Fidaa},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {We investigate load balancing games in the context of unrelated machines scheduling. In such a game, there are a number of jobs and a number of machines, and each job needs to be scheduled on one machine. A collection of values pij are given, where pij indicates the processing time of job i on machine j. Moreover, each job is controlled by a selfish player who only wants to minimize the completion time of his job while disregarding other players' welfare. The outcome schedule is a Nash equilibrium if no player can unilaterally change his machine and reduce the completion time of his job. It is known that in an equilibrium, the performance of the system can be far from optimal. The degradation of the system performance in Nash equilibrium is defined as the price of anarchy (PoA): the ratio of the cost of the worst Nash equilibrium to the cost of the optimal scheduling. Clever scheduling policies can be designed to reduce PoA. These scheduling policies are called coordination mechanisms. It has been posed as an open question "what is the best possible lower bound when coordination mechanisms use preemption". In this thesis we prove a lower bound of ( logm log logm) for all symmetric preemptive coordination mechanisms. Moreover we study the lower bound for the unusual case when the coordination mechanisms are asymmetric and we get the same bound under the weak assumption that machines have no IDs. On the positive side we prove that the inefficiency-based mechanism can achieve a constant PoA when the maximum inefficiency of the jobs is bounded by a constant.},
}
Endnote
%0 Thesis
%A Abed, Fidaa
%Y Mehlhorn, Kurt
%A referee: Huang, Chien-Chung
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Coordination Mechanisms for Unrelated Machine Scheduling :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A1A2-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X We investigate load balancing games in the context of unrelated machines
scheduling. In such a game, there are a number of jobs and a number of
machines, and each job needs to be scheduled on one machine. A collection of
values pij are given, where pij indicates the processing time of job i on
machine j. Moreover, each job is controlled by a selfish player who only
wants to minimize the completion time of his job while disregarding other
players' welfare. The outcome schedule is a Nash equilibrium if no player
can unilaterally change his machine and reduce the completion time of his job.
It is known that in an equilibrium, the performance of the system can
be far from optimal. The degradation of the system performance in Nash
equilibrium is defined as the price of anarchy (PoA): the ratio of the cost
of the worst Nash equilibrium to the cost of the optimal scheduling. Clever
scheduling policies can be designed to reduce PoA. These scheduling policies
are called coordination mechanisms.
It has been posed as an open question "what is the best possible lower bound
when coordination mechanisms use preemption". In this thesis we prove a lower
bound of ( logm log logm) for all symmetric preemptive coordination mechanisms.
Moreover we study the lower bound for the unusual case when the coordination
mechanisms are asymmetric and we get the same bound
under the weak assumption that machines have no IDs. On the positive side we
prove that the inefficiency-based mechanism can achieve a constant PoA when the
maximum inefficiency of the jobs is bounded by a constant.
[78]
I. L. Ciolacu, “Universally Composable Relativistic Commitments,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
Designing communications protocols specifically adapted to relativistic
situations (i.e. constrained by special relativity theory) is taking advantage
of uniquely relativistic features to accomplish otherwise impossible tasks.
Kent [Ken99] has demonstrated, for example, that secure bit commitment is
possible using a protocol exploiting relativistic causality constraints, even
though it is known to be impossible otherwise. Therefore, Kent's protocol gives
a theoretical solution to the problem of finding commitment
schemes secure over arbitrarily long time intervals. The functionality only
requires from
the committer a sequence of communications, including a post-revelation
validation,
each of which is guaranteed to be independent of its predecessor.
We propose to verify the security of the relativistic commitment not as a stand
alone protocol, but as an entity which is part of an unpredictable environment.
To achieve this task we use the universal composability paradigm defined by
Canetti [Can01]. The relevant property of the paradigm is the guarantee of
security even when a secure protocol
is composed with an arbitrary set of protocols, or, more generally, when the
protocol is used as an element of a possibly complex system. Unfortunately,
Kent's relativistic bit commitment satisfies universal composability only with
certain restrictions on the adversarial model. However, we construct a
two-party universal composable commitment
protocol, also based on general relativistic assumptions.
Export
BibTeX
@mastersthesis{Ciolacu2011,
TITLE = {Universally Composable Relativistic Commitments},
AUTHOR = {Ciolacu, Ines Lucia},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011-05},
ABSTRACT = {Designing communications protocols specifically adapted to relativistic situations (i.e. constrained by special relativity theory) is taking advantage of uniquely relativistic features to accomplish otherwise impossible tasks. Kent [Ken99] has demonstrated, for example, that secure bit commitment is possible using a protocol exploiting relativistic causality constraints, even though it is known to be impossible otherwise. Therefore, Kent's protocol gives a theoretical solution to the problem of finding commitment schemes secure over arbitrarily long time intervals. The functionality only requires from the committer a sequence of communications, including a post-revelation validation, each of which is guaranteed to be independent of its predecessor. We propose to verify the security of the relativistic commitment not as a stand alone protocol, but as an entity which is part of an unpredictable environment. To achieve this task we use the universal composability paradigm defined by Canetti [Can01]. The relevant property of the paradigm is the guarantee of security even when a secure protocol is composed with an arbitrary set of protocols, or, more generally, when the protocol is used as an element of a possibly complex system. Unfortunately, Kent's relativistic bit commitment satisfies universal composability only with certain restrictions on the adversarial model. However, we construct a two-party universal composable commitment protocol, also based on general relativistic assumptions.},
}
Endnote
%0 Thesis
%A Ciolacu, Ines Lucia
%Y Unruh, Dominique
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
%T Universally Composable Relativistic Commitments :
%U http://hdl.handle.net/11858/00-001M-0000-0027-A1B6-0
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X Designing communications protocols specifically adapted to relativistic
situations (i.e. constrained by special relativity theory) is taking advantage
of uniquely relativistic features to accomplish otherwise impossible tasks.
Kent [Ken99] has demonstrated, for example, that secure bit commitment is
possible using a protocol exploiting relativistic causality constraints, even
though it is known to be impossible otherwise. Therefore, Kent's protocol gives
a theoretical solution to the problem of finding commitment
schemes secure over arbitrarily long time intervals. The functionality only
requires from
the committer a sequence of communications, including a post-revelation
validation,
each of which is guaranteed to be independent of its predecessor.
We propose to verify the security of the relativistic commitment not as a stand
alone protocol, but as an entity which is part of an unpredictable environment.
To achieve this task we use the universal composability paradigm defined by
Canetti [Can01]. The relevant property of the paradigm is the guarantee of
security even when a secure protocol
is composed with an arbitrary set of protocols, or, more generally, when the
protocol is used as an element of a possibly complex system. Unfortunately,
Kent's relativistic bit commitment satisfies universal composability only with
certain restrictions on the adversarial model. However, we construct a
two-party universal composable commitment
protocol, also based on general relativistic assumptions.
[79]
M. Ebrahimi, “Solving Linear Programs in MapReduce,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{Ebrahimi2011,
TITLE = {Solving Linear Programs in {M}ap{R}educe},
AUTHOR = {Ebrahimi, Mahdi},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-24D5DC0DEA1F99D1C1257903003A46A3-Ebrahimi2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Ebrahimi, Mahdi
%Y Weikum, Gerhard
%A referee: Gemulla, Rainer
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Solving Linear Programs in MapReduce :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-14C4-E
%F EDOC: 618981
%F OTHER: Local-ID: C1256DBF005F876D-24D5DC0DEA1F99D1C1257903003A46A3-Ebrahimi2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[80]
J. Iqbal, “Lineage Enabled Query Answering in Uncertain Knowledge Bases,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
We present a unified framework for query answering over uncertain RDF knowledge
bases. Specifically, our proposed design combines correlated base facts with a
query driven, top down deductive grounding phase of first-order logic formulas
(i.e., Horn rules) followed by a probabilistic inference phase. In addition
to static input correlations among base facts, we employ the lineage structure
obtained from processing the rules during grounding phase, in order to trace
the logical dependencies of query answers (i.e., derived facts) back to the base
facts. Thus, correlations (or more precisely: dependencies) among facts in a
knowledge base may arise from two sources: 1) static input dependencies
obtained from real-world observations; and 2) dynamic dependencies induced at
query time by the rule-based lineage structure of the query answer.
Our implementation employs state-of-the-art inference techniques: We
apply exact inference whenever tractable, the detection of shared factors,
shrink-
age of Boolean formula when feasible, and Gibbs sampling in the general case.
Our experiments are conducted on real data sets with synthetic expansion of
correlated base facts. The experimental evaluation demonstrates the practical
viability and scalability of our approach, achieving interactive query response
times over a very large knowledge base. The experimental results provide the
success guarantee of our presented framework.
Export
BibTeX
@mastersthesis{Iqbal2011,
TITLE = {Lineage Enabled Query Answering in Uncertain Knowledge Bases},
AUTHOR = {Iqbal, Javeria},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {We present a unified framework for query answering over uncertain RDF knowledge bases. Specifically, our proposed design combines correlated base facts with a query driven, top down deductive grounding phase of first-order logic formulas (i.e., Horn rules) followed by a probabilistic inference phase. In addition to static input correlations among base facts, we employ the lineage structure obtained from processing the rules during grounding phase, in order to trace the logical dependencies of query answers (i.e., derived facts) back to the base facts. Thus, correlations (or more precisely: dependencies) among facts in a knowledge base may arise from two sources: 1) static input dependencies obtained from real-world observations; and 2) dynamic dependencies induced at query time by the rule-based lineage structure of the query answer. Our implementation employs state-of-the-art inference techniques: We apply exact inference whenever tractable, the detection of shared factors, shrink- age of Boolean formula when feasible, and Gibbs sampling in the general case. Our experiments are conducted on real data sets with synthetic expansion of correlated base facts. The experimental evaluation demonstrates the practical viability and scalability of our approach, achieving interactive query response times over a very large knowledge base. The experimental results provide the success guarantee of our presented framework.},
}
Endnote
%0 Thesis
%A Iqbal, Javeria
%Y Theobald, Martin
%A referee: Michel, Sebastian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Lineage Enabled Query Answering in Uncertain Knowledge Bases :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A1EA-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X We present a unified framework for query answering over uncertain RDF knowledge
bases. Specifically, our proposed design combines correlated base facts with a
query driven, top down deductive grounding phase of first-order logic formulas
(i.e., Horn rules) followed by a probabilistic inference phase. In addition
to static input correlations among base facts, we employ the lineage structure
obtained from processing the rules during grounding phase, in order to trace
the logical dependencies of query answers (i.e., derived facts) back to the base
facts. Thus, correlations (or more precisely: dependencies) among facts in a
knowledge base may arise from two sources: 1) static input dependencies
obtained from real-world observations; and 2) dynamic dependencies induced at
query time by the rule-based lineage structure of the query answer.
Our implementation employs state-of-the-art inference techniques: We
apply exact inference whenever tractable, the detection of shared factors,
shrink-
age of Boolean formula when feasible, and Gibbs sampling in the general case.
Our experiments are conducted on real data sets with synthetic expansion of
correlated base facts. The experimental evaluation demonstrates the practical
viability and scalability of our approach, achieving interactive query response
times over a very large knowledge base. The experimental results provide the
success guarantee of our presented framework.
[81]
V. N. Ivanova, “Comparison of Methods for the Discovery of Copy Number Aberrations Relevant to Cancer,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
Recurrent genomic amplications and deletions characterize cancer genomes and
contribute to disease evolution. Array Comparative Genomic Hybridization (aCGH)
technology allows detection of chromosomal copy number aberrations in the
genomic DNA of tumors with high resolution. The association of consistent copy
number aberrations with particular types of cancer facilitates the
understanding of the pathogenesis of the disease, and contributes towards the
improvement of diagnosis, prognosis and the development of drugs. However,
distinguishing aberrations that are relevant to cancer from random background
aberrations is a dicult task, due to the high dimensionality of the aCGH data.
Different statistical methods have been developed to identify non-random gains
and losses across multiple samples. Their approaches vary in several aspects:
requirements necessary for an aberration to be recurrent, preprocessing of the
input data, statistical approaches used for assessing signicance of a
recurrent aberration and other biological considerations they use. So far,
multiple-sample analysis methods have only been evaluated qualitatively and
their relative merits remain unknown. In this work we propose an approach for
quantitative evaluation of the performance of four selected methods. We use
simulated data with known aberrations to validate each method and we interpret
the different outcomes. We also compare the performance of the methods on a
collection of neuroblastoma tumors by quantifying the agreement between
methods. We select appropriate techniques to combine the outputs of the methods
into a meaningful aggregation in order to obtain a high condence lists of
signicant copy number aberrations.
Export
BibTeX
@mastersthesis{Ivanova2011,
TITLE = {Comparison of Methods for the Discovery of Copy Number Aberrations Relevant to Cancer},
AUTHOR = {Ivanova, Violeta Nikolaeva},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {Recurrent genomic amplications and deletions characterize cancer genomes and contribute to disease evolution. Array Comparative Genomic Hybridization (aCGH) technology allows detection of chromosomal copy number aberrations in the genomic DNA of tumors with high resolution. The association of consistent copy number aberrations with particular types of cancer facilitates the understanding of the pathogenesis of the disease, and contributes towards the improvement of diagnosis, prognosis and the development of drugs. However, distinguishing aberrations that are relevant to cancer from random background aberrations is a dicult task, due to the high dimensionality of the aCGH data. Different statistical methods have been developed to identify non-random gains and losses across multiple samples. Their approaches vary in several aspects: requirements necessary for an aberration to be recurrent, preprocessing of the input data, statistical approaches used for assessing signicance of a recurrent aberration and other biological considerations they use. So far, multiple-sample analysis methods have only been evaluated qualitatively and their relative merits remain unknown. In this work we propose an approach for quantitative evaluation of the performance of four selected methods. We use simulated data with known aberrations to validate each method and we interpret the different outcomes. We also compare the performance of the methods on a collection of neuroblastoma tumors by quantifying the agreement between methods. We select appropriate techniques to combine the outputs of the methods into a meaningful aggregation in order to obtain a high condence lists of signicant copy number aberrations.},
}
Endnote
%0 Thesis
%A Ivanova, Violeta Nikolaeva
%Y Lengauer, Thomas
%A referee: Lenhof, Hans-Peter
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Comparison of Methods for the Discovery of Copy Number Aberrations Relevant to Cancer :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A1F0-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%P 128 p.
%V master
%9 master
%X Recurrent genomic amplications and deletions characterize cancer genomes and
contribute to disease evolution. Array Comparative Genomic Hybridization (aCGH)
technology allows detection of chromosomal copy number aberrations in the
genomic DNA of tumors with high resolution. The association of consistent copy
number aberrations with particular types of cancer facilitates the
understanding of the pathogenesis of the disease, and contributes towards the
improvement of diagnosis, prognosis and the development of drugs. However,
distinguishing aberrations that are relevant to cancer from random background
aberrations is a dicult task, due to the high dimensionality of the aCGH data.
Different statistical methods have been developed to identify non-random gains
and losses across multiple samples. Their approaches vary in several aspects:
requirements necessary for an aberration to be recurrent, preprocessing of the
input data, statistical approaches used for assessing signicance of a
recurrent aberration and other biological considerations they use. So far,
multiple-sample analysis methods have only been evaluated qualitatively and
their relative merits remain unknown. In this work we propose an approach for
quantitative evaluation of the performance of four selected methods. We use
simulated data with known aberrations to validate each method and we interpret
the different outcomes. We also compare the performance of the methods on a
collection of neuroblastoma tumors by quantifying the agreement between
methods. We select appropriate techniques to combine the outputs of the methods
into a meaningful aggregation in order to obtain a high condence lists of
signicant copy number aberrations.
[82]
Y. Kargin, “Distributed Analytics over Web Archives,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{KarginMaster2011,
TITLE = {Distributed Analytics over Web Archives},
AUTHOR = {Kargin, Yagiz},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-E222B52EB02061B7C125783F0041FA53-KarginMaster2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Kargin, Yagiz
%Y Bedathur, Srikanta
%A referee: Anand, Avishek
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Distributed Analytics over Web Archives :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-1447-A
%F EDOC: 618943
%F OTHER: Local-ID: C1256DBF005F876D-E222B52EB02061B7C125783F0041FA53-KarginMaster2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[83]
E. Kuzey, “Extraction of Temporal Facts and Events from Wikipedia,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{Kuzey2011,
TITLE = {Extraction of Temporal Facts and Events from Wikipedia},
AUTHOR = {Kuzey, Erdal},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-E87CB338D81CD766C12578CC003C2F6E-Kuzey2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Kuzey, Erdal
%Y Weikum, Gerhard
%A referee: Theobald, Martin
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Extraction of Temporal Facts and Events from Wikipedia :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-1455-A
%F EDOC: 618966
%F OTHER: Local-ID: C1256DBF005F876D-E87CB338D81CD766C12578CC003C2F6E-Kuzey2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[84]
D. Mahmoud, “Multiple-frame Image Super Resolution Based on Optic Flow,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
Super resolution is the task of reconstructing one or several high resolution
images, from one or several low resolution images. A variety of super resolution
methods have been proposed over the past three decades, some following a
singleframe based methodology while the others utilizing a multiple-frame based
one. These methods are usually very sensitive to their underlying model of data
and noise, which limits their performance. In this thesis, we propose and
compare
two multiple-frame based approaches that address such shortcomings. In the rst
proposal we investigate a fast, local approach which combines the low resolution
frames via warping and then performs diusion-based inpainting. The second
proposal models the image formation process in a variational framework with
regularization that is robust to errors in motion and blur estimation. In
addition, we
introduce a brightness adaptation step which results in images with sharper
edges.
An accurate estimation of optical ow among the low resolution measurements
is a fundamental step towards high quality super resolution for both methods.
Experiments conrm the eectiveness of our method on a variety of super
resolution
benchmark sequences, as well as its superiority in performance to other
closely-related methods.
Export
BibTeX
@mastersthesis{Mahmoud2011,
TITLE = {Multiple-frame Image Super Resolution Based on Optic Flow},
AUTHOR = {Mahmoud, Dina},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {Super resolution is the task of reconstructing one or several high resolution images, from one or several low resolution images. A variety of super resolution methods have been proposed over the past three decades, some following a singleframe based methodology while the others utilizing a multiple-frame based one. These methods are usually very sensitive to their underlying model of data and noise, which limits their performance. In this thesis, we propose and compare two multiple-frame based approaches that address such shortcomings. In the rst proposal we investigate a fast, local approach which combines the low resolution frames via warping and then performs diusion-based inpainting. The second proposal models the image formation process in a variational framework with regularization that is robust to errors in motion and blur estimation. In addition, we introduce a brightness adaptation step which results in images with sharper edges. An accurate estimation of optical ow among the low resolution measurements is a fundamental step towards high quality super resolution for both methods. Experiments conrm the eectiveness of our method on a variety of super resolution benchmark sequences, as well as its superiority in performance to other closely-related methods.},
}
Endnote
%0 Thesis
%A Mahmoud, Dina
%Y Bruhn, Andres
%A referee: Weikert, Joachim
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Multiple-frame Image Super Resolution Based on Optic Flow :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-A7F2-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X Super resolution is the task of reconstructing one or several high resolution
images, from one or several low resolution images. A variety of super resolution
methods have been proposed over the past three decades, some following a
singleframe based methodology while the others utilizing a multiple-frame based
one. These methods are usually very sensitive to their underlying model of data
and noise, which limits their performance. In this thesis, we propose and
compare
two multiple-frame based approaches that address such shortcomings. In the rst
proposal we investigate a fast, local approach which combines the low resolution
frames via warping and then performs diusion-based inpainting. The second
proposal models the image formation process in a variational framework with
regularization that is robust to errors in motion and blur estimation. In
addition, we
introduce a brightness adaptation step which results in images with sharper
edges.
An accurate estimation of optical ow among the low resolution measurements
is a fundamental step towards high quality super resolution for both methods.
Experiments conrm the eectiveness of our method on a variety of super
resolution
benchmark sequences, as well as its superiority in performance to other
closely-related methods.
[85]
M. Malinowski, “Optimization Algorithms in the Reconstruction of MR Images: A Comparative Study,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
Time that an imaging device needs to produce results
is one of the most crucial factors in medical imaging.
Shorter scanning duration causes fewer artifacts such as those
created by the patient motion. In addition, it increases patient comfort and
in the case of some imaging modalities also decreases exposure to radiation.
There are some possibilities,
hardware-based or software-based, to improve the imaging speed. One way
is to speed up the scanning process by acquiring fewer measurements. A recently
developed mathematical framework called compressed
sensing shows that it is possible to accurately recover undersampled images
provided a suitable measurement matrix is used and the image itself has a
sparse representation.
Nevertheless, not only measurements are important but also good reconstruction
models are required. Such models are usually expressed as optimization
problems.
In this thesis, we concentrated on the reconstruction of the undersampled
Magnetic Resonance (MR) images.
For this purpose a complex-valued reconstruction model was provided.
Since the reconstruction should be as quick as possible,
fast methods to find the solution for the reconstruction problem are required.
To meet this objective, three popular algorithms FISTA,
Augmented Lagrangian and Non-linear Conjugate Gradient were
adopted to work with our model.
By changing the complex-valued reconstruction model slightly
and dualizing the problem, we obtained an instance of the quadratically
constrained quadratic program where both the objective function and the
constraints are twice differentiable. Hence new model opened doors to two other
methods, the first order method which resembles FISTA and is called in
this thesis Normed Constrained Quadratic FGP, and the second order
method called Truncated Newton Primal Dual Interior Point.
Next, in order to
compare performance of the methods, we set up the experiments and evaluated all
presented methods against the problem of reconstructing undersampled MR images.
In the experiments we used a number of invocations of the Fourier transform
to measure the performance of all algorithms.
As a result of the experiments we found that in the context of the original
model the performance of Augmented Lagrangian is better than the other
two methods. Performance of Non-linear Conjugate Gradient and
FISTA are about the same.
In the context of the extended model Normed Constrained Quadratic FGP
beats the Truncated Newton Primal Dual Interior Point method.
Export
BibTeX
@mastersthesis{Malinowski2011,
TITLE = {Optimization Algorithms in the Reconstruction of {MR} Images: A Comparative Study},
AUTHOR = {Malinowski, Mateusz},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12576EE0048963A-A807E45B9B37277EC12579A3003A3E73-Malinowski2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {Time that an imaging device needs to produce results is one of the most crucial factors in medical imaging. Shorter scanning duration causes fewer artifacts such as those created by the patient motion. In addition, it increases patient comfort and in the case of some imaging modalities also decreases exposure to radiation. There are some possibilities, hardware-based or software-based, to improve the imaging speed. One way is to speed up the scanning process by acquiring fewer measurements. A recently developed mathematical framework called compressed sensing shows that it is possible to accurately recover undersampled images provided a suitable measurement matrix is used and the image itself has a sparse representation. Nevertheless, not only measurements are important but also good reconstruction models are required. Such models are usually expressed as optimization problems. In this thesis, we concentrated on the reconstruction of the undersampled Magnetic Resonance (MR) images. For this purpose a complex-valued reconstruction model was provided. Since the reconstruction should be as quick as possible, fast methods to find the solution for the reconstruction problem are required. To meet this objective, three popular algorithms FISTA, Augmented Lagrangian and Non-linear Conjugate Gradient were adopted to work with our model. By changing the complex-valued reconstruction model slightly and dualizing the problem, we obtained an instance of the quadratically constrained quadratic program where both the objective function and the constraints are twice differentiable. Hence new model opened doors to two other methods, the first order method which resembles FISTA and is called in this thesis Normed Constrained Quadratic FGP, and the second order method called Truncated Newton Primal Dual Interior Point. Next, in order to compare performance of the methods, we set up the experiments and evaluated all presented methods against the problem of reconstructing undersampled MR images. In the experiments we used a number of invocations of the Fourier transform to measure the performance of all algorithms. As a result of the experiments we found that in the context of the original model the performance of Augmented Lagrangian is better than the other two methods. Performance of Non-linear Conjugate Gradient and FISTA are about the same. In the context of the extended model Normed Constrained Quadratic FGP beats the Truncated Newton Primal Dual Interior Point method.},
}
Endnote
%0 Thesis
%A Malinowski, Mateusz
%Y Seeger, Matthias
%A referee: Hein,, Matthias
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Optimization Algorithms in the Reconstruction of MR Images: A Comparative Study :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-11B0-3
%F EDOC: 618778
%F OTHER: Local-ID: C12576EE0048963A-A807E45B9B37277EC12579A3003A3E73-Malinowski2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X Time that an imaging device needs to produce results
is one of the most crucial factors in medical imaging.
Shorter scanning duration causes fewer artifacts such as those
created by the patient motion. In addition, it increases patient comfort and
in the case of some imaging modalities also decreases exposure to radiation.
There are some possibilities,
hardware-based or software-based, to improve the imaging speed. One way
is to speed up the scanning process by acquiring fewer measurements. A recently
developed mathematical framework called compressed
sensing shows that it is possible to accurately recover undersampled images
provided a suitable measurement matrix is used and the image itself has a
sparse representation.
Nevertheless, not only measurements are important but also good reconstruction
models are required. Such models are usually expressed as optimization
problems.
In this thesis, we concentrated on the reconstruction of the undersampled
Magnetic Resonance (MR) images.
For this purpose a complex-valued reconstruction model was provided.
Since the reconstruction should be as quick as possible,
fast methods to find the solution for the reconstruction problem are required.
To meet this objective, three popular algorithms FISTA,
Augmented Lagrangian and Non-linear Conjugate Gradient were
adopted to work with our model.
By changing the complex-valued reconstruction model slightly
and dualizing the problem, we obtained an instance of the quadratically
constrained quadratic program where both the objective function and the
constraints are twice differentiable. Hence new model opened doors to two other
methods, the first order method which resembles FISTA and is called in
this thesis Normed Constrained Quadratic FGP, and the second order
method called Truncated Newton Primal Dual Interior Point.
Next, in order to
compare performance of the methods, we set up the experiments and evaluated all
presented methods against the problem of reconstructing undersampled MR images.
In the experiments we used a number of invocations of the Fourier transform
to measure the performance of all algorithms.
As a result of the experiments we found that in the context of the original
model the performance of Augmented Lagrangian is better than the other
two methods. Performance of Non-linear Conjugate Gradient and
FISTA are about the same.
In the context of the extended model Normed Constrained Quadratic FGP
beats the Truncated Newton Primal Dual Interior Point method.
[86]
N. Prytkova, “Modeling and Evaluation of Co-Evolution in Collective Web Memories,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
The constantly evolving Web reects the evolution of society in the cyberspace.
Projects like the Open Directory Project (dmoz.org) can be understood as a
collective memory of society on the Web. The main assumption is that such
collective Web memories evolve when a certain cognition level about a concept
has been exceeded. In the scope of our work we analyse the New York Times
archive for concept detection. There are several approaches to the concept
modelling. We introduce an alternative model for concepts, which does not make
any additional assumptions about types of contained entities or the number of
entities in the corpus. Moreover, the proposed distributed concept computation
algorithm enables the large scale archive analysis. We also introduce a model
of cognition level and explain how it can be employed to predict changes in the
category system of DMOZ.
Export
BibTeX
@mastersthesis{Prytkova2011,
TITLE = {Modeling and Evaluation of Co-Evolution in Collective Web Memories},
AUTHOR = {Prytkova, Natalia},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-680CF1F2F3D8F339C1257957003777B9-Prytkova2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {The constantly evolving Web reects the evolution of society in the cyberspace. Projects like the Open Directory Project (dmoz.org) can be understood as a collective memory of society on the Web. The main assumption is that such collective Web memories evolve when a certain cognition level about a concept has been exceeded. In the scope of our work we analyse the New York Times archive for concept detection. There are several approaches to the concept modelling. We introduce an alternative model for concepts, which does not make any additional assumptions about types of contained entities or the number of entities in the corpus. Moreover, the proposed distributed concept computation algorithm enables the large scale archive analysis. We also introduce a model of cognition level and explain how it can be employed to predict changes in the category system of DMOZ.},
}
Endnote
%0 Thesis
%A Prytkova, Natalia
%Y Weikum, Gerhard
%A referee: Spaniol, Marc
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Modeling and Evaluation of Co-Evolution in Collective Web Memories :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-1493-D
%F EDOC: 618987
%F OTHER: Local-ID: C1256DBF005F876D-680CF1F2F3D8F339C1257957003777B9-Prytkova2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
%X The constantly evolving Web reects the evolution of society in the cyberspace.
Projects like the Open Directory Project (dmoz.org) can be understood as a
collective memory of society on the Web. The main assumption is that such
collective Web memories evolve when a certain cognition level about a concept
has been exceeded. In the scope of our work we analyse the New York Times
archive for concept detection. There are several approaches to the concept
modelling. We introduce an alternative model for concepts, which does not make
any additional assumptions about types of contained entities or the number of
entities in the corpus. Moreover, the proposed distributed concept computation
algorithm enables the large scale archive analysis. We also introduce a model
of cognition level and explain how it can be employed to predict changes in the
category system of DMOZ.
[87]
T. Samar, “Scalable Distributed Time-Travel Text Search,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{Samar2011,
TITLE = {Scalable Distributed Time-Travel Text Search},
AUTHOR = {Samar, Thaer},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-A12913F267F0DA6FC12579650034300D-Samar2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Samar, Thaer
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Scalable Distributed Time-Travel Text Search :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-14B7-C
%F EDOC: 618990
%F OTHER: Local-ID: C1256DBF005F876D-A12913F267F0DA6FC12579650034300D-Samar2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[88]
M. Simonovsky, “Hand Shape Recognition Using a ToF Camera : An Application to Sign Language,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
This master's thesis investigates the benefit of utilizing depth
information acquired by a time-of-flight (ToF) camera for hand shape
recognition from unrestricted viewpoints. Specifically, we assess the
hypothesis that classical 3D content descriptors might be
inappropriate for ToF depth images due to the 2.5D nature and
noisiness of the data and possible expensive computations in 3D space.
Instead, we extend 2D descriptors to make use of the additional
semantics of depth images. Our system is based on the appearance-based
retrieval paradigm, using a synthetic 3D hand model to generate its
database. The system is able to run at interactive frame rates. For
increased robustness, no color, intensity, or time coherence
information is used. A novel, domain-specific algorithm for segmenting
the forearm from the upper body based on reprojecting the acquired
geometry into the lateral view is introduced. Moreover, three kinds of
descriptors exploiting depth data are proposed and the made design
choices are experimentally supported. The whole system is then
evaluated on an American sign language fingerspelling dataset.
However, the retrieval performance still leaves room for improvements.
Several insights and possible reasons are discussed.
Export
BibTeX
@mastersthesis{Simonovsky2010,
TITLE = {Hand Shape Recognition Using a {ToF} Camera : An Application to Sign Language},
AUTHOR = {Simonovsky, Martin},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125675300671F7B-F6F3ECA002CC444FC1257970006B4EF3-Simonovsky2010},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {This master's thesis investigates the benefit of utilizing depth information acquired by a time-of-flight (ToF) camera for hand shape recognition from unrestricted viewpoints. Specifically, we assess the hypothesis that classical 3D content descriptors might be inappropriate for ToF depth images due to the 2.5D nature and noisiness of the data and possible expensive computations in 3D space. Instead, we extend 2D descriptors to make use of the additional semantics of depth images. Our system is based on the appearance-based retrieval paradigm, using a synthetic 3D hand model to generate its database. The system is able to run at interactive frame rates. For increased robustness, no color, intensity, or time coherence information is used. A novel, domain-specific algorithm for segmenting the forearm from the upper body based on reprojecting the acquired geometry into the lateral view is introduced. Moreover, three kinds of descriptors exploiting depth data are proposed and the made design choices are experimentally supported. The whole system is then evaluated on an American sign language fingerspelling dataset. However, the retrieval performance still leaves room for improvements. Several insights and possible reasons are discussed.},
}
Endnote
%0 Thesis
%A Simonovsky, Martin
%Y Theobalt, Christian
%A referee: Müller, Meinard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
External Organizations
%T Hand Shape Recognition Using a ToF Camera : An Application to Sign
Language :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-11B9-2
%F EDOC: 618894
%F OTHER: Local-ID: C125675300671F7B-F6F3ECA002CC444FC1257970006B4EF3-Simonovsky2010
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%P III, 64 p.
%V master
%9 master
%X This master's thesis investigates the benefit of utilizing depth
information acquired by a time-of-flight (ToF) camera for hand shape
recognition from unrestricted viewpoints. Specifically, we assess the
hypothesis that classical 3D content descriptors might be
inappropriate for ToF depth images due to the 2.5D nature and
noisiness of the data and possible expensive computations in 3D space.
Instead, we extend 2D descriptors to make use of the additional
semantics of depth images. Our system is based on the appearance-based
retrieval paradigm, using a synthetic 3D hand model to generate its
database. The system is able to run at interactive frame rates. For
increased robustness, no color, intensity, or time coherence
information is used. A novel, domain-specific algorithm for segmenting
the forearm from the upper body based on reprojecting the acquired
geometry into the lateral view is introduced. Moreover, three kinds of
descriptors exploiting depth data are proposed and the made design
choices are experimentally supported. The whole system is then
evaluated on an American sign language fingerspelling dataset.
However, the retrieval performance still leaves room for improvements.
Several insights and possible reasons are discussed.
[89]
N. Tandon, “Deriving a Web-scale Common Sense Fact Knowledge Base,” Universität des Saarlandes, Saarbrücken, 2011.
Abstract
The fact that birds have feathers and ice is cold seems trivially true. Yet,
most machine-readable sources of knowledge either lack such common sense facts
entirely or have only limited coverage. Prior work on automated knowledge base
construction has largely focused on relations between named entities and on
taxonomic knowledge, while disregarding common sense properties.
Extracting such structured data from text is challenging, especially due to the
scarcity of explicitly expressed knowledge. Even when relying on large document
collections, patternbased information extraction approaches typically discover
insufficient amounts of information.
This thesis investigates harvesting massive amounts of common sense knowledge
using the textual knowledge of the entire Web, yet staying away from the
massive engineering efforts in procuring such a large corpus as a Web. Despite
the advancements in knowledge harvesting, we observed that the state of the art
methods were limited in terms of accuracy and discovered insufficient amounts
of information under our desired setting.
This thesis shows how to gather large amounts of common sense facts from Web
N-gram data, using seeds from the existing knowledge bases like ConceptNet. Our
novel contributions include scalable methods for tapping onto Web-scale data
and a new scoring model to determine which patterns and facts are most reliable.
An extensive experimental evaluation is provided for three different binary
relations, comparing different sources of n-gram data as well as different
algorithms. The experimental results show that this approach extends ConceptNet
by many orders of magnitude (more than 200-fold) at comparable levels of
precision.
Export
BibTeX
@mastersthesis{MasterTandon2011,
TITLE = {Deriving a Web-scale Common Sense Fact Knowledge Base},
AUTHOR = {Tandon, Niket},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
ABSTRACT = {The fact that birds have feathers and ice is cold seems trivially true. Yet, most machine-readable sources of knowledge either lack such common sense facts entirely or have only limited coverage. Prior work on automated knowledge base construction has largely focused on relations between named entities and on taxonomic knowledge, while disregarding common sense properties. Extracting such structured data from text is challenging, especially due to the scarcity of explicitly expressed knowledge. Even when relying on large document collections, patternbased information extraction approaches typically discover insufficient amounts of information. This thesis investigates harvesting massive amounts of common sense knowledge using the textual knowledge of the entire Web, yet staying away from the massive engineering efforts in procuring such a large corpus as a Web. Despite the advancements in knowledge harvesting, we observed that the state of the art methods were limited in terms of accuracy and discovered insufficient amounts of information under our desired setting. This thesis shows how to gather large amounts of common sense facts from Web N-gram data, using seeds from the existing knowledge bases like ConceptNet. Our novel contributions include scalable methods for tapping onto Web-scale data and a new scoring model to determine which patterns and facts are most reliable. An extensive experimental evaluation is provided for three different binary relations, comparing different sources of n-gram data as well as different algorithms. The experimental results show that this approach extends ConceptNet by many orders of magnitude (more than 200-fold) at comparable levels of precision.},
}
Endnote
%0 Thesis
%A Tandon, Niket
%Y Weikum, Gerhard
%A referee: Theobalt, Christian
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Deriving a Web-scale Common Sense Fact Knowledge Base :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-ABF9-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%P X, 81 p.
%V master
%9 master
%X The fact that birds have feathers and ice is cold seems trivially true. Yet,
most machine-readable sources of knowledge either lack such common sense facts
entirely or have only limited coverage. Prior work on automated knowledge base
construction has largely focused on relations between named entities and on
taxonomic knowledge, while disregarding common sense properties.
Extracting such structured data from text is challenging, especially due to the
scarcity of explicitly expressed knowledge. Even when relying on large document
collections, patternbased information extraction approaches typically discover
insufficient amounts of information.
This thesis investigates harvesting massive amounts of common sense knowledge
using the textual knowledge of the entire Web, yet staying away from the
massive engineering efforts in procuring such a large corpus as a Web. Despite
the advancements in knowledge harvesting, we observed that the state of the art
methods were limited in terms of accuracy and discovered insufficient amounts
of information under our desired setting.
This thesis shows how to gather large amounts of common sense facts from Web
N-gram data, using seeds from the existing knowledge bases like ConceptNet. Our
novel contributions include scalable methods for tapping onto Web-scale data
and a new scoring model to determine which patterns and facts are most reliable.
An extensive experimental evaluation is provided for three different binary
relations, comparing different sources of n-gram data as well as different
algorithms. The experimental results show that this approach extends ConceptNet
by many orders of magnitude (more than 200-fold) at comparable levels of
precision.
[90]
C. Teflioudi, “Learning Soft Inference Rules in Large and Uncertain Knowledge Bases,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{Teflioudi2011,
TITLE = {Learning Soft Inference Rules in Large and Uncertain Knowledge Bases},
AUTHOR = {Teflioudi, Christina},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-8C6053D027747A15C1257850004BFB15-Teflioudi2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Teflioudi, Christina
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Learning Soft Inference Rules in Large and Uncertain Knowledge Bases :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-1486-B
%F EDOC: 618950
%F OTHER: Local-ID: C1256DBF005F876D-8C6053D027747A15C1257850004BFB15-Teflioudi2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[91]
A. T. Tran, “Context-aware timeline for entity exploration,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{Tran2011,
TITLE = {Context-aware timeline for entity exploration},
AUTHOR = {Tran, Anh Tuan},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-3DE40C1C4F51E0FBC1257853004E31B4-Tran2011},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Tran, Anh Tuan
%Y Preda, Nicoleta
%A referee: Elbassuoni, Shady
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Context-aware timeline for entity exploration :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-1433-5
%F EDOC: 618952
%F OTHER: Local-ID: C1256DBF005F876D-3DE40C1C4F51E0FBC1257853004E31B4-Tran2011
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%V master
%9 master
[92]
M. R. Yousefi, “Generating Detailed Face Models by Controlled Lighting,” Universität des Saarlandes, Saarbrücken, 2011.
Export
BibTeX
@mastersthesis{MasterThesisYousefiMohammad,
TITLE = {Generating Detailed Face Models by Controlled Lighting},
AUTHOR = {Yousefi, Mohammad Reza},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2011},
DATE = {2011},
}
Endnote
%0 Thesis
%A Yousefi, Mohammad Reza
%A referee: Seidel, Hans-Peter
%Y Thormählen, Thorsten
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Generating Detailed Face Models by Controlled Lighting :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0010-13C1-B
%F EDOC: 618921
%I Universität des Saarlandes
%C Saarbrücken
%D 2011
%P XVI, 39 p.
%V master
%9 master
2010
[93]
A. Andreychenko, “Uniformization for Time-Inhomogeneous Markov Population Models,” Universität des Saarlandes, Saarbrücken, 2010.
Abstract
Time is one of the main factors in any kind of real-life systems. When a certain system is
analysed one is often interested in its evolution with respect to time. Various phenomena
can be described using a form of time-dependency. The difference between load in
call-centres is the example of time-dependency in queueing systems. The process of
migration of biological species in autumn and spring is another illustration of changing
the behaviour in time. The ageing process in critical infrastructures (which can result in
the system component failure) can also be considered as another type of time-dependent
evolution.
Considering the variability in time for chemical and biological systems one comes to
the general tasks of systems biology [9]. It is an inter-disciplinary study field which
investigates complex interactions between components of biological systems and aims
to explore the fundamental laws and new features of them. Systems biology is also
used for referring to a certain type of research cycle. It starts with the creation a
model. One tries to describe the behaviour in a most intuitive and informative way
which assumes convenience and visibility of future analysis. The traditional approach
is based on deterministic models where the evolution can be predicted with certainty.
This type of model usually operates at a macroscopic scale and if one considers chemical
reactions the state of the system is represented by the concentrations of species and
a continuous deterministic change is assumed. A set of ordinary differential equations
(ODE) is one of the ways to describe such kind of models. To obtain a solution numerical
methods are applied. The choice of a certain ODE-solver depends on the type of the
ODE system. Another option is a full description of the chemical reaction system where
we model each single molecule explicitly operating with their properties and positions
in space. Naturally it is difficult to treat big systems in a such way and it also creates
restrictions for computational analysis.
However it reveals that the deterministic formalism is not always sufficient to describe
all possible ways for the system to evolve. For instance, the Lambda phage decision
circuit [1] can be a motivational example of such system. When the lambda phage
virus infects the E.coli bacterium it can evolve in two different ways. The first one is
lysogeny where the genome of the virus is integrated into the genome of the bacterium.
Virus DNA is then replicated in descendant cells using the replication mechanism of
the host cell. Another way is entering the lytic cycle, which means that new phages
are synthesized directly in the host cell and finally its membrane is destroyed and new
phages are released. A deterministic model is not appropriate to describe this process of
choosing between two pathways as this decision is probabilistic and one needs a stochastic
model to give an appropriate description.
Another important issue which has to be addressed is the fact that the state of the
system changes discretely. It means that one considers not the continuous change of
chemical species concentrations but discrete events occuring with different probabilities
(they can be time-dependent as well).
We will use the continuous-time Markov Population Models (MPMs) formalism in
this thesis to describe discrete-state stochastic systems. They are indeed continuous-
1
time Markov processes, where the state of the system represents populations and it is
expressed by the vector of natural numbers. Such systems can have innitely many
states. For the case of chemical reactions network it results in the fact that one can
not provide strict upper bounds for the population of certain species. When analysing
these systems one can estimate measures of interest (like expectation and variance for
the certain species populations at a given time instant). Besides this, probabilities for
certain events to occur can be important (for instance, the probability for population to
reach the threshold or the probability for given species to extinct).
The usual way to investigate properties of these systems is simulation [8] which means
that a large amount of possible sample trajectories are generated and then analysed.
However it can be difficult to collect a sufficient number of trajectories to provide statistical
estimations of good quality. Besides the simulation, approaches based on the
uniformization technique have been proven to be computationally efficient for analysis
of time-independent MPMs. In the case of time-dependent processes only few results
concerning the performance of numerical techniques are known [2].
Here we present a method for conducting an analysis of MPMs that can have possibly
infinitely many states and their dynamics is time-dependent. To cope with the problem
we combine the ideas of on-the-y uniformization [5] with the method for treating timeinhomogeneous
behaviour presented by Bucholz.
Export
BibTeX
@mastersthesis{Andreychenko2010,
TITLE = {Uniformization for Time-Inhomogeneous {M}arkov Population Models},
AUTHOR = {Andreychenko, Aleksandr},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
ABSTRACT = {Time is one of the main factors in any kind of real-life systems. When a certain system is analysed one is often interested in its evolution with respect to time. Various phenomena can be described using a form of time-dependency. The difference between load in call-centres is the example of time-dependency in queueing systems. The process of migration of biological species in autumn and spring is another illustration of changing the behaviour in time. The ageing process in critical infrastructures (which can result in the system component failure) can also be considered as another type of time-dependent evolution. Considering the variability in time for chemical and biological systems one comes to the general tasks of systems biology [9]. It is an inter-disciplinary study field which investigates complex interactions between components of biological systems and aims to explore the fundamental laws and new features of them. Systems biology is also used for referring to a certain type of research cycle. It starts with the creation a model. One tries to describe the behaviour in a most intuitive and informative way which assumes convenience and visibility of future analysis. The traditional approach is based on deterministic models where the evolution can be predicted with certainty. This type of model usually operates at a macroscopic scale and if one considers chemical reactions the state of the system is represented by the concentrations of species and a continuous deterministic change is assumed. A set of ordinary differential equations (ODE) is one of the ways to describe such kind of models. To obtain a solution numerical methods are applied. The choice of a certain ODE-solver depends on the type of the ODE system. Another option is a full description of the chemical reaction system where we model each single molecule explicitly operating with their properties and positions in space. Naturally it is difficult to treat big systems in a such way and it also creates restrictions for computational analysis. However it reveals that the deterministic formalism is not always sufficient to describe all possible ways for the system to evolve. For instance, the Lambda phage decision circuit [1] can be a motivational example of such system. When the lambda phage virus infects the E.coli bacterium it can evolve in two different ways. The first one is lysogeny where the genome of the virus is integrated into the genome of the bacterium. Virus DNA is then replicated in descendant cells using the replication mechanism of the host cell. Another way is entering the lytic cycle, which means that new phages are synthesized directly in the host cell and finally its membrane is destroyed and new phages are released. A deterministic model is not appropriate to describe this process of choosing between two pathways as this decision is probabilistic and one needs a stochastic model to give an appropriate description. Another important issue which has to be addressed is the fact that the state of the system changes discretely. It means that one considers not the continuous change of chemical species concentrations but discrete events occuring with different probabilities (they can be time-dependent as well). We will use the continuous-time Markov Population Models (MPMs) formalism in this thesis to describe discrete-state stochastic systems. They are indeed continuous- 1 time Markov processes, where the state of the system represents populations and it is expressed by the vector of natural numbers. Such systems can have innitely many states. For the case of chemical reactions network it results in the fact that one can not provide strict upper bounds for the population of certain species. When analysing these systems one can estimate measures of interest (like expectation and variance for the certain species populations at a given time instant). Besides this, probabilities for certain events to occur can be important (for instance, the probability for population to reach the threshold or the probability for given species to extinct). The usual way to investigate properties of these systems is simulation [8] which means that a large amount of possible sample trajectories are generated and then analysed. However it can be difficult to collect a sufficient number of trajectories to provide statistical estimations of good quality. Besides the simulation, approaches based on the uniformization technique have been proven to be computationally efficient for analysis of time-independent MPMs. In the case of time-dependent processes only few results concerning the performance of numerical techniques are known [2]. Here we present a method for conducting an analysis of MPMs that can have possibly infinitely many states and their dynamics is time-dependent. To cope with the problem we combine the ideas of on-the-y uniformization [5] with the method for treating timeinhomogeneous behaviour presented by Bucholz.},
}
Endnote
%0 Thesis
%A Andreychenko, Aleksandr
%Y Hermanns, Holger
%A referee: Wolf, Verena
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Uniformization for Time-Inhomogeneous Markov Population Models :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-AF8D-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
%X Time is one of the main factors in any kind of real-life systems. When a certain system is
analysed one is often interested in its evolution with respect to time. Various phenomena
can be described using a form of time-dependency. The difference between load in
call-centres is the example of time-dependency in queueing systems. The process of
migration of biological species in autumn and spring is another illustration of changing
the behaviour in time. The ageing process in critical infrastructures (which can result in
the system component failure) can also be considered as another type of time-dependent
evolution.
Considering the variability in time for chemical and biological systems one comes to
the general tasks of systems biology [9]. It is an inter-disciplinary study field which
investigates complex interactions between components of biological systems and aims
to explore the fundamental laws and new features of them. Systems biology is also
used for referring to a certain type of research cycle. It starts with the creation a
model. One tries to describe the behaviour in a most intuitive and informative way
which assumes convenience and visibility of future analysis. The traditional approach
is based on deterministic models where the evolution can be predicted with certainty.
This type of model usually operates at a macroscopic scale and if one considers chemical
reactions the state of the system is represented by the concentrations of species and
a continuous deterministic change is assumed. A set of ordinary differential equations
(ODE) is one of the ways to describe such kind of models. To obtain a solution numerical
methods are applied. The choice of a certain ODE-solver depends on the type of the
ODE system. Another option is a full description of the chemical reaction system where
we model each single molecule explicitly operating with their properties and positions
in space. Naturally it is difficult to treat big systems in a such way and it also creates
restrictions for computational analysis.
However it reveals that the deterministic formalism is not always sufficient to describe
all possible ways for the system to evolve. For instance, the Lambda phage decision
circuit [1] can be a motivational example of such system. When the lambda phage
virus infects the E.coli bacterium it can evolve in two different ways. The first one is
lysogeny where the genome of the virus is integrated into the genome of the bacterium.
Virus DNA is then replicated in descendant cells using the replication mechanism of
the host cell. Another way is entering the lytic cycle, which means that new phages
are synthesized directly in the host cell and finally its membrane is destroyed and new
phages are released. A deterministic model is not appropriate to describe this process of
choosing between two pathways as this decision is probabilistic and one needs a stochastic
model to give an appropriate description.
Another important issue which has to be addressed is the fact that the state of the
system changes discretely. It means that one considers not the continuous change of
chemical species concentrations but discrete events occuring with different probabilities
(they can be time-dependent as well).
We will use the continuous-time Markov Population Models (MPMs) formalism in
this thesis to describe discrete-state stochastic systems. They are indeed continuous-
1
time Markov processes, where the state of the system represents populations and it is
expressed by the vector of natural numbers. Such systems can have innitely many
states. For the case of chemical reactions network it results in the fact that one can
not provide strict upper bounds for the population of certain species. When analysing
these systems one can estimate measures of interest (like expectation and variance for
the certain species populations at a given time instant). Besides this, probabilities for
certain events to occur can be important (for instance, the probability for population to
reach the threshold or the probability for given species to extinct).
The usual way to investigate properties of these systems is simulation [8] which means
that a large amount of possible sample trajectories are generated and then analysed.
However it can be difficult to collect a sufficient number of trajectories to provide statistical
estimations of good quality. Besides the simulation, approaches based on the
uniformization technique have been proven to be computationally efficient for analysis
of time-independent MPMs. In the case of time-dependent processes only few results
concerning the performance of numerical techniques are known [2].
Here we present a method for conducting an analysis of MPMs that can have possibly
infinitely many states and their dynamics is time-dependent. To cope with the problem
we combine the ideas of on-the-y uniformization [5] with the method for treating timeinhomogeneous
behaviour presented by Bucholz.
[94]
S. Byelozyorov, “Construction of Virtual Worlds with Web 2.0 Technology,” Universität des Saarlandes, Saarbrücken, 2010.
Abstract
Current Web technologies allow developers to create rich Web-applications.
Unlike desktop applications Web 2.0 programs are created by easily linking
several existing components. This approach, also known as mashup, allows to use
JavaScript to connect web-services and browser components together.
I have extended this development method by bringing 3D and virtual world
networking components into the browser. This allowed me to create Virtual
Worlds Web-application similar to Second Life. I have wrapped the open-source
Sirikata platform for virtual worlds into a Web-service component, created
XML3D rendering component, combined them with other browser services and thus
created a fully-featured 3D world application right inside of the browser
Export
BibTeX
@mastersthesis{Byelozyorov2010,
TITLE = {Construction of Virtual Worlds with Web 2.0 Technology},
AUTHOR = {Byelozyorov, Sergey},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
ABSTRACT = {Current Web technologies allow developers to create rich Web-applications. Unlike desktop applications Web 2.0 programs are created by easily linking several existing components. This approach, also known as mashup, allows to use JavaScript to connect web-services and browser components together. I have extended this development method by bringing 3D and virtual world networking components into the browser. This allowed me to create Virtual Worlds Web-application similar to Second Life. I have wrapped the open-source Sirikata platform for virtual worlds into a Web-service component, created XML3D rendering component, combined them with other browser services and thus created a fully-featured 3D world application right inside of the browser},
}
Endnote
%0 Thesis
%A Byelozyorov, Sergey
%Y Rubinstein, DimItri
%A referee: Zeller, Andreas
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Construction of Virtual Worlds with Web 2.0 Technology :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-B0F1-3
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
%X Current Web technologies allow developers to create rich Web-applications.
Unlike desktop applications Web 2.0 programs are created by easily linking
several existing components. This approach, also known as mashup, allows to use
JavaScript to connect web-services and browser components together.
I have extended this development method by bringing 3D and virtual world
networking components into the browser. This allowed me to create Virtual
Worlds Web-application similar to Second Life. I have wrapped the open-source
Sirikata platform for virtual worlds into a Web-service component, created
XML3D rendering component, combined them with other browser services and thus
created a fully-featured 3D world application right inside of the browser
[95]
L. de la Garza, “Implementation and evaluation of an efficient, distributed replication algorithm in a real network,” Universität des Saarlandes, Saarbrücken, 2010.
Export
BibTeX
@mastersthesis{delaGarza2010,
TITLE = {Implementation and evaluation of an efficient, distributed replication algorithm in a real network},
AUTHOR = {de la Garza, Luis},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-D30860E168D99863C125778500372C38-delaGarza2010},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
}
Endnote
%0 Thesis
%A de la Garza, Luis
%Y Weikum, Gerhard
%A referee: Sozio, Mauro
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Implementation and evaluation of an efficient, distributed replication algorithm in a real network :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1473-9
%F EDOC: 536379
%F OTHER: Local-ID: C1256DBF005F876D-D30860E168D99863C125778500372C38-delaGarza2010
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
[96]
S. Jansen, “Symmetry Detection in Images Using Belief Propagation,” Universität des Saarlandes, Saarbrücken, 2010.
Abstract
In this thesis a general approach for detection of symmetric structures in
images is presented. Rather than relying on some feature points to extract
symmetries, symmetries are described using a probabilistic formulation of image
self-similarity. Using a Markov random field we obtain a joint probability
distribution describing all assignments of the image to itself. Due to the high
dimensionality of this joint distribution, we do not examine this distribution
directly, but approximate its marginals in order to gather information about
the symmetries with the image. In the case of perfect symmetries this
approximation is done using belief propagation. A novel variant of belief
propagation is introduced allowing for reliable approximations when dealing
with approximate symmetries. We apply our approach to several images ranging
from perfect synthetic symmetries to real-world scenarios, demonstrating the
capabilities of probabilistic frameworks for symmetry detection.
Export
BibTeX
@mastersthesis{Jansen2010,
TITLE = {Symmetry Detection in Images Using Belief Propagation},
AUTHOR = {Jansen, Silke},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125675300671F7B-705AE7BF0843CDFDC1257823004C9D42-Jansen2010},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
ABSTRACT = {In this thesis a general approach for detection of symmetric structures in images is presented. Rather than relying on some feature points to extract symmetries, symmetries are described using a probabilistic formulation of image self-similarity. Using a Markov random field we obtain a joint probability distribution describing all assignments of the image to itself. Due to the high dimensionality of this joint distribution, we do not examine this distribution directly, but approximate its marginals in order to gather information about the symmetries with the image. In the case of perfect symmetries this approximation is done using belief propagation. A novel variant of belief propagation is introduced allowing for reliable approximations when dealing with approximate symmetries. We apply our approach to several images ranging from perfect synthetic symmetries to real-world scenarios, demonstrating the capabilities of probabilistic frameworks for symmetry detection.},
}
Endnote
%0 Thesis
%A Jansen, Silke
%Y Wand, Michael
%A referee: Seidel, Hans-Peter
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Symmetry Detection in Images Using Belief Propagation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-145F-8
%F EDOC: 537325
%F OTHER: Local-ID: C125675300671F7B-705AE7BF0843CDFDC1257823004C9D42-Jansen2010
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%P X, 51 p.
%V master
%9 master
%X In this thesis a general approach for detection of symmetric structures in
images is presented. Rather than relying on some feature points to extract
symmetries, symmetries are described using a probabilistic formulation of image
self-similarity. Using a Markov random field we obtain a joint probability
distribution describing all assignments of the image to itself. Due to the high
dimensionality of this joint distribution, we do not examine this distribution
directly, but approximate its marginals in order to gather information about
the symmetries with the image. In the case of perfect symmetries this
approximation is done using belief propagation. A novel variant of belief
propagation is introduced allowing for reliable approximations when dealing
with approximate symmetries. We apply our approach to several images ranging
from perfect synthetic symmetries to real-world scenarios, demonstrating the
capabilities of probabilistic frameworks for symmetry detection.
[97]
L. E. Kuhn Cuellar, “A probabilistic algorithm for matching protein structures - and its application to detecting functionally relevant patterns,” Universität des Saarlandes, Saabrücken, 2010.
Export
BibTeX
@mastersthesis{Kuhn2010a,
TITLE = {A probabilistic algorithm for matching protein structures -- and its application to detecting functionally relevant patterns},
AUTHOR = {Kuhn Cuellar, Luis Eugenio},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125673F004B2D7B-6C92695A7AE2ABEEC1257834003E67F4-Kuhn2010a},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saabr{\"u}cken},
YEAR = {2010},
DATE = {2010},
}
Endnote
%0 Thesis
%A Kuhn Cuellar, Luis Eugenio
%Y Sommer, Ingolf
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T A probabilistic algorithm for matching protein structures - and its application to detecting functionally relevant patterns :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-147B-A
%F EDOC: 536615
%F OTHER: Local-ID: C125673F004B2D7B-6C92695A7AE2ABEEC1257834003E67F4-Kuhn2010a
%I Universität des Saarlandes
%C Saabrücken
%D 2010
%V master
%9 master
[98]
F. Makari Manshadi, “Fast Distributed Replication in Modern Networks,” Universität des Saarlandes, Saarbrücken, 2010.
Export
BibTeX
@mastersthesis{Makari-Manshadi2010,
TITLE = {Fast Distributed Replication in Modern Networks},
AUTHOR = {Makari Manshadi, Faraz},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-540B1228EC63CB65C1257715003E7B21-Makari-Manshadi2010},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
}
Endnote
%0 Thesis
%A Makari Manshadi, Faraz
%Y Weikum, Gerhard
%A referee: Sozio, Mauro
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Fast Distributed Replication in Modern Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1470-F
%F EDOC: 536367
%F OTHER: Local-ID: C1256DBF005F876D-540B1228EC63CB65C1257715003E7B21-Makari-Manshadi2010
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
[99]
V. Setty, “Efficiently Identifying Interesting Time-Points in Text Archive Search,” Universität des Saarlandes, Saarbrücken, 2010.
Export
BibTeX
@mastersthesis{Setty2010Master,
TITLE = {Efficiently Identifying Interesting Time-Points in Text Archive Search},
AUTHOR = {Setty, Vinay},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-3DD9962C88FAA1C9C12576C40037DE66-Setty2010Master},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
}
Endnote
%0 Thesis
%A Setty, Vinay
%Y Weikum, Gerhard
%A referee: Bedathur, Srikanta
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Efficiently Identifying Interesting Time-Points in Text Archive Search :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-147E-4
%F EDOC: 536359
%F OTHER: Local-ID: C1256DBF005F876D-3DD9962C88FAA1C9C12576C40037DE66-Setty2010Master
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
[100]
M. Yahya, “Accelerating Rule-Based Reasoning in Disk-Resident RDF Knowledge Bases,” Universität des Saarlandes, Saarbrücken, 2010.
Export
BibTeX
@mastersthesis{Yahya10,
TITLE = {Accelerating Rule-Based Reasoning in Disk-Resident {RDF} Knowledge Bases},
AUTHOR = {Yahya, Mohamed},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-5E0515E262BC394BC12577190038C333-Yahya10},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2010},
DATE = {2010},
}
Endnote
%0 Thesis
%A Yahya, Mohamed
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
%T Accelerating Rule-Based Reasoning in Disk-Resident RDF Knowledge Bases :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1451-4
%F EDOC: 536368
%F OTHER: Local-ID: C1256DBF005F876D-5E0515E262BC394BC12577190038C333-Yahya10
%I Universität des Saarlandes
%C Saarbrücken
%D 2010
%V master
%9 master
2009
[101]
A. Anand, “Index Partitioning Strategies for Peer-to-Peer Web Archival,” Universität des Saarlandes, Saarbrücken, 2009.
Export
BibTeX
@mastersthesis{anand09,
TITLE = {Index Partitioning Strategies for Peer-to-Peer Web Archival},
AUTHOR = {Anand, Avishek},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-67C68AC79E28850EC12575BB003D3708-anand09},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
}
Endnote
%0 Thesis
%A Anand, Avishek
%Y Weikum, Gerhard
%A referee: Bedathur, Srikanta
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Index Partitioning Strategies for Peer-to-Peer Web Archival :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-17EE-E
%F EDOC: 520420
%F OTHER: Local-ID: C1256DBF005F876D-67C68AC79E28850EC12575BB003D3708-anand09
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
[102]
P. Danilewski, “Binned kd-tree Construction with SAH on the GPU,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Our main goal is to create realistically looking animation in real-time. To
that end, we are interested in fast ray tracing. Ray tracing recursively traces
photon movement from the camera (backward) or light sources (forward). To find
where the first intersection between a given ray and the objects in the scene
is we use acceleration structures, for example kd-trees. Kd-trees are
considered to perform best in the majority of cases, however due to their large
construction times are often avoided for dynamic scenes. In this work we try to
overcome this obstacle by building the kd-tree in parallel on many
cores of a GPU.
Our algorithm build the kd-tree in a top-down breath-first fashion, with many
threads processing each node of the tree. For each node we test 31 uniformly
distributed candidate split planes along each axis and use the Surface Area
cost function to estimate the best one. In order to reach maximum performance,
the kd-tree construction is divided into 4 stages. Each of them handles tree
nodes of different primitive count, differs in how counting is resolved and how
work is distributed on the GPU.
Our current program constructs kd-trees faster than other GPU implementations,
while maintaining competing quality compared to serial CPU programs. Tests have
shown that execution time scales well in respect to power of the GPU and it
will most likely continue doing so with future releases of the hardware.
Export
BibTeX
@mastersthesis{Danilewski2009,
TITLE = {Binned kd-tree Construction with {SAH} on the {GPU}},
AUTHOR = {Danilewski, Piotr},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Our main goal is to create realistically looking animation in real-time. To that end, we are interested in fast ray tracing. Ray tracing recursively traces photon movement from the camera (backward) or light sources (forward). To find where the first intersection between a given ray and the objects in the scene is we use acceleration structures, for example kd-trees. Kd-trees are considered to perform best in the majority of cases, however due to their large construction times are often avoided for dynamic scenes. In this work we try to overcome this obstacle by building the kd-tree in parallel on many cores of a GPU. Our algorithm build the kd-tree in a top-down breath-first fashion, with many threads processing each node of the tree. For each node we test 31 uniformly distributed candidate split planes along each axis and use the Surface Area cost function to estimate the best one. In order to reach maximum performance, the kd-tree construction is divided into 4 stages. Each of them handles tree nodes of different primitive count, differs in how counting is resolved and how work is distributed on the GPU. Our current program constructs kd-trees faster than other GPU implementations, while maintaining competing quality compared to serial CPU programs. Tests have shown that execution time scales well in respect to power of the GPU and it will most likely continue doing so with future releases of the hardware.},
}
Endnote
%0 Thesis
%A Danilewski, Piotr
%Y Slusallek, Philipp
%A referee: Myszkowski, Karol
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Binned kd-tree Construction with SAH on the GPU :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-B6D9-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Our main goal is to create realistically looking animation in real-time. To
that end, we are interested in fast ray tracing. Ray tracing recursively traces
photon movement from the camera (backward) or light sources (forward). To find
where the first intersection between a given ray and the objects in the scene
is we use acceleration structures, for example kd-trees. Kd-trees are
considered to perform best in the majority of cases, however due to their large
construction times are often avoided for dynamic scenes. In this work we try to
overcome this obstacle by building the kd-tree in parallel on many
cores of a GPU.
Our algorithm build the kd-tree in a top-down breath-first fashion, with many
threads processing each node of the tree. For each node we test 31 uniformly
distributed candidate split planes along each axis and use the Surface Area
cost function to estimate the best one. In order to reach maximum performance,
the kd-tree construction is divided into 4 stages. Each of them handles tree
nodes of different primitive count, differs in how counting is resolved and how
work is distributed on the GPU.
Our current program constructs kd-trees faster than other GPU implementations,
while maintaining competing quality compared to serial CPU programs. Tests have
shown that execution time scales well in respect to power of the GPU and it
will most likely continue doing so with future releases of the hardware.
[103]
O. Honcharova, “Static Detection of Parametric Loop Bounds on C Code,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Static Worst-Case Execution Time (WCET) analysis is a technique to derive upper bounds for the execution times of programs. Such bounds are crucial when designing and verifying real-time systems. A key component for static derivation of precise WCET estimates is determination of upper bounds on the number of times loops can be iterated.
The idea of the parametric loop bound analysis is to express the upper loop bound as a formula depending on parameters { variables and expressions staying constant within the loop body. The formula is constructed once for each loop. Then by instantiating this formula with
values of parameters acquired externally (from value analysis, etc.), a concrete loop bound can be computed without high computational effort.
Export
BibTeX
@mastersthesis{Honcharova2009,
TITLE = {Static Detection of Parametric Loop Bounds on {C} Code},
AUTHOR = {Honcharova, Olha},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Static Worst-Case Execution Time (WCET) analysis is a technique to derive upper bounds for the execution times of programs. Such bounds are crucial when designing and verifying real-time systems. A key component for static derivation of precise WCET estimates is determination of upper bounds on the number of times loops can be iterated. The idea of the parametric loop bound analysis is to express the upper loop bound as a formula depending on parameters { variables and expressions staying constant within the loop body. The formula is constructed once for each loop. Then by instantiating this formula with values of parameters acquired externally (from value analysis, etc.), a concrete loop bound can be computed without high computational effort.},
}
Endnote
%0 Thesis
%A Honcharova, Olha
%Y Finkbeiner, Bernd
%A referee: Martin, Florian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Static Detection of Parametric Loop Bounds on C Code :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA7A-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Static Worst-Case Execution Time (WCET) analysis is a technique to derive upper bounds for the execution times of programs. Such bounds are crucial when designing and verifying real-time systems. A key component for static derivation of precise WCET estimates is determination of upper bounds on the number of times loops can be iterated.
The idea of the parametric loop bound analysis is to express the upper loop bound as a formula depending on parameters { variables and expressions staying constant within the loop body. The formula is constructed once for each loop. Then by instantiating this formula with
values of parameters acquired externally (from value analysis, etc.), a concrete loop bound can be computed without high computational effort.
[104]
A. Jindal, “Quality in Phrase Mining,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Phrase snippets of large text corpora like news articles or web search results
offer great insight and analytical value. While much of the prior work is
focussed on efficient storage and retrieval of all candidate phrases, little
emphasis has been laid on the quality of the result set. In this thesis, we
define phrases of interest and propose a framework
for mining and post-processing interesting phrases. We focus on the quality of
phrases and develop techniques to mine minimal-length maximal-informative
sequences of words.The techniques developed are streamed into a post-processing
pipeline and include exact and approximate match-based merging, incomplete
phrase detection with filtering, and heuristics-based phrase classification.
The strategies aim to prune the candidate set of phrases down to the ones being
meaningful and having rich content. We characterize
the phrases with heuristics- and NLP-based features. We use a supervised
learning based regression model to predict their interestingness. Further, we
develop and analyze ranking and grouping models for presenting the phrases to
the user. Finally, we discuss relevance and performance evaluation of our
techniques. Our framework is evaluated using a recently released real world
corpus of New York Times news articles.
Export
BibTeX
@mastersthesis{Jindal2010,
TITLE = {Quality in Phrase Mining},
AUTHOR = {Jindal, Alekh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Phrase snippets of large text corpora like news articles or web search results offer great insight and analytical value. While much of the prior work is focussed on efficient storage and retrieval of all candidate phrases, little emphasis has been laid on the quality of the result set. In this thesis, we define phrases of interest and propose a framework for mining and post-processing interesting phrases. We focus on the quality of phrases and develop techniques to mine minimal-length maximal-informative sequences of words.The techniques developed are streamed into a post-processing pipeline and include exact and approximate match-based merging, incomplete phrase detection with filtering, and heuristics-based phrase classification. The strategies aim to prune the candidate set of phrases down to the ones being meaningful and having rich content. We characterize the phrases with heuristics- and NLP-based features. We use a supervised learning based regression model to predict their interestingness. Further, we develop and analyze ranking and grouping models for presenting the phrases to the user. Finally, we discuss relevance and performance evaluation of our techniques. Our framework is evaluated using a recently released real world corpus of New York Times news articles.},
}
Endnote
%0 Thesis
%A Jindal, Alekh
%Y Weikum, Gerhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Quality in Phrase Mining :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA7D-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Phrase snippets of large text corpora like news articles or web search results
offer great insight and analytical value. While much of the prior work is
focussed on efficient storage and retrieval of all candidate phrases, little
emphasis has been laid on the quality of the result set. In this thesis, we
define phrases of interest and propose a framework
for mining and post-processing interesting phrases. We focus on the quality of
phrases and develop techniques to mine minimal-length maximal-informative
sequences of words.The techniques developed are streamed into a post-processing
pipeline and include exact and approximate match-based merging, incomplete
phrase detection with filtering, and heuristics-based phrase classification.
The strategies aim to prune the candidate set of phrases down to the ones being
meaningful and having rich content. We characterize
the phrases with heuristics- and NLP-based features. We use a supervised
learning based regression model to predict their interestingness. Further, we
develop and analyze ranking and grouping models for presenting the phrases to
the user. Finally, we discuss relevance and performance evaluation of our
techniques. Our framework is evaluated using a recently released real world
corpus of New York Times news articles.
[105]
J. Kalojanov, “Parallel and Lazy Construction of Grids for Ray Tracing on Graphics Hardware,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
In this thesis we investigate the use of uniform grids as acceleration
structures for ray tracing on data-parallel machines such as modern graphics
processors. The main focus of this work is the trade-off between construction
time and rendering performance provided by the acceleration structures,
which is important for rendering dynamic scenes. We propose several parallel
construction algorithms for uniform and two-level grids as well as a ray
triangle intersection algorithm, which improves SIMD utilization for incoherent
rays. The result of this work is a GPU ray tracer with performance
for dynamic scenes that is comparable and in some cases better than the best
known implementations today.
Export
BibTeX
@mastersthesis{Kalojanov2009,
TITLE = {Parallel and Lazy Construction of Grids for Ray Tracing on Graphics Hardware},
AUTHOR = {Kalojanov, Javor},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {In this thesis we investigate the use of uniform grids as acceleration structures for ray tracing on data-parallel machines such as modern graphics processors. The main focus of this work is the trade-off between construction time and rendering performance provided by the acceleration structures, which is important for rendering dynamic scenes. We propose several parallel construction algorithms for uniform and two-level grids as well as a ray triangle intersection algorithm, which improves SIMD utilization for incoherent rays. The result of this work is a GPU ray tracer with performance for dynamic scenes that is comparable and in some cases better than the best known implementations today.},
}
Endnote
%0 Thesis
%A Kalojanov, Javor
%Y Slusallek, Philipp
%A referee: Wand, Michael
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Parallel and Lazy Construction of Grids for Ray Tracing on Graphics Hardware :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA80-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X In this thesis we investigate the use of uniform grids as acceleration
structures for ray tracing on data-parallel machines such as modern graphics
processors. The main focus of this work is the trade-off between construction
time and rendering performance provided by the acceleration structures,
which is important for rendering dynamic scenes. We propose several parallel
construction algorithms for uniform and two-level grids as well as a ray
triangle intersection algorithm, which improves SIMD utilization for incoherent
rays. The result of this work is a GPU ray tracer with performance
for dynamic scenes that is comparable and in some cases better than the best
known implementations today.
[106]
M. Khosla, “Message Passing Algorithms,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Constraint Satisfaction Problems (CSPs) are defined over a set of variables
whose state must satisfy a number of constraints. We study a class of
algorithms called Message Passing Algorithms, which aim at finding the
probability distribution of the variables
over the space of satisfying assignments. These algorithms involve passing
local messages (according to some message update rules) over the edges of a
factor graph constructed corresponding to the CSP. We focus on the Belief
Propagation (BP) algorithm, which
finds exact solution marginals for tree-like factor graphs. However,
convergence and exactness cannot be guaranteed for a general factor graph. We
propose a method for improving BP to account for cycles in the factor graph. We
also study another message passing algorithm known as Survey Propagation (SP),
which is empirically quite effective in solving random K-SAT instances, even
when the density is close to the satisfiability threshold. We contribute to the
theoretical understanding of SP by deriving the SP
equations from the BP message update rules.
Export
BibTeX
@mastersthesis{Khosla2009,
TITLE = {Message Passing Algorithms},
AUTHOR = {Khosla, Megha},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Constraint Satisfaction Problems (CSPs) are defined over a set of variables whose state must satisfy a number of constraints. We study a class of algorithms called Message Passing Algorithms, which aim at finding the probability distribution of the variables over the space of satisfying assignments. These algorithms involve passing local messages (according to some message update rules) over the edges of a factor graph constructed corresponding to the CSP. We focus on the Belief Propagation (BP) algorithm, which finds exact solution marginals for tree-like factor graphs. However, convergence and exactness cannot be guaranteed for a general factor graph. We propose a method for improving BP to account for cycles in the factor graph. We also study another message passing algorithm known as Survey Propagation (SP), which is empirically quite effective in solving random K-SAT instances, even when the density is close to the satisfiability threshold. We contribute to the theoretical understanding of SP by deriving the SP equations from the BP message update rules.},
}
Endnote
%0 Thesis
%A Khosla, Megha
%Y Panagiotou, Konstantinos
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Message Passing Algorithms :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA83-2
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Constraint Satisfaction Problems (CSPs) are defined over a set of variables
whose state must satisfy a number of constraints. We study a class of
algorithms called Message Passing Algorithms, which aim at finding the
probability distribution of the variables
over the space of satisfying assignments. These algorithms involve passing
local messages (according to some message update rules) over the edges of a
factor graph constructed corresponding to the CSP. We focus on the Belief
Propagation (BP) algorithm, which
finds exact solution marginals for tree-like factor graphs. However,
convergence and exactness cannot be guaranteed for a general factor graph. We
propose a method for improving BP to account for cycles in the factor graph. We
also study another message passing algorithm known as Survey Propagation (SP),
which is empirically quite effective in solving random K-SAT instances, even
when the density is close to the satisfiability threshold. We contribute to the
theoretical understanding of SP by deriving the SP
equations from the BP message update rules.
[107]
D. Puzhay, “Modeling Bug Reporter Reputation,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Tracking and resolving of software bugs are very important tasks for software
developers and maintainers. Bug-tracking systems are tools which are widely
used in open source projects to support these activities. The empirical
Software Engineering research community pays considerable attention to
bug-tracking-related topics in order to provide bug-tracking systems users with
adequate software and tool support.
Bug-tracking is a highly socialized process which requires constant
communication between developers and bug reporters. However, the inherent
social structure of bug tracking systems and its influence on everyday
bug-tracking has earlier been poorly studied.
In this work I address the role of bug reporter reputation. Using publicly
available information from bug-tracking system database, I model bug reporter
reputation to check whether there is any evidence of relation between reporter
reputation and attention from developers his bugs get.
If reputation actually plays important role in bug-tracking activities and can
relatively easily be extracted, existing prediction techniques
could potentially be improved by using reputation as additional input variable;
bug-tracking software could be supported with more formal
notion of reporter reputation.
Export
BibTeX
@mastersthesis{Puzhay2009,
TITLE = {Modeling Bug Reporter Reputation},
AUTHOR = {Puzhay, Dmytro},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Tracking and resolving of software bugs are very important tasks for software developers and maintainers. Bug-tracking systems are tools which are widely used in open source projects to support these activities. The empirical Software Engineering research community pays considerable attention to bug-tracking-related topics in order to provide bug-tracking systems users with adequate software and tool support. Bug-tracking is a highly socialized process which requires constant communication between developers and bug reporters. However, the inherent social structure of bug tracking systems and its influence on everyday bug-tracking has earlier been poorly studied. In this work I address the role of bug reporter reputation. Using publicly available information from bug-tracking system database, I model bug reporter reputation to check whether there is any evidence of relation between reporter reputation and attention from developers his bugs get. If reputation actually plays important role in bug-tracking activities and can relatively easily be extracted, existing prediction techniques could potentially be improved by using reputation as additional input variable; bug-tracking software could be supported with more formal notion of reporter reputation.},
}
Endnote
%0 Thesis
%A Puzhay, Dmytro
%Y Wilhelm, Reinhard
%A referee: Premraj, Rahul
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Modeling Bug Reporter Reputation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA88-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Tracking and resolving of software bugs are very important tasks for software
developers and maintainers. Bug-tracking systems are tools which are widely
used in open source projects to support these activities. The empirical
Software Engineering research community pays considerable attention to
bug-tracking-related topics in order to provide bug-tracking systems users with
adequate software and tool support.
Bug-tracking is a highly socialized process which requires constant
communication between developers and bug reporters. However, the inherent
social structure of bug tracking systems and its influence on everyday
bug-tracking has earlier been poorly studied.
In this work I address the role of bug reporter reputation. Using publicly
available information from bug-tracking system database, I model bug reporter
reputation to check whether there is any evidence of relation between reporter
reputation and attention from developers his bugs get.
If reputation actually plays important role in bug-tracking activities and can
relatively easily be extracted, existing prediction techniques
could potentially be improved by using reputation as additional input variable;
bug-tracking software could be supported with more formal
notion of reporter reputation.
[108]
R. Ragneala, “A Useful Resource for Defect Prediction Models,” Universität des Saarlandes, Saarbrücken, 2009.
Abstract
Predicting likely software defects in the future is valuable for project
managers when planning resource allocation for software testing. But building
prediction models using only code metrics may not be suffice for accurate
results. In this work, we investigate the value of code history metrics that
can be collected from the project's version archives for the purpose of defect
prediction. Our results suggest that prediction models built using code history
metrics outperform those using traditional code metrics only.
Export
BibTeX
@mastersthesis{Ragneala2009,
TITLE = {A Useful Resource for Defect Prediction Models},
AUTHOR = {Ragneala, Roxana},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
ABSTRACT = {Predicting likely software defects in the future is valuable for project managers when planning resource allocation for software testing. But building prediction models using only code metrics may not be suffice for accurate results. In this work, we investigate the value of code history metrics that can be collected from the project's version archives for the purpose of defect prediction. Our results suggest that prediction models built using code history metrics outperform those using traditional code metrics only.},
}
Endnote
%0 Thesis
%A Ragneala, Roxana
%Y Zeller, Andreas
%A referee: Weikum, Gerhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T A Useful Resource for Defect Prediction Models :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA8D-E
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
%X Predicting likely software defects in the future is valuable for project
managers when planning resource allocation for software testing. But building
prediction models using only code metrics may not be suffice for accurate
results. In this work, we investigate the value of code history metrics that
can be collected from the project's version archives for the purpose of defect
prediction. Our results suggest that prediction models built using code history
metrics outperform those using traditional code metrics only.
[109]
C. Rizkallah, “Proof Representations for Higher-Order Logic,” Universität des Saarlandes, Saarbrücken, 2009.
Export
BibTeX
@mastersthesis{Rizkallah2009,
TITLE = {Proof Representations for Higher-Order Logic},
AUTHOR = {Rizkallah, Christine},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
}
Endnote
%0 Thesis
%A Rizkallah, Christine
%Y Brown, Chad
%A referee: Smolka, Gert
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Proof Representations for Higher-Order Logic :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BA90-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
[110]
T. Tylenda, “Time-aware Link Prediction in Evolving Social Networks,” Universität des Saarlandes, Saarbrücken, 2009.
Export
BibTeX
@mastersthesis{tylenda09,
TITLE = {Time-aware Link Prediction in Evolving Social Networks},
AUTHOR = {Tylenda, Tomasz},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-206CBE96EFA630DEC1257553004EF89D-tylenda09},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2009},
DATE = {2009},
}
Endnote
%0 Thesis
%A Tylenda, Tomasz
%Y Weikum, Gerhard
%A referee: Bedathur, Srikanta
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Time-aware Link Prediction in Evolving Social Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-17CD-7
%F EDOC: 520431
%F OTHER: Local-ID: C1256DBF005F876D-206CBE96EFA630DEC1257553004EF89D-tylenda09
%I Universität des Saarlandes
%C Saarbrücken
%D 2009
%V master
%9 master
2008
[111]
L. M. Andreescu, “Pricing Information Goods in an Agent-based Information Filtering System,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{Andreescu08,
TITLE = {Pricing Information Goods in an Agent-based Information Filtering System},
AUTHOR = {Andreescu, Laura Maria},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-55AF99F2B1A2EFFFC12575350042D9D0-Andreescu08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Andreescu, Laura Maria
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Pricing Information Goods in an Agent-based Information Filtering System :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1A9E-A
%F EDOC: 428293
%F OTHER: Local-ID: C125756E0038A185-55AF99F2B1A2EFFFC12575350042D9D0-Andreescu08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[112]
O.-M. Ciobotaru, “Efficient lont-term Secure Universally Composable Commitments,” Universität des Saarlandes, Saarbrücken, 2008.
Abstract
Long-term security ensures that a protocol remains secure even in the future,
when the adversarial computational power could, potentially, become unlimited.
The notion of universal composability preserves the security of a cryptographic
protocol when it is used in combination with any other protocols, in possibly
complex systems. The area of long-term universally composable secure protocols
has been developed mostly by Müller-Quade and Unruh. Their research conducted
so far has shown the existence of secure long-term UC commitments under general
cryptographic assumptions, thus without having an emphasis on the efficiency of
the protocols designed. Building on their work and using very efficient
zero-knowledge proofs of knowledge from [CL02], this thesis presents a new
long-term universally composable secure commitment protocol that is both
efficient and plausible to use in practice.
Export
BibTeX
@mastersthesis{Ciobotaru2008,
TITLE = {Efficient lont-term Secure Universally Composable Commitments},
AUTHOR = {Ciobotaru, Oana-Madalina},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
ABSTRACT = {Long-term security ensures that a protocol remains secure even in the future, when the adversarial computational power could, potentially, become unlimited. The notion of universal composability preserves the security of a cryptographic protocol when it is used in combination with any other protocols, in possibly complex systems. The area of long-term universally composable secure protocols has been developed mostly by M{\"u}ller-Quade and Unruh. Their research conducted so far has shown the existence of secure long-term UC commitments under general cryptographic assumptions, thus without having an emphasis on the efficiency of the protocols designed. Building on their work and using very efficient zero-knowledge proofs of knowledge from [CL02], this thesis presents a new long-term universally composable secure commitment protocol that is both efficient and plausible to use in practice.},
}
Endnote
%0 Thesis
%A Ciobotaru, Oana-Madalina
%A referee: Unruh, Dominique
%Y Backes, Michael
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Efficient lont-term Secure Universally Composable Commitments :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BABF-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
%X Long-term security ensures that a protocol remains secure even in the future,
when the adversarial computational power could, potentially, become unlimited.
The notion of universal composability preserves the security of a cryptographic
protocol when it is used in combination with any other protocols, in possibly
complex systems. The area of long-term universally composable secure protocols
has been developed mostly by Müller-Quade and Unruh. Their research conducted
so far has shown the existence of secure long-term UC commitments under general
cryptographic assumptions, thus without having an emphasis on the efficiency of
the protocols designed. Building on their work and using very efficient
zero-knowledge proofs of knowledge from [CL02], this thesis presents a new
long-term universally composable secure commitment protocol that is both
efficient and plausible to use in practice.
[113]
M. Dudev, “Personalization of Search on Structured Data,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{dudev08,
TITLE = {Personalization of Search on Structured Data},
AUTHOR = {Dudev, Minko},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-7EBBA7D368754138C12574C900472DFD-dudev08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Dudev, Minko
%Y Weikum, Gerhard
%A referee: Zeller, Andreas
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Personalization of Search on Structured Data :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1CA6-C
%F EDOC: 428294
%F OTHER: Local-ID: C125756E0038A185-7EBBA7D368754138C12574C900472DFD-dudev08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[114]
Q. Gao, “Low Bit Rate Video Compression Using Inpainting PDEs and Optic Flow,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{GaoMaster08,
TITLE = {Low Bit Rate Video Compression Using Inpainting {PDE}s and Optic Flow},
AUTHOR = {Gao, Qi},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-A8F6E9D8D03F66B3C1257590002ABCDE-GaoMaster08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Gao, Qi
%Y Weikert, Joachim
%A referee: Bruhn, Andres
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Low Bit Rate Video Compression Using Inpainting PDEs and Optic Flow :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1A7E-1
%F EDOC: 428295
%F OTHER: Local-ID: C125756E0038A185-A8F6E9D8D03F66B3C1257590002ABCDE-GaoMaster08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[115]
I. Georgiev, “RTfact Concepts for Generic Ray Tracing,” Universität des Saarlandes, Saarbrücken, 2008.
Abstract
For a long time now, interactive 3D graphics has been dominated by
rasterization algorithms. However, thanks to more than a decade of research and
the fast evolution of computer hardware, ray tracing has recently achieved
real-time performance. Thus, it is likely that ray tracing will become a
commodity choice for adding complex lighting effects to real-time rendering
engines. Nonetheless, interactive ray tracing research has been mostly
concentrated on few specific combinations of algorithms and data structures.
In this thesis we present RTfact (an attempt to bring the different
aspects of ray tracing together in a component oriented, generic, and
portable way, without sacrificing the performance benefits of hand-tuned
single-purpose implementations. RTfact is a template library consisting of
packet-centric components combined into an ecient ray tracing framework. Our
generic design approach with loosely coupled algorithms and data structures
allows for seamless integration of new algorithms with maximum runtime
performance, while leveraging as much of the existing code base as possible.
The SIMD abstraction layer of RTfact enables easy porting to new
microprocessor architectures with wider SIMD instruction sets without the
need of modifying existing code. The eciency of C++ templates allows us
to achieve fine component granularity and to incorporate a flexible physically-
based surface shading model, which enables exploitation of ray coherence.
As a proof of concept we apply the library to a variety of rendering tasks and
demonstrate its ability to deliver performance equal to existing optimized
implementations.
Export
BibTeX
@mastersthesis{Georgiev2008,
TITLE = {{RT}fact Concepts for Generic Ray Tracing},
AUTHOR = {Georgiev, Iliyan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
ABSTRACT = {For a long time now, interactive 3D graphics has been dominated by rasterization algorithms. However, thanks to more than a decade of research and the fast evolution of computer hardware, ray tracing has recently achieved real-time performance. Thus, it is likely that ray tracing will become a commodity choice for adding complex lighting effects to real-time rendering engines. Nonetheless, interactive ray tracing research has been mostly concentrated on few specific combinations of algorithms and data structures. In this thesis we present RTfact (an attempt to bring the different aspects of ray tracing together in a component oriented, generic, and portable way, without sacrificing the performance benefits of hand-tuned single-purpose implementations. RTfact is a template library consisting of packet-centric components combined into an ecient ray tracing framework. Our generic design approach with loosely coupled algorithms and data structures allows for seamless integration of new algorithms with maximum runtime performance, while leveraging as much of the existing code base as possible. The SIMD abstraction layer of RTfact enables easy porting to new microprocessor architectures with wider SIMD instruction sets without the need of modifying existing code. The eciency of C++ templates allows us to achieve fine component granularity and to incorporate a flexible physically- based surface shading model, which enables exploitation of ray coherence. As a proof of concept we apply the library to a variety of rendering tasks and demonstrate its ability to deliver performance equal to existing optimized implementations.},
}
Endnote
%0 Thesis
%A Georgiev, Iliyan
%Y Slusallek, Philipp
%A referee: Hack, Sebastian
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T RTfact Concepts for Generic Ray Tracing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BB2C-F
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
%X For a long time now, interactive 3D graphics has been dominated by
rasterization algorithms. However, thanks to more than a decade of research and
the fast evolution of computer hardware, ray tracing has recently achieved
real-time performance. Thus, it is likely that ray tracing will become a
commodity choice for adding complex lighting effects to real-time rendering
engines. Nonetheless, interactive ray tracing research has been mostly
concentrated on few specific combinations of algorithms and data structures.
In this thesis we present RTfact (an attempt to bring the different
aspects of ray tracing together in a component oriented, generic, and
portable way, without sacrificing the performance benefits of hand-tuned
single-purpose implementations. RTfact is a template library consisting of
packet-centric components combined into an ecient ray tracing framework. Our
generic design approach with loosely coupled algorithms and data structures
allows for seamless integration of new algorithms with maximum runtime
performance, while leveraging as much of the existing code base as possible.
The SIMD abstraction layer of RTfact enables easy porting to new
microprocessor architectures with wider SIMD instruction sets without the
need of modifying existing code. The eciency of C++ templates allows us
to achieve fine component granularity and to incorporate a flexible physically-
based surface shading model, which enables exploitation of ray coherence.
As a proof of concept we apply the library to a variety of rendering tasks and
demonstrate its ability to deliver performance equal to existing optimized
implementations.
[116]
M. A. Granados Velásquez, “Background Estimation from Photographs with Application to Ghost Removal in High Dynamic Range Image Reconstruction,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{GranadosMaster08,
TITLE = {Background Estimation from Photographs with Application to Ghost Removal in High Dynamic Range Image Reconstruction},
AUTHOR = {Granados Vel{\'a}squez, Miguel Andr{\'e}s},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-1A9849076B83BAE3C1257590002ADC6E-GranadosMaster08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Granados Velásquez, Miguel Andrés
%Y Slusallek, Philipp
%A referee: Lensch, Hendrik P. A.
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computer Graphics, MPI for Informatics, Max Planck Society
%T Background Estimation from Photographs with Application to Ghost Removal in High Dynamic Range Image Reconstruction :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1A7A-9
%F EDOC: 428297
%F OTHER: Local-ID: C125756E0038A185-1A9849076B83BAE3C1257590002ADC6E-GranadosMaster08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[117]
M. L. Hamerlik, “Anonymity and Censorship Resistance in Semantic Overlay Networks,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{HamerlikMaster08,
TITLE = {Anonymity and Censorship Resistance in Semantic Overlay Networks},
AUTHOR = {Hamerlik, Marek Lech},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-BE5670609AEB3F01C1257590002B14E2-HamerlikMaster08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Hamerlik, Marek Lech
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
%T Anonymity and Censorship Resistance in Semantic Overlay Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1A78-D
%F EDOC: 428298
%F OTHER: Local-ID: C125756E0038A185-BE5670609AEB3F01C1257590002B14E2-HamerlikMaster08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[118]
S. Holder, “Replication in Unstructured Peer-to-Peer Networks with Availability Constraints,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{Holder08,
TITLE = {Replication in Unstructured Peer-to-Peer Networks with Availability Constraints},
AUTHOR = {Holder, Stefan},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-D791C134663D6D6AC12574CC003593E6-Holder08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Holder, Stefan
%Y Weikum, Gerhard
%A referee: Wilhelm, Reinhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Replication in Unstructured Peer-to-Peer Networks with Availability Constraints :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1AA4-9
%F EDOC: 428299
%F OTHER: Local-ID: C125756E0038A185-D791C134663D6D6AC12574CC003593E6-Holder08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[119]
L. Kasradze, “Implementation of a File-based Indexing Framework for the TopX Search Engine,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{kasradze07,
TITLE = {Implementation of a File-based Indexing Framework for the {TopX} Search Engine},
AUTHOR = {Kasradze, Levan},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-EF4294FB083A0396C1257456003A6530-kasradze07},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Kasradze, Levan
%Y Weikum, Gerhard
%A referee: Bast, Hannah
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Implementation of a File-based Indexing Framework for the TopX Search Engine :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1AAE-6
%F EDOC: 428300
%F OTHER: Local-ID: C125756E0038A185-EF4294FB083A0396C1257456003A6530-kasradze07
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[120]
G. Manolache, “Index-based Snippet Generation,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{manolache08,
TITLE = {Index-based Snippet Generation},
AUTHOR = {Manolache, Gabriel},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-515D17CBD7A6628FC125747200462E5B-manolache08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Manolache, Gabriel
%Y Bast, Hannah
%A referee: Weikum, Gerhard
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Index-based Snippet Generation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1BEE-0
%F EDOC: 428303
%F OTHER: Local-ID: C125756E0038A185-515D17CBD7A6628FC125747200462E5B-manolache08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[121]
H. Peters, “Hardware and Software Extensions for a FTIR Multi-Touch Interface,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{Peters2008,
TITLE = {Hardware and Software Extensions for a {FTIR} Multi-Touch Interface},
AUTHOR = {Peters, Henning},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Peters, Henning
%Y Lensch, Hendrik
%A referee: Seidel, Hans-Peter
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Hardware and Software Extensions for a FTIR Multi-Touch Interface :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BB73-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[122]
M. Rusinov, “Homomorphism Homogeneous Graphs,” Universität des Saarlandes, Saarbrücken, 2008.
Abstract
Homogeneous structures are a well studied research area and have variety uses
like constructions in model theory and permutation group theory. Recently
Cameron and Nesetril have introduced homomorphism homogeneity by incorporating
homomorphisms in the definition of homogeneity. This has attracted a fair bit
of attention from the research community and a growing amount of research has
been done in this area for different relational structures.
The first goal of this thesis is to investigate the different classes of
homomorphism homogeneous simple undirected graphs with respect to different
kinds of homomorphisms and study the relations between these classes. Although
homogeneous graphs are heavily analyzed, little has been done for homomorphism
homogeneous graphs. Cameron and Nesetril posed two open questions when they
first defined these graphs. We answer both questions and also attempt to
classify the homomorphism homogeneous graphs. This, we believe, opens up future
possibilities for more analysis of these structures.
In the thesis we also treat the category of graphs with loop allowed and
further extend the idea of homogeneity by expanding the list of homomorphisms
that are taken into consideration in the definitions.
Export
BibTeX
@mastersthesis{Rusinov2008,
TITLE = {Homomorphism Homogeneous Graphs},
AUTHOR = {Rusinov, Momchil},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
ABSTRACT = {Homogeneous structures are a well studied research area and have variety uses like constructions in model theory and permutation group theory. Recently Cameron and Nesetril have introduced homomorphism homogeneity by incorporating homomorphisms in the definition of homogeneity. This has attracted a fair bit of attention from the research community and a growing amount of research has been done in this area for different relational structures. The first goal of this thesis is to investigate the different classes of homomorphism homogeneous simple undirected graphs with respect to different kinds of homomorphisms and study the relations between these classes. Although homogeneous graphs are heavily analyzed, little has been done for homomorphism homogeneous graphs. Cameron and Nesetril posed two open questions when they first defined these graphs. We answer both questions and also attempt to classify the homomorphism homogeneous graphs. This, we believe, opens up future possibilities for more analysis of these structures. In the thesis we also treat the category of graphs with loop allowed and further extend the idea of homogeneity by expanding the list of homomorphisms that are taken into consideration in the definitions.},
}
Endnote
%0 Thesis
%A Rusinov, Momchil
%Y Mehlhorn, Kurt
%A referee: Bläser, Markus
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
%T Homomorphism Homogeneous Graphs :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BB7C-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
%X Homogeneous structures are a well studied research area and have variety uses
like constructions in model theory and permutation group theory. Recently
Cameron and Nesetril have introduced homomorphism homogeneity by incorporating
homomorphisms in the definition of homogeneity. This has attracted a fair bit
of attention from the research community and a growing amount of research has
been done in this area for different relational structures.
The first goal of this thesis is to investigate the different classes of
homomorphism homogeneous simple undirected graphs with respect to different
kinds of homomorphisms and study the relations between these classes. Although
homogeneous graphs are heavily analyzed, little has been done for homomorphism
homogeneous graphs. Cameron and Nesetril posed two open questions when they
first defined these graphs. We answer both questions and also attempt to
classify the homomorphism homogeneous graphs. This, we believe, opens up future
possibilities for more analysis of these structures.
In the thesis we also treat the category of graphs with loop allowed and
further extend the idea of homogeneity by expanding the list of homomorphisms
that are taken into consideration in the definitions.
[123]
R. Socher, “A Learning-Based Hierarchical Model for Vessel Segmentation,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{socher08,
TITLE = {A Learning-Based Hierarchical Model for Vessel Segmentation},
AUTHOR = {Socher, Richard},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-46F39494700CC85AC12574AB00351D0B-socher08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Socher, Richard
%Y Weikum, Gerhard
%A referee: Weikert, Joachim
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T A Learning-Based Hierarchical Model for Vessel Segmentation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1AA8-1
%F EDOC: 428309
%F OTHER: Local-ID: C125756E0038A185-46F39494700CC85AC12574AB00351D0B-socher08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%V master
%9 master
[124]
O. Sönmez, “Learning Game Theoretic Model Parameters Applied to Adversarial Classification,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{SonmezMaster_2008,
TITLE = {Learning Game Theoretic Model Parameters Applied to Adversarial Classi{fi}cation},
AUTHOR = {S{\"o}nmez, Orhan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
CONTENTS = {In most of the machine learning approaches, it is commonly assumed that the data is independent and identically distributed from a data distribution. However, this is not the actual case in the real world applications. Hence, a further assumption could be done about the type of noise over the data in order to correctly model the real world. However, in some application domains such as spam {fi}ltering, intrusion detection, fraud detection etc. this assumption does not hold as there exists an opponent adversary that reacts the {fi}ltering process of the classi{fi}er and modi{fi}es the upcoming data accordingly. Hence, the performance of the classi{fi}er degrades rapidly after its deployment with the counter actions of the adversary. When not assuming the independence of the data generation from the classi{fi}cation, arise a new problem, namely the adversarial learning problem. Now, the classi{fi}er should estimate the classi{fi}cation parameters with considering the presence of the opponent adversary and furthermore has to adapt itself to the activities of it. In order to solve this adversarial learning problem, a two-player game is de{fi}ned between the classi{fi}er and the adversary. Afterward, the game results are resolved for different classi{fi}er losses such as adversary-aware and utilitybased classi{fi}er, and for different adversarial strategies such as worst-case, goal-based and utility-based. Furthermore, a minimax approximation and a Nikaido-Isado function-based Nash equilibrium calculation algorithm are proposed in order to calculate the resolved game results. Finally, these two algorithms are applied over a real-life and an arti{fi}cial data set for different settings and compared with a linear SVM.},
}
Endnote
%0 Thesis
%A Sönmez, Orhan
%Y Scheffer, Tobias
%A referee: Hein, Matthias
%+ Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T Learning Game Theoretic Model Parameters Applied to Adversarial Classification :
%G eng
%U http://hdl.handle.net/21.11116/0000-0002-8444-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%P 52 p.
%V master
%9 master
%Z In most of the machine learning approaches, it is commonly assumed that the data is independent and identically distributed from a data distribution. However, this is not the actual case in the real world applications. Hence, a further assumption could be done about the type of noise over the data in order to correctly model the real world. However, in some application domains such as spam filtering, intrusion detection, fraud detection etc. this assumption does not hold as there exists an opponent adversary that reacts the filtering process of the classifier and modifies the upcoming data accordingly. Hence, the performance of the classifier degrades rapidly after its deployment with the counter actions of the adversary.
When not assuming the independence of the data generation from the classification, arise a new problem, namely the adversarial learning problem. Now, the classifier should estimate the classification parameters with considering the presence of the opponent adversary and furthermore has to adapt itself to the activities of it.
In order to solve this adversarial learning problem, a two-player game is defined between the classifier and the adversary. Afterward, the game results are resolved for different classifier losses such as adversary-aware and utilitybased classifier, and for different adversarial strategies such as worst-case, goal-based and utility-based. Furthermore, a minimax approximation and a Nikaido-Isado function-based Nash equilibrium calculation algorithm are proposed in order to calculate the resolved game results.
Finally, these two algorithms are applied over a real-life and an artificial data set for different settings and compared with a linear SVM.
[125]
B. Taneva, “Conjoint Analysis: A Tool for Preference Analysis,” Universität des Saarlandes, Saarbrücken, 2008.
Export
BibTeX
@mastersthesis{TanevaMaster08,
TITLE = {Conjoint Analysis: A Tool for Preference Analysis},
AUTHOR = {Taneva, Bilyana},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-F74EE865F73CA9F0C1257590002B6F7B-TanevaMaster08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2008},
DATE = {2008},
}
Endnote
%0 Thesis
%A Taneva, Bilyana
%Y Giesen, Joachim
%A referee: Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Conjoint Analysis: A Tool for Preference Analysis :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1AAC-A
%F EDOC: 428310
%F OTHER: Local-ID: C125756E0038A185-F74EE865F73CA9F0C1257590002B6F7B-TanevaMaster08
%I Universität des Saarlandes
%C Saarbrücken
%D 2008
%P 54 S.
%V master
%9 master
2007
[126]
M. Abdel Maksoud, “Generating Code from Abstract VHDL Models,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{Maksoud2007,
TITLE = {Generating Code from Abstract {VHDL} Models},
AUTHOR = {Abdel Maksoud, Mohamed},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Abdel Maksoud, Mohamed
%Y Wilhelm, Reinhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
%T Generating Code from Abstract VHDL Models :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D127-5
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
[127]
V. Alvarez Amaya, “Approximation of Minimum Spanning Trees of Set of Points in the Hausdorff Metric,” Universität des Saarlandes, 2007.
Export
BibTeX
@mastersthesis{Alvarez2007,
TITLE = {Approximation of Minimum Spanning Trees of Set of Points in the Hausdorff Metric},
AUTHOR = {Alvarez Amaya, Victor},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Alvarez Amaya, Victor
%Y Seidel, Raimund
%A referee: Funke, Stefan
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Approximation of Minimum Spanning Trees of Set of Points in the Hausdorff Metric :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-BB8E-2
%I Universität des Saarlandes
%D 2007
%V master
%9 master
[128]
J. Bogojeska, “Stability Analysis of Oncogenetic Trees Mixture Models,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{Bogojeska2007a,
TITLE = {Stability Analysis of Oncogenetic Trees Mixture Models},
AUTHOR = {Bogojeska, Jasmina},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12573CC004A8E26-7A5625CE2DBE4839C1257283004617FD-Bogojeska2007a},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Bogojeska, Jasmina
%Y Rahnenführer, Jörg
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Stability Analysis of Oncogenetic Trees Mixture Models :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1DC3-2
%F EDOC: 356591
%F OTHER: Local-ID: C12573CC004A8E26-7A5625CE2DBE4839C1257283004617FD-Bogojeska2007a
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
[129]
I. I. Brudaru, “Heuristics for Average Diameter Approximation with External Memory Algorithms,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{Brudaru2007,
TITLE = {Heuristics for Average Diameter Approximation with External Memory Algorithms},
AUTHOR = {Brudaru, Irina Ioana},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Brudaru, Irina Ioana
%Y Meyer, Ulrich
%A referee: Funke, Stefan
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Heuristics for Average Diameter Approximation with External Memory Algorithms :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-C325-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
[130]
M. Celikik, “Efficient Large-Scale Clustering of Spelling Variants, with Applications to Error-Tolerant Text Search,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
In this thesis, the following spelling variants clustering problem is
considered: Given a list of distinct words, called lexicon, compute (possibly
overlapping) clusters of words which are spelling variants of each other. We
are looking for algorithms that are both efficient and accurate. Accuracy is
measured with respect to human judgment, e.g., a cluster is 100 accurate if it
contains all true spelling variants of the unique correct word it contains and
no other words, as judged by a human.
We have sifted the large body of literature on approximate string searching and
spelling correction problem for its applicability to our problem. We have
combined
various ideas from previous approaches to two new algorithms, with two
distinctly
different trade-offs between efficiency and accuracy. We have analyzed both
algorithms
and tested them experimentally on a variety of test collections, which were
chosen to exhibit the whole spectrum of spelling errors as they occur in
practice (human-made, OCR-induced, garbage). Our largest lexicon, containing
roughly 25 million words, can be processed in half an hour on a single machine.
The accuracies we obtain range from 88 - 95. We show that previous
approaches, if directly applied to our problem, are either significantly slower
or significantly less accurate or both.
Our spelling variants clustering problem arises naturally in the context of
search
engine spelling correction of the following kind: For a given query, return not
only
documents matching the query words exactly but also those matching their
spelling
variants. This is inverse to the well-known �did you mean: ...� web search
engine
feature, where the error tolerance is on the side of the query, and not on the
side of
the documents. We have integrated our algorithms with the CompleteSearch
engine, and show that this feature can be achieved without significant blowup
in either index size or query processing time.
Export
BibTeX
@mastersthesis{Celikik2007,
TITLE = {Efficient Large-Scale Clustering of Spelling Variants, with Applications to Error-Tolerant Text Search},
AUTHOR = {Celikik, Marjan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {In this thesis, the following spelling variants clustering problem is considered: Given a list of distinct words, called lexicon, compute (possibly overlapping) clusters of words which are spelling variants of each other. We are looking for algorithms that are both efficient and accurate. Accuracy is measured with respect to human judgment, e.g., a cluster is 100 accurate if it contains all true spelling variants of the unique correct word it contains and no other words, as judged by a human. We have sifted the large body of literature on approximate string searching and spelling correction problem for its applicability to our problem. We have combined various ideas from previous approaches to two new algorithms, with two distinctly different trade-offs between efficiency and accuracy. We have analyzed both algorithms and tested them experimentally on a variety of test collections, which were chosen to exhibit the whole spectrum of spelling errors as they occur in practice (human-made, OCR-induced, garbage). Our largest lexicon, containing roughly 25 million words, can be processed in half an hour on a single machine. The accuracies we obtain range from 88 -- 95. We show that previous approaches, if directly applied to our problem, are either significantly slower or significantly less accurate or both. Our spelling variants clustering problem arises naturally in the context of search engine spelling correction of the following kind: For a given query, return not only documents matching the query words exactly but also those matching their spelling variants. This is inverse to the well-known {\diamond}did you mean: ...{\diamond} web search engine feature, where the error tolerance is on the side of the query, and not on the side of the documents. We have integrated our algorithms with the CompleteSearch engine, and show that this feature can be achieved without significant blowup in either index size or query processing time.},
}
Endnote
%0 Thesis
%A Celikik, Marjan
%Y Weikum, Gerhard
%A referee: Bast, Holger
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Efficient Large-Scale Clustering of Spelling Variants, with Applications to Error-Tolerant Text Search :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-C33D-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X In this thesis, the following spelling variants clustering problem is
considered: Given a list of distinct words, called lexicon, compute (possibly
overlapping) clusters of words which are spelling variants of each other. We
are looking for algorithms that are both efficient and accurate. Accuracy is
measured with respect to human judgment, e.g., a cluster is 100 accurate if it
contains all true spelling variants of the unique correct word it contains and
no other words, as judged by a human.
We have sifted the large body of literature on approximate string searching and
spelling correction problem for its applicability to our problem. We have
combined
various ideas from previous approaches to two new algorithms, with two
distinctly
different trade-offs between efficiency and accuracy. We have analyzed both
algorithms
and tested them experimentally on a variety of test collections, which were
chosen to exhibit the whole spectrum of spelling errors as they occur in
practice (human-made, OCR-induced, garbage). Our largest lexicon, containing
roughly 25 million words, can be processed in half an hour on a single machine.
The accuracies we obtain range from 88 - 95. We show that previous
approaches, if directly applied to our problem, are either significantly slower
or significantly less accurate or both.
Our spelling variants clustering problem arises naturally in the context of
search
engine spelling correction of the following kind: For a given query, return not
only
documents matching the query words exactly but also those matching their
spelling
variants. This is inverse to the well-known �did you mean: ...� web search
engine
feature, where the error tolerance is on the side of the query, and not on the
side of
the documents. We have integrated our algorithms with the CompleteSearch
engine, and show that this feature can be achieved without significant blowup
in either index size or query processing time.
[131]
A. Chitea, “Efficient Semantic Annotation of the English Wikipedia,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{Chitea2007,
TITLE = {Efficient Semantic Annotation of the English Wikipedia},
AUTHOR = {Chitea, Alexandru},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Chitea, Alexandru
%Y Bast, Hannah
%A referee: Weikum, Gerhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Efficient Semantic Annotation of the English Wikipedia :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-C3F9-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
[132]
D. Dumitriu, “Graph-based Conservative Surface Reconstruction,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
We propose a new approach for reconstructing a 2-manifold from a point sample
in R³. Compared to previous algorithms, our approach is novel in that it throws
away geometry information early on in the reconstruction process and mainly
operates combinatorially on a graph structure.
Furthermore, it is very conservative in creating adjacencies between samples in
the vicinity of slivers, still we can prove that the resulting reconstruction
faithfully resembles the original 2-manifold. While the theoretical proof
requires an extremely high sampling density, our prototype implementation
of the approach produces surprisingly good results on typical sample sets.
Export
BibTeX
@mastersthesis{Dumitriu2007,
TITLE = {Graph-based Conservative Surface Reconstruction},
AUTHOR = {Dumitriu, Daniel},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12573CC004A8E26-6DF2C53978AEEB3DC12573590049A125-Dumitriu2007},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {We propose a new approach for reconstructing a 2-manifold from a point sample in R³. Compared to previous algorithms, our approach is novel in that it throws away geometry information early on in the reconstruction process and mainly operates combinatorially on a graph structure. Furthermore, it is very conservative in creating adjacencies between samples in the vicinity of slivers, still we can prove that the resulting reconstruction faithfully resembles the original 2-manifold. While the theoretical proof requires an extremely high sampling density, our prototype implementation of the approach produces surprisingly good results on typical sample sets.},
}
Endnote
%0 Thesis
%A Dumitriu, Daniel
%Y Kutz, Martin
%A referee: Funke, Stefan
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Graph-based Conservative Surface Reconstruction :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1DAA-B
%F EDOC: 356698
%F OTHER: Local-ID: C12573CC004A8E26-6DF2C53978AEEB3DC12573590049A125-Dumitriu2007
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X We propose a new approach for reconstructing a 2-manifold from a point sample
in R³. Compared to previous algorithms, our approach is novel in that it throws
away geometry information early on in the reconstruction process and mainly
operates combinatorially on a graph structure.
Furthermore, it is very conservative in creating adjacencies between samples in
the vicinity of slivers, still we can prove that the resulting reconstruction
faithfully resembles the original 2-manifold. While the theoretical proof
requires an extremely high sampling density, our prototype implementation
of the approach produces surprisingly good results on typical sample sets.
[133]
S. Elbassuoni, “Adaptive Personalization of Web Search,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
An often stated problem in the state-of-the-art web search is its lack of user
adaptation, as all users are presented with the same search results for a given
query string. A user submitting an ambiguous query such as ”java” with a strong
interest in traveling might appreciate finding pages related to the
Indonesian island Java. However, if the same user searched for programming
tutorials a few minutes ago, the situation would be completely different, and
call for programming-related results. Furthermore suppose our sample user
searches for ”java hashmap”. Again imposing her interest into traveling might
this time have the contrary effect and even harm the result quality. Thus the
effectiveness of a personalization of web search shows high variance in
performance depending on the query, the user and the search context. To this
end, carefully choosing the right personalization strategy in a
contextsensitive manner is critical for an improvement of search results.
In this thesis, we present a general framework that dynamically adapts the
query-result ranking to the different information needs in order to improve the
search experience for the individual user. We distinguish three different
search goals, namely whether the user re-searches known information, delves
deeper into a topic she is generally interested in, or satisfies an ad-hoc
information need. We take an implicit relevance feedback approach that makes
use of the user’s web interactions, however, vary what constitutes the examples
of relevant and irrelevant information according to the user’s search mode. We
show that incorporating user behavior data can significantly improve the
ordering of top results in a real web search setting.
Export
BibTeX
@mastersthesis{Elbassuoni-Master,
TITLE = {Adaptive Personalization of Web Search},
AUTHOR = {Elbassuoni, Shady},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12573CC004A8E26-6252DBEA76F19925C125730E0040A735-Elbassuoni-Master},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {An often stated problem in the state-of-the-art web search is its lack of user adaptation, as all users are presented with the same search results for a given query string. A user submitting an ambiguous query such as {\textquotedblright}java{\textquotedblright} with a strong interest in traveling might appreciate finding pages related to the Indonesian island Java. However, if the same user searched for programming tutorials a few minutes ago, the situation would be completely different, and call for programming-related results. Furthermore suppose our sample user searches for {\textquotedblright}java hashmap{\textquotedblright}. Again imposing her interest into traveling might this time have the contrary effect and even harm the result quality. Thus the effectiveness of a personalization of web search shows high variance in performance depending on the query, the user and the search context. To this end, carefully choosing the right personalization strategy in a contextsensitive manner is critical for an improvement of search results. In this thesis, we present a general framework that dynamically adapts the query-result ranking to the different information needs in order to improve the search experience for the individual user. We distinguish three different search goals, namely whether the user re-searches known information, delves deeper into a topic she is generally interested in, or satisfies an ad-hoc information need. We take an implicit relevance feedback approach that makes use of the user{\textquoteright}s web interactions, however, vary what constitutes the examples of relevant and irrelevant information according to the user{\textquoteright}s search mode. We show that incorporating user behavior data can significantly improve the ordering of top results in a real web search setting.},
}
Endnote
%0 Thesis
%A Elbassuoni, Shady
%Y Weikum, Gerhard
%A referee: Hermanns, Holger
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Adaptive Personalization of Web Search :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1D9F-5
%F EDOC: 356495
%F OTHER: Local-ID: C12573CC004A8E26-6252DBEA76F19925C125730E0040A735-Elbassuoni-Master
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X An often stated problem in the state-of-the-art web search is its lack of user
adaptation, as all users are presented with the same search results for a given
query string. A user submitting an ambiguous query such as ”java” with a strong
interest in traveling might appreciate finding pages related to the
Indonesian island Java. However, if the same user searched for programming
tutorials a few minutes ago, the situation would be completely different, and
call for programming-related results. Furthermore suppose our sample user
searches for ”java hashmap”. Again imposing her interest into traveling might
this time have the contrary effect and even harm the result quality. Thus the
effectiveness of a personalization of web search shows high variance in
performance depending on the query, the user and the search context. To this
end, carefully choosing the right personalization strategy in a
contextsensitive manner is critical for an improvement of search results.
In this thesis, we present a general framework that dynamically adapts the
query-result ranking to the different information needs in order to improve the
search experience for the individual user. We distinguish three different
search goals, namely whether the user re-searches known information, delves
deeper into a topic she is generally interested in, or satisfies an ad-hoc
information need. We take an implicit relevance feedback approach that makes
use of the user’s web interactions, however, vary what constitutes the examples
of relevant and irrelevant information according to the user’s search mode. We
show that incorporating user behavior data can significantly improve the
ordering of top results in a real web search setting.
[134]
P. Emeliyanenko, “Visualization of Points and Segments of Real Algebraic Plane Curves,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
This thesis presents an exact and complete approach for visualization of
segments and points of real plane algebraic curves given in implicit form
$f(x,y) = 0$. A curve segment is a distinct curve branch consisting of regular
points
only. Visualization of algebraic curves having self-intersection and isolated
points constitutes the main challenge. Visualization of curve segments involves
even more
difficulties since here we are faced with a problem of discriminating
different curve branches, which can pass arbitrary close to each other.
Our approach is robust and efficient (as shown by our benchmarks), it
combines the advantages both of curve tracking and
space subdivision methods and is able to correctly rasterize segments of
arbitrary-degree algebraic curves using double, multi-precision or exact
rational arithmetic.
Export
BibTeX
@mastersthesis{Emel2007,
TITLE = {Visualization of Points and Segments of Real Algebraic Plane Curves},
AUTHOR = {Emeliyanenko, Pavel},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12573CC004A8E26-FB0C2A279BD897D1C12572900046C0A0-Emel2007},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {This thesis presents an exact and complete approach for visualization of segments and points of real plane algebraic curves given in implicit form $f(x,y) = 0$. A curve segment is a distinct curve branch consisting of regular points only. Visualization of algebraic curves having self-intersection and isolated points constitutes the main challenge. Visualization of curve segments involves even more difficulties since here we are faced with a problem of discriminating different curve branches, which can pass arbitrary close to each other. Our approach is robust and efficient (as shown by our benchmarks), it combines the advantages both of curve tracking and space subdivision methods and is able to correctly rasterize segments of arbitrary-degree algebraic curves using double, multi-precision or exact rational arithmetic.},
}
Endnote
%0 Thesis
%A Emeliyanenko, Pavel
%A referee: Wolpert, Nicola
%Y Mehlhorn, Kurt
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Visualization of Points and Segments of Real Algebraic Plane Curves :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1DC6-B
%F EDOC: 356729
%F OTHER: Local-ID: C12573CC004A8E26-FB0C2A279BD897D1C12572900046C0A0-Emel2007
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X This thesis presents an exact and complete approach for visualization of
segments and points of real plane algebraic curves given in implicit form
$f(x,y) = 0$. A curve segment is a distinct curve branch consisting of regular
points
only. Visualization of algebraic curves having self-intersection and isolated
points constitutes the main challenge. Visualization of curve segments involves
even more
difficulties since here we are faced with a problem of discriminating
different curve branches, which can pass arbitrary close to each other.
Our approach is robust and efficient (as shown by our benchmarks), it
combines the advantages both of curve tracking and
space subdivision methods and is able to correctly rasterize segments of
arbitrary-degree algebraic curves using double, multi-precision or exact
rational arithmetic.
[135]
F. F. Horozal, “Towards a Natural Representation of Mathematics in Proof Assistants,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
In this thesis we investigate the proof assistant Scunak in order to explore <br>the relationship between informal mathematical texts and their Scunak <br>counterparts. The investigation is based on a case study in which we have <br>formalized parts of an introductory book on real analysis. Based on this case <br>study, we illustrate significant aspects of the formal representation of <br>mathematics in Scunak. In particular, we present the formal proof of the <br>example lim(1/n) = 0.<br>Moreover, we present a comparison of Scunak with two well-known systems for <br>formalizing mathematics, the Mizar System and Isabelle/HOL. We have proved the <br>example lim(1/n) = 0 in Mizar and Isabelle/HOL as well and we relate certain <br>features of formal mathematics in Mizar and Isabelle/HOL to corresponding <br>features of the Scunak type theory in light of this example.
Export
BibTeX
@mastersthesis{Horozal2007,
TITLE = {Towards a Natural Representation of Mathematics in Proof Assistants},
AUTHOR = {Horozal, Feryal Fulya},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {In this thesis we investigate the proof assistant Scunak in order to explore <br>the relationship between informal mathematical texts and their Scunak <br>counterparts. The investigation is based on a case study in which we have <br>formalized parts of an introductory book on real analysis. Based on this case <br>study, we illustrate significant aspects of the formal representation of <br>mathematics in Scunak. In particular, we present the formal proof of the <br>example lim(1/n) = 0.<br>Moreover, we present a comparison of Scunak with two well-known systems for <br>formalizing mathematics, the Mizar System and Isabelle/HOL. We have proved the <br>example lim(1/n) = 0 in Mizar and Isabelle/HOL as well and we relate certain <br>features of formal mathematics in Mizar and Isabelle/HOL to corresponding <br>features of the Scunak type theory in light of this example.},
}
Endnote
%0 Thesis
%A Horozal, Feryal Fulya
%Y Siekmann, Jörg
%A referee: Smolka, Gert
%A referee: Brown, Chad
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
%T Towards a Natural Representation of Mathematics in Proof Assistants :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-CF7A-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X In this thesis we investigate the proof assistant Scunak in order to explore <br>the relationship between informal mathematical texts and their Scunak <br>counterparts. The investigation is based on a case study in which we have <br>formalized parts of an introductory book on real analysis. Based on this case <br>study, we illustrate significant aspects of the formal representation of <br>mathematics in Scunak. In particular, we present the formal proof of the <br>example lim(1/n) = 0.<br>Moreover, we present a comparison of Scunak with two well-known systems for <br>formalizing mathematics, the Mizar System and Isabelle/HOL. We have proved the <br>example lim(1/n) = 0 in Mizar and Isabelle/HOL as well and we relate certain <br>features of formal mathematics in Mizar and Isabelle/HOL to corresponding <br>features of the Scunak type theory in light of this example.
[136]
C. Hritcu, “Step-indexed Semantic Model of Types for the Functional Object Calculus,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
Step-indexed semantic models of types were proposed as an alternative to the
purely syntactic proofs of type safety using subject-reduction. This thesis
introduces
a step-indexed model for the functional object calculus, and uses it to
prove the soundness of an expressive type system with object types, subtyping,
recursive and bounded quantified types.
Export
BibTeX
@mastersthesis{Hritcu2007,
TITLE = {Step-indexed Semantic Model of Types for the Functional Object Calculus},
AUTHOR = {Hritcu, Catalin},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {Step-indexed semantic models of types were proposed as an alternative to the purely syntactic proofs of type safety using subject-reduction. This thesis introduces a step-indexed model for the functional object calculus, and uses it to prove the soundness of an expressive type system with object types, subtyping, recursive and bounded quantified types.},
}
Endnote
%0 Thesis
%A Hritcu, Catalin
%Y Smolka, Gert
%A referee: Hermanns, Holger
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Step-indexed Semantic Model of Types for the Functional Object Calculus :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D108-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X Step-indexed semantic models of types were proposed as an alternative to the
purely syntactic proofs of type safety using subject-reduction. This thesis
introduces
a step-indexed model for the functional object calculus, and uses it to
prove the soundness of an expressive type system with object types, subtyping,
recursive and bounded quantified types.
[137]
L. Machablishvili, “Computing k-hop Broadcast Trees Exactly,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{Machablishvili2007,
TITLE = {Computing k-hop Broadcast Trees Exactly},
AUTHOR = {Machablishvili, Levan},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Machablishvili, Levan
%Y Funke, Stefan
%A referee: Hermanns, Holger
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
%T Computing k-hop Broadcast Trees Exactly :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D10E-0
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
[138]
Y. Mileva, “Invariance with Optic Flow,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
Variational methods currently belong to the most accurate techniques for the
computation of the displacement field between the frames of an image sequence.
Accuracy and time performance improvements of these methods are achieved every
year. Most of the effort is directed towards finding better data and smoothness
terms for the energy functional. Usually people are working mainly with
black-and-white image sequences.
In this thesis we will consider only colour images, as we believe that the
colour itself
carries much more information than the grey value and can help us for the
better estimation of the optic flow. So far most of the research done in optic
flow computation does not consider the presence of realistic illumination
changes in the image sequences.
One of the main goals of this thesis is to find new constancy assumptions for
the data term, which overcome the problems of severe illumination changes. So
far no research has been also done on combining variational methods with
statistical moments for the purpose of optic flow computation. The second goal
of this thesis is to investigate how and to what extend the optic flow methods
can benefit from the rotational invariant moments. We will introduce a new
variational methods framework that can combine all of the above mentioned new
assumptions into a successful optic flow computation technique.
Export
BibTeX
@mastersthesis{Mileva2007,
TITLE = {Invariance with Optic Flow},
AUTHOR = {Mileva, Yana},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {Variational methods currently belong to the most accurate techniques for the computation of the displacement field between the frames of an image sequence. Accuracy and time performance improvements of these methods are achieved every year. Most of the effort is directed towards finding better data and smoothness terms for the energy functional. Usually people are working mainly with black-and-white image sequences. In this thesis we will consider only colour images, as we believe that the colour itself carries much more information than the grey value and can help us for the better estimation of the optic flow. So far most of the research done in optic flow computation does not consider the presence of realistic illumination changes in the image sequences. One of the main goals of this thesis is to find new constancy assumptions for the data term, which overcome the problems of severe illumination changes. So far no research has been also done on combining variational methods with statistical moments for the purpose of optic flow computation. The second goal of this thesis is to investigate how and to what extend the optic flow methods can benefit from the rotational invariant moments. We will introduce a new variational methods framework that can combine all of the above mentioned new assumptions into a successful optic flow computation technique.},
}
Endnote
%0 Thesis
%A Mileva, Yana
%Y Weikert, Joachim
%A referee: Wilhelm, Reinhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Invariance with Optic Flow :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D19C-E
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X Variational methods currently belong to the most accurate techniques for the
computation of the displacement field between the frames of an image sequence.
Accuracy and time performance improvements of these methods are achieved every
year. Most of the effort is directed towards finding better data and smoothness
terms for the energy functional. Usually people are working mainly with
black-and-white image sequences.
In this thesis we will consider only colour images, as we believe that the
colour itself
carries much more information than the grey value and can help us for the
better estimation of the optic flow. So far most of the research done in optic
flow computation does not consider the presence of realistic illumination
changes in the image sequences.
One of the main goals of this thesis is to find new constancy assumptions for
the data term, which overcome the problems of severe illumination changes. So
far no research has been also done on combining variational methods with
statistical moments for the purpose of optic flow computation. The second goal
of this thesis is to investigate how and to what extend the optic flow methods
can benefit from the rotational invariant moments. We will introduce a new
variational methods framework that can combine all of the above mentioned new
assumptions into a successful optic flow computation technique.
[139]
S. Nenova, “Extraction of Attack Signatures,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
With the advance of technology, the need for fast reaction to remote attacks
gains
in importance. A common practice to help detect malicious activity is to
install an
Intrusion Detection System. Intrusion detection systems are equipped with a set
of
signatures�descriptions of known intrusion attempts. They monitor traffic and
use
the signatures to detect intrusion attempts.
To date, attack signatures are still mostly derived manually. However, to
ensure the
security of computer systems and data, the speed and quality of signature
generation has to be improved. To help achieve the task, we propose an approach
for automatic extraction of attack signatures.
In contrast to the majority of the existing research in the area, we do not
confine
our approach to a particular type of attack. In particular, we are the first to
try signature extraction for attacks resulting from misconfigured security
policies. Whereas the majority of existing approaches rely on statistical
methods and require many attack instances in order to launch the signature
generation mechanism, we use experimentation and need only a single attack
instance.
For experimentation, we combine an existing framework for capture and replay of
system calls with an appropriate minimization algorithm. We propose three
minimization algorithms: Delta Debugging, Binary Debugging and Consecutive
Binary Debugging.
We evaluate the performance of the different algorithms and test our approach
with an example program. In all test cases, our application successfully
extracts the
attack signature. Our current results suggest that this is a promising approach
that
can help us defend better and faster against unknown attacks.
Export
BibTeX
@mastersthesis{Nenova2007,
TITLE = {Extraction of Attack Signatures},
AUTHOR = {Nenova, Stefana},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {With the advance of technology, the need for fast reaction to remote attacks gains in importance. A common practice to help detect malicious activity is to install an Intrusion Detection System. Intrusion detection systems are equipped with a set of signatures{\diamond}descriptions of known intrusion attempts. They monitor traffic and use the signatures to detect intrusion attempts. To date, attack signatures are still mostly derived manually. However, to ensure the security of computer systems and data, the speed and quality of signature generation has to be improved. To help achieve the task, we propose an approach for automatic extraction of attack signatures. In contrast to the majority of the existing research in the area, we do not confine our approach to a particular type of attack. In particular, we are the first to try signature extraction for attacks resulting from misconfigured security policies. Whereas the majority of existing approaches rely on statistical methods and require many attack instances in order to launch the signature generation mechanism, we use experimentation and need only a single attack instance. For experimentation, we combine an existing framework for capture and replay of system calls with an appropriate minimization algorithm. We propose three minimization algorithms: Delta Debugging, Binary Debugging and Consecutive Binary Debugging. We evaluate the performance of the different algorithms and test our approach with an example program. In all test cases, our application successfully extracts the attack signature. Our current results suggest that this is a promising approach that can help us defend better and faster against unknown attacks.},
}
Endnote
%0 Thesis
%A Nenova, Stefana
%Y Zeller, Andreas
%A referee: Wilhelm, Reinhard
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Extraction of Attack Signatures :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D1A0-2
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X With the advance of technology, the need for fast reaction to remote attacks
gains
in importance. A common practice to help detect malicious activity is to
install an
Intrusion Detection System. Intrusion detection systems are equipped with a set
of
signatures�descriptions of known intrusion attempts. They monitor traffic and
use
the signatures to detect intrusion attempts.
To date, attack signatures are still mostly derived manually. However, to
ensure the
security of computer systems and data, the speed and quality of signature
generation has to be improved. To help achieve the task, we propose an approach
for automatic extraction of attack signatures.
In contrast to the majority of the existing research in the area, we do not
confine
our approach to a particular type of attack. In particular, we are the first to
try signature extraction for attacks resulting from misconfigured security
policies. Whereas the majority of existing approaches rely on statistical
methods and require many attack instances in order to launch the signature
generation mechanism, we use experimentation and need only a single attack
instance.
For experimentation, we combine an existing framework for capture and replay of
system calls with an appropriate minimization algorithm. We propose three
minimization algorithms: Delta Debugging, Binary Debugging and Consecutive
Binary Debugging.
We evaluate the performance of the different algorithms and test our approach
with an example program. In all test cases, our application successfully
extracts the
attack signature. Our current results suggest that this is a promising approach
that
can help us defend better and faster against unknown attacks.
[140]
G. Pandey, “Retrieval Model Enhancement by Implicit Feedback from Query Logs,” Universität des Saarlandes, Saarbrücken, 2007.
Export
BibTeX
@mastersthesis{PandeyMaster08,
TITLE = {Retrieval Model Enhancement by Implicit Feedback from Query Logs},
AUTHOR = {Pandey, Gaurav},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125756E0038A185-4BA81B4BEC36919EC1257590002B50A7-PandeyMaster08},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
}
Endnote
%0 Thesis
%A Pandey, Gaurav
%Y Weikum, Gerhard
%A referee: Bast, Hannah
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Retrieval Model Enhancement by Implicit Feedback from Query Logs :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1DA7-2
%F EDOC: 428305
%F OTHER: Local-ID: C125756E0038A185-4BA81B4BEC36919EC1257590002B50A7-PandeyMaster08
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%P X, 51 S.
%V master
%9 master
[141]
S. Solomon, “Evaluation of Relevance Feedback Algorithms for XML Retrieval,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
Information retrieval and feedback in {XML} are rather new fields for
researchers; natural questions arise, such as: how good are the feedback
algorithms in {XML IR}? Can they be evaluated with standard evaluation tools?
Even though some evaluation methods have been proposed in the literature, it is
still not clear yet which of them are applicable in the context of {XML IR},
and which metrics they can be combined with to assess the quality of {XML}
retrieval algorithms that use feedback.
We propose a solution for fairly evaluating the performance of the {XML} search
engines that use feedback for improving the query results. Compared
to previous approaches, we aim at removing the effect of the results for which
the system has knowledge about their the relevance, and at measuring the
improvement on unseen relevant elements.
We implemented our proposed evaluation methodologies by extending a standard
evaluation tool with a module capable of assessing feedback algorithms for a
specific set of metrics. We performed multiple tests on runs from both {INEX}
2005 and {INEX} 2006, covering two different {XML} document collections.
The performance of the assessed feedback algorithms did not reach the
theoretical optimal values either for the proposed evaluation methodologies, or
for the used metrics. The analysis of the results shows that, although the six
evaluation techniques provide good improvement figures, none of them can be
declared the absolute winner. Despite the lack of a definitive
conclusion, our findings provide a better understanding on the quality of
feedback algorithms.
Export
BibTeX
@mastersthesis{Solomon2007,
TITLE = {Evaluation of Relevance Feedback Algorithms for {XML} Retrieval},
AUTHOR = {Solomon, Silvana},
LANGUAGE = {eng},
LOCALID = {Local-ID: C12573CC004A8E26-7696201B4CA7C699C125730800414211-Solomon2007},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {Information retrieval and feedback in {XML} are rather new fields for researchers; natural questions arise, such as: how good are the feedback algorithms in {XML IR}? Can they be evaluated with standard evaluation tools? Even though some evaluation methods have been proposed in the literature, it is still not clear yet which of them are applicable in the context of {XML IR}, and which metrics they can be combined with to assess the quality of {XML} retrieval algorithms that use feedback. We propose a solution for fairly evaluating the performance of the {XML} search engines that use feedback for improving the query results. Compared to previous approaches, we aim at removing the effect of the results for which the system has knowledge about their the relevance, and at measuring the improvement on unseen relevant elements. We implemented our proposed evaluation methodologies by extending a standard evaluation tool with a module capable of assessing feedback algorithms for a specific set of metrics. We performed multiple tests on runs from both {INEX} 2005 and {INEX} 2006, covering two different {XML} document collections. The performance of the assessed feedback algorithms did not reach the theoretical optimal values either for the proposed evaluation methodologies, or for the used metrics. The analysis of the results shows that, although the six evaluation techniques provide good improvement figures, none of them can be declared the absolute winner. Despite the lack of a definitive conclusion, our findings provide a better understanding on the quality of feedback algorithms.},
}
Endnote
%0 Thesis
%A Solomon, Silvana
%Y Weikum, Gerhard
%A referee: Schenkel, Ralf
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Evaluation of Relevance Feedback Algorithms for XML Retrieval :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-1D94-C
%F EDOC: 356461
%F OTHER: Local-ID: C12573CC004A8E26-7696201B4CA7C699C125730800414211-Solomon2007
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X Information retrieval and feedback in {XML} are rather new fields for
researchers; natural questions arise, such as: how good are the feedback
algorithms in {XML IR}? Can they be evaluated with standard evaluation tools?
Even though some evaluation methods have been proposed in the literature, it is
still not clear yet which of them are applicable in the context of {XML IR},
and which metrics they can be combined with to assess the quality of {XML}
retrieval algorithms that use feedback.
We propose a solution for fairly evaluating the performance of the {XML} search
engines that use feedback for improving the query results. Compared
to previous approaches, we aim at removing the effect of the results for which
the system has knowledge about their the relevance, and at measuring the
improvement on unseen relevant elements.
We implemented our proposed evaluation methodologies by extending a standard
evaluation tool with a module capable of assessing feedback algorithms for a
specific set of metrics. We performed multiple tests on runs from both {INEX}
2005 and {INEX} 2006, covering two different {XML} document collections.
The performance of the assessed feedback algorithms did not reach the
theoretical optimal values either for the proposed evaluation methodologies, or
for the used metrics. The analysis of the results shows that, although the six
evaluation techniques provide good improvement figures, none of them can be
declared the absolute winner. Despite the lack of a definitive
conclusion, our findings provide a better understanding on the quality of
feedback algorithms.
[142]
M. Strauss, “Realtime Generation of Multimodal Affective Sports Commentary for Embodied Agents,” Universität des Saarlandes, Saarbrücken, 2007.
Abstract
Autonomous, graphically embodied agents are a versatile platform for
information presentation and user interaction. This thesis presents ERIC, a
homogeneous agent framework that can be configured to provide real-time running
commentary on a dynamic environment of many and frequent events. We have
focused on knowledge reasoning with a world model, generating and expressing
affect, and generating coherent natural language, synchronised with nonverbal
modalities. The graphical and TTS output of the agent is provided by commercial
systems.
ERIC is currently implemented to commentate a simulated horse race and a
multiplayer tank combat game. With minimal modification the system is
configurable to provide commentary in any continuous dynamically changing
environment; for example, it could commentate sports matches and computer
games, or play the role of "tourist guide" during a self-guided tour of a city.
An elaborate world model is deduced from limited input by an expert system
implemented as rules in Jess. Natural language is generated using
template-based NLG.
Discourse coherence is maintained by requiring semantic relations between the
forwardlooking and backward-looking centers of successive utterances. The agent
uses a set of causal and belief relations to assign appraisals of
emotion-eliciting conditions to facts in the world model based on goals and
desires. These appraisals are used to generate an affective state according to
the OCC cognitive model of emotions; the agent�s affect is expressed via his
lexical choice, gestures and facial expressions.
ERIC was designed to be domain-independent, homogeneous, behaviourally complex,
reactive and affective. Domain-indepence was evaluated by comparing the effort
required to implement the ERIC system with the effort required to re-implement
the framework for another domain. Complexity, reactivity and affectivity were
assessed by independent experts, whose reviews are presented.
Export
BibTeX
@mastersthesis{Strauss2007,
TITLE = {Realtime Generation of Multimodal Affective Sports Commentary for Embodied Agents},
AUTHOR = {Strauss, Martin},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2007},
DATE = {2007},
ABSTRACT = {Autonomous, graphically embodied agents are a versatile platform for information presentation and user interaction. This thesis presents ERIC, a homogeneous agent framework that can be configured to provide real-time running commentary on a dynamic environment of many and frequent events. We have focused on knowledge reasoning with a world model, generating and expressing affect, and generating coherent natural language, synchronised with nonverbal modalities. The graphical and TTS output of the agent is provided by commercial systems. ERIC is currently implemented to commentate a simulated horse race and a multiplayer tank combat game. With minimal modification the system is configurable to provide commentary in any continuous dynamically changing environment; for example, it could commentate sports matches and computer games, or play the role of "tourist guide" during a self-guided tour of a city. An elaborate world model is deduced from limited input by an expert system implemented as rules in Jess. Natural language is generated using template-based NLG. Discourse coherence is maintained by requiring semantic relations between the forwardlooking and backward-looking centers of successive utterances. The agent uses a set of causal and belief relations to assign appraisals of emotion-eliciting conditions to facts in the world model based on goals and desires. These appraisals are used to generate an affective state according to the OCC cognitive model of emotions; the agent{\diamond}s affect is expressed via his lexical choice, gestures and facial expressions. ERIC was designed to be domain-independent, homogeneous, behaviourally complex, reactive and affective. Domain-indepence was evaluated by comparing the effort required to implement the ERIC system with the effort required to re-implement the framework for another domain. Complexity, reactivity and affectivity were assessed by independent experts, whose reviews are presented.},
}
Endnote
%0 Thesis
%A Strauss, Martin
%Y Wahlster, Wolfgang
%A referee: Kipp, Michael
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Realtime Generation of Multimodal Affective Sports Commentary for Embodied Agents :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D1A5-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2007
%V master
%9 master
%X Autonomous, graphically embodied agents are a versatile platform for
information presentation and user interaction. This thesis presents ERIC, a
homogeneous agent framework that can be configured to provide real-time running
commentary on a dynamic environment of many and frequent events. We have
focused on knowledge reasoning with a world model, generating and expressing
affect, and generating coherent natural language, synchronised with nonverbal
modalities. The graphical and TTS output of the agent is provided by commercial
systems.
ERIC is currently implemented to commentate a simulated horse race and a
multiplayer tank combat game. With minimal modification the system is
configurable to provide commentary in any continuous dynamically changing
environment; for example, it could commentate sports matches and computer
games, or play the role of "tourist guide" during a self-guided tour of a city.
An elaborate world model is deduced from limited input by an expert system
implemented as rules in Jess. Natural language is generated using
template-based NLG.
Discourse coherence is maintained by requiring semantic relations between the
forwardlooking and backward-looking centers of successive utterances. The agent
uses a set of causal and belief relations to assign appraisals of
emotion-eliciting conditions to facts in the world model based on goals and
desires. These appraisals are used to generate an affective state according to
the OCC cognitive model of emotions; the agent�s affect is expressed via his
lexical choice, gestures and facial expressions.
ERIC was designed to be domain-independent, homogeneous, behaviourally complex,
reactive and affective. Domain-indepence was evaluated by comparing the effort
required to implement the ERIC system with the effort required to re-implement
the framework for another domain. Complexity, reactivity and affectivity were
assessed by independent experts, whose reviews are presented.
2006
[143]
L. Antova, “Efficient Representation and Processing of Incomplete Information,” Universität des Saarlandes, Saarbrücken, 2006.
Abstract
Database systems often have to deal with incomplete information as the world
they model is not always complete. This is a frequent case in data integration
applications, scientific databases, or in scenarios where information is
manually entered and constains errors or ambiguity.
In the las couple of decades different formalisms have been proposed for
representing incomplete information. These include, among other, the so called
realtions with or-sets, tables with variables (v-tables) and conditional tables
(c-tables).
However, none of the current approaches for representing incomple information
has satisfied the requirements for a powerful and effcient data management
system, which is the reason why none has found application in practice. All
models generally suffer from at least one of two weaknesses. Either they are
not strong enough for representing results of simple queries, as is the case
for v-tables and realtions with or-sets, or the handling and processing of the
data, e.g. for query evaluation, is intractable (as is the case for c-tables).
In this thesis, e present a decomposition-based approach to addressing the
problem of incompletely specified databases. we introduce world-set
decompositions (WSDs), a space-efficient formalism for repreenting any finite
set of possible worlds over relational algebra queries on WSDs. For each
relational algebra operation we present an algorithm operting on WSDs.
We also address the problem of data cleaning in the context of world set
decompositions. We present a modified version of the existing Chase algorithm,
which we use to remove inconsistent worlds in an incompletely specified
database.
We evaluate our techniques in a large census data scenario with data
originating from the 1990 USA census and we show that data processing on WSDs
is both scalable and efficient.
Export
BibTeX
@mastersthesis{Antova2006,
TITLE = {Efficient Representation and Processing of Incomplete Information},
AUTHOR = {Antova, Lyublena},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
ABSTRACT = {Database systems often have to deal with incomplete information as the world they model is not always complete. This is a frequent case in data integration applications, scientific databases, or in scenarios where information is manually entered and constains errors or ambiguity. In the las couple of decades different formalisms have been proposed for representing incomplete information. These include, among other, the so called realtions with or-sets, tables with variables (v-tables) and conditional tables (c-tables). However, none of the current approaches for representing incomple information has satisfied the requirements for a powerful and effcient data management system, which is the reason why none has found application in practice. All models generally suffer from at least one of two weaknesses. Either they are not strong enough for representing results of simple queries, as is the case for v-tables and realtions with or-sets, or the handling and processing of the data, e.g. for query evaluation, is intractable (as is the case for c-tables). In this thesis, e present a decomposition-based approach to addressing the problem of incompletely specified databases. we introduce world-set decompositions (WSDs), a space-efficient formalism for repreenting any finite set of possible worlds over relational algebra queries on WSDs. For each relational algebra operation we present an algorithm operting on WSDs. We also address the problem of data cleaning in the context of world set decompositions. We present a modified version of the existing Chase algorithm, which we use to remove inconsistent worlds in an incompletely specified database. We evaluate our techniques in a large census data scenario with data originating from the 1990 USA census and we show that data processing on WSDs is both scalable and efficient.},
}
Endnote
%0 Thesis
%A Antova, Lyublena
%Y Olteanu, Dan
%A referee: Koch, Christoph
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Efficient Representation and Processing of Incomplete Information :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D311-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
%X Database systems often have to deal with incomplete information as the world
they model is not always complete. This is a frequent case in data integration
applications, scientific databases, or in scenarios where information is
manually entered and constains errors or ambiguity.
In the las couple of decades different formalisms have been proposed for
representing incomplete information. These include, among other, the so called
realtions with or-sets, tables with variables (v-tables) and conditional tables
(c-tables).
However, none of the current approaches for representing incomple information
has satisfied the requirements for a powerful and effcient data management
system, which is the reason why none has found application in practice. All
models generally suffer from at least one of two weaknesses. Either they are
not strong enough for representing results of simple queries, as is the case
for v-tables and realtions with or-sets, or the handling and processing of the
data, e.g. for query evaluation, is intractable (as is the case for c-tables).
In this thesis, e present a decomposition-based approach to addressing the
problem of incompletely specified databases. we introduce world-set
decompositions (WSDs), a space-efficient formalism for repreenting any finite
set of possible worlds over relational algebra queries on WSDs. For each
relational algebra operation we present an algorithm operting on WSDs.
We also address the problem of data cleaning in the context of world set
decompositions. We present a modified version of the existing Chase algorithm,
which we use to remove inconsistent worlds in an incompletely specified
database.
We evaluate our techniques in a large census data scenario with data
originating from the 1990 USA census and we show that data processing on WSDs
is both scalable and efficient.
[144]
Y. Assenov, “Topological Analysis of Biological Networks,” Universität des Saarlandes, Saarbrücken, 2006.
Export
BibTeX
@mastersthesis{Assenov2006a,
TITLE = {Topological Analysis of Biological Networks},
AUTHOR = {Assenov, Yassen},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125673F004B2D7B-7C34F302699CE873C12572960035F334-Assenov2006a},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
}
Endnote
%0 Thesis
%A Assenov, Yassen
%Y Albrecht, Mario
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Topological Analysis of Biological Networks :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2441-9
%F EDOC: 314467
%F OTHER: Local-ID: C125673F004B2D7B-7C34F302699CE873C12572960035F334-Assenov2006a
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
[145]
M. Demir, “Predicting Component Failures at Early Design Time,” Universität des Saarlandes, Saarbrücken, 2006.
Abstract
For the effective prevention and elemination of defects and failures in a
software system, it is important to know which parts of the software are mor
likely to contain errors, and therefore, can be considered as "risky". To
increase reliability and quality, more effort should be spent in risky
components during design, implementation, and testing.
Examining the version archive and the code of a large open-source project, we
have investigated the relation between the risk of components as measured by
post-release failures, and different code structures; such as method calls,
variables, exception handling expressions and inheritnace statements. We have
analyzed the different types of usage relations between components, and their
affects on the failures. We utilized three commonly used statistical techniques
to build failure prediction models. As a realistic opponent to our models, we
introduced a "simple prediction model" which makes use of the riskiness
information from the available components, rather than making random guesses.
While the results from the classification experiments supported the use of code
structures to predict failur-proneness, our regression analyses showed that the
design time decisions also effected component riskiness. Our models were able
to make precise predictions, with even only the knowledge of the inheritnace
relations. since inheritance relations are defined aerliest at design time;
based on the results of this study, we can say that it may be possible to
initialize preventive actions against failures even early in the design phase
of a project.
Export
BibTeX
@mastersthesis{Demir2006,
TITLE = {Predicting Component Failures at Early Design Time},
AUTHOR = {Demir, Melih},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
ABSTRACT = {For the effective prevention and elemination of defects and failures in a software system, it is important to know which parts of the software are mor likely to contain errors, and therefore, can be considered as "risky". To increase reliability and quality, more effort should be spent in risky components during design, implementation, and testing. Examining the version archive and the code of a large open-source project, we have investigated the relation between the risk of components as measured by post-release failures, and different code structures; such as method calls, variables, exception handling expressions and inheritnace statements. We have analyzed the different types of usage relations between components, and their affects on the failures. We utilized three commonly used statistical techniques to build failure prediction models. As a realistic opponent to our models, we introduced a "simple prediction model" which makes use of the riskiness information from the available components, rather than making random guesses. While the results from the classification experiments supported the use of code structures to predict failur-proneness, our regression analyses showed that the design time decisions also effected component riskiness. Our models were able to make precise predictions, with even only the knowledge of the inheritnace relations. since inheritance relations are defined aerliest at design time; based on the results of this study, we can say that it may be possible to initialize preventive actions against failures even early in the design phase of a project.},
}
Endnote
%0 Thesis
%A Demir, Melih
%Y Zeller, Andreas
%A referee: Wilhelm, Reinhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Predicting Component Failures at Early Design Time :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D44A-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
%X For the effective prevention and elemination of defects and failures in a
software system, it is important to know which parts of the software are mor
likely to contain errors, and therefore, can be considered as "risky". To
increase reliability and quality, more effort should be spent in risky
components during design, implementation, and testing.
Examining the version archive and the code of a large open-source project, we
have investigated the relation between the risk of components as measured by
post-release failures, and different code structures; such as method calls,
variables, exception handling expressions and inheritnace statements. We have
analyzed the different types of usage relations between components, and their
affects on the failures. We utilized three commonly used statistical techniques
to build failure prediction models. As a realistic opponent to our models, we
introduced a "simple prediction model" which makes use of the riskiness
information from the available components, rather than making random guesses.
While the results from the classification experiments supported the use of code
structures to predict failur-proneness, our regression analyses showed that the
design time decisions also effected component riskiness. Our models were able
to make precise predictions, with even only the knowledge of the inheritnace
relations. since inheritance relations are defined aerliest at design time;
based on the results of this study, we can say that it may be possible to
initialize preventive actions against failures even early in the design phase
of a project.
[146]
R. Dimitrova, “Model Checking with Abstraction Refinement for Well-structured Systems,” Universität des Saarlandes, Saarbrücken, 2006.
Export
BibTeX
@mastersthesis{DimitrovaPhd2006,
TITLE = {Model Checking with Abstraction Refinement for Well-structured Systems},
AUTHOR = {Dimitrova, Rayna},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291-scidok-13339},
LOCALID = {Local-ID: C1256104005ECAFC-E98933E3DE9BBA00C12572350035E8C3-Dimitrova2006},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
}
Endnote
%0 Thesis
%A Dimitrova, Rayna
%Y Podelski, Andreas
%A referee: Finkbeiner, Bernd
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
External Organizations
%T Model Checking with Abstraction Refinement for Well-structured Systems :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-218F-8
%F EDOC: 314388
%F OTHER: Local-ID: C1256104005ECAFC-E98933E3DE9BBA00C12572350035E8C3-Dimitrova2006
%U urn:nbn:de:bsz:291-scidok-13339
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%P 93 p.
%V master
%9 master
%U http://scidok.sulb.uni-saarland.de/doku/urheberrecht.php?la=dehttp://scidok.sulb.uni-saarland.de/volltexte/2007/1333/
[147]
K. Halachev, “EpiGRAPHregression: A toolkit for (epi-)genomic correlation analysis and prediction of quantitative attributes,” Universität des Saarlandes, Saarbrücken, 2006.
Abstract
Five years ago, the human genome sequence was published, an important milestone
towards understanding human biology.
However, basic cell processes cannot be explained by the genome sequence alone.
Instead, further layers of control such
as the epigenome will be important for significant advances towards better
understanding of normal and disease-related
phenotypes.
A new research field in computational biology is currently emerging that is
concerned with the analysis of functional
information beyond the human genome sequence. Our goal is to provide biologists
with means to navigate the large
amounts of epigenetic data and with tools to screen these data for biologically
interesting associations. We developed
a statistical learning methodology that facilitates mapping of epigenetic data
against the human genome, identifies
areas of over- and underrepresentation, and finds significant correlations with
DNA-related attributes. We implemented
this methodology in a software toolkit called EpiGRAPHregression.
EpiGRAPHregression is a prototype of a genome analysis tool that enables the
user to analyze
relationships between many attributes, and it provides a quick test whether a
newly analyzed attribute can be
efficiently predicted from already known attributes. Thereby,
EpiGRAPHregression may
significantly speed up the analysis of new types of genomic and epigenomic
data.
Export
BibTeX
@mastersthesis{HalachevMaster2006,
TITLE = {{EpiGRAPHregression}: A toolkit for (epi-)genomic correlation analysis and prediction of quantitative attributes},
AUTHOR = {Halachev, Konstantin},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125673F004B2D7B-56DA5A90E26AE559C12572350051361B-HalachevMaster2006},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
ABSTRACT = {Five years ago, the human genome sequence was published, an important milestone towards understanding human biology. However, basic cell processes cannot be explained by the genome sequence alone. Instead, further layers of control such as the epigenome will be important for significant advances towards better understanding of normal and disease-related phenotypes. A new research field in computational biology is currently emerging that is concerned with the analysis of functional information beyond the human genome sequence. Our goal is to provide biologists with means to navigate the large amounts of epigenetic data and with tools to screen these data for biologically interesting associations. We developed a statistical learning methodology that facilitates mapping of epigenetic data against the human genome, identifies areas of over- and underrepresentation, and finds significant correlations with DNA-related attributes. We implemented this methodology in a software toolkit called EpiGRAPHregression. EpiGRAPHregression is a prototype of a genome analysis tool that enables the user to analyze relationships between many attributes, and it provides a quick test whether a newly analyzed attribute can be efficiently predicted from already known attributes. Thereby, EpiGRAPHregression may significantly speed up the analysis of new types of genomic and epigenomic data.},
}
Endnote
%0 Thesis
%A Halachev, Konstantin
%Y Lengauer, Thomas
%Y Bock, Christoph
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T EpiGRAPHregression: A toolkit for (epi-)genomic correlation analysis and prediction of quantitative attributes :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-22BA-B
%F EDOC: 314491
%F OTHER: Local-ID: C125673F004B2D7B-56DA5A90E26AE559C12572350051361B-HalachevMaster2006
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
%X Five years ago, the human genome sequence was published, an important milestone
towards understanding human biology.
However, basic cell processes cannot be explained by the genome sequence alone.
Instead, further layers of control such
as the epigenome will be important for significant advances towards better
understanding of normal and disease-related
phenotypes.
A new research field in computational biology is currently emerging that is
concerned with the analysis of functional
information beyond the human genome sequence. Our goal is to provide biologists
with means to navigate the large
amounts of epigenetic data and with tools to screen these data for biologically
interesting associations. We developed
a statistical learning methodology that facilitates mapping of epigenetic data
against the human genome, identifies
areas of over- and underrepresentation, and finds significant correlations with
DNA-related attributes. We implemented
this methodology in a software toolkit called EpiGRAPHregression.
EpiGRAPHregression is a prototype of a genome analysis tool that enables the
user to analyze
relationships between many attributes, and it provides a quick test whether a
newly analyzed attribute can be
efficiently predicted from already known attributes. Thereby,
EpiGRAPHregression may
significantly speed up the analysis of new types of genomic and epigenomic
data.
[148]
V. Osipov, “A polynomial Time Randomized Parallel Approximation Algorithm for Finding Heavy Planar Subgraphs,” Universität des Saarlandes, Saarbrücken, 2006.
Abstract
We provide an approximation algorithm for the Maximum Weight
Planar Subgraph problem, the NP-hard problem of finding a heaviest planar
subgraph in an edge-weighted graph G. In the general case
our algorithm has performance ratio at least 1/3+1/72 matching the
best algorithm known so far, though in several special cases we prove
stronger results. In particular, we obtain performance ratio 2/3 (in-
stead of 7/12) for the NP-hard Maximum Weight Outerplanar Sub-
graph problem meeting the performance ratio of the best algorithm
for the unweighted case. When the maximum weight planar subgraph
is one of several special types of Hamiltonian graphs, we show performance
ratios at least 2/5 and 4/9 (instead of 1/3 + 1/72), and 1/2
(instead of 4/9) for the unweighted case.
Export
BibTeX
@mastersthesis{Osipov2006,
TITLE = {A polynomial Time Randomized Parallel Approximation Algorithm for Finding Heavy Planar Subgraphs},
AUTHOR = {Osipov, Vitali},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
ABSTRACT = {We provide an approximation algorithm for the Maximum Weight Planar Subgraph problem, the NP-hard problem of finding a heaviest planar subgraph in an edge-weighted graph G. In the general case our algorithm has performance ratio at least 1/3+1/72 matching the best algorithm known so far, though in several special cases we prove stronger results. In particular, we obtain performance ratio 2/3 (in- stead of 7/12) for the NP-hard Maximum Weight Outerplanar Sub- graph problem meeting the performance ratio of the best algorithm for the unweighted case. When the maximum weight planar subgraph is one of several special types of Hamiltonian graphs, we show performance ratios at least 2/5 and 4/9 (instead of 1/3 + 1/72), and 1/2 (instead of 4/9) for the unweighted case.},
}
Endnote
%0 Thesis
%A Osipov, Vitali
%Y Bläser^, Markus
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
%T A polynomial Time Randomized Parallel Approximation Algorithm for Finding Heavy Planar Subgraphs :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D5A8-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
%X We provide an approximation algorithm for the Maximum Weight
Planar Subgraph problem, the NP-hard problem of finding a heaviest planar
subgraph in an edge-weighted graph G. In the general case
our algorithm has performance ratio at least 1/3+1/72 matching the
best algorithm known so far, though in several special cases we prove
stronger results. In particular, we obtain performance ratio 2/3 (in-
stead of 7/12) for the NP-hard Maximum Weight Outerplanar Sub-
graph problem meeting the performance ratio of the best algorithm
for the unweighted case. When the maximum weight planar subgraph
is one of several special types of Hamiltonian graphs, we show performance
ratios at least 2/5 and 4/9 (instead of 1/3 + 1/72), and 1/2
(instead of 4/9) for the unweighted case.
[149]
L. Tolosi, “Analysis of Array CGH Data for the Estimation of Genetic Tumor Progression,” Universität des Saarlandes, Saarbrücken, 2006.
Abstract
In cancer research, prediction of time to death or relapse is important for a
meaningful tumor classification and selecting
appropriate therapies. The accumulation of genetic alterations during tumor
progression can be used for the assessment of the genetic
status of the tumor. ArrayCGH technology is used to measure genomic
amplifications and deletions, with a high resolution that allows the
detection of down to single genes copy number changes.
\\\\We propose an automated method for analysis of cancer mutations
accumulation based on statistical analysis of arrayCGH data. The
method consists of the four steps: arrayCGH smoothing, aberrations
detection, consensus analysis and oncogenetic tree models
estimation. For the second and third steps, we propose new
algorithmic solutions. First, we use the adaptive weights
smoothing-based algorithm GLAD for identifying regions of constant
copy number. Then, in order to select regions of gain and loss, we
fit robust normals to the smoothed Log$_2$Ratios of each CGH array
and choose appropriate significance cutoffs. The consensus analysis
step consists of an automated selection of recurrent aberrant
regions when multiple CGH experiments on the same tumor type are
available. We propose to associate $p$-values to each measured
genomic position and to select the regions where the $p$-value is
sufficiently small.
\\\\The aberrant regions computed by our method can be further used to estimate
evolutionary trees, which model the dependencies
between genetic mutations and can help to predict tumor progression
stages and survival times.
\\\\We applied our method to two arrayCGH data sets obtained from prostate
cancer and glioblastoma patients, respectively.
The results confirm previous knowledge on the genetic mutations
specific to these types of cancer, but also bring out new regions,
often reducing to single genes, due to the high resolution of
arrayCGH measurements. An oncogenetic tree mixture model fitted to
the Prostate Cancer data set shows two distinct evolutionary
patterns discriminating between two different cell lines. Moreover,
when used as clustering features, the genetic mutations our
algorithm outputs separate well arrays representing 4 different cell
lines, proving that we extract meaningful information. }
Export
BibTeX
@mastersthesis{Tolosi2006,
TITLE = {Analysis of Array {CGH} Data for the Estimation of Genetic Tumor Progression},
AUTHOR = {Tolosi, Laura},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125673F004B2D7B-0EA512C11180051FC12572350050DF02-Tolosi2006},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2006},
DATE = {2006},
ABSTRACT = {In cancer research, prediction of time to death or relapse is important for a meaningful tumor classification and selecting appropriate therapies. The accumulation of genetic alterations during tumor progression can be used for the assessment of the genetic status of the tumor. ArrayCGH technology is used to measure genomic amplifications and deletions, with a high resolution that allows the detection of down to single genes copy number changes. \\\\We propose an automated method for analysis of cancer mutations accumulation based on statistical analysis of arrayCGH data. The method consists of the four steps: arrayCGH smoothing, aberrations detection, consensus analysis and oncogenetic tree models estimation. For the second and third steps, we propose new algorithmic solutions. First, we use the adaptive weights smoothing-based algorithm GLAD for identifying regions of constant copy number. Then, in order to select regions of gain and loss, we fit robust normals to the smoothed Log$_2$Ratios of each CGH array and choose appropriate significance cutoffs. The consensus analysis step consists of an automated selection of recurrent aberrant regions when multiple CGH experiments on the same tumor type are available. We propose to associate $p$-values to each measured genomic position and to select the regions where the $p$-value is sufficiently small. \\\\The aberrant regions computed by our method can be further used to estimate evolutionary trees, which model the dependencies between genetic mutations and can help to predict tumor progression stages and survival times. \\\\We applied our method to two arrayCGH data sets obtained from prostate cancer and glioblastoma patients, respectively. The results confirm previous knowledge on the genetic mutations specific to these types of cancer, but also bring out new regions, often reducing to single genes, due to the high resolution of arrayCGH measurements. An oncogenetic tree mixture model fitted to the Prostate Cancer data set shows two distinct evolutionary patterns discriminating between two different cell lines. Moreover, when used as clustering features, the genetic mutations our algorithm outputs separate well arrays representing 4 different cell lines, proving that we extract meaningful information. }},
}
Endnote
%0 Thesis
%A Tolosi, Laura
%Y Rahnenfüher, Jörg
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Analysis of Array CGH Data for the Estimation of Genetic Tumor Progression :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-21A7-1
%F EDOC: 314521
%F OTHER: Local-ID: C125673F004B2D7B-0EA512C11180051FC12572350050DF02-Tolosi2006
%I Universität des Saarlandes
%C Saarbrücken
%D 2006
%V master
%9 master
%X In cancer research, prediction of time to death or relapse is important for a
meaningful tumor classification and selecting
appropriate therapies. The accumulation of genetic alterations during tumor
progression can be used for the assessment of the genetic
status of the tumor. ArrayCGH technology is used to measure genomic
amplifications and deletions, with a high resolution that allows the
detection of down to single genes copy number changes.
\\\\We propose an automated method for analysis of cancer mutations
accumulation based on statistical analysis of arrayCGH data. The
method consists of the four steps: arrayCGH smoothing, aberrations
detection, consensus analysis and oncogenetic tree models
estimation. For the second and third steps, we propose new
algorithmic solutions. First, we use the adaptive weights
smoothing-based algorithm GLAD for identifying regions of constant
copy number. Then, in order to select regions of gain and loss, we
fit robust normals to the smoothed Log$_2$Ratios of each CGH array
and choose appropriate significance cutoffs. The consensus analysis
step consists of an automated selection of recurrent aberrant
regions when multiple CGH experiments on the same tumor type are
available. We propose to associate $p$-values to each measured
genomic position and to select the regions where the $p$-value is
sufficiently small.
\\\\The aberrant regions computed by our method can be further used to estimate
evolutionary trees, which model the dependencies
between genetic mutations and can help to predict tumor progression
stages and survival times.
\\\\We applied our method to two arrayCGH data sets obtained from prostate
cancer and glioblastoma patients, respectively.
The results confirm previous knowledge on the genetic mutations
specific to these types of cancer, but also bring out new regions,
often reducing to single genes, due to the high resolution of
arrayCGH measurements. An oncogenetic tree mixture model fitted to
the Prostate Cancer data set shows two distinct evolutionary
patterns discriminating between two different cell lines. Moreover,
when used as clustering features, the genetic mutations our
algorithm outputs separate well arrays representing 4 different cell
lines, proving that we extract meaningful information. }
2005
[150]
D. Ajwani, “Design, Implementation and Experimental Study of External Memory BFS Algorithms,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Ajwani05,
TITLE = {Design, Implementation and Experimental Study of External Memory {BFS} Algorithms},
AUTHOR = {Ajwani, Deepak},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256428004B93B8-5BA0DAF2ECA46112C1256FBE00534BFC-Ajwani05},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Ajwani, Deepak
%Y Meyer, Ulrich
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Design, Implementation and Experimental Study of External Memory BFS Algorithms :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2564-4
%F EDOC: 279155
%F OTHER: Local-ID: C1256428004B93B8-5BA0DAF2ECA46112C1256FBE00534BFC-Ajwani05
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[151]
A. Alexa, “Integrating the GO Graph Structure in Scoring the Significance of Gene Ontology Terms,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Alexa2005a,
TITLE = {Integrating the {GO} Graph Structure in Scoring the Significance of Gene Ontology Terms},
AUTHOR = {Alexa, Adrian},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Alexa, Adrian
%Y Rahnenfüher, Jörg
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Integrating the GO Graph Structure in Scoring the Significance of Gene Ontology Terms :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-001A-0922-1
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[152]
S. Chernov, “Result Merging in a Peer-to-Peer Web Search Engine,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Chernov2005,
TITLE = {Result Merging in a Peer-to-Peer Web Search Engine},
AUTHOR = {Chernov, Sergey},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-2F6545166DCC932CC1256FBF003A2575-Chernov2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Chernov, Sergey
%Y Weikum, Gerhard
%A referee: Zimmer, Christian
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Result Merging in a Peer-to-Peer Web Search Engine :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-278E-5
%F EDOC: 278933
%F OTHER: Local-ID: C1256DBF005F876D-2F6545166DCC932CC1256FBF003A2575-Chernov2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[153]
S. Cotton, “Satisfiability Checking with Difference Constraints,” Universität des Saarlandes, Saarbrücken, 2005.
Abstract
This thesis studies the problem of determining the satisfiability of a Boolean <br>combination of binary difference constraints of the form <br>x-y <= c where x and y are numeric variables and c is a constant. In <br>particular, we present an incremental and model-based interpreter for the <br>theory of difference constraints in the context of a generic Boolean <br>satisfiability checking procedure capable of incorporating interpreters for <br>arbitrary theories. We show how to use the model based approach to efficiently <br>make inferences with the option of complete inference.
Export
BibTeX
@mastersthesis{ScottCotton2005,
TITLE = {Satisfiability Checking with Difference Constraints},
AUTHOR = {Cotton, Scott},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
ABSTRACT = {This thesis studies the problem of determining the satisfiability of a Boolean <br>combination of binary difference constraints of the form <br>x-y <= c where x and y are numeric variables and c is a constant. In <br>particular, we present an incremental and model-based interpreter for the <br>theory of difference constraints in the context of a generic Boolean <br>satisfiability checking procedure capable of incorporating interpreters for <br>arbitrary theories. We show how to use the model based approach to efficiently <br>make inferences with the option of complete inference.},
}
Endnote
%0 Thesis
%A Cotton, Scott
%Y Podelski, Andreas
%A referee: Finkbeiner, Bernd
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
External Organizations
%T Satisfiability Checking with Difference Constraints :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D5C9-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
%X This thesis studies the problem of determining the satisfiability of a Boolean <br>combination of binary difference constraints of the form <br>x-y <= c where x and y are numeric variables and c is a constant. In <br>particular, we present an incremental and model-based interpreter for the <br>theory of difference constraints in the context of a generic Boolean <br>satisfiability checking procedure capable of incorporating interpreters for <br>arbitrary theories. We show how to use the model based approach to efficiently <br>make inferences with the option of complete inference.
[154]
S. S. Hussain, “Enconding a Hierarchical Proof Data Structure for Contextual Reasoning in a Logical Framework,” Universität des Saarlandes, Saarbrücken, 2005.
Abstract
For many application such as mathematical assistant systems, an effective
communication between the system and its users is critical for the acceptance
of a system. Explaining the computer-supported proofs in natural language can
enhance the understanding of the users. We define a function that encodes the
proofs generated from the computer-supported theorem proving system MEGA into
TWEGA, which is the input language of the proof presentation system P.rex. This
encoding enables the natural language explanation of MEGA proofs in P.rex
Export
BibTeX
@mastersthesis{HussainS2005,
TITLE = {Enconding a Hierarchical Proof Data Structure for Contextual Reasoning in a Logical Framework},
AUTHOR = {Hussain, Syed Sajjad},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
ABSTRACT = {For many application such as mathematical assistant systems, an effective communication between the system and its users is critical for the acceptance of a system. Explaining the computer-supported proofs in natural language can enhance the understanding of the users. We define a function that encodes the proofs generated from the computer-supported theorem proving system MEGA into TWEGA, which is the input language of the proof presentation system P.rex. This encoding enables the natural language explanation of MEGA proofs in P.rex},
}
Endnote
%0 Thesis
%A Hussain, Syed Sajjad
%Y Siekmann, Jörg
%A referee: Benzmüller, Christoph
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Enconding a Hierarchical Proof Data Structure for Contextual Reasoning in a Logical Framework :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D5CB-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
%X For many application such as mathematical assistant systems, an effective
communication between the system and its users is critical for the acceptance
of a system. Explaining the computer-supported proofs in natural language can
enhance the understanding of the users. We define a function that encodes the
proofs generated from the computer-supported theorem proving system MEGA into
TWEGA, which is the input language of the proof presentation system P.rex. This
encoding enables the natural language explanation of MEGA proofs in P.rex
[155]
G. Ifrim, “A Bayesian Learning Approach to Concept-Based Document Classification,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Ifrim2005,
TITLE = {A Bayesian Learning Approach to Concept-Based Document Classification},
AUTHOR = {Ifrim, Georgiana},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-3A9028DC2228AB1BC1256FBF003B1738-Ifrim2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Ifrim, Georgiana
%Y Weikum, Gerhard
%A referee: Theobald, Martin
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T A Bayesian Learning Approach to Concept-Based Document Classification :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2581-2
%F EDOC: 278931
%F OTHER: Local-ID: C1256DBF005F876D-3A9028DC2228AB1BC1256FBF003B1738-Ifrim2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[156]
N. Kozlova, “Automatic Ontology Extraction for Document Classification,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Kozlova2005,
TITLE = {Automatic Ontology Extraction for Document Classification},
AUTHOR = {Kozlova, Natalia},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-B009BA9471A70191C1256FBF003AF86C-Kozlova2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Kozlova, Natalia
%Y Weikum, Gerhard
%A referee: Theobald, Martin
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Automatic Ontology Extraction for Document Classification :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-25E4-2
%F EDOC: 278927
%F OTHER: Local-ID: C1256DBF005F876D-B009BA9471A70191C1256FBF003AF86C-Kozlova2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[157]
A. Moleda, “Probabilistic Scheduling for Top-k Query Processing,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Moleda2005,
TITLE = {Probabilistic Scheduling for Top-k Query Processing},
AUTHOR = {Moleda, Anna},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-C358CAA736C9613FC12570C1002A6CB8-Moleda2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Moleda, Anna
%Y Weikum, Gerhard
%A referee: Bast, Hannah
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Probabilistic Scheduling for Top-k Query Processing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2503-B
%F EDOC: 278879
%F OTHER: Local-ID: C1256DBF005F876D-C358CAA736C9613FC12570C1002A6CB8-Moleda2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[158]
O. Papapetrou, “On the Usage of Global Document Occurrences in Peer-to-Peer Information Systems,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Papapetrou2005,
TITLE = {On the Usage of Global Document Occurrences in Peer-to-Peer Information Systems},
AUTHOR = {Papapetrou, Odysseas},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-84E02DEB41937C61C12570A6003DA112-Papapetrou2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Papapetrou, Odysseas
%Y Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T On the Usage of Global Document Occurrences in Peer-to-Peer Information Systems :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2739-5
%F EDOC: 278876
%F OTHER: Local-ID: C1256DBF005F876D-84E02DEB41937C61C12570A6003DA112-Papapetrou2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[159]
E. Pyrga, “Shortest Paths in Time-Dependent Networks ant their Applications,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Evangelia2005,
TITLE = {Shortest Paths in Time-Dependent Networks ant their Applications},
AUTHOR = {Pyrga, Evangelia},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Pyrga, Evangelia
%Y Mehlhorn, Kurt
%A referee: Zaroliagis, Christos
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Shortest Paths in Time-Dependent Networks ant their Applications :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D754-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[160]
S. Ray, “Counting Straight-Edge Tringulation of Planar Point Sets,” Universität des Saarlandes, Saarbrücken, 2005.
Abstract
A triangulation of a finite set S of points in R2 is a maximal set of line
segments with disjoint interiors whose end points are in S. A set of points in
the plane can have many triangulations and it is known that a set of n points
always has more than (2.33n)[7] and fewer than 59n-(log(n)) [4]
triangulation. However, these bounds are not tight. Also, counting the
number of triangulation of a given a set of points efficiently remains an open
problem. The fastest method so far is based on the so called t-path method [5]
and it was the first algorithm having a running time sublinear on the number of
triangulations counted. In this thesis, we consider a slightly different
approach to counting the number of triangulations. Although we are unable to
prove any non-trivial result about our algorithm yet, empirical results show
that the running time of our algorithm for a set of n points is o(nlog2nT(n))
where T(n) is the number of triangulations counted, and in practice it performs
much better than the earlier algorithm.
Export
BibTeX
@mastersthesis{Saurabh,
TITLE = {Counting Straight-Edge Tringulation of Planar Point Sets},
AUTHOR = {Ray, Saurabh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
ABSTRACT = {A triangulation of a finite set S of points in R2 is a maximal set of line segments with disjoint interiors whose end points are in S. A set of points in the plane can have many triangulations and it is known that a set of n points always has more than (2.33n)[7] and fewer than 59n-(log(n)) [4] triangulation. However, these bounds are not tight. Also, counting the number of triangulation of a given a set of points efficiently remains an open problem. The fastest method so far is based on the so called t-path method [5] and it was the first algorithm having a running time sublinear on the number of triangulations counted. In this thesis, we consider a slightly different approach to counting the number of triangulations. Although we are unable to prove any non-trivial result about our algorithm yet, empirical results show that the running time of our algorithm for a set of n points is o(nlog2nT(n)) where T(n) is the number of triangulations counted, and in practice it performs much better than the earlier algorithm.},
}
Endnote
%0 Thesis
%A Ray, Saurabh
%Y Seidel, Raimund
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
%T Counting Straight-Edge Tringulation of Planar Point Sets :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-D757-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
%X A triangulation of a finite set S of points in R2 is a maximal set of line
segments with disjoint interiors whose end points are in S. A set of points in
the plane can have many triangulations and it is known that a set of n points
always has more than (2.33n)[7] and fewer than 59n-(log(n)) [4]
triangulation. However, these bounds are not tight. Also, counting the
number of triangulation of a given a set of points efficiently remains an open
problem. The fastest method so far is based on the so called t-path method [5]
and it was the first algorithm having a running time sublinear on the number of
triangulations counted. In this thesis, we consider a slightly different
approach to counting the number of triangulations. Although we are unable to
prove any non-trivial result about our algorithm yet, empirical results show
that the running time of our algorithm for a set of n points is o(nlog2nT(n))
where T(n) is the number of triangulations counted, and in practice it performs
much better than the earlier algorithm.
[161]
P. Serdyukov, “Query Routing in Peer-to-Peer Web Search,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Serdyukov2005,
TITLE = {Query Routing in Peer-to-Peer Web Search},
AUTHOR = {Serdyukov, Pavel},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256DBF005F876D-C84939A41A9A1991C1256FBF003ABFD0-Serdyukov2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Serdyukov, Pavel
%Y Weikum, Gerhard
%A referee: Michel, Sebastian
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Query Routing in Peer-to-Peer Web Search :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-277B-F
%F EDOC: 278873
%F OTHER: Local-ID: C1256DBF005F876D-C84939A41A9A1991C1256FBF003ABFD0-Serdyukov2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[162]
W.-I. Siu, “Computational Prediction of MHC-Peptide Interaction,” Universität des Saarlandes, Saarbrücken, 2005.
Abstract
T-cell recognition is a critical step in regulating immune response. Activation
of Cytotoxic T-cell requires the MHC class I molecules in complex with specific
peptides and present them on the surface of the cell. Identification of
potential ligands to MHC is therefore important for understanding disease
pathogenesis and aiding vaccine design.
Despite years of effort in the field, reliable prediction of MHC ligands
remains a difficult task. It is reported that only one out of 100 to 200
potential binders actually binds. Methods based on sequence data alone are fast
but fail to capture all binding patterns, while the structure based methods are
more promising but far too slow for large-scale screening of protein sequences.
In this work, we propose a new method to the prediction problem. It is based on
the assumption that peptide binding is an aggregrate effect of contributions
from independent binding of residues. Compatibility of each amino acid in the
MHC binding pockets is examined thoroughly by molecular dynamics simulation.
Values of energy terms important for binding are collected from the generated
ensembles, and are used to produce the allele-specific scoring matrix. Each
entry in this matrix represents the favorableness in terms of a particular
"feature" of an amino acid in a binding position. Prediction models based on
machine learning techniques are then trained to discriminate binders from
non-binders.
Our method is compared to two other sequence-based methods using HLA-A*0201
9-mer sequences. Three publicly available data sets are used: the MHCPEP,
SYFPEITHI data sets, and the HXB2 genome. In overall, our method successfully
improves the prediction accuracy with higher specificity. Its robustness to
different sizes and ratios of training data proves its ability to provide
reliable prediction by less dependency on the sequence data. The method also
shows better generalizability in cross-allele predictions. For predicting
peptide bound conformations, our preliminary approach based on energy
minimization gives the satisfactory result of a backbone RMSD at 1.7 to 1.88 A
as compared to the crystal structures.
Export
BibTeX
@mastersthesis{Siu2005,
TITLE = {Computational Prediction of {MHC}-Peptide Interaction},
AUTHOR = {Siu, Weng-In},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
ABSTRACT = {T-cell recognition is a critical step in regulating immune response. Activation of Cytotoxic T-cell requires the MHC class I molecules in complex with specific peptides and present them on the surface of the cell. Identification of potential ligands to MHC is therefore important for understanding disease pathogenesis and aiding vaccine design. Despite years of effort in the field, reliable prediction of MHC ligands remains a difficult task. It is reported that only one out of 100 to 200 potential binders actually binds. Methods based on sequence data alone are fast but fail to capture all binding patterns, while the structure based methods are more promising but far too slow for large-scale screening of protein sequences. In this work, we propose a new method to the prediction problem. It is based on the assumption that peptide binding is an aggregrate effect of contributions from independent binding of residues. Compatibility of each amino acid in the MHC binding pockets is examined thoroughly by molecular dynamics simulation. Values of energy terms important for binding are collected from the generated ensembles, and are used to produce the allele-specific scoring matrix. Each entry in this matrix represents the favorableness in terms of a particular "feature" of an amino acid in a binding position. Prediction models based on machine learning techniques are then trained to discriminate binders from non-binders. Our method is compared to two other sequence-based methods using HLA-A*0201 9-mer sequences. Three publicly available data sets are used: the MHCPEP, SYFPEITHI data sets, and the HXB2 genome. In overall, our method successfully improves the prediction accuracy with higher specificity. Its robustness to different sizes and ratios of training data proves its ability to provide reliable prediction by less dependency on the sequence data. The method also shows better generalizability in cross-allele predictions. For predicting peptide bound conformations, our preliminary approach based on energy minimization gives the satisfactory result of a backbone RMSD at 1.7 to 1.88 A as compared to the crystal structures.},
}
Endnote
%0 Thesis
%A Siu, Weng-In
%Y Antes, Iris
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Computational Prediction of MHC-Peptide Interaction :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F3C3-8
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
%X T-cell recognition is a critical step in regulating immune response. Activation
of Cytotoxic T-cell requires the MHC class I molecules in complex with specific
peptides and present them on the surface of the cell. Identification of
potential ligands to MHC is therefore important for understanding disease
pathogenesis and aiding vaccine design.
Despite years of effort in the field, reliable prediction of MHC ligands
remains a difficult task. It is reported that only one out of 100 to 200
potential binders actually binds. Methods based on sequence data alone are fast
but fail to capture all binding patterns, while the structure based methods are
more promising but far too slow for large-scale screening of protein sequences.
In this work, we propose a new method to the prediction problem. It is based on
the assumption that peptide binding is an aggregrate effect of contributions
from independent binding of residues. Compatibility of each amino acid in the
MHC binding pockets is examined thoroughly by molecular dynamics simulation.
Values of energy terms important for binding are collected from the generated
ensembles, and are used to produce the allele-specific scoring matrix. Each
entry in this matrix represents the favorableness in terms of a particular
"feature" of an amino acid in a binding position. Prediction models based on
machine learning techniques are then trained to discriminate binders from
non-binders.
Our method is compared to two other sequence-based methods using HLA-A*0201
9-mer sequences. Three publicly available data sets are used: the MHCPEP,
SYFPEITHI data sets, and the HXB2 genome. In overall, our method successfully
improves the prediction accuracy with higher specificity. Its robustness to
different sizes and ratios of training data proves its ability to provide
reliable prediction by less dependency on the sequence data. The method also
shows better generalizability in cross-allele predictions. For predicting
peptide bound conformations, our preliminary approach based on energy
minimization gives the satisfactory result of a backbone RMSD at 1.7 to 1.88 A
as compared to the crystal structures.
[163]
J. Sliwerski, “Locating the Risk of Changes,” Universität des Saarlandes, Saarbrücken, 2005.
Abstract
As a software system evolves, programmers make changes which sometimes lead to
problems. The risk of later problems significantly depends on the location of
the change. Which are the locations where changes impose the greatest risk?
We introduce a set of automated techniques that relate a version history
archive (such as CVS) with a bus database (such as BUGZILLA) to detect those
locations where changes have been risky in the past. Our experiments show that
simple measures have low accuracy in locating files that are most risky to
change
Export
BibTeX
@mastersthesis{Sliwerski2005,
TITLE = {Locating the Risk of Changes},
AUTHOR = {Sliwerski, Jacek},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
ABSTRACT = {As a software system evolves, programmers make changes which sometimes lead to problems. The risk of later problems significantly depends on the location of the change. Which are the locations where changes impose the greatest risk? We introduce a set of automated techniques that relate a version history archive (such as CVS) with a bus database (such as BUGZILLA) to detect those locations where changes have been risky in the past. Our experiments show that simple measures have low accuracy in locating files that are most risky to change},
}
Endnote
%0 Thesis
%A Sliwerski, Jacek
%Y Zeller, Andreas
%A referee: Zimmermann, Thomas
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Locating the Risk of Changes :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F3C9-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
%X As a software system evolves, programmers make changes which sometimes lead to
problems. The risk of later problems significantly depends on the location of
the change. Which are the locations where changes impose the greatest risk?
We introduce a set of automated techniques that relate a version history
archive (such as CVS) with a bus database (such as BUGZILLA) to detect those
locations where changes have been risky in the past. Our experiments show that
simple measures have low accuracy in locating files that are most risky to
change
[164]
F. M. Suchanek, “Ontological Reasoning for Natural Language Understanding,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Suchanek:DA:2005,
TITLE = {Ontological Reasoning for Natural Language Understanding},
AUTHOR = {Suchanek, Fabian M.},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256104005ECAFC-8D87190D5DB624ADC1256FE200545742-Suchanek:DA:2005},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Suchanek, Fabian M.
%Y Weikum, Gerhard
%A referee: Baumgartner, Peter
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
%T Ontological Reasoning for Natural Language Understanding :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-254C-B
%F EDOC: 279073
%F OTHER: Local-ID: C1256104005ECAFC-8D87190D5DB624ADC1256FE200545742-Suchanek:DA:2005
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
[165]
J. Yin, “Model Selection for Mixtures of Mutagenetic Trees,” Universität des Saarlandes, Saarbrücken, 2005.
Export
BibTeX
@mastersthesis{Yin2005a,
TITLE = {Model Selection for Mixtures of Mutagenetic Trees},
AUTHOR = {Yin, Junming},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2005},
DATE = {2005},
}
Endnote
%0 Thesis
%A Yin, Junming
%Y Lengauer, Thomas
%A referee: Beerenwinkel, Niko
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Model Selection for Mixtures of Mutagenetic Trees :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-001A-091E-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2005
%V master
%9 master
2004
[166]
R. Angelova, “Neighborhood Conscious Hypertext Categorization,” Universität des Saarlandes, Saarbrücken, 2004.
Abstract
A fundamental issue in statistics, pattern recognition, and machine learning is
that of classification. In a traditional classification problem, we wish to
assign one of k labels (or classes) to each of n objects (or documents), in a
way that is consistent with some observed data available about that problem.
For achieving better classification results, we try to capture the information
derived by pairwise realtionships between objects, in particular hyperlinks
between web documents. the usage of hyperlinks poses new problems not addressed
in the extensive text classification literature. Links contain high quality
seantic clues that a purely text-based classifier can not take advantage of.
However, exploiting link inoframtion is non-trivial because it is noisy and a
naive use of terms in the link neghborhood of a document can degrade accuracy.
The problem becomes even harder when only a very small fraction of document
labels ar known to the classifier and can be used for training, as it is the
case in a real classification scenario. Our work is based on an algorithm
proposed by Soumen Chakrabarti and uses the theory of Markov Random Fields to
derive a relaxation labelling technique for the class assignment problem. We
show that the extra information contaned in the hyperlinks between the
documents can be explited to achieve significant improvement in the performance
of classification. We implemente our algorithm in Java and ran our experiments
on to sets of data obtained from the DBLP and IMDB databases. We oberved up to
5.5 improvement in the accuracy of the classification and up the 10 higher
recall and precision resultls.
Export
BibTeX
@mastersthesis{Ralitsa2004,
TITLE = {Neighborhood Conscious Hypertext Categorization},
AUTHOR = {Angelova, Ralitsa},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
ABSTRACT = {A fundamental issue in statistics, pattern recognition, and machine learning is that of classification. In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects (or documents), in a way that is consistent with some observed data available about that problem. For achieving better classification results, we try to capture the information derived by pairwise realtionships between objects, in particular hyperlinks between web documents. the usage of hyperlinks poses new problems not addressed in the extensive text classification literature. Links contain high quality seantic clues that a purely text-based classifier can not take advantage of. However, exploiting link inoframtion is non-trivial because it is noisy and a naive use of terms in the link neghborhood of a document can degrade accuracy. The problem becomes even harder when only a very small fraction of document labels ar known to the classifier and can be used for training, as it is the case in a real classification scenario. Our work is based on an algorithm proposed by Soumen Chakrabarti and uses the theory of Markov Random Fields to derive a relaxation labelling technique for the class assignment problem. We show that the extra information contaned in the hyperlinks between the documents can be explited to achieve significant improvement in the performance of classification. We implemente our algorithm in Java and ran our experiments on to sets of data obtained from the DBLP and IMDB databases. We oberved up to 5.5 improvement in the accuracy of the classification and up the 10 higher recall and precision resultls.},
}
Endnote
%0 Thesis
%A Angelova, Ralitsa
%Y Weikum, Gerhard
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Neighborhood Conscious Hypertext Categorization :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F483-0
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
%X A fundamental issue in statistics, pattern recognition, and machine learning is
that of classification. In a traditional classification problem, we wish to
assign one of k labels (or classes) to each of n objects (or documents), in a
way that is consistent with some observed data available about that problem.
For achieving better classification results, we try to capture the information
derived by pairwise realtionships between objects, in particular hyperlinks
between web documents. the usage of hyperlinks poses new problems not addressed
in the extensive text classification literature. Links contain high quality
seantic clues that a purely text-based classifier can not take advantage of.
However, exploiting link inoframtion is non-trivial because it is noisy and a
naive use of terms in the link neghborhood of a document can degrade accuracy.
The problem becomes even harder when only a very small fraction of document
labels ar known to the classifier and can be used for training, as it is the
case in a real classification scenario. Our work is based on an algorithm
proposed by Soumen Chakrabarti and uses the theory of Markov Random Fields to
derive a relaxation labelling technique for the class assignment problem. We
show that the extra information contaned in the hyperlinks between the
documents can be explited to achieve significant improvement in the performance
of classification. We implemente our algorithm in Java and ran our experiments
on to sets of data obtained from the DBLP and IMDB databases. We oberved up to
5.5 improvement in the accuracy of the classification and up the 10 higher
recall and precision resultls.
[167]
A. Q. Kara, “Indutive Learning Approaches in Information Extraction Analysis, Formalization, Comparison, Evaluation,” Universität des Saarlandes, Saarbrücken, 2004.
Abstract
Although Information Extraction field has been in the market since almost last
two decades, it is still considered to be in its initial stage. There are at
present many algorithms that are used for Information Extraction task, and they
also have a good success rate, but there are no benchemarks or standard data on
which they can be compared among themselves. Most of the algorithms work well
in semi-structured data but they seem to fail when dealing with free text.
Other algorithms fail when different types of data is required to extract from
the same doument. One way, is to some how try to compare them all and then try
to improve and create algorithms that are domain independent and work both
efficiently and effectively. For this, we introduce an idea of formalizing
Information Extractio algorithms and then to find out where can we improve them
or what parts are still needed to improve the over all performance. We have in
the end, described what we got by formalizing the algorithms and what can we
achieve after formalinzing further algorithms.
Export
BibTeX
@mastersthesis{Kara2004,
TITLE = {Indutive Learning Approaches in Information Extraction Analysis, Formalization, Comparison, Evaluation},
AUTHOR = {Kara, Abdul Qadar},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
ABSTRACT = {Although Information Extraction field has been in the market since almost last two decades, it is still considered to be in its initial stage. There are at present many algorithms that are used for Information Extraction task, and they also have a good success rate, but there are no benchemarks or standard data on which they can be compared among themselves. Most of the algorithms work well in semi-structured data but they seem to fail when dealing with free text. Other algorithms fail when different types of data is required to extract from the same doument. One way, is to some how try to compare them all and then try to improve and create algorithms that are domain independent and work both efficiently and effectively. For this, we introduce an idea of formalizing Information Extractio algorithms and then to find out where can we improve them or what parts are still needed to improve the over all performance. We have in the end, described what we got by formalizing the algorithms and what can we achieve after formalinzing further algorithms.},
}
Endnote
%0 Thesis
%A Kara, Abdul Qadar
%+ Programming Logics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
%T Indutive Learning Approaches in Information Extraction Analysis, Formalization, Comparison, Evaluation :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F48E-9
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
%X Although Information Extraction field has been in the market since almost last
two decades, it is still considered to be in its initial stage. There are at
present many algorithms that are used for Information Extraction task, and they
also have a good success rate, but there are no benchemarks or standard data on
which they can be compared among themselves. Most of the algorithms work well
in semi-structured data but they seem to fail when dealing with free text.
Other algorithms fail when different types of data is required to extract from
the same doument. One way, is to some how try to compare them all and then try
to improve and create algorithms that are domain independent and work both
efficiently and effectively. For this, we introduce an idea of formalizing
Information Extractio algorithms and then to find out where can we improve them
or what parts are still needed to improve the over all performance. We have in
the end, described what we got by formalizing the algorithms and what can we
achieve after formalinzing further algorithms.
[168]
K. Shi, “Extracting the Topological Structure of the Higher Order Critical points for 3D Vector Fields,” Universität des Saarlandes, Saarbrücken, 2004.
Abstract
Critical points of vector fields are important topological features, which are
characterized by the number and order of areas of different flow behavior
around it. We present an approach to detect the different sectors around
general critical points of 3D vector fields. This approach is based on a
piecewise linear approximation of the vector fields around the critical points.
We showed examples how this approach can also treat critical points of a higher
order, and we discussed the limitation of the approach as well.
Export
BibTeX
@mastersthesis{Kuangyu2004,
TITLE = {Extracting the Topological Structure of the Higher Order Critical points for {3D} Vector Fields},
AUTHOR = {Shi, Kuangyu},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
ABSTRACT = {Critical points of vector fields are important topological features, which are characterized by the number and order of areas of different flow behavior around it. We present an approach to detect the different sectors around general critical points of 3D vector fields. This approach is based on a piecewise linear approximation of the vector fields around the critical points. We showed examples how this approach can also treat critical points of a higher order, and we discussed the limitation of the approach as well.},
}
Endnote
%0 Thesis
%A Shi, Kuangyu
%Y Theisel, Holger
%Y Seidel, Hans-Peter
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Extracting the Topological Structure of the Higher Order Critical points for 3D Vector Fields :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F7F9-C
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
%X Critical points of vector fields are important topological features, which are
characterized by the number and order of areas of different flow behavior
around it. We present an approach to detect the different sectors around
general critical points of 3D vector fields. This approach is based on a
piecewise linear approximation of the vector fields around the critical points.
We showed examples how this approach can also treat critical points of a higher
order, and we discussed the limitation of the approach as well.
[169]
E. Stegantova, “Multicommodity Flows Over time with Costs,” Universität des Saarlandes, Saarbrücken, 2004.
Abstract
Flows over time (dymanic flows) generalize standard network flows by
introducing a new element- time. They naturally model problems where travel and
transmission are not instantaneous.
I this work we consider two dynamic flows problems: The Quickest multicommodity
Dynamic Flow problem with Bounded Cost (QMDFP) ant The Maximal Multicommodity
dynamic flow problem (MMDFP). Both problems are known to be NP-hard. In the
first part we propose two methods of improving the result obtained by the
efficient two-approximation algorithm proposed by Lisa Fleischer and Martin
Skutella for solving the QMDFP. The Approximation algorithm constructs the
temporally repeated flow using so called "static average flow". In the first
method we prove that the value of the static average flow can be increased by a
factor, that depends on the length of th shortest path form a source to a sink
in the underlying network. Increasing the value of the static average flow
allows us to save time on sending the necessary amount of flow (the given
demands) from sources to sinks. The cost of the resulting temporally repeated
flow remains unchanged. In the second method we porpose an algorithm that
reconstructs the static average flow in the way that the length of the longest
path used by the flow becomes shorter. This allows us to wait for a shorter
period of time until the last sent unit of flow reaches its sink. The drawback
of the reconstructing of the flow is its increase in cost. But we give a proof
ot the fact that the cost increases at most by a factor of two.
In the second part of the thesis we deal with MMDFP. We give an instance of the
network that demonstrates that the optimal solution is not always a temporally
repeated flow. But we give an easy proof of the fact that the difference
between the optimal solution and the Maximal Multicommodity Temporally Repeated
Flow is bounded by a constant that depends on the network and not on the given
time horizon. This fact allows to approximate the optimal Maximal
Multicommodity Dynamic Flow with the Maximal Muticommodity Temporally Repeated
Flow for large enough time horizons.
Export
BibTeX
@mastersthesis{Stegantova2004,
TITLE = {Multicommodity Flows Over time with Costs},
AUTHOR = {Stegantova, Evghenia},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
ABSTRACT = {Flows over time (dymanic flows) generalize standard network flows by introducing a new element- time. They naturally model problems where travel and transmission are not instantaneous. I this work we consider two dynamic flows problems: The Quickest multicommodity Dynamic Flow problem with Bounded Cost (QMDFP) ant The Maximal Multicommodity dynamic flow problem (MMDFP). Both problems are known to be NP-hard. In the first part we propose two methods of improving the result obtained by the efficient two-approximation algorithm proposed by Lisa Fleischer and Martin Skutella for solving the QMDFP. The Approximation algorithm constructs the temporally repeated flow using so called "static average flow". In the first method we prove that the value of the static average flow can be increased by a factor, that depends on the length of th shortest path form a source to a sink in the underlying network. Increasing the value of the static average flow allows us to save time on sending the necessary amount of flow (the given demands) from sources to sinks. The cost of the resulting temporally repeated flow remains unchanged. In the second method we porpose an algorithm that reconstructs the static average flow in the way that the length of the longest path used by the flow becomes shorter. This allows us to wait for a shorter period of time until the last sent unit of flow reaches its sink. The drawback of the reconstructing of the flow is its increase in cost. But we give a proof ot the fact that the cost increases at most by a factor of two. In the second part of the thesis we deal with MMDFP. We give an instance of the network that demonstrates that the optimal solution is not always a temporally repeated flow. But we give an easy proof of the fact that the difference between the optimal solution and the Maximal Multicommodity Temporally Repeated Flow is bounded by a constant that depends on the network and not on the given time horizon. This fact allows to approximate the optimal Maximal Multicommodity Dynamic Flow with the Maximal Muticommodity Temporally Repeated Flow for large enough time horizons.},
}
Endnote
%0 Thesis
%A Stegantova, Evghenia
%Y Skutella, Martin
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Multicommodity Flows Over time with Costs :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F7FE-2
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
%X Flows over time (dymanic flows) generalize standard network flows by
introducing a new element- time. They naturally model problems where travel and
transmission are not instantaneous.
I this work we consider two dynamic flows problems: The Quickest multicommodity
Dynamic Flow problem with Bounded Cost (QMDFP) ant The Maximal Multicommodity
dynamic flow problem (MMDFP). Both problems are known to be NP-hard. In the
first part we propose two methods of improving the result obtained by the
efficient two-approximation algorithm proposed by Lisa Fleischer and Martin
Skutella for solving the QMDFP. The Approximation algorithm constructs the
temporally repeated flow using so called "static average flow". In the first
method we prove that the value of the static average flow can be increased by a
factor, that depends on the length of th shortest path form a source to a sink
in the underlying network. Increasing the value of the static average flow
allows us to save time on sending the necessary amount of flow (the given
demands) from sources to sinks. The cost of the resulting temporally repeated
flow remains unchanged. In the second method we porpose an algorithm that
reconstructs the static average flow in the way that the length of the longest
path used by the flow becomes shorter. This allows us to wait for a shorter
period of time until the last sent unit of flow reaches its sink. The drawback
of the reconstructing of the flow is its increase in cost. But we give a proof
ot the fact that the cost increases at most by a factor of two.
In the second part of the thesis we deal with MMDFP. We give an instance of the
network that demonstrates that the optimal solution is not always a temporally
repeated flow. But we give an easy proof of the fact that the difference
between the optimal solution and the Maximal Multicommodity Temporally Repeated
Flow is bounded by a constant that depends on the network and not on the given
time horizon. This fact allows to approximate the optimal Maximal
Multicommodity Dynamic Flow with the Maximal Muticommodity Temporally Repeated
Flow for large enough time horizons.
[170]
I. Trajkovski, “Analysis of Protein Binding Pocket Flexibility,” Universität des Saarlandes, Saarbrücken, 2004.
Export
BibTeX
@mastersthesis{Trajkovski2004,
TITLE = {Analysis of Protein Binding Pocket Flexibility},
AUTHOR = {Trajkovski, Igor},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125673F004B2D7B-FE9E2234859B9C8BC1256FBD005A29E8-Trajkovski2004},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
}
Endnote
%0 Thesis
%A Trajkovski, Igor
%Y Antes, Iris
%A referee: Lengauer, Thomas
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
%T Analysis of Protein Binding Pocket Flexibility :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2A1B-4
%F EDOC: 232004
%F OTHER: Local-ID: C125673F004B2D7B-FE9E2234859B9C8BC1256FBD005A29E8-Trajkovski2004
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
[171]
H. Yu, “Importance Sampling in Photon Tracing,” Universität des Saarlandes, Saarbrücken, 2004.
Abstract
All global illumination algorithms are based on rendering equation. The
rendering equation is solved in different ways in every algorithm. Most of
algorithms solve the equation by using Monte Carlo method. In this process many
samples are produced. These samples have different contribution to generated
image. If one hopes to get acceptable result with fewer samples, important
samples, which have more contribution for the nal image, must be considered in
the rst place. For example, in ordinary Light Tracing, millions of photons
have to be traced in order to obtain the distribution of illumination in the
whole scene. Actually only a part of scene can be observed most of the time,
and just photons hitting visible surfaces will contribute to the generated
image. If only a small part of entire scene is visible, we will spend most of
the time tracing and storing unimportant photons that have no any contribution
to the nal image. Even considering only visible photons, one can see that
their contribution to image is very different. Surfaces that are located closer
to viewpoint have larger image plane projected area and thus require more
photons to achieve the same noise level as surfaces located further away.
Orientation of surface in respect to view direction also affects viewdependent
photons importance. Depending on the application and used Monte Carlo algorithm
one can come up with many other different criteria to compute this importance,
which may dramatically affect the quality of produced images and computation
speed. Algorithm presented in the thesis takes only useful (visible) photons
into account, concentrating computation only on the surfaces visible by
currently active camera, balancing the distribution of photons on the image
plane, greatly improving the image quality. Using this concept, we can get
better result with fewer photons. In this way it is possible to save not only
rendering time, but also storage space because less photons need to be stored.
This idea also can be applied in other algorithms where millions of samples
have to be generated. Once the difference among these samples is found out, we
can pay more attention to the important samples that have more contribution to
the result image, while ignoring less important ones, thus using fewer samples
to get better result.
Export
BibTeX
@mastersthesis{YuHang:ISPT:2004,
TITLE = {Importance Sampling in Photon Tracing},
AUTHOR = {Yu, Hang},
LANGUAGE = {eng},
LOCALID = {Local-ID: C125675300671F7B-BDC02E96AF19A112C1256F870042CA3A-YuHang:ISPT:2004},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2004},
DATE = {2004},
ABSTRACT = {All global illumination algorithms are based on rendering equation. The rendering equation is solved in different ways in every algorithm. Most of algorithms solve the equation by using Monte Carlo method. In this process many samples are produced. These samples have different contribution to generated image. If one hopes to get acceptable result with fewer samples, important samples, which have more contribution for the nal image, must be considered in the rst place. For example, in ordinary Light Tracing, millions of photons have to be traced in order to obtain the distribution of illumination in the whole scene. Actually only a part of scene can be observed most of the time, and just photons hitting visible surfaces will contribute to the generated image. If only a small part of entire scene is visible, we will spend most of the time tracing and storing unimportant photons that have no any contribution to the nal image. Even considering only visible photons, one can see that their contribution to image is very different. Surfaces that are located closer to viewpoint have larger image plane projected area and thus require more photons to achieve the same noise level as surfaces located further away. Orientation of surface in respect to view direction also affects viewdependent photons importance. Depending on the application and used Monte Carlo algorithm one can come up with many other different criteria to compute this importance, which may dramatically affect the quality of produced images and computation speed. Algorithm presented in the thesis takes only useful (visible) photons into account, concentrating computation only on the surfaces visible by currently active camera, balancing the distribution of photons on the image plane, greatly improving the image quality. Using this concept, we can get better result with fewer photons. In this way it is possible to save not only rendering time, but also storage space because less photons need to be stored. This idea also can be applied in other algorithms where millions of samples have to be generated. Once the difference among these samples is found out, we can pay more attention to the important samples that have more contribution to the result image, while ignoring less important ones, thus using fewer samples to get better result.},
}
Endnote
%0 Thesis
%A Yu, Hang
%Y Dmitriev, Kirill Alexandrovich
%A referee: Seidel, Hans-Peter
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Importance Sampling in Photon Tracing :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2ABF-5
%F EDOC: 231913
%F OTHER: Local-ID: C125675300671F7B-BDC02E96AF19A112C1256F870042CA3A-YuHang:ISPT:2004
%I Universität des Saarlandes
%C Saarbrücken
%D 2004
%V master
%9 master
%X All global illumination algorithms are based on rendering equation. The
rendering equation is solved in different ways in every algorithm. Most of
algorithms solve the equation by using Monte Carlo method. In this process many
samples are produced. These samples have different contribution to generated
image. If one hopes to get acceptable result with fewer samples, important
samples, which have more contribution for the nal image, must be considered in
the rst place. For example, in ordinary Light Tracing, millions of photons
have to be traced in order to obtain the distribution of illumination in the
whole scene. Actually only a part of scene can be observed most of the time,
and just photons hitting visible surfaces will contribute to the generated
image. If only a small part of entire scene is visible, we will spend most of
the time tracing and storing unimportant photons that have no any contribution
to the nal image. Even considering only visible photons, one can see that
their contribution to image is very different. Surfaces that are located closer
to viewpoint have larger image plane projected area and thus require more
photons to achieve the same noise level as surfaces located further away.
Orientation of surface in respect to view direction also affects viewdependent
photons importance. Depending on the application and used Monte Carlo algorithm
one can come up with many other different criteria to compute this importance,
which may dramatically affect the quality of produced images and computation
speed. Algorithm presented in the thesis takes only useful (visible) photons
into account, concentrating computation only on the surfaces visible by
currently active camera, balancing the distribution of photons on the image
plane, greatly improving the image quality. Using this concept, we can get
better result with fewer photons. In this way it is possible to save not only
rendering time, but also storage space because less photons need to be stored.
This idea also can be applied in other algorithms where millions of samples
have to be generated. Once the difference among these samples is found out, we
can pay more attention to the important samples that have more contribution to
the result image, while ignoring less important ones, thus using fewer samples
to get better result.
2003
[172]
W. Ding, “Geometric Rounding without changing the Topology,” Universität des Saarlandes, Saarbrücken, 2003.
Export
BibTeX
@mastersthesis{Dipl03/Ding,
TITLE = {Geometric Rounding without changing the Topology},
AUTHOR = {Ding, Wei},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256428004B93B8-0CA4F6A423444329C1256E1A00470230-Dipl03/Ding},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
}
Endnote
%0 Thesis
%A Ding, Wei
%Y Mehlhorn, Kurt
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
%T Geometric Rounding without changing the Topology :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2D16-5
%F EDOC: 201948
%F OTHER: Local-ID: C1256428004B93B8-0CA4F6A423444329C1256E1A00470230-Dipl03/Ding
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
[173]
M. S. Kiraz, “Formalization and Verification of Informal Security Protocol Description,” Universität des Saarlandes, Saarbrücken, 2003.
Abstract
Conclusions:
In this master thesis, we have started with an informal security protocol
representation. We have demonstrated the translation of protocols into Horn
clauses by giving the well-known Otway Rees Protocol as an example. We also
defined and formalized semantics of the protocol for all participant.
For the work presented in this thesis we have assumed perfect encryption. We
also assume that the protocol is executed in the presence of the attacker that
can listen, compute new messages from the messages it has already received, and
send any message it can build.
We firmalized the abilities of attacker and we defined the view of attacker to
the message. By looking to the view of the messages, if participant can
distiguish the views then it will stop the protocol run, if participant cannot
distinguish the messages from each other then it will reply to the previous
message.
The related work has been done in the reference [5] for CAPSL ( Common
Authentication Protocol Specification Language) wich is a high-level language
for applying formal methods to the security analysis of cryptographic
protocols. Protocol is specified in a form that could be used as the input
format for any formal analysis.
Export
BibTeX
@mastersthesis{Kiraz2003,
TITLE = {Formalization and Verification of Informal Security Protocol Description},
AUTHOR = {Kiraz, Mehmet Sabir},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
ABSTRACT = {Conclusions: In this master thesis, we have started with an informal security protocol representation. We have demonstrated the translation of protocols into Horn clauses by giving the well-known Otway Rees Protocol as an example. We also defined and formalized semantics of the protocol for all participant. For the work presented in this thesis we have assumed perfect encryption. We also assume that the protocol is executed in the presence of the attacker that can listen, compute new messages from the messages it has already received, and send any message it can build. We firmalized the abilities of attacker and we defined the view of attacker to the message. By looking to the view of the messages, if participant can distiguish the views then it will stop the protocol run, if participant cannot distinguish the messages from each other then it will reply to the previous message. The related work has been done in the reference [5] for CAPSL ( Common Authentication Protocol Specification Language) wich is a high-level language for applying formal methods to the security analysis of cryptographic protocols. Protocol is specified in a form that could be used as the input format for any formal analysis.},
}
Endnote
%0 Thesis
%A Kiraz, Mehmet Sabir
%Y Blanchet, Bruno
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Static Analysis, MPI for Informatics, Max Planck Society
%T Formalization and Verification of Informal Security Protocol Description :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F819-B
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
%X Conclusions:
In this master thesis, we have started with an informal security protocol
representation. We have demonstrated the translation of protocols into Horn
clauses by giving the well-known Otway Rees Protocol as an example. We also
defined and formalized semantics of the protocol for all participant.
For the work presented in this thesis we have assumed perfect encryption. We
also assume that the protocol is executed in the presence of the attacker that
can listen, compute new messages from the messages it has already received, and
send any message it can build.
We firmalized the abilities of attacker and we defined the view of attacker to
the message. By looking to the view of the messages, if participant can
distiguish the views then it will stop the protocol run, if participant cannot
distinguish the messages from each other then it will reply to the previous
message.
The related work has been done in the reference [5] for CAPSL ( Common
Authentication Protocol Specification Language) wich is a high-level language
for applying formal methods to the security analysis of cryptographic
protocols. Protocol is specified in a form that could be used as the input
format for any formal analysis.
[174]
R. Kumar, “Proving Program Termination via Transition Invariants,” Universität des Saarlandes, Saarbrücken, 2003.
Abstract
We can prove termination of C programs by computing 'strong enough' transition
invariants by abastract interpretation. In this thesis, we describe basic
ingredients for an implementation of this computation. Namely, we show how to
extract models form C programs (using GCA tool [7]) and how to construct an
abstract domain of transition predicates. Furthermore, we propose a method for
'compacting' a model that improves the running time of the transtion invariant
generation algorithm.
We implement these ingredients and the proposed optimization, and practically
evaluate their effectiveness
Export
BibTeX
@mastersthesis{Kumar2003,
TITLE = {Proving Program Termination via Transition Invariants},
AUTHOR = {Kumar, Ramesh},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
ABSTRACT = {We can prove termination of C programs by computing 'strong enough' transition invariants by abastract interpretation. In this thesis, we describe basic ingredients for an implementation of this computation. Namely, we show how to extract models form C programs (using GCA tool [7]) and how to construct an abstract domain of transition predicates. Furthermore, we propose a method for 'compacting' a model that improves the running time of the transtion invariant generation algorithm. We implement these ingredients and the proposed optimization, and practically evaluate their effectiveness},
}
Endnote
%0 Thesis
%A Kumar, Ramesh
%Y Podelski, Andreas
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
%T Proving Program Termination via Transition Invariants :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F81B-7
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
%X We can prove termination of C programs by computing 'strong enough' transition
invariants by abastract interpretation. In this thesis, we describe basic
ingredients for an implementation of this computation. Namely, we show how to
extract models form C programs (using GCA tool [7]) and how to construct an
abstract domain of transition predicates. Furthermore, we propose a method for
'compacting' a model that improves the running time of the transtion invariant
generation algorithm.
We implement these ingredients and the proposed optimization, and practically
evaluate their effectiveness
[175]
P. McCabe, “Lower Bounding the Number of Straight-Edge Triangulations of Planar Point Sets,” Universität des Saarlandes, Saarbrücken, 2003.
Abstract
We examine the number of triangulations that any set of n points in the plane
must have, and prove that (i) any set of n points has at least 0.00037*2.2n
triangulations, (ii) any set with three extreme points and n interior points
has at least 0.112*2.569n triangulation, and (iii) any set with n interior
points has at least 0.238 * 2.38n triangulation. The best previously known
lower bound for the number of triangulations for n points in the plane is
0.0822 * 2.0129n. We also give a method of automatically extending known
bounds for small point sets to general lower bounds.
Export
BibTeX
@mastersthesis{McCabe2003,
TITLE = {Lower Bounding the Number of Straight-Edge Triangulations of Planar Point Sets},
AUTHOR = {McCabe, Paul},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
ABSTRACT = {We examine the number of triangulations that any set of n points in the plane must have, and prove that (i) any set of n points has at least 0.00037*2.2n triangulations, (ii) any set with three extreme points and n interior points has at least 0.112*2.569n triangulation, and (iii) any set with n interior points has at least 0.238 * 2.38n triangulation. The best previously known lower bound for the number of triangulations for n points in the plane is 0.0822 * 2.0129n. We also give a method of automatically extending known bounds for small point sets to general lower bounds.},
}
Endnote
%0 Thesis
%A McCabe, Paul
%Y Seidel, Raimund
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
%T Lower Bounding the Number of Straight-Edge Triangulations of Planar Point Sets :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F823-4
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
%X We examine the number of triangulations that any set of n points in the plane
must have, and prove that (i) any set of n points has at least 0.00037*2.2n
triangulations, (ii) any set with three extreme points and n interior points
has at least 0.112*2.569n triangulation, and (iii) any set with n interior
points has at least 0.238 * 2.38n triangulation. The best previously known
lower bound for the number of triangulations for n points in the plane is
0.0822 * 2.0129n. We also give a method of automatically extending known
bounds for small point sets to general lower bounds.
[176]
J. L. Schoner, “Interactive Haptics and Display for Viscoelastic Solids,” Universität des Saarlandes, Saarbrücken, 2003.
Abstract
This thesis describes a way of modeling viscoelasticity in deformable solids
for interactive haptic and display purposes. The model is based on elastostatic
deformation characterized by a discrete Green's function matrix, which is
extended using a viscoelastic add-on. The underlying idea is to replace the
linear, time-independent relationships between the matrix entries, force, and
displacement with non-linear, time-dependent relationships. A general framework
is given, in which different viscoelastic models can be devised. One such model
based on physical measurements is discussed in detail. Finally, the details and
results of a system that implements this model are described.
Export
BibTeX
@mastersthesis{Schoner2003,
TITLE = {Interactive Haptics and Display for Viscoelastic Solids},
AUTHOR = {Schoner, Jeffrey Lawrence},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
ABSTRACT = {This thesis describes a way of modeling viscoelasticity in deformable solids for interactive haptic and display purposes. The model is based on elastostatic deformation characterized by a discrete Green's function matrix, which is extended using a viscoelastic add-on. The underlying idea is to replace the linear, time-independent relationships between the matrix entries, force, and displacement with non-linear, time-dependent relationships. A general framework is given, in which different viscoelastic models can be devised. One such model based on physical measurements is discussed in detail. Finally, the details and results of a system that implements this model are described.},
}
Endnote
%0 Thesis
%A Schoner, Jeffrey Lawrence
%Y Lang, Jochen
%A referee: Seidel, Hans-Peter
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T Interactive Haptics and Display for Viscoelastic Solids :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F859-E
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
%X This thesis describes a way of modeling viscoelasticity in deformable solids
for interactive haptic and display purposes. The model is based on elastostatic
deformation characterized by a discrete Green's function matrix, which is
extended using a viscoelastic add-on. The underlying idea is to replace the
linear, time-independent relationships between the matrix entries, force, and
displacement with non-linear, time-dependent relationships. A general framework
is given, in which different viscoelastic models can be devised. One such model
based on physical measurements is discussed in detail. Finally, the details and
results of a system that implements this model are described.
[177]
S. Tverdyshev, “Documentation and Modelling of the IPC Mechanism in the L4 Kernel,” Universität des Saarlandes, Saarbrücken, 2003.
Abstract
Summary
My diploma thesis is a part of the project "Verification of the L4 operating
system kernel". The aim of this project ist to formally verify the L4 kernel in
order to guarantee its correctness. In my thesis I have documented the
implementation of the IPC mechanism in the L4 kernel. I proved the correctness
of the message passing protocol of the IPC mechanism. This was done in three
steps: At first I have created the formal specification of that protocol. After
that I built a model of the IPC mechanism in PVS and at last I proved that the
created model fulfils the specification. The model which is built in this work
is not meant to be a precise representation of the original protocol. The proof
of such a model is not the same as proofs for the original C code, but they may
allow to write a correct source code. Therefore for software verification we
need some tool that we can work with C-code or any programming language in
order to verify its correctness.
Export
BibTeX
@mastersthesis{Tverdyshev2003,
TITLE = {Documentation and Modelling of the {IPC} Mechanism in the {L4} Kernel},
AUTHOR = {Tverdyshev, Sergey},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2003},
DATE = {2003},
ABSTRACT = {Summary My diploma thesis is a part of the project "Verification of the L4 operating system kernel". The aim of this project ist to formally verify the L4 kernel in order to guarantee its correctness. In my thesis I have documented the implementation of the IPC mechanism in the L4 kernel. I proved the correctness of the message passing protocol of the IPC mechanism. This was done in three steps: At first I have created the formal specification of that protocol. After that I built a model of the IPC mechanism in PVS and at last I proved that the created model fulfils the specification. The model which is built in this work is not meant to be a precise representation of the original protocol. The proof of such a model is not the same as proofs for the original C code, but they may allow to write a correct source code. Therefore for software verification we need some tool that we can work with C-code or any programming language in order to verify its correctness.},
}
Endnote
%0 Thesis
%A Tverdyshev, Sergey
%Y Paul, Wolfgang
%A referee: Hermann, H.
%+ International Max Planck Research School, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Documentation and Modelling of the IPC Mechanism in the L4 Kernel :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0027-F863-5
%I Universität des Saarlandes
%C Saarbrücken
%D 2003
%V master
%9 master
%X Summary
My diploma thesis is a part of the project "Verification of the L4 operating
system kernel". The aim of this project ist to formally verify the L4 kernel in
order to guarantee its correctness. In my thesis I have documented the
implementation of the IPC mechanism in the L4 kernel. I proved the correctness
of the message passing protocol of the IPC mechanism. This was done in three
steps: At first I have created the formal specification of that protocol. After
that I built a model of the IPC mechanism in PVS and at last I proved that the
created model fulfils the specification. The model which is built in this work
is not meant to be a precise representation of the original protocol. The proof
of such a model is not the same as proofs for the original C code, but they may
allow to write a correct source code. Therefore for software verification we
need some tool that we can work with C-code or any programming language in
order to verify its correctness.
2002
[178]
A. Rybalchenko, “A Model Checker based on Abstraction Refinement,” Universität des Saarlandes, Saarbrücken, 2002.
Abstract
Abstraction plays an important role for verification of computer
programs. We want to construct the right abstraction automatically.
There is a promising approach to do it, called {\it predicate abstraction}.
An insufficiently precise abstraction can be {\it automatically refined}.
There is an automated model checking method described in
[Ball, Podelski, Rajamani TACAS02] which combines both techniques,
e.g., predicate abstraction and abstraction refinement. The quality of
the method is expressed by a completeness property relative to a
powerful but unrealistic oracle-guided algorithm.
\par
In this work we want to generalize the results from
[Ball,Podelski,Rajamani TACAS02] and introduce new abstraction
functions with different precision. We implement the new
abstraction functions in a model checker and practically evaluate
their effectiveness in verifying various computer programs.
Export
BibTeX
@mastersthesis{Rybalchenko2002,
TITLE = {A Model Checker based on Abstraction Refinement},
AUTHOR = {Rybalchenko, Andrey},
LANGUAGE = {eng},
LOCALID = {Local-ID: C1256104005ECAFC-E198F9F27C896D72C1256D0A0037AD19-Rybalchenko2002},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2002},
DATE = {2002},
ABSTRACT = {Abstraction plays an important role for verification of computer programs. We want to construct the right abstraction automatically. There is a promising approach to do it, called {\it predicate abstraction}. An insufficiently precise abstraction can be {\it automatically refined}. There is an automated model checking method described in [Ball, Podelski, Rajamani TACAS02] which combines both techniques, e.g., predicate abstraction and abstraction refinement. The quality of the method is expressed by a completeness property relative to a powerful but unrealistic oracle-guided algorithm. \par In this work we want to generalize the results from [Ball,Podelski,Rajamani TACAS02] and introduce new abstraction functions with different precision. We implement the new abstraction functions in a model checker and practically evaluate their effectiveness in verifying various computer programs.},
}
Endnote
%0 Thesis
%A Rybalchenko, Andrey
%Y Podelski, Andreas
%A referee: Wilhelm, Reinhard
%+ Programming Logics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Programming Logics, MPI for Informatics, Max Planck Society
External Organizations
%T A Model Checker based on Abstraction Refinement :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-000F-2EF1-F
%F EDOC: 202126
%F OTHER: Local-ID: C1256104005ECAFC-E198F9F27C896D72C1256D0A0037AD19-Rybalchenko2002
%I Universität des Saarlandes
%C Saarbrücken
%D 2002
%V master
%9 master
%X Abstraction plays an important role for verification of computer
programs. We want to construct the right abstraction automatically.
There is a promising approach to do it, called {\it predicate abstraction}.
An insufficiently precise abstraction can be {\it automatically refined}.
There is an automated model checking method described in
[Ball, Podelski, Rajamani TACAS02] which combines both techniques,
e.g., predicate abstraction and abstraction refinement. The quality of
the method is expressed by a completeness property relative to a
powerful but unrealistic oracle-guided algorithm.
\par
In this work we want to generalize the results from
[Ball,Podelski,Rajamani TACAS02] and introduce new abstraction
functions with different precision. We implement the new
abstraction functions in a model checker and practically evaluate
their effectiveness in verifying various computer programs.
PhD Thesis
2022
[1]
C. X. Chu, “Knowledge Extraction from Fictional Texts,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries - by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: - Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. - Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. - Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. - The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. - The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. - The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.
Export
BibTeX
@phdthesis{Chuphd2022,
TITLE = {Knowledge Extraction from Fictional Texts},
AUTHOR = {Chu, Cuong Xuan},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-361070},
DOI = {10.22028/D291-36107},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries -- by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: -- Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. -- Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. -- Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. -- The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. -- The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. -- The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.},
}
Endnote
%0 Thesis
%A Chu, Cuong Xuan
%Y Weikum, Gerhard
%A referee: Theobald, Martin
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Knowledge Extraction from Fictional Texts :
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-9598-2
%R 10.22028/D291-36107
%U nbn:de:bsz:291--ds-361070
%F OTHER: hdl:20.500.11880/32914
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 129 p.
%V phd
%9 phd
%X Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries - by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: - Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. - Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. - Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. - The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. - The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. - The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32914
[2]
J. Fischer, “More than the sum of its parts,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
In this thesis we explore pattern mining and deep learning. Often seen as orthogonal, we show that these fields complement each other and propose to combine them to gain from each other’s strengths. We, first, show how to efficiently discover succinct and non-redundant sets of patterns that provide insight into data beyond conjunctive statements. We leverage the interpretability of such patterns to unveil how and which information flows through neural networks, as well as what characterizes their decisions. Conversely, we show how to combine continuous optimization with pattern discovery, proposing a neural network that directly encodes discrete patterns, which allows us to apply pattern mining at a scale orders of magnitude larger than previously possible. Large neural networks are, however, exceedingly expensive to train for which ‘lottery tickets’ – small, well-trainable sub-networks in randomly initialized neural networks – offer a remedy. We identify theoretical limitations of strong tickets and overcome them by equipping these tickets with the property of universal approximation. To analyze whether limitations in ticket sparsity are algorithmic or fundamental, we propose a framework to plant and hide lottery tickets. With novel ticket benchmarks we then conclude that the limitation is likely algorithmic, encouraging further developments for which our framework offers means to measure progress.
Export
BibTeX
@phdthesis{Fischerphd2022,
TITLE = {More than the sum of its parts},
AUTHOR = {Fischer, Jonas},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-370240},
DOI = {10.22028/D291-37024},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {In this thesis we explore pattern mining and deep learning. Often seen as orthogonal, we show that these fields complement each other and propose to combine them to gain from each other{\textquoteright}s strengths. We, first, show how to efficiently discover succinct and non-redundant sets of patterns that provide insight into data beyond conjunctive statements. We leverage the interpretability of such patterns to unveil how and which information flows through neural networks, as well as what characterizes their decisions. Conversely, we show how to combine continuous optimization with pattern discovery, proposing a neural network that directly encodes discrete patterns, which allows us to apply pattern mining at a scale orders of magnitude larger than previously possible. Large neural networks are, however, exceedingly expensive to train for which {\textquoteleft}lottery tickets{\textquoteright} -- small, well-trainable sub-networks in randomly initialized neural networks -- offer a remedy. We identify theoretical limitations of strong tickets and overcome them by equipping these tickets with the property of universal approximation. To analyze whether limitations in ticket sparsity are algorithmic or fundamental, we propose a framework to plant and hide lottery tickets. With novel ticket benchmarks we then conclude that the limitation is likely algorithmic, encouraging further developments for which our framework offers means to measure progress.},
}
Endnote
%0 Thesis
%A Fischer, Jonas
%Y Vreeken, Jilles
%A referee: Weikum, Gerhard
%A referee: Parthasarathy, Srinivasan
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T More than the sum of its parts : pattern mining neural networks, and how they complement each other
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-38BF-0
%R 10.22028/D291-37024
%U nbn:de:bsz:291--ds-370240
%F OTHER: hdl:20.500.11880/33893
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 250 p.
%V phd
%9 phd
%X In this thesis we explore pattern mining and deep learning. Often seen as orthogonal, we show that these fields complement each other and propose to combine them to gain from each other’s strengths. We, first, show how to efficiently discover succinct and non-redundant sets of patterns that provide insight into data beyond conjunctive statements. We leverage the interpretability of such patterns to unveil how and which information flows through neural networks, as well as what characterizes their decisions. Conversely, we show how to combine continuous optimization with pattern discovery, proposing a neural network that directly encodes discrete patterns, which allows us to apply pattern mining at a scale orders of magnitude larger than previously possible. Large neural networks are, however, exceedingly expensive to train for which ‘lottery tickets’ – small, well-trainable sub-networks in randomly initialized neural networks – offer a remedy. We identify theoretical limitations of strong tickets and overcome them by equipping these tickets with the property of universal approximation. To analyze whether limitations in ticket sparsity are algorithmic or fundamental, we propose a framework to plant and hide lottery tickets. With novel ticket benchmarks we then conclude that the limitation is likely algorithmic, encouraging further developments for which our framework offers means to measure progress.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33893
[3]
A. Guimarães, “Data Science Methods for the Analysis of Controversial Social Media Discussions,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
Social media communities like Reddit and Twitter allow users to express their views on<br>topics of their interest, and to engage with other users who may share or oppose these views.<br>This can lead to productive discussions towards a consensus, or to contended debates, where<br>disagreements frequently arise.<br>Prior work on such settings has primarily focused on identifying notable instances of antisocial<br>behavior such as hate-speech and “trolling”, which represent possible threats to the health of<br>a community. These, however, are exceptionally severe phenomena, and do not encompass<br>controversies stemming from user debates, differences of opinions, and off-topic content, all<br>of which can naturally come up in a discussion without going so far as to compromise its<br>development.<br>This dissertation proposes a framework for the systematic analysis of social media discussions<br>that take place in the presence of controversial themes, disagreements, and mixed opinions from<br>participating users. For this, we develop a feature-based model to describe key elements of a<br>discussion, such as its salient topics, the level of activity from users, the sentiments it expresses,<br>and the user feedback it receives.<br>Initially, we build our feature model to characterize adversarial discussions surrounding<br>political campaigns on Twitter, with a focus on the factual and sentimental nature of their<br>topics and the role played by different users involved. We then extend our approach to Reddit<br>discussions, leveraging community feedback signals to define a new notion of controversy<br>and to highlight conversational archetypes that arise from frequent and interesting interaction<br>patterns. We use our feature model to build logistic regression classifiers that can predict future<br>instances of controversy in Reddit communities centered on politics, world news, sports, and<br>personal relationships. Finally, our model also provides the basis for a comparison of different<br>communities in the health domain, where topics and activity vary considerably despite their<br>shared overall focus. In each of these cases, our framework provides insight into how user<br>behavior can shape a community’s individual definition of controversy and its overall identity.
Export
BibTeX
@phdthesis{Decarvalhophd2021,
TITLE = {Data Science Methods for the Analysis of Controversial Social Media Discussions},
AUTHOR = {Guimar{\~a}es, Anna},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-365021},
DOI = {10.22028/D291-36502},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {Social media communities like Reddit and Twitter allow users to express their views on<br>topics of their interest, and to engage with other users who may share or oppose these views.<br>This can lead to productive discussions towards a consensus, or to contended debates, where<br>disagreements frequently arise.<br>Prior work on such settings has primarily focused on identifying notable instances of antisocial<br>behavior such as hate-speech and {\textquotedblleft}trolling{\textquotedblright}, which represent possible threats to the health of<br>a community. These, however, are exceptionally severe phenomena, and do not encompass<br>controversies stemming from user debates, differences of opinions, and off-topic content, all<br>of which can naturally come up in a discussion without going so far as to compromise its<br>development.<br>This dissertation proposes a framework for the systematic analysis of social media discussions<br>that take place in the presence of controversial themes, disagreements, and mixed opinions from<br>participating users. For this, we develop a feature-based model to describe key elements of a<br>discussion, such as its salient topics, the level of activity from users, the sentiments it expresses,<br>and the user feedback it receives.<br>Initially, we build our feature model to characterize adversarial discussions surrounding<br>political campaigns on Twitter, with a focus on the factual and sentimental nature of their<br>topics and the role played by different users involved. We then extend our approach to Reddit<br>discussions, leveraging community feedback signals to define a new notion of controversy<br>and to highlight conversational archetypes that arise from frequent and interesting interaction<br>patterns. We use our feature model to build logistic regression classifiers that can predict future<br>instances of controversy in Reddit communities centered on politics, world news, sports, and<br>personal relationships. Finally, our model also provides the basis for a comparison of different<br>communities in the health domain, where topics and activity vary considerably despite their<br>shared overall focus. In each of these cases, our framework provides insight into how user<br>behavior can shape a community{\textquoteright}s individual definition of controversy and its overall identity.},
}
Endnote
%0 Thesis
%A Guimarães, Anna
%Y Weikum, Gerhard
%A referee: de Melo, Gerard
%A referee: Yates, Andrew
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Data Science Methods for the Analysis of
Controversial Social Media Discussions :
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-CDF7-9
%R 10.22028/D291-36502
%U nbn:de:bsz:291--ds-365021
%F OTHER: hdl:20.500.11880/33161
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 94 p.
%V phd
%9 phd
%X Social media communities like Reddit and Twitter allow users to express their views on<br>topics of their interest, and to engage with other users who may share or oppose these views.<br>This can lead to productive discussions towards a consensus, or to contended debates, where<br>disagreements frequently arise.<br>Prior work on such settings has primarily focused on identifying notable instances of antisocial<br>behavior such as hate-speech and “trolling”, which represent possible threats to the health of<br>a community. These, however, are exceptionally severe phenomena, and do not encompass<br>controversies stemming from user debates, differences of opinions, and off-topic content, all<br>of which can naturally come up in a discussion without going so far as to compromise its<br>development.<br>This dissertation proposes a framework for the systematic analysis of social media discussions<br>that take place in the presence of controversial themes, disagreements, and mixed opinions from<br>participating users. For this, we develop a feature-based model to describe key elements of a<br>discussion, such as its salient topics, the level of activity from users, the sentiments it expresses,<br>and the user feedback it receives.<br>Initially, we build our feature model to characterize adversarial discussions surrounding<br>political campaigns on Twitter, with a focus on the factual and sentimental nature of their<br>topics and the role played by different users involved. We then extend our approach to Reddit<br>discussions, leveraging community feedback signals to define a new notion of controversy<br>and to highlight conversational archetypes that arise from frequent and interesting interaction<br>patterns. We use our feature model to build logistic regression classifiers that can predict future<br>instances of controversy in Reddit communities centered on politics, world news, sports, and<br>personal relationships. Finally, our model also provides the basis for a comparison of different<br>communities in the health domain, where topics and activity vary considerably despite their<br>shared overall focus. In each of these cases, our framework provides insight into how user<br>behavior can shape a community’s individual definition of controversy and its overall identity.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33161
[4]
A. Horňáková, “Lifted Edges as Connectivity Priors for Multicut and Disjoint Paths,” Universität des Saarlandes, Saarbrücken, 2022.
Export
BibTeX
@phdthesis{HornakovaPhD22,
TITLE = {Lifted Edges as Connectivity Priors for Multicut and Disjoint Paths},
AUTHOR = {Hor{\v n}{\'a}kov{\'a}, Andrea},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-369193},
DOI = {10.22028/D291-36919},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
}
Endnote
%0 Thesis
%A Horňáková, Andrea
%Y Swoboda, Paul
%A referee: Schiele, Bernt
%A referee: Werner, Tomáš
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T Lifted Edges as Connectivity Priors for Multicut and Disjoint Paths :
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-2AD2-9
%U urn:nbn:de:bsz:291--ds-369193
%R 10.22028/D291-36919
%F OTHER: hdl:20.500.11880/33680
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P X, 150 p.
%V phd
%9 phd
%U http://dx.doi.org/10.22028/D291-36919
[5]
V. T. Ho, “Entities with Quantities: Extraction, Search and Ranking,” Universität des Saarlandes, Saarbrücken, 2022.
Export
BibTeX
@phdthesis{Ho_PhD2022,
TITLE = {Entities with Quantities: Extraction, Search and Ranking},
AUTHOR = {Ho, Vinh Thinh},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-380308},
DOI = {10.22028/D291-38030},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
}
Endnote
%0 Thesis
%A Ho, Vinh Thinh
%Y Weikum, Gerhard
%A referee: Stepanova, Daria
%A referee: Theobald, Martin
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Entities with Quantities: Extraction, Search and Ranking :
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-B756-5
%R 10.22028/D291-38030
%U urn:nbn:de:bsz:291--ds-380308
%F OTHER: hdl:20.500.11880/34538
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P xii, 131p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/34538
[6]
A. Kinali-Dogan, “On Time, Time Synchronization and Noise in Time Measurement Systems,” Universität des Saarlandes, Saarbrücken, 2022.
Export
BibTeX
@phdthesis{Attilaphd2022,
TITLE = {On Time, Time Synchronization and Noise in Time Measurement Systems},
AUTHOR = {Kinali-Dogan, Attila},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
}
Endnote
%0 Thesis
%A Kinali-Dogan, Attila
%Y Lenzen, Christoph
%A referee: Mehlhorn, Kurt
%A referee: Vernotte, Francoise
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
%T On Time, Time Synchronization and Noise in Time Measurement Systems :
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-5436-A
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 140 p.
%V phd
%9 phd
[7]
P. Lahoti, “Operationalizing Fairness for Responsible Machine Learning,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
As machine learning (ML) is increasingly used for decision making in scenarios that impact humans, there is a growing awareness of its potential for unfairness. A large body of recent work has focused on proposing formal notions of fairness in ML, as well as approaches to mitigate unfairness. However, there is a growing disconnect between the ML fairness literature and the needs to operationalize fairness in practice. This thesis addresses the need for responsible ML by developing new models and methods to address challenges in operationalizing fairness in practice. Specifically, it makes the following contributions. First, we tackle a key assumption in the group fairness literature that sensitive demographic attributes such as race and gender are known upfront, and can be readily used in model training to mitigate unfairness. In practice, factors like privacy and regulation often prohibit ML models from collecting or using protected attributes in decision making. To address this challenge we introduce the novel notion of computationally-identifiable errors and propose Adversarially Reweighted Learning (ARL), an optimization method that seeks to improve the worst-case performance over unobserved groups, without requiring access to the protected attributes in the dataset. Second, we argue that while group fairness notions are a desirable fairness criterion, they are fundamentally limited as they reduce fairness to an average statistic over pre-identified protected groups. In practice, automated decisions are made at an individual level, and can adversely impact individual people irrespective of the group statistic. We advance the paradigm of individual fairness by proposing iFair (individually fair representations), an optimization approach for learning a low dimensional latent representation of the data with two goals: to encode the data as well as possible, while removing any information about protected attributes in the transformed representation. Third, we advance the individual fairness paradigm, which requires that similar individuals receive similar outcomes. However, similarity metrics computed over observed feature space can be brittle, and inherently limited in their ability to accurately capture similarity between individuals. To address this, we introduce a novel notion of fairness graphs, wherein pairs of individuals can be identified as deemed similar with respect to the ML objective. We cast the problem of individual fairness into graph embedding, and propose PFR (pairwise fair representations), a method to learn a unified pairwise fair representation of the data. Fourth, we tackle the challenge that production data after model deployment is constantly evolving. As a consequence, in spite of the best efforts in training a fair model, ML systems can be prone to failure risks due to a variety of unforeseen reasons. To ensure responsible model deployment, potential failure risks need to be predicted, and mitigation actions need to be devised, for example, deferring to a human expert when uncertain or collecting additional data to address model’s blind-spots. We propose Risk Advisor, a model-agnostic meta-learner to predict potential failure risks and to give guidance on the sources of uncertainty inducing the risks, by leveraging information theoretic notions of aleatoric and epistemic uncertainty. This dissertation brings ML fairness closer to real-world applications by developing methods that address key practical challenges. Extensive experiments on a variety of real-world and synthetic datasets show that our proposed methods are viable in practice.
Export
BibTeX
@phdthesis{Lahotophd2022,
TITLE = {Operationalizing Fairness for Responsible Machine Learning},
AUTHOR = {Lahoti, Preethi},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-365860},
DOI = {10.22028/D291-36586},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {As machine learning (ML) is increasingly used for decision making in scenarios that impact humans, there is a growing awareness of its potential for unfairness. A large body of recent work has focused on proposing formal notions of fairness in ML, as well as approaches to mitigate unfairness. However, there is a growing disconnect between the ML fairness literature and the needs to operationalize fairness in practice. This thesis addresses the need for responsible ML by developing new models and methods to address challenges in operationalizing fairness in practice. Specifically, it makes the following contributions. First, we tackle a key assumption in the group fairness literature that sensitive demographic attributes such as race and gender are known upfront, and can be readily used in model training to mitigate unfairness. In practice, factors like privacy and regulation often prohibit ML models from collecting or using protected attributes in decision making. To address this challenge we introduce the novel notion of computationally-identifiable errors and propose Adversarially Reweighted Learning (ARL), an optimization method that seeks to improve the worst-case performance over unobserved groups, without requiring access to the protected attributes in the dataset. Second, we argue that while group fairness notions are a desirable fairness criterion, they are fundamentally limited as they reduce fairness to an average statistic over pre-identified protected groups. In practice, automated decisions are made at an individual level, and can adversely impact individual people irrespective of the group statistic. We advance the paradigm of individual fairness by proposing iFair (individually fair representations), an optimization approach for learning a low dimensional latent representation of the data with two goals: to encode the data as well as possible, while removing any information about protected attributes in the transformed representation. Third, we advance the individual fairness paradigm, which requires that similar individuals receive similar outcomes. However, similarity metrics computed over observed feature space can be brittle, and inherently limited in their ability to accurately capture similarity between individuals. To address this, we introduce a novel notion of fairness graphs, wherein pairs of individuals can be identified as deemed similar with respect to the ML objective. We cast the problem of individual fairness into graph embedding, and propose PFR (pairwise fair representations), a method to learn a unified pairwise fair representation of the data. Fourth, we tackle the challenge that production data after model deployment is constantly evolving. As a consequence, in spite of the best efforts in training a fair model, ML systems can be prone to failure risks due to a variety of unforeseen reasons. To ensure responsible model deployment, potential failure risks need to be predicted, and mitigation actions need to be devised, for example, deferring to a human expert when uncertain or collecting additional data to address model{\textquoteright}s blind-spots. We propose Risk Advisor, a model-agnostic meta-learner to predict potential failure risks and to give guidance on the sources of uncertainty inducing the risks, by leveraging information theoretic notions of aleatoric and epistemic uncertainty. This dissertation brings ML fairness closer to real-world applications by developing methods that address key practical challenges. Extensive experiments on a variety of real-world and synthetic datasets show that our proposed methods are viable in practice.},
}
Endnote
%0 Thesis
%A Lahoti, Preethi
%Y Weikum, Gerhard
%A referee: Gummadi, Krishna
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Group K. Gummadi, Max Planck Institute for Software Systems, Max Planck Society
%T Operationalizing Fairness for
Responsible Machine Learning :
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-CEC6-F
%R 10.22028/D291-36586
%U nbn:de:bsz:291--ds-365860
%F OTHER: hdl:20.500.11880/33465
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 129 p.
%V phd
%9 phd
%X As machine learning (ML) is increasingly used for decision making in scenarios that impact humans, there is a growing awareness of its potential for unfairness. A large body of recent work has focused on proposing formal notions of fairness in ML, as well as approaches to mitigate unfairness. However, there is a growing disconnect between the ML fairness literature and the needs to operationalize fairness in practice. This thesis addresses the need for responsible ML by developing new models and methods to address challenges in operationalizing fairness in practice. Specifically, it makes the following contributions. First, we tackle a key assumption in the group fairness literature that sensitive demographic attributes such as race and gender are known upfront, and can be readily used in model training to mitigate unfairness. In practice, factors like privacy and regulation often prohibit ML models from collecting or using protected attributes in decision making. To address this challenge we introduce the novel notion of computationally-identifiable errors and propose Adversarially Reweighted Learning (ARL), an optimization method that seeks to improve the worst-case performance over unobserved groups, without requiring access to the protected attributes in the dataset. Second, we argue that while group fairness notions are a desirable fairness criterion, they are fundamentally limited as they reduce fairness to an average statistic over pre-identified protected groups. In practice, automated decisions are made at an individual level, and can adversely impact individual people irrespective of the group statistic. We advance the paradigm of individual fairness by proposing iFair (individually fair representations), an optimization approach for learning a low dimensional latent representation of the data with two goals: to encode the data as well as possible, while removing any information about protected attributes in the transformed representation. Third, we advance the individual fairness paradigm, which requires that similar individuals receive similar outcomes. However, similarity metrics computed over observed feature space can be brittle, and inherently limited in their ability to accurately capture similarity between individuals. To address this, we introduce a novel notion of fairness graphs, wherein pairs of individuals can be identified as deemed similar with respect to the ML objective. We cast the problem of individual fairness into graph embedding, and propose PFR (pairwise fair representations), a method to learn a unified pairwise fair representation of the data. Fourth, we tackle the challenge that production data after model deployment is constantly evolving. As a consequence, in spite of the best efforts in training a fair model, ML systems can be prone to failure risks due to a variety of unforeseen reasons. To ensure responsible model deployment, potential failure risks need to be predicted, and mitigation actions need to be devised, for example, deferring to a human expert when uncertain or collecting additional data to address model’s blind-spots. We propose Risk Advisor, a model-agnostic meta-learner to predict potential failure risks and to give guidance on the sources of uncertainty inducing the risks, by leveraging information theoretic notions of aleatoric and epistemic uncertainty. This dissertation brings ML fairness closer to real-world applications by developing methods that address key practical challenges. Extensive experiments on a variety of real-world and synthetic datasets show that our proposed methods are viable in practice.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33465
[8]
A. Nusser, “Fine-Grained Complexity and Algorithm Engineering of Geometric Similarity Measures,” Universität des Saarlandes, Saarbrücken, 2022.
Export
BibTeX
@phdthesis{NusserPhD22,
TITLE = {Fine-Grained Complexity and Algorithm Engineering of Geometric Similarity Measures},
AUTHOR = {Nusser, Andr{\'e}},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-370184},
DOI = {10.22028/D291-37018},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
}
Endnote
%0 Thesis
%A Nusser, André
%Y Bringmann, Karl
%A referee: Mehlhorn, Kurt
%A referee: Chan, Timothy
%A referee: de Ber, Mark
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Fine-Grained Complexity and Algorithm Engineering of Geometric Similarity Measures :
%G eng
%U http://hdl.handle.net/21.11116/0000-000C-2693-3
%R 10.22028/D291-37018
%U urn:nbn:de:bsz:291--ds-370184
%F OTHER: hdl:20.500.11880/33904
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P XIV, 210 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33904
[9]
D. Stutz, “Understanding and Improving Robustness and Uncertainty Estimation in Deep Learning,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
Deep learning is becoming increasingly relevant for many high-stakes applications such as autonomous driving or medical diagnosis where wrong decisions can have massive impact on human lives. Unfortunately, deep neural networks are typically assessed solely based on generalization, e.g., accuracy on a fixed test set. However, this is clearly insufficient for safe deployment as potential malicious actors and distribution shifts or the effects of quantization and unreliable hardware are disregarded. Thus, recent work additionally evaluates performance on potentially manipulated or corrupted inputs as well as after quantization and deployment on specialized hardware. In such settings, it is also important to obtain reasonable estimates of the model's confidence alongside its predictions. This thesis studies robustness and uncertainty estimation in deep learning along three main directions: First, we consider so-called adversarial examples, slightly perturbed inputs causing severe drops in accuracy. Second, we study weight perturbations, focusing particularly on bit errors in quantized weights. This is relevant for deploying models on special-purpose hardware for efficient inference, so-called accelerators. Finally, we address uncertainty estimation to improve robustness and provide meaningful statistical performance guarantees for safe deployment. In detail, we study the existence of adversarial examples with respect to the underlying data manifold. In this context, we also investigate adversarial training which improves robustness by augmenting training with adversarial examples at the cost of reduced accuracy. We show that regular adversarial examples leave the data manifold in an almost orthogonal direction. While we find no inherent trade-off between robustness and accuracy, this contributes to a higher sample complexity as well as severe overfitting of adversarial training. Using a novel measure of flatness in the robust loss landscape with respect to weight changes, we also show that robust overfitting is caused by converging to particularly sharp minima. In fact, we find a clear correlation between flatness and good robust generalization. Further, we study random and adversarial bit errors in quantized weights. In accelerators, random bit errors occur in the memory when reducing voltage with the goal of improving energy-efficiency. Here, we consider a robust quantization scheme, use weight clipping as regularization and perform random bit error training to improve bit error robustness, allowing considerable energy savings without requiring hardware changes. In contrast, adversarial bit errors are maliciously introduced through hardware- or software-based attacks on the memory, with severe consequences on performance. We propose a novel adversarial bit error attack to study this threat and use adversarial bit error training to improve robustness and thereby also the accelerator's security. Finally, we view robustness in the context of uncertainty estimation. By encouraging low-confidence predictions on adversarial examples, our confidence-calibrated adversarial training successfully rejects adversarial, corrupted as well as out-of-distribution examples at test time. Thereby, we are also able to improve the robustness-accuracy trade-off compared to regular adversarial training. However, even robust models do not provide any guarantee for safe deployment. To address this problem, conformal prediction allows the model to predict confidence sets with user-specified guarantee of including the true label. Unfortunately, as conformal prediction is usually applied after training, the model is trained without taking this calibration step into account. To address this limitation, we propose conformal training which allows training conformal predictors end-to-end with the underlying model. This not only improves the obtained uncertainty estimates but also enables optimizing application-specific objectives without losing the provided guarantee. Besides our work on robustness or uncertainty, we also address the problem of 3D shape completion of partially observed point clouds. Specifically, we consider an autonomous driving or robotics setting where vehicles are commonly equipped with LiDAR or depth sensors and obtaining a complete 3D representation of the environment is crucial. However, ground truth shapes that are essential for applying deep learning techniques are extremely difficult to obtain. Thus, we propose a weakly-supervised approach that can be trained on the incomplete point clouds while offering efficient inference. In summary, this thesis contributes to our understanding of robustness against both input and weight perturbations. To this end, we also develop methods to improve robustness alongside uncertainty estimation for safe deployment of deep learning methods in high-stakes applications. In the particular context of autonomous driving, we also address 3D shape completion of sparse point clouds.
Export
BibTeX
@phdthesis{Stutzphd2022,
TITLE = {Understanding and Improving Robustness and Uncertainty Estimation in Deep Learning},
AUTHOR = {Stutz, David},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-372867},
DOI = {10.22028/D291-37286},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {Deep learning is becoming increasingly relevant for many high-stakes applications such as autonomous driving or medical diagnosis where wrong decisions can have massive impact on human lives. Unfortunately, deep neural networks are typically assessed solely based on generalization, e.g., accuracy on a fixed test set. However, this is clearly insufficient for safe deployment as potential malicious actors and distribution shifts or the effects of quantization and unreliable hardware are disregarded. Thus, recent work additionally evaluates performance on potentially manipulated or corrupted inputs as well as after quantization and deployment on specialized hardware. In such settings, it is also important to obtain reasonable estimates of the model's confidence alongside its predictions. This thesis studies robustness and uncertainty estimation in deep learning along three main directions: First, we consider so-called adversarial examples, slightly perturbed inputs causing severe drops in accuracy. Second, we study weight perturbations, focusing particularly on bit errors in quantized weights. This is relevant for deploying models on special-purpose hardware for efficient inference, so-called accelerators. Finally, we address uncertainty estimation to improve robustness and provide meaningful statistical performance guarantees for safe deployment. In detail, we study the existence of adversarial examples with respect to the underlying data manifold. In this context, we also investigate adversarial training which improves robustness by augmenting training with adversarial examples at the cost of reduced accuracy. We show that regular adversarial examples leave the data manifold in an almost orthogonal direction. While we find no inherent trade-off between robustness and accuracy, this contributes to a higher sample complexity as well as severe overfitting of adversarial training. Using a novel measure of flatness in the robust loss landscape with respect to weight changes, we also show that robust overfitting is caused by converging to particularly sharp minima. In fact, we find a clear correlation between flatness and good robust generalization. Further, we study random and adversarial bit errors in quantized weights. In accelerators, random bit errors occur in the memory when reducing voltage with the goal of improving energy-efficiency. Here, we consider a robust quantization scheme, use weight clipping as regularization and perform random bit error training to improve bit error robustness, allowing considerable energy savings without requiring hardware changes. In contrast, adversarial bit errors are maliciously introduced through hardware- or software-based attacks on the memory, with severe consequences on performance. We propose a novel adversarial bit error attack to study this threat and use adversarial bit error training to improve robustness and thereby also the accelerator's security. Finally, we view robustness in the context of uncertainty estimation. By encouraging low-confidence predictions on adversarial examples, our confidence-calibrated adversarial training successfully rejects adversarial, corrupted as well as out-of-distribution examples at test time. Thereby, we are also able to improve the robustness-accuracy trade-off compared to regular adversarial training. However, even robust models do not provide any guarantee for safe deployment. To address this problem, conformal prediction allows the model to predict confidence sets with user-specified guarantee of including the true label. Unfortunately, as conformal prediction is usually applied after training, the model is trained without taking this calibration step into account. To address this limitation, we propose conformal training which allows training conformal predictors end-to-end with the underlying model. This not only improves the obtained uncertainty estimates but also enables optimizing application-specific objectives without losing the provided guarantee. Besides our work on robustness or uncertainty, we also address the problem of 3D shape completion of partially observed point clouds. Specifically, we consider an autonomous driving or robotics setting where vehicles are commonly equipped with LiDAR or depth sensors and obtaining a complete 3D representation of the environment is crucial. However, ground truth shapes that are essential for applying deep learning techniques are extremely difficult to obtain. Thus, we propose a weakly-supervised approach that can be trained on the incomplete point clouds while offering efficient inference. In summary, this thesis contributes to our understanding of robustness against both input and weight perturbations. To this end, we also develop methods to improve robustness alongside uncertainty estimation for safe deployment of deep learning methods in high-stakes applications. In the particular context of autonomous driving, we also address 3D shape completion of sparse point clouds.},
}
Endnote
%0 Thesis
%A Stutz, David
%Y Schiele, Bernt
%A referee: Hein, Matthias
%A referee: Kumar, Pawan
%A referee: Fritz, Mario
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Understanding and Improving Robustness and
Uncertainty Estimation in Deep Learning :
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-3FE6-C
%R 10.22028/D291-37286
%U nbn:de:bsz:291--ds-372867
%F OTHER: hdl:20.500.11880/33949
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 291 p.
%V phd
%9 phd
%X Deep learning is becoming increasingly relevant for many high-stakes applications such as autonomous driving or medical diagnosis where wrong decisions can have massive impact on human lives. Unfortunately, deep neural networks are typically assessed solely based on generalization, e.g., accuracy on a fixed test set. However, this is clearly insufficient for safe deployment as potential malicious actors and distribution shifts or the effects of quantization and unreliable hardware are disregarded. Thus, recent work additionally evaluates performance on potentially manipulated or corrupted inputs as well as after quantization and deployment on specialized hardware. In such settings, it is also important to obtain reasonable estimates of the model's confidence alongside its predictions. This thesis studies robustness and uncertainty estimation in deep learning along three main directions: First, we consider so-called adversarial examples, slightly perturbed inputs causing severe drops in accuracy. Second, we study weight perturbations, focusing particularly on bit errors in quantized weights. This is relevant for deploying models on special-purpose hardware for efficient inference, so-called accelerators. Finally, we address uncertainty estimation to improve robustness and provide meaningful statistical performance guarantees for safe deployment. In detail, we study the existence of adversarial examples with respect to the underlying data manifold. In this context, we also investigate adversarial training which improves robustness by augmenting training with adversarial examples at the cost of reduced accuracy. We show that regular adversarial examples leave the data manifold in an almost orthogonal direction. While we find no inherent trade-off between robustness and accuracy, this contributes to a higher sample complexity as well as severe overfitting of adversarial training. Using a novel measure of flatness in the robust loss landscape with respect to weight changes, we also show that robust overfitting is caused by converging to particularly sharp minima. In fact, we find a clear correlation between flatness and good robust generalization. Further, we study random and adversarial bit errors in quantized weights. In accelerators, random bit errors occur in the memory when reducing voltage with the goal of improving energy-efficiency. Here, we consider a robust quantization scheme, use weight clipping as regularization and perform random bit error training to improve bit error robustness, allowing considerable energy savings without requiring hardware changes. In contrast, adversarial bit errors are maliciously introduced through hardware- or software-based attacks on the memory, with severe consequences on performance. We propose a novel adversarial bit error attack to study this threat and use adversarial bit error training to improve robustness and thereby also the accelerator's security. Finally, we view robustness in the context of uncertainty estimation. By encouraging low-confidence predictions on adversarial examples, our confidence-calibrated adversarial training successfully rejects adversarial, corrupted as well as out-of-distribution examples at test time. Thereby, we are also able to improve the robustness-accuracy trade-off compared to regular adversarial training. However, even robust models do not provide any guarantee for safe deployment. To address this problem, conformal prediction allows the model to predict confidence sets with user-specified guarantee of including the true label. Unfortunately, as conformal prediction is usually applied after training, the model is trained without taking this calibration step into account. To address this limitation, we propose conformal training which allows training conformal predictors end-to-end with the underlying model. This not only improves the obtained uncertainty estimates but also enables optimizing application-specific objectives without losing the provided guarantee. Besides our work on robustness or uncertainty, we also address the problem of 3D shape completion of partially observed point clouds. Specifically, we consider an autonomous driving or robotics setting where vehicles are commonly equipped with LiDAR or depth sensors and obtaining a complete 3D representation of the environment is crucial. However, ground truth shapes that are essential for applying deep learning techniques are extremely difficult to obtain. Thus, we propose a weakly-supervised approach that can be trained on the incomplete point clouds while offering efficient inference. In summary, this thesis contributes to our understanding of robustness against both input and weight perturbations. To this end, we also develop methods to improve robustness alongside uncertainty estimation for safe deployment of deep learning methods in high-stakes applications. In the particular context of autonomous driving, we also address 3D shape completion of sparse point clouds.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33949
[10]
A. Tigunova, “Extracting Personal Information from Conversations,” Universität des Saarlandes, Saarbrücken, 2022.
Abstract
Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakers’ personal attributes: • Demographic attributes, age, gender, profession and family status, are inferred by HAMs - hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. • Long-tailed personal attributes, hobby and profession, are predicted with CHARM - a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. • Interpersonal relationships are inferred with PRIDE - a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.
Export
BibTeX
@phdthesis{Tiguphd2022,
TITLE = {Extracting Personal Information from Conversations},
AUTHOR = {Tigunova, Anna},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-356280},
DOI = {10.22028/D291-35628},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2022},
MARGINALMARK = {$\bullet$},
DATE = {2022},
ABSTRACT = {Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakers{\textquoteright} personal attributes: \mbox{$\bullet$} Demographic attributes, age, gender, profession and family status, are inferred by HAMs -- hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. \mbox{$\bullet$} Long-tailed personal attributes, hobby and profession, are predicted with CHARM -- a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. \mbox{$\bullet$} Interpersonal relationships are inferred with PRIDE -- a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.},
}
Endnote
%0 Thesis
%A Tigunova, Anna
%Y Weikum, Gerhard
%A referee: Yates, Andrew
%A referee: Demberg, Vera
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Extracting Personal Information from
Conversations :
%G eng
%U http://hdl.handle.net/21.11116/0000-000B-3FE1-1
%R 10.22028/D291-35628
%U nbn:de:bsz:291--ds-356280
%F OTHER: hdl:20.500.11880/32546
%I Universität des Saarlandes
%C Saarbrücken
%D 2022
%P 139 p.
%V phd
%9 phd
%X Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakers’ personal attributes: • Demographic attributes, age, gender, profession and family status, are inferred by HAMs - hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. • Long-tailed personal attributes, hobby and profession, are predicted with CHARM - a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. • Interpersonal relationships are inferred with PRIDE - a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32546
2021
[11]
A. Bhattacharyya, “Long-term future prediction under uncertainty and multi-modality,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@phdthesis{Batphd2021,
TITLE = {Long-term future prediction under uncertainty and multi-modality},
AUTHOR = {Bhattacharyya, Apratim},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-356522},
DOI = {10.22028/D291-35652},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Bhattacharyya, Apratim
%Y Schiele, Bernt
%A referee: Fritz, Mario
%A referee: Geiger, Andreas
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
%T Long-term future prediction under uncertainty and multi-modality :
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-20BF-B
%R 10.22028/D291-35652
%U nbn:de:bsz:291--ds-356522
%F OTHER: hdl:20.500.11880/32595
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 210 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32595
[12]
B. R. Chaudhury, “Finding Fair and Efficient Allocations,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@phdthesis{Chaudphd2021,
TITLE = {Finding Fair and Efficient Allocations},
AUTHOR = {Chaudhury, Bhaskar Ray},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-345370},
DOI = {10.22028/D291-34537},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Chaudhury, Bhaskar Ray
%Y Mehlhorn, Kurt
%A referee: Bringmann, Karl
%A referee: Roughgarden, Tim
%A referee: Moulin, Herve
%+ Algorithms and Complexity, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
Algorithms and Complexity, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Finding Fair and Efficient Allocations :
%G eng
%U http://hdl.handle.net/21.11116/0000-0009-9CC9-5
%R 10.22028/D291-34537
%U nbn:de:bsz:291--ds-345370
%F OTHER: hdl:20.500.11880/31737
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 173 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31737
[13]
D. A. Durai, “Novel graph based algorithms fortranscriptome sequence analysis,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@phdthesis{Duraiphd2020,
TITLE = {Novel graph based algorithms fortranscriptome sequence analysis},
AUTHOR = {Durai, Dilip Ariyur},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-341585},
DOI = {10.22028/D291-34158},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Durai, Dilip Ariyur
%Y Schulz, Marcel
%A referee: Helms, Volker
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
External Organizations
%T Novel graph based algorithms fortranscriptome sequence analysis :
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-E4D6-5
%R 10.22028/D291-34158
%U urn:nbn:de:bsz:291--ds-341585
%F OTHER: hdl:20.500.11880/31478
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 143 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31478
[14]
M. H. Gad-Elrab, “Explainable Methods for Knowledge Graph Refinement and Exploration via Symbolic Reasoning,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to refine the constructed KGs to enhance their coverage and accuracy via KG completion and KG validation. It is also vital to provide human-comprehensible explanations for such refinements, so that humans have trust in the KG quality. Enabling KG exploration, by search and browsing, is also essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration very challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. In this dissertation we tackle the aforementioned challenges of KG refinement and KG exploration by combining logical reasoning over the KG with other techniques such as KG embedding models and text mining. Through such combination, we introduce methods that provide human-understandable output. Concretely, we introduce methods to tackle KG incompleteness by learning exception-aware rules over the existing KG. Learned rules are then used in inferring missing links in the KG accurately. Furthermore, we propose a framework for constructing human-comprehensible explanations for candidate facts from both KG and text. Extracted explanations are used to insure the validity of KG facts. Finally, to facilitate KG exploration, we introduce a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations.
Export
BibTeX
@phdthesis{Elrabphd2021,
TITLE = {Explainable Methods for Knowledge Graph Refinement and Exploration via Symbolic Reasoning},
AUTHOR = {Gad-Elrab, Mohamed Hassan},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-344237},
DOI = {10.22028/D291-34423},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to refine the constructed KGs to enhance their coverage and accuracy via KG completion and KG validation. It is also vital to provide human-comprehensible explanations for such refinements, so that humans have trust in the KG quality. Enabling KG exploration, by search and browsing, is also essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration very challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. In this dissertation we tackle the aforementioned challenges of KG refinement and KG exploration by combining logical reasoning over the KG with other techniques such as KG embedding models and text mining. Through such combination, we introduce methods that provide human-understandable output. Concretely, we introduce methods to tackle KG incompleteness by learning exception-aware rules over the existing KG. Learned rules are then used in inferring missing links in the KG accurately. Furthermore, we propose a framework for constructing human-comprehensible explanations for candidate facts from both KG and text. Extracted explanations are used to insure the validity of KG facts. Finally, to facilitate KG exploration, we introduce a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations.},
}
Endnote
%0 Thesis
%A Gad-Elrab, Mohamed Hassan
%Y Weikum, Gerhard
%A referee: Theobald, Martin
%A referee: Stepanova, Daria
%A referee: Razniewski, Simon
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Explainable Methods for Knowledge Graph Refinement and Exploration via Symbolic Reasoning :
%G eng
%U http://hdl.handle.net/21.11116/0000-0009-427E-0
%R 10.22028/D291-34423
%U urn:nbn:de:bsz:291--ds-344237
%F OTHER: hdl:20.500.11880/31629
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 176 p.
%V phd
%9 phd
%X Knowledge Graphs (KGs) have applications in many domains such as Finance, Manufacturing, and Healthcare. While recent efforts have created large KGs, their content is far from complete and sometimes includes invalid statements. Therefore, it is crucial to refine the constructed KGs to enhance their coverage and accuracy via KG completion and KG validation. It is also vital to provide human-comprehensible explanations for such refinements, so that humans have trust in the KG quality. Enabling KG exploration, by search and browsing, is also essential for users to understand the KG value and limitations towards down-stream applications. However, the large size of KGs makes KG exploration very challenging. While the type taxonomy of KGs is a useful asset along these lines, it remains insufficient for deep exploration. In this dissertation we tackle the aforementioned challenges of KG refinement and KG exploration by combining logical reasoning over the KG with other techniques such as KG embedding models and text mining. Through such combination, we introduce methods that provide human-understandable output. Concretely, we introduce methods to tackle KG incompleteness by learning exception-aware rules over the existing KG. Learned rules are then used in inferring missing links in the KG accurately. Furthermore, we propose a framework for constructing human-comprehensible explanations for candidate facts from both KG and text. Extracted explanations are used to insure the validity of KG facts. Finally, to facilitate KG exploration, we introduce a method that combines KG embeddings with rule mining to compute informative entity clusters with explanations.
%K knowledge graphs
symbolic learning
embedding models
rule learning
Big Data
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31629
[15]
A. Ghazimatin, “Enhancing Explainability and Scrutability of Recommender Systems,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Our increasing reliance on complex algorithms for recommendations calls for models and methods for explainable, scrutable, and trustworthy AI. While explainability is required for understanding the relationships between model inputs and outputs, a scrutable system allows us to modify its behavior as desired. These properties help bridge the gap between our expectations and the algorithm’s behavior and accordingly boost our trust in AI. Aiming to cope with information overload, recommender systems play a crucial role in filtering content (such as products, news, songs, and movies) and shaping a personalized experience for their users. Consequently, there has been a growing demand from the information consumers to receive proper explanations for their personalized recommendations. These explanations aim at helping users understand why certain items are recommended to them and how their previous inputs to the system relate to the generation of such recommendations. Besides, in the event of receiving undesirable content, explanations could possibly contain valuable information as to how the system’s behavior can be modified accordingly. In this thesis, we present our contributions towards explainability and scrutability of recommender systems: • We introduce a user-centric framework, FAIRY, for discovering and ranking post-hoc explanations for the social feeds generated by black-box platforms. These explanations reveal relationships between users’ profiles and their feed items and are extracted from the local interaction graphs of users. FAIRY employs a learning-to-rank (LTR) method to score candidate explanations based on their relevance and surprisal. • We propose a method, PRINCE, to facilitate provider-side explainability in graph-based recommender systems that use personalized PageRank at their core. PRINCE explanations are comprehensible for users, because they present subsets of the user’s prior actions responsible for the received recommendations. PRINCE operates in a counterfactual setup and builds on a polynomial-time algorithm for finding the smallest counterfactual explanations. • We propose a human-in-the-loop framework, ELIXIR, for enhancing scrutability and subsequently the recommendation models by leveraging user feedback on explanations. ELIXIR enables recommender systems to collect user feedback on pairs of recommendations and explanations. The feedback is incorporated into the model by imposing a soft constraint for learning user-specific item representations. We evaluate all proposed models and methods with real user studies and demonstrate their benefits at achieving explainability and scrutability in recommender systems.
Export
BibTeX
@phdthesis{Ghazphd2021,
TITLE = {Enhancing Explainability and Scrutability of Recommender Systems},
AUTHOR = {Ghazimatin, Azin},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-355166},
DOI = {10.22028/D291-35516},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Our increasing reliance on complex algorithms for recommendations calls for models and methods for explainable, scrutable, and trustworthy AI. While explainability is required for understanding the relationships between model inputs and outputs, a scrutable system allows us to modify its behavior as desired. These properties help bridge the gap between our expectations and the algorithm{\textquoteright}s behavior and accordingly boost our trust in AI. Aiming to cope with information overload, recommender systems play a crucial role in {fi}ltering content (such as products, news, songs, and movies) and shaping a personalized experience for their users. Consequently, there has been a growing demand from the information consumers to receive proper explanations for their personalized recommendations. These explanations aim at helping users understand why certain items are recommended to them and how their previous inputs to the system relate to the generation of such recommendations. Besides, in the event of receiving undesirable content, explanations could possibly contain valuable information as to how the system{\textquoteright}s behavior can be modi{fi}ed accordingly. In this thesis, we present our contributions towards explainability and scrutability of recommender systems: \mbox{$\bullet$} We introduce a user-centric framework, FAIRY, for discovering and ranking post-hoc explanations for the social feeds generated by black-box platforms. These explanations reveal relationships between users{\textquoteright} pro{fi}les and their feed items and are extracted from the local interaction graphs of users. FAIRY employs a learning-to-rank (LTR) method to score candidate explanations based on their relevance and surprisal. \mbox{$\bullet$} We propose a method, PRINCE, to facilitate provider-side explainability in graph-based recommender systems that use personalized PageRank at their core. PRINCE explanations are comprehensible for users, because they present subsets of the user{\textquoteright}s prior actions responsible for the received recommendations. PRINCE operates in a counterfactual setup and builds on a polynomial-time algorithm for {fi}nding the smallest counterfactual explanations. \mbox{$\bullet$} We propose a human-in-the-loop framework, ELIXIR, for enhancing scrutability and subsequently the recommendation models by leveraging user feedback on explanations. ELIXIR enables recommender systems to collect user feedback on pairs of recommendations and explanations. The feedback is incorporated into the model by imposing a soft constraint for learning user-speci{fi}c item representations. We evaluate all proposed models and methods with real user studies and demonstrate their bene{fi}ts at achieving explainability and scrutability in recommender systems.},
}
Endnote
%0 Thesis
%A Ghazimatin, Azin
%Y Weikum, Gerhard
%A referee: Saha Roy, Rishiraj
%A referee: Amer-Yahia, Sihem
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Enhancing Explainability and Scrutability of Recommender Systems :
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-3C99-7
%R 10.22028/D291-35516
%U nbn:de:bsz:291--ds-355166
%F OTHER: hdl:20.500.11880/32590
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 136 p.
%V phd
%9 phd
%X Our increasing reliance on complex algorithms for recommendations calls for models and methods for explainable, scrutable, and trustworthy AI. While explainability is required for understanding the relationships between model inputs and outputs, a scrutable system allows us to modify its behavior as desired. These properties help bridge the gap between our expectations and the algorithm’s behavior and accordingly boost our trust in AI. Aiming to cope with information overload, recommender systems play a crucial role in filtering content (such as products, news, songs, and movies) and shaping a personalized experience for their users. Consequently, there has been a growing demand from the information consumers to receive proper explanations for their personalized recommendations. These explanations aim at helping users understand why certain items are recommended to them and how their previous inputs to the system relate to the generation of such recommendations. Besides, in the event of receiving undesirable content, explanations could possibly contain valuable information as to how the system’s behavior can be modified accordingly. In this thesis, we present our contributions towards explainability and scrutability of recommender systems: • We introduce a user-centric framework, FAIRY, for discovering and ranking post-hoc explanations for the social feeds generated by black-box platforms. These explanations reveal relationships between users’ profiles and their feed items and are extracted from the local interaction graphs of users. FAIRY employs a learning-to-rank (LTR) method to score candidate explanations based on their relevance and surprisal. • We propose a method, PRINCE, to facilitate provider-side explainability in graph-based recommender systems that use personalized PageRank at their core. PRINCE explanations are comprehensible for users, because they present subsets of the user’s prior actions responsible for the received recommendations. PRINCE operates in a counterfactual setup and builds on a polynomial-time algorithm for finding the smallest counterfactual explanations. • We propose a human-in-the-loop framework, ELIXIR, for enhancing scrutability and subsequently the recommendation models by leveraging user feedback on explanations. ELIXIR enables recommender systems to collect user feedback on pairs of recommendations and explanations. The feedback is incorporated into the model by imposing a soft constraint for learning user-specific item representations. We evaluate all proposed models and methods with real user studies and demonstrate their benefits at achieving explainability and scrutability in recommender systems.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32590
[16]
M. Habermann, “Real-time human performance capture and synthesis,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Most of the images one finds in the media, such as on the Internet or in textbooks and magazines, contain humans as the main point of attention. Thus, there is an inherent necessity for industry, society, and private persons to be able to thoroughly analyze and synthesize the human-related content in these images. One aspect of this analysis and subject of this thesis is to infer the 3D pose and surface deformation, using only visual information, which is also known as human performance capture. Human performance capture enables the tracking of virtual characters from real-world observations, and this is key for visual effects, games, VR, and AR, to name just a few application areas. However, traditional capture methods usually rely on expensive multi-view (marker-based) systems that are prohibitively expensive for the vast majority of people, or they use depth sensors, which are still not as common as single color cameras. Recently, some approaches have attempted to solve the task by assuming only a single RGB image is given. Nonetheless, they can either not track the dense deforming geometry of the human, such as the clothing layers, or they are far from real time, which is indispensable for many applications. To overcome these shortcomings, this thesis proposes two monocular human performance capture methods, which for the first time allow the real-time capture of the dense deforming geometry as well as an unseen 3D accuracy for pose and surface deformations. At the technical core, this work introduces novel GPU-based and data-parallel optimization strategies in conjunction with other algorithmic design choices that are all geared towards real-time performance at high accuracy. Moreover, this thesis presents a new weakly supervised multiview training strategy combined with a fully differentiable character representation that shows superior 3D accuracy. However, there is more to human-related Computer Vision than only the analysis of people in images. It is equally important to synthesize new images of humans in unseen poses and also from camera viewpoints that have not been observed in the real world. Such tools are essential for the movie industry because they, for example, allow the synthesis of photo-realistic virtual worlds with real-looking humans or of contents that are too dangerous for actors to perform on set. But also video conferencing and telepresence applications can benefit from photo-real 3D characters, as they can enhance the immersive experience of these applications. Here, the traditional Computer Graphics pipeline for rendering photo-realistic images involves many tedious and time-consuming steps that require expert knowledge and are far from real time. Traditional rendering involves character rigging and skinning, the modeling of the surface appearance properties, and physically based ray tracing. Recent learning-based methods attempt to simplify the traditional rendering pipeline and instead learn the rendering function from data resulting in methods that are easier accessible to non-experts. However, most of them model the synthesis task entirely in image space such that 3D consistency cannot be achieved, and/or they fail to model motion- and view-dependent appearance effects. To this end, this thesis presents a method and ongoing work on character synthesis, which allow the synthesis of controllable photoreal characters that achieve motion- and view-dependent appearance effects as well as 3D consistency and which run in real time. This is technically achieved by a novel coarse-to-fine geometric character representation for efficient synthesis, which can be solely supervised on multi-view imagery. Furthermore, this work shows how such a geometric representation can be combined with an implicit surface representation to boost synthesis and geometric quality.
Export
BibTeX
@phdthesis{Habermannphd2021,
TITLE = {Real-time human performance capture and synthesis},
AUTHOR = {Habermann, Marc},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-349617},
DOI = {10.22028/D291-34961},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Most of the images one finds in the media, such as on the Internet or in textbooks and magazines, contain humans as the main point of attention. Thus, there is an inherent necessity for industry, society, and private persons to be able to thoroughly analyze and synthesize the human-related content in these images. One aspect of this analysis and subject of this thesis is to infer the 3D pose and surface deformation, using only visual information, which is also known as human performance capture. Human performance capture enables the tracking of virtual characters from real-world observations, and this is key for visual effects, games, VR, and AR, to name just a few application areas. However, traditional capture methods usually rely on expensive multi-view (marker-based) systems that are prohibitively expensive for the vast majority of people, or they use depth sensors, which are still not as common as single color cameras. Recently, some approaches have attempted to solve the task by assuming only a single RGB image is given. Nonetheless, they can either not track the dense deforming geometry of the human, such as the clothing layers, or they are far from real time, which is indispensable for many applications. To overcome these shortcomings, this thesis proposes two monocular human performance capture methods, which for the first time allow the real-time capture of the dense deforming geometry as well as an unseen 3D accuracy for pose and surface deformations. At the technical core, this work introduces novel GPU-based and data-parallel optimization strategies in conjunction with other algorithmic design choices that are all geared towards real-time performance at high accuracy. Moreover, this thesis presents a new weakly supervised multiview training strategy combined with a fully differentiable character representation that shows superior 3D accuracy. However, there is more to human-related Computer Vision than only the analysis of people in images. It is equally important to synthesize new images of humans in unseen poses and also from camera viewpoints that have not been observed in the real world. Such tools are essential for the movie industry because they, for example, allow the synthesis of photo-realistic virtual worlds with real-looking humans or of contents that are too dangerous for actors to perform on set. But also video conferencing and telepresence applications can benefit from photo-real 3D characters, as they can enhance the immersive experience of these applications. Here, the traditional Computer Graphics pipeline for rendering photo-realistic images involves many tedious and time-consuming steps that require expert knowledge and are far from real time. Traditional rendering involves character rigging and skinning, the modeling of the surface appearance properties, and physically based ray tracing. Recent learning-based methods attempt to simplify the traditional rendering pipeline and instead learn the rendering function from data resulting in methods that are easier accessible to non-experts. However, most of them model the synthesis task entirely in image space such that 3D consistency cannot be achieved, and/or they fail to model motion- and view-dependent appearance effects. To this end, this thesis presents a method and ongoing work on character synthesis, which allow the synthesis of controllable photoreal characters that achieve motion- and view-dependent appearance effects as well as 3D consistency and which run in real time. This is technically achieved by a novel coarse-to-fine geometric character representation for efficient synthesis, which can be solely supervised on multi-view imagery. Furthermore, this work shows how such a geometric representation can be combined with an implicit surface representation to boost synthesis and geometric quality.},
}
Endnote
%0 Thesis
%A Habermann, Marc
%Y Theobalt, Christian
%A referee: Seidel, Hans-Peter
%A referee: Hilton, Adrian
%+ Computer Graphics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
External Organizations
%T Real-time human performance capture and synthesis :
%G eng
%U http://hdl.handle.net/21.11116/0000-0009-7D87-3
%R 10.22028/D291-34961
%U nbn:de:bsz:291--ds-349617
%F OTHER: hdl:20.500.11880/31986
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 153 p.
%V phd
%9 phd
%X Most of the images one finds in the media, such as on the Internet or in textbooks and magazines, contain humans as the main point of attention. Thus, there is an inherent necessity for industry, society, and private persons to be able to thoroughly analyze and synthesize the human-related content in these images. One aspect of this analysis and subject of this thesis is to infer the 3D pose and surface deformation, using only visual information, which is also known as human performance capture. Human performance capture enables the tracking of virtual characters from real-world observations, and this is key for visual effects, games, VR, and AR, to name just a few application areas. However, traditional capture methods usually rely on expensive multi-view (marker-based) systems that are prohibitively expensive for the vast majority of people, or they use depth sensors, which are still not as common as single color cameras. Recently, some approaches have attempted to solve the task by assuming only a single RGB image is given. Nonetheless, they can either not track the dense deforming geometry of the human, such as the clothing layers, or they are far from real time, which is indispensable for many applications. To overcome these shortcomings, this thesis proposes two monocular human performance capture methods, which for the first time allow the real-time capture of the dense deforming geometry as well as an unseen 3D accuracy for pose and surface deformations. At the technical core, this work introduces novel GPU-based and data-parallel optimization strategies in conjunction with other algorithmic design choices that are all geared towards real-time performance at high accuracy. Moreover, this thesis presents a new weakly supervised multiview training strategy combined with a fully differentiable character representation that shows superior 3D accuracy. However, there is more to human-related Computer Vision than only the analysis of people in images. It is equally important to synthesize new images of humans in unseen poses and also from camera viewpoints that have not been observed in the real world. Such tools are essential for the movie industry because they, for example, allow the synthesis of photo-realistic virtual worlds with real-looking humans or of contents that are too dangerous for actors to perform on set. But also video conferencing and telepresence applications can benefit from photo-real 3D characters, as they can enhance the immersive experience of these applications. Here, the traditional Computer Graphics pipeline for rendering photo-realistic images involves many tedious and time-consuming steps that require expert knowledge and are far from real time. Traditional rendering involves character rigging and skinning, the modeling of the surface appearance properties, and physically based ray tracing. Recent learning-based methods attempt to simplify the traditional rendering pipeline and instead learn the rendering function from data resulting in methods that are easier accessible to non-experts. However, most of them model the synthesis task entirely in image space such that 3D consistency cannot be achieved, and/or they fail to model motion- and view-dependent appearance effects. To this end, this thesis presents a method and ongoing work on character synthesis, which allow the synthesis of controllable photoreal characters that achieve motion- and view-dependent appearance effects as well as 3D consistency and which run in real time. This is technically achieved by a novel coarse-to-fine geometric character representation for efficient synthesis, which can be solely supervised on multi-view imagery. Furthermore, this work shows how such a geometric representation can be combined with an implicit surface representation to boost synthesis and geometric quality.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31986
[17]
P. Mandros, “Discovering robust dependencies from data,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@phdthesis{Panphd2020,
TITLE = {Discovering robust dependencies from data},
AUTHOR = {Mandros, Panagiotis},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-342919},
DOI = {10.22028/D291-34291},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Mandros, Panagiotis
%Y Vreeken, Jilles
%A referee: Weikum, Gerhard
%A referee: Webb, Geoffrey
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Discovering robust dependencies from data :
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-E4CF-E
%R 10.22028/D291-34291
%U urn:nbn:de:bsz:291--ds-342919
%F OTHER: hdl:20.500.11880/31535
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 194 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31535
[18]
A. Marx, “Information-Theoretic Causal Discovery,” Universität des Saarlandes, Saarbrücken, 2021.
Export
BibTeX
@phdthesis{Marxphd2020,
TITLE = {Information-Theoretic Causal Discovery},
AUTHOR = {Marx, Alexander},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-342908},
DOI = {10.22028/D291-34290},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
}
Endnote
%0 Thesis
%A Marx, Alexander
%Y Vreeken, Jilles
%A referee: Weikum, Gerhard
%A referee: Ommen, Thijs van
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Information-Theoretic Causal Discovery :
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-EECA-9
%R 10.22028/D291-34290
%U urn:nbn:de:bsz:291--ds-342908
%F OTHER: hdl:20.500.11880/31480
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 195 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31480
[19]
S. Metzler, “Structural Building Blocks in Graph Data,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Graph data nowadays easily become so large that it is infeasible to study the underlying structures manually. Thus, computational methods are needed to uncover large-scale structural information. In this thesis, we present methods to understand and summarise large networks.<br>We propose the hyperbolic community model to describe groups of more densely connected nodes within networks using very intuitive parameters. The model accounts for a <br>frequent connectivity pattern in real data: a few community members are highly interconnected; most members mainly have ties to this core. Our model fits real data much better than previously-proposed models. Our corresponding random graph generator, HyGen, creates graphs with realistic intra-community structure.<br>Using the hyperbolic model, we conduct a large-scale study of the temporal evolution of communities on online question–answer sites. We observe that the user activity within a community is constant with respect to its size throughout its lifetime, and a small group of users is responsible for the majority of the social interactions. <br>We propose an approach for Boolean tensor clustering. This special tensor factorisation is restricted to binary data and assumes that one of the tensor directions has only non-overlapping factors. These assumptions – valid for many real-world data, in particular time-evolving networks – enable the use of bitwise operators and lift much of the computational complexity from the task.
Export
BibTeX
@phdthesis{SaskiaDiss21,
TITLE = {Structural Building Blocks in Graph Data},
AUTHOR = {Metzler, Saskia},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-335366},
DOI = {10.22028/D291-33536},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Graph data nowadays easily become so large that it is infeasible to study the underlying structures manually. Thus, computational methods are needed to uncover large-scale structural information. In this thesis, we present methods to understand and summarise large networks.<br>We propose the hyperbolic community model to describe groups of more densely connected nodes within networks using very intuitive parameters. The model accounts for a <br>frequent connectivity pattern in real data: a few community members are highly interconnected; most members mainly have ties to this core. Our model fits real data much better than previously-proposed models. Our corresponding random graph generator, HyGen, creates graphs with realistic intra-community structure.<br>Using the hyperbolic model, we conduct a large-scale study of the temporal evolution of communities on online question--answer sites. We observe that the user activity within a community is constant with respect to its size throughout its lifetime, and a small group of users is responsible for the majority of the social interactions. <br>We propose an approach for Boolean tensor clustering. This special tensor factorisation is restricted to binary data and assumes that one of the tensor directions has only non-overlapping factors. These assumptions -- valid for many real-world data, in particular time-evolving networks -- enable the use of bitwise operators and lift much of the computational complexity from the task.},
}
Endnote
%0 Thesis
%A Metzler, Saskia
%Y Miettinen, Pauli
%Y Weikum, Gerhard
%Y Günnemann, Stephan
%+ Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
External Organizations
%T Structural Building Blocks in Graph Data : Characterised by Hyperbolic Communities and Uncovered by Boolean Tensor Clustering
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-0BC1-2
%R 10.22028/D291-33536
%U urn:nbn:de:bsz:291--ds-335366
%F OTHER: hdl:20.500.11880/30904
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 196 p.
%V phd
%9 phd
%X Graph data nowadays easily become so large that it is infeasible to study the underlying structures manually. Thus, computational methods are needed to uncover large-scale structural information. In this thesis, we present methods to understand and summarise large networks.<br>We propose the hyperbolic community model to describe groups of more densely connected nodes within networks using very intuitive parameters. The model accounts for a <br>frequent connectivity pattern in real data: a few community members are highly interconnected; most members mainly have ties to this core. Our model fits real data much better than previously-proposed models. Our corresponding random graph generator, HyGen, creates graphs with realistic intra-community structure.<br>Using the hyperbolic model, we conduct a large-scale study of the temporal evolution of communities on online question–answer sites. We observe that the user activity within a community is constant with respect to its size throughout its lifetime, and a small group of users is responsible for the majority of the social interactions. <br>We propose an approach for Boolean tensor clustering. This special tensor factorisation is restricted to binary data and assumes that one of the tensor directions has only non-overlapping factors. These assumptions – valid for many real-world data, in particular time-evolving networks – enable the use of bitwise operators and lift much of the computational complexity from the task.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/30904
[20]
S. Nag Chowdhury, “Text-image synergy for multimodal retrieval and annotation,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.
Export
BibTeX
@phdthesis{Chowphd2021,
TITLE = {Text-image synergy for multimodal retrieval and annotation},
AUTHOR = {Nag Chowdhury, Sreyasi},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-345092},
DOI = {10.22028/D291-34509},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.},
}
Endnote
%0 Thesis
%A Nag Chowdhury, Sreyasi
%A referee: Weikum, Gerhard
%A referee: de Melo, Gerard
%A referee: Berberich, Klaus
%+ Databases and Information Systems, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
Databases and Information Systems, MPI for Informatics, Max Planck Society
%T Text-image synergy for multimodal retrieval and annotation :
%G eng
%U http://hdl.handle.net/21.11116/0000-0009-428A-1
%R 10.22028/D291-34509
%U urn:nbn:de:bsz:291--ds-345092
%F OTHER: hdl:20.500.11880/31690
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 131 p.
%V phd
%9 phd
%X Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.
%K image retrieval
image-text alignment
image captioning
commonsense knowledge
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/31690
[21]
M. Omran, “From Pixels to People,” Universität des Saarlandes, Saarbrücken, 2021.
Abstract
Abstract<br>Humans are at the centre of a significant amount of research in computer vision.<br>Endowing machines with the ability to perceive people from visual data is an immense<br>scientific challenge with a high degree of direct practical relevance. Success in automatic<br>perception can be measured at different levels of abstraction, and this will depend on<br>which intelligent behaviour we are trying to replicate: the ability to localise persons in<br>an image or in the environment, understanding how persons are moving at the skeleton<br>and at the surface level, interpreting their interactions with the environment including<br>with other people, and perhaps even anticipating future actions. In this thesis we tackle<br>different sub-problems of the broad research area referred to as "looking at people",<br>aiming to perceive humans in images at different levels of granularity.<br>We start with bounding box-level pedestrian detection: We present a retrospective<br>analysis of methods published in the decade preceding our work, identifying various<br>strands of research that have advanced the state of the art. With quantitative exper-<br>iments, we demonstrate the critical role of developing better feature representations<br>and having the right training distribution. We then contribute two methods based<br>on the insights derived from our analysis: one that combines the strongest aspects of<br>past detectors and another that focuses purely on learning representations. The latter<br>method outperforms more complicated approaches, especially those based on hand-<br>crafted features. We conclude our work on pedestrian detection with a forward-looking<br>analysis that maps out potential avenues for future research.<br>We then turn to pixel-level methods: Perceiving humans requires us to both separate<br>them precisely from the background and identify their surroundings. To this end, we<br>introduce Cityscapes, a large-scale dataset for street scene understanding. This has since<br>established itself as a go-to benchmark for segmentation and detection. We additionally<br>develop methods that relax the requirement for expensive pixel-level annotations, focusing<br>on the task of boundary detection, i.e. identifying the outlines of relevant objects and<br>surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and<br>labelling to fine-grained spatial understanding. We contribute a method for recovering<br>3D human shape and pose, which marries the advantages of learning-based and model-<br>based approaches.<br>We conclude the thesis with a detailed discussion of benchmarking practices in<br>computer vision. Among other things, we argue that the design of future datasets<br>should be driven by the general goal of combinatorial robustness besides task-specific<br>considerations.
Export
BibTeX
@phdthesis{Omranphd2021,
TITLE = {From Pixels to People},
AUTHOR = {Omran, Mohamed},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-366053},
DOI = {10.22028/D291-36605},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Abstract<br>Humans are at the centre of a significant amount of research in computer vision.<br>Endowing machines with the ability to perceive people from visual data is an immense<br>scientific challenge with a high degree of direct practical relevance. Success in automatic<br>perception can be measured at different levels of abstraction, and this will depend on<br>which intelligent behaviour we are trying to replicate: the ability to localise persons in<br>an image or in the environment, understanding how persons are moving at the skeleton<br>and at the surface level, interpreting their interactions with the environment including<br>with other people, and perhaps even anticipating future actions. In this thesis we tackle<br>different sub-problems of the broad research area referred to as "looking at people",<br>aiming to perceive humans in images at different levels of granularity.<br>We start with bounding box-level pedestrian detection: We present a retrospective<br>analysis of methods published in the decade preceding our work, identifying various<br>strands of research that have advanced the state of the art. With quantitative exper-<br>iments, we demonstrate the cri