Selected Publications

Please refer to our ELLIS Unit's Google Scholar for a complete list of references, or to each individual researcher's Google Scholar page.

2025

Conference Papers

Can LLMs Replace Manual Annotation of Software Engineering Artifacts?

Toufique Ahmed, Premkumar Devanbu, Christoph Treude, Michael Pradel

International Conference on Mining Software Repositories (MSR), 2025.

BibTeX Project

@inproceedings{msr2025, author = {Ahmed, Toufique and Devanbu, Premkumar and Treude, Christoph and Pradel, Michael}, title = {Can LLMs Replace Manual Annotation of Software Engineering Artifacts?}, booktitle = {International Conference on Mining Software Repositories (MSR)}, year = {2025} }
RepairAgent: An Autonomous, LLM-Based Agent for Program Repair

Islem Bouzenia, Premkumar Devanbu, Michael Pradel

International Conference on Software Engineering (ICSE), 2025.

BibTeX Project

@inproceedings{icse2025-RepairAgent, author = {Bouzenia, Islem and Devanbu, Premkumar and Pradel, Michael}, title = {{RepairAgent}: An Autonomous, {LLM}-Based Agent for Program Repair}, booktitle = {International Conference on Software Engineering (ICSE)}, year = {2025}, primaryclass = {cs.SE} }
ChangeGuard: Validating Code Changes via Pairwise Learning-Guided Execution

Lars Gröninger, Beatriz Souza, Michael Pradel

International Conference on the Foundations of Software Engineering (FSE), 2025.

BibTeX Project

@inproceedings{fse2025-ChangeGuard, author = {Gröninger, Lars and Souza, Beatriz and Pradel, Michael}, title = {ChangeGuard: Validating Code Changes via Pairwise Learning-Guided Execution}, booktitle = {International Conference on the Foundations of Software Engineering (FSE)}, year = {2025} }
DAGE: DAG Query Answering via Relational Combinator with Logical Constraints

Yunjie He, Bo Xiong, Daniel Hernández, Yuqicheng Zhu, Evgeny Kharlamov, Steffen Staab

THE WEB CONFERENCE 2025, 2025.

Abstract Links BibTeX Project

Predicting answers to queries over knowledge graphs is called a complex reasoning task because answering a query requires subdividing it into subqueries. Existing query embedding methods use this decomposition to compute the embedding of a query as the combination of the embedding of the subqueries. This requirement limits the answerable queries to queries having a single free variable and being decomposable, which are called tree-form queries and correspond to the SROI^- description logic. In this paper, we define a more general set of queries, called DAG queries and formulated in the ALCOIR description logic, propose a query embedding method for them, called DAGE, and a new benchmark to evaluate query embeddings on them. Given the computational graph of a DAG query, DAGE combines the possibly multiple paths between two nodes into a single path with a trainable operator that represents the intersection of relations and learns DAG-DL from tautologies. We show that it is possible to implement DAGE on top of existing query embedding methods, and we empirically measure the improvement of our method over the results of vanilla methods evaluated in tree-form queries that approximate the DAG queries of our proposed benchmark.

Paper: https://openreview.net/forum?id=x1nXBzUknn

@inproceedings{he2025dage, author = {He, Yunjie and Xiong, Bo and Hern{\'a}ndez, Daniel and Zhu, Yuqicheng and Kharlamov, Evgeny and Staab, Steffen}, booktitle = {THE WEB CONFERENCE 2025}, language = {en}, preprinturl = {https://doi.org/10.48550/arXiv.2410.22105}, title = {DAGE: DAG Query Answering via Relational Combinator with Logical Constraints}, url = {https://openreview.net/forum?id=x1nXBzUknn}, year = {2025} }
Treefix: Enabling Execution with a Tree of Prefixes

Beatriz Souza, Michael Pradel

International Conference on Software Engineering (ICSE), 2025.

BibTeX Project

@inproceedings{icse2025-Treefix, author = {Souza, Beatriz and Pradel, Michael}, title = {Treefix: Enabling Execution with a Tree of Prefixes}, booktitle = {International Conference on Software Engineering (ICSE)}, year = {2025} }
Calibration and Correctness of Language Models for Code

Claudio Spiess, David Gros, Kunal Suresh Pai, Michael Pradel, Md Rafiqul Islam Rabin, Amin Alipour, Susmit Jha, Premkumar Devanbu, Toufique Ahmed

International Conference on Software Engineering (ICSE), 2025.

BibTeX Project

@inproceedings{icse2025-calibration, author = {Spiess, Claudio and Gros, David and Pai, Kunal Suresh and Pradel, Michael and Rabin, Md Rafiqul Islam and Alipour, Amin and Jha, Susmit and Devanbu, Premkumar and Ahmed, Toufique}, title = {Calibration and Correctness of Language Models for Code}, booktitle = {International Conference on Software Engineering (ICSE)}, year = {2025} }

2024

Journal Articles

AiroTouch: Enhancing Telerobotic Assembly through Naturalistic Haptic Feedback of Tool Vibrations

Yijie Gong, Haliza Mat Husin, Ecda Erol, Valerio Ortenzi, Katherine J. Kuchenbecker

Frontiers in Robotics and AI, 11, pp. 1–15, 2024.

Links BibTeX Project

doi: 10.3389/frobt.2024.1355205

@article{Gong24-FRAI-Enhancing, title = {Airo{T}ouch: Enhancing Telerobotic Assembly through Naturalistic Haptic Feedback of Tool Vibrations}, author = {Gong, Yijie and Husin, Haliza Mat and Erol, Ecda and Ortenzi, Valerio and Kuchenbecker, Katherine J.}, journal = {Frontiers in Robotics and AI}, volume = {11}, pages = {1--15}, year = {2024}, doi = {10.3389/frobt.2024.1355205} }
Mindful Explanations: Prevalence and Impact of Mind Attribution in XAI Research

Susanne Hindennach, Lei Shi, Filip Miletic, Andreas Bulling

Proceedings of the ACM on Human-Computer Interaction (PACM HCI), , pp. 1–42, 2024.

Links BibTeX Project

doi: 10.1145/3641009

@article{hindennach24_pacm, title = {Mindful Explanations: Prevalence and Impact of Mind Attribution in XAI Research}, author = {Hindennach, Susanne and Shi, Lei and Miletic, Filip and Bulling, Andreas}, year = {2024}, pages = {1--42}, doi = {10.1145/3641009}, journal = {Proceedings of the ACM on Human-Computer Interaction (PACM HCI)} }
Pose2Gaze: Eye-body Coordination during Daily Activities for Gaze Prediction from Full-body Poses

Zhiming Hu, Jiahui Xu, Syn Schmitt, Andreas Bulling

IEEE Transactions on Visualization and Computer Graphics (TVCG), , pp. 1–12, 2024.

BibTeX Project

@article{hu24_tvcg, author = {Hu, Zhiming and Xu, Jiahui and Schmitt, Syn and Bulling, Andreas}, title = {Pose2Gaze: Eye-body Coordination during Daily Activities for Gaze Prediction from Full-body Poses}, journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)}, year = {2024}, pages = {1--12} }
HOIMotion: Forecasting Human Motion During Human-Object Interactions Using Egocentric 3D Object Bounding Boxes

Zhiming Hu, Zheming Yin, Daniel Haeufle, Syn Schmitt, Andreas Bulling

IEEE Transactions on Visualization and Computer Graphics (TVCG), , pp. 1–11, 2024.

BibTeX Project

@article{hu24_ismar, author = {Hu, Zhiming and Yin, Zheming and Haeufle, Daniel and Schmitt, Syn and Bulling, Andreas}, title = {HOIMotion: Forecasting Human Motion During Human-Object Interactions Using Egocentric 3D Object Bounding Boxes}, journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)}, year = {2024}, pages = {1--11} }
Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise

Behnam Khojasteh, Yitian Shao, Katherine J. Kuchenbecker

IEEE Transactions on Haptics, 17(1), pp. 58–65, 2024.

Links BibTeX Project

doi: 10.1109/TOH.2024.3356609

@article{Khojasteh24-TH-Discrepancy, title = {Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise}, author = {Khojasteh, Behnam and Shao, Yitian and Kuchenbecker, Katherine J.}, journal = {IEEE Transactions on Haptics}, volume = {17}, number = {1}, pages = {58--65}, year = {2024}, doi = {10.1109/TOH.2024.3356609} }
Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Filip Miletic, Sabine Schulte im Walde

Transactions of the Association for Computational Linguistics, 12, pp. 593-612, 2024.

BibTeX Project

@article{Miletic/SchulteImWalde:24, author = {Mileti{c}, Filip and {Schulte im Walde}, Sabine}, title = {{Semantics of Multiword Expressions in Transformer-Based Models: A Survey}}, journal = {Transactions of the Association for Computational Linguistics}, year = {2024}, volume = {12}, pages = {593-612} }
Closing the Loop in Minimally Supervised Human-Robot Interaction: Formative and Summative Feedback

Mayumi Mohan, Cara M. Nunez, Katherine J. Kuchenbecker

Scientific Reports, 14(10564), pp. 1–18, 2024.

Links BibTeX Project

doi: 10.1038/s41598-024-60905-x

@article{Mohan24-SR-Closing, title = {Closing the Loop in Minimally Supervised Human-Robot Interaction: Formative and Summative Feedback}, author = {Mohan, Mayumi and Nunez, Cara M. and Kuchenbecker, Katherine J.}, journal = {Scientific Reports}, volume = {14}, number = {10564}, pages = {1--18}, year = {2024}, doi = {10.1038/s41598-024-60905-x} }
Cutaneous Electrohydraulic (CUTE) Wearable Devices for Pleasant Broad-Bandwidth Haptic Cues

Natalia Sanchez-Tamayo, Zachary Yoder, Philipp Rothemund, Giulia Ballardini, Christoph Keplinger, Katherine J. Kuchenbecker

Advanced Science, (2402461), pp. 1–14, 2024.

Links BibTeX Project

doi: 10.1002/advs.202402461

@article{Sanchez-Tamayo24-AS-CUTE, title = {Cutaneous Electrohydraulic ({CUTE}) Wearable Devices for Pleasant Broad-Bandwidth Haptic Cues}, author = {Sanchez-Tamayo, Natalia and Yoder, Zachary and Rothemund, Philipp and Ballardini, Giulia and Keplinger, Christoph and Kuchenbecker, Katherine J.}, journal = {Advanced Science}, number = {2402461}, pages = {1--14}, year = {2024}, doi = {10.1002/advs.202402461} }
L2XGNN: Learning to Explain Graph Neural Networks

Giuseppe Serra, Mathias Niepert

Machine Learning Journal, , 2024.

BibTeX Project

@article{Serra2024, author = {Serra, Giuseppe and Niepert, Mathias}, title = {L2XGNN: Learning to Explain Graph Neural Networks}, journal = {Machine Learning Journal}, year = {2024} }
Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, Johannes Kästner

NPJ Computational Materials, , 2024.

BibTeX Project

@article{Zaverkin2024, author = {Zaverkin, Viktor and Holzmüller, David and Christiansen, Henrik and Errica, Federico and Alesiani, Francesco and Takamoto, Makoto and Niepert, Mathias and Kästner, Johannes}, title = {Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials}, journal = {NPJ Computational Materials}, year = {2024} }
State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards

M. Calvo-Fullana, S. Paternain, L. F. O. Chamon, A. Ribeiro

IEEE Trans. on Autom. Control., , pp. , 2024.

Links BibTeX Project

doi:

@article{Calvo-Fullana24s, author = {{Calvo-Fullana}, M. and Paternain, S. and Chamon, L. F. O. and Ribeiro, A.}, title = {State augmented constrained reinforcement learning: {O}vercoming the limitations of learning with rewards}, journal = {IEEE Trans. on Autom. Control.}, year = {2024}, volume = {}, pages = {}, doi = {}, arxiv = {\url{https://arxiv.org/abs/2102.11941}}, keywords = {journal} }

Conference Papers

VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs

Adnen Abdessaied, Lei Shi, Andreas Bulling

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 5805–5814, 2024.

BibTeX Project

@inproceedings{abdessaied24_wacv, author = {Abdessaied, Adnen and Shi, Lei and Bulling, Andreas}, title = {VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, year = {2024}, pages = {5805--5814} }
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog

Adnen Abdessaied, Manuel Hochmeister, Andreas Bulling

Proceedings of the 31st Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), pp. 1–11, 2024.

BibTeX Project

@inproceedings{abdessaied24_coling, author = {Abdessaied, Adnen and von Hochmeister, Manuel and Bulling, Andreas}, title = {OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog}, booktitle = {Proceedings of the 31st Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)}, year = {2024}, pages = {1--11} }
NPCS: Native Provenance Computation for SPARQL

Zubaria Asma, Daniel Hernández, Luis Galárraga, Giorgos Flouris, Irini Fundulaki, Katja Hose

Proceedings of the ACM Web Conference 2024 (WWW ’24), May13–17, 2024, Singapore, Singapore, 2024.

Abstract Links BibTeX Project

The popularity of Knowledge Graphs (KGs) both in industry and academia owes credit to their flexible data model, suitable for data integration from multiple sources. Several KG-based applications such as trust assessment or view maintenance on dynamic data rely on the ability to compute provenance explanations for query results. The how-provenance of a query result is an expression that encodes the records (triples or facts) that explain its inclusion in the result set. This article proposes NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance. By building upon spm-provenance semirings, NPCS supports both monotonic and non-monotonic SPARQL queries. Thanks to its reliance on query rewriting techniques, the approach is directly applicable to already deployed SPARQL engines using different reification schemes – including RDF-star. Our experimental evaluation on two popular SPARQL engines (GraphDB and Stardog) shows that our novel query rewriting brings a significant runtime improvement over existing query rewriting solutions, scaling to RDF graphs with billions of triples.

doi: 10.1145/3589334.3645557

Paper: https://doi.org/10.1145/3589334.3645557

@inproceedings{zubaria2024native, author = {Asma, Zubaria and Hernández, Daniel and Galárraga, Luis and Flouris, Giorgos and Fundulaki, Irini and Hose, Katja}, booktitle = {Proceedings of the ACM Web Conference 2024 (WWW '24), May13--17, 2024, Singapore, Singapore}, doi = {10.1145/3589334.3645557}, eventdate = {May 13 -17 2024}, eventtitle = {WWW '24}, isbn = {979-8-4007-0171-9/24/05}, language = {English}, publisher = {ACM}, title = {NPCS: Native Provenance Computation for SPARQL}, url = {https://doi.org/10.1145/3589334.3645557}, venue = {Singapore}, year = {2024} }
Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition

Matteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1–16, 2024.

Links BibTeX Project

doi:

@inproceedings{bortoletto24_acl, author = {Bortoletto, Matteo and Ruhdorfer, Constantin and Abdessaied, Adnen and Shi, Lei and Bulling, Andreas}, title = {Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition}, booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL)}, year = {2024}, pages = {1--16}, doi = {} }
Neural Reasoning About Agents’ Goals, Preferences, and Actions

Matteo Bortoletto, Lei Shi, Andreas Bulling

Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), pp. 1–13, 2024.

Abstract Links BibTeX Project

We propose the Intuitive Reasoning Network (IRENE) – a novel neural model for intuitive psychological reasoning about agents’ goals, preferences, and actions that can generalise previous experiences to new situations. IRENE combines a graph neural network for learning agent and world state representations with a transformer to encode the task context. When evaluated on the challenging Baby Intuitions Benchmark, IRENE achieves new state-of-the-art performance on three out of its five tasks – with up to 48.9% improvement. In contrast to existing methods, IRENE is able to bind preferences to specific agents, to better distinguish between rational and irrational agents, and to better understand the role of blocking obstacles. We also investigate, for the first time, the influence of the training tasks on test performance. Our analyses demonstrate the effectiveness of IRENE in combining prior knowledge gained during training for unseen evaluation tasks.

doi:

Code: https://git.hcics.simtech.uni-stuttgart.de/public-projects/IRENE

@inproceedings{bortoletto24_aaai, author = {Bortoletto, Matteo and Shi, Lei and Bulling, Andreas}, title = {Neural Reasoning About Agents’ Goals, Preferences, and Actions}, booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)}, year = {2024}, pages = {1--13}, doi = {} }
Constrained sampling with primal-dual Langevin Monte Carlo

L. F. O. Chamon, M. R. K. Jaghargh, A. Korba

Conference on Neural Information Processing Systems~(NeurIPS), 2024.

BibTeX Project

@inproceedings{Chamon24c, author = {Chamon, L. F. O. and Jaghargh, M. R. K. and Korba, A.}, title = {Constrained sampling with primal-dual {L}angevin {M}onte {C}arlo}, booktitle = {Conference on Neural Information Processing Systems\textasciitilde (NeurIPS)}, year = {2024}, arxiv = {\url{https://arxiv.org/abs/2411.00568}} }
Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds

Annerose Eichel, Tana Deeg, André Blessing, Milena Belosevic, Sabine Arndt-Lappe, Sabine Schulte Walde

, 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2404.04031

@inproceedings{eichel2024willkommensmerkelchaosjohnsontoreklosemodeling, title = {Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds}, author = {Eichel, Annerose and Deeg, Tana and Blessing, André and Belosevic, Milena and Arndt-Lappe, Sabine and im Walde, Sabine Schulte}, year = {2024}, eprint = {2404.04031}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2404.04031} }
Near-optimal solutions of constrained learning problems

J. Elenter, L. F. O. Chamon, A. Ribeiro

International Conference on Learning Representations (ICLR), 2024.

BibTeX Project

@inproceedings{Elenter24n, author = {Elenter, J. and Chamon, L. F. O. and Ribeiro, A.}, title = {Near-optimal solutions of constrained learning problems}, booktitle = {International Conference on Learning Representations~(ICLR)}, year = {2024}, arxiv = {\url{https://arxiv.org/abs/2403.11844}}, keywords = {conf_ml} }
Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks

Federico Errica, Mathias Niepert

Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), 2024.

BibTeX Project

@inproceedings{Errica2024, author = {Errica, Federico and Niepert, Mathias}, title = {Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks}, booktitle = {Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)}, year = {2024} }
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations

Jan Hagnberger, Marimuthu Kalimuthu, Daniel Musekamp, Mathias Niepert

Proceedings of the 41st International Conference on Machine Learning (ICML 2024), 2024.

BibTeX Project

@inproceedings{Hagnberger2024a, author = {Hagnberger, Jan and Kalimuthu, Marimuthu and Musekamp, Daniel and Niepert, Mathias}, title = {Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations}, booktitle = {Proceedings of the 41st International Conference on Machine Learning (ICML 2024)}, year = {2024} }
Image Inpainting via Tractable Steering of Diffusion Models

Anji Liu, Mathias Niepert, Guy Van Broeck

Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), 2024.

BibTeX Project

@inproceedings{Liu2024, author = {Liu, Anji and Niepert, Mathias and den Broeck, Guy Van}, title = {Image Inpainting via Tractable Steering of Diffusion Models}, booktitle = {Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)}, year = {2024} }
What Can Diachronic Contexts and Topics Tell Us about the Present-Day Compositionality of English Noun Compounds?

Samin Mahdizadeh Sani, Malak Rassem, Chris Jenkins, Filip Miletić, Sabine Walde

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 17449–17458, 2024.

Links BibTeX Project

Paper: https://aclanthology.org/2024.lrec-main.1517

@inproceedings{mahdizadeh-sani-etal-2024-diachronic, title = {What Can Diachronic Contexts and Topics Tell Us about the Present-Day Compositionality of {E}nglish Noun Compounds?}, author = {Mahdizadeh Sani, Samin and Rassem, Malak and Jenkins, Chris and Mileti{\'c}, Filip and Schulte im Walde, Sabine}, editor = {Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen}, booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)}, year = {2024}, address = {Torino, Italia}, publisher = {ELRA and ICCL}, url = {https://aclanthology.org/2024.lrec-main.1517}, pages = {17449--17458} }
Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks

Duy Nguyen, Nina Lukashina, Tai Nguyen, An Le, TrungTin Nguyen, Nhat Ho, Jan Peters, Daniel Sonntag, Viktor Zaverkin, Mathias Niepert

Proceedings of the 41st International Conference on Machine Learning (ICML 2024), 2024.

BibTeX Project

@inproceedings{Nguyen2024, author = {Nguyen, Duy and Lukashina, Nina and Nguyen, Tai and Le, An and Nguyen, TrungTin and Ho, Nhat and Peters, Jan and Sonntag, Daniel and Zaverkin, Viktor and Niepert, Mathias}, title = {Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks}, booktitle = {Proceedings of the 41st International Conference on Machine Learning (ICML 2024)}, year = {2024} }
HGE: Embedding Temporal Knowledge Graphs in a Product Space of Heterogeneous Geometric Subspaces

J. Pan, M. Nayyeri, Y. Li, S. Staab

Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), pp. 1–13, 2024.

Links BibTeX Project

doi:

@inproceedings{pan24_aaai, author = {Pan, J. and Nayyeri, M. and Li, Y. and Staab, S.}, title = {HGE: Embedding Temporal Knowledge Graphs in a Product Space of Heterogeneous Geometric Subspaces}, booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)}, year = {2024}, pages = {1--13}, doi = {} }
Navigating Open Set Scenarios for Skeleton-based Action Recognition

K. Peng, Y. Cheng, J. Zheng, R. Liu, D. Schneider, J. Zhang, K. Yang, M. S. Sarfraz, R. Stiefelhagen, A. Roitberg

Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), pp. 1–13, 2024.

Links BibTeX Project

doi:

@inproceedings{peng24_aaai, author = {Peng, K. and Cheng, Y. and Zheng, J. and Liu, R. and Schneider, D. and Zhang, J. and Yang, K. and Sarfraz, M. S. and Stiefelhagen, R. and Roitberg, A.}, title = {Navigating Open Set Scenarios for Skeleton-based Action Recognition}, booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)}, year = {2024}, pages = {1--13}, doi = {} }
Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler

Kunyu Peng, Di Wen, Kailun Yang, Ao Luo, Yufan Chen, Jia Fu, M. Saquib Sarfraz, Alina Roitberg, Rainer Stiefelhagen

Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2409.17555

@inproceedings{peng2024advancingopensetdomaingeneralization, title = {Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler}, author = {Peng, Kunyu and Wen, Di and Yang, Kailun and Luo, Ao and Chen, Yufan and Fu, Jia and Sarfraz, M. Saquib and Roitberg, Alina and Stiefelhagen, Rainer}, booktitle = {Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024)}, year = {2024}, eprint = {2409.17555}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2409.17555} }
VOLIMET: A Parallel Corpus of Literal and Metaphorical Verb-Object Pairs for English–German and English–French

Prisca Piccirilli, Alexander Fraser, Sabine Walde

Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024), pp. 222–237, 2024.

Links BibTeX Project

doi: 10.18653/v1/2024.starsem-1.18

Paper: https://aclanthology.org/2024.starsem-1.18

@inproceedings{piccirilli-etal-2024-volimet, title = {{VOLIMET}: A Parallel Corpus of Literal and Metaphorical Verb-Object Pairs for {E}nglish{--}{G}erman and {E}nglish{--}{F}rench}, author = {Piccirilli, Prisca and Fraser, Alexander and Schulte im Walde, Sabine}, editor = {Bollegala, Danushka and Shwartz, Vered}, booktitle = {Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024)}, year = {2024}, address = {Mexico City, Mexico}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.starsem-1.18}, doi = {10.18653/v1/2024.starsem-1.18}, pages = {222--237} }
Robust Knowledge Extraction from Large Language Models using Social Choice Theory

Nico Potyka, Yuqicheng Zhu, Yunjie He, Evgeny Kharlamov, Steffen Staab

Proseedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems, 2024.

Abstract BibTeX Project

Large-language models (LLMs) have the potential to support a wide range of applications like conversational agents, creative writing, text improvement, and general query answering. However, they are ill-suited for query answering in high-stake domains like medicine because they generate answers at random and their answers are typically not robust - even the same query can result in different answers when prompted multiple times. In order to improve the robustness of LLM queries, we propose using ranking queries repeatedly and to aggregate the queries using methods from social choice theory. We study ranking queries in diagnostic settings like medical and fault diagnosis and discuss how the Partial Borda Choice function from the literature can be applied to merge multiple query results. We discuss some additional interesting properties in our setting and evaluate the robustness of our approach empirically.

@inproceedings{noauthororeditor2023robust, author = {Potyka, Nico and Zhu, Yuqicheng and He, Yunjie and Kharlamov, Evgeny and Staab, Steffen}, booktitle = {Proseedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems}, title = {Robust Knowledge Extraction from Large Language Models using Social Choice Theory}, year = {2024} }
Probabilistically Rewired Message-Passing Neural Networks

Chendi Qian, Andrei Manolache, Kareem Ahmed, Zhe Zeng, Guy Van Broeck, Mathias Niepert, Christopher Morris

Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), 2024.

BibTeX Project

@inproceedings{Qian2024, author = {Qian, Chendi and Manolache, Andrei and Ahmed, Kareem and Zeng, Zhe and den Broeck, Guy Van and Niepert, Mathias and Morris, Christopher}, title = {Probabilistically Rewired Message-Passing Neural Networks}, booktitle = {Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)}, year = {2024} }
Probabilistic Graph Rewiring via Virtual Nodes

Chendi Qian, Andrei Manolache, Christopher Morris, Mathias Niepert

Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2405.17311

@inproceedings{qian2024probabilisticgraphrewiringvirtual, title = {Probabilistic Graph Rewiring via Virtual Nodes}, author = {Qian, Chendi and Manolache, Andrei and Morris, Christopher and Niepert, Mathias}, booktitle = {Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024)}, year = {2024}, eprint = {2405.17311}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2405.17311} }
More DWUGs: Extending and Evaluating Word Usage Graph Datasets in Multiple Languages

Dominik Schlechtweg, Pierluigi Cassotti, Bill Noble, David Alfter, Sabine Schulte Im Walde, Nina Tahmasebi

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 14379–14393, 2024.

Links BibTeX Project

doi: 10.18653/v1/2024.emnlp-main.796

Paper: https://aclanthology.org/2024.emnlp-main.796

@inproceedings{schlechtweg-etal-2024-dwugs, title = {More {DWUG}s: Extending and Evaluating Word Usage Graph Datasets in Multiple Languages}, author = {Schlechtweg, Dominik and Cassotti, Pierluigi and Noble, Bill and Alfter, David and Schulte Im Walde, Sabine and Tahmasebi, Nina}, editor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung}, booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing}, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.emnlp-main.796}, doi = {10.18653/v1/2024.emnlp-main.796}, pages = {14379--14393} }
The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change

Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte Walde

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL): System Demonstrations. St. Julians, Malta, 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2311.12664

@inproceedings{schlechtweg2024durelannotationtoolhuman, title = {The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change}, author = {Schlechtweg, Dominik and Virk, Shafqat Mumtaz and Sander, Pauline and Sköldberg, Emma and Linke, Lukas Theuer and Zhang, Tuo and Tahmasebi, Nina and Kuhn, Jonas and im Walde, Sabine Schulte}, booktitle = {Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL): System Demonstrations. St. Julians, Malta}, year = {2024}, eprint = {2311.12664}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2311.12664} }
From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries

Philipp Seifer, Daniel Hernández, Ralf Lämmel, Steffen Staab

Proceedings of the ACM Web Conference 2024, WWW 2024, Singapore, 13 - 17 May 2024, 2024.

Abstract BibTeX Project

SPARQL CONSTRUCT queries allow for the specification of data processing pipelines that transform given input graphs into new output graphs. It is now common to constrain graphs through SHACL shapes allowing users to understand which data they can expect and which not. However, it becomes challenging to understand what graph data can be expected at the end of a data processing pipeline without knowing the particular input data: Shape constraints on the input graph may affect the output graph, but may no longer apply literally, and new shapes may be imposed by the query template. In this paper, we study the derivation of shape constraints that hold on all possible output graphs of a given SPARQL CONSTRUCT query. We assume that the SPARQL CONSTRUCT query is fixed, e.g., being part of a program, whereas the input graphs adhere to input shape constraints but may otherwise vary over time and, thus, are mostly unknown. We study a fragment of SPARQL CONSTRUCT queries (SCCQ) and a fragment of SHACL (Simple SHACL). We formally define the problem of deriving the most restrictive set of Simple SHACL shapes that constrain the results from evaluating a SCCQ over any input graph restricted by a given set of Simple SHACL shapes. We propose and implement an algorithm that statically analyses input SHACL shapes and CONSTRUCT queries and prove its soundness and complexity.

@inproceedings{seifer2024shapes, author = {Seifer, Philipp and Hernández, Daniel and Lämmel, Ralf and Staab, Steffen}, booktitle = {Proceedings of the {ACM} Web Conference 2024, {WWW} 2024, Singapore, 13 - 17 May 2024}, eventdate = {13 May 2024 - 17 May 2024}, publisher = {ACM}, title = {From Shapes to Shapes: Inferring SHACL Shapes for Results of SPARQL CONSTRUCT Queries}, venue = {Singapore}, year = {2024} }
Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories

Tarun Tater, Sabine Schulte Walde, Diego Frassinelli

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2410.11657

@inproceedings{tater2024unveilingmysteryvisualattributes, title = {Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories}, author = {Tater, Tarun and im Walde, Sabine Schulte and Frassinelli, Diego}, booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing}, year = {2024}, eprint = {2410.11657}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://arxiv.org/abs/2410.11657} }
Accelerating Transformers with Spectrum-Preserving Token Merging

Hoai-Chau Tran, Duy M. H. Nguyen, Duy M. Nguyen, Trung-Tin Nguyen, Ngan Le, Pengtao Xie, Daniel Sonntag, James Y. Zou, Binh T. Nguyen, Mathias Niepert

Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2405.16148

@inproceedings{tran2024acceleratingtransformersspectrumpreservingtoken, title = {Accelerating Transformers with Spectrum-Preserving Token Merging}, author = {Tran, Hoai-Chau and Nguyen, Duy M. H. and Nguyen, Duy M. and Nguyen, Trung-Tin and Le, Ngan and Xie, Pengtao and Sonntag, Daniel and Zou, James Y. and Nguyen, Binh T. and Niepert, Mathias}, booktitle = {Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024)}, year = {2024}, eprint = {2405.16148}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2405.16148} }
SalChartQA: Question-driven Saliency on Information Visualisations

Yao Wang, Weitian Wang, Abdullah Abdelhafez, Mayar Elfares, Zhiming Hu, Mihai Bâce, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–14, 2024.

Links BibTeX Project

doi: 10.1145/3613904.3642942

@inproceedings{wang24_chi, title = {SalChartQA: Question-driven Saliency on Information Visualisations}, author = {Wang, Yao and Wang, Weitian and Abdelhafez, Abdullah and Elfares, Mayar and Hu, Zhiming and B{\^a}ce, Mihai and Bulling, Andreas}, year = {2024}, pages = {1--14}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, doi = {10.1145/3613904.3642942} }
Fuzz4all: Universal fuzzing with large language models

Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, Lingming Zhang

Proceedings of the IEEE/ACM ICSE, 2024.

BibTeX Project

@inproceedings{xia2024fuzz4all, title = {Fuzz4all: Universal fuzzing with large language models}, author = {Xia, Chunqiu Steven and Paltenghi, Matteo and Le Tian, Jia and Pradel, Michael and Zhang, Lingming}, booktitle = {Proceedings of the IEEE/ACM ICSE}, year = {2024} }
NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning

B. Xiong, M. Nayyeri, L. Luo, Z. Wang, S. Pan, S. Staab

Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), pp. 1–13, 2024.

Links BibTeX Project

doi:

@inproceedings{xiong24_aaai, author = {Xiong, B. and Nayyeri, M. and Luo, L. and Wang, Z. and Pan, S. and Staab, S.}, title = {NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning}, booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)}, year = {2024}, pages = {1--13}, doi = {} }
Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing

Viktor Zaverkin, Francesco Alesiani, Takashi Maruyama, Federico Errica, Henrik Christiansen, Makoto Takamoto, Nicolas Weber, Mathias Niepert

Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), 2024.

Links BibTeX Project

Paper: https://arxiv.org/abs/2405.14253

@inproceedings{zaverkin2024higherrankirreduciblecartesiantensors, title = {Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing}, author = {Zaverkin, Viktor and Alesiani, Francesco and Maruyama, Takashi and Errica, Federico and Christiansen, Henrik and Takamoto, Makoto and Weber, Nicolas and Niepert, Mathias}, booktitle = {Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024)}, year = {2024}, eprint = {2405.14253}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2405.14253} }
Mouse2Vec: Learning Reusable Semantic Representations of Mouse Behaviour

Guanhua Zhang, Zhiming Hu, Mihai Bâce, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–17, 2024.

Links BibTeX Project

doi: 10.1145/3613904.3642141

@inproceedings{zhang24_chi, title = {Mouse2Vec: Learning Reusable Semantic Representations of Mouse Behaviour}, author = {Zhang, Guanhua and Hu, Zhiming and B{\^a}ce, Mihai and Bulling, Andreas}, year = {2024}, pages = {1--17}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, doi = {10.1145/3613904.3642141} }

Miscellaneous

Demonstration: Minsight - A Soft Vision-Based Tactile Sensor for Robotic Fingertips

Iris Andrussow, Huanbo Sun, Georg Martius, Katherine J. Kuchenbecker

2024.

Links BibTeX Project

doi:

@misc{Andrussow24-CORLD-Minsight, title = {Demonstration: Minsight - A Soft Vision-Based Tactile Sensor for Robotic Fingertips}, author = {Andrussow, Iris and Sun, Huanbo and Martius, Georg and Kuchenbecker, Katherine J.}, howpublished = {Hands-on demonstration presented at the Conference on Robot Learning (CoRL)}, address = {Munich, Germany}, year = {2024}, doi = {} }
Demonstration: OCRA - A Kinematic Retargeting Algorithm for Expressive Whole-Arm Teleoperation

Mayumi Mohan, Katherine J. Kuchenbecker

2024.

Links BibTeX Project

doi:

@misc{Mohan24-CORLD-Algorithm, title = {Demonstration: {OCRA} - A Kinematic Retargeting Algorithm for Expressive Whole-Arm Teleoperation}, author = {Mohan, Mayumi and Kuchenbecker, Katherine J.}, howpublished = {Hands-on demonstration presented at the Conference on Robot Learning (CoRL)}, address = {Munich, Germany}, year = {2024}, doi = {} }

2023

Journal Articles

Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation

Iris Andrussow, Huanbo Sun, Katherine J. Kuchenbecker, Georg Martius

Advanced Intelligent Systems, , 2023.

Links BibTeX Project

doi: 10.1002/aisy.202300042

@article{Andrussow23-AIS-Minsight, title = {Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation}, author = {Andrussow, Iris and Sun, Huanbo and Kuchenbecker, Katherine J. and Martius, Georg}, journal = {Advanced Intelligent Systems}, year = {2023}, doi = {10.1002/aisy.202300042} }
Constrained Learning with Non-Convex Losses

L. F. O. Chamon, S. Paternain, M. Calvo-Fullana, A. Ribeiro

IEEE Trans. on Inf. Theory, 69[3], pp. 1739–1760, 2023.

Links BibTeX Project

doi: 10.1109/TIT.2022.3187948

@article{Chamon23c, author = {Chamon, L. F. O. and Paternain, S. and {Calvo-Fullana}, M. and Ribeiro, A.}, title = {Constrained Learning with Non-Convex Losses}, journal = {IEEE Trans. on Inf. Theory}, volume = {69[3]}, pages = {1739--1760}, year = {2023}, arxiv = {\url{https://arxiv.org/abs/2103.05134}}, doi = {10.1109/TIT.2022.3187948}, keywords = {journal} }
A framework and benchmark for deep batch active learning for regression

David Holzmüller, Viktor Zaverkin, Johannes Kästner, Ingo Steinwart

Journal of Machine Learning Research, 24(164), pp. 1–81, 2023.

BibTeX Project

@article{holzmuller_framework_2023, title = {A framework and benchmark for deep batch active learning for regression}, author = {Holzmüller, David and Zaverkin, Viktor and Kästner, Johannes and Steinwart, Ingo}, year = {2023}, journal = {Journal of Machine Learning Research}, volume = {24}, number = {164}, pages = {1--81} }
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test

Behnam Khojasteh, Friedrich Solowjow, Sebastian Trimpe, Katherine J. Kuchenbecker

IEEE Transactions on Automation Science and Engineering, , pp. 1–16, 2023.

Links BibTeX Project

doi: 10.1109/TASE.2023.3296569

@article{Khojasteh23-TASE-Recognition, title = {Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test}, author = {Khojasteh, Behnam and Solowjow, Friedrich and Trimpe, Sebastian and and Katherine J. Kuchenbecker}, journal = {IEEE Transactions on Automation Science and Engineering}, pages = {1--16}, year = {2023}, doi = {10.1109/TASE.2023.3296569} }
Predicting the Force Map of an ERT-Based Tactile Sensor Using Simulation and Deep Networks

Hyosang Lee, Huanbo Sun, Hyunkyu Park, Gokhan Serhat, Bernard Javot, Georg Martius, Katherine J. Kuchenbecker

IEEE Transactions on Automation Science and Engineering, 20(1), pp. 425–439, 2023.

Links BibTeX Project

doi: 10.1109/TASE.2022.3156184

@article{Lee23-TASE-Map, title = {Predicting the Force Map of an {ERT}-Based Tactile Sensor Using Simulation and Deep Networks}, author = {Lee, Hyosang and Sun, Huanbo and Park, Hyunkyu and Serhat, Gokhan and Javot, Bernard and Martius, Georg and Kuchenbecker, Katherine J.}, journal = {IEEE Transactions on Automation Science and Engineering}, volume = {20}, number = {1}, pages = {425--439}, year = {2023}, doi = {10.1109/TASE.2022.3156184} }
Distributed Universal Adaptive Networks

C. G. Lopes, V. H. Nascimento, L. F. O. Chamon

IEEE Trans. on Signal Process., 71, pp. 1817–1832, 2023.

Links BibTeX Project

doi: 10.1109/TSP.2023.3275812

@article{Lopes23d, author = {Lopes, C. G. and Nascimento, V. H. and Chamon, L. F. O.}, title = {Distributed Universal Adaptive Networks}, journal = {IEEE Trans. on Signal Process.}, volume = {71}, pages = {1817--1832}, year = {2023}, arxiv = {\url{https://arxiv.org/abs/2307.05746}}, doi = {10.1109/TSP.2023.3275812}, keywords = {journal} }
SCENE: Reasoning about Traffic Scenes using Heterogeneous Graph Neural Networks

Thomas Monninger, Julian Schmidt, Jan Rupprecht, David Raba, Julian Jordan, Daniel Frank, Steffen Staab, Klaus Dietmayer

IEEE Robotics and Automation Letters, , pp. 1–8, 2023.

Abstract Links BibTeX Project

Understanding traffic scenes requires considering heterogeneous information about dynamic agents and the static infrastructure. In this work we propose SCENE, a methodology to encode diverse traffic scenes in heterogeneous graphs and to reason about these graphs using a heterogeneous Graph Neural Network encoder and task-specific decoders. The heterogeneous graphs, whose structures are defined by an ontology, consist of different nodes with type-specific node features and different relations with type-specific edge features. In order to exploit all the information given by these graphs, we propose to use cascaded layers of graph convolution. The result is an encoding of the scene. Task-specific decoders can be applied to predict desired attributes of the scene. Extensive evaluation on two diverse binary node classification tasks show the main strength of this methodology: despite being generic, it even manages to outperform task-specific baselines. The further application of our methodology to the task of node classification in various knowledge graphs shows its transferability to other domains.

doi: 10.1109/LRA.2023.3234771

@article{monninger23_ral, title = {SCENE: Reasoning about Traffic Scenes using Heterogeneous Graph Neural Networks}, author = {Monninger, Thomas and Schmidt, Julian and Rupprecht, Jan and Raba, David and Jordan, Julian and Frank, Daniel and Staab, Steffen and Dietmayer, Klaus}, year = {2023}, journal = {IEEE Robotics and Automation Letters}, pages = {1--8}, doi = {10.1109/LRA.2023.3234771} }
Safe policies for reinforcement learning via primal-dual methods

S. Paternain, M. Calvo-Fullana, L. F. O. Chamon, A. Ribeiro

IEEE Trans. on Autom. Control., 68[3], pp. 1321–1336, 2023.

Links BibTeX Project

doi: 10.1109/TAC.2022.3152724

@article{Paternain23s, author = {Paternain, S. and {Calvo-Fullana}, M. and Chamon, L. F. O. and Ribeiro, A.}, title = {Safe policies for reinforcement learning via primal-dual methods}, journal = {IEEE Trans. on Autom. Control.}, year = {2023}, volume = {68[3]}, pages = {1321--1336}, arxiv = {\url{https://arxiv.org/abs/1911.09101}}, doi = {10.1109/TAC.2022.3152724}, keywords = {journal} }
Transferability Properties of Graph Neural Networks

L. Ruiz, L. F. O. Chamon, A. Ribeiro

IEEE Trans. on Signal Process., 71, pp. 3474–3489, 2023.

Links BibTeX Project

doi: 10.1109/TSP.2023.3297848

@article{Ruiz23t, author = {Ruiz, L. and Chamon, L. F. O. and Ribeiro, A.}, title = {Transferability Properties of Graph Neural Networks}, journal = {IEEE Trans. on Signal Process.}, year = {2023}, volume = {71}, pages = {3474--3489}, arxiv = {\url{https://arxiv.org/abs/2112.04629}}, doi = {10.1109/TSP.2023.3297848}, keywords = {journal} }
Adaptive Clustering Using Kernel Density Estimators

I. Steinwart, B.K. Sriperumbudur, P. Thomann

Journal of Machine Learning Research, 24, pp. 1–56, 2023.

BibTeX Project

@article{StSrTh23a, author = {Steinwart, I. and Sriperumbudur, B.K. and Thomann, P.}, title = {Adaptive Clustering Using Kernel Density Estimators}, year = {2023}, journal = {Journal of Machine Learning Research}, volume = {24}, pages = {1--56} }

Conference Papers

SIMPLE: A Gradient Estimator for k-Subset Sampling

Kareem Ahmed, Zhe Zeng, Mathias Niepert, Guy Broeck

Proceedings of the 11th International Conference on Learning Representations (ICLR), 2023.

BibTeX Project

@inproceedings{ahmed2023simple, title = {SIMPLE: A Gradient Estimator for k-Subset Sampling}, author = {Ahmed, Kareem and Zeng, Zhe and Niepert, Mathias and Van den Broeck, Guy}, year = {2023}, booktitle = {Proceedings of the 11th International Conference on Learning Representations (ICLR)} }
ReLiNet: Stable and Explainable Multistep Prediction with Recurrent Linear Parameter Varying Networks

Alexandra Baier, Decky Aspandi, Steffen Staab

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI), 2023.

Abstract BibTeX Project

Multistep prediction models are essential for the simulation and model-predictive control of dynamical systems. Verifying the safety of such models is a multi-faceted problem requiring both system-theoretic guarantees as well as establishing trust with human users. In this work, we propose a novel approach, ReLiNet (Recurrent Linear Parameter Varying Network), to ensure safety for multistep prediction of dynamical systems. Our approach simplifies a recurrent neural network to a switched linear system that is constrained to guarantee exponential stability, which acts as a surrogate for safety from a system-theoretic perspective. Furthermore, ReLiNet’s computation can be reduced to a single linear model for each time step, resulting in predictions that are explainable by definition, thereby establishing trust from a human-centric perspective. Our quantitative experiments show that ReLiNet achieves prediction accuracy comparable to that of state-of-the-art recurrent neural networks, while achieving more faithful and robust explanations compared to the model-agnostic explanation method of LIME.

@inproceedings{Baier2023, title = {ReLiNet: Stable and Explainable Multistep Prediction with Recurrent Linear Parameter Varying Networks}, author = {Baier, Alexandra and Aspandi, Decky and Staab, Steffen}, year = {2023}, booktitle = {Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI)}, publisher = {International Joint Conferences on Artificial Intelligence Organization}, added-at = {2023-06-13T10:22:08.000+0000}, biburl = {https://puma.ub.uni-stuttgart.de/bibtex/258e3d8a67f20a1766353bcb2fdf910b1/alexbaier}, interhash = {2e996c3803dd3030b150002c1cabf6a1}, intrahash = {58e3d8a67f20a1766353bcb2fdf910b1}, keywords = {}, timestamp = {2023-06-13T10:22:37.000+0000} }
When to Say What: Learning to Find Condition-Message Inconsistencies

Islem Bouzenia, Michael Pradel

Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 868–880, 2023.

BibTeX Project

@inproceedings{bouzenia2023say, title = {When to Say What: Learning to Find Condition-Message Inconsistencies}, author = {Bouzenia, Islem and Pradel, Michael}, booktitle = {Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)}, pages = {868--880}, year = {2023}, organization = {IEEE} }
Wear Your Heart on Your Sleeve: Users Prefer Robots with Emotional Reactions to Touch and Ambient Moods

Rachael Bevill Burns, Fayokemi Ojo, Katherine J. Kuchenbecker

Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 1914–1921, 2023.

Links BibTeX Project

doi: 10.1109/RO-MAN57019.2023.10309358

@inproceedings{Burns23-ROMAN-Heart, title = {Wear Your Heart on Your Sleeve: Users Prefer Robots with Emotional Reactions to Touch and Ambient Moods}, author = {Burns, Rachael Bevill and Ojo, Fayokemi and Kuchenbecker, Katherine J.}, booktitle = {Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)}, pages = {1914--1921}, address = {Busan, South Korea}, year = {2023}, doi = {10.1109/RO-MAN57019.2023.10309358} }
Learning Globally Smooth Functions on Manifolds

J. Cervino, L. F. O. Chamon, B. D. Haeffele, R. Vidal, A. Ribeiro

Proceedings of the International Conference on Machine Learning (ICML), 2023.

BibTeX Project

@inproceedings{Cervino23l, author = {Cervino, J. and Chamon, L. F. O. and Haeffele, B. D. and Vidal, R. and Ribeiro, A.}, title = {Learning Globally Smooth Functions on Manifolds}, booktitle = {Proceedings of the International Conference on Machine Learning~(ICML)}, year = {2023}, arxiv = {\url{https://arxiv.org/abs/2210.00301}}, keywords = {conf_ml} }
Learning Globally Smooth Functions on Manifolds

J. Cervino, L. F. O. Chamon, B. D. Haeffele, R. Vidal, A. Ribeiro

Proceedings of the International Conference on Machine Learning (ICML), 2023.

BibTeX Project

@inproceedings{Cervino23m, author = {Cervino, J. and Chamon, L. F. O. and Haeffele, B. D. and Vidal, R. and Ribeiro, A.}, title = {Learning Globally Smooth Functions on Manifolds}, booktitle = {Proceedings of the International Conference on Machine Learning~(ICML)}, year = {2023}, arxiv = {\url{https://arxiv.org/abs/2210.00301}}, keywords = {conf_ml} }
Beware of the unexpected: Bimodal taint analysis

Yiu Wai Chow, Max Schäfer, Michael Pradel

Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 211–222, 2023.

BibTeX Project

@inproceedings{chow2023beware, title = {Beware of the unexpected: Bimodal taint analysis}, author = {Chow, Yiu Wai and Sch{\"a}fer, Max and Pradel, Michael}, booktitle = {Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis}, pages = {211--222}, year = {2023} }
Made of Steel? Learning Plausible Materials for Components in the Vehicle Repair Domain

Annerose Eichel, Sabine Schulte Walde

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. , 2023.

Abstract Links BibTeX Project

Preprint:

@inproceedings{eichel23_eacl, title = {Made of Steel? Learning Plausible Materials for Components in the Vehicle Repair Domain}, author = {Eichel, Annerose and im Walde, Sabine Schulte}, year = {2023}, booktitle = {Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)}, pages = {}, preprint = {} }
Reconstructing Signing Avatars from Video Using Linguistic Priors

Maria-Paola Forte, Peter Kulits, Chun-Hao Paul Huang, Vasileios Choutas, Dimitrios Tzionas, Katherine J. Kuchenbecker, Michael J. Black

Proceedings of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 12791–12801, 2023.

Links BibTeX Project

doi:

@inproceedings{Forte23-CVPR-SGNify, title = {Reconstructing Signing Avatars from Video Using Linguistic Priors}, author = {Forte, Maria-Paola and Kulits, Peter and Huang, Chun-Hao Paul and Choutas, Vasileios and Tzionas, Dimitrios and Kuchenbecker, Katherine J. and Black, Michael J.}, booktitle = {Proceedings of the IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)}, pages = {12791--12801}, year = {2023}, doi = {} }
Learning Disentangled Discrete Representations

David Friede, Christian Reimers, Heiner Stuckenschmidt, Mathias Niepert

Proceedings of the 34th European Conference on Machine Learning (ECML 2023), 2023.

BibTeX Project

@inproceedings{friede2023learning, title = {Learning Disentangled Discrete Representations}, author = {Friede, David and Reimers, Christian and Stuckenschmidt, Heiner and Niepert, Mathias}, year = {2023}, booktitle = {Proceedings of the 34th European Conference on Machine Learning (ECML 2023)} }
Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models

Cosimo Gregucci, Mojtaba Nayyeri, Daniel Hernandez, Steffen Staab

Proceedings of the ACM Web Conference, pp. , 2023.

Abstract Links BibTeX Project

Preprint:

@inproceedings{gregucci23_websci, title = {Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models}, author = {Gregucci, Cosimo and Nayyeri, Mojtaba and Hernandez, Daniel and Staab, Steffen}, year = {2023}, booktitle = {Proceedings of the ACM Web Conference}, pages = {}, preprint = {} }
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

M. Haas, D. Holzmüller, U. Luxburg, I. Steinwart

Advances in Neural Information Processing Systems, pp. 20763–20826, 2023.

BibTeX Project

@inproceedings{HaHoLuSt23a, author = {Haas, M. and Holzm\"{u}ller, D. and Luxburg, U. and Steinwart, I.}, booktitle = {Advances in Neural Information Processing Systems}, pages = {20763--20826}, title = {Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension}, volume = {36}, year = {2023} }
CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions

Ramin Hedeshy, Raphael Menges, Steffen Staab

, 2023.

Abstract BibTeX Project

Non-verbal voice expressions (NVVEs) have been adopted as a means of human-computer interaction in research studies. However, exploring non-verbal voice-based interactions has been constrained by the limited availability of suitable training data and computational methods for classifying such expressions, leading to a focus on simple binary inputs. We address this issue with a new dataset containing 950 audio samples comprising 6 classes of voice expressions. The data were collected from 42 speakers who donated voice recordings. The classifier was trained on the data using features derived from mel-spectrograms. Furthermore, we studied the effectiveness of data augmentation and improved over the baseline model accuracy significantly with a test accuracy of 96.6% in a 5-fold cross-validation. We have made CNVVE publicly accessible in the hope that it will serve as a benchmark for future research.

@inproceedings{hedeshy2023cnvve, title = {CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions}, author = {Hedeshy, Ramin and Menges, Raphael and Staab, Steffen}, year = {2023}, added-at = {2023-06-09T10:48:42.000+0000}, biburl = {https://puma.ub.uni-stuttgart.de/bibtex/27c77d415becf2f5405d6520cb66a3548/analyticcomp}, eventdate = {August 20-24}, eventtitle = {Interspeech 2023}, interhash = {6adf2287678455a9fe702bdad6058b80}, intrahash = {7c77d415becf2f5405d6520cb66a3548}, keywords = {myown from:hedeshy}, timestamp = {2023-06-09T10:50:55.000+0000} }
Emotional Framing in the Spreading of False and True Claims

Akram Sadat Hosseini, Steffen Staab

Proceedings of the 15th ACM Web Science Conference, pp. , 2023.

Abstract Links BibTeX Project

Preprint:

@inproceedings{hosseini23_websci, title = {Emotional Framing in the Spreading of False and True Claims}, author = {Hosseini, Akram Sadat and Staab, Steffen}, year = {2023}, booktitle = {Proceedings of the 15th ACM Web Science Conference}, pages = {}, preprint = {} }
Resilient Constrained Learning

I. Hounie, A. Ribeiro, L. F. O. Chamon

Conference on Neural Information Processing Systems (NeurIPS), 2023.

BibTeX Project

@inproceedings{Hounie23r, author = {Hounie, I. and Ribeiro, A. and Chamon, L. F. O.}, title = {Resilient Constrained Learning}, booktitle = {Conference on Neural Information Processing Systems~(NeurIPS)}, year = {2023}, arxiv = {\url{https://arxiv.org/abs/2306.02426}}, keywords = {conf_ml} }
To Split or Not to Split: Composing Compounds in Contextual Vector Spaces

Chris Jenkins, Filip Miletic, Sabine Walde

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 16131–16136, 2023.

Links BibTeX Project

doi: 10.18653/v1/2023.emnlp-main.1002

Paper: https://aclanthology.org/2023.emnlp-main.1002

@inproceedings{jenkins-etal-2023-split, title = {To Split or Not to Split: Composing Compounds in Contextual Vector Spaces}, author = {Jenkins, Chris and Miletic, Filip and Schulte im Walde, Sabine}, editor = {Bouamor, Houda and Pino, Juan and Bali, Kalika}, booktitle = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing}, year = {2023}, address = {Singapore}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2023.emnlp-main.1002}, doi = {10.18653/v1/2023.emnlp-main.1002}, pages = {16131--16136} }
Investigating the Nature of Disagreements on Mid-Scale Ratings: A Case Study on the Abstractness-Concreteness Continuum

Urban Knupleš, Diego Frassinelli, Sabine Walde

Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL), pp. 70–86, 2023.

Links BibTeX Project

doi: 10.18653/v1/2023.conll-1.6

Paper: https://aclanthology.org/2023.conll-1.6

@inproceedings{knuples-etal-2023-investigating, title = {Investigating the Nature of Disagreements on Mid-Scale Ratings: A Case Study on the Abstractness-Concreteness Continuum}, author = {Knuple{\v{s}}, Urban and Frassinelli, Diego and Schulte im Walde, Sabine}, editor = {Jiang, Jing and Reitter, David and Deng, Shumin}, booktitle = {Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)}, year = {2023}, address = {Singapore}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2023.conll-1.6}, doi = {10.18653/v1/2023.conll-1.6}, pages = {70--86} }
A Systematic Search for Compound Semantics in Pretrained BERT Architectures

Filip Miletic, Sabine Schulte Walde

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. , 2023.

Abstract Links BibTeX Project

Preprint:

@inproceedings{miletic23_eacl, title = {A Systematic Search for Compound Semantics in Pretrained BERT Architectures}, author = {Miletic, Filip and im Walde, Sabine Schulte}, year = {2023}, booktitle = {Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL)}, pages = {}, preprint = {} }
Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models

Pasquale Minervini, Luca Franceschi, Mathias Niepert

Proceedings of the 37th Conference on Artificial Intelligence (AAAI), 2023.

BibTeX Project

@inproceedings{minervini2023adaptive, title = {Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models}, author = {Minervini, Pasquale and Franceschi, Luca and Niepert, Mathias}, year = {2023}, booktitle = {Proceedings of the 37th Conference on Artificial Intelligence (AAAI)} }
Vulgen: Realistic vulnerability generation via pattern mining and deep learning

Yu Nong, Yuzhe Ou, Michael Pradel, Feng Chen, Haipeng Cai

Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 2527–2539, 2023.

BibTeX Project

@inproceedings{nong2023vulgen, title = {Vulgen: Realistic vulnerability generation via pattern mining and deep learning}, author = {Nong, Yu and Ou, Yuzhe and Pradel, Michael and Chen, Feng and Cai, Haipeng}, booktitle = {Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)}, pages = {2527--2539}, year = {2023}, organization = {IEEE} }
Estimating the Contamination Factor’s Distribution in Unsupervised Anomaly Detection

Lorenzo Perini, Paul Buerkner, Arto Klami

Proceedings of the International Conference on Machine Learning (ICML), 2023.

BibTeX Project

@inproceedings{perini2023estimating, title = {Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection}, author = {Perini, Lorenzo and Buerkner, Paul and Klami, Arto}, year = {2023}, booktitle = {Proceedings of the International Conference on Machine Learning~(ICML)} }
JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models

Stefan T Radev, Marvin Schmitt, Valentin Pratz, Umberto Picchini, Ullrich Köthe, Paul-Christian Bürkner

Proceedings of the UAI Conference, 2023.

BibTeX Project

@inproceedings{radev2023jana, title = {JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models}, author = {Radev, Stefan T and Schmitt, Marvin and Pratz, Valentin and Picchini, Umberto and K{\"o}the, Ullrich and B{\"u}rkner, Paul-Christian}, year = {2023}, booktitle = {Proceedings of the UAI Conference} }
Meta-Uncertainty in Bayesian Model Comparison

Marvin Schmitt, Stefan T Radev, Paul-Christian Bürkner

Proceedings of the AISTATS Conference, 2023.

BibTeX Project

@inproceedings{schmitt2023meta, title = {Meta-Uncertainty in Bayesian Model Comparison}, author = {Schmitt, Marvin and Radev, Stefan T and B{\"u}rkner, Paul-Christian}, year = {2023}, booktitle = {Proceedings of the AISTATS Conference} }
Lexecutor: Learning-guided execution

Beatriz Souza, Michael Pradel

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1522–1534, 2023.

BibTeX Project

@inproceedings{souza2023lexecutor, title = {Lexecutor: Learning-guided execution}, author = {Souza, Beatriz and Pradel, Michael}, booktitle = {Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering}, pages = {1522--1534}, year = {2023} }
Learning Neural PDE Solvers with Parameter-Guided Channel Attention

Makoto Takamoto, Francesco Alesiani, Mathias Niepert

Proceedings of the International Conference on Machine Learning (ICML), 2023.

BibTeX Project

@inproceedings{takamoto2023learning, title = {Learning Neural PDE Solvers with Parameter-Guided Channel Attention}, author = {Takamoto, Makoto and Alesiani, Francesco and Niepert, Mathias}, year = {2023}, booktitle = {Proceedings of the International Conference on Machine Learning~(ICML)} }

2022

Journal Articles

Improved Classification Rates for Localized SVMs

Ingrid Blaschzyk, Ingo Steinwart

Journal of Machine Learning Research, 23(165), pp. 1–59, 2022.

Abstract Links BibTeX Project

Localized support vector machines solve SVMs on many spatially defined small chunks and besides their computational benefit compared to global SVMs one of their main characteristics is the freedom of choosing arbitrary kernel and regularization parameter on each cell. We take advantage of this observation to derive global learning rates for localized SVMs with Gaussian kernels and hinge loss. It turns out that our rates outperform under suitable sets of assumptions known classification rates for localized SVMs, for global SVMs, and other learning algorithms based on e.g., plug-in rules or trees. The localized SVM rates are achieved under a set of margin conditions, which describe the behavior of the data-generating distribution, and no assumption on the existence of a density is made. Moreover, we show that our rates are obtained adaptively, that is without knowing the margin parameters in advance. The statistical analysis of the excess risk relies on a simple partitioning based technique, which splits the input space into a subset that is close to the decision boundary and into a subset that is sufficiently far away. A crucial condition to derive then improved global rates is a margin condition that relates the distance to the decision boundary to the amount of noise.

doi:

@article{blaschzyk22_jmlr, title = {Improved Classification Rates for Localized SVMs}, author = {Blaschzyk, Ingrid and Steinwart, Ingo}, year = {2022}, journal = {Journal of Machine Learning Research}, volume = {23}, number = {165}, pages = {1--59}, doi = {} }
In the Arms of a Robot: Designing Autonomous Hugging Robots with Intra-Hug Gestures

Alexis E. Block, Hasti Seifi, Otmar Hilliges, Roger Gassert, Katherine J. Kuchenbecker

ACM Transactions on Human-Robot Interaction Special Issue on Designing the Robot Body: Critical Perspectives on Affective Embodied Interaction (THRI), , 2022.

Links BibTeX Project

doi:

@article{block22_THRI, title = {In the Arms of a Robot: Designing Autonomous Hugging Robots with Intra-Hug Gestures}, author = {Block, Alexis E. and Seifi, Hasti and Hilliges, Otmar and Gassert, Roger and Kuchenbecker, Katherine J.}, year = {2022}, journal = {ACM Transactions on Human-Robot Interaction Special Issue on Designing the Robot Body: Critical Perspectives on Affective Embodied Interaction (THRI)}, volume = {}, doi = {} }
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent

David Holzmüller, Ingo Steinwart

Journal of Machine Learning Research, 23(181), pp. 1–82, 2022.

Abstract Links BibTeX Project

We prove that two-layer (Leaky)ReLU networks initialized by e.g. the widely used method proposed by He et al. (2015) and trained using gradient descent on a least-squares loss are not universally consistent. Specifically, we describe a large class of one-dimensional data-generating distributions for which, with high probability, gradient descent only finds a bad local minimum of the optimization landscape, since it is unable to move the biases far away from their initialization at zero. It turns out that in these cases, the found network essentially performs linear regression even if the target function is non-linear. We further provide numerical evidence that this happens in practical situations, for some multi- dimensional distributions and that stochastic gradient descent exhibits similar behavior. We also provide empirical results on how the choice of initialization and optimizer can influence this behavior.

doi:

@article{holzmueller22_jmlr, title = {Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent}, author = {Holzmüller, David and Steinwart, Ingo}, year = {2022}, journal = {Journal of Machine Learning Research}, volume = {23}, number = {181}, pages = {1--82}, doi = {} }
Predicting the Force Map of an ERT-Based Tactile Sensor Using Simulation and Deep Networks

Hyosang Lee, Huanbo Sun, Hyunkyu Park, Gokhan Serhat, Bernard Javot, Georg Martius, Katherine J. Kuchenbecker

IEEE Transactions on Automation Science and Engineering (TASE), , 2022.

Abstract Links BibTeX Project

Electrical resistance tomography (ERT) can be used to create large-scale soft tactile sensors that are flexible and robust. Good performance requires a fast and accurate mapping from the sensor’s sequential voltage measurements to the distribution of force across its surface. However, particularly with multiple contacts, this task is challenging for both previously developed approaches: physics-based modeling and end-to-end data-driven learning. Some promising results were recently achieved using sim-to-real transfer learning, but estimating multiple contact locations and accurate contact forces remains difficult because simulations tend to be less accurate with a high number of contact locations and/or high force. This paper introduces a modular hybrid method that combines simulation data synthesized from an electromechanical finite element model with real measurements collected from a new ERT-based tactile sensor. We use about 290,000 simulated and 90,000 real measurements to train two deep neural networks: the first (Transfer-Net) captures the inevitable gap between simulation and reality, and the second (Recon-Net) reconstructs contact forces from voltage measurements. The number of contacts, contact locations, force magnitudes, and contact diameters are evaluated for a manually collected multi-contact dataset of 150 measurements. Our modular pipeline’s results outperform predictions by both a physics-based model and end-to-end learning.

doi: 10.1109/TASE.2022.3156184

@article{lee22_TASE, title = {Predicting the Force Map of an ERT-Based Tactile Sensor Using Simulation and Deep Networks}, author = {Lee, Hyosang and Sun, Huanbo and Park, Hyunkyu and Serhat, Gokhan and Javot, Bernard and Martius, Georg and Kuchenbecker, Katherine J.}, year = {2022}, journal = {IEEE Transactions on Automation Science and Engineering (TASE)}, doi = {10.1109/TASE.2022.3156184} }
Neural Software Analysis

Michael Pradel, Satish Chandra

Communications of the ACM, 65(1), pp. 86–96, 2022.

Abstract Links BibTeX Project

Developer tools that use a neural machine learning model to make predictions about previously unseen code.

doi: 10.1145/3460348

@article{pradel22_cacm, title = {Neural Software Analysis}, author = {Pradel, Michael and Chandra, Satish}, year = {2022}, journal = {Communications of the ACM}, volume = {65}, number = {1}, pages = {86--96}, doi = {10.1145/3460348} }
A Soft Thumb-sized Vision-based Sensor with Accurate All-round Force Perception

Huanbo Sun, Katherine J. Kuchenbecker, Georg Martius

Nature Machine Intelligence, 4, 2022.

Abstract Links BibTeX Project

Vision-based haptic sensors have emerged as a promising approach to robotic touch due to affordable high-resolution cameras and successful computer vision techniques; however, their physical design and the information they provide do not yet meet the requirements of real applications. We present a robust, soft, low-cost, vision-based, thumb-sized three-dimensional haptic sensor named Insight, which continually provides a directional force-distribution map over its entire conical sensing surface. Constructed around an internal monocular camera, the sensor has only a single layer of elastomer over-moulded on a stiff frame to guarantee sensitivity, robustness and soft contact. Furthermore, Insight uniquely combines photometric stereo and structured light using a collimator to detect the three-dimensional deformation of its easily replaceable flexible outer shell. The force information is inferred by a deep neural network that maps images to the spatial distribution of three-dimensional contact force (normal and shear). Insight has an overall spatial resolution of 0.4 mm, a force magnitude accuracy of around 0.03 N and a force direction accuracy of around five degrees over a range of 0.03–2 N for numerous distinct contacts with varying contact area. The presented hardware and software design concepts can be transferred to a wide variety of robot parts.

doi: 10.1038/s42256-021-00439-3

@article{sun22_NMI, title = {A Soft Thumb-sized Vision-based Sensor with Accurate All-round Force Perception}, author = {Sun, Huanbo and Kuchenbecker, Katherine J. and Martius, Georg}, year = {2022}, journal = {Nature Machine Intelligence}, volume = {4}, doi = {10.1038/s42256-021-00439-3}, organization = {Max Planck Institute for Intelligent Systems} }
Distributional Measures of Semantic Abstraction

Sabine Schulte im Walde, Diego Frassinelli

Frontiers in Artificial Intelligence: Language and Computation, 4(796756), 2022.

Abstract Links BibTeX Project

This article provides an in-depth study of distributional measures for distinguishing between degrees of semantic abstraction. Abstraction is considered a “central construct in cognitive science” (Barsalou, 2003) and a “process of information reduction that allows for efficient storage and retrieval of central knowledge” (Burgoon et al., 2013). Relying on the distributional hypothesis, computational studies have successfully exploited measures of contextual co-occurrence and neighbourhood density to distinguish between conceptual semantic categorisations. So far, these studies have modeled semantic abstraction across lexical-semantic tasks such as ambiguity; diachronic meaning changes; abstractness vs. concreteness; and hypernymy. Yet, the distributional approaches target different conceptual types of semantic relatedness, and as to our knowledge not much attention has been paid to apply, compare or analyse the computational abstraction measures across conceptual tasks. The current article suggests a novel perspective that exploits variants of distributional measures to investigate semantic abstraction in English in terms of the abstract–concrete dichotomy (e.g., glory–banana) and in terms of the generality–specificity distinction (e.g., animal–fish), in order to compare the strengths and weaknesses of the measures regarding categorisations of abstraction, and to determine and investigate conceptual differences. In a series of experiments we identify reliable distributional measures for both instantiations of lexical-semantic abstraction and reach a precision higher than 0.7, but the measures clearly differ for the abstract–concrete vs. abstract–specific distinctions and for nouns vs. verbs. Overall, we identify two groups of measures, (i) frequency and word entropy when distinguishing between more and less abstract words in terms of the generality–specificity distinction, and (ii) neighbourhood density variants (especially target–context diversity) when distinguishing between more and less abstract words in terms of the abstract–concrete dichotomy. We conclude that more general words are used more often and are less surprising than more specific words, and that abstract words establish themselves empirically in semantically more diverse contexts than concrete words. Finally, our experiments once more point out that distributional models of conceptual categorisations need to take word classes and ambiguity into account: results for nouns vs. verbs differ in many respects, and ambiguity hinders fine-tuning empirical observations.

doi: 10.3389/frai.2021.796756

@article{schulteimwalde22_fai, title = {Distributional Measures of Semantic Abstraction}, author = {{Schulte im Walde}, Sabine and Frassinelli, Diego}, year = {2022}, journal = {Frontiers in Artificial Intelligence: Language and Computation}, volume = {4}, number = {796756}, doi = {10.3389/frai.2021.796756} }

Conference Papers

Neuro-Symbolic Visual Dialog

Adnen Abdessaied, Mihai Bâce, Andreas Bulling

Proceedings of the 29th International Conference on Computational Linguistics (COLING), pp. 1–11, 2022.

Abstract Links BibTeX Project

We propose Neuro-Symbolic Visual Dialog (NSVD) —the first method to combine deep learning and symbolic program execution for multi-round visually-grounded reasoning. NSVD significantly outperforms existing purely-connectionist methods on two key challenges inherent to visual dialog: long-distance co-reference resolution as well as vanishing question-answering performance. We demonstrate the latter by proposing a more realistic and stricter evaluation scheme in which we use predicted answers for the full dialog history when calculating accuracy. We describe two variants of our model and show that using this new scheme, our best model achieves an accuracy of 99.72% on CLEVR-Dialog —a relative improvement of more than 10% over the state of the art —while only requiring a fraction of training data. Moreover, we demonstrate that our neuro-symbolic models have a higher mean first failure round, are more robust against incomplete dialog histories, and generalise better not only to dialogs that are up to three times longer than those seen during training but also to unseen question types and scenes.

Preprint: https://perceptualui.org/publications/abdessaied22_coling/

@inproceedings{abdessaied22_coling, title = {Neuro-Symbolic Visual Dialog}, author = {Abdessaied, Adnen and Bâce, Mihai and Bulling, Andreas}, year = {2022}, booktitle = {Proceedings of the 29th International Conference on Computational Linguistics (COLING)}, pages = {1--11}, preprint = {https://perceptualui.org/publications/abdessaied22_coling/} }
Tensor-based Graph Modularity for Text Data Clustering

Rafika Boutalbi, Mira Ait-Saada, Anastasiia Iurshina, Steffen Staab, Mohamed Nadif

Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 1–5, 2022.

Abstract BibTeX Project

Graphs are used in several applications to represent similaritiesbetween instances. For text data, we can represent texts by differentfeatures such as bag-of-words, static embeddings (Word2vec, GloVe,etc.), and contextual embeddings (BERT, RoBERTa, etc.), leading tomultiple similarities (or graphs) based on each representation. Theproposal posits that incorporating the local invariance within everygraph and the consistency across different graphs leads to a consen-sus clustering that improves the document clustering. This problemis complex and challenged with the sparsity and the noisy data in-cluded in each graph. To this end, we rely on the modularity metric,which effectively evaluates graph clustering in such circumstances.Therefore, we present a novel approach for text clustering basedon both a sparse tensor representation and graph modularity. Thisleads to cluster texts (nodes) while capturing information arisingfrom the different graphs. We iteratively maximize a Tensor-basedGraph Modularity criterion. Extensive experiments on benchmarktext clustering datasets are performed, showing that the proposed al-gorithm referred to asTensor Graph Modularity–TGM– outperformsother baseline methods in terms of clustering task. The source codeis available at https://github.com/TGMclustering/TGMclustering.

@inproceedings{boutalbi22_sigir, title = {Tensor-based Graph Modularity for Text Data Clustering}, author = {Boutalbi, Rafika and Ait-Saada, Mira and Iurshina, Anastasiia and Staab, Steffen and Nadif, Mohamed}, year = {2022}, booktitle = {Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)}, pages = {1--5} }
Projection Predictive Inference for Generalized Linear and Additive Multilevel Models

Alejandro Catalina, Paul-Christian Bürkner, Aki Vehtari

Proceedings of the Artificial Intelligence and Statistics (AISTATS) Conference Proceedings, pp. 1–23, 2022.

Abstract BibTeX Project

Projection predictive inference is a decision theoretic Bayesian approach that decouples model estimation from decision making. Given a reference model previously built including all variables present in the data, projection predictive inference projects its posterior onto a constrained space of a subset of variables. Variable selection is then performed by sequentially adding relevant variables until predictive performance is satisfactory. Previously, projection predictive inference has been demonstrated only for generalized linear models (GLMs) and Gaussian processes (GPs) where it showed superior performance to competing variable selection procedures. In this work, we extend projection predictive inference to support variable and structure selection for generalized linear multilevel models (GLMMs) and generalized additive multilevel models (GAMMs). Our simulative and real-word experiments demonstrate that our method can drastically reduce the model complexity required to reach reference predictive performance and achieve good frequency properties.

@inproceedings{catalina22_aistats, title = {Projection Predictive Inference for Generalized Linear and Additive Multilevel Models}, author = {Catalina, Alejandro and Bürkner, Paul-Christian and Vehtari, Aki}, year = {2022}, booktitle = {Proceedings of the Artificial Intelligence and Statistics (AISTATS) Conference Proceedings}, pages = {1--23} }
CrystalBLEU: Precisely and Efficiently Measuring the Similarity of Code

Aryaz Eghbali, Michael Pradel

Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 1–12, 2022.

Abstract BibTeX Project ACM SIGSOFT Distinguished Paper Award

Recent years have brought a surge of work on predicting pieces of source code, e.g., for code completion, code migration, program repair, or translating natural language into code. All this work faces the challenge of evaluating the quality of a prediction w.r.t. some oracle, typically in the form of a reference solution. A common evaluation metric is the BLEU score, an n-gram-based metric originally proposed for evaluating natural language translation, but adopted in software engineering because it can be easily computed on any programming language and enables automated evaluation at scale. However, a key difference between natural and programming languages is that in the latter, completely unrelated pieces of code may have many common n-grams simply because of the syntactic verbosity and coding conventions of programming languages. We observe that these trivially shared n-grams hamper the ability of the metric to distinguish between truly similar code examples and code examples that are merely written in the same language. This paper presents CrystalBLEU, an evaluation metric based on BLEU, that allows for precisely and efficiently measuring the similarity of code. Our metric preserves the desirable properties of BLEU, such as being language-agnostic, able to handle incomplete or partially incorrect code, and efficient, while reducing the noise caused by trivially shared n-grams. We evaluate CrystalBLEU on two datasets from prior work and on a new, labeled dataset of semantically equivalent programs. Our results show that CrystalBLEU can distinguish similar from dissimilar code examples 1.9–4.5 times more effectively, when compared to the original BLEU score and a previously proposed variant of BLEU for code.

@inproceedings{eghbali22_ase, title = {CrystalBLEU: Precisely and Efficiently Measuring the Similarity of Code}, author = {Eghbali, Aryaz and Pradel, Michael}, year = {2022}, booktitle = {Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE)}, pages = {1--12} }
Investigating Independence vs. Control: Agenda-Setting in Russian News Coverage on Social Media

Annerose Eichel, Gabriella Lapesa, Sabine Schulte Walde

Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC), pp. 5314–5323, 2022.

Abstract Links BibTeX Project

Agenda-setting is a widely explored phenomenon in political science: powerful stakeholders (governments or their financial supporters) have control over the media and set their agenda: political and economical powers determine which news should be salient. This is a clear case of targeted manipulation to divert the public attention from serious issues affecting internal politics (such as economic downturns and scandals) by flooding the media with potentially distracting information. We investigate agenda-setting in the Russian social media landscape, exploring the relation between economic indicators and mentions of foreign geopolitical entities, as well as of Russia itself. Our contributions are at three levels: at the level of the domain of the investigation, our study is the first to substructure the Russian media landscape in state-controlled vs. independent outlets in the context of strategic distraction from negative economic trends; at the level of the scope of the investigation, we involve a large set of geopolitical entities (while previous work has focused on the U.S.); at the qualitative level, our analysis of posts on Ukraine, whose relationship with Russia is of high geopolitical relevance, provides further insights into the contrast between state-controlled and independent outlets.

Preprint:

@inproceedings{eichel23_lrec, title = {Investigating Independence vs. Control: Agenda-Setting in Russian News Coverage on Social Media}, author = {Eichel, Annerose and Lapesa, Gabriella and im Walde, Sabine Schulte}, year = {2022}, booktitle = {Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC)}, pages = {5314–5323}, preprint = {} }
BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation

Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.

Abstract Links BibTeX Project

Intrinsic evaluations of OIE systems are carried out either manually – with human evaluators judging the correctness of extractions – or automatically, on standardized benchmarks. The latter, while much more cost-effective, is less reliable, primarily because of the incompleteness of the existing OIE benchmarks: the ground truth extractions do not include all acceptable variants of the same fact, leading to unreliable assessment of models’ performance. Moreover, the existing OIE benchmarks are available for English only. In this work, we introduce BenchIE: a benchmark and evaluation framework for comprehensive evaluation of OIE systems for English, Chinese and German. In contrast to existing OIE benchmarks, BenchIE takes into account informational equivalence of extractions: our gold standard consists of fact synsets, clusters in which we exhaustively list all surface forms of the same fact. We benchmark several state-of-the-art OIE systems using BenchIE and demonstrate that these systems are significantly less effective than indicated by existing OIE benchmarks. We make BenchIE (data and evaluation code) publicly available.

doi:

@inproceedings{gashteovski22_acl, title = {BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation}, author = {Gashteovski, Kiril and Yu, Mingying and Kotnis, Bhushan and Lawrence, Carolin and Niepert, Mathias and Glavaš, Goran}, year = {2022}, booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)}, doi = {} }
Modular and Iterative Multilingual Open Information Extraction

Bhushan Kotnis, Kiril Gashteovski, Daniel Onoro Rubio, Ammar Shaker, Vanesa Rodriguez-Tembras, Makoto Takamoto, Mathias Niepert, Carolin Lawrence

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.

Abstract Links BibTeX Project

Open Information Extraction (OpenIE) is the task of extracting (subject, predicate, object) triples from natural language sentences. Current OpenIE systems extract all triple slots independently. In contrast, we investigate the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and therefore achieve a better overall extraction. Based on this hypothesis, we propose a neural OpenIE system, MILLIE, that operates in an iterative fashion. Due to the iterative nature, the system is also modular: it is possible to seamlessly integrate rule based extraction systems with a neural end-to-end system, thereby allowing rule based systems to supply extraction slots which MILLIE can leverage for extracting the remaining slots. We confirm our hypothesis empirically: MILLIE outperforms SOTA systems on multiple languages ranging from Chinese to Arabic. Additionally, we are the first to provide an OpenIE test dataset for Arabic.

doi:

Paper: https://openreview.net/pdf?id=KNqKOUnl_3F

@inproceedings{kotnis22_acl, title = {Modular and Iterative Multilingual Open Information Extraction}, author = {Kotnis, Bhushan and Gashteovski, Kiril and Rubio, Daniel Onoro and Shaker, Ammar and Rodriguez-Tembras, Vanesa and Takamoto, Makoto and Niepert, Mathias and Lawrence, Carolin}, year = {2022}, booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)}, doi = {}, url = {https://openreview.net/pdf?id=KNqKOUnl_3F} }
Finding the Dwarf: Recovering Precise Types from WebAssembly Binaries

Daniel Lehmann, Michael Pradel

Proceedings of the 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 1–16, 2022.

Abstract Links BibTeX Project

The increasing popularity of WebAssembly creates a demand for understanding and reverse engineering WebAssembly binaries. Recovering high-level function types is an important part of this process. One method to recover types is data-flow analysis, but it is complex to implement and may require manual heuristics when logical constraints fall short. In contrast, this paper presents SnowWhite, a learning-based approach for recovering precise, high-level parameter and return types for WebAssembly functions. It improves over prior work on learning-based type recovery by representing the types-to-predict in an expressive type language, which can describe a large number of complex types, instead of the fixed, and usually small type vocabulary used previously. Thus, recovery of a single type is no longer a classification task but sequence prediction, for which we build on the success of neural sequence-to-sequence models. We evaluate SnowWhite on a new, large-scale dataset of 6.3 million type samples extracted from 300,905 WebAssembly object files. The results show the type language is expressive, precisely describing 1,225 types instead the 7 to 35 types considered in previous learning-based approaches. Despite this expressiveness, our type recovery has high accuracy, exactly matching 44.5% (75.2%) of all parameter types and 57.7% (80.5%) of all return types within the top-1 (top-5) predictions.

Preprint: https://software-lab.org/publications/pldi2022.pdf

@inproceedings{lehmann22_pldi, title = {Finding the Dwarf: Recovering Precise Types from WebAssembly Binaries}, author = {Lehmann, Daniel and Pradel, Michael}, year = {2022}, booktitle = {Proceedings of the 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)}, pages = {1--16}, preprint = {https://software-lab.org/publications/pldi2022.pdf} }
Generating Realistic Vulnerabilities via Neural Code Editing: An Empirical Study

Yu Nong, Yuzhe Ou, Michael Pradel, Feng Chen, Haipeng Cai

Proceedings of the ACM Symposium on the Foundations of Software Engineering (FSE), pp. 1–13, 2022.

Abstract Links BibTeX Project

The availability of large-scale, realistic vulnerability datasets is essential both for benchmarking existing techniques and for developing effective new data-driven approaches for software security. Yet such datasets are critically lacking. A promising solution is to generate such datasets by injecting vulnerabilities into real-world programs, which are richly available. Thus, in this paper, we explore the feasibility of vulnerability injection through neural code editing. With a synthetic dataset and a real-world one, we investigate the potential and gaps of three state-of-the-art neural code editors for vulnerability injection. We find that the studied editors have critical limitations on the real-world dataset, where the best accuracy is only 10.03%, versus 79.40% on the synthetic dataset. While the graph-based editors are more effective (successfully injecting vulnerabilities in up to 34.93% of real-world testing samples) than the sequence-based one (0 success), they still suffer from complex code structures and fall short for long edits due to their insufficient designs of the preprocessing and deep learning (DL) models. We reveal the promise of neural code editing for generating realistic vulnerable samples, as they help boost the effectiveness of DL-based vulnerability detectors by up to 49.51% in terms of F1 score. We also provide insights into the gaps in current editors (e.g., they are good at deleting but not at replacing code) and actionable suggestions for addressing them (e.g., designing effective editing primitives).

Preprint: https://software-lab.org/publications/fse2022_vuln_inj_study.pdf

@inproceedings{nong22_fse, title = {Generating Realistic Vulnerabilities via Neural Code Editing: An Empirical Study}, author = {Nong, Yu and Ou, Yuzhe and Pradel, Michael and Chen, Feng and Cai, Haipeng}, year = {2022}, booktitle = {Proceedings of the ACM Symposium on the Foundations of Software Engineering (FSE)}, pages = {1--13}, preprint = {https://software-lab.org/publications/fse2022_vuln_inj_study.pdf} }
Utilizing Expert Features for Contrastive Learning of Time-Series Representations

Manuel Nonnenmacher, Lukas Oldenburg, Ingo Steinwart, David Reeb

Proceedings of the 39th International Conference on Machine Learning (ICML), pp. 1–21, 2022.

Abstract BibTeX Project

We present an approach that incorporates expert knowledge for time-series representation learning. Our method employs expert features to replace the commonly used data transformations in previous contrastive learning approaches. We do this since time-series data frequently stems from the industrial or medical field where expert features are often available from domain experts, while transformations are generally elusive for time-series data. We start by proposing two properties that useful time-series representations should fulfill and show that current representation learning approaches do not ensure these properties. We therefore devise ExpCLR, a novel contrastive learning approach built on an objective that utilizes expert features to encourage both properties for the learned representation. Finally, we demonstrate on three real-world time-series datasets that ExpCLR surpasses several state-of-the-art methods for both unsupervised and semi-supervised representation learning.

@inproceedings{nonnenmacher22_icml, title = {Utilizing Expert Features for Contrastive Learning of Time-Series Representations}, author = {Nonnenmacher, Manuel and Oldenburg, Lukas and Steinwart, Ingo and Reeb, David}, year = {2022}, booktitle = {Proceedings of the 39th International Conference on Machine Learning (ICML)}, pages = {1--21} }
SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning

Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, David Reeb

Proceedings of the Tenth International Conference on Learning Representations (ICLR), pp. 1–24, 2022.

Abstract BibTeX Project

Pruning neural networks reduces inference time and memory costs. On standard hardware, these benefits will be especially prominent if coarse-grained structures, like feature maps, are pruned. We devise two novel saliency-based methods for second-order structured pruning (SOSP) which include correlations among all structures and layers. Our main method SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products. SOSP-H thereby scales like a first-order method despite taking into account the full Hessian. We validate SOSP-H by comparing it to our second method SOSP-I that uses a well-established Hessian approximation, and to numerous state-of-the-art methods. While SOSP-H performs on par or better in terms of accuracy, it has clear advantages in terms of scalability and efficiency. This allowed us to scale SOSP-H to large-scale vision tasks, even though it captures correlations across all layers of the network. To underscore the global nature of our pruning methods, we evaluate their performance not only by removing structures from a pretrained network, but also by detecting architectural bottlenecks. We show that our algorithms allow to systematically reveal architectural bottlenecks, which we then remove to further increase the accuracy of the networks.

@inproceedings{nonnenmacher22_iclr, title = {SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning}, author = {Nonnenmacher, Manuel and Pfeil, Thomas and Steinwart, Ingo and Reeb, David}, year = {2022}, booktitle = {Proceedings of the Tenth International Conference on Learning Representations (ICLR)}, pages = {1--24} }
Robot, Pass Me the Tool: Handle Visibility Facilitates Task-oriented Handovers

Valerio Ortenzi, Maija Filipovica, Diar Abdlkarim, Tommaso Pardi, Chie Takahashi, Alan Wing, Massimiliano Di Luca, Katherine J. Kuchenbecker

Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 256–264, 2022.

Abstract Links BibTeX Project

A human handing over an object modulates their grasp and movements to accommodate their partner’s capabilities, which greatly increases the likelihood of a successful transfer. State-of-the-art robot behavior lacks this level of user understanding, resulting in interactions that force the human partner to shoulder the burden of adaptation. This paper investigates how visual occlusion of the object being passed affects the subjective perception and quantitative performance of the human receiver. We performed an experiment in virtual reality where seventeen participants were tasked with repeatedly reaching to take a tool from the hand of a robot; each of the three tested objects (hammer, screwdriver, scissors) was presented in a wide variety of poses. We carefully analysed the user’s hand and head motions, the time to grasp the object, and the chosen grasp location, as well as participants’ ratings of the grasp they just performed. Results show that initial visibility of the handle significantly increases the reported holdability and immediate usability of a tool. Furthermore, a robot that offers objects so that their handles are more occluded forces the receiver to spend more time in planning and executing the grasp and also lowers the probability that the tool will be grasped by the handle. Together these findings indicate that robots can more effectively support their human work partners by increasing the visibility of the intended grasp location of objects being passed.

doi: 10.5555/3523760.3523797

@inproceedings{ortenzi22_HRI, title = {Robot, Pass Me the Tool: Handle Visibility Facilitates Task-oriented Handovers}, author = {Ortenzi, Valerio and Filipovica, Maija and Abdlkarim, Diar and Pardi, Tommaso and Takahashi, Chie and Wing, Alan and Luca, Massimiliano Di and Kuchenbecker, Katherine J.}, year = {2022}, booktitle = {Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI)}, pages = {256–264}, doi = {10.5555/3523760.3523797} }
Nalin: Learning from Runtime Behavior to Find Name-Value Inconsistencies in Jupyter Notebooks

Jibesh Patra, Michael Pradel

Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE), pp. 1–13, 2022.

Abstract Links BibTeX Project

Variable names are important to understand and maintain code. If a variable name and the value stored in the variable do not match, then the program suffers from a name-value inconsistency, which is due to one of two situations that developers may want to fix: Either a correct value is referred to through a misleading name, which negatively affects code understandability and maintainability, or the correct name is bound to a wrong value, which may cause unexpected runtime behavior. Finding name-value inconsistencies is hard because it requires an understanding of the meaning of names and knowledge about the values assigned to a variable at runtime. This paper presents Nalin, a technique to automatically detect name-value inconsistencies. The approach combines a dynamic analysis that tracks assignments of values to names with a neural machine learning model that predicts whether a name and a value fit together. To the best of our knowledge, this is the first work to formulate the problem of finding coding issues as a classification problem over names and runtime values. We apply Nalin to 106,652 real-world Python programs, where meaningful names are particularly important due to the absence of statically declared types. Our results show that the classifier detects name-value inconsistencies with high accuracy, that the warnings reported by Nalin have a precision of 80% and a recall of 76% w.r.t. a ground truth created in a user study, and that our approach complements existing techniques for finding coding issues.

Preprint: https://software-lab.org/publications/icse2022_Nalin.pdf

@inproceedings{patra22_icse, title = {Nalin: Learning from Runtime Behavior to Find Name-Value Inconsistencies in Jupyter Notebooks}, author = {Patra, Jibesh and Pradel, Michael}, year = {2022}, booktitle = {Proceedings of the 44th {IEEE/ACM} International Conference on Software Engineering ({ICSE})}, pages = {1--13}, preprint = {https://software-lab.org/publications/icse2022_Nalin.pdf} }
Ordered Subgraph Aggregation Networks

Chendi Qian, Gaurav Rattan, Floris Geerts, Christopher Morris, Mathias Niepert

Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS), pp. , 2022.

Abstract BibTeX Project

Numerous subgraph-enhanced graph neural networks (GNNs) have emerged recently, provably boosting the expressive power of standard (message-passing) GNNs. However, there is a limited understanding of how these approaches relate to each other and to the Weisfeiler–Leman hierarchy. Moreover, current approaches either use all subgraphs of a given size, sample them uniformly at random, or use hand-crafted heuristics instead of learning to select subgraphs in a data-driven manner. Here, we offer a unified way to study such architectures by introducing a theoretical framework and extending the known expressivity results of subgraph-enhanced GNNs. Concretely, we show that increasing subgraph size always increases the expressive power and develop a better understanding of their limitations by relating them to the established k-𝖶𝖫 hierarchy. In addition, we explore different approaches for learning to sample subgraphs using recent methods for backpropagating through complex discrete probability distributions. Empirically, we study the predictive performance of different subgraph-enhanced GNNs, showing that our data-driven architectures increase prediction accuracy on standard benchmark datasets compared to non-data-driven subgraph-enhanced graph neural networks while reducing computation time.

@inproceedings{qian22_neurips, title = {Ordered Subgraph Aggregation Networks}, author = {Qian, Chendi and Rattan, Gaurav and Geerts, Floris and Morris, Christopher and Niepert, Mathias}, year = {2022}, booktitle = {Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS)}, pages = {} }
PDEBENCH: An Extensive Benchmark for Scientific Machine Learning

Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Dan MacKinlay, Francesco Alesiani, Dirk Pflüger, Mathias Niepert

Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS), pp. , 2022.

Abstract Links BibTeX Project

Machine learning-based modeling of physical systems has gained increasing interest in recent years. Despite recent progress, there is still a lack of such benchmarks for scientific ML with sufficient volume and variety that are easy to use but still challenging and representative for a wide range of problems. In this paper, we introduce PDEBench, a benchmark suite of time-dependent simulation tasks based on Partial Differential Equations (PDEs). PDEBench comprises both code and data to benchmark the performance of novel machine learning models against both classical numerical simulations and machine learning baselines. Our proposed set of benchmark problems contributes in particular the following unique features: (1) A much wider range of PDEs than existing approaches, ranging from relatively common examples to more realistic and difficult ones; (2) much larger ready-to-use datasets than state-of-the-art, comprising multiple simulation-runs across varying initial or boundary conditions and model parameters; (3) and it provides easily extensible source codes with user-friendly APIs for data generation and baseline results with advanced machine learning models (FNO, U-Net, PINN, Gradient-based inverse method). PDEBench allows researchers to extend the dataset freely for their own purposes using a standardized API, and to compare the performance of their new models. Finally, we propose new metrics to help to understand and evaluate a given ML model in the context of scientific ML. With those metrics we identified tasks which the present ML methods cannot provide acceptable accuracy, and propose them as future challenge-task for the community.

Code: https://github.com/pdebench/PDEBench

Preprint: https://openreview.net/forum?id=dh_MkX0QfrK

@inproceedings{takamoto22_neurips, title = {PDEBENCH: An Extensive Benchmark for Scientific Machine Learning}, author = {Takamoto, Makoto and Praditia, Timothy and Leiteritz, Raphael and MacKinlay, Dan and Alesiani, Francesco and Pflüger, Dirk and Niepert, Mathias}, year = {2022}, booktitle = {Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS)}, pages = {}, preprint = {https://openreview.net/forum?id=dh_MkX0QfrK} }
Hyperbolic Embedding Inference for Structured Multi-Label Prediction

Bo Xiong, M. Cochez, Mojtaba Nayyeri, Steffen Staab

Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS), pp. , 2022.

Abstract BibTeX Project

@inproceedings{xiong22_neurips_2, title = {Hyperbolic Embedding Inference for Structured Multi-Label Prediction}, author = {Xiong, Bo and Cochez, M. and Nayyeri, Mojtaba and Staab, Steffen}, year = {2022}, booktitle = {Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS)}, pages = {} }
Faithful Embeddings for EL++ Knowledge Bases

Bo Xiong, Nico Potyka, Trung-Kien Tran, Mojtaba Nayyeri, Steffen Staab

Proceedings of the 21st International Semantic Web Conference (ISWC2022), pp. 1–18, 2022.

Abstract BibTeX Project

Recently, increasing efforts are put into learning continual representations for symbolic knowledge bases (KBs). However, these approaches either only embed the data-level knowledge (ABox) or suffer from inherent limitations when dealing with concept-level knowledge (TBox), i.e., they cannot faithfully model the logical structure present in the KBs. We present BoxEL, a geometric KB embedding approach that allows for better capturing the logical structure (i.e., ABox and TBox axioms) in the description logic EL++. BoxEL models concepts in a KB as axis-parallel boxes that are suitable for modeling concept intersection, entities as points inside boxes, and relations between concepts/entities as affine transformations. We show theoretical guarantees (soundness) of BoxEL for preserving logical structure. Namely, the learned model of BoxEL embedding with loss 0 is a (logical) model of the KB. Experimental results on (plausible) subsumption reasonings and a real-world application for protein-protein prediction show that BoxEL outperforms traditional knowledge graph embedding methods as well as state-of-the-art EL++ embedding

@inproceedings{xiong22_iswc, title = {Faithful Embeddings for EL++ Knowledge Bases}, author = {Xiong, Bo and Potyka, Nico and Tran, Trung-Kien and Nayyeri, Mojtaba and Staab, Steffen}, year = {2022}, booktitle = {Proceedings of the 21st International Semantic Web Conference (ISWC2022)}, pages = {1--18} }
Ultrahyperbolic Knowledge Graph Embeddings

Bo Xiong, Shichao Zhu, Mojtaba Nayyeri, Chengjin Xu, Shirui Pan, Chuan Zhou, Steffen Staab

Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 1–10, 2022.

Abstract BibTeX Project

Recent knowledge graph (KG) embeddings have been advanced by hyperbolic geometry due to its superior capability for representing hierarchies. The topological structures of real-world KGs, however, are rather heterogeneous, i.e., a KG is composed of multiple distinct hierarchies and non-hierarchical graph structures. Therefore, a homogeneous (either Euclidean or hyperbolic) geometry is not sufficient for fairly representing such heterogeneous structures. To capture the topological heterogeneity of KGs, we present an ultrahyperbolic KG embedding (UltraE) in an ultrahyperbolic (or pseudo-Riemannian) manifold that seamlessly interleaves hyperbolic and spherical manifolds. In particular, we model each relation as a pseudo-orthogonal transformation that preserves the pseudo-Riemannian bilinear form. The pseudo-orthogonal transformation is decomposed into various operators (i.e., circular rotations, reflections and hyperbolic rotations), allowing for simultaneously modeling heterogeneous structures as well as complex relational patterns. Experimental results on three standard KGs show that UltraE outperforms previous Euclidean- and hyperbolic-based approaches.

@inproceedings{xiong22_kdd, title = {Ultrahyperbolic Knowledge Graph Embeddings}, author = {Xiong, Bo and Zhu, Shichao and Nayyeri, Mojtaba and Xu, Chengjin and Pan, Shirui and Zhou, Chuan and Staab, Steffen}, year = {2022}, booktitle = {Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)}, pages = {1--10} }
Pseudo-Riemannian Graph Convolutional Networks

Bo Xiong, Shichao Zhu, Nico Potyka, Shirui Pan, Chuan Zhou, Steffen Staab

Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS), pp. , 2022.

Abstract BibTeX Project

Graph Convolutional Networks (GCNs) are typically studied through the lens of Euclidean geometry. Non-Euclidean Riemannian manifolds provide specific inductive biases for embedding hierarchical or spherical data, but cannot align well with data of mixed topologies. We consider a larger class of semi-Riemannian manifolds with indefinite metric that generalize hyperboloid and sphere as well as their submanifolds. We develop new geodesic tools that allow for extending neural network operations into geodesically disconnected semi-Riemannian manifolds. As a consequence, we derive a principled Semi-Riemannian GCN that first models data in semi-Riemannian manifolds of constant nonzero curvature in the context of graph neural networks. Our method provides a geometric inductive bias that is sufficiently flexible to model mixed heterogeneous topologies like hierarchical graphs with cycles. Empirical results demonstrate that our method outperforms Riemannian counterparts when embedding graphs of complex topologies.

@inproceedings{xiong22_neurips, title = {Pseudo-Riemannian Graph Convolutional Networks}, author = {Xiong, Bo and Zhu, Shichao and Potyka, Nico and Pan, Shirui and Zhou, Chuan and Staab, Steffen}, year = {2022}, booktitle = {Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS)}, pages = {} }

2021

Journal Articles

Amortized Bayesian Model Comparison with Evidental Deep Learning

Stefan T. Radev, Marco D’Alessandro, Ulf K. Mertens, Andreas Voss, Ullrich Kothe, Paul-Christian Bürkner

IEEE Transactions on Neural Networks and Learning Systems (TNNLS), (), pp. 1–12, 2021.

Abstract Links BibTeX Project

Comparing competing mathematical models of complex processes is a shared goal among many branches of science. The Bayesian probabilistic framework offers a principled way to perform model comparison and extract useful metrics for guiding decisions. However, many interesting models are intractable with standard Bayesian methods, as they lack a closed-form likelihood function or the likelihood is computationally too expensive to evaluate. In this work, we propose a novel method for performing Bayesian model comparison using specialized deep learning architectures. Our method is purely simulation-based and circumvents the step of explicitly fitting all alternative models under consideration to each observed dataset. Moreover, it requires no hand-crafted summary statistics of the data and is designed to amortize the cost of simulation over multiple models, datasets, and dataset sizes. This makes the method especially effective in scenarios where model fit needs to be assessed for a large number of datasets, so that case-based inference is practically infeasible. Finally, we propose a novel way to measure epistemic uncertainty in model comparison problems. We demonstrate the utility of our method on toy examples and simulated data from nontrivial models from cognitive science and single-cell neuroscience. We show that our method achieves excellent results in terms of accuracy, calibration, and efficiency across the examples considered in this work. We argue that our framework can enhance and enrich model-based analysis and inference in many fields dealing with computational models of natural processes. We further argue that the proposed measure of epistemic uncertainty provides a unique proxy to quantify absolute evidence even in a framework which assumes that the true data-generating model is within a finite set of candidate models.

doi: 10.1109/TNNLS.2021.3124052

@article{radev21_tnnls, title = {Amortized Bayesian Model Comparison with Evidental Deep Learning}, author = {Radev, Stefan T. and D'Alessandro, Marco and Mertens, Ulf K. and Voss, Andreas and Kothe, Ullrich and Bürkner, Paul-Christian}, year = {2021}, journal = {IEEE Transactions on Neural Networks and Learning Systems (TNNLS)}, volume = {}, number = {}, pages = {1--12}, doi = {10.1109/TNNLS.2021.3124052} }
Rank-normalization, Folding, and Localization: An Improved Rhat for Assessing Convergence of MCMC (with discussion)

Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, Paul-Christian Bürkner

Bayesian Analysis, 16(2), pp. 667–718, 2021.

Abstract Links BibTeX Project

Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic R̂ of Gelman and Rubin (1992) has serious flaws. Traditional R̂ will fail to correctly diagnose convergence failures when the chain has a heavy tail or when the variance varies across the chains. In this paper we propose an alternative rank-based diagnostic that fixes these problems. We also introduce a collection of quantile-based local efficiency measures, along with a practical approach for computing Monte Carlo error estimates for quantiles. We suggest that common trace plots should be replaced with rank plots from multiple chains. Finally, we give recommendations for how these methods should be used in practice.

doi: 10.1214/20-BA1221

@article{vehtari21_ba, title = {Rank-normalization, Folding, and Localization: An Improved Rhat for Assessing Convergence of MCMC (with discussion)}, author = {Vehtari, Aki and Gelman, Andrew and Simpson, Daniel and Carpenter, Bob and Bürkner, Paul-Christian}, year = {2021}, journal = {Bayesian Analysis}, volume = {16}, number = {2}, pages = {667--718}, doi = {10.1214/20-BA1221} }

Conference Papers

Efficient Learning of Discrete-Continuous Computation Graphs

David Friede, Mathias Niepert

Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 1–13, 2021.

Abstract Links BibTeX Project

Numerous models for supervised and reinforcement learning benefit from combinations of discrete and continuous model components. End-to-end learnable discrete-continuous models are compositional, tend to generalize better, and are more interpretable. A popular approach to building discrete-continuous computation graphs is that of integrating discrete probability distributions into neural networks using stochastic softmax tricks. Prior work has mainly focused on computation graphs with a single discrete component on each of the graph’s execution paths. We analyze the behavior of more complex stochastic computations graphs with multiple sequential discrete components. We show that it is challenging to optimize the parameters of these models, mainly due to small gradients and local minima. We then propose two new strategies to overcome these challenges. First, we show that increasing the scale parameter of the Gumbel noise perturbations during training improves the learning behavior. Second, we propose dropout residual connections specifically tailored to stochastic, discrete-continuous computation graphs. With an extensive set of experiments, we show that we can train complex discrete-continuous models which one cannot train with standard stochastic softmax tricks. We also show that complex discrete-stochastic models generalize better than their continuous counterparts on several benchmark datasets.

Paper: https://proceedings.neurips.cc/paper/2021/file/3556a3018cce3076e27dbbf9645b44d5-Paper.pdf

@inproceedings{friede21_neurips, title = {Efficient Learning of Discrete-Continuous Computation Graphs}, author = {Friede, David and Niepert, Mathias}, year = {2021}, booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)}, pages = {1--13}, url = {https://proceedings.neurips.cc/paper/2021/file/3556a3018cce3076e27dbbf9645b44d5-Paper.pdf} }
On the universality of the Double Descent peak in ridgeless regression

David Holzmüller

Proceedings of the International Conference on Learning Representations (ICLR), 2021.

BibTeX Project

@inproceedings{holzmuller_universality_2021, title = {On the universality of the {Double} {Descent} peak in ridgeless regression}, author = {Holzmüller, David}, year = {2021}, booktitle = {Proceedings of the International {Conference} on {Learning} {Representations} (ICLR)} }
Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders

Bhushan Kotnis, Carolin Lawrence, Mathias Niepert

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 4968–4977, 2021.

Abstract Links BibTeX Project

Representation learning for knowledge graphs (KGs) has focused on the problem of answering simple link prediction queries. In this work we address the more ambitious challenge of predicting the answers of conjunctive queries with multiple missing entities. We propose Bidirectional Query Embedding (BiQE), a method that embeds conjunctive queries with models based on bi-directional attention mechanisms. Contrary to prior work, bidirectional self-attention can capture interactions among all the elements of a query graph. We introduce two new challenging datasets for studying conjunctive query inference and conduct experiments on several benchmark datasets that demonstrate BiQE significantly outperforms state of the art baselines.

doi:

Paper: https://ojs.aaai.org/index.php/AAAI/article/view/16630

@inproceedings{kotnis21_aaai, title = {Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders}, author = {Kotnis, Bhushan and Lawrence, Carolin and Niepert, Mathias}, year = {2021}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)}, volume = {35}, number = {6}, pages = {4968--4977}, doi = {}, url = {https://ojs.aaai.org/index.php/AAAI/article/view/16630} }
Explaining Neural Matrix Factorization with Gradient Rollback

Carolin Lawrence, Timo Sztyler, Mathias Niepert

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 4987–4995, 2021.

Abstract Links BibTeX Project

Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model’s behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The most generally applicable existing method is based on influence functions, which scale poorly for larger sample sizes and models. We propose gradient rollback, a general approach for influence estimation, applicable to neural models where each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. Neural matrix factorization models trained with gradient descent are part of this model class. These models are popular and have found a wide range of applications in industry. Especially knowledge graph embedding methods, which belong to this class, are used extensively. We show that gradient rollback is highly efficient at both training and test time. Moreover, we show theoretically that the difference between gradient rollback’s influence approximation and the true influence on a model’s behavior is smaller than known bounds on the stability of stochastic gradient descent. This establishes that gradient rollback is robustly estimating example influence. We also conduct experiments which show that gradient rollback provides faithful explanations for knowledge base completion and recommender datasets. An implementation and an appendix are available.

doi:

Paper: https://ojs.aaai.org/index.php/AAAI/article/view/16632

@inproceedings{lawrence21_aaai, title = {Explaining Neural Matrix Factorization with Gradient Rollback}, author = {Lawrence, Carolin and Sztyler, Timo and Niepert, Mathias}, year = {2021}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)}, volume = {35}, number = {6}, pages = {4987--4995}, doi = {}, url = {https://ojs.aaai.org/index.php/AAAI/article/view/16632} }
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Mathias Niepert, Pasquale Minervini, Luca Franceschi

Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 1–13, 2021.

Abstract Links BibTeX Project

Combining discrete probability distributions and combinatorial optimization problems with neural network components has numerous applications but poses several challenges. We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components. I-MLE is widely applicable as it only requires the ability to compute the most probable states and does not rely on smooth relaxations. The framework encompasses several approaches such as perturbation-based implicit differentiation and recent methods to differentiate through black-box combinatorial solvers. We introduce a novel class of noise distributions for approximating marginals via perturb-and-MAP. Moreover, we show that I-MLE simplifies to maximum likelihood estimation when used in some recently studied learning settings that involve combinatorial solvers. Experiments on several datasets suggest that I-MLE is competitive with and often outperforms existing approaches which rely on problem specific relaxations.

Paper: https://proceedings.neurips.cc/paper/2021/file/7a430339c10c642c4b2251756fd1b484-Paper.pdf

@inproceedings{niepert21_neurips, title = {Implicit {MLE}: Backpropagating Through Discrete Exponential Family Distributions}, author = {Niepert, Mathias and Minervini, Pasquale and and Luca Franceschi}, year = {2021}, booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)}, pages = {1--13}, url = {https://proceedings.neurips.cc/paper/2021/file/7a430339c10c642c4b2251756fd1b484-Paper.pdf} }
Thinking Like a Developer? Comparing the Attention of Humans with Neural Models of Code

Matteo Paltenghi, Michael Pradel

Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 867–879, 2021.

Abstract Links BibTeX Project

Neural models of code are successfully tackling various prediction tasks, complementing and sometimes even outperforming traditional program analyses. While most work focuses on end-to-end evaluations of such models, it often remains unclear what the models actually learn, and to what extent their reasoning about code matches that of skilled humans. A poor understanding of the model reasoning risks deploying models that are right for the wrong reason, and taking decisions based on spurious correlations in the training dataset. This paper investigates to what extent the attention weights of effective neural models match the reasoning of skilled humans. To this end, we present a methodology for recording human attention and use it to gather 1,508 human attention maps from 91 participants, which is the largest such dataset we are aware of. Computing human-model correlations shows that the copy attention of neural models often matches the way humans reason about code (Spearman rank coefficients of 0.49 and 0.47), which gives an empirical justification for the intuition behind copy attention. In contrast, the regular attention of models is mostly uncorrelated with human attention. We find that models and humans sometimes focus on different kinds of tokens, e.g., strings are important to humans but mostly ignored by models. The results also show that human-model agreement positively correlates with accurate predictions by a model, which calls for neural models that even more closely mimic human reasoning. Beyond the insights from our study, we envision the release of our dataset of human attention maps to help understand future neural models of code and to foster work on human-inspired models.

doi: 10.1109/ASE51524.2021.9678712

@inproceedings{paltenghi21_ase, title = {Thinking Like a Developer? Comparing the Attention of Humans with Neural Models of Code}, author = {Paltenghi, Matteo and Pradel, Michael}, year = {2021}, booktitle = {Proceedings of the 36th {IEEE/ACM} International Conference on Automated Software Engineering ({ASE})}, pages = {867--879}, doi = {10.1109/ASE51524.2021.9678712} }
Semantic Bug Seeding: A Learning-based Approach for Creating Realistic Bugs

Jibesh Patra, Michael Pradel

Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering ESEC/FSE, pp. 906–918, 2021.

Abstract Links BibTeX Project

When working on techniques to address the wide-spread problem of software bugs, one often faces the need for a large number of realistic bugs in real-world programs. Such bugs can either help evaluate an approach, e.g., in form of a bug benchmark or a suite of program mutations, or even help build the technique, e.g., in learning-based bug detection. Because gathering a large number of real bugs is difficult, a common approach is to rely on automatically seeded bugs. Prior work seeds bugs based on syntactic transformation patterns, which often results in unrealistic bugs and typically cannot introduce new, application-specific code tokens. This paper presents SemSeed, a technique for automatically seeding bugs in a semantics-aware way. The key idea is to imitate how a given real-world bug would look like in other programs by semantically adapting the bug pattern to the local context. To reason about the semantics of pieces of code, our approach builds on learned token embeddings that encode the semantic similarities of identifiers and literals. Our evaluation with real-world JavaScript software shows that the approach effectively reproduces real bugs and clearly outperforms a semantics-unaware approach. The seeded bugs are useful as training data for learning-based bug detection, where they significantly improve the bug detection ability. Moreover, we show that SemSeed-created bugs complement existing mutation testing operators, and that our approach is efficient enough to seed hundreds of thousands of bugs within an hour.

doi: 10.1145/3468264.3468623

@inproceedings{patra21_esec, title = {Semantic Bug Seeding: A Learning-based Approach for Creating Realistic Bugs}, author = {Patra, Jibesh and Pradel, Michael}, year = {2021}, booktitle = {Proceedings of the 29th {ACM} Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering {ESEC/FSE}}, pages = {906--918}, doi = {10.1145/3468264.3468623}, editor = {Spinellis, Diomidis and Gousios, Georgios and Chechik, Marsha and Penta, Massimiliano Di} }
Neural Photofit: Gaze-based Mental Image Reconstruction

Florian Strohm, Ekta Sood, Sven Mayer, Philipp Müller, Mihai Bâce, Andreas Bulling

Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 245–254, 2021.

Abstract Links BibTeX Project

We propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite). Our method combines three neural networks: An encoder, a scoring network, and a decoder. The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer. A neural scoring network compares the human and neural attention and predicts a relevance score for each extracted image feature. Finally, image features are aggregated into a single feature vector as a linear combination of all features weighted by relevance which a decoder decodes into the final photofit. We train the neural scoring network on a novel dataset containing gaze data of 19 participants looking at collages of synthetic faces. We show that our method significantly outperforms a mean baseline predictor and report on a human study that shows that we can decode photofits that are visually plausible and close to the observer’s mental image. Code and dataset available upon request.

doi: 10.1109/ICCV48922.2021.00031

Code: Available upon request.

Dataset: Available upon request.

@inproceedings{strohm21_iccv, title = {Neural Photofit: Gaze-based Mental Image Reconstruction}, author = {Strohm, Florian and Sood, Ekta and Mayer, Sven and Müller, Philipp and Bâce, Mihai and Bulling, Andreas}, year = {2021}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, pages = {245--254}, doi = {10.1109/ICCV48922.2021.00031} }
IdBench: Evaluating Semantic Representations of Identifier Names in Source Code

Yaza Wainakh, Moiz Rauf, Michael Pradel

Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (ICSE), pp. 562–573, 2021.

Abstract Links BibTeX Project

Identifier names convey useful information about the intended semantics of code. Name-based program analyses use this information, e.g., to detect bugs, to predict types, and to improve the readability of code. At the core of name-based analyses are semantic representations of identifiers, e.g., in the form of learned embeddings. The high-level goal of such a representation is to encode whether two identifiers, e.g., len and size, are semantically similar. Unfortunately, it is currently unclear to what extent semantic representations match the semantic relatedness and similarity perceived by developers. This paper presents IdBench, the first benchmark for evaluating semantic representations against a ground truth created from thousands of ratings by 500 software developers. We use IdBench to study state-of-the-art embedding techniques proposed for natural language, an embedding technique specifically designed for source code, and lexical string distance functions. Our results show that the effectiveness of semantic representations varies significantly and that the best available embeddings successfully represent semantic relatedness. On the downside, no existing technique provides a satisfactory representation of semantic similarities, among other reasons because identifiers with opposing meanings are incorrectly considered to be similar, which may lead to fatal mistakes, e.g., in a refactoring tool. Studying the strengths and weaknesses of the different techniques shows that they complement each other. As a first step toward exploiting this complementarity, we present an ensemble model that combines existing techniques and that clearly outperforms the best available semantic representation.

doi: 10.1109/ICSE43902.2021.00059

@inproceedings{wainakh21_icse, title = {IdBench: Evaluating Semantic Representations of Identifier Names in Source Code}, author = {Wainakh, Yaza and Rauf, Moiz and Pradel, Michael}, year = {2021}, booktitle = {Proceedings of the 43rd {IEEE/ACM} International Conference on Software Engineering ({ICSE})}, pages = {562--573}, doi = {10.1109/ICSE43902.2021.00059} }
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs

Cheng Wang, Carolin Lawrence, Mathias Niepert

Proceedings of the Ninth International Conference on Learning Representations (ICLR), 2021.

Abstract Links BibTeX Project

Uncertainty quantification is crucial for building reliable and trustable machine learning systems. We propose to estimate uncertainty in recurrent neural networks (RNNs) via stochastic discrete state transitions over recurrent timesteps. The uncertainty of the model can be quantified by running a prediction several times, each time sampling from the recurrent state transition distribution, leading to potentially different results if the model is uncertain. Alongside uncertainty quantification, our proposed method offers several advantages in different settings. The proposed method can (1) learn deterministic and probabilistic automata from data, (2) learn well-calibrated models on real-world classification tasks, (3) improve the performance of out-of-distribution detection, and (4) control the exploration-exploitation trade-off in reinforcement learning. An implementation is available.

doi:

Paper: https://openreview.net/forum?id=9EKHN1jOlA

@inproceedings{wang21_iclr, title = {Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs}, author = {Wang, Cheng and Lawrence, Carolin and Niepert, Mathias}, year = {2021}, booktitle = {Proceedings of the Ninth International Conference on Learning Representations (ICLR)}, doi = {}, url = {https://openreview.net/forum?id=9EKHN1jOlA} }

2020

Journal Articles

Approximate Leave-future-out Cross-validation for Bayesian Time Series Models

Paul-Christian Bürkner, Jonah Gabry, Aki Vehtari

Journal of Statistical Computation and Simulation, 90(14), pp. 2499–2523, 2020.

Abstract Links BibTeX Project

One of the common goals of time series analysis is to use the observed series to inform predictions for future observations. In the absence of any actual new data to predict, cross-validation can be used to estimate a model’s future predictive accuracy, for instance, for the purpose of model comparison or selection. Exact cross-validation for Bayesian models is often computationally expensive, but approximate cross-validation methods have been developed, most notably methods for leave-one-out cross-validation (LOO-CV). If the actual prediction task is to predict the future given the past, LOO-CV provides an overly optimistic estimate because the information from future observations is available to influence predictions of the past. To properly account for the time series structure, we can use leave-future-out cross-validation (LFO-CV). Like exact LOO-CV, exact LFO-CV requires refitting the model many times to different subsets of the data. Using Pareto smoothed importance sampling, we propose a method for approximating exact LFO-CV that drastically reduces the computational costs while also providing informative diagnostics about the quality of the approximation.

doi: 10.1080/00949655.2020.1783262

@article{buerkner20_jscs, title = {Approximate Leave-future-out Cross-validation for Bayesian Time Series Models}, author = {Bürkner, Paul-Christian and Gabry, Jonah and Vehtari, Aki}, year = {2020}, journal = {Journal of Statistical Computation and Simulation}, volume = {90}, number = {14}, pages = {2499–2523}, doi = {10.1080/00949655.2020.1783262} }
Sobolev Norm Learning Rates for Regularized Least-Squares Algorithm

Simon Fischer, Ingo Steinwart

Journal of Machine Learning Research (JMLR), 21, pp. 1–38, 2020.

Abstract Links BibTeX Project

Learning rates for least-squares regression are typically expressed in terms of L2-norms. In this paper we extend these rates to norms stronger than the L₂-norm without requiring the regression function to be contained in the hypothesis space. In the special case of Sobolev reproducing kernel Hilbert spaces used as hypotheses spaces, these stronger norms coincide with fractional Sobolev norms between the used Sobolev space and L₂. As a consequence, not only the target function but also some of its derivatives can be estimated without changing the algorithm. From a technical point of view, we combine the well-known integral operator techniques with an embedding property, which so far has only been used in combination with empirical process arguments. This combination results in new finite sample bounds with respect to the stronger norms. From these finite sample bounds our rates easily follow. Finally, we prove the asymptotic optimality of our results in many cases.

Paper: https://www.jmlr.org/papers/volume21/19-734/19-734.pdf

@article{fischer20_jmlr, title = {Sobolev Norm Learning Rates for Regularized Least-Squares Algorithm}, author = {Fischer, Simon and Steinwart, Ingo}, year = {2020}, journal = {Journal of Machine Learning Research (JMLR)}, volume = {21}, pages = {1--38}, url = {https://www.jmlr.org/papers/volume21/19-734/19-734.pdf} }

Conference Papers

Quantification of Users’ Visual Attention During Everyday Mobile Device Interactions

Mihai Bâce, Sander Staal, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–14, 2020.

Abstract Links BibTeX Project

We present the first real-world dataset and quantitative evaluation of visual attention of mobile device users in-situ, i.e. while using their devices during everyday routine. Understanding user attention is a core research challenge in mobile HCI but previous approaches relied on usage logs or self-reports that are only proxies and consequently do neither reflect attention completely nor accurately. Our evaluations are based on Everyday Mobile Visual Attention (EMVA) – a new 32-participant dataset containing around 472 hours of video snippets recorded over more than two weeks in real life using the front-facing camera as well as associated usage logs, interaction events, and sensor data. Using an eye contact detection method, we are first to quantify the highly dynamic nature of everyday visual attention across users, mobile applications, and usage contexts. We discuss key insights from our analyses that highlight the potential and inform the design of future mobile attentive user interfaces.

doi: 10.1145/3313831.3376449

Dataset: http://www.emva-dataset.org/

@inproceedings{bace20_chi, title = {Quantification of Users' Visual Attention During Everyday Mobile Device Interactions}, author = {B{\^a}ce, Mihai and Staal, Sander and Bulling, Andreas}, year = {2020}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, pages = {1--14}, doi = {10.1145/3313831.3376449} }
Flexible Prior Elicitation via the Prior Predictive Distribution

Marcelo Hartmann, Georgi Agiashvili, Paul Bürkner, Arto Klami

Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 1129–1138, 2020.

Abstract Links BibTeX Project

The prior distribution for the unknown model parameters plays a crucial role in the process of statistical inference based on Bayesian methods. However, specifying suitable priors is often difficult even when detailed prior knowledge is available in principle. The challenge is to express quantitative information in the form of a probability distribution. Prior elicitation addresses this question by extracting subjective information from an expert and transforming it into a valid prior. Most existing methods, however, require information to be provided on the unobservable parameters, whose effect on the data generating process is often complicated and hard to understand. We propose an alternative approach that only requires knowledge about the observable outcomes - knowledge which is often much easier for experts to provide. Building upon a principled statistical framework, our approach utilizes the prior predictive distribution implied by the model to automatically transform experts judgements about plausible outcome values to suitable priors on the parameters. We also provide computational strategies to perform inference and guidelines to facilitate practical use.

Paper: https://proceedings.mlr.press/v124/hartmann20a.html

@inproceedings{hartmann20_uai, title = {Flexible Prior Elicitation via the Prior Predictive Distribution}, author = {Hartmann, Marcelo and Agiashvili, Georgi and Bürkner, Paul and Klami, Arto}, year = {2020}, booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)}, volume = {124}, pages = {1129--1138}, url = {https://proceedings.mlr.press/v124/hartmann20a.html} }
Predicting Degrees of Technicality in Automatic Terminology Extraction

Anna Hätty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 2883–2889, 2020.

Abstract Links BibTeX Project

While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on technicality prediction. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparative embeddings. We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multi-channel model performing best.

doi: 10.18653/v1/2020.acl-main.258

@inproceedings{haetty20_acl, title = {Predicting Degrees of Technicality in Automatic Terminology Extraction}, author = {Hätty, Anna and Schlechtweg, Dominik and Dorna, Michael and {Schulte im Walde}, Sabine}, year = {2020}, booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL)}, pages = {2883–2889}, doi = {10.18653/v1/2020.acl-main.258} }
TypeWriter: Neural Type Prediction with Search-based Validation

Michael Pradel, Georgios Gousios, Jason Liu, Satish Chandra

Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 209–220, 2020.

Abstract Links BibTeX Project

Maintaining large code bases written in dynamically typed languages, such as JavaScript or Python, can be challenging due to the absence of type annotations: simple data compatibility errors proliferate, IDE support is limited, and APIs are hard to comprehend. Recent work attempts to address those issues through either static type inference or probabilistic type prediction. Unfortunately, static type inference for dynamic languages is inherently limited, while probabilistic approaches suffer from imprecision. This paper presents TypeWriter, the first combination of probabilistic type prediction with search-based refinement of predicted types. TypeWriter’s predictor learns to infer the return and argument types for functions from partially annotated code bases by combining the natural language properties of code with programming language-level information. To validate predicted types, TypeWriter invokes a gradual type checker with different combinations of the predicted types, while navigating the space of possible type combinations in a feedback-directed manner. We implement the TypeWriter approach for Python and evaluate it on two code corpora: a multi-million line code base at Facebook and a collection of 1,137 popular open-source projects. We show that TypeWriter’s type predictor achieves an F1 score of 0.64 (0.79) in the top-1 (top-5) predictions for return types, and 0.57 (0.80) for argument types, which clearly outperforms prior type prediction models. By combining predictions with search-based validation, TypeWriter can fully annotate between 14% to 44% of the files in a randomly selected corpus, while ensuring type correctness. A comparison with a static type inference tool shows that TypeWriter adds many more non-trivial types. TypeWriter currently suggests types to developers at Facebook and several thousands of types have already been accepted with minimal changes.

doi: 10.1145/3368089.3409715

@inproceedings{pradel20_esec, title = {TypeWriter: Neural Type Prediction with Search-based Validation}, author = {Pradel, Michael and Gousios, Georgios and Liu, Jason and Chandra, Satish}, year = {2020}, booktitle = {Proceedings of the 28th {ACM} Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering ({ESEC/FSE})}, pages = {209--220}, doi = {10.1145/3368089.3409715}, editor = {Devanbu, Prem and Cohen, Myra B. and Zimmermann, Thomas} }
Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention

Ekta Sood, Simon Tannert, Philipp Müller, Andreas Bulling

Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 1–15, 2020.

Abstract Links BibTeX Project

A lack of corpora has so far limited advances in integrating human gaze data as a supervisory signal in neural attention mechanisms for natural language processing (NLP). We propose a novel hybrid text saliency model (TSM) that, for the first time, combines a cognitive model of reading with explicit human gaze supervision in a single machine learning framework. We show on four different corpora that our hybrid TSM duration predictions are highly correlated with human gaze ground truth. We further propose a novel joint modelling approach to integrate the predictions of the TSM into the attention layer of a network designed for a specific upstream task without the need for task-specific human gaze data. We demonstrate that our joint model outperforms the state of the art in paraphrase generation on the Quora Question Pairs corpus by more than 10% in BLEU-4 and achieves state-of-the-art performance for sentence compression on the challenging Google Sentence Compression corpus. As such, our work introduces a practical approach for bridging between data-driven and cognitive models and demonstrates a new way to integrate human gaze-guided neural attention into NLP tasks.

Code: https://git.hcics.simtech.uni-stuttgart.de/public-projects/human-gaze-guided-neural-attention-for-nlp

Paper: https://proceedings.neurips.cc/paper/2020/hash/460191c72f67e90150a093b4585e7eb4-Abstract.html

@inproceedings{sood20_neurips, title = {Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention}, author = {Sood, Ekta and Tannert, Simon and Müller, Philipp and Bulling, Andreas}, year = {2020}, booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)}, pages = {1--15}, url = {https://proceedings.neurips.cc/paper/2020/hash/460191c72f67e90150a093b4585e7eb4-Abstract.html} }

2019

Journal Articles

Getafix: Learning to Fix Bugs Automatically

Johannes Bader, Andrew Scott, Michael Pradel, Satish Chandra

Proceedings of the ACM on Programming Languages, 3(OOPSLA), pp. 159:1–159:27, 2019.

Abstract Links BibTeX Project

Static analyzers help find bugs early by warning about recurring bug categories. While fixing these bugs still remains a mostly manual task in practice, we observe that fixes for a specific bug category often are repetitive. This paper addresses the problem of automatically fixing instances of common bugs by learning from past fixes. We present Getafix, an approach that produces human-like fixes while being fast enough to suggest fixes in time proportional to the amount of time needed to obtain static analysis results in the first place. Getafix is based on a novel hierarchical clustering algorithm that summarizes fix patterns into a hierarchy ranging from general to specific patterns. Instead of an expensive exploration of a potentially large space of candidate fixes, Getafix uses a simple yet effective ranking technique that uses the context of a code change to select the most appropriate fix for a given bug. Our evaluation applies Getafix to 1,268 bug fixes for six bug categories reported by popular static analyzers for Java, including null dereferences, incorrect API calls, and misuses of particular language constructs. The approach predicts exactly the human-written fix as the top-most suggestion between 12% and 91% of the time, depending on the bug category. The top-5 suggestions contain fixes for 526 of the 1,268 bugs. Moreover, we report on deploying the approach within Facebook, where it contributes to the reliability of software used by billions of people. To the best of our knowledge, Getafix is the first industrially-deployed automated bug-fixing tool that learns fix patterns from past, human-written fixes to produce human-like fixes.

doi: 10.1145/3360585

@article{bader19_pl, title = {Getafix: Learning to Fix Bugs Automatically}, author = {Bader, Johannes and Scott, Andrew and Pradel, Michael and Chandra, Satish}, year = {2019}, journal = {Proceedings of the ACM on Programming Languages}, volume = {3}, number = {{OOPSLA}}, pages = {159:1--159:27}, doi = {10.1145/3360585} }
Learning Rates for Kernel-Based Expectile Regression

Muhammad Farooq, Ingo Steinwart

Machine Learning, 108, pp. 203–227, 2019.

Abstract Links BibTeX Project

Conditional expectiles are becoming an increasingly important tool in finance as well as in other areas of applications. We analyse a support vector machine type approach for estimating conditional expectiles and establish learning rates that are minimax optimal modulo a logarithmic factor if Gaussian RBF kernels are used and the desired expectile is smooth in a Besov sense. As a special case, our learning rates improves the best known rates for kernel-based least squares regression in aforementioned scenario. Key ingredients of our statistical analysis are a general calibration inequality for the asymmetric least squares loss, a corresponding variance bound as well as an improved entropy number bound for Gaussian RBF kernels.

doi: 10.1007/s10994-018-5762-9

@article{farooq19_ml, title = {Learning Rates for Kernel-Based Expectile Regression}, author = {Farooq, Muhammad and Steinwart, Ingo}, year = {2019}, journal = {Machine Learning}, volume = {108}, pages = {203--227}, doi = {10.1007/s10994-018-5762-9} }
MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation

Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 41(1), pp. 162–175, 2019.

Abstract Links BibTeX Project

Learning-based methods are believed to work well for unconstrained gaze estimation, i.e. gaze estimation from a monocular RGB camera without assumptions regarding user, environment, or camera. However, current gaze datasets were collected under laboratory conditions and methods were not evaluated across multiple datasets. Our work makes three contributions towards addressing these limitations. First, we present the MPIIGaze dataset, which contains 213,659 full face images and corresponding ground-truth gaze positions collected from 15 users during everyday laptop use over several months. An experience sampling approach ensured continuous gaze and head poses and realistic variation in eye appearance and illumination. To facilitate cross-dataset evaluations, 37,667 images were manually annotated with eye corners, mouth corners, and pupil centres. Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze. We study key challenges including target gaze range, illumination conditions, and facial appearance variation. We show that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative. Finally, we propose GazeNet, the first deep appearance-based gaze estimation method. GazeNet improves on the state of the art by 22% (from a mean error of 13.9 degrees to 10.8 degrees) for the most challenging cross-dataset evaluation.

doi: 10.1109/TPAMI.2017.2778103

@article{zhang19_pami, title = {MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation}, author = {Zhang, Xucong and Sugano, Yusuke and Fritz, Mario and Bulling, Andreas}, year = {2019}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, volume = {41}, number = {1}, pages = {162--175}, doi = {10.1109/TPAMI.2017.2778103} }

Conference Papers

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He

Proceedings of the 36th International Conference on Machine Learning (ICML), pp. 1–11, 2019.

Abstract Links BibTeX Project

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

Paper: http://proceedings.mlr.press/v97/franceschi19a/franceschi19a.pdf

@inproceedings{franceschi19_icml, title = {Learning Discrete Structures for Graph Neural Networks}, author = {Franceschi, Luca and Niepert, Mathias and Pontil, Massimiliano and He, Xiao}, year = {2019}, booktitle = {Proceedings of the 36th International Conference on Machine Learning (ICML)}, pages = {1--11}, url = {http://proceedings.mlr.press/v97/franceschi19a/franceschi19a.pdf} }
Attending to Future Tokens For Bidirectional Sequence Generation

Carolin Lawrence, Bhushan Kotnis, Mathias Niepert

Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1–10, 2019.

Abstract Links BibTeX Project

Neural sequence generation is typically performed token-by-token and left-to-right. Whenever a token is generated only previously produced tokens are taken into consideration. In contrast, for problems such as sequence classification, bidirectional attention, which takes both past and future tokens into consideration, has been shown to perform much better. We propose to make the sequence generation process bidirectional by employing special placeholder tokens. Treated as a node in a fully connected graph, a placeholder token can take past and future tokens into consideration when generating the actual output token. We verify the effectiveness of our approach experimentally on two conversational tasks where the proposed bidirectional model outperforms competitive baselines by a large margin.

doi: 10.18653/v1/D19-1001

@inproceedings{lawrence19_emnlp, title = {Attending to Future Tokens For Bidirectional Sequence Generation}, author = {Lawrence, Carolin and Kotnis, Bhushan and Niepert, Mathias}, year = {2019}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages = {1--10}, doi = {10.18653/v1/D19-1001} }
NL2Type: Inferring JavaScript Function Types from Natural Language Information

Rabee Sohail Malik, Jibesh Patra, Michael Pradel

Proceedings of the 41st IEEE/ACM International Conference on Software Engineering (ICSE), pp. 304–315, 2019.

Abstract Links BibTeX Project

JavaScript is dynamically typed and hence lacks the type safety of statically typed languages, leading to suboptimal IDE support, difficult to understand APIs, and unexpected runtime behavior. Several gradual type systems have been proposed, e.g., Flow and TypeScript, but they rely on developers to annotate code with types. This paper presents NL2Type, a learning-based approach for predicting likely type signatures of JavaScript functions. The key idea is to exploit natural language information in source code, such as comments, function names, and parameter names, a rich source of knowledge that is typically ignored by type inference algorithms. We formulate the problem of predicting types as a classification problem and train a recurrent, LSTM-based neural model that, after learning from an annotated code base, predicts function types for unannotated code. We evaluate the approach with a corpus of 162,673 JavaScript files from real-world projects. NL2Type predicts types with a precision of 84.1% and a recall of 78.9% when considering only the top-most suggestion, and with a precision of 95.5% and a recall of 89.6% when considering the top-5 suggestions. The approach outperforms both JSNice, a state-of-the-art approach that analyzes implementations of functions instead of natural language information, and DeepTyper, a recent type prediction approach that is also based on deep learning. Beyond predicting types, NL2Type serves as a consistency checker for existing type annotations. We show that it discovers 39 inconsistencies that deserve developer attention (from a manual analysis of 50 warnings), most of which are due to incorrect type annotations.

doi: 10.1109/ICSE.2019.00045

@inproceedings{malik19_icse, title = {NL2Type: Inferring JavaScript Function Types from Natural Language Information}, author = {Malik, Rabee Sohail and Patra, Jibesh and Pradel, Michael}, year = {2019}, booktitle = {Proceedings of the 41st {IEEE/ACM} International Conference on Software Engineering ({ICSE})}, pages = {304--315}, doi = {10.1109/ICSE.2019.00045}, editor = {Atlee, Joanne M. and Bultan, Tevfik and Whittle, Jon} }
A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains

Dominik Schlechtweg, Anna Hätty, Marco Tredici, Sabine Schulte im Walde

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 732–746, 2019.

Abstract Links BibTeX Project

We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains. Our work addresses the superficialness and lack of comparison in assessing models of diachronic lexical change, by bringing together and extending benchmark models on a common state-of-the-art evaluation task. In addition, we demonstrate that the same evaluation task and modelling approaches can successfully be utilised for the synchronic detection of domain-specific sense divergences in the field of term extraction.

doi: 10.18653/v1/P19-1072

@inproceedings{schlechtweg19_acl, title = {A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains}, author = {Schlechtweg, Dominik and Hätty, Anna and del Tredici, Marco and {Schulte im Walde}, Sabine}, year = {2019}, booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)}, pages = {732–746}, doi = {10.18653/v1/P19-1072} }
State-Regularized Recurrent Neural Networks

Cheng Wang, Mathias Niepert

Proceedings of the 36th International Conference on Machine Learning (ICML), pp. 6596–6606, 2019.

Abstract Links BibTeX Project

Recurrent neural networks are a widely used class of neural architectures with two shortcomings. First, it is difficult to understand what exactly they learn. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) nonregular languages such as balanced parentheses, palindromes, and the copy task where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition, and language modeling. We show that state-regularization simplifies the extraction of finite state automata from the RNN’s state transition dynamics; forces RNNs to operate more like automata with external memory and less like finite state machines; and makes RNNs more interpretable.

Paper: https://proceedings.mlr.press/v97/wang19j.html

@inproceedings{wang19_icml, title = {State-Regularized Recurrent Neural Networks}, author = {Wang, Cheng and Niepert, Mathias}, year = {2019}, booktitle = {Proceedings of the 36th International Conference on Machine Learning (ICML)}, pages = {6596--6606}, url = {https://proceedings.mlr.press/v97/wang19j.html} }
Evaluation of Appearance-Based Methods and Implications for Gaze-Based Applications

Xucong Zhang, Yusuke Sugano, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–13, 2019.

Abstract Links BibTeX Project

Appearance-based gaze estimation methods that only require an off-the-shelf camera have significantly improved but they are still not yet widely used in the human-computer interaction (HCI) community. This is partly because it remains unclear how they perform compared to model-based approaches as well as dominant, special-purpose eye tracking equipment. To address this limitation, we evaluate the performance of state-of-the-art appearance-based gaze estimation for interaction scenarios with and without personal calibration, indoors and outdoors, for different sensing distances, as well as for users with and without glasses. We discuss the obtained findings and their implications for the most important gaze-based applications, namely explicit eye input, attentive user interfaces, gaze-based user modelling, and passive eye monitoring. To democratise the use of appearance-based gaze estimation and interaction in HCI, we finally present OpenGaze (www.opengaze.org), the first software toolkit for appearance-based gaze estimation and interaction.

doi: 10.1145/3290605.3300646

Code: http://www.opengaze.org/

@inproceedings{zhang19_chi, title = {Evaluation of Appearance-Based Methods and Implications for Gaze-Based Applications}, author = {Zhang, Xucong and Sugano, Yusuke and Bulling, Andreas}, year = {2019}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, pages = {1--13}, doi = {10.1145/3290605.3300646} }

2018

Journal Articles

DeepBugs: a Learning Approach to Name-based Bug Detection

Michael Pradel, Koushik Sen

Proceedings of the ACM on Programming Languages, 2(OOPSLA), pp. 147:1–147:25, 2018.

Abstract Links BibTeX Project

Natural language elements in source code, e.g., the names of variables and functions, convey useful information. However, most existing bug detection tools ignore this information and therefore miss some classes of bugs. The few existing name-based bug detection approaches reason about names on a syntactic level and rely on manually designed and tuned algorithms to detect bugs. This paper presents DeepBugs, a learning approach to name-based bug detection, which reasons about names based on a semantic representation and which automatically learns bug detectors instead of manually writing them. We formulate bug detection as a binary classification problem and train a classifier that distinguishes correct from incorrect code. To address the challenge that effectively learning a bug detector requires examples of both correct and incorrect code, we create likely incorrect code examples from an existing corpus of code through simple code transformations. A novel insight learned from our work is that learning from artificially seeded bugs yields bug detectors that are effective at finding bugs in real-world code. We implement our idea into a framework for learning-based and name-based bug detection. Three bug detectors built on top of the framework detect accidentally swapped function arguments, incorrect binary operators, and incorrect operands in binary operations. Applying the approach to a corpus of 150,000 JavaScript files yields bug detectors that have a high accuracy (between 89% and 95%), are very efficient (less than 20 milliseconds per analyzed file), and reveal 102 programming mistakes (with 68% true positive rate) in real-world code.

doi: 10.1145/3276517

@article{pradel18_pl, title = {DeepBugs: a Learning Approach to Name-based Bug Detection}, author = {Pradel, Michael and Sen, Koushik}, year = {2018}, journal = {Proceedings of the ACM on Programming Languages}, volume = {2}, number = {{OOPSLA}}, pages = {147:1--147:25}, doi = {10.1145/3276517} }

Conference Papers

Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages

Jeremy Barnes, Roman Klinger, Sabine Schulte im Walde

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL), pp. 2483–2493, 2018.

Abstract Links BibTeX Project

Sentiment analysis in low-resource languages suffers from a lack of annotated corpora to estimate high-performing models. Machine translation and bilingual word embeddings provide some relief through cross-lingual sentiment approaches. However, they either require large amounts of parallel data or do not sufficiently capture sentiment information. We introduce Bilingual Sentiment Embeddings (BLSE), which jointly represent sentiment information in a source and target language. This model only requires a small bilingual lexicon, a source-language corpus annotated for sentiment, and monolingual word embeddings for each language. We perform experiments on three language combinations (Spanish, Catalan, Basque) for sentence-level cross-lingual sentiment classification and find that our model significantly outperforms state-of-the-art methods on four out of six experimental setups, as well as capturing complementary information to machine translation. Our analysis of the resulting embedding space provides evidence that it represents sentiment information in the resource-poor target language without any annotated data in that language.

doi: 10.18653/v1/P18-1231

@inproceedings{barnes18_acl, title = {Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages}, author = {Barnes, Jeremy and Klinger, Roman and {Schulte im Walde}, Sabine}, year = {2018}, booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (ACL)}, pages = {2483–2493}, doi = {10.18653/v1/P18-1231} }
Learning Sequence Encoders for Temporal Knowledge Graph Completion

Alberto García-Durán, Sebastijan Dumančić, Mathias Niepert

Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4816–4821, 2018.

Abstract Links BibTeX Project

Research on link prediction in knowledge graphs has mainly focused on static multi-relational data. In this work we consider temporal knowledge graphs where relations between entities may only hold for a time interval or a specific point in time. In line with previous work on static knowledge graphs, we propose to address this problem by learning latent entity and relation type representations. To incorporate temporal information, we utilize recurrent neural networks to learn time-aware representations of relation types which can be used in conjunction with existing latent factorization methods. The proposed approach is shown to be robust to common challenges in real-world KGs: the sparsity and heterogeneity of temporal expressions. Experiments show the benefits of our approach on four temporal KGs. The data sets are available under a permissive BSD-3 license.

doi: 10.18653/v1/D18-1516

@inproceedings{garciaduran18_emnlp, title = {Learning Sequence Encoders for Temporal Knowledge Graph Completion}, author = {García-Durán, Alberto and Dumančić, Sebastijan and Niepert, Mathias}, year = {2018}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages = {4816–4821}, doi = {10.18653/v1/D18-1516} }
KBLRN: End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features

Alberto García-Durán, Mathias Niepert

Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 1–10, 2018.

Abstract Links BibTeX Project

We present KBLRN, a framework for end-to-end learning of knowledge base representations from latent, relational, and numerical features. KBLRN integrates feature types with a novel combination of neural representation learning and probabilistic product of experts models. To the best of our knowledge, KBLRN is the first approach that learns representations of knowledge bases by integrating latent, relational, and numerical features. We show that instances of KBLRN outperform existing methods on a range of knowledge base completion tasks. We contribute a novel data sets enriching commonly used knowledge base completion benchmarks with numerical features. The data sets are available under a permissive BSD-3 license. We also investigate the impact numerical features have on the KB completion performance of KBLRN.

Paper: http://auai.org/uai2018/proceedings/papers/149.pdf

@inproceedings{garciaduran18_uai, title = {KBLRN: End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features}, author = {García-Durán, Alberto and Niepert, Mathias}, year = {2018}, booktitle = {Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI)}, pages = {1--10}, url = {http://auai.org/uai2018/proceedings/papers/149.pdf} }
Training Person-Specific Gaze Estimators from Interactions with Multiple Devices

Xucong Zhang, Michael Xuelin Huang, Yusuke Sugano, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 1–12, 2018.

Abstract Links BibTeX Project

Learning-based gaze estimation has significant potential to enable attentive user interfaces and gaze-based interaction on the billions of camera-equipped handheld devices and ambient displays. While training accurate person- and device-independent gaze estimators remains challenging, person-specific training is feasible but requires tedious data collection for each target device. To address these limitations, we present the first method to train person-specific gaze estimators across multiple devices. At the core of our method is a single convolutional neural network with shared feature extraction layers and device-specific branches that we train from face images and corresponding on-screen gaze locations. Detailed evaluations on a new dataset of interactions with five common devices (mobile phone, tablet, laptop, desktop computer, smart TV) and three common applications (mobile game, text editing, media center) demonstrate the significant potential of cross-device training. We further explore training with gaze locations derived from natural interactions, such as mouse or touch input.

doi: 10.1145/3173574.3174198

@inproceedings{zhang18_chi, title = {Training Person-Specific Gaze Estimators from Interactions with Multiple Devices}, author = {Zhang, Xucong and Huang, Michael Xuelin and Sugano, Yusuke and Bulling, Andreas}, year = {2018}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, pages = {1--12}, doi = {10.1145/3173574.3174198} }

2017

Journal Articles

A Bernstein-type Inequality for Some Mixing Processes and Dynamical Systems with an Application to Learning

Hanyuan Hang, Ingo Steinwart

Annals of Statistics, 45, pp. 708–743, 2017.

Abstract Links BibTeX Project

We establish a Bernstein-type inequality for a class of stochastic processes that includes the classical geometrically φ -mixing processes, Rio’s generalization of these processes and many time-discrete dynamical systems. Modulo a logarithmic factor and some constants, our Bernstein-type inequality coincides with the classical Bernstein inequality for i.i.d. data. We further use this new Bernstein-type inequality to derive an oracle inequality for generic regularized empirical risk minimization algorithms and data generated by such processes. Applying this oracle inequality to support vector machines using the Gaussian kernels for binary classification, we obtain essentially the same rate as for i.i.d. processes, and for least squares and quantile regression; it turns out that the resulting learning rates match, up to some arbitrarily small extra term in the exponent, the optimal rates for i.i.d. processes.

doi: 10.1214/16-AOS1465

@article{hang17_as, title = {A {B}ernstein-type Inequality for Some Mixing Processes and Dynamical Systems with an Application to Learning}, author = {Hang, Hanyuan and Steinwart, Ingo}, year = {2017}, journal = {Annals of Statistics}, volume = {45}, pages = {708--743}, doi = {10.1214/16-AOS1465} }

Conference Papers

Learning Graph Representations with Embedding Propagation

Alberto García-Durán, Mathias Niepert

Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 5125–5136, 2017.

Abstract Links BibTeX Project

We propose Embedding Propagation (EP), an unsupervised learning framework for graph-structured data. EP learns vector representations of graphs by passing two types of messages between neighboring nodes. Forward messages consist of label representations such as representations of words and other attributes associated with the nodes. Backward messages consist of gradients that result from aggregating the label representations and applying a reconstruction loss. Node representations are finally computed from the representation of their labels. With significantly fewer parameters and hyperparameters an instance of EP is competitive with and often outperforms state of the art unsupervised and semi-supervised learning methods on a range of benchmark data sets.

doi: 10.5555/3295222.3295265

Paper: https://proceedings.neurips.cc/paper/2017/file/e0688d13958a19e087e123148555e4b4-Paper.pdf

@inproceedings{garciaduran17_neurips, title = {Learning Graph Representations with Embedding Propagation}, author = {García-Durán, Alberto and Niepert, Mathias}, year = {2017}, booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)}, pages = {5125–5136}, doi = {10.5555/3295222.3295265}, url = {https://proceedings.neurips.cc/paper/2017/file/e0688d13958a19e087e123148555e4b4-Paper.pdf} }
Gaze Embeddings for Zero-Shot Image Classification

Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6412–6421, 2017.

Abstract Links BibTeX Project

Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts. We instead propose a method that relies on human gaze as auxiliary information, exploiting that even non-expert users have a natural ability to judge class membership. We present a data collection paradigm that involves a discrimination task to increase the information content obtained from gaze data. Our method extracts discriminative descriptors from the data and learns a compatibility function between image and gaze using three novel gaze embeddings: Gaze Histograms (GH), Gaze Features with Grid (GFG) and Gaze Features with Sequence (GFS). We introduce two new gaze-annotated datasets for fine-grained image classification and show that human gaze data is indeed class discriminative, provides a competitive alternative to expert-annotated attributes, and outperforms other baselines for zero-shot image classification.

doi: 10.1109/CVPR.2017.679

@inproceedings{karessli17_cvpr, title = {Gaze Embeddings for Zero-Shot Image Classification}, author = {Karessli, Nour and Akata, Zeynep and Schiele, Bernt and Bulling, Andreas}, year = {2017}, booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, pages = {6412--6421}, doi = {10.1109/CVPR.2017.679} }
Everyday Eye Contact Detection Using Unsupervised Gaze Target Discovery

Xucong Zhang, Yusuke Sugano, Andreas Bulling

Proceedings of the ACM Symposium on User Interface Software and Technology (UIST), pp. 193–203, 2017.

Abstract Links BibTeX Project

Eye contact is an important non-verbal cue in social signal processing and promising as a measure of overt attention in human-object interactions and attentive user interfaces. However, robust detection of eye contact across different users, gaze targets, camera positions, and illumination conditions is notoriously challenging. We present a novel method for eye contact detection that combines a state-of-the-art appearance-based gaze estimator with a novel approach for unsupervised gaze target discovery, i.e. without the need for tedious and time-consuming manual data annotation. We evaluate our method in two real-world scenarios: detecting eye contact at the workplace, including on the main work display, from cameras mounted to target objects, as well as during everyday social interactions with the wearer of a head-mounted egocentric camera. We empirically evaluate the performance of our method in both scenarios and demonstrate its effectiveness for detecting eye contact independent of target object type and size, camera position, and user and recording environment.

doi: 10.1145/3126594.3126614

@inproceedings{zhang17_uist, title = {Everyday Eye Contact Detection Using Unsupervised Gaze Target Discovery}, author = {Zhang, Xucong and Sugano, Yusuke and Bulling, Andreas}, year = {2017}, booktitle = {Proceedings of the ACM Symposium on User Interface Software and Technology (UIST)}, pages = {193--203}, doi = {10.1145/3126594.3126614} }

2016

Conference Papers

Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features

Maximilian Köper, Sabine Schulte im Walde

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (ACL), pp. 256–263, 2016.

Abstract Links BibTeX Project

This paper addresses an automatic classification of preposition types in German, comparing hard and soft clustering approaches and various window- and syntax-based co-occurrence features. We show that (i) the semantically most salient preposition features (i.e., subcategorised nouns) are the most successful, and that (ii) soft clustering approaches are required for the task but reveal quite different attitudes towards predicting ambiguity.

doi: 10.18653/v1/P16-2042

@inproceedings{koeper16_acl, title = {Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features}, author = {Köper, Maximilian and {Schulte im Walde}, Sabine}, year = {2016}, booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (ACL)}, pages = {256–263}, doi = {10.18653/v1/P16-2042} }
Learning Convolutional Neural Networks for Graphs

Mohamed Ahmed Mathias Niepert, Konstantin Kutzkov

Proceedings of the 33rd International Conference on Machine Learning (ICML), pp. 2014–2023, 2016.

Abstract Links BibTeX Project

Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.

doi: 10.5555/3045390.3045603

@inproceedings{niepert16_icml, title = {Learning Convolutional Neural Networks for Graphs}, author = {Mathias Niepert, Mohamed Ahmed and Kutzkov, Konstantin}, year = {2016}, booktitle = {Proceedings of the 33rd International Conference on Machine Learning (ICML)}, pages = {2014--2023}, doi = {10.5555/3045390.3045603} }
Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction

Kim-Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (ACL), pp. 454–459, 2016.

Abstract Links BibTeX Project

We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity. The improved vectors significantly outperform standard models and distinguish antonyms from synonyms with an average precision of 0.66–0.76 across word classes (adjectives, nouns, verbs). Moreover, we integrate the lexical contrast vectors into the objective function of a skip-gram model. The novel embedding outperforms state-of-the-art models on predicting word similarities in SimLex-999, and on distinguishing antonyms from synonyms.

doi: 10.18653/v1/P16-2074

@inproceedings{nguyen16_acl, title = {Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction}, author = {Nguyen, Kim-Anh and {Schulte im Walde}, Sabine and Vu, Ngoc Thang}, year = {2016}, booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (ACL)}, pages = {454–459}, doi = {10.18653/v1/P16-2074} }
Discriminative Gaifman Models

Mathias Niepert

Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 1–9, 2016.

Abstract Links BibTeX Project

We present discriminative Gaifman models, a novel family of relational machine learning models. Gaifman models learn feature representations bottom up from representations of locally connected and bounded-size regions of knowledge bases (KBs). Considering local and bounded-size neighborhoods of knowledge bases renders logical inference and learning tractable, mitigates the problem of overfitting, and facilitates weight sharing. Gaifman models sample neighborhoods of knowledge bases so as to make the learned relational models more robust to missing objects and relations which is a common situation in open-world KBs. We present the core ideas of Gaifman models and apply them to large-scale relational learning problems. We also discuss the ways in which Gaifman models relate to some existing relational machine learning approaches.

doi: 10.5555/3157382.3157479

Paper: https://proceedings.neurips.cc/paper/2016/file/7c4ede33a62160a19586f6e26eaefacf-Paper.pdf

@inproceedings{niepert16_neurips, title = {Discriminative Gaifman Models}, author = {Niepert, Mathias}, year = {2016}, booktitle = {Proceedings of Advances in Neural Information Processing Systems (NeurIPS)}, pages = {1--9}, doi = {10.5555/3157382.3157479}, url = {https://proceedings.neurips.cc/paper/2016/file/7c4ede33a62160a19586f6e26eaefacf-Paper.pdf} }
AggreGaze: Collective Estimation of Audience Attention on Public Displays

Yusuke Sugano, Xucong Zhang, Andreas Bulling

Proceedings of the ACM Symposium on User Interface Software and Technology (UIST), pp. 821–831, 2016.

Abstract Links BibTeX Project

Gaze is frequently explored in public display research given its importance for monitoring and analysing audience attention. However, current gaze-enabled public display interfaces require either special-purpose eye tracking equipment or explicit personal calibration for each individual user. We present AggreGaze, a novel method for estimating spatio-temporal audience attention on public displays. Our method requires only a single off-the-shelf camera attached to the display, does not require any personal calibration, and provides visual attention estimates across the full display. We achieve this by 1) compensating for errors of state-of-the-art appearance-based gaze estimation methods through on-site training data collection, and by 2) aggregating uncalibrated and thus inaccurate gaze estimates of multiple users into joint attention estimates. We propose different visual stimuli for this compensation: a standard 9-point calibration, moving targets, text and visual stimuli embedded into the display content, as well as normal video content. Based on a two-week deployment in a public space, we demonstrate the effectiveness of our method for estimating attention maps that closely resemble ground-truth audience gaze distributions.

doi: 10.1145/2984511.2984536

@inproceedings{sugano16_uist, title = {AggreGaze: Collective Estimation of Audience Attention on Public Displays}, author = {Sugano, Yusuke and Zhang, Xucong and Bulling, Andreas}, year = {2016}, booktitle = {Proceedings of the ACM Symposium on User Interface Software and Technology (UIST)}, pages = {821--831}, doi = {10.1145/2984511.2984536} }
A 3D Morphable Eye Region Model for Gaze Estimation

Erroll Wood, Tadas Baltrušaitis, Louis-Philippe Morency, Peter Robinson, Andreas Bulling

Proceedings of the European Conference on Computer Vision (ECCV), pp. 297–313, 2016.

Abstract Links BibTeX Project

Morphable face models are a powerful tool, but have previ- ously failed to model the eye accurately due to complexities in its material and motion. We present a new multi-part model of the eye that includes a morphable model of the facial eye region, as well as an anatomy-based eyeball model. It is the first morphable model that accurately captures eye region shape, since it was built from high-quality head scans. It is also the first to allow independent eyeball movement, since we treat it as a separate part. To showcase our model we present a new method for illumination- and head-pose–invariant gaze estimation from a single RGB image. We fit our model to an image through analysis-by-synthesis, solving for eye region shape, texture, eyeball pose, and illumination simul- taneously. The fitted eyeball pose parameters are then used to estimate gaze direction. Through evaluation on two standard datasets we show that our method generalizes to both webcam and high-quality camera images, and outperforms a state-of-the-art CNN method achieving a gaze estimation accuracy of 9.44° in a challenging user-independent scenario.

doi: 10.1007/978-3-319-46448-0_18

@inproceedings{wood16_eccv, title = {A 3D Morphable Eye Region Model for Gaze Estimation}, author = {Wood, Erroll and Baltru{\v{s}}aitis, Tadas and Morency, Louis-Philippe and Robinson, Peter and Bulling, Andreas}, year = {2016}, booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)}, pages = {297--313}, doi = {10.1007/978-3-319-46448-0_18} }
Spatio-Temporal Modeling and Prediction of Visual Attention in Graphical User Interfaces

Pingmei Xu, Yusuke Sugano, Andreas Bulling

Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 3299–3310, 2016.

Abstract Links BibTeX Project

We present a computational model to predict users’ spatio-temporal visual attention for WIMP-style (windows, icons, mouse, pointer) graphical user interfaces. Like existing models of bottom-up visual attention in computer vision, our model does not require any eye tracking equipment. Instead, it predicts attention solely using information available to the interface, specifically users’ mouse and keyboard input as well as the UI components they interact with. To study our model in a principled way we further introduce a method to synthesize user interface layouts that are functionally equivalent to real-world interfaces, such as from Gmail, Facebook, or GitHub. We first quantitatively analyze attention allocation and its correlation with user input and UI components using ground-truth gaze, mouse, and keyboard data of 18 participants performing a text editing task. We then show that our model predicts attention maps more accurately than state-of-the-art methods. Our results underline the significant potential of spatio-temporal attention modeling for user interface evaluation, optimization, or even simulation.

doi: 10.1145/2858036.2858479

@inproceedings{xu16_chi, title = {Spatio-Temporal Modeling and Prediction of Visual Attention in Graphical User Interfaces}, author = {Xu, Pingmei and Sugano, Yusuke and Bulling, Andreas}, year = {2016}, booktitle = {Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI)}, pages = {3299--3310}, doi = {10.1145/2858036.2858479} }

2015

Journal Articles

Fully Adaptive Density-Based Clustering

Ingo Steinwart

Annals of Statistics, 43(5), pp. 2132–2167, 2015.

Abstract Links BibTeX Project

The clusters of a distribution are often defined by the connected components of a density level set. However, this definition depends on the user-specified level. We address this issue by proposing a simple, generic algorithm, which uses an almost arbitrary level set estimator to estimate the smallest level at which there are more than one connected components. In the case where this algorithm is fed with histogram-based level set estimates, we provide a finite sample analysis, which is then used to show that the algorithm consistently estimates both the smallest level and the corresponding connected components. We further establish rates of convergence for the two estimation problems, and last but not least, we present a simple, yet adaptive strategy for determining the width-parameter of the involved density estimator in a data-depending way.

doi: 10.1214/15-AOS1331

@article{Steinwart15_as, title = {Fully Adaptive Density-Based Clustering}, author = {Steinwart, Ingo}, year = {2015}, journal = {Annals of Statistics}, volume = {43}, number = {5}, pages = {2132--2167}, doi = {10.1214/15-AOS1331} }
Towards an Axiomatic Approach to Hierarchical Clustering of Measures

Philipp Thomann, Ingo Steinwart, Nico Schmid

Journal of Machine Learning Research (JMLR), 16, pp. 1949–2002, 2015.

Abstract Links BibTeX Project

We propose some axioms for hierarchical clustering of probability measures and investigate their ramifications. The basic idea is to let the user stipulate the clusters for some elementary measures. This is done without the need of any notion of metric, similarity or dissimilarity. Our main results then show that for each suitable choice of user-defined clustering on elementary measures we obtain a unique notion of clustering on a large set of distributions satisfying a set of additivity and continuity axioms. We illustrate the developed theory by numerous examples including some with and some without a density.

Paper: https://www.jmlr.org/papers/v16/thomann15a.html

@article{thomann15_jmlr, title = {Towards an Axiomatic Approach to Hierarchical Clustering of Measures}, author = {Thomann, Philipp and Steinwart, Ingo and Schmid, Nico}, year = {2015}, journal = {Journal of Machine Learning Research (JMLR)}, volume = {16}, pages = {1949--2002}, url = {https://www.jmlr.org/papers/v16/thomann15a.html} }

Conference Papers

Orbits: Enabling Gaze Interaction in Smart Watches using Moving Targets

Augusto Esteves, Eduardo Velloso, Andreas Bulling, Hans Gellersen

Proceedings of the ACM Symposium on User Interface Software and Technology (UIST), pp. 457–466, 2015.

Abstract Links BibTeX Project

We introduce Orbits, a novel gaze interaction technique that enables hands-free input on smart watches. The technique relies on moving controls to leverage the smooth pursuit movements of the eyes and detect whether and at which control the user is looking at. In Orbits, controls include targets that move in a circular trajectory in the face of the watch, and can be selected by following the desired one for a small amount of time. We conducted two user studies to assess the technique’s recognition and robustness, which demonstrated how Orbits is robust against false positives triggered by natural eye movements and how it presents a hands-free, high accuracy way of interacting with smart watches using off-the-shelf devices. Finally, we developed three example interfaces built with Orbits: a music player, a notifications face plate and a missed call menu. Despite relying on moving controls – very unusual in current HCI interfaces – these were generally well received by participants in a third and final study.

doi: 10.1145/2807442.2807499

@inproceedings{esteves15_uist, title = {Orbits: Enabling Gaze Interaction in Smart Watches using Moving Targets}, author = {Esteves, Augusto and Velloso, Eduardo and Bulling, Andreas and Gellersen, Hans}, year = {2015}, booktitle = {Proceedings of the ACM Symposium on User Interface Software and Technology (UIST)}, pages = {457--466}, doi = {10.1145/2807442.2807499} }
Prediction of Search Targets From Fixations in Open-world Settings

Hosnieh Sattar, Sabine Müller, Mario Fritz, Andreas Bulling

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 981–990, 2015.

Abstract Links BibTeX Project

Previous work on predicting the target of visual search from human fixations only considered closed-world settings in which training labels are available and predictions are performed for a known set of potential targets. In this work we go beyond the state of the art by studying search target prediction in an open-world setting in which we no longer assume that we have fixation data to train for the search targets. We present a dataset containing fixation data of 18 users searching for natural images from three image categories within synthesised image collages of about 80 images. In a closed-world baseline experiment we show that we can predict the correct target image out of a candidate set of five images. We then present a new problem formulation for search target prediction in the open-world setting that is based on learning compatibilities between fixations and potential targets.

doi: 10.1109/CVPR.2015.7298700

@inproceedings{sattar15_cvpr, title = {Prediction of Search Targets From Fixations in Open-world Settings}, author = {Sattar, Hosnieh and M{\"{u}}ller, Sabine and Fritz, Mario and Bulling, Andreas}, year = {2015}, booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, pages = {981--990}, doi = {10.1109/CVPR.2015.7298700} }
Rendering of Eyes for Eye-Shape Registration and Gaze Estimation

Erroll Wood, Tadas Baltrušaitis, Xucong Zhang, Yusuke Sugano, Peter Robinson, Andreas Bulling

Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3756–3764, 2015.

Abstract Links BibTeX Project

Images of the eye are key in several computer vision problems, such as shape registration and gaze estimation. Recent large-scale supervised methods for these problems require time-consuming data collection and manual annotation, which can be unreliable. We propose synthesizing perfectly labelled photo-realistic training data in a fraction of the time. We used computer graphics techniques to build a collection of dynamic eye-region models from head scan geometry. These were randomly posed to synthesize close-up eye images for a wide range of head poses, gaze directions, and illumination conditions. We used our model’s controllability to verify the importance of realistic illumination and shape variations in eye-region training data. Finally, we demonstrate the benefits of our synthesized training data (SynthesEyes) by out-performing state-of-the-art methods for eye-shape registration as well as cross-dataset appearance-based gaze estimation in the wild.

doi: 10.1109/ICCV.2015.428

@inproceedings{wood15_iccv, title = {Rendering of Eyes for Eye-Shape Registration and Gaze Estimation}, author = {Wood, Erroll and Baltru{\v{s}}aitis, Tadas and Zhang, Xucong and Sugano, Yusuke and Robinson, Peter and Bulling, Andreas}, year = {2015}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, pages = {3756--3764}, doi = {10.1109/ICCV.2015.428} }
Appearance-based Gaze Estimation in the Wild

Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4511–4520, 2015.

Abstract Links BibTeX Project

Appearance-based gaze estimation is believed to work well in real-world settings but existing datasets were collected under controlled laboratory conditions and methods were not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing datasets with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks, which significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation setting. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithm on three current datasets, including our own. This evaluation provides clear insights and allows us identify key research challenges of gaze estimation in the wild.

doi: 10.1109/CVPR.2015.7299081

@inproceedings{zhang15_cvpr, title = {Appearance-based Gaze Estimation in the Wild}, author = {Zhang, Xucong and Sugano, Yusuke and Fritz, Mario and Bulling, Andreas}, year = {2015}, booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, pages = {4511--4520}, doi = {10.1109/CVPR.2015.7299081} }

2014

Journal Articles

A Tutorial on Human Activity Recognition Using Body-worn Inertial Sensors

Andreas Bulling, Ulf Blanke, Bernt Schiele

ACM Computing Surveys, 46(3), pp. 1–33, 2014.

Abstract Links BibTeX Project

The last 20 years have seen an ever increasing research activity in the field of human activity recognition. With activity recognition having considerably matured so did the number of challenges in designing, implementing and evaluating activity recognition systems. This tutorial aims to provide a comprehensive hands-on introduction for newcomers to the field of human activity recognition. It specifically focuses on activity recognition using on-body inertial sensors. We first discuss the key research challenges that human activity recognition shares with general pattern recognition and identify those challenges that are specific to human activity recognition. We then describe the concept of an activity recognition chain (ARC) as a general-purpose framework for designing and evaluating activity recognition systems. We detail each component of the framework, provide references to related research and introduce the best practise methods developed by the activity recognition research community. We conclude with the educational example problem of recognising different hand gestures from inertial sensors attached to the upper and lower arm. We illustrate how each component of this framework can be implemented for this specific activity recognition problem and demonstrate how different implementations compare and how they impact overall recognition performance.

doi: 10.1145/2499621

@article{bulling14_csur, title = {A Tutorial on Human Activity Recognition Using Body-worn Inertial Sensors}, author = {Bulling, Andreas and Blanke, Ulf and Schiele, Bernt}, year = {2014}, journal = {ACM Computing Surveys}, volume = {46}, number = {3}, pages = {1--33}, doi = {10.1145/2499621} }

Conference Papers

Chasing Hypernyms in Vector Spaces with Entropy

Enrico Santus, Alessandro Lenci, Qin Lu, Sabine Schulte im Walde

Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, (Volume 2: Short Papers) (EACL), pp. 38–42, 2014.

Abstract Links BibTeX Project

In this paper, we introduce SLQS, a new entropy-based measure for the unsupervised identification of hypernymy and its directionality in Distributional Semantic Models (DSMs). SLQS is assessed through two tasks: (i.) identifying the hypernym in hyponym-hypernym pairs, and (ii.) discriminating hypernymy among various semantic relations. In both tasks, SLQS outperforms other state-of-the-art measures.

doi: 10.3115/v1/E14-4008

@inproceedings{santus14_eacl, title = {Chasing Hypernyms in Vector Spaces with Entropy}, author = {Santus, Enrico and Lenci, Alessandro and Lu, Qin and {Schulte im Walde}, Sabine}, year = {2014}, booktitle = {Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, (Volume 2: Short Papers) (EACL)}, pages = {38–42}, doi = {10.3115/v1/E14-4008} }

2013

Conference Papers

A Multimodal LDA Model integrating Textual, Cognitive and Visual Modalities

Stephen Roller, Sabine Schulte im Walde

Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1146–1157, 2013.

Abstract BibTeX Project

Recent investigations into grounded models of language have shown that holistic views of language and perception can provide higher performance than independent views. In this work, we improve a two-dimensional multimodal version of Latent Dirichlet Allocation (Andrews et al., 2009) in various ways. (1) We outperform text-only models in two different evaluations, and demonstrate that low-level visual features are directly compatible with the existing model. (2) We present a novel way to integrate visual features into the LDA model using unsupervised clusters of images. The clusters are directly interpretable and improve on our evaluation tasks. (3) We provide two novel ways to extend the bimodal models to support three or more modalities. We find that the three-, four-, and five-dimensional models significantly outperform models using only one or two modalities, and that nontextual modalities each provide separate, disjoint knowledge that cannot be forced into a shared, latent structure.

@inproceedings{roller13_emnlp, title = {A Multimodal LDA Model integrating Textual, Cognitive and Visual Modalities}, author = {Roller, Stephen and {Schulte im Walde}, Sabine}, year = {2013}, booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages = {1146–1157} }
Using Subcategorization Knowledge to improve Case Prediction for Translation to German

Marion Weller, Alexander Fraser, Sabine Schulte im Walde

Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 593–603, 2013.

Abstract BibTeX Project

This paper demonstrates the need and impact of subcategorization information for SMT. We combine (i) features on source-side syntactic subcategorization and (ii) an external knowledge base with quantitative, dependency-based information about target-side subcategorization frames. A manual evaluation of an English-to-German translation task shows that the subcategorization information has a positive impact on translation quality through better prediction of case.

@inproceedings{weller13_acl, title = {Using Subcategorization Knowledge to improve Case Prediction for Translation to German}, author = {Weller, Marion and Fraser, Alexander and {Schulte im Walde}, Sabine}, year = {2013}, booktitle = {Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL)}, pages = {593–603} }

2011

Journal Articles

Eye Movement Analysis for Activity Recognition Using Electrooculography

Andreas Bulling, Jamie A. Ward, Hans Gellersen, Gerhard Tröster

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 33(4), pp. 741–753, 2011.

Abstract Links BibTeX Project

In this work we investigate eye movement analysis as a new sensing modality for activity recognition. Eye movement data was recorded using an electrooculography (EOG) system. We first describe and evaluate algorithms for detecting three eye movement characteristics from EOG signals - saccades, fixations, and blinks - and propose a method for assessing repetitive patterns of eye movements. We then devise 90 different features based on these characteristics and select a subset of them using minimum redundancy maximum relevance feature selection (mRMR). We validate the method using an eight participant study in an office environment using an example set of five activity classes: copying a text, reading a printed paper, taking hand-written notes, watching a video, and browsing the web. We also include periods with no specific activity (the NULL class). Using a support vector machine (SVM) classifier and a person-independent (leave-one-out) training scheme, we obtain an average precision of 76.1% and recall of 70.5% over all classes and participants. The work demonstrates the promise of eye-based activity recognition (EAR) and opens up discussion on the wider applicability of EAR to other activities that are difficult, or even impossible, to detect using common sensing modalities.

doi: 10.1109/TPAMI.2010.86

@article{bulling11_pami, title = {Eye {M}ovement {A}nalysis for {A}ctivity {R}ecognition {U}sing {E}lectrooculography}, author = {Bulling, Andreas and Ward, Jamie A. and Gellersen, Hans and Tr{\"{o}}ster, Gerhard}, year = {2011}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, volume = {33}, number = {4}, pages = {741--753}, doi = {10.1109/TPAMI.2010.86} }

Conference Papers

Optimal Learning Rates for Least Squares SVMs Using Gaussian Kernels

Mona Eberts, Ingo Steinwart

Advances in Neural Information Processing Systems (NeurIPS), pp. 1539–1547, 2011.

Abstract Links BibTeX Project

We prove a new oracle inequality for support vector machines with Gaussian RBF kernels solving the regularized least squares regression problem. To this end, we apply the modulus of smoothness. With the help of the new oracle inequality we then derive learning rates that can also be achieved by a simple data-dependent parameter selection method. Finally, it turns out that our learning rates are asymptotically optimal for regression functions satisfying certain standard smoothness conditions.

Paper: https://proceedings.neurips.cc/paper/2011/file/51ef186e18dc00c2d31982567235c559-Paper.pdf

@inproceedings{eberts11_neurips, title = {Optimal Learning Rates for Least Squares {SVM}s Using {G}aussian Kernels}, author = {Eberts, Mona and Steinwart, Ingo}, year = {2011}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, volume = {24}, pages = {1539--1547}, url = {https://proceedings.neurips.cc/paper/2011/file/51ef186e18dc00c2d31982567235c559-Paper.pdf} }

2010

Conference Papers

Universal Kernels on Non-Standard Input Spaces

Andreas Christmann, Ingo Steinwart

Advances in Neural Information Processing Systems (NeurIPS), pp. 406–414, 2010.

Abstract Links BibTeX Project

During the last years support vector machines (SVMs) have been successfully applied even in situations where the input space X; is not necessarily a subset of ℝ^d. Examples include SVMs using probability measures to analyse e.g. histograms or coloured images, SVMs for text classification and web mining, and SVMs for applications from computational biology using, e.g., kernels for trees and graphs. Moreover, SVMs are known to be consistent to the Bayes risk, if either the input space is a complete separable metric space and the reproducing kernel Hilbert space (RKHS) H ⊂ L_p (P_X) is dense, or if the SVM is based on a universal kernel k. So far, however, there are no RKHSs of practical interest known that satisfy these assumptions if X ⊄ ℝ^d. We close this gap by providing a general technique based on Taylor-type kernels to explicitly construct universal kernels on compact metric spaces which are not subset of ℝ^d. We apply this technique for the following special cases: universal kernels on the set of probability measures, universal kernels based on Fourier transforms, and universal kernels for signal processing.

Paper: https://papers.nips.cc/paper/2010/hash/4e0cb6fb5fb446d1c92ede2ed8780188-Abstract.html

@inproceedings{christmann10_neurips, title = {Universal Kernels on Non-Standard Input Spaces}, author = {Christmann, Andreas and Steinwart, Ingo}, year = {2010}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, volume = {23}, pages = {406--414}, url = {https://papers.nips.cc/paper/2010/hash/4e0cb6fb5fb446d1c92ede2ed8780188-Abstract.html} }

2009

Journal Articles

Consistency of Support Vector Machines for Forecasting the Evolution of an Unknown Ergodic Dynamical System from Observations with Unknown Noise

Ingo Steinwart, Marian Anghel

Annals of Statistics, 37, pp. 841–875, 2009.

Abstract Links BibTeX Project

We consider the problem of forecasting the next (observable) state of an unknown ergodic dynamical system from a noisy observation of the present state. Our main result shows, for example, that support vector machines (SVMs) using Gaussian RBF kernels can learn the best forecaster from a sequence of noisy observations if (a) the unknown observational noise process is bounded and has a summable α-mixing rate and (b) the unknown ergodic dynamical system is defined by a Lipschitz continuous function on some compact subset of ℝ^d and has a summable decay of correlations for Lipschitz continuous functions. In order to prove this result we first establish a general consistency result for SVMs and all stochastic processes that satisfy a mixing notion that is substantially weaker than α -mixing.

doi: 10.1214/07-AOS562

@article{steinwart09_as, title = {Consistency of Support Vector Machines for Forecasting the Evolution of an Unknown Ergodic Dynamical System from Observations with Unknown Noise}, author = {Steinwart, Ingo and Anghel, Marian}, year = {2009}, journal = {Annals of Statistics}, volume = {37}, pages = {841--875}, doi = {10.1214/07-AOS562} }

Conference Papers

Fast Learning from Non-i.i.d. Observations

Ingo Steinwart, Andreas Christmann

Advances in Neural Information Processing Systems (NeurIPS), pp. 1768–1776, 2009.

Abstract Links BibTeX Project

We prove an oracle inequality for generic regularized empirical risk minimization algorithms learning from α-mixing processes. To illustrate this oracle inequality, we use it to derive learning rates for some learning methods including least squares SVMs. Since the proof of the oracle inequality uses recent localization ideas developed for independent and identically distributed (i.i.d.) processes, it turns out that these learning rates are close to the optimal rates known in the i.i.d. case.

Paper: https://papers.nips.cc/paper/2009/hash/a89cf525e1d9f04d16ce31165e139a4b-Abstract.html

@inproceedings{steinwart09_neurips, title = {Fast Learning from Non-i.i.d. Observations}, author = {Steinwart, Ingo and Christmann, Andreas}, year = {2009}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, volume = {22}, pages = {1768--1776}, url = {https://papers.nips.cc/paper/2009/hash/a89cf525e1d9f04d16ce31165e139a4b-Abstract.html} }

2008

Conference Papers

Combining EM Training and the MDL Principle for an Automatic Verb Classification incorporating Selectional Preferences

Sabine Schulte im Walde, Christian Hying, Christian Scheible, Helmut Schmid

Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 496–504, 2008.

Abstract BibTeX Project

This paper presents an innovative, complex approach to semantic verb classification that relies on selectional preferences as verb properties. The probabilistic verb class model underlying the semantic classes is trained by a combination of the EM algorithm and the MDL principle, providing soft clusters with two dimensions (verb senses and subcategorisation frames with selectional preferences) as a result. A language-model-based evaluation shows that after 10 training iterations the verb class model results are above the baseline results.

@inproceedings{schulteimwalde08_acl, title = {Combining EM Training and the MDL Principle for an Automatic Verb Classification incorporating Selectional Preferences}, author = {{Schulte im Walde}, Sabine and Hying, Christian and Scheible, Christian and Schmid, Helmut}, year = {2008}, booktitle = {Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL)}, pages = {496–504} }

Books

Support Vector Machines

Ingo Steinwart, Andreas Christmann

2008.

Abstract Links BibTeX Project

Explains the principles that make support vector machines a successful modelling and prediction tool for a variety of applications. Rigorous treatment of state-of-the-art results on support vector machines. Suitable for both graduate students and researchers in statistical machine learning.

doi: 10.1007/978-0-387-77242-4

@book{steinwart08_svm, title = {Support Vector Machines}, author = {Steinwart, Ingo and Christmann, Andreas}, year = {2008}, publisher = {Springer}, address = {New York}, doi = {10.1007/978-0-387-77242-4} }

2007

Journal Articles

Fast rates for support vector machines using Gaussian kernels

Ingo Steinwart, Clint Scovel

Annals of Statistics, 35, pp. 575–607, 2007.

Abstract Links BibTeX Project

For binary classification we establish learning rates up to the order of n^-1 for support vector machines (SVMs) with hinge loss and Gaussian RBF kernels. These rates are in terms of two assumptions on the considered distributions: Tsybakov’s noise assumption to establish a small estimation error, and a new geometric noise condition which is used to bound the approximation error. Unlike previously proposed concepts for bounding the approximation error, the geometric noise assumption does not employ any smoothness assumption.

Paper: https://www.jstor.org/stable/25463569

@article{steinwart07_as, title = {Fast rates for support vector machines using {G}aussian kernels}, author = {Steinwart, Ingo and Scovel, Clint}, year = {2007}, journal = {Annals of Statistics}, volume = {35}, pages = {575--607}, url = {https://www.jstor.org/stable/25463569} }

2005

Journal Articles

A Classification Framework for Anomaly Detection

Ingo Steinwart, Don Hush, Clint Scovel

Journal of Machine Learning Research (JMLR), 6, pp. 211–232, 2005.

Abstract Links BibTeX Project

One way to describe anomalies is by saying that anomalies are not concentrated. This leads to the problem of finding level sets for the data generating density. We interpret this learning problem as a binary classification problem and compare the corresponding classification risk with the standard performance measure for the density level problem. In particular it turns out that the empirical classification risk can serve as an empirical performance measure for the anomaly detection problem. This allows us to compare different anomaly detection algorithms empirically, i.e. with the help of a test set. Furthermore, by the above interpretation we can give a strong justification for the well-known heuristic of artificially sampling ’labeled’ samples, provided that the sampling plan is well chosen. In particular this enables us to propose a support vector machine (SVM) for anomaly detection for which we can easily establish universal consistency. Finally, we report some experiments which compare our SVM to other commonly used methods including the standard one-class SVM.

doi: 10.5555/1046920.1058109

Paper: https://www.jmlr.org/papers/v6/steinwart05a.html

@article{steinwart05_jmlr, title = {A Classification Framework for Anomaly Detection}, author = {Steinwart, Ingo and Hush, Don and Scovel, Clint}, year = {2005}, journal = {Journal of Machine Learning Research (JMLR)}, volume = {6}, pages = {211--232}, doi = {10.5555/1046920.1058109}, url = {https://www.jmlr.org/papers/v6/steinwart05a.html} }

2004

Journal Articles

On Robustness Properties of Convex Risk Minimization Methods for Pattern Recognition

Andreas Christmann, Ingo Steinwart

Journal of Machine Learning Research (JMLR), 5, pp. 1007–1034, 2004.

Abstract Links BibTeX Project

The paper brings together methods from two disciplines: machine learning theory and robust statistics. We argue that robustness is an important aspect and we show that many existing machine learning methods based on the convex risk minimization principle have - besides other good properties - also the advantage of being robust. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds on the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. Some results on the robustness of such methods are also obtained for the sensitivity curve and the maxbias, which are two other robustness criteria. A sensitivity analysis of the support vector machine is given.

doi: 10.5555/1005332.1016792

Paper: http://www.jmlr.org/papers/volume5/christmann04a/christmann04a.pdf

@article{christmann04_jmlr, title = {On Robustness Properties of Convex Risk Minimization Methods for Pattern Recognition}, author = {Christmann, Andreas and Steinwart, Ingo}, year = {2004}, journal = {Journal of Machine Learning Research (JMLR)}, volume = {5}, pages = {1007--1034}, doi = {10.5555/1005332.1016792}, url = {http://www.jmlr.org/papers/volume5/christmann04a/christmann04a.pdf} }

Selected Publications

2025

Conference Papers

2024

Journal Articles

Conference Papers

Miscellaneous

2023

Journal Articles

Conference Papers

2022

Journal Articles

Conference Papers

2021

Journal Articles

Conference Papers

2020

Journal Articles

Conference Papers

2019

Journal Articles

Conference Papers

2018

Journal Articles

Conference Papers

2017

Journal Articles

Conference Papers

2016

Conference Papers

2015

Journal Articles

Conference Papers

2014

Journal Articles

Conference Papers

2013

Conference Papers

2011

Journal Articles

Conference Papers

2010

Conference Papers

2009

Journal Articles

Conference Papers

2008

Conference Papers

Books

2007

Journal Articles

2005

Journal Articles

2004

Journal Articles

Links

Contact Us