Publications

2024

Propagating Ontology Changes to Declarative Mappings in Construction of Knowledge Graphs

Diego Conde-Herreros, Lise Stork, Romana Pernisch, María Poveda-Villalón, Óscar Corcho, and David Chaves-Fraga

In Proceedings of the 5th International Workshop on Knowledge Graph Construction co-located with 21th Extended Semantic Web Conference (ESWC 2024), Hersonissos, Greece, May 27, 2024, vol. 3718, 2024

[Abs] [PDF] [URL]

Knowledge Graphs (KGs) are usually constructed through a set of data transformation pipelines that turn heterogeneous sources into triples following a set of rules. These rules, usually in the form of mapping rules (e.g., RML, R2RML, etc.), are a key resource for the construction of the KG as they describe the relationship between the input data sources and the ontology terms. Several efforts have been made to manage and describe the evolution of the ontology; however, its propagation over interrelated assets (e.g., mapping rules) is commonly done in manual processes. In this paper, we present a preliminary approach to automatically project the evolution of the ontology on the mapping rules used to construct the KG. For each potential change, we analyse the impact on the mappings and the required steps to ensure that the KG is up-to-date w.r.t. the ontology. We implement our solution in fully declarative workflows and demonstrate its benefits in a real-world project in the public procurement domain.
Supporting Companion Planting with the CoPla Ontology

Giacomo Zamprogno, Mark Adamik, Ritten Roothaert, Ameneh Naghdipour, Lise Stork, Patrick Koopmann, Romana Pernisch, Benno Kruit, Jieying Chen, Ilaria Tiddi, and Stefan Schlobach

In KG4S@ESWC, vol. 3753, pp. 29–41, 2024

[Abs] [PDF] [URL]

Sustainability in agriculture is crucial for environmental conservation and ecosystem resilience. Within this context, companion planting stands out as a key practice, leveraging synergies between plants for enhanced growth and pest control. However, its broader adoption is hindered by large-scale knowledge and data integration challenges. To bring companion planting forward and closer to interested users, we engineered a semantically rich ontology, CoPla. We used several automated techniques to extract knowledge from various sources and capture different companion and anti-companion mechanisms. We demonstrate CoPla’s versatility through three applications using different reasoning mechanisms: identifying plant companionships, evaluating, and optimising garden layouts.
When Ontologies met Knowledge Graphs: A Methodology Tale

Romana Pernisch, María Poveda-Villalón, Diego Conde-Herreros, David Chaves-Fraga, and Lise Stork

In The Semantic Web: ESWC 2024 Satellite Events, May, 2024

[Abs] [PDF]

The current state of the art of knowledge engineering lacks proper methodologies to deal with the ever-changing nature of knowledge. In this short paper, we present a first step towards including the changing nature of knowledge in the knowledge graph lifecycle. We extend the LOT ontology engineering methodology to include activities associated with knowledge graph construction, better reflecting how they are engineered in the real world. Further, we analyse how these lifecycles compare to ontology evolution frameworks and what work is there to be done in the future to step from engineering towards full knowledge graph evolution.
ORKA: An Ontology for Robotic Knowledge Acquisition

Mark Adamik, Romana Pernisch, Ilaria Tiddi, and Stefan Schlobach

In Proceedings of 24th International Conference on Knowledge Engineering and Knowledge Management (EKAW-24), Nov, 2024

[Abs] [PDF]

Most autonomous agents operating in the real world use perception capabilities and reasoning mechanisms to acquire new knowledge of the environment, where perception capabilities include both the physical sensor devices and the software-based perception pipelines involved in the process. For autonomous agents to be able to adjust and reason over their own perception, knowledge of the sensors and the corresponding perception algorithms is required. We present the Ontology for Robotic Knowledge Acquisition (ORKA), that models the perception pipeline of a robotic agent by representing the sensory, algorithmic and measurement aspects of the perception process, thereby unifying the agent’s sensing with the characteristics of the environment and facilitating the grounding process. The ontology is based on the alignment between SSN and OBOE, linked to external databases as additional knowledge sources for robotic agents, populated with instances from two different robotic use-cases, and evaluated using competency questions and comparisons to related ontologies. A proof of concept use-case is presented to highlight the potential of the ontology.
Advancing Robotic Perception with Perceived-Entity Linking

Mark Adamik, Romana Pernisch, Ilaria Tiddi, and Stefan Schlobach

In The Semantic Web - ISWC 2024, Nov, 2024

[Abs] [PDF]

The capabilities of current robotic applications are significantly constrained by their limited ability to perceive and understand their surroundings. The Semantic Web aims to offer general, machine-readable knowledge about the world and could be a potential solution to address the information needs of robotic agents. We introduce the Perceived-Entity Linking (PEL) problem as the task of recognizing entities and linking the sensory data of an autonomous agent to a unique identifier in a target knowledge graph. We provide a formal definition of PEL, and propose a PEL baseline based on the YOLO object detection algorithm and a conventional entity linking method as an initial attempt to solve the task. The baseline is evaluated by linking the concepts contained in MS COCO and VisualGenome datasets to WikiData, DBpedia and YAGO as target knowledge graphs. This study makes a first step in allowing robotic agents to leverage the extensive knowledge contained in general-purpose knowledge graphs.
Vision of Knowledge Graph Lifecycle Management within Hybrid Artificial Intelligence Solutions

Romana Pernisch, Hennie Huijgens, Stefan Schlobach, Ruud Mattheij, Frank Benders, Hubert Beusekom, and Freek Bomhof

In Software Lifecycle Management for Knowledge Graphs Workshop Co-located with the ISWC 2024, Nov, 2024

[Abs] [PDF]

Knowledge Graphs (KGs) are essential components in AI systems, providing structured and interpretable data representations. However, managing the lifecycle of KGs poses significant challenges due to their dynamic nature, requiring continuous updates, validation, and maintenance. This vision paper addresses the critical need for innovative lifecycle management practices for hybrid AI solutions, KGs being part of them. Given advances in software engineering and software lifecycle, we need to learn from their past and investigate their practices to be applied to hybrid AI. This can be best done in collaboration with industry, such as small to middle-sized companies (SMEs). Our work aims to advance the scientific understanding of KG lifecycle management, offering practical tools and methodologies that benefit various industries, including healthcare, finance, and manufacturing. The implementation of such practices will enhance the overall quality and trustworthiness of AI systems, contributing to broader societal acceptance and integration of AI technologies in the future.

2023

How Does Knowledge Evolve in Open Knowledge Graphs?

Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell’Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs

Transactions on Graph Data and Knowledge, vol. 1, pp. 11:1–11:59, 2023

[Abs] [PDF] [URL] [DOI]

Openly available, collaboratively edited Knowledge Graphs (KGs) are key platforms for the collective management of evolving knowledge. The present work aims t o provide an analysis of the obstacles related to investigating and processing specifically this central aspect of evolution in KGs. To this end, we discuss (i) the dimensions of evolution in KGs, (ii) the observability of evolution in existing, open, collaboratively constructed Knowledge Graphs over time, and (iii) possible metrics to analyse this evolution. We provide an overview of relevant state-of-the-art research, ranging from metrics developed for Knowledge Graphs specifically to potential methods from related fields such as network science. Additionally, we discuss technical approaches – and their current limitations – related to storing, analysing and processing large and evolving KGs in terms of handling typical KG downstream tasks.
Do you catch my drift? On the usage of embedding methods to measure concept shift in knowledge graphs

Stella Verkijk, Ritten Roothaert, Romana Pernisch, and Stefan Schlobach

In Proceedings of the 12th Knowledge Capture Conference 2023, pp. 70–74, 2023

[Abs] [PDF] [URL] [DOI]

Automatically detecting and measuring differences between evolving Knowledge Graphs (KGs) has been a topic of investigation for years. With the rising popularity of embedding methods, we investigate the possibility of using embeddings to detect Concept Shift in evolving KGs. Specifically, we go deeper into the usage of nearest neighbour set comparison as the basis for a similarity measure, and show why this approach is conceptually problematic. As an alternative, we explore the possibility of using clustering methods. This paper serves to (i) inform the community about the challenges that arise when using KG embeddings for the comparison of different versions of a KG specifically, (ii) investigate how this is supported by theories on knowledge representation and semantic representation in NLP and (iii) take the first steps into the direction of valuable representation of semantics within KGs for comparison.
Descriptive Comparison of Visual Ontology Change Summarisation Methods

Kornpol Chung, Romana Pernisch, and Stefan Schlobach

In The Semantic Web: ESWC 2023 Satellite Events - Hersonissos, Crete, Greece, May 28 - June 1, 2023, Proceedings, vol. 13998, pp. 54–58, 2023

[Abs] [PDF] [URL] [DOI]

The ontology evolution lifecycle is crucial for usability of ontologies across applications. Changes that are applied to ontologies need to be communicated comprehensively to ontology users and engineers. Change visualisation is a simple, yet powerful way of explaining ontological changes, and different methods come with different shortcomings. This paper introduces and analyses the predominant methods of ontology change visualisations. As there exists no one-fits-all solution, we provide simple guidelines for which visualisation to use.

2022

Visualising the effects of ontology changes and studying their understanding with ChImp

Romana Pernisch, Daniele Dell’Aglio, Mirko Serbak, Rafael S. Gonçalves, and Abraham Bernstein

Journal of Web Semantics, vol. 74, pp. 100715, 2022

[Abs] [PDF] [URL] [DOI]

Due to the Semantic Web’s decentralised nature, ontology engineers rarely know all applications that leverage their ontology. Consequently, they are unaware of the full extent of possible consequences that changes might cause to the ontology. Our goal is to lessen the gap between ontology engineers and users by investigating ontology engineers’ understanding of ontology changes’ impact at editing time. Hence, this paper introduces the Protégé plugin \chimp which we use to reach our goal. We elicited requirements for \chimp through a questionnaire with ontology engineers. We then developed \chimp according to these requirements and it displays all changes of a given session and provides selected information on said changes and their effects. For each change, it computes a number of metrics on both the ontology and its materialisation. It displays those metrics on both the originally loaded ontology at the beginning of the editing session and the current state to help ontology engineers understand the impact of their changes. We investigated the informativeness of materialisation impact measures, the meaning of severe impact, and also the usefulness of \chimp in an online user study with \colorchange36 ontology engineers. We asked the participants to solve two ontology engineering tasks—with and without \chimp (assigned in random order)—and answer in-depth questions about the applied changes as well as the materialisation impact measures. We found that \chimp increased the participants’ understanding of change effects and that they felt better informed. Answers also suggest that the proposed measures were useful and informative. We also learned that the participants consider different outcomes of changes severe, but most would define severity based on the amount of changes to the materialisation compared to its size. The participants also acknowledged the importance of quantifying the impact of changes and that the study will affect their approach of editing ontologies.

2021

Beware of the hierarchy - An analysis of ontology evolution and the materialisation impact for biomedical ontologies

Romana Pernisch, Daniele Dell’Aglio, and Abraham Bernstein

Journal of Web Semantics, vol. 70, pp. 100658, 2021

[Abs] [PDF] [URL] [DOI]

Ontologies are becoming a key component of numerous applications and research fields. But knowledge captured within ontologies is not static. Some ontology updates potentially have a wide ranging impact; others only affect very localised parts of the ontology and their applications. Investigating the impact of the evolution gives us insight into the editing behaviour but also signals ontology engineers and users how the ontology evolution is affecting other applications. However, such research is in its infancy. Hence, we need to investigate the evolution itself and its impact on the simplest of applications: the materialisation. In this work, we define impact measures that capture the effect of changes on the materialisation. In the future, the impact measures introduced in this work can be used to investigate how aware the ontology editors are about consequences of changes. By introducing five different measures, which focus either on the change in the materialisation with respect to the size or on the number of changes applied, we are able to quantify the consequences of ontology changes. To see these measures in action, we investigate the evolution and its impact on materialisation for nine open biomedical ontologies, most of which adhere to the EL++ description logic. Our results show that these ontologies evolve at varying paces but no statistically significant difference between the ontologies with respect to their evolution could be identified. We identify three types of ontologies based on the types of complex changes which are applied to them throughout their evolution. The impact on the materialisation is the same for the investigated ontologies, bringing us to the conclusion that the effect of changes on the materialisation can be generalised to other similar ontologies. Further, we found that the materialised concept inclusion axioms experience most of the impact induced by changes to the class inheritance of the ontology and other changes only marginally touch the materialisation.
Toward Measuring the Resemblance of Embedding Models for Evolving Ontologies

Romana Pernisch, Daniele Dell’Aglio, and Abraham Bernstein

In K-CAP ’21: Knowledge Capture Conference, Virtual Event, USA, December 2-3, 2021, pp. 177–184, 2021

[Abs] [PDF] [URL] [DOI]

Updates on ontologies affect the operations built on top of them. But not all changes are equal: some updates drastically change the result of operations; others lead to minor variations, if any. Hence, estimating the impact of a change ex-ante is highly important, as it might make ontology engineers aware of the consequences of their action during editing. However, in order to estimate the impact of changes, we need to understand how to measure them. To address this gap for embeddings, we propose a new measure called Embedding Resemblance Indicator (ERI), which takes into account both the stochasticity of learning embeddings as well as the shortcomings of established comparison methods. We base ERI on (i) a similarity score, (ii) a robustness factor \hatμ (based on the embedding method, similarity measure, and dataset), and (iii) the number of added or deleted entities to the embedding computed with the Jaccard index. To evaluate ERI, we investigate its usage in the context of two biomedical ontologies and three embedding methods—GraRep, LINE, and DeepWalk—as well as the two standard benchmark datasets—FB15k-237 and Wordnet-18-RR—with TransE and RESCAL embeddings. To study different aspects of ERI, we introduce synthetic changes in the knowledge graphs, generating two test-cases with five versions each and compare their impact with the expected behaviour. Our studies suggests that ERI behaves as expected and captures the similarity of embeddings based on the severity of changes. ERI is crucial for enabling further studies into impact of changes on embeddings.
Multi-domain and Explainable Prediction of Changes in Web Vocabularies

Albert Meroño-Peñuela, Romana Pernisch, Christophe Guéret, and Stefan Schlobach

In K-CAP ’21: Knowledge Capture Conference, Virtual Event, USA, December 2-3, 2021, pp. 193–200, 2021

[Abs] [PDF] [URL] [DOI]

Web vocabularies (WV) have become a fundamental tool for structuring Web data: over 10 million sites use structured data formats and ontologies to markup content. Maintaining these vocabularies and keeping up with their changes are manual tasks with very limited automated support, impacting both publishers and users. Existing work shows that machine learning can be used to reliably predict vocabulary changes, but on specific domains (e.g. biomedicine) and with limited explanations on the impact of changes (e.g. their type, frequency, etc.). In this paper, we describe a framework that uses various supervised learning models to learn and predict changes in versioned vocabularies, independent of their domain. Using well-established results in ontology evolution we extract domain-agnostic and human-interpretable features and explain their influence on change predictability. Applying our method on 139 WV from 9 different domains, we find that ontology structural and instance data, the number of versions, and the release frequency highly correlate with predictability of change. These results can pave the way towards integrating predictive models into knowledge engineering practices and methods.
VideoGraph - Towards Using Knowledge Graphs for Interactive Video Retrieval

Luca Rossetto, Matthias Baumgartner, Narges Ashena, Florian Ruosch, Romana Pernisch, Lucien Heitz, and Abraham Bernstein

In MultiMedia Modeling - 27th International Conference, MMM 2021, Prague, Czech Republic, June 22-24, 2021, Proceedings, Part II, vol. 12573, pp. 417–422, 2021

[Abs] [PDF] [URL] [DOI]

Video is a very expressive medium, able to capture a wide variety of information in different ways. While there have been many advances in the recent past, which enable the annotation of semantic concepts as well as individual objects within video, their larger context has so far not extensively been used for the purpose of retrieval. In this paper, we introduce the first iteration of VideoGraph, a knowledge graph-based video retrieval system. VideoGraph combines information extracted from multiple video modalities with external knowledge bases to produce a semantically enriched representation of the content in a video collection, which can then be retrieved using graph traversal. For the 2021 Video Browser Showdown, we show the first proof-of-concept of such a graph-based video retrieval approach.

2020

LifeGraph: A Knowledge Graph for Lifelogs

Luca Rossetto, Matthias Baumgartner, Narges Ashena, Florian Ruosch, Romana Pernischová, and Abraham Bernstein

In Proceedings of the Third ACM Workshop on Lifelog Search Challenge, LSC@ICMR 2020, Dublin, Ireland, June 8-11, 2020, pp. 13–17, 2020

[Abs] [PDF] [URL] [DOI]

The data produced by efforts such as life logging is commonly multi modal and can have manifold interrelations with itself as well as external information. Representing this data in such a way that these rich relations as well as all the different sources can be leveraged is a non-trivial undertaking. In this paper, we present the first iteration of LifeGraph, a Knowledge Graph for lifelogging data. LifeGraph aims at not only capturing all aspects of the data contained in a lifelog but also linking them to external, static knowledge bases in order to put the log as a whole as well as its individual entries into a broader context. In the Lifelog Search Challenge 2020, we show a first proof-of-concept implementation of LifeGraph as well as a retrieval system prototype which utilizes it to search the log for specific events.
ChImp: Visualizing Ontology Changes and their Impact in Protégé

Romana Pernischová, Mirko Serbak, Daniele Dell’Aglio, and Abraham Bernstein

In Proceedings of the Fifth International Workshop on Visualization and Interaction for Ontologies and Linked Data co-located with the 19th International Semantic Web Conference (ISWC 2020), Virtual Conference (originally planned in Athens, Greece), November 02, 2020, vol. 2778, pp. 47–60, 2020

[Abs] [PDF] [URL]

Today, ontologies are an established part of many applications and research. However, ontologies evolve over time, and ontology editors—engineers and domain experts—need to be aware of the consequences of changes while editing. Ontology editors might not be fully aware of how they are influencing consistency, quality, or the structure of the ontology, possibly causing applications to fail. To support editors and increase their sensitivity towards the consequences of their actions, we conducted a user survey to elicit preferences for representing changes, e.g., with ontology metrics such as number of classes and properties. Based on the survey, we developed ChImp—a Protégé plug-in to display information about the impact of changes in real-time. During editing of the ontology, ChImp lists the applied changes, checks and displays the consistency status, and reports measures describing the effect on the structure of the ontology. Akin to software IDEs and integrated testing approaches, we hope that displaying such metrics will help to improve ontology evolution processes in the long run.
A Knowledge Graph-based System for Retrieval of Lifelog Data

Luca Rossetto, Matthias Baumgartner, Narges Ashena, Florian Ruosch, Romana Pernischová, and Abraham Bernstein

In Proceedings of the ISWC 2020 Demos and Industry Tracks: From Novel Ideas to Industrial Practice co-located with 19th International Semantic Web Conference (ISWC 2020), Globally online, November 1-6, 2020 (UTC), vol. 2721, pp. 223–228, 2020

[Abs] [PDF] [URL]

Lifelogging is a phenomenon where practitioners record an increasing part of their subjective daily experience with the aim of later being able to use these recordings as a memory aid or basis for data-driven self improvement. The resulting lifelogs are, therefore, only useful if the lifeloggers have efficient ways to search through them. The logs are inherently multi-modal and semi structured, combining data from several sources, such as cameras and other wearable physical as well as virtual sensors, so representing the data in a graph structure can effectively capture all produced interrelations. Since annotating each entry with a sufficiently large semantic context is infeasible, either manually or automatically, alternatives must be found to capture the higher level semantics. In this paper, we demonstrate LifeGraph, a first approach of creating a Knowledge Graph-based lifelog representation and retrieval solution, able of capturing a lifelog in a graph structure and augmenting it with external information to aid with the association of higher-level semantic information.

2019

Toward Predicting Impact of Changes in Evolving Knowledge Graphs

Romana Pernischová, Daniele Dell’Aglio, Matthew Horridge, Matthias Baumgartner, and Abraham Bernstein

In Proceedings of the ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019), Auckland, New Zealand, October 26-30, 2019, vol. 2456, pp. 137–140, 2019

[Abs] [PDF] [URL]

The updates on knowledge graphs (KGs) affect the services built on top of them. However, changes are not all the same: some updates drastically change the result of operations based on knowledge graph content; others do not lead to any variation. Estimating the impact of a change ex-ante is highly important, as it might make KG engineers aware of the consequences of their action during KG editing or may be used to highlight the importance of a new fragment of knowledge to be added to the KG for some application. The main goal of this contribution is to offer a formalization of the problem. Additionally, it presents some preliminary experiments on three different datasets considering embeddings as operation. Results show that the estimation can reach AUCs of 0.85, suggesting the feasibility of this research.
The Butterfly Effect in Knowledge Graphs: Predicting the Impact of Changes in the Evolving Web of Data

Romana Pernischová

vol. 2548, pp. 25–36, 2019

[Abs] [PDF] [URL]

Knowledge graphs (KGs) are at the core of numerous applications and their importance is increasing. Yet, knowledge evolves and so do KGs. PubMed, a search engine that primarily provides access to medical publications, adds an estimated 500’000 new records per year—each having the potential to require updates to a medical KG, like the National Cancer Institute Thesaurus. Depending on the applications that use such a medical KG, some of these updates have possibly wide ranging impact, while others have only local effects. Estimating the impact of a change ex-ante is highly important, as it might make KG-engineers aware of the consequences of their actions during editing or may be used to highlight the importance of a new fragment of knowledge to be added to the KG for some application. This research description proposes a unified methodology for predicting the impact of changes in evolving KGs and introduces an evaluation framework to assess the quality of these predictions.

2018

Stream Processing: The Matrix Revolutions

Romana Pernischová, Florian Ruosch, Daniele Dell’Aglio, and Abraham Bernstein

In Proceedings of the 12th International Workshop on Scalable Semantic Web Knowledge Base Systems co-located with 17th International Semantic Web Conference, SSWS@ISWC 2018, Monterey, California, USA, October 9, 2018, vol. 2179, pp. 15–27, 2018

[Abs] [PDF] [URL]

Analyzing data streams is a vital task in data science. Often, data comes in different shapes such as triples, tuples, relations, or matrices. Traditional stream processing systems, however, only process data in one of these formats. To enable the processing of streams combining different shapes of data, we developed a system that parses SPARQL queries using the Apache Jena parser and transforms them to Apache Flink topologies. With a custom data type and tailored functions, we enabled the integration of matrices in Jena and therefore allowed to mix graphs, relational, and linear algebra in an RDF graph. This provided a proof of concept that queries may be written for static data and – with the usage of the streaming engine Flink – can easily be run on data streams, even if they contain multiple of the aforementioned types.

2017

Abstracting .torrent content consumption into two-mode graphs and their projection to content networks (ConNet)

Andri Lareida, Romana Pernischová, Bruno Rodrigues, and Burkhard Stiller

In 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Lisbon, Portugal, May 8-12, 2017, pp. 151–159, 2017

[Abs] [PDF] [URL] [DOI]

Video-on-demand and live streaming services are about to take over video discs. Video streaming services typically cannot compete with the content available in Peer-to-Peer (P2P) file sharing networks. Thus, content providers can use P2P systems to identify content to include in their offer. This work defines a novel method to apply Social Network Analysis (SNA) on video streaming or download traces. Those traces are abstracted int a two-mode graph, which is projected to a content-centric one mode graph (ConNet). SNA measures are used on a ConNet to classify a content-centric graph and provide a general interpretation and insights into the system the traces were collected from. To evaluate the proposed method, real world traces acquired from BitTorrent (BT) swarms sharing movies and television (TV) shows are used to construct 48 hourly graphs to show the evolution of the graph. The results show that the video network can be classified as scale-free, that SNA measures can be used as an alternative popularity indicator, and that the network evolves over time and exhibits diurnal patterns. Finally, this work shows that the proposed method can be applied to real world traces and provides a novel perspective on video consumption.