Search Results

Now showing 1 - 10 of 31
  • Item
    Understanding Class Representations: An Intrinsic Evaluation of Zero-Shot Text Classification
    (Aachen, Germany : RWTH Aachen, 2021) Hoppe, Fabian; Dessì, Danilo; Sack, Harald; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, Harald
    Frequently, Text Classification is limited by insufficient training data. This problem is addressed by Zero-Shot Classification through the inclusion of external class definitions and then exploiting the relations between classes seen during training and unseen classes (Zero-shot). However, it requires a class embedding space capable of accurately representing the semantic relatedness between classes. This work defines an intrinsic evaluation based on greater-than constraints to provide a better understanding of this relatedness. The results imply that textual embeddings are able to capture more semantics than Knowledge Graph embeddings, but combining both modalities yields the best performance.
  • Item
    Steps towards a Dislocation Ontology for Crystalline Materials
    (Aachen, Germany : RWTH Aachen, 2021) Ihsan, Ahmad Zainul; Dessì, Danilo; Alam, Mehwish; Sack, Harald; Sandfeld, Stefan; García-Castro, Raúl; Davies, John; Antoniou, Grigoris; Fortuna, Carolina
    The field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain “dislocations" - a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into understanding dislocation behavior across different length scales both with experimental characterization techniques as well as with simulations. However, for describing such dislocation structures there is still a lack of a common standard to represent and to connect dislocation domain knowledge across different but related communities. An ontology offers a common foundation to enable knowledge representation and data interoperability, which are important components to establish a “digital twin". This paper outlines the first steps towards the design of an ontology in the dislocation domain and shows a connection with the already existing ontologies in the materials science and engineering domain.
  • Item
    Improving Zero-Shot Text Classification with Graph-based Knowledge Representations
    (Aachen, Germany : RWTH Aachen, 2022) Hoppe, Fabian; Hartig, Olaf; Seneviratne, Oshani
    Insufficient training data is a key challenge for text classification. In particular, long-tail class distributions and emerging, new classes do not provide any training data for specific classes. Therefore, such a zeroshot setting must incorporate additional, external knowledge to enable transfer learning by connecting the external knowledge of previously unseen classes to texts. Recent zero-shot text classifier utilize only distributional semantics defined by large language models and based on class names or natural language descriptions. This implicit knowledge contains ambiguities, is not able to capture logical relations nor is it an efficient representation of factual knowledge. These drawbacks can be avoided by introducing explicit, external knowledge. Especially, knowledge graphs provide such explicit, unambiguous, and complementary, domain specific knowledge. Hence, this thesis explores graph-based knowledge as additional modality for zero-shot text classification. Besides a general investigation of this modality, the influence on the capabilities of dealing with domain shifts by including domain-specific knowledge is explored.
  • Item
    DDB-KG: The German Bibliographic Heritage in a Knowledge Graph
    (Aachen, Germany : RWTH Aachen, 2021) Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Oppenlaender, Jonas; Dessì, Danilo; Harald, Sack; Sumikawa, Yasunobu; Ikejiri, Ryohei; Doucet, Antoine; Pfanzelter, Eva; Hasanuzzaman, Mohammed; Dias, Gaël; Milligan, Ian; Jatowt, Adam
    Under the German government’s initiative “NEUSTART Kultur”, the German Digital Library or Deutsche Digitale Bibliothek (DDB) is undergoing improvements to enhance user-experience. As an initial step, emphasis is placed on creating a knowledge graph from the bibliographic record collection of the DDB. This paper discusses the challenges facing the DDB in terms of retrieval and the solutions in addressing them. In particular, limitations of the current data model or ontology to represent bibliographic metadata is analyzed through concrete examples. This study presents the complete ontological mapping from DDB-Europeana Data Model (DDB-EDM) to FaBiO, and a prototype of the DDB-KG made available as a SPARQL endpoint. The suitabiliy of the target ontology is demonstrated with SPARQL queries formulated from competency questions.
  • Item
    Data Protection Impact Assessments in Practice: Experiences from Case Studies
    (Berlin ; Heidelberg : Springer, 2022) Friedewald, Michael; Schiering, Ina; Martin, Nicholas; Hallinan, Dara; Katsikas, Sokratis; Lambrinoudakis, Costas; Cuppens, Nora; Mylopoulos, John; Kalloniatis, Christos; Meng, Weizhi; Furnell, Steven; Pallas, Frank; Pohle, Jörg; Sasse, M. Angela; Abie, Habtamu; Ranise, Silvio; Verderame, Luca; Cambiaso, Enrico; Vidal, Jorge Maestre; Monge, Marco Antonio Sotelo
    In the context of the project A Data Protection Impact Assessment (DPIA) Tool for Practical Use in Companies and Public Administration an operationalization for Data Protection Impact Assessments was developed based on the approach of Forum Privatheit. This operationalization was tested and refined during twelve tests with startups, small- and medium sized enterprises, corporations and public bodies. This paper presents the operationalization and summarizes the experience from the tests.
  • Item
    Toward a Comparison Framework for Interactive Ontology Enrichment Methodologies
    (Aachen, Germany : RWTH Aachen, 2022) Vrolijk, Jarno; Reklos, Ioannis; Vafaie, Mahsa; Massari, Arcangelo; Mohammadi, Maryam; Rudolph, Sebastian; Fu, Bo; Lambrix, Patrick; Pesquita, Catia
    The growing demand for well-modeled ontologies in diverse application areas increases the need for intuitive interaction techniques that support human domain experts in ontology modeling and enrichment tasks, such that quality expectations are met. Beyond the correctness of the specified information, the quality of an ontology depends on its (relative) completeness, i.e., whether the ontology contains all the necessary information to draw expected inferences. On an abstract level, the Ontology Enrichment problem consists of identifying and filling the gap between information that can be logically inferred from the ontology and the information expected to be inferable by the user. To this end, numerous approaches have been described in the literature, providing methodologies from the fields of Formal Semantics and Automated Reasoning targeted at eliciting knowledge from human domain experts. These approaches vary greatly in many aspects and their applicability typically depends on the specifics of the concrete modeling scenario at hand. Toward a better understanding of the landscape of methodological possibilities, this position paper proposes a framework consisting of multiple performance dimensions along which existing and future approaches to interactive ontology enrichment can be characterized. We apply our categorization scheme to a selection of methodologies from the literature. In light of this comparison, we address the limitations of the methods and propose directions for future work.
  • Item
    Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems
    (Berlin ; Heidelberg : Springer, 2022) Greiner-Petter, André; Cohl, Howard S.; Youssef, Abdou; Schubotz, Moritz; Trost, Avi; Dey, Rajen; Aizawa, Akiko; Gipp, Bela; Fisman, Dana; Rosu, Grigore
    Digital mathematical libraries assemble the knowledge of years of mathematical research. Numerous disciplines (e.g., physics, engineering, pure and applied mathematics) rely heavily on compendia gathered findings. Likewise, modern research applications rely more and more on computational solutions, which are often calculated and verified by computer algebra systems. Hence, the correctness, accuracy, and reliability of both digital mathematical libraries and computer algebra systems is a crucial attribute for modern research. In this paper, we present a novel approach to verify a digital mathematical library and two computer algebra systems with one another by converting mathematical expressions from one system to the other. We use our previously developed conversion tool (referred to as ) to translate formulae from the NIST Digital Library of Mathematical Functions to the computer algebra systems Maple and Mathematica. The contributions of our presented work are as follows: (1) we present the most comprehensive verification of computer algebra systems and digital mathematical libraries with one another; (2) we significantly enhance the performance of the underlying translator in terms of coverage and accuracy; and (3) we provide open access to translations for Maple and Mathematica of the formulae in the NIST Digital Library of Mathematical Functions.
  • Item
    Temporal Evolution of the Migration-related Topics on Social Media
    (Aachen, Germany : RWTH Aachen, 2021) Chen, Yiyi; Gesese, Genet Asefa; Sack, Harald; Alam, Mehwish; Seneviratne, Oshani; Pesquita, Catia; Sequeda, Juan; Etcheverry, Lorena
    This poster focuses on capturing the temporal evolution of migration-related topics on relevant tweets. It uses Dynamic Embedded Topic Model (DETM) as a learning algorithm to perform a quantitative and qualitative analysis of these emerging topics. TweetsKB is extended with the extracted Twitter dataset along with the results of DETM which considers temporality. These results are then further analyzed and visualized. It reveals that the trajectories of the migration-related topics are in agreement with historical events. The source codes are available online: https://bit.ly/3dN9ICB.
  • Item
    Forschungsdaten in den Naturwissenschaften: Eine urheberrechtliche Bestandsaufnahme mit ihren Implikationen für universitäres FDM
    (Heidelberg : Universitätsbibliothek Heidelberg, 2022) Hartmann, Thomas; Heuveline, Vincent; Bisheh, Nina
    Ein Schlüsselfaktor für die Zugänglichkeit, Nachnutzbarkeit und Interoperabilität von Forschungsdaten1 ist deren urheberrechtlicher Status. In den Geistes- und Sozialwissenschaften unterliegen Forschungsdaten (z. Bsp. Texte in Form von Interviews oder sonstiger Literatur) in den meisten Fällen dem Urheberschutz. Anders ist die Situation in den Naturwissenschaften. Nicht immer, aber häufig bleiben Forschungsdaten aus diesen Fächern urheberrechtsfrei. Dies begründet der Beitrag an einem typischen Beispiel aus dem Forschungsdatenzentrum für Molekulare Materialforschung (SDC MoMaF).2 Welche rechtlichen Handlungsempfehlungen sich für das Forschungsdatenmanagement (FDM) daraus ergeben, wird am Beitragsende dargestellt.
  • Item
    Challenges of Applying Knowledge Graph and their Embeddings to a Real-world Use-case
    (Aachen, Germany : RWTH Aachen, 2021) Petzold, Rick; Gesese, Genet Asefa; Bogdanova, Viktoria; Zylowski, Thorsten; Sack, Harald; Alam, Mehwish; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, Harald
    Different Knowledge Graph Embedding (KGE) models have been proposed so far which are trained on some specific KG completion tasks such as link prediction and evaluated on datasets which are mainly created for such purpose. Mostly, the embeddings learnt on link prediction tasks are not applied for downstream tasks in real-world use-cases such as data available in different companies/organizations. In this paper, the challenges with enriching a KG which is generated from a real-world relational database (RDB) about companies, with information from external sources such as Wikidata and learning representations for the KG are presented. Moreover, a comparative analysis is presented between the KGEs and various text embeddings on some downstream clustering tasks. The results of experiments indicate that in use-cases like the one used in this paper, where the KG is highly skewed, it is beneficial to use text embeddings or language models instead of KGEs.