Search Results

Now showing 1 - 10 of 157
  • Item
    Knowledge Graphs - Working Group Charter (NFDI section-metadata) (1.2)
    (Genève : CERN, 2023) Stocker, Markus; Rossenova, Lozana; Shigapov, Renat; Betancort, Noemi; Dietze, Stefan; Murphy, Bridget; Bölling, Christian; Schubotz, Moritz; Koepler, Oliver
    Knowledge Graphs are a key technology for implementing the FAIR principles in data infrastructures by ensuring interoperability for both humans and machines. The Working Group "Knowledge Graphs" in Section "(Meta)data, Terminologies, Provenance" of the German National Research Data Infrastructure (Nationale Forschungsdateninfrastruktur (NFDI) e.V.) aims to promote the use of knowledge graphs in all NFDI consortia, to facilitate cross-domain data interlinking and federation following the FAIR principles, and to contribute to the joint development of tools and technologies that enable transformation of structured and unstructured data into semantically reusable knowledge across different domains.
  • Item
    Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems
    (Berlin ; Heidelberg : Springer, 2022) Greiner-Petter, André; Cohl, Howard S.; Youssef, Abdou; Schubotz, Moritz; Trost, Avi; Dey, Rajen; Aizawa, Akiko; Gipp, Bela; Fisman, Dana; Rosu, Grigore
    Digital mathematical libraries assemble the knowledge of years of mathematical research. Numerous disciplines (e.g., physics, engineering, pure and applied mathematics) rely heavily on compendia gathered findings. Likewise, modern research applications rely more and more on computational solutions, which are often calculated and verified by computer algebra systems. Hence, the correctness, accuracy, and reliability of both digital mathematical libraries and computer algebra systems is a crucial attribute for modern research. In this paper, we present a novel approach to verify a digital mathematical library and two computer algebra systems with one another by converting mathematical expressions from one system to the other. We use our previously developed conversion tool (referred to as ) to translate formulae from the NIST Digital Library of Mathematical Functions to the computer algebra systems Maple and Mathematica. The contributions of our presented work are as follows: (1) we present the most comprehensive verification of computer algebra systems and digital mathematical libraries with one another; (2) we significantly enhance the performance of the underlying translator in terms of coverage and accuracy; and (3) we provide open access to translations for Maple and Mathematica of the formulae in the NIST Digital Library of Mathematical Functions.
  • Item
    Collaborative annotation and semantic enrichment of 3D media
    (New York,NY,United States : Association for Computing Machinery, 2022) Rossenova, Lozana; Schubert, Zoe; Vock, Richard; Sohmen, Lucia; Günther, Lukas; Duchesne, Paul; Blümel, Ina; Aizawa, Akiko
    A new FOSS (free and open source software) toolchain and associated workflow is being developed in the context of NFDI4Culture, a German consortium of research- and cultural heritage institutions working towards a shared infrastructure for research data that meets the needs of 21st century data creators, maintainers and end users across the broad spectrum of the digital libraries and archives field, and the digital humanities. This short paper and demo present how the integrated toolchain connects: 1) OpenRefine - for data reconciliation and batch upload; 2) Wikibase - for linked open data (LOD) storage; and 3) Kompakkt - for rendering and annotating 3D models. The presentation is aimed at librarians, digital curators and data managers interested in learning how to manage research datasets containing 3D media, and how to make them available within an open data environment with 3D-rendering and collaborative annotation features.
  • Item
    Unveiling Relations in the Industry 4.0 Standards Landscape Based on Knowledge Graph Embeddings
    (Cham : Springer, 2020) Rivas, Ariam; Grangel-González, Irlán; Collarana, Diego; Lehmann, Jens; Vidal, Maria-Esther; Hartmann, Sven; Küng, Josef; Kotsis, Gabriele; Tjoa, A Min; Khalil, Ismail
    Industry 4.0 (I4.0) standards and standardization frameworks have been proposed with the goal of empowering interoperability in smart factories. These standards enable the description and interaction of the main components, systems, and processes inside of a smart factory. Due to the growing number of frameworks and standards, there is an increasing need for approaches that automatically analyze the landscape of I4.0 standards. Standardization frameworks classify standards according to their functions into layers and dimensions. However, similar standards can be classified differently across the frameworks, producing, thus, interoperability conflicts among them. Semantic-based approaches that rely on ontologies and knowledge graphs, have been proposed to represent standards, known relations among them, as well as their classification according to existing frameworks. Albeit informative, the structured modeling of the I4.0 landscape only provides the foundations for detecting interoperability issues. Thus, graph-based analytical methods able to exploit knowledge encoded by these approaches, are required to uncover alignments among standards. We study the relatedness among standards and frameworks based on community analysis to discover knowledge that helps to cope with interoperability conflicts between standards. We use knowledge graph embeddings to automatically create these communities exploiting the meaning of the existing relationships. In particular, we focus on the identification of similar standards, i.e., communities of standards, and analyze their properties to detect unknown relations. We empirically evaluate our approach on a knowledge graph of I4.0 standards using the Trans∗ family of embedding models for knowledge graph entities. Our results are promising and suggest that relations among standards can be detected accurately.
  • Item
    Context-Based Entity Matching for Big Data
    (Cham : Springer, 2020) Tasnim, Mayesha; Collarana, Diego; Graux, Damien; Vidal, Maria-Esther; Janev, Valentina; Graux, Damien; Jabeen, Hajira; Sallinger, Emanuel
    In the Big Data era, where variety is the most dominant dimension, the RDF data model enables the creation and integration of actionable knowledge from heterogeneous data sources. However, the RDF data model allows for describing entities under various contexts, e.g., people can be described from its demographic context, but as well from their professional contexts. Context-aware description poses challenges during entity matching of RDF datasets—the match might not be valid in every context. To perform a contextually relevant entity matching, the specific context under which a data-driven task, e.g., data integration is performed, must be taken into account. However, existing approaches only consider inter-schema and properties mapping of different data sources and prevent users from selecting contexts and conditions during a data integration process. We devise COMET, an entity matching technique that relies on both the knowledge stated in RDF vocabularies and a context-based similarity metric to map contextually equivalent RDF graphs. COMET follows a two-fold approach to solve the problem of entity matching in RDF graphs in a context-aware manner. In the first step, COMET computes the similarity measures across RDF entities and resorts to the Formal Concept Analysis algorithm to map contextually equivalent RDF entities. Finally, COMET combines the results of the first step and executes a 1-1 perfect matching algorithm for matching RDF entities based on the combined scores. We empirically evaluate the performance of COMET on testbed from DBpedia. The experimental results suggest that COMET accurately matches equivalent RDF graphs in a context-dependent manner.
  • Item
    Contextual Language Models for Knowledge Graph Completion
    (Aachen, Germany : RWTH Aachen, 2021) Russa, Biswas; Sofronova, Radina; Alam, Mehwish; Sack, Harald; Mehwish, Alam; Ali, Medi; Groth, Paul; Hitzler, Pascal; Lehmann, Jens; Paulheim, Heiko; Rettinger, Achim; Sack, Harald; Sadeghi, Afshin; Tresp, Volker
    Knowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the _ne-tuning of the GPT-2 model for triple classi_cation strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed.
  • Item
    Building Scholarly Knowledge Bases with Crowdsourcing and Text Mining
    (Aachen : RWTH, 2020) Stocker, Markus; Zhang, Chengzhi; Mayr, Philipp; Lu, Wei; Zhang, Yi
    For centuries, scholarly knowledge has been buried in documents. While articles are great to convey the story of scientific work to peers, they make it hard for machines to process scholarly knowledge. The recent proliferation of the scholarly literature and the increasing inability of researchers to digest, reproduce, reuse its content are constant reminders that we urgently need a transformative digitalization of the scholarly literature. Building on the Open Research Knowledge Graph (http://orkg.org) as a concrete research infrastructure, in this talk we present how using crowdsourcing and text mining humans and machines can collaboratively build scholarly knowledge bases, i.e. systems that acquire, curate and publish data, information and knowledge published in the scholarly literature in structured and semantic form. We discuss some key challenges that human and technical infrastructures face as well as the possibilities scholarly knowledge bases enable.
  • Item
    NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature
    (Aachen : RWTH, 2020) D'Souza, Jennifer; Auer, Sören
    We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. Question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers [18] of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761.
  • Item
    Data Protection Impact Assessments in Practice: Experiences from Case Studies
    (Berlin ; Heidelberg : Springer, 2022) Friedewald, Michael; Schiering, Ina; Martin, Nicholas; Hallinan, Dara; Katsikas, Sokratis; Lambrinoudakis, Costas; Cuppens, Nora; Mylopoulos, John; Kalloniatis, Christos; Meng, Weizhi; Furnell, Steven; Pallas, Frank; Pohle, Jörg; Sasse, M. Angela; Abie, Habtamu; Ranise, Silvio; Verderame, Luca; Cambiaso, Enrico; Vidal, Jorge Maestre; Monge, Marco Antonio Sotelo
    In the context of the project A Data Protection Impact Assessment (DPIA) Tool for Practical Use in Companies and Public Administration an operationalization for Data Protection Impact Assessments was developed based on the approach of Forum Privatheit. This operationalization was tested and refined during twelve tests with startups, small- and medium sized enterprises, corporations and public bodies. This paper presents the operationalization and summarizes the experience from the tests.
  • Item
    Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features
    (Aachen, Germany : RWTH Aachen, 2020) Cheema, Gullasl S.; Hakimov, Sherzod; Ewerth, Ralph; Cappellato, Linda; Eickhoff, Carsten; Ferro, Nicola; Névéol, Aurélie
    In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first prob-lem, claim check-worthiness prediction, we explore the fusion of syntac-tic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similar-ity, and perform KD-search to retrieve verified claims with respect to a query tweet.