Search Results

Now showing 1 - 9 of 9
  • Item
    Towards Customizable Chart Visualizations of Tabular Data Using Knowledge Graphs
    (Cham : Springer, 2020) Wiens, Vitalis; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies. In this article, we focus on tabular data of scientific articles, which provide an organized and compressed representation of information. However, chart visualizations can additionally facilitate their comprehension. We present an approach that employs a human-in-the-loop paradigm during the data acquisition phase to define additional semantics for tabular data. The additional semantics guide the creation of chart visualizations for meaningful representations of tabular data. Our approach organizes tabular data into different information groups which are analyzed for the selection of suitable visualizations. The set of suitable visualizations serves as a user-driven selection of visual representations. Additionally, customization for visual representations provides the means for facilitating the understanding and sense-making of information.
  • Item
    Compact representations for efficient storage of semantic sensor data
    (Dordrecht : Springer Science + Business Media B.V, 2021) Karim, Farah; Vidal, Maria-Esther; Auer, Sören
    Nowadays, there is a rapid increase in the number of sensor data generated by a wide variety of sensors and devices. Data semantics facilitate information exchange, adaptability, and interoperability among several sensors and devices. Sensor data and their meaning can be described using ontologies, e.g., the Semantic Sensor Network (SSN) Ontology. Notwithstanding, semantically enriched, the size of semantic sensor data is substantially larger than raw sensor data. Moreover, some measurement values can be observed by sensors several times, and a huge number of repeated facts about sensor data can be produced. We propose a compact or factorized representation of semantic sensor data, where repeated measurement values are described only once. Furthermore, these compact representations are able to enhance the storage and processing of semantic sensor data. To scale up to large datasets, factorization based, tabular representations are exploited to store and manage factorized semantic sensor data using Big Data technologies. We empirically study the effectiveness of a semantic sensor’s proposed compact representations and their impact on query processing. Additionally, we evaluate the effects of storing the proposed representations on diverse RDF implementations. Results suggest that the proposed compact representations empower the storage and query processing of sensor data over diverse RDF implementations, and up to two orders of magnitude can reduce query execution time.
  • Item
    Ontology Design for Pharmaceutical Research Outcomes
    (Cham : Springer, 2020) Say, Zeynep; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    The network of scholarly publishing involves generating and exchanging ideas, certifying research, publishing in order to disseminate findings, and preserving outputs. Despite enormous efforts in providing support for each of those steps in scholarly communication, identifying knowledge fragments is still a big challenge. This is due to the heterogeneous nature of the scholarly data and the current paradigm of distribution by publishing (mostly document-based) over journal articles, numerous repositories, and libraries. Therefore, transforming this paradigm to knowledge-based representation is expected to reform the knowledge sharing in the scholarly world. Although many movements have been initiated in recent years, non-technical scientific communities suffer from transforming document-based publishing to knowledge-based publishing. In this paper, we present a model (PharmSci) for scholarly publishing in the pharmaceutical research domain with the goal of facilitating knowledge discovery through effective ontology-based data integration. PharmSci provides machine-interpretable information to the knowledge discovery process. The principles and guidelines of the ontological engineering have been followed. Reasoning-based techniques are also presented in the design of the ontology to improve the quality of targeted tasks for data integration. The developed ontology is evaluated with a validation process and also a quality verification method.
  • Item
    Creating a Scholarly Knowledge Graph from Survey Article Tables
    (Cham : Springer, 2020) Oelen, Allard; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
  • Item
    Operational Research Literature as a Use Case for the Open Research Knowledge Graph
    (Cham : Springer, 2020) Runnwerth, Mila; Stocker, Markus; Auer, Sören; Bigatti, Anna Maria; Carette, Jacques; Davenport, James H.; Joswig, Michael; de Wolff, Timo
    The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the moment, in ORKG papers are described manually, but in the long run the semantic depth of the literature at scale needs automation. Operational Research is a suitable test case for this vision because the mathematical field and, hence, its publication habits are highly structured: A mundane problem is formulated as a mathematical model, solved or approximated numerically, and evaluated systematically. We study the existing literature with respect to the Assembly Line Balancing Problem and derive a semantic description in accordance with the ORKG. Eventually, selected papers are ingested to test the semantic description and refine it further.
  • Item
    Representing Semantified Biological Assays in the Open Research Knowledge Graph
    (Cham : Springer, 2020) Anteghini, Marco; D'Souza, Jennifer; Martins dos Santos, Vitor A.P.; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    In the biotechnology and biomedical domains, recent text mining efforts advocate for machine-interpretable, and preferably, semantified, documentation formats of laboratory processes. This includes wet-lab protocols, (in)organic materials synthesis reactions, genetic manipulations and procedures for faster computer-mediated analysis and predictions. Herein, we present our work on the representation of semantified bioassays in the Open Research Knowledge Graph (ORKG). In particular, we describe a semantification system work-in-progress to generate, automatically and quickly, the critical semantified bioassay data mass needed to foster a consistent user audience to adopt the ORKG for recording their bioassays and facilitate the organisation of research, according to FAIR principles.
  • Item
    Question Answering on Scholarly Knowledge Graphs
    (Cham : Springer, 2020) Jaradeh, Mohamad Yaser; Stocker, Markus; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
  • Item
    Domain-Independent Extraction of Scientific Concepts from Research Articles
    (Cham : Springer, 2020) Brack, Arthur; D'Souza, Jennifer; Hoppe, Anett; Auer, Sören; Ewerth, Ralph; Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo; Ferro, Nicola; Silva, Mário J.; Martins, Flávio
    We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present a state-of-the-art deep learning baseline. Further, we propose the active learning strategy for an optimal selection of instances from among the various domains in our data. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.
  • Item
    Encoding Knowledge Graph Entity Aliases in Attentive Neural Network for Wikidata Entity Linking
    (Berlin ; Heidelberg : Springer, 2020) Mulang’, Isaiah Onando; Singh, Kuldeep; Vyas, Akhilesh; Shekarpour, Saeedeh; Vidal, Maria-Esther; Lehmann, Jens; Auer, Sören; Huang, Zhisheng; Beek, Wouter; Wang, Hua; Zhou, Rui; Zhang, Yanchun
    The collaborative knowledge graphs such as Wikidata excessively rely on the crowd to author the information. Since the crowd is not bound to a standard protocol for assigning entity titles, the knowledge graph is populated by non-standard, noisy, long or even sometimes awkward titles. The issue of long, implicit, and nonstandard entity representations is a challenge in Entity Linking (EL) approaches for gaining high precision and recall. Underlying KG in general is the source of target entities for EL approaches, however, it often contains other relevant information, such as aliases of entities (e.g., Obama and Barack Hussein Obama are aliases for the entity Barack Obama). EL models usually ignore such readily available entity attributes. In this paper, we examine the role of knowledge graph context on an attentive neural network approach for entity linking on Wikidata. Our approach contributes by exploiting the sufficient context from a KG as a source of background knowledge, which is then fed into the neural network. This approach demonstrates merit to address challenges associated with entity titles (multi-word, long, implicit, case-sensitive). Our experimental study shows ≈8% improvements over the baseline approach, and significantly outperform an end to end approach for Wikidata entity linking.