Search Results

Now showing 1 - 8 of 8
  • Item
    Towards Customizable Chart Visualizations of Tabular Data Using Knowledge Graphs
    (Cham : Springer, 2020) Wiens, Vitalis; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies. In this article, we focus on tabular data of scientific articles, which provide an organized and compressed representation of information. However, chart visualizations can additionally facilitate their comprehension. We present an approach that employs a human-in-the-loop paradigm during the data acquisition phase to define additional semantics for tabular data. The additional semantics guide the creation of chart visualizations for meaningful representations of tabular data. Our approach organizes tabular data into different information groups which are analyzed for the selection of suitable visualizations. The set of suitable visualizations serves as a user-driven selection of visual representations. Additionally, customization for visual representations provides the means for facilitating the understanding and sense-making of information.
  • Item
    Ontology Design for Pharmaceutical Research Outcomes
    (Cham : Springer, 2020) Say, Zeynep; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    The network of scholarly publishing involves generating and exchanging ideas, certifying research, publishing in order to disseminate findings, and preserving outputs. Despite enormous efforts in providing support for each of those steps in scholarly communication, identifying knowledge fragments is still a big challenge. This is due to the heterogeneous nature of the scholarly data and the current paradigm of distribution by publishing (mostly document-based) over journal articles, numerous repositories, and libraries. Therefore, transforming this paradigm to knowledge-based representation is expected to reform the knowledge sharing in the scholarly world. Although many movements have been initiated in recent years, non-technical scientific communities suffer from transforming document-based publishing to knowledge-based publishing. In this paper, we present a model (PharmSci) for scholarly publishing in the pharmaceutical research domain with the goal of facilitating knowledge discovery through effective ontology-based data integration. PharmSci provides machine-interpretable information to the knowledge discovery process. The principles and guidelines of the ontological engineering have been followed. Reasoning-based techniques are also presented in the design of the ontology to improve the quality of targeted tasks for data integration. The developed ontology is evaluated with a validation process and also a quality verification method.
  • Item
    Creating a Scholarly Knowledge Graph from Survey Article Tables
    (Cham : Springer, 2020) Oelen, Allard; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
  • Item
    Requirements Analysis for an Open Research Knowledge Graph
    (Berlin ; Heidelberg : Springer, 2020) Brack, Arthur; Hoppe, Anett; Stocker, Markus; Auer, Sören; Ewerth, Ralph; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KGs) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective by presenting a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications and outline possible solutions.
  • Item
    Domain-Independent Extraction of Scientific Concepts from Research Articles
    (Cham : Springer, 2020) Brack, Arthur; D'Souza, Jennifer; Hoppe, Anett; Auer, Sören; Ewerth, Ralph; Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo; Ferro, Nicola; Silva, Mário J.; Martins, Flávio
    We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present a state-of-the-art deep learning baseline. Further, we propose the active learning strategy for an optimal selection of instances from among the various domains in our data. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.
  • Item
    Scholarly event characteristics in four fields of science: a metrics-based analysis
    (Berlin : Springer Nature, 2020) Fathalla, S.; Vahdati, S.; Lange, C.; Auer, Sören
    One of the key channels of scholarly knowledge exchange are scholarly events such as conferences, workshops, symposiums, etc.; such events are especially important and popular in Computer Science, Engineering, and Natural Sciences.However, scholars encounter problems in finding relevant information about upcoming events and statistics on their historic evolution.In order to obtain a better understanding of scholarly event characteristics in four fields of science, we analyzed the metadata of scholarly events of four major fields of science, namely Computer Science, Physics, Engineering, and Mathematics using Scholarly Events Quality Assessment suite, a suite of ten metrics.In particular, we analyzed renowned scholarly events belonging to five sub-fields within Computer Science, namely World Wide Web, Computer Vision, Software Engineering, Data Management, as well as Security and Privacy.This analysis is based on a systematic approach using descriptive statistics as well as exploratory data analysis. The findings are on the one hand interesting to observe the general evolution and success factors of scholarly events; on the other hand, they allow (prospective) event organizers, publishers, and committee members to assess the progress of their event over time and compare it to other events in the same field; and finally, they help researchers to make more informed decisions when selecting suitable venues for presenting their work.Based on these findings, a set of recommendations has been concluded to different stakeholders, involving event organizers, potential authors, proceedings publishers, and sponsors. Our comprehensive dataset of scholarly events of the aforementioned fields is openly available in a semantic format and maintained collaboratively at OpenResearch.org.
  • Item
    SemSur: A Core Ontology for the Semantic Representation of Research Findings
    (Amsterdam [u.a.] : Elsevier, 2018) Fathalla, Said; Vahdati, Sahar; Auer, Sören; Lange, Christoph; Fensel, Anna; de Boer, Victor; Pellegrini, Tassilo; Kiesling, Elmar; Haslhofer, Bernhard; Hollink, Laura; Schindler, Alexander
    The way how research is communicated using text publications has not changed much over the past decades. We have the vision that ultimately researchers will work on a common structured knowledge base comprising comprehensive semantic and machine-comprehensible descriptions of their research, thus making research contributions more transparent and comparable. We present the SemSur ontology for semantically capturing the information commonly found in survey and review articles. SemSur is able to represent scientific results and to publish them in a comprehensive knowledge graph, which provides an efficient overview of a research field, and to compare research findings with related works in a structured way, thus saving researchers a significant amount of time and effort. The new release of SemSur covers more domains, defines better alignment with external ontologies and rules for eliciting implicit knowledge. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the SemSur ontology to answer queries about the different research contributions covered by the survey. SemSur is currently used and maintained at OpenResearch.org.
  • Item
    Analysing the requirements for an Open Research Knowledge Graph: use cases, quality requirements, and construction strategies
    (Berlin ; Heidelberg ; New York : Springer, 2021) Brack, Arthur; Hoppe, Anett; Stocker, Markus; Auer, Sören; Ewerth, Ralph
    Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.