Search Results

Now showing 1 - 10 of 12
  • Item
    Towards Customizable Chart Visualizations of Tabular Data Using Knowledge Graphs
    (Cham : Springer, 2020) Wiens, Vitalis; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies. In this article, we focus on tabular data of scientific articles, which provide an organized and compressed representation of information. However, chart visualizations can additionally facilitate their comprehension. We present an approach that employs a human-in-the-loop paradigm during the data acquisition phase to define additional semantics for tabular data. The additional semantics guide the creation of chart visualizations for meaningful representations of tabular data. Our approach organizes tabular data into different information groups which are analyzed for the selection of suitable visualizations. The set of suitable visualizations serves as a user-driven selection of visual representations. Additionally, customization for visual representations provides the means for facilitating the understanding and sense-making of information.
  • Item
    Ontology Design for Pharmaceutical Research Outcomes
    (Cham : Springer, 2020) Say, Zeynep; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    The network of scholarly publishing involves generating and exchanging ideas, certifying research, publishing in order to disseminate findings, and preserving outputs. Despite enormous efforts in providing support for each of those steps in scholarly communication, identifying knowledge fragments is still a big challenge. This is due to the heterogeneous nature of the scholarly data and the current paradigm of distribution by publishing (mostly document-based) over journal articles, numerous repositories, and libraries. Therefore, transforming this paradigm to knowledge-based representation is expected to reform the knowledge sharing in the scholarly world. Although many movements have been initiated in recent years, non-technical scientific communities suffer from transforming document-based publishing to knowledge-based publishing. In this paper, we present a model (PharmSci) for scholarly publishing in the pharmaceutical research domain with the goal of facilitating knowledge discovery through effective ontology-based data integration. PharmSci provides machine-interpretable information to the knowledge discovery process. The principles and guidelines of the ontological engineering have been followed. Reasoning-based techniques are also presented in the design of the ontology to improve the quality of targeted tasks for data integration. The developed ontology is evaluated with a validation process and also a quality verification method.
  • Item
    Towards Operational Research Infrastructures with FAIR Data and Services
    (Cham : Springer, 2020) Zhao, Zhiming; Jeffery, Keith; Stocker, Markus; Atkinson, Malcolm; Petzold, Andreas; Zhao, Zhiming; Hellström, Margareta
    Environmental research infrastructures aim to provide scientists with facilities, resources and services to enable scientists to effectively perform advanced research. When addressing societal challenges such as climate change and pollution, scientists usually need data, models and methods from different domains to tackle the complexity of the complete environmental system. Research infrastructures are thus required to enable all data, including services, products, and virtual research environments is FAIR for research communities: Findable, Accessible, Interoperable and Reusable. In this last chapter, we conclude and identify future challenges in research infrastructure operation, user support, interoperability, and future evolution.
  • Item
    Creating a Scholarly Knowledge Graph from Survey Article Tables
    (Cham : Springer, 2020) Oelen, Allard; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
  • Item
    Operational Research Literature as a Use Case for the Open Research Knowledge Graph
    (Cham : Springer, 2020) Runnwerth, Mila; Stocker, Markus; Auer, Sören; Bigatti, Anna Maria; Carette, Jacques; Davenport, James H.; Joswig, Michael; de Wolff, Timo
    The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the moment, in ORKG papers are described manually, but in the long run the semantic depth of the literature at scale needs automation. Operational Research is a suitable test case for this vision because the mathematical field and, hence, its publication habits are highly structured: A mundane problem is formulated as a mathematical model, solved or approximated numerically, and evaluated systematically. We study the existing literature with respect to the Assembly Line Balancing Problem and derive a semantic description in accordance with the ORKG. Eventually, selected papers are ingested to test the semantic description and refine it further.
  • Item
    Case Study: ENVRI Science Demonstrators with D4Science
    (Cham : Springer, 2020) Candela, Leonardo; Stocker, Markus; Häggström, Ingemar; Enell, Carl-Fredrik; Vitale, Domenico; Papale, Dario; Grenier, Baptiste; Chen, Yin; Obst, Matthias; Zhao, Zhiming; Hellström, Margareta
    Whenever a community of practice starts developing an IT solution for its use case(s) it has to face the issue of carefully selecting “the platform” to use. Such a platform should match the requirements and the overall settings resulting from the specific application context (including legacy technologies and solutions to be integrated and reused, costs of adoption and operation, easiness in acquiring skills and competencies). There is no one-size-fits-all solution that is suitable for all application context, and this is particularly true for scientific communities and their cases because of the wide heterogeneity characterising them. However, there is a large consensus that solutions from scratch are inefficient and services that facilitate the development and maintenance of scientific community-specific solutions do exist. This chapter describes how a set of diverse communities of practice efficiently developed their science demonstrators (on analysing and producing user-defined atmosphere data products, greenhouse gases fluxes, particle formation, mosquito diseases) by leveraging the services offered by the D4Science infrastructure. It shows that the D4Science design decisions aiming at streamlining implementations are effective. The chapter discusses the added value injected in the science demonstrators and resulting from the reuse of D4Science services, especially regarding Open Science practices and overall quality of service.
  • Item
    Representing Semantified Biological Assays in the Open Research Knowledge Graph
    (Cham : Springer, 2020) Anteghini, Marco; D'Souza, Jennifer; Martins dos Santos, Vitor A.P.; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    In the biotechnology and biomedical domains, recent text mining efforts advocate for machine-interpretable, and preferably, semantified, documentation formats of laboratory processes. This includes wet-lab protocols, (in)organic materials synthesis reactions, genetic manipulations and procedures for faster computer-mediated analysis and predictions. Herein, we present our work on the representation of semantified bioassays in the Open Research Knowledge Graph (ORKG). In particular, we describe a semantification system work-in-progress to generate, automatically and quickly, the critical semantified bioassay data mass needed to foster a consistent user audience to adopt the ORKG for recording their bioassays and facilitate the organisation of research, according to FAIR principles.
  • Item
    Question Answering on Scholarly Knowledge Graphs
    (Cham : Springer, 2020) Jaradeh, Mohamad Yaser; Stocker, Markus; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
  • Item
    Semantic and Knowledge Engineering Using ENVRI RM
    (Cham : Springer, 2020) Martin, Paul; Liao, Xiaofeng; Magagna, Barbara; Stocker, Markus; Zhao, Zhiming; Zhao, Zhiming; Hellström, Margareta
    The ENVRI Reference Model provides architects and engineers with the means to describe the architecture and operational behaviour of environmental and Earth science research infrastructures (RIs) in a standardised way using the standard terminology. This terminology and the relationships between specific classes of concept can be used as the basis for the machine-actionable specification of RIs or RI subsystems. Open Information Linking for Environmental RIs (OIL-E) is a framework for capturing architectural and design knowledge about environmental and Earth science RIs intended to help harmonise vocabulary, promote collaboration and identify common standards and technologies across different research infrastructure initiatives. At its heart is an ontology derived from the ENVRI Reference Model. Using this ontology, RI descriptions can be published as linked data, allowing discovery, querying and comparison using established Semantic Web technologies. It can also be used as an upper ontology by which to connect descriptions of RI entities (whether they be datasets, equipment, processes, etc.) that use other, more specific terminologies. The ENVRI Knowledge Base uses OIL-E to capture information about environmental and Earth science RIs in the ENVRI community for query and comparison. The Knowledge Base can be used to identify the technologies and standards used for particular activities and services and as a basis for evaluating research infrastructure subsystems and behaviours against certain criteria, such as compliance with the FAIR data principles.
  • Item
    Domain-Independent Extraction of Scientific Concepts from Research Articles
    (Cham : Springer, 2020) Brack, Arthur; D'Souza, Jennifer; Hoppe, Anett; Auer, Sören; Ewerth, Ralph; Jose, Joemon M.; Yilmaz, Emine; Magalhães, João; Castells, Pablo; Ferro, Nicola; Silva, Mário J.; Martins, Flávio
    We examine the novel task of domain-independent scientific concept extraction from abstracts of scholarly articles and present two contributions. First, we suggest a set of generic scientific concepts that have been identified in a systematic annotation process. This set of concepts is utilised to annotate a corpus of scientific abstracts from 10 domains of Science, Technology and Medicine at the phrasal level in a joint effort with domain experts. The resulting dataset is used in a set of benchmark experiments to (a) provide baseline performance for this task, (b) examine the transferability of concepts between domains. Second, we present a state-of-the-art deep learning baseline. Further, we propose the active learning strategy for an optimal selection of instances from among the various domains in our data. The experimental results show that (1) a substantial agreement is achievable by non-experts after consultation with domain experts, (2) the baseline system achieves a fairly high F1 score, (3) active learning enables us to nearly halve the amount of required training data.