Search Results

Now showing 1 - 8 of 8
Loading...
Thumbnail Image
Item

Crowdsourcing Scholarly Discourse Annotations

2021, Oelen, Allard, Stocker, Markus, Auer, Sören

The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process.

Loading...
Thumbnail Image
Item

A Scholarly Knowledge Graph-Powered Dashboard: Implementation and User Evaluation

2022, Lezhnina, Olga, Kismihók, Gábor, Prinz, Manuel, Stocker, Markus, Auer, Sören

Scholarly knowledge graphs provide researchers with a novel modality of information retrieval, and their wider use in academia is beneficial for the digitalization of published works and the development of scholarly communication. To increase the acceptance of scholarly knowledge graphs, we present a dashboard, which visualizes the research contributions on an educational science topic in the frame of the Open Research Knowledge Graph (ORKG). As dashboards are created at the intersection of computer science, graphic design, and human-technology interaction, we used these three perspectives to develop a multi-relational visualization tool aimed at improving the user experience. According to preliminary results of the user evaluation survey, the dashboard was perceived as more appealing than the baseline ORKG-powered interface. Our findings can be used for the development of scholarly knowledge graph-powered dashboards in different domains, thus facilitating acceptance of these novel instruments by research communities and increasing versatility in scholarly communication.

Loading...
Thumbnail Image
Item

ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph

2021, Auer, Sören, Stocker, Markus, Vogt, Lars, Fraumann, Grischa, Garatzogianni, Alexandra

This document is an edited version of the original funding proposal entitled 'ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph' that was submitted to the European Research Council (ERC) Proof of Concept (PoC) Grant in September 2020 (https://erc.europa.eu/funding/proof-concept). The proposal was evaluated by five reviewers and has been placed after the evaluations on the reserve list. The main document of the original proposal did not contain an abstract.

Loading...
Thumbnail Image
Item

Information extraction pipelines for knowledge graphs

2023, Jaradeh, Mohamad Yaser, Singh, Kuldeep, Stocker, Markus, Both, Andreas, Auer, Sören

In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.

Loading...
Thumbnail Image
Item

Integrating data and analysis technologies within leading environmental research infrastructures: Challenges and approaches

2021, Huber, Robert, D'Onofrio, Claudio, Devaraju, Anusuriya, Klump, Jens, Loescher, Henry W., Kindermann, Stephan, Guru, Siddeswara, Grant, Mark, Morris, Beryl, Wyborn, Lesley, Evans, Ben, Goldfarb, Doron, Genazzio, Melissa A., Ren, Xiaoli, Magagna, Barbara, Thiemann, Hannes, Stocker, Markus

When researchers analyze data, it typically requires significant effort in data preparation to make the data analysis ready. This often involves cleaning, pre-processing, harmonizing, or integrating data from one or multiple sources and placing them into a computational environment in a form suitable for analysis. Research infrastructures and their data repositories host data and make them available to researchers, but rarely offer a computational environment for data analysis. Published data are often persistently identified, but such identifiers resolve onto landing pages that must be (manually) navigated to identify how data are accessed. This navigation is typically challenging or impossible for machines. This paper surveys existing approaches for improving environmental data access to facilitate more rapid data analyses in computational environments, and thus contribute to a more seamless integration of data and analysis. By analysing current state-of-the-art approaches and solutions being implemented by world‑leading environmental research infrastructures, we highlight the existing practices to interface data repositories with computational environments and the challenges moving forward. We found that while the level of standardization has improved during recent years, it still is challenging for machines to discover and access data based on persistent identifiers. This is problematic in regard to the emerging requirements for FAIR (Findable, Accessible, Interoperable, and Reusable) data, in general, and problematic for seamless integration of data and analysis, in particular. There are a number of promising approaches that would improve the state-of-the-art. A key approach presented here involves software libraries that streamline reading data and metadata into computational environments. We describe this approach in detail for two research infrastructures. We argue that the development and maintenance of specialized libraries for each RI and a range of programming languages used in data analysis does not scale well. Based on this observation, we propose a set of established standards and web practices that, if implemented by environmental research infrastructures, will enable the development of RI and programming language independent software libraries with much reduced effort required for library implementation and maintenance as well as considerably lower learning requirements on users. To catalyse such advancement, we propose a roadmap and key action points for technology harmonization among RIs that we argue will build the foundation for efficient and effective integration of data and analysis.

Loading...
Thumbnail Image
Item

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

2023, Auer, Sören, Barone, Dante A.C., Bartz, Cassiano, Cortes, Eduardo G., Jaradeh, Mohamad Yaser, Karras, Oliver, Koubarakis, Manolis, Mouromtsev, Dmitry, Pliukhin, Dmitrii, Radyush, Daniil, Shilin, Ivan, Stocker, Markus, Tsalapati, Eleni

Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

Loading...
Thumbnail Image
Item

Analysing the requirements for an Open Research Knowledge Graph: use cases, quality requirements, and construction strategies

2021, Brack, Arthur, Hoppe, Anett, Stocker, Markus, Auer, Sören, Ewerth, Ralph

Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.

Loading...
Thumbnail Image
Item

Knowledge Graphs - Working Group Charter (NFDI section-metadata) (1.2)

2023, Stocker, Markus, Rossenova, Lozana, Shigapov, Renat, Betancort, Noemi, Dietze, Stefan, Murphy, Bridget, Bölling, Christian, Schubotz, Moritz, Koepler, Oliver

Knowledge Graphs are a key technology for implementing the FAIR principles in data infrastructures by ensuring interoperability for both humans and machines. The Working Group "Knowledge Graphs" in Section "(Meta)data, Terminologies, Provenance" of the German National Research Data Infrastructure (Nationale Forschungsdateninfrastruktur (NFDI) e.V.) aims to promote the use of knowledge graphs in all NFDI consortia, to facilitate cross-domain data interlinking and federation following the FAIR principles, and to contribute to the joint development of tools and technologies that enable transformation of structured and unstructured data into semantically reusable knowledge across different domains.