Browsing by Author "Stocker, Markus"
Now showing 1 - 20 of 25
Results Per Page
Sort Options
- ItemAnalysing the requirements for an Open Research Knowledge Graph: use cases, quality requirements, and construction strategies(Berlin ; Heidelberg ; New York : Springer, 2021) Brack, Arthur; Hoppe, Anett; Stocker, Markus; Auer, Sören; Ewerth, RalphCurrent science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.
- ItemBuilding Scholarly Knowledge Bases with Crowdsourcing and Text Mining(Aachen : RWTH, 2020) Stocker, Markus; Zhang, Chengzhi; Mayr, Philipp; Lu, Wei; Zhang, YiFor centuries, scholarly knowledge has been buried in documents. While articles are great to convey the story of scientific work to peers, they make it hard for machines to process scholarly knowledge. The recent proliferation of the scholarly literature and the increasing inability of researchers to digest, reproduce, reuse its content are constant reminders that we urgently need a transformative digitalization of the scholarly literature. Building on the Open Research Knowledge Graph (http://orkg.org) as a concrete research infrastructure, in this talk we present how using crowdsourcing and text mining humans and machines can collaboratively build scholarly knowledge bases, i.e. systems that acquire, curate and publish data, information and knowledge published in the scholarly literature in structured and semantic form. We discuss some key challenges that human and technical infrastructures face as well as the possibilities scholarly knowledge bases enable.
- ItemCase Study: ENVRI Science Demonstrators with D4Science(Cham : Springer, 2020) Candela, Leonardo; Stocker, Markus; Häggström, Ingemar; Enell, Carl-Fredrik; Vitale, Domenico; Papale, Dario; Grenier, Baptiste; Chen, Yin; Obst, Matthias; Zhao, Zhiming; Hellström, MargaretaWhenever a community of practice starts developing an IT solution for its use case(s) it has to face the issue of carefully selecting “the platform” to use. Such a platform should match the requirements and the overall settings resulting from the specific application context (including legacy technologies and solutions to be integrated and reused, costs of adoption and operation, easiness in acquiring skills and competencies). There is no one-size-fits-all solution that is suitable for all application context, and this is particularly true for scientific communities and their cases because of the wide heterogeneity characterising them. However, there is a large consensus that solutions from scratch are inefficient and services that facilitate the development and maintenance of scientific community-specific solutions do exist. This chapter describes how a set of diverse communities of practice efficiently developed their science demonstrators (on analysing and producing user-defined atmosphere data products, greenhouse gases fluxes, particle formation, mosquito diseases) by leveraging the services offered by the D4Science infrastructure. It shows that the D4Science design decisions aiming at streamlining implementations are effective. The chapter discusses the added value injected in the science demonstrators and resulting from the reuse of D4Science services, especially regarding Open Science practices and overall quality of service.
- ItemCreating a Scholarly Knowledge Graph from Survey Article Tables(Cham : Springer, 2020) Oelen, Allard; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, LihongDue to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
- ItemCrowdsourcing Scholarly Discourse Annotations(New York, NY : ACM, 2021) Oelen, Allard; Stocker, Markus; Auer, SörenThe number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process.
- ItemCurating Scientific Information in Knowledge Infrastructures(Paris : CODATA, 2018) Stocker, Markus; Paasonen, Pauli; Fiebig, Markus; Zaidan, Martha A.; Hardisty, AlexInterpreting observational data is a fundamental task in the sciences, specifically in earth and environmental science where observational data are increasingly acquired, curated, and published systematically by environmental research infrastructures. Typically subject to substantial processing, observational data are used by research communities, their research groups and individual scientists, who interpret such primary data for their meaning in the context of research investigations. The result of interpretation is information—meaningful secondary or derived data—about the observed environment. Research infrastructures and research communities are thus essential to evolving uninterpreted observational data to information. In digital form, the classical bearer of information are the commonly known “(elaborated) data products,” for instance maps. In such form, meaning is generally implicit e.g., in map colour coding, and thus largely inaccessible to machines. The systematic acquisition, curation, possible publishing and further processing of information gained in observational data interpretation—as machine readable data and their machine readable meaning—is not common practice among environmental research infrastructures. For a use case in aerosol science, we elucidate these problems and present a Jupyter based prototype infrastructure that exploits a machine learning approach to interpretation and could support a research community in interpreting observational data and, more importantly, in curating and further using resulting information about a studied natural phenomenon.
- ItemFAIR Convergence Matrix: Optimizing the Reuse of Existing FAIR-Related Resources(Cambridge, MA : MIT Press, 2020) Sustkova, Hana Pergl; Hettne, Kristina Maria; Wittenburg, Peter; Jacobsen, Annika; Kuhn, Tobias; Pergl, Robert; Slifka, Jan; McQuilton, Peter; Magagna, Barbara; Sansone, Susanna-Assunta; Stocker, Markus; Imming, Melanie; Lannom, Larry; Musen, Mark; Schultes, ErikThe FAIR principles articulate the behaviors expected from digital artifacts that are Findable, Accessible, Interoperable and Reusable by machines and by people. Although by now widely accepted, the FAIR Principles by design do not explicitly consider actual implementation choices enabling FAIR behaviors. As different communities have their own, often well-established implementation preferences and priorities for data reuse, coordinating a broadly accepted, widely used FAIR implementation approach remains a global challenge. In an effort to accelerate broad community convergence on FAIR implementation options, the GO FAIR community has launched the development of the FAIR Convergence Matrix. The Matrix is a platform that compiles for any community of practice, an inventory of their self-declared FAIR implementation choices and challenges. The Convergence Matrix is itself a FAIR resource, openly available, and encourages voluntary participation by any self-identified community of practice (not only the GO FAIR Implementation Networks). Based on patterns of use and reuse of existing resources, the Convergence Matrix supports the transparent derivation of strategies that optimally coordinate convergence on standards and technologies in the emerging Internet of FAIR Data and Services.
- ItemGenerate FAIR Literature Surveys with Scholarly Knowledge Graphs(New York City, NY : Association for Computing Machinery, 2020) Oelen, Allard; Jaradeh, Mohamad Yaser; Stocker, Markus; Auer, SörenReviewing scientific literature is a cumbersome, time consuming but crucial activity in research. Leveraging a scholarly knowledge graph, we present a methodology and a system for comparing scholarly literature, in particular research contributions describing the addressed problem, utilized materials, employed methods and yielded results. The system can be used by researchers to quickly get familiar with existing work in a specific research domain (e.g., a concrete research question or hypothesis). Additionally, it can be used to publish literature surveys following the FAIR Data Principles. The methodology to create a research contribution comparison consists of multiple tasks, specifically: (a) finding similar contributions, (b) aligning contribution descriptions, (c) visualizing and finally (d) publishing the comparison. The methodology is implemented within the Open Research Knowledge Graph (ORKG), a scholarly infrastructure that enables researchers to collaboratively describe, find and compare research contributions. We evaluate the implementation using data extracted from published review articles. The evaluation also addresses the FAIRness of comparisons published with the ORKG.
- ItemInformation extraction pipelines for knowledge graphs(London : Springer, 2023) Jaradeh, Mohamad Yaser; Singh, Kuldeep; Stocker, Markus; Both, Andreas; Auer, SörenIn the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
- ItemIntegrating data and analysis technologies within leading environmental research infrastructures: Challenges and approaches(Amsterdam [u.a.] : Elsevier, 2021) Huber, Robert; D'Onofrio, Claudio; Devaraju, Anusuriya; Klump, Jens; Loescher, Henry W.; Kindermann, Stephan; Guru, Siddeswara; Grant, Mark; Morris, Beryl; Wyborn, Lesley; Evans, Ben; Goldfarb, Doron; Genazzio, Melissa A.; Ren, Xiaoli; Magagna, Barbara; Thiemann, Hannes; Stocker, MarkusWhen researchers analyze data, it typically requires significant effort in data preparation to make the data analysis ready. This often involves cleaning, pre-processing, harmonizing, or integrating data from one or multiple sources and placing them into a computational environment in a form suitable for analysis. Research infrastructures and their data repositories host data and make them available to researchers, but rarely offer a computational environment for data analysis. Published data are often persistently identified, but such identifiers resolve onto landing pages that must be (manually) navigated to identify how data are accessed. This navigation is typically challenging or impossible for machines. This paper surveys existing approaches for improving environmental data access to facilitate more rapid data analyses in computational environments, and thus contribute to a more seamless integration of data and analysis. By analysing current state-of-the-art approaches and solutions being implemented by world‑leading environmental research infrastructures, we highlight the existing practices to interface data repositories with computational environments and the challenges moving forward. We found that while the level of standardization has improved during recent years, it still is challenging for machines to discover and access data based on persistent identifiers. This is problematic in regard to the emerging requirements for FAIR (Findable, Accessible, Interoperable, and Reusable) data, in general, and problematic for seamless integration of data and analysis, in particular. There are a number of promising approaches that would improve the state-of-the-art. A key approach presented here involves software libraries that streamline reading data and metadata into computational environments. We describe this approach in detail for two research infrastructures. We argue that the development and maintenance of specialized libraries for each RI and a range of programming languages used in data analysis does not scale well. Based on this observation, we propose a set of established standards and web practices that, if implemented by environmental research infrastructures, will enable the development of RI and programming language independent software libraries with much reduced effort required for library implementation and maintenance as well as considerably lower learning requirements on users. To catalyse such advancement, we propose a roadmap and key action points for technology harmonization among RIs that we argue will build the foundation for efficient and effective integration of data and analysis.
- ItemKnowledge Graphs - Working Group Charter (NFDI section-metadata) (1.2)(Genève : CERN, 2023) Stocker, Markus; Rossenova, Lozana; Shigapov, Renat; Betancort, Noemi; Dietze, Stefan; Murphy, Bridget; Bölling, Christian; Schubotz, Moritz; Koepler, OliverKnowledge Graphs are a key technology for implementing the FAIR principles in data infrastructures by ensuring interoperability for both humans and machines. The Working Group "Knowledge Graphs" in Section "(Meta)data, Terminologies, Provenance" of the German National Research Data Infrastructure (Nationale Forschungsdateninfrastruktur (NFDI) e.V.) aims to promote the use of knowledge graphs in all NFDI consortia, to facilitate cross-domain data interlinking and federation following the FAIR principles, and to contribute to the joint development of tools and technologies that enable transformation of structured and unstructured data into semantically reusable knowledge across different domains.
- ItemOpen Research Knowledge Graph(Goettingen: Cuvillier Verlag, 2024-05-07) Auer, Sören; Ilangovan, Vinodh; Stocker, Markus; Tiwari, Sanju; Vogt, Lars; Bernard-Verdier, Maud; D'Souza, Jennifer; Fadel , Kamel; Farfar, Kheir Eddine; Göpfert , Jan; Haris , Muhammad; Heger, Tina; Hussein, Hassan; Jaradeh, Yaser; Jeschke, Jonathan M.; Jiomekong , Azanzi; Kabongo, Salomon; Karras, Oliver; Kuckertz, Patrick; Kullamann, Felix; Martin, Emily A.; Oelen, Allard; Perez-Alvarez, Ricardo; Prinz, Manuel; Snyder, Lauren D.; Stolten, Detlef; Weinand, Jann M.As we mark the fifth anniversary of the alpha release of the Open Research Knowledge Graph (ORKG), it is both timely and exhilarating to celebrate the significant strides made in this pioneering project. We designed this book as a tribute to the evolution and achievements of the ORKG and as a practical guide encapsulating its essence in a form that resonates with both the general reader and the specialist. The ORKG has opened a new era in the way scholarly knowledge is curated, managed, and disseminated. By transforming vast arrays of unstructured narrative text into structured, machine-processable knowledge, the ORKG has emerged as an essential service with sophisticated functionalities. Over the past five years, our team has developed the ORKG into a vibrant platform that enhances the accessibility and visibility of scientific research. This book serves as a non-technical guide and a comprehensive reference for new and existing users that outlines the ORKG’s approach, technologies, and its role in revolutionizing scholarly communication. By elucidating how the ORKG facilitates the collection, enhancement, and sharing of knowledge, we invite readers to appreciate the value and potential of this groundbreaking digital tool presented in a tangible form. Looking ahead, we are thrilled to announce the upcoming unveiling of promising new features and tools at the fifth-year celebration of the ORKG’s alpha release. These innovations are set to redefine the boundaries of machine assistance enabled by research knowledge graphs. Among these enhancements, you can expect more intuitive interfaces that simplify the user experience, and enhanced machine learning models that improve the automation and accuracy of data curation. We also included a glossary tailored to clarifying key terms and concepts associated with the ORKG to ensure that all readers, regardless of their technical background, can fully engage with and understand the content presented. This book transcends the boundaries of a typical technical report. We crafted this as an inspiration for future applications, a testament to the ongoing evolution in scholarly communication that invites further collaboration and innovation. Let this book serve as both your guide and invitation to explore the ORKG as it continues to grow and shape the landscape of scientific inquiry and communication.
- ItemOperational Research Literature as a Use Case for the Open Research Knowledge Graph(Cham : Springer, 2020) Runnwerth, Mila; Stocker, Markus; Auer, Sören; Bigatti, Anna Maria; Carette, Jacques; Davenport, James H.; Joswig, Michael; de Wolff, TimoThe Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the moment, in ORKG papers are described manually, but in the long run the semantic depth of the literature at scale needs automation. Operational Research is a suitable test case for this vision because the mathematical field and, hence, its publication habits are highly structured: A mundane problem is formulated as a mathematical model, solved or approximated numerically, and evaluated systematically. We study the existing literature with respect to the Assembly Line Balancing Problem and derive a semantic description in accordance with the ORKG. Eventually, selected papers are ingested to test the semantic description and refine it further.
- ItemORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph(Sofia : Pensoft, 2021) Auer, Sören; Stocker, Markus; Vogt, Lars; Fraumann, Grischa; Garatzogianni, AlexandraThis document is an edited version of the original funding proposal entitled 'ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph' that was submitted to the European Research Council (ERC) Proof of Concept (PoC) Grant in September 2020 (https://erc.europa.eu/funding/proof-concept). The proposal was evaluated by five reviewers and has been placed after the evaluations on the reserve list. The main document of the original proposal did not contain an abstract.
- ItemPersistent Identification Of Instruments(Ithaka : Cornell University, 2020) Stocker, Markus; Darroch, Louise; Krahl, Rolf; Habermann, Ted; Devaraju, Anusuriya; Schwardmann, Ulrich; D'Onofrio, Claudio; Häggström, IngemarInstruments play an essential role in creating research data. Given the importance of instruments and associated metadata to the assessment of data quality and data reuse, globally unique, persistent and resolvable identification of instruments is crucial. The Research Data Alliance Working Group Persistent Identification of Instruments (PIDINST) developed a community-driven solution for persistent identification of instruments which we present and discuss in this paper. Based on an analysis of 10 use cases, PIDINST developed a metadata schema and prototyped schema implementation with DataCite and ePIC as representative persistent identifier infrastructures and with HZB (Helmholtz-Zentrum Berlin für Materialien und Energie) and BODC (British Oceanographic Data Centre) as representative institutional instrument providers. These implementations demonstrate the viability of the proposed solution in practice. Moving forward, PIDINST will further catalyse adoption and consolidate the schema by addressing new stakeholder requirements.
- ItemQuestion Answering on Scholarly Knowledge Graphs(Cham : Springer, 2020) Jaradeh, Mohamad Yaser; Stocker, Markus; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, FabienAnswering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
- ItemRequirements Analysis for an Open Research Knowledge Graph(Berlin ; Heidelberg : Springer, 2020) Brack, Arthur; Hoppe, Anett; Stocker, Markus; Auer, Sören; Ewerth, Ralph; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, FabienCurrent science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KGs) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective by presenting a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications and outline possible solutions.
- ItemA Scholarly Knowledge Graph-Powered Dashboard: Implementation and User Evaluation(Lausanne : Frontiers Media, 2022) Lezhnina, Olga; Kismihók, Gábor; Prinz, Manuel; Stocker, Markus; Auer, SörenScholarly knowledge graphs provide researchers with a novel modality of information retrieval, and their wider use in academia is beneficial for the digitalization of published works and the development of scholarly communication. To increase the acceptance of scholarly knowledge graphs, we present a dashboard, which visualizes the research contributions on an educational science topic in the frame of the Open Research Knowledge Graph (ORKG). As dashboards are created at the intersection of computer science, graphic design, and human-technology interaction, we used these three perspectives to develop a multi-relational visualization tool aimed at improving the user experience. According to preliminary results of the user evaluation survey, the dashboard was perceived as more appealing than the baseline ORKG-powered interface. Our findings can be used for the development of scholarly knowledge graph-powered dashboards in different domains, thus facilitating acceptance of these novel instruments by research communities and increasing versatility in scholarly communication.
- ItemSemantic and Knowledge Engineering Using ENVRI RM(Cham : Springer, 2020) Martin, Paul; Liao, Xiaofeng; Magagna, Barbara; Stocker, Markus; Zhao, Zhiming; Zhao, Zhiming; Hellström, MargaretaThe ENVRI Reference Model provides architects and engineers with the means to describe the architecture and operational behaviour of environmental and Earth science research infrastructures (RIs) in a standardised way using the standard terminology. This terminology and the relationships between specific classes of concept can be used as the basis for the machine-actionable specification of RIs or RI subsystems. Open Information Linking for Environmental RIs (OIL-E) is a framework for capturing architectural and design knowledge about environmental and Earth science RIs intended to help harmonise vocabulary, promote collaboration and identify common standards and technologies across different research infrastructure initiatives. At its heart is an ontology derived from the ENVRI Reference Model. Using this ontology, RI descriptions can be published as linked data, allowing discovery, querying and comparison using established Semantic Web technologies. It can also be used as an upper ontology by which to connect descriptions of RI entities (whether they be datasets, equipment, processes, etc.) that use other, more specific terminologies. The ENVRI Knowledge Base uses OIL-E to capture information about environmental and Earth science RIs in the ENVRI community for query and comparison. The Knowledge Base can be used to identify the technologies and standards used for particular activities and services and as a basis for evaluating research infrastructure subsystems and behaviours against certain criteria, such as compliance with the FAIR data principles.
- ItemThe SciQA Scientific Question Answering Benchmark for Scholarly Knowledge(London : Nature Publishing Group, 2023) Auer, Sören; Barone, Dante A.C.; Bartz, Cassiano; Cortes, Eduardo G.; Jaradeh, Mohamad Yaser; Karras, Oliver; Koubarakis, Manolis; Mouromtsev, Dmitry; Pliukhin, Dmitrii; Radyush, Daniil; Shilin, Ivan; Stocker, Markus; Tsalapati, EleniKnowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.