Search Results

Now showing 1 - 10 of 23
  • Item A platform for semantically representing and analyzing open fiscal data
    (Zenodo, 2018) Musyaffa, Fathoni A.; Halilaj, Lavdim; Li, Yakun; Orlandi, Fabrizio; Jabeen, Hajira; Auer, Sören; Vidal, Maria-Esther
    A paper describing the details of platform implementation. Pre-print version of the paper accepted at International Conference On Web Engineering (ICWE) 2018 in Caceres, Spain.
  • Item
    ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph
    (Sofia : Pensoft, 2021) Auer, Sören; Stocker, Markus; Vogt, Lars; Fraumann, Grischa; Garatzogianni, Alexandra
    This document is an edited version of the original funding proposal entitled 'ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph' that was submitted to the European Research Council (ERC) Proof of Concept (PoC) Grant in September 2020 ( The proposal was evaluated by five reviewers and has been placed after the evaluations on the reserve list. The main document of the original proposal did not contain an abstract.
  • Item
    NFDI4Ing - the National Research Data Infrastructure for Engineering Sciences
    (Meyrin : CERN, 2020-09-25) Schmitt, Robert H.; Anthofer, Verena; Auer, Sören; Başkaya, Sait; Bischof, Christian; Bronger, Torsten; Claus, Florian; Cordes, Florian; Demandt, Évariste; Eifert, Thomas; Flemisch, Bernd; Fuchs, Matthias; Fuhrmans, Marc; Gerike, Regine; Gerstner, Eva-Maria; Hanke, Vanessa; Heine, Ina; Huebser, Louis; Iglezakis, Dorothea; Jagusch, Gerald; Klinger, Axel; Krafczyk, Manfred; Kraft, Angelina; Kuckertz, Patrick; Küsters, Ulrike; Lachmayer, Roland; Langenbach, Christian; Mozgova, Iryna; Müller, Matthias S.; Nestler, Britta; Pelz, Peter; Politze, Marius; Preuß, Nils; Przybylski-Freund, Marie-Dominique; Rißler-Pipka, Nanette; Robinius, Martin; Schachtner, Joachim; Schlenz, Hartmut; Schwarz, Annett; Schwibs, Jürgen; Selzer, Michael; Sens, Irina; Stäcker, Thomas; Stemmer, Christian; Stille, Wolfgang; Stolten, Detlef; Stotzka, Rainer; Streit, Achim; Strötgen, Robert; Wang, Wei Min
    NFDI4Ing brings together the engineering communities and fosters the management of engineering research data. The consortium represents engineers from all walks of the profession. It offers a unique method-oriented and user-centred approach in order to make engineering research data FAIR – findable, accessible, interoperable, and re-usable. NFDI4Ing has been founded in 2017. The consortium has actively engaged engineers across all five engineering research areas of the DFG classification. Leading figures have teamed up with experienced infrastructure providers. As one important step, NFDI4Ing has taken on the task of structuring the wealth of concrete needs in research data management. A broad consensus on typical methods and workflows in engineering research has been established: The archetypes. So far, seven archetypes are harmonising the methodological needs: Alex: bespoke experiments with high variability of setups, Betty: engineering research software, Caden: provenance tracking of physical samples & data samples, Doris: high performance measurement & computation, Ellen: extensive and heterogeneous data requirements, Frank: many participants & simultaneous devices, Golo: field data & distributed systems. A survey of the entire engineering research landscape in Germany confirms that the concept of engineering archetypes has been very well received. 95% of the research groups identify themselves with at least one of the NFDI4Ing archetypes. NFDI4Ing plans to further coordinate its engagement along the gateways provided by the DFG classification of engineering research areas. Consequently, NFDI4Ing will support five community clusters. In addition, an overarching task area will provide seven base services to be accessed by both the community clusters and the archetype task areas. Base services address quality assurance & metrics, research software development, terminologies & metadata, repositories & storage, data security & sovereignty, training, and data & knowledge discovery. With the archetype approach, NFDI4Ing’s work programme is modular and distinctly method-oriented. With the community clusters and base services, NFDI4Ing’s work programme remains firmly user-centred and highly integrated. NFDI4Ing has set in place an internal organisational structure that ensures viability, operational efficiency, and openness to new partners during the course of the consortium’s development. NFDI4Ing’s management team brings in the experience from two applicant institutions and from two years of actively engaging with the engineering communities. Eleven applicant institutions and over fifty participants have committed to carrying out NFDI4Ing’s work programme. Moreover, NFDI4Ing’s connectedness with consortia from nearby disciplinary fields is strong. Collaboration on cross-cutting topics is well prepared and foreseen. As a result, NFDI4Ing is ready to join the National Research Data Infrastructure.
  • Item
    Compacting frequent star patterns in RDF graphs
    (Dordrecht : Springer Science + Business Media B.V, 2020) Karim, Farah; Vidal, Maria-Esther; Auer, Sören
    Knowledge graphs have become a popular formalism for representing entities and their properties using a graph data model, e.g., the Resource Description Framework (RDF). An RDF graph comprises entities of the same type connected to objects or other entities using labeled edges annotated with properties. RDF graphs usually contain entities that share the same objects in a certain group of properties, i.e., they match star patterns composed of these properties and objects. In case the number of these entities or properties in these star patterns is large, the size of the RDF graph and query processing are negatively impacted; we refer these star patterns as frequent star patterns. We address the problem of identifying frequent star patterns in RDF graphs and devise the concept of factorized RDF graphs, which denote compact representations of RDF graphs where the number of frequent star patterns is minimized. We also develop computational methods to identify frequent star patterns and generate a factorized RDF graph, where compact RDF molecules replace frequent star patterns. A compact RDF molecule of a frequent star pattern denotes an RDF subgraph that instantiates the corresponding star pattern. Instead of having all the entities matching the original frequent star pattern, a surrogate entity is added and related to the properties of the frequent star pattern; it is linked to the entities that originally match the frequent star pattern. Since the edges between the entities and the objects in the frequent star pattern are replaced by edges between these entities and the surrogate entity of the compact RDF molecule, the size of the RDF graph is reduced. We evaluate the performance of our factorization techniques on several RDF graph benchmarks and compare with a baseline built on top gSpan, a state-of-the-art algorithm to detect frequent patterns. The outcomes evidence the efficiency of proposed approach and show that our techniques are able to reduce execution time of the baseline approach in at least three orders of magnitude. Additionally, RDF graph size can be reduced by up to 66.56% while data represented in the original RDF graph is preserved.
  • Item
    An OER Recommender System Supporting Accessibility Requirements
    (New York : Association for Computing Machinery, 2020) Elias, Mirette; Tavakoli, Mohammadreza; Lohmann, Steffen; Kismihok, Gabor; Auer, Sören; Gurreiro, Tiago; Nicolau, Hugo; Moffatt, Karyn
    Open Educational Resources are becoming a significant source of learning that are widely used for various educational purposes and levels. Learners have diverse backgrounds and needs, especially when it comes to learners with accessibility requirements. Persons with disabilities have significantly lower employment rates partly due to the lack of access to education and vocational rehabilitation and training. It is not surprising therefore, that providing high quality OERs that facilitate the self-development towards specific jobs and skills on the labor market in the light of special preferences of learners with disabilities is difficult. In this paper, we introduce a personalized OER recommeder system that considers skills, occupations, and accessibility properties of learners to retrieve the most adequate and high-quality OERs. This is done by: 1) describing the profile of learners with disabilities, 2) collecting and analysing more than 1,500 OERs, 3) filtering OERs based on their accessibility features and predicted quality, and 4) providing personalised OER recommendations for learners according to their accessibility needs. As a result, the OERs retrieved by our method proved to satisfy more accessibility checks than other OERs. Moreover, we evaluated our results with five experts in educating people with visual and cognitive impairments. The evaluation showed that our recommendations are potentially helpful for learners with accessibility needs.
  • Item
    TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation
    (New York,NY,United States : Association for Computing Machinery, 2022) Oelen, Allard; Stocker, Markus; Auer, Sören; Aizawa, Akiko
    As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.
  • Item
    Question Answering on Scholarly Knowledge Graphs
    (Cham : Springer, 2020) Jaradeh, Mohamad Yaser; Stocker, Markus; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    Answering questions on scholarly knowledge comprising text and other artifacts is a vital part of any research life cycle. Querying scholarly knowledge and retrieving suitable answers is currently hardly possible due to the following primary reason: machine inactionable, ambiguous and unstructured content in publications. We present JarvisQA, a BERT based system to answer questions on tabular views of scholarly knowledge graphs. Such tables can be found in a variety of shapes in the scholarly literature (e.g., surveys, comparisons or results). Our system can retrieve direct answers to a variety of different questions asked on tabular data in articles. Furthermore, we present a preliminary dataset of related tables and a corresponding set of natural language questions. This dataset is used as a benchmark for our system and can be reused by others. Additionally, JarvisQA is evaluated on two datasets against other baselines and shows an improvement of two to three folds in performance compared to related methods.
  • Item
    Quality Prediction of Open Educational Resources A Metadata-based Approach
    (Piscataway, NJ : IEEE, 2020) Tavakoli, Mohammadreza; Elias, Mirette; Kismihók, Gábor; Auer, Sören; Chang, Maiga; Sampson, Demetrios G.; Huang, Ronghuai; Hooshyar, Danial; Chen, Nian-Shing; Kinshuk; Pedaste, Margus
    In the recent decade, online learning environments have accumulated millions of Open Educational Resources (OERs). However, for learners, finding relevant and high quality OERs is a complicated and time-consuming activity. Furthermore, metadata play a key role in offering high quality services such as recommendation and search. Metadata can also be used for automatic OER quality control as, in the light of the continuously increasing number of OERs, manual quality control is getting more and more difficult. In this work, we collected the metadata of 8,887 OERs to perform an exploratory data analysis to observe the effect of quality control on metadata quality. Subsequently, we propose an OER metadata scoring model, and build a metadata-based prediction model to anticipate the quality of OERs. Based on our data and model, we were able to detect high-quality OERs with the F1 score of 94.6%. © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
  • Item
    Ontology Design for Pharmaceutical Research Outcomes
    (Cham : Springer, 2020) Say, Zeynep; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    The network of scholarly publishing involves generating and exchanging ideas, certifying research, publishing in order to disseminate findings, and preserving outputs. Despite enormous efforts in providing support for each of those steps in scholarly communication, identifying knowledge fragments is still a big challenge. This is due to the heterogeneous nature of the scholarly data and the current paradigm of distribution by publishing (mostly document-based) over journal articles, numerous repositories, and libraries. Therefore, transforming this paradigm to knowledge-based representation is expected to reform the knowledge sharing in the scholarly world. Although many movements have been initiated in recent years, non-technical scientific communities suffer from transforming document-based publishing to knowledge-based publishing. In this paper, we present a model (PharmSci) for scholarly publishing in the pharmaceutical research domain with the goal of facilitating knowledge discovery through effective ontology-based data integration. PharmSci provides machine-interpretable information to the knowledge discovery process. The principles and guidelines of the ontological engineering have been followed. Reasoning-based techniques are also presented in the design of the ontology to improve the quality of targeted tasks for data integration. The developed ontology is evaluated with a validation process and also a quality verification method.
  • Item
    Toward Representing Research Contributions in Scholarly Knowledge Graphs Using Knowledge Graph Cells
    (New York City, NY : Association for Computing Machinery, 2020) Vogt, Lars; D'Souza, Jennifer; Stocker, Markus; Auer, Sören
    There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. Toward this end, in this work, we propose a novel semantic data model for modeling the contribution of scientific investigations. Our model, i.e. the Research Contribution Model (RCM), includes a schema of pertinent concepts highlighting six core information units, viz. Objective, Method, Activity, Agent, Material, and Result, on which the contribution hinges. It comprises bottom-up design considerations made from three scientific domains, viz. Medicine, Computer Science, and Agriculture, which we highlight as case studies. For its implementation in a knowledge graph application we introduce the idea of building blocks called Knowledge Graph Cells (KGC), which provide the following characteristics: (1) they limit the expressibility of ontologies to what is relevant in a knowledge graph regarding specific concepts on the theme of research contributions; (2) they are expressible via ABox and TBox expressions; (3) they enforce a certain level of data consistency by ensuring that a uniform modeling scheme is followed through rules and input controls; (4) they organize the knowledge graph into named graphs; (5) they provide information for the front end for displaying the knowledge graph in a human-readable form such as HTML pages; and (6) they can be seamlessly integrated into any existing publishing process thatsupports form-based input abstracting its semantic technicalities including RDF semantification from the user. Thus RCM joins the trend of existing work toward enhanced digitalization of scholarly publication enabled by an RDF semantification as a knowledge graph fostering the evolution of the scholarly publications beyond written text.