Search Results

Now showing 1 - 10 of 13
Loading...
Thumbnail Image
Item

Easy Semantification of Bioassays

2022, Anteghini, Marco, D’Souza, Jennifer, dos Santos, Vitor A. P. Martins, Auer, Sören

Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.

Loading...
Thumbnail Image
Item

TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation

2022, Oelen, Allard, Stocker, Markus, Auer, Sören, Aizawa, Akiko

As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.

Loading...
Thumbnail Image
Item

Toward Representing Research Contributions in Scholarly Knowledge Graphs Using Knowledge Graph Cells

2020, Vogt, Lars, D'Souza, Jennifer, Stocker, Markus, Auer, Sören

There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. Toward this end, in this work, we propose a novel semantic data model for modeling the contribution of scientific investigations. Our model, i.e. the Research Contribution Model (RCM), includes a schema of pertinent concepts highlighting six core information units, viz. Objective, Method, Activity, Agent, Material, and Result, on which the contribution hinges. It comprises bottom-up design considerations made from three scientific domains, viz. Medicine, Computer Science, and Agriculture, which we highlight as case studies. For its implementation in a knowledge graph application we introduce the idea of building blocks called Knowledge Graph Cells (KGC), which provide the following characteristics: (1) they limit the expressibility of ontologies to what is relevant in a knowledge graph regarding specific concepts on the theme of research contributions; (2) they are expressible via ABox and TBox expressions; (3) they enforce a certain level of data consistency by ensuring that a uniform modeling scheme is followed through rules and input controls; (4) they organize the knowledge graph into named graphs; (5) they provide information for the front end for displaying the knowledge graph in a human-readable form such as HTML pages; and (6) they can be seamlessly integrated into any existing publishing process thatsupports form-based input abstracting its semantic technicalities including RDF semantification from the user. Thus RCM joins the trend of existing work toward enhanced digitalization of scholarly publication enabled by an RDF semantification as a knowledge graph fostering the evolution of the scholarly publications beyond written text.

Loading...
Thumbnail Image
Item

Generate FAIR Literature Surveys with Scholarly Knowledge Graphs

2020, Oelen, Allard, Jaradeh, Mohamad Yaser, Stocker, Markus, Auer, Sören

Reviewing scientific literature is a cumbersome, time consuming but crucial activity in research. Leveraging a scholarly knowledge graph, we present a methodology and a system for comparing scholarly literature, in particular research contributions describing the addressed problem, utilized materials, employed methods and yielded results. The system can be used by researchers to quickly get familiar with existing work in a specific research domain (e.g., a concrete research question or hypothesis). Additionally, it can be used to publish literature surveys following the FAIR Data Principles. The methodology to create a research contribution comparison consists of multiple tasks, specifically: (a) finding similar contributions, (b) aligning contribution descriptions, (c) visualizing and finally (d) publishing the comparison. The methodology is implemented within the Open Research Knowledge Graph (ORKG), a scholarly infrastructure that enables researchers to collaboratively describe, find and compare research contributions. We evaluate the implementation using data extracted from published review articles. The evaluation also addresses the FAIRness of comparisons published with the ORKG.

Loading...
Thumbnail Image
Item

Compacting frequent star patterns in RDF graphs

2020, Karim, Farah, Vidal, Maria-Esther, Auer, Sören

Knowledge graphs have become a popular formalism for representing entities and their properties using a graph data model, e.g., the Resource Description Framework (RDF). An RDF graph comprises entities of the same type connected to objects or other entities using labeled edges annotated with properties. RDF graphs usually contain entities that share the same objects in a certain group of properties, i.e., they match star patterns composed of these properties and objects. In case the number of these entities or properties in these star patterns is large, the size of the RDF graph and query processing are negatively impacted; we refer these star patterns as frequent star patterns. We address the problem of identifying frequent star patterns in RDF graphs and devise the concept of factorized RDF graphs, which denote compact representations of RDF graphs where the number of frequent star patterns is minimized. We also develop computational methods to identify frequent star patterns and generate a factorized RDF graph, where compact RDF molecules replace frequent star patterns. A compact RDF molecule of a frequent star pattern denotes an RDF subgraph that instantiates the corresponding star pattern. Instead of having all the entities matching the original frequent star pattern, a surrogate entity is added and related to the properties of the frequent star pattern; it is linked to the entities that originally match the frequent star pattern. Since the edges between the entities and the objects in the frequent star pattern are replaced by edges between these entities and the surrogate entity of the compact RDF molecule, the size of the RDF graph is reduced. We evaluate the performance of our factorization techniques on several RDF graph benchmarks and compare with a baseline built on top gSpan, a state-of-the-art algorithm to detect frequent patterns. The outcomes evidence the efficiency of proposed approach and show that our techniques are able to reduce execution time of the baseline approach in at least three orders of magnitude. Additionally, RDF graph size can be reduced by up to 66.56% while data represented in the original RDF graph is preserved.

Loading...
Thumbnail Image
Item

Accessibility and Personalization in OpenCourseWare : An Inclusive Development Approach

2020, Elias, Mirette, Ruckhaus, Edna, Draffan, E.A., James, Abi, Suárez-Figueroa, Mari Carmen, Lohmann, Steffen, Khiat, Abderrahmane, Auer, Sören, Chang, Maiga, Sampson, Demetrios G., Huang, Ronghuai, Hooshyar, Danial, Chen, Nian-Shing, Kinshuk, Pedaste, Margus

OpenCourseWare (OCW) has become a desirable source for sharing free educational resources which means there will always be users with differing needs. It is therefore the responsibility of OCW platform developers to consider accessibility as one of their prioritized requirements to ensure ease of use for all, including those with disabilities. However, the main challenge when creating an accessible platform is the ability to address all the different types of barriers that might affect those with a wide range of physical, sensory and cognitive impairments. This article discusses accessibility and personalization strategies and their realisation in the SlideWiki platform, in order to facilitate the development of accessible OCW. Previously, accessibility was seen as a complementary feature that can be tackled in the implementation phase. However, a meaningful integration of accessibility features requires thoughtful consideration during all project phases with active involvement of related stakeholders. The evaluation results and lessons learned from the SlideWiki development process have the potential to assist in the development of other systems that aim for an inclusive approach. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Loading...
Thumbnail Image
Item

Clustering Semantic Predicates in the Open Research Knowledge Graph

2022, Arab Oghli, Omar, D’Souza, Jennifer, Auer, Sören

When semantically describing knowledge graphs (KGs), users have to make a critical choice of a vocabulary (i.e. predicates and resources). The success of KG building is determined by the convergence of shared vocabularies so that meaning can be established. The typical lifecycle for a new KG construction can be defined as follows: nascent phases of graph construction experience terminology divergence, while later phases of graph construction experience terminology convergence and reuse. In this paper, we describe our approach tailoring two AI-based clustering algorithms for recommending predicates (in RDF statements) about resources in the Open Research Knowledge Graph (ORKG) https://orkg.org/. Such a service to recommend existing predicates to semantify new incoming data of scholarly publications is of paramount importance for fostering terminology convergence in the ORKG. Our experiments show very promising results: a high precision with relatively high recall in linear runtime performance. Furthermore, this work offers novel insights into the predicate groups that automatically accrue loosely as generic semantification patterns for semantification of scholarly knowledge spanning 44 research fields.

Loading...
Thumbnail Image
Item

An OER Recommender System Supporting Accessibility Requirements

2020, Elias, Mirette, Tavakoli, Mohammadreza, Lohmann, Steffen, Kismihok, Gabor, Auer, Sören, Gurreiro, Tiago, Nicolau, Hugo, Moffatt, Karyn

Open Educational Resources are becoming a significant source of learning that are widely used for various educational purposes and levels. Learners have diverse backgrounds and needs, especially when it comes to learners with accessibility requirements. Persons with disabilities have significantly lower employment rates partly due to the lack of access to education and vocational rehabilitation and training. It is not surprising therefore, that providing high quality OERs that facilitate the self-development towards specific jobs and skills on the labor market in the light of special preferences of learners with disabilities is difficult. In this paper, we introduce a personalized OER recommeder system that considers skills, occupations, and accessibility properties of learners to retrieve the most adequate and high-quality OERs. This is done by: 1) describing the profile of learners with disabilities, 2) collecting and analysing more than 1,500 OERs, 3) filtering OERs based on their accessibility features and predicted quality, and 4) providing personalised OER recommendations for learners according to their accessibility needs. As a result, the OERs retrieved by our method proved to satisfy more accessibility checks than other OERs. Moreover, we evaluated our results with five experts in educating people with visual and cognitive impairments. The evaluation showed that our recommendations are potentially helpful for learners with accessibility needs.

Loading...
Thumbnail Image
Item

Quality Prediction of Open Educational Resources A Metadata-based Approach

2020, Tavakoli, Mohammadreza, Elias, Mirette, Kismihók, Gábor, Auer, Sören, Chang, Maiga, Sampson, Demetrios G., Huang, Ronghuai, Hooshyar, Danial, Chen, Nian-Shing, Kinshuk, Pedaste, Margus

In the recent decade, online learning environments have accumulated millions of Open Educational Resources (OERs). However, for learners, finding relevant and high quality OERs is a complicated and time-consuming activity. Furthermore, metadata play a key role in offering high quality services such as recommendation and search. Metadata can also be used for automatic OER quality control as, in the light of the continuously increasing number of OERs, manual quality control is getting more and more difficult. In this work, we collected the metadata of 8,887 OERs to perform an exploratory data analysis to observe the effect of quality control on metadata quality. Subsequently, we propose an OER metadata scoring model, and build a metadata-based prediction model to anticipate the quality of OERs. Based on our data and model, we were able to detect high-quality OERs with the F1 score of 94.6%. © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Loading...
Thumbnail Image
Item

An Approach to Evaluate User Interfaces in a Scholarly Knowledge Communication Domain

2023, Obrezkov, Denis, Oelen, Allard, Auer, Sören, Abdelnour-Nocera, José L., Marta Lárusdóttir, Petrie, Helen, Piccinno, Antonio, Winckler, Marco

The amount of research articles produced every day is overwhelming: scholarly knowledge is getting harder to communicate and easier to get lost. A possible solution is to represent the information in knowledge graphs: structures representing knowledge in networks of entities, their semantic types, and relationships between them. But this solution has its own drawback: given its very specific task, it requires new methods for designing and evaluating user interfaces. In this paper, we propose an approach for user interface evaluation in the knowledge communication domain. We base our methodology on the well-established Cognitive Walkthough approach but employ a different set of questions, tailoring the method towards domain-specific needs. We demonstrate our approach on a scholarly knowledge graph implementation called Open Research Knowledge Graph (ORKG).