Search Results

Now showing 1 - 10 of 32
  • Item
    Ontology Design for Pharmaceutical Research Outcomes
    (Cham : Springer, 2020) Say, Zeynep; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Hall, Mark; Merčun, Tanja; Risse, Thomas; Duchateau, Fabien
    The network of scholarly publishing involves generating and exchanging ideas, certifying research, publishing in order to disseminate findings, and preserving outputs. Despite enormous efforts in providing support for each of those steps in scholarly communication, identifying knowledge fragments is still a big challenge. This is due to the heterogeneous nature of the scholarly data and the current paradigm of distribution by publishing (mostly document-based) over journal articles, numerous repositories, and libraries. Therefore, transforming this paradigm to knowledge-based representation is expected to reform the knowledge sharing in the scholarly world. Although many movements have been initiated in recent years, non-technical scientific communities suffer from transforming document-based publishing to knowledge-based publishing. In this paper, we present a model (PharmSci) for scholarly publishing in the pharmaceutical research domain with the goal of facilitating knowledge discovery through effective ontology-based data integration. PharmSci provides machine-interpretable information to the knowledge discovery process. The principles and guidelines of the ontological engineering have been followed. Reasoning-based techniques are also presented in the design of the ontology to improve the quality of targeted tasks for data integration. The developed ontology is evaluated with a validation process and also a quality verification method.
  • Item
    Creation of a Knowledge Space by Semantically Linking Data Repository and Knowledge Management System - a Use Case from Production Engineering
    (Laxenburg : IFAC, 2022) Sheveleva, Tatyana; Wawer, Max Leo; Oladazimi, Pooya; Koepler, Oliver; Nürnberger, Florian; Lachmayer, Roland; Auer, Sören; Mozgova, Iryna
    The seamless documentation of research data flows from generation, processing, analysis, publication, and reuse is of utmost importance when dealing with large amounts of data. Semantic linking of process documentation and gathered data creates a knowledge space enabling the discovery of relations between steps of process chains. This paper shows the design of two systems for data deposit and for process documentation using semantic annotations and linking on a use case of a process chain step of the Tailored Forming Technology.
  • Item
    Semantic Representation of Physics Research Data
    (Setúbal, Portugal : Science and Technology Publications, Lda, 2020) Say, Aysegul; Fathalla, Said; Vahdati, Sahar; Lehmann, Jens; Auer, Sören; Aveiro, David; Dietz, Jan; Filipe, Joaquim
    Improvements in web technologies and artificial intelligence enable novel, more data-driven research practices for scientists. However, scientific knowledge generated from data-intensive research practices is disseminated with unstructured formats, thus hindering the scholarly communication in various respects. The traditional document-based representation of scholarly information hampers the reusability of research contributions. To address this concern, we developed the Physics Ontology (PhySci) to represent physics-related scholarly data in a machine-interpretable format. PhySci facilitates knowledge exploration, comparison, and organization of such data by representing it as knowledge graphs. It establishes a unique conceptualization to increase the visibility and accessibility to the digital content of physics publications. We present the iterative design principles by outlining a methodology for its development and applying three different evaluation approaches: data-driven and criteria-based evaluation, as well as ontology testing.
  • Item
    Easy Semantification of Bioassays
    (Heidelberg : Springer, 2022) Anteghini, Marco; D’Souza, Jennifer; dos Santos, Vitor A. P. Martins; Auer, Sören
    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.
  • Item
    An OER Recommender System Supporting Accessibility Requirements
    (New York : Association for Computing Machinery, 2020) Elias, Mirette; Tavakoli, Mohammadreza; Lohmann, Steffen; Kismihok, Gabor; Auer, Sören; Gurreiro, Tiago; Nicolau, Hugo; Moffatt, Karyn
    Open Educational Resources are becoming a significant source of learning that are widely used for various educational purposes and levels. Learners have diverse backgrounds and needs, especially when it comes to learners with accessibility requirements. Persons with disabilities have significantly lower employment rates partly due to the lack of access to education and vocational rehabilitation and training. It is not surprising therefore, that providing high quality OERs that facilitate the self-development towards specific jobs and skills on the labor market in the light of special preferences of learners with disabilities is difficult. In this paper, we introduce a personalized OER recommeder system that considers skills, occupations, and accessibility properties of learners to retrieve the most adequate and high-quality OERs. This is done by: 1) describing the profile of learners with disabilities, 2) collecting and analysing more than 1,500 OERs, 3) filtering OERs based on their accessibility features and predicted quality, and 4) providing personalised OER recommendations for learners according to their accessibility needs. As a result, the OERs retrieved by our method proved to satisfy more accessibility checks than other OERs. Moreover, we evaluated our results with five experts in educating people with visual and cognitive impairments. The evaluation showed that our recommendations are potentially helpful for learners with accessibility needs.
  • Item
    Formalizing Gremlin pattern matching traversals in an integrated graph Algebra
    (Aachen, Germany : RWTH Aachen, 2019) Thakkar, Harsh; Auer, Sören; Vidal, Maria-Esther; Samavi, Reza; Consens, Mariano P.; Khatchadourian, Shahan; Nguyen, Vinh; Sheth, Amit; Giménez-García, José M.; Thakkar, Harsh
    Graph data management (also called NoSQL) has revealed beneficial characteristics in terms of flexibility and scalability by differ-ently balancing between query expressivity and schema flexibility. This peculiar advantage has resulted into an unforeseen race of developing new task-specific graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards flex-ible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing sys-tem (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern match-ing for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra.
  • Item
    Accessibility and Personalization in OpenCourseWare : An Inclusive Development Approach
    (Piscataway, NJ : IEEE, 2020) Elias, Mirette; Ruckhaus, Edna; Draffan, E.A.; James, Abi; Suárez-Figueroa, Mari Carmen; Lohmann, Steffen; Khiat, Abderrahmane; Auer, Sören; Chang, Maiga; Sampson, Demetrios G.; Huang, Ronghuai; Hooshyar, Danial; Chen, Nian-Shing; Kinshuk; Pedaste, Margus
    OpenCourseWare (OCW) has become a desirable source for sharing free educational resources which means there will always be users with differing needs. It is therefore the responsibility of OCW platform developers to consider accessibility as one of their prioritized requirements to ensure ease of use for all, including those with disabilities. However, the main challenge when creating an accessible platform is the ability to address all the different types of barriers that might affect those with a wide range of physical, sensory and cognitive impairments. This article discusses accessibility and personalization strategies and their realisation in the SlideWiki platform, in order to facilitate the development of accessible OCW. Previously, accessibility was seen as a complementary feature that can be tackled in the implementation phase. However, a meaningful integration of accessibility features requires thoughtful consideration during all project phases with active involvement of related stakeholders. The evaluation results and lessons learned from the SlideWiki development process have the potential to assist in the development of other systems that aim for an inclusive approach. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
  • Item
    Creating a Scholarly Knowledge Graph from Survey Article Tables
    (Cham : Springer, 2020) Oelen, Allard; Stocker, Markus; Auer, Sören; Ishita, Emi; Pang, Natalie Lee San; Zhou, Lihong
    Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts.
  • Item
    Why reinvent the wheel: Let's build question answering systems together
    (New York City : Association for Computing Machinery, 2018) Singh, K.; Radhakrishna, A.S.; Both, A.; Shekarpour, S.; Lytra, I.; Usbeck, R.; Vyas, A.; Khikmatullaev, A.; Punjani, D.; Lange, C.; Vidal, Maria-Esther; Lehmann, J.; Auer, Sören
    Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein, a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein, thus improving the performance of the generated pipelines.
  • Item
    Toward Representing Research Contributions in Scholarly Knowledge Graphs Using Knowledge Graph Cells
    (New York City, NY : Association for Computing Machinery, 2020) Vogt, Lars; D'Souza, Jennifer; Stocker, Markus; Auer, Sören
    There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. Toward this end, in this work, we propose a novel semantic data model for modeling the contribution of scientific investigations. Our model, i.e. the Research Contribution Model (RCM), includes a schema of pertinent concepts highlighting six core information units, viz. Objective, Method, Activity, Agent, Material, and Result, on which the contribution hinges. It comprises bottom-up design considerations made from three scientific domains, viz. Medicine, Computer Science, and Agriculture, which we highlight as case studies. For its implementation in a knowledge graph application we introduce the idea of building blocks called Knowledge Graph Cells (KGC), which provide the following characteristics: (1) they limit the expressibility of ontologies to what is relevant in a knowledge graph regarding specific concepts on the theme of research contributions; (2) they are expressible via ABox and TBox expressions; (3) they enforce a certain level of data consistency by ensuring that a uniform modeling scheme is followed through rules and input controls; (4) they organize the knowledge graph into named graphs; (5) they provide information for the front end for displaying the knowledge graph in a human-readable form such as HTML pages; and (6) they can be seamlessly integrated into any existing publishing process thatsupports form-based input abstracting its semantic technicalities including RDF semantification from the user. Thus RCM joins the trend of existing work toward enhanced digitalization of scholarly publication enabled by an RDF semantification as a knowledge graph fostering the evolution of the scholarly publications beyond written text.