Search Results

Now showing 1 - 10 of 20
  • Item
    A Fair and Comprehensive Comparison of Multimodal Tweet Sentiment Analysis Methods
    (Ithaka : Cornell University, 2021) Cheema, Gullal S.; Hakimov, Sherzod; Müller-Budack, Eric; Ewerth, Ralph
    Opinion and sentiment analysis is a vital task to characterize subjective information in social media posts. In this paper, we present a comprehensive experimental evaluation and comparison with six state-of-the-art methods, from which we have re-implemented one of them. In addition, we investigate different textual and visual feature embeddings that cover different aspects of the content, as well as the recently introduced multimodal CLIP embeddings. Experimental results are presented for two different publicly available benchmark datasets of tweets and corresponding images. In contrast to the evaluation methodology of previous work, we introduce a reproducible and fair evaluation scheme to make results comparable. Finally, we conduct an error analysis to outline the limitations of the methods and possibilities for the future work.
  • Item
    Easy Semantification of Bioassays
    (Heidelberg : Springer, 2022) Anteghini, Marco; D’Souza, Jennifer; dos Santos, Vitor A. P. Martins; Auer, Sören
    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.
  • Item
    Roadmap to FAIR Research Information in Open Infrastructures
    (Abingdon : Routledge, 2021) Hauschke, Christian; Nazarovets, Serhii; Altemeier, Franziska; Kaliuzhna, Nataliia
    The FAIR Principles were designed to improve the findability, accessibility, interoperability and reusability of data holdings by humans and machines. The principles can be applied to research information too. We present the results of the discussions that took place during the series of online workshops with experts on Research Information and FAIR Guiding Principles. We provide high-level criteria on how to foster findable, accessible, interoperable and reusable, and we hope that our roadmap for FAIR research information in open infrastructures bring many benefits to a diverse group of stakeholders of the scientific ecosystem.
  • Item
    Goportis – hoch integrierte Kooperation der Zentralen Fachbibliotheken
    (Bingley : Emerald, 2008) Burblies, Christine; Pianos, Tamara
    Unter dem Namen „Goportis“ haben die drei Deutschen Zentralen Fachbibliotheken (ZFB) im Herbst 2006 begonnen, eine hoch integrierte Kooperation auf der Ebene gemeinsamer Produkte und Dienstleistungen und interner Geschäftsgänge aufzubauen. Die Zentralen Fachbibliotheken sind die Technische Informationsbibliothek in Hannover, die Deutsche Zentralbibliothek für Medizin in Köln/Bonn und die Deutsche Zentralbibliothek für Wirtschaftswissenschaften in Kiel/Hamburg.
  • Item
    Compacting frequent star patterns in RDF graphs
    (Dordrecht : Springer Science + Business Media B.V, 2020) Karim, Farah; Vidal, Maria-Esther; Auer, Sören
    Knowledge graphs have become a popular formalism for representing entities and their properties using a graph data model, e.g., the Resource Description Framework (RDF). An RDF graph comprises entities of the same type connected to objects or other entities using labeled edges annotated with properties. RDF graphs usually contain entities that share the same objects in a certain group of properties, i.e., they match star patterns composed of these properties and objects. In case the number of these entities or properties in these star patterns is large, the size of the RDF graph and query processing are negatively impacted; we refer these star patterns as frequent star patterns. We address the problem of identifying frequent star patterns in RDF graphs and devise the concept of factorized RDF graphs, which denote compact representations of RDF graphs where the number of frequent star patterns is minimized. We also develop computational methods to identify frequent star patterns and generate a factorized RDF graph, where compact RDF molecules replace frequent star patterns. A compact RDF molecule of a frequent star pattern denotes an RDF subgraph that instantiates the corresponding star pattern. Instead of having all the entities matching the original frequent star pattern, a surrogate entity is added and related to the properties of the frequent star pattern; it is linked to the entities that originally match the frequent star pattern. Since the edges between the entities and the objects in the frequent star pattern are replaced by edges between these entities and the surrogate entity of the compact RDF molecule, the size of the RDF graph is reduced. We evaluate the performance of our factorization techniques on several RDF graph benchmarks and compare with a baseline built on top gSpan, a state-of-the-art algorithm to detect frequent patterns. The outcomes evidence the efficiency of proposed approach and show that our techniques are able to reduce execution time of the baseline approach in at least three orders of magnitude. Additionally, RDF graph size can be reduced by up to 66.56% while data represented in the original RDF graph is preserved.
  • Item
    An OER Recommender System Supporting Accessibility Requirements
    (New York : Association for Computing Machinery, 2020) Elias, Mirette; Tavakoli, Mohammadreza; Lohmann, Steffen; Kismihok, Gabor; Auer, Sören; Gurreiro, Tiago; Nicolau, Hugo; Moffatt, Karyn
    Open Educational Resources are becoming a significant source of learning that are widely used for various educational purposes and levels. Learners have diverse backgrounds and needs, especially when it comes to learners with accessibility requirements. Persons with disabilities have significantly lower employment rates partly due to the lack of access to education and vocational rehabilitation and training. It is not surprising therefore, that providing high quality OERs that facilitate the self-development towards specific jobs and skills on the labor market in the light of special preferences of learners with disabilities is difficult. In this paper, we introduce a personalized OER recommeder system that considers skills, occupations, and accessibility properties of learners to retrieve the most adequate and high-quality OERs. This is done by: 1) describing the profile of learners with disabilities, 2) collecting and analysing more than 1,500 OERs, 3) filtering OERs based on their accessibility features and predicted quality, and 4) providing personalised OER recommendations for learners according to their accessibility needs. As a result, the OERs retrieved by our method proved to satisfy more accessibility checks than other OERs. Moreover, we evaluated our results with five experts in educating people with visual and cognitive impairments. The evaluation showed that our recommendations are potentially helpful for learners with accessibility needs.
  • Item
    TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation
    (New York,NY,United States : Association for Computing Machinery, 2022) Oelen, Allard; Stocker, Markus; Auer, Sören; Aizawa, Akiko
    As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.
  • Item
    Accessibility and Personalization in OpenCourseWare : An Inclusive Development Approach
    (Piscataway, NJ : IEEE, 2020) Elias, Mirette; Ruckhaus, Edna; Draffan, E.A.; James, Abi; Suárez-Figueroa, Mari Carmen; Lohmann, Steffen; Khiat, Abderrahmane; Auer, Sören; Chang, Maiga; Sampson, Demetrios G.; Huang, Ronghuai; Hooshyar, Danial; Chen, Nian-Shing; Kinshuk; Pedaste, Margus
    OpenCourseWare (OCW) has become a desirable source for sharing free educational resources which means there will always be users with differing needs. It is therefore the responsibility of OCW platform developers to consider accessibility as one of their prioritized requirements to ensure ease of use for all, including those with disabilities. However, the main challenge when creating an accessible platform is the ability to address all the different types of barriers that might affect those with a wide range of physical, sensory and cognitive impairments. This article discusses accessibility and personalization strategies and their realisation in the SlideWiki platform, in order to facilitate the development of accessible OCW. Previously, accessibility was seen as a complementary feature that can be tackled in the implementation phase. However, a meaningful integration of accessibility features requires thoughtful consideration during all project phases with active involvement of related stakeholders. The evaluation results and lessons learned from the SlideWiki development process have the potential to assist in the development of other systems that aim for an inclusive approach. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
  • Item
    Quality Prediction of Open Educational Resources A Metadata-based Approach
    (Piscataway, NJ : IEEE, 2020) Tavakoli, Mohammadreza; Elias, Mirette; Kismihók, Gábor; Auer, Sören; Chang, Maiga; Sampson, Demetrios G.; Huang, Ronghuai; Hooshyar, Danial; Chen, Nian-Shing; Kinshuk; Pedaste, Margus
    In the recent decade, online learning environments have accumulated millions of Open Educational Resources (OERs). However, for learners, finding relevant and high quality OERs is a complicated and time-consuming activity. Furthermore, metadata play a key role in offering high quality services such as recommendation and search. Metadata can also be used for automatic OER quality control as, in the light of the continuously increasing number of OERs, manual quality control is getting more and more difficult. In this work, we collected the metadata of 8,887 OERs to perform an exploratory data analysis to observe the effect of quality control on metadata quality. Subsequently, we propose an OER metadata scoring model, and build a metadata-based prediction model to anticipate the quality of OERs. Based on our data and model, we were able to detect high-quality OERs with the F1 score of 94.6%. © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
  • Item
    Toward Representing Research Contributions in Scholarly Knowledge Graphs Using Knowledge Graph Cells
    (New York City, NY : Association for Computing Machinery, 2020) Vogt, Lars; D'Souza, Jennifer; Stocker, Markus; Auer, Sören
    There is currently a gap between the natural language expression of scholarly publications and their structured semantic content modeling to enable intelligent content search. With the volume of research growing exponentially every year, a search feature operating over semantically structured content is compelling. Toward this end, in this work, we propose a novel semantic data model for modeling the contribution of scientific investigations. Our model, i.e. the Research Contribution Model (RCM), includes a schema of pertinent concepts highlighting six core information units, viz. Objective, Method, Activity, Agent, Material, and Result, on which the contribution hinges. It comprises bottom-up design considerations made from three scientific domains, viz. Medicine, Computer Science, and Agriculture, which we highlight as case studies. For its implementation in a knowledge graph application we introduce the idea of building blocks called Knowledge Graph Cells (KGC), which provide the following characteristics: (1) they limit the expressibility of ontologies to what is relevant in a knowledge graph regarding specific concepts on the theme of research contributions; (2) they are expressible via ABox and TBox expressions; (3) they enforce a certain level of data consistency by ensuring that a uniform modeling scheme is followed through rules and input controls; (4) they organize the knowledge graph into named graphs; (5) they provide information for the front end for displaying the knowledge graph in a human-readable form such as HTML pages; and (6) they can be seamlessly integrated into any existing publishing process thatsupports form-based input abstracting its semantic technicalities including RDF semantification from the user. Thus RCM joins the trend of existing work toward enhanced digitalization of scholarly publication enabled by an RDF semantification as a knowledge graph fostering the evolution of the scholarly publications beyond written text.