Search Results

Now showing 1 - 10 of 18
  • Item
    DDB-KG: The German Bibliographic Heritage in a Knowledge Graph
    (Aachen, Germany : RWTH Aachen, 2021) Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Oppenlaender, Jonas; Dessì, Danilo; Harald, Sack; Sumikawa, Yasunobu; Ikejiri, Ryohei; Doucet, Antoine; Pfanzelter, Eva; Hasanuzzaman, Mohammed; Dias, Gaël; Milligan, Ian; Jatowt, Adam
    Under the German government’s initiative “NEUSTART Kultur”, the German Digital Library or Deutsche Digitale Bibliothek (DDB) is undergoing improvements to enhance user-experience. As an initial step, emphasis is placed on creating a knowledge graph from the bibliographic record collection of the DDB. This paper discusses the challenges facing the DDB in terms of retrieval and the solutions in addressing them. In particular, limitations of the current data model or ontology to represent bibliographic metadata is analyzed through concrete examples. This study presents the complete ontological mapping from DDB-Europeana Data Model (DDB-EDM) to FaBiO, and a prototype of the DDB-KG made available as a SPARQL endpoint. The suitabiliy of the target ontology is demonstrated with SPARQL queries formulated from competency questions.
  • Item
    Causal Relationship over Knowledge Graphs
    (2022) Huang, Hao; Al Hasan, Mohammad; Xiong, Li
    Causality has been discussed for centuries, and the theory of causal inference over tabular data has been broadly studied and utilized in multiple disciplines. However, only a few works attempt to infer the causality while exploiting the meaning of the data represented in a data structure like knowledge graph. These works offer a glance at the possibilities of causal inference over knowledge graphs, but do not yet consider the metadata, e.g., cardinalities, class subsumption and overlap, and integrity constraints. We propose CareKG, a new formalism to express causal relationships among concepts, i.e., classes and relations, and enable causal queries over knowledge graphs using semantics of metadata. We empirically evaluate the expressiveness of CareKG in a synthetic knowledge graph concerning cardinalities, class subsumption and overlap, integrity constraints. Our initial results indicate that CareKG can represent and measure causal relations with some semantics which are uncovered by state-of-the-art approaches.
  • Item
    Towards Analyzing the Bias of News Recommender Systems Using Sentiment and Stance Detection
    (New York,NY,United States : Association for Computing Machinery, 2022) Alam, Mehwish; Iana, Andreea; Grote, Alexander; Ludwig, Katharina; Müller, Philipp; Paulheim, Heiko; Laforest, Frédérique; Troncy, Raphael; Médini, Lionel; Herman, Ivan
    News recommender systems are used by online news providers to alleviate information overload and to provide personalized content to users. However, algorithmic news curation has been hypothesized to create filter bubbles and to intensify users' selective exposure, potentially increasing their vulnerability to polarized opinions and fake news. In this paper, we show how information on news items' stance and sentiment can be utilized to analyze and quantify the extent to which recommender systems suffer from biases. To that end, we have annotated a German news corpus on the topic of migration using stance detection and sentiment analysis. In an experimental evaluation with four different recommender systems, our results show a slight tendency of all four models for recommending articles with negative sentiments and stances against the topic of refugees and migration. Moreover, we observed a positive correlation between the sentiment and stance bias of the text-based recommenders and the preexisting user bias, which indicates that these systems amplify users' opinions and decrease the diversity of recommended news. The knowledge-aware model appears to be the least prone to such biases, at the cost of predictive accuracy.
  • Item
    Detecting Cross-Language Plagiarism using Open Knowledge Graphs
    (Aachen, Germany : RWTH Aachen, 2021) Stegmüller, Johannes; Bauer-Marquart, Fabian; Meuschke, Norman; Ruas, Terry; Schubotz, Moritz; Gipp, Bela; Zhang, Chengzhi; Mayr, Philipp; Lu, Wie; Zhang, Yi
    Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application toWebscale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that include distant language pairs like Japanese-English. For identifying cross-language plagiarism at the character level, CL-OSA primarily improves the detection of sense-for-sense translations. For these challenging cases, CL-OSA’s performance in terms of the well-established PlagDet score exceeds that of the best competitor by more than factor two. The code and data of our study are openly available.
  • Item
    Lokal betrieben, remote gepflegt – Software für ein Datenrepositorium in Kooperation implementieren
    (Heidelberg : Universitätsbibliothek Heidelberg, 2022) Landwehr, Matthias; Schneider, Gabriel; Hofmann, Stefan; Razum, Matthias; Soltau, Kerstin; Heuveline, Vincent; Bisheh, Nina
    Die Universität Konstanz deckt ihren Bedarf nach einem institutionellen Forschungsdatenrepositorium mit der von FIZ Karlsruhe angebotenen Lösung „RADAR Local“. Als Alternative zu einer Eigenentwicklung wurde das Datenrepositorium als hybrides Modell mit Repositorien-Software und Archivierung auf lokaler Infrastruktur implementiert. Dabei stellt FIZ Karlsruhe die etablierte Repositorien-Software RADAR zur Verfügung, wartet und betreibt sie aus der Ferne und passt sie nach Kundenwunsch an. Um die parallele Installation und Pflege der RADAR-Software auf mehreren lokalen Instanzen effizient bewältigen zu können, hat FIZ Karlsruhe vorab den Automatisierungsgrad der betroffenen Prozesse in der Software-Entwicklung, in der Systemkonfiguration und im Deployment erhöht. Dies wurde durch den Einsatz von Container-Virtualisierung wie Docker und Docker Swarm sowie mit Orchestrierungswerkzeugen wie Ansible erreicht. Der zeitliche Aufwand und der personelle Ressourcenbedarf reduzieren sich dadurch für die Universität Konstanz und als Ergebnis erhält sie ein gepflegtes Repositorium auf dem aktuellen Stand der Technik. Gleichzeitig erfordert diese Betriebsvariante eine intensive Auseinandersetzung mit dem jeweiligen Geschäftsmodell und den technischen Rahmenbedingungen des Anbieters, eine genaue Kostenkalkulation sowie möglicherweise Kompromisse oder Abstriche bei individuellen Wünschen.
  • Item
    Rechtliche Fragen bei der Nutzung von Abbildungen aus Open-Access-Publikationen
    (Heidelberg : Universitätsbibliothek Heidelberg, 2022) Sohmen, Lucia; Rack, Fabian; Heuveline, Vincent; Bisheh, Nina
    Die zunehmende Verfügbarkeit von Forschungsdaten eröffnet Forschenden neue Möglichkeiten, mit von Dritten erstellten Forschungsdaten zu arbeiten. Dieser Beitrag befasst sich mit der Frage, welche rechtlichen Rahmenbedingungen gelten, wenn diese nachgenutzten Forschungsdaten öffentlich verfügbar gemacht werden sollen. Im Speziellen geht der Artikel dabei auf Bildersuchmaschinen und das Veröffentlichen von Bildkorpora ein. Dabei wird dargestellt, dass es bei der öffentlichen Zugänglichmachung von unübersichtlichen Bildmengen keine hundertprozentige Sicherheit geben kann. Durch bestimmte Abwägungen und technische Mittel kann sich dieser aber angenähert werden.
  • Item
    Crowdsourcing Scholarly Discourse Annotations
    (New York, NY : ACM, 2021) Oelen, Allard; Stocker, Markus; Auer, Sören
    The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process.
  • Item
    Knowledge Graph enabled Curation and Exploration of Nuremberg's City Heritage
    (Aachen, Germany : RWTH Aachen, 2021) Tietz, Tabea; Bruns, Oleksandra; Göller, Sandra; Razum, Matthias; Dessì, Danilo; Sack, Harald; Paschke, Adrian; Rehm, Georg; Al Qundus, Jamal; Neudecker, Clemens; Pintscher, Lydia
    An important part in European cultural identity relies on European cities and in particular on their histories and cultural heritage. Nuremberg, the home of important artists such as Albrecht Dürer and Hans Sachs developed into the epitome of German and European culture already during the Middle Ages. Throughout history, the city experienced a number of transformations, especially with its almost complete destruction during World War 2. This position paper presents TRANSRAZ, a project with the goal to recreate Nuremberg by means of an interactive 3D tool to explore the city's architecture and culture ranging from the 17th to the 21st century. The goal of this position paper is to discuss the ongoing work of connecting heterogeneous historical data from various sources previously hidden in archives to the 3D model using knowledge graphs for a scientifically accurate interactive exploration on the Web.
  • Item
    On the Impact of Temporal Representations on Metaphor Detection
    (Paris : European Language Resources Association (ELRA), 2022) Giorgio Ottolina; Matteo Palmonari; Manuel Vimercati; Mehwish Alam; Calzolari, Nicoletta; Béchet, Frédéric; Blache, Philippe; Choukri, Khalid; Cieri, Christopher; Declerck, Thierry; Goggi, Sara; Isahara, Hitoshi; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Odijk, Jan; Piperidis, Stelios
    State-of-the-art approaches for metaphor detection compare their literal - or core - meaning and their contextual meaning using metaphor classifiers based on neural networks. However, metaphorical expressions evolve over time due to various reasons, such as cultural and societal impact. Metaphorical expressions are known to co-evolve with language and literal word meanings, and even drive, to some extent, this evolution. This poses the question of whether different, possibly time-specific, representations of literal meanings may impact the metaphor detection task. To the best of our knowledge, this is the first study that examines the metaphor detection task with a detailed exploratory analysis where different temporal and static word embeddings are used to account for different representations of literal meanings. Our experimental analysis is based on three popular benchmarks used for metaphor detection and word embeddings extracted from different corpora and temporally aligned using different state-of-the-art approaches. The results suggest that the usage of different static word embedding methods does impact the metaphor detection task and some temporal word embeddings slightly outperform static methods. However, the results also suggest that temporal word embeddings may provide representations of the core meaning of the metaphor even too close to their contextual meaning, thus confusing the classifier. Overall, the interaction between temporal language evolution and metaphor detection appears tiny in the benchmark datasets used in our experiments. This suggests that future work for the computational analysis of this important linguistic phenomenon should first start by creating a new dataset where this interaction is better represented.
  • Item
    DDB-EDM to FaBiO: The Case of the German Digital Library
    (Aachen, Germany : RWTH Aachen, 2021) Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Oppenlaender, Jonas; Dessì, Danilo; Sack, Harald; Seneviratne, Oshani; Pesquita, Catia; Sequeda, Juan; Etcheverry, Lorena
    Cultural heritage portals have the goal of providing users with seamless access to all their resources. This paper introduces initial efforts for a user-oriented restructuring of the German Digital Library (DDB). At present, cultural heritage objects (CHOs) in the DDB are modeled using an extended version of the Europeana Data Model (DDBEDM), which negatively impacts usability and exploration. These challenges can be addressed by leveraging ontologies, and building a knowledge graph from the DDB's voluminous collection. Towards this goal, an alignment of bibliographic metadata from DDB-EDM to FRBR-Aligned Bibliographic Ontology (FaBiO) is presented.