Search Results

Now showing 1 - 10 of 18
  • Item
    DDB-KG: The German Bibliographic Heritage in a Knowledge Graph
    (Aachen, Germany : RWTH Aachen, 2021) Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Oppenlaender, Jonas; Dessì, Danilo; Harald, Sack; Sumikawa, Yasunobu; Ikejiri, Ryohei; Doucet, Antoine; Pfanzelter, Eva; Hasanuzzaman, Mohammed; Dias, Gaël; Milligan, Ian; Jatowt, Adam
    Under the German government’s initiative “NEUSTART Kultur”, the German Digital Library or Deutsche Digitale Bibliothek (DDB) is undergoing improvements to enhance user-experience. As an initial step, emphasis is placed on creating a knowledge graph from the bibliographic record collection of the DDB. This paper discusses the challenges facing the DDB in terms of retrieval and the solutions in addressing them. In particular, limitations of the current data model or ontology to represent bibliographic metadata is analyzed through concrete examples. This study presents the complete ontological mapping from DDB-Europeana Data Model (DDB-EDM) to FaBiO, and a prototype of the DDB-KG made available as a SPARQL endpoint. The suitabiliy of the target ontology is demonstrated with SPARQL queries formulated from competency questions.
  • Item
    Toward a Comparison Framework for Interactive Ontology Enrichment Methodologies
    (Aachen, Germany : RWTH Aachen, 2022) Vrolijk, Jarno; Reklos, Ioannis; Vafaie, Mahsa; Massari, Arcangelo; Mohammadi, Maryam; Rudolph, Sebastian; Fu, Bo; Lambrix, Patrick; Pesquita, Catia
    The growing demand for well-modeled ontologies in diverse application areas increases the need for intuitive interaction techniques that support human domain experts in ontology modeling and enrichment tasks, such that quality expectations are met. Beyond the correctness of the specified information, the quality of an ontology depends on its (relative) completeness, i.e., whether the ontology contains all the necessary information to draw expected inferences. On an abstract level, the Ontology Enrichment problem consists of identifying and filling the gap between information that can be logically inferred from the ontology and the information expected to be inferable by the user. To this end, numerous approaches have been described in the literature, providing methodologies from the fields of Formal Semantics and Automated Reasoning targeted at eliciting knowledge from human domain experts. These approaches vary greatly in many aspects and their applicability typically depends on the specifics of the concrete modeling scenario at hand. Toward a better understanding of the landscape of methodological possibilities, this position paper proposes a framework consisting of multiple performance dimensions along which existing and future approaches to interactive ontology enrichment can be characterized. We apply our categorization scheme to a selection of methodologies from the literature. In light of this comparison, we address the limitations of the methods and propose directions for future work.
  • Item
    Forschungsdaten in den Naturwissenschaften: Eine urheberrechtliche Bestandsaufnahme mit ihren Implikationen für universitäres FDM
    (Heidelberg : Universitätsbibliothek Heidelberg, 2022) Hartmann, Thomas; Heuveline, Vincent; Bisheh, Nina
    Ein Schlüsselfaktor für die Zugänglichkeit, Nachnutzbarkeit und Interoperabilität von Forschungsdaten1 ist deren urheberrechtlicher Status. In den Geistes- und Sozialwissenschaften unterliegen Forschungsdaten (z. Bsp. Texte in Form von Interviews oder sonstiger Literatur) in den meisten Fällen dem Urheberschutz. Anders ist die Situation in den Naturwissenschaften. Nicht immer, aber häufig bleiben Forschungsdaten aus diesen Fächern urheberrechtsfrei. Dies begründet der Beitrag an einem typischen Beispiel aus dem Forschungsdatenzentrum für Molekulare Materialforschung (SDC MoMaF).2 Welche rechtlichen Handlungsempfehlungen sich für das Forschungsdatenmanagement (FDM) daraus ergeben, wird am Beitragsende dargestellt.
  • Item
    Towards a Representation of Temporal Data in Archival Records: Use Cases and Requirements
    (Aachen, Germany : RWTH Aachen, 2021) Bruns, Oleksandra; Tietz, Tabea; Vafaie, Mahsa; Dessì, Danilo; Sack, Harald; Lopes, Carla Teixeira; Ribeiro, Cristina; Niccolucci, Franco; Rodrigues, Irene; Freire, Nuno
    Archival records are essential sources of information for historians and digital humanists to understand history. For modern information systems they are often analysed and integrated into Knowledge Graphs for better access, interoperability and re-use. However, due to restrictions of the representation of RDF predicates temporal data within archival records is a challenge to model. This position paper explains requirements for modeling temporal data in archival records based on running research projects in which archival records are analysed and integrated in Knowledge Graphs for research and exploration.
  • Item
    Modelling Archival Hierarchies in Practice: Key Aspects and Lessons Learned
    (Aachen, Germany : RWTH Aachen, 2021) Vafaie, Mahsa; Bruns, Oleksandra; Pilz, Nastasja; Dessì, Danilo; Sack, Harald; Sumikawa, Yasunobu; Ikejiri, Ryohei; Doucet, Antoine; Pfanzelter, Eva; Hasanuzzaman, Mohammed; Dias, Gaël; Milligan, Ian; Jatowt, Adam
    An increasing number of archival institutions aim to provide public access to historical documents. Ontologies have been designed, developed and utilised to model the archival description of historical documents and to enable interoperability between different information sources. However, due to the heterogeneous nature of archives and archival systems, current ontologies for the representation of archival content do not always cover all existing structural organisation forms equallywell. After briefly contextualising the heterogeneity in the hierarchical structure of German archives, this paper describes and evaluates differences between two archival ontologies, ArDO and RiC-O, and their approaches to modelling hierarchy levels and archive dynamics.
  • Item
    Knowledge Graph enabled Curation and Exploration of Nuremberg's City Heritage
    (Aachen, Germany : RWTH Aachen, 2021) Tietz, Tabea; Bruns, Oleksandra; Göller, Sandra; Razum, Matthias; Dessì, Danilo; Sack, Harald; Paschke, Adrian; Rehm, Georg; Al Qundus, Jamal; Neudecker, Clemens; Pintscher, Lydia
    An important part in European cultural identity relies on European cities and in particular on their histories and cultural heritage. Nuremberg, the home of important artists such as Albrecht Dürer and Hans Sachs developed into the epitome of German and European culture already during the Middle Ages. Throughout history, the city experienced a number of transformations, especially with its almost complete destruction during World War 2. This position paper presents TRANSRAZ, a project with the goal to recreate Nuremberg by means of an interactive 3D tool to explore the city's architecture and culture ranging from the 17th to the 21st century. The goal of this position paper is to discuss the ongoing work of connecting heterogeneous historical data from various sources previously hidden in archives to the 3D model using knowledge graphs for a scientifically accurate interactive exploration on the Web.
  • Item
    About Migration Flows and Sentiment Analysis on Twitter Data: Building the Bridge Between Technical and Legal approaches to data protection
    (Paris : European Language Resources Association (ELRA), 2022) Gottschalk, Thilo; Pichierri, Francesca; Rigault, Mickaël; Arranz, Victoria; Siegert, Ingo
    Sentiment analysis has always been an important driver of political decisions and campaigns across all fields. Novel technologies allow automatizing analysis of sentiments on a big scale and hence provide allegedly more accurate outcomes. With user numbers in the billions and their increasingly important role in societal discussions, social media platforms become a glaring data source for these types of analysis. Due to its public availability, the relative ease of access and the sheer amount of available data, the Twitter API has become a particularly important source to researchers and data analysts alike. Despite the evident value of these data sources, the analysis of such data comes with legal, ethical and societal risks that should be taken into consideration when analysing data from Twitter. This paper describes these risks along the technical processing pipeline and proposes related mitigation measures.
  • Item
    On the Impact of Temporal Representations on Metaphor Detection
    (Paris : European Language Resources Association (ELRA), 2022) Giorgio Ottolina; Matteo Palmonari; Manuel Vimercati; Mehwish Alam; Calzolari, Nicoletta; Béchet, Frédéric; Blache, Philippe; Choukri, Khalid; Cieri, Christopher; Declerck, Thierry; Goggi, Sara; Isahara, Hitoshi; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Odijk, Jan; Piperidis, Stelios
    State-of-the-art approaches for metaphor detection compare their literal - or core - meaning and their contextual meaning using metaphor classifiers based on neural networks. However, metaphorical expressions evolve over time due to various reasons, such as cultural and societal impact. Metaphorical expressions are known to co-evolve with language and literal word meanings, and even drive, to some extent, this evolution. This poses the question of whether different, possibly time-specific, representations of literal meanings may impact the metaphor detection task. To the best of our knowledge, this is the first study that examines the metaphor detection task with a detailed exploratory analysis where different temporal and static word embeddings are used to account for different representations of literal meanings. Our experimental analysis is based on three popular benchmarks used for metaphor detection and word embeddings extracted from different corpora and temporally aligned using different state-of-the-art approaches. The results suggest that the usage of different static word embedding methods does impact the metaphor detection task and some temporal word embeddings slightly outperform static methods. However, the results also suggest that temporal word embeddings may provide representations of the core meaning of the metaphor even too close to their contextual meaning, thus confusing the classifier. Overall, the interaction between temporal language evolution and metaphor detection appears tiny in the benchmark datasets used in our experiments. This suggests that future work for the computational analysis of this important linguistic phenomenon should first start by creating a new dataset where this interaction is better represented.
  • Item
    Detecting Cross-Language Plagiarism using Open Knowledge Graphs
    (Aachen, Germany : RWTH Aachen, 2021) Stegmüller, Johannes; Bauer-Marquart, Fabian; Meuschke, Norman; Ruas, Terry; Schubotz, Moritz; Gipp, Bela; Zhang, Chengzhi; Mayr, Philipp; Lu, Wie; Zhang, Yi
    Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application toWebscale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that include distant language pairs like Japanese-English. For identifying cross-language plagiarism at the character level, CL-OSA primarily improves the detection of sense-for-sense translations. For these challenging cases, CL-OSA’s performance in terms of the well-established PlagDet score exceeds that of the best competitor by more than factor two. The code and data of our study are openly available.
  • Item
    Lokal betrieben, remote gepflegt – Software für ein Datenrepositorium in Kooperation implementieren
    (Heidelberg : Universitätsbibliothek Heidelberg, 2022) Landwehr, Matthias; Schneider, Gabriel; Hofmann, Stefan; Razum, Matthias; Soltau, Kerstin; Heuveline, Vincent; Bisheh, Nina
    Die Universität Konstanz deckt ihren Bedarf nach einem institutionellen Forschungsdatenrepositorium mit der von FIZ Karlsruhe angebotenen Lösung „RADAR Local“. Als Alternative zu einer Eigenentwicklung wurde das Datenrepositorium als hybrides Modell mit Repositorien-Software und Archivierung auf lokaler Infrastruktur implementiert. Dabei stellt FIZ Karlsruhe die etablierte Repositorien-Software RADAR zur Verfügung, wartet und betreibt sie aus der Ferne und passt sie nach Kundenwunsch an. Um die parallele Installation und Pflege der RADAR-Software auf mehreren lokalen Instanzen effizient bewältigen zu können, hat FIZ Karlsruhe vorab den Automatisierungsgrad der betroffenen Prozesse in der Software-Entwicklung, in der Systemkonfiguration und im Deployment erhöht. Dies wurde durch den Einsatz von Container-Virtualisierung wie Docker und Docker Swarm sowie mit Orchestrierungswerkzeugen wie Ansible erreicht. Der zeitliche Aufwand und der personelle Ressourcenbedarf reduzieren sich dadurch für die Universität Konstanz und als Ergebnis erhält sie ein gepflegtes Repositorium auf dem aktuellen Stand der Technik. Gleichzeitig erfordert diese Betriebsvariante eine intensive Auseinandersetzung mit dem jeweiligen Geschäftsmodell und den technischen Rahmenbedingungen des Anbieters, eine genaue Kostenkalkulation sowie möglicherweise Kompromisse oder Abstriche bei individuellen Wünschen.