Search Results

Now showing 1 - 3 of 3
  • Item
    Detecting Cross-Language Plagiarism using Open Knowledge Graphs
    (Aachen, Germany : RWTH Aachen, 2021) Stegmüller, Johannes; Bauer-Marquart, Fabian; Meuschke, Norman; Ruas, Terry; Schubotz, Moritz; Gipp, Bela; Zhang, Chengzhi; Mayr, Philipp; Lu, Wie; Zhang, Yi
    Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application toWebscale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that include distant language pairs like Japanese-English. For identifying cross-language plagiarism at the character level, CL-OSA primarily improves the detection of sense-for-sense translations. For these challenging cases, CL-OSA’s performance in terms of the well-established PlagDet score exceeds that of the best competitor by more than factor two. The code and data of our study are openly available.
  • Item
    Causal Relationship over Knowledge Graphs
    (2022) Huang, Hao; Al Hasan, Mohammad; Xiong, Li
    Causality has been discussed for centuries, and the theory of causal inference over tabular data has been broadly studied and utilized in multiple disciplines. However, only a few works attempt to infer the causality while exploiting the meaning of the data represented in a data structure like knowledge graph. These works offer a glance at the possibilities of causal inference over knowledge graphs, but do not yet consider the metadata, e.g., cardinalities, class subsumption and overlap, and integrity constraints. We propose CareKG, a new formalism to express causal relationships among concepts, i.e., classes and relations, and enable causal queries over knowledge graphs using semantics of metadata. We empirically evaluate the expressiveness of CareKG in a synthetic knowledge graph concerning cardinalities, class subsumption and overlap, integrity constraints. Our initial results indicate that CareKG can represent and measure causal relations with some semantics which are uncovered by state-of-the-art approaches.
  • Item
    TRANSRAZ Data Model: Towards a Geosocial Representation of Historical Cities
    (Berlin : AKA, 2023) Bruns, Oleksandra; Tietz, Tabea; Göller, Sandra; Sack, Harald; Acosta, M.; Peroni, S.; Vahdati, S.; Gentile, A.-L.; Pellegrini, T.; Kalo, J.-C.
    Preserving historical city architectures and making them (publicly) available has emerged as an important field of the cultural heritage and digital humanities research domain. In this context, the TRANSRAZ project is creating an interactive 3D environment of the historical city of Nuremberg which spans over different periods of time. Next to the exploration of the city’s historical architecture, TRANSRAZ is also integrating information about its inhabitants, organizations, and important events, which are extracted from historical documents semi-automatically. Knowledge Graphs have proven useful and valuable to integrate and enrich these heterogeneous data. However, this task also comes with versatile data modeling challenges. This paper contributes the TRANSRAZ data model, which integrates agents, architectural objects, events, and historical documents into the 3D research environment by means of ontologies. Goal is to explore Nuremberg’s multifaceted past in different time layers in the context of its architectural, social, economical, and cultural developments.