Search Results

Now showing 1 - 5 of 5
  • Item
    Detecting Cross-Language Plagiarism using Open Knowledge Graphs
    (Aachen, Germany : RWTH Aachen, 2021) Stegmüller, Johannes; Bauer-Marquart, Fabian; Meuschke, Norman; Ruas, Terry; Schubotz, Moritz; Gipp, Bela; Zhang, Chengzhi; Mayr, Philipp; Lu, Wie; Zhang, Yi
    Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application toWebscale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that include distant language pairs like Japanese-English. For identifying cross-language plagiarism at the character level, CL-OSA primarily improves the detection of sense-for-sense translations. For these challenging cases, CL-OSA’s performance in terms of the well-established PlagDet score exceeds that of the best competitor by more than factor two. The code and data of our study are openly available.
  • Item
    zbMATH Open: API Solutions and Research Challenges
    (Aachen, Germany : RWTH Aachen, 2021) Petrera, Matteo; Trautwein, Dennis; Beckenbach, Isabel; Ehsani, Dariush; Müller, Fabian; Teschke, Olaf; Gipp, Bela; Schubotz, Moritz; Balke, Wolf-Tilo; de Waard, Anita; Fu, Yuanxi; Hua, Bolin; Schneider, Jodi; Song, Ningyuan; Wang, Xiaoguang
    We present zbMATH Open, the most comprehensive collection of reviews and bibliographic metadata of scholarly literature in mathematics. Besides our website zbMATH.org which is openly accessible since the beginning of this year, we provide API endpoints to offer our data. APIs improve interoperability with others, i.e., digital libraries, and allow using our data for research purposes. In this article, we (1) illustrate the current and future overview of the services offered by zbMATH; (2) present the initial version of the zbMATH links API; (3) analyze potentials and limitations of the links API based on the example of the NIST Digital Library of Mathematical Functions; (4) and finally, present thezbMATHOpen dataset as a research resource and discuss connected open research problems.
  • Item
    Mathematics in Wikidata
    (Aachen, Germany : RWTH Aachen, 2021) Scharpf, Philipp; Schubotz, Moritz; Gipp, Bela; Kaffee, Lucie-Aimée; Razniewski, Simon; Hogan, Aidan
    Documents from Science, Technology, Engineering, and Mathematics (STEM) disciplines usually contain a signicant amount of mathematical formulae alongside text. Some Mathematical Information Retrieval (MathIR) systems, e.g., Mathematical Question Answering (MathQA), exploit knowledge from Wikidata. Therefore, the mathematical information needs to be stored in items. In the last years, there have been efforts to define several properties and seed formulae together with their constituting identifiers into Wikidata. This paper summarizes the current state, challenges, and discussions related to this endeavor. Furthermore, some data mining methods (supervised formula annotation and concept retrieval) and applications (question answering and classification explainability) of the mathematical information are outlined. Finally, we discuss community feedback and issues related to integrating Mathematical Entity Linking (MathEL) into Wikidata and Wikipedia, which was rejected in 33% and 12% of the test cases, for Wikidata and Wikipedia respectively. Our long-term goal is to populate Wikidata, such that it can serve a variety of automated math reasoning tasks and AI systems.
  • Item
    Knowledge Graphs - Working Group Charter (NFDI section-metadata) (1.2)
    (Genève : CERN, 2023) Stocker, Markus; Rossenova, Lozana; Shigapov, Renat; Betancort, Noemi; Dietze, Stefan; Murphy, Bridget; Bölling, Christian; Schubotz, Moritz; Koepler, Oliver
    Knowledge Graphs are a key technology for implementing the FAIR principles in data infrastructures by ensuring interoperability for both humans and machines. The Working Group "Knowledge Graphs" in Section "(Meta)data, Terminologies, Provenance" of the German National Research Data Infrastructure (Nationale Forschungsdateninfrastruktur (NFDI) e.V.) aims to promote the use of knowledge graphs in all NFDI consortia, to facilitate cross-domain data interlinking and federation following the FAIR principles, and to contribute to the joint development of tools and technologies that enable transformation of structured and unstructured data into semantically reusable knowledge across different domains.
  • Item
    The Case for a Common, Reusable Knowledge Graph Infrastructure for NFDI
    (Hannover : TIB Open Publishing, 2023) Rossenova, Lozana; Schubotz, Moritz; Shigapov, Renat
    The Strategic Research and Innovation Agenda (SRIA) of the European Commission identifies Knowledge Graphs (KGs) as one of the most important technologies for building an interoperability framework and enabling data exchange among users across countries, sectors, and disciplines [1]. KG is a graph-structured knowledge base containing a terminology (vocabulary or ontology) and data entities interrelated via the terminology [2]. KGs are based on semantic web technologies (RDF, SPARQL, etc.) and often used for agile data integration. KGs also play an essential role within Germany as a vehicle to connect research data and research-related entities and make those accessible – examples include the GESIS Knowledge Graph Infrastructure, TIB Open Research Knowledge Graph, and GND.network. Furthermore, the Wikidata knowledge graph, maintained by Wikimedia Germany, contains a large number of research-related entities and is widely used in scientific knowledge management in addition to being an important advocacy tool for open data [3]. Extending domain-specific ontology-supported KGs with the multidisciplinary, crowdsourced knowledge in Wikidata KG would enable significant applications. The linking between expert knowledge systems and world knowledge empowers lay persons to benefit from high-quality research data and ultimately contributes to increasing confidence in scientific research in society.