Search Results

Now showing 1 - 4 of 4
  • Item
    Baroque AI
    (Zenodo, 2023) Worthington, Simon; Blümel, Ina
    Publication prototype: A computational publishing and AI assisted writing course unit with students of the Open Knowledge class – at Hochschule Hannover with the Open Science Lab, TIB. The prototype publication exercise involves creating a fictional ‘exhibition catalogue’ drawing on Wikidata based cataloguing of seventeenth century painting deposited by the Bavarian State Painting Collections. The prototype demostrates how computational publishing can be used to bring together different distributed linked open data (LOD) sources. Additionally AI tools are used for assisted essay writing. Then both are encapsulated in a multi-format computational publication — allowing for asynchronous collaborative working. Distributed LOD sources include: Wikidata/base, Nextcloud, Thoth, Semantic Kompakkt, and TIB AV Portal. AI tools used for essay writing are — OpenAI and Perplexity. Eleven students completed the class unit which was carried out over March to April 2023. An open access OER guide to running the class, a template publication for use in the class are online on GitHub and designed for OER reuse. Full class information and resources are on Wikiversity. The open source software used is brought together in the ADA Pipeline.
  • Item
    When humans and machines collaborate: Cross-lingual Label Editing in Wikidata
    (New York City : Association for Computing Machinery, 2019) Kaffee, L.-A.; Endris, K.M.; Simperl, E.
    The quality and maintainability of a knowledge graph are determined by the process in which it is created. There are different approaches to such processes; extraction or conversion of available data in the web (automated extraction of knowledge such as DBpedia from Wikipedia), community-created knowledge graphs, often by a group of experts, and hybrid approaches where humans maintain the knowledge graph alongside bots. We focus in this work on the hybrid approach of human edited knowledge graphs supported by automated tools. In particular, we analyse the editing of natural language data, i.e. labels. Labels are the entry point for humans to understand the information, and therefore need to be carefully maintained. We take a step toward the understanding of collaborative editing of humans and automated tools across languages in a knowledge graph. We use Wikidata as it has a large and active community of humans and bots working together covering over 300 languages. In this work, we analyse the different editor groups and how they interact with the different language data to understand the provenance of the current label data.
  • Item
    Encoding Knowledge Graph Entity Aliases in Attentive Neural Network for Wikidata Entity Linking
    (Berlin ; Heidelberg : Springer, 2020) Mulang’, Isaiah Onando; Singh, Kuldeep; Vyas, Akhilesh; Shekarpour, Saeedeh; Vidal, Maria-Esther; Lehmann, Jens; Auer, Sören; Huang, Zhisheng; Beek, Wouter; Wang, Hua; Zhou, Rui; Zhang, Yanchun
    The collaborative knowledge graphs such as Wikidata excessively rely on the crowd to author the information. Since the crowd is not bound to a standard protocol for assigning entity titles, the knowledge graph is populated by non-standard, noisy, long or even sometimes awkward titles. The issue of long, implicit, and nonstandard entity representations is a challenge in Entity Linking (EL) approaches for gaining high precision and recall. Underlying KG in general is the source of target entities for EL approaches, however, it often contains other relevant information, such as aliases of entities (e.g., Obama and Barack Hussein Obama are aliases for the entity Barack Obama). EL models usually ignore such readily available entity attributes. In this paper, we examine the role of knowledge graph context on an attentive neural network approach for entity linking on Wikidata. Our approach contributes by exploiting the sufficient context from a KG as a source of background knowledge, which is then fed into the neural network. This approach demonstrates merit to address challenges associated with entity titles (multi-word, long, implicit, case-sensitive). Our experimental study shows ≈8% improvements over the baseline approach, and significantly outperform an end to end approach for Wikidata entity linking.
  • Item
    Falcon 2.0: An Entity and Relation Linking Tool over Wikidata
    (New York City, NY : Association for Computing Machinery, 2020) Sakor, Ahmad; Singh, Kuldeep; Patel, Anery; Vidal, Maria-Esther
    The Natural Language Processing (NLP) community has significantly contributed to the solutions for entity and relation recognition from a natural language text, and possibly linking them to proper matches in Knowledge Graphs (KGs). Considering Wikidata as the background KG, there are still limited tools to link knowledge within the text to Wikidata. In this paper, we present Falcon 2.0, the first joint entity and relation linking tool over Wikidata. It receives a short natural language text in the English language and outputs a ranked list of entities and relations annotated with the proper candidates in Wikidata. The candidates are represented by their Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to the English language model for the recognition task (e.g., N-Gram tiling and N-Gram splitting), and then an optimization approach for the linking task. We have empirically studied the performance of Falcon 2.0 on Wikidata and concluded that it outperforms all the existing baselines. Falcon 2.0 is open source and can be reused by the community; all the required instructions of Falcon 2.0 are well-documented at our GitHub repository (https://github.com/SDM-TIB/falcon2.0). We also demonstrate an online API, which can be run without any technical expertise. Falcon 2.0 and its background knowledge bases are available as resources at https://labs.tib.eu/falcon/falcon2/.