Search Results

Now showing 1 - 3 of 3
  • Item
    Falcon 2.0: An Entity and Relation Linking Tool over Wikidata
    (New York City, NY : Association for Computing Machinery, 2020) Sakor, Ahmad; Singh, Kuldeep; Patel, Anery; Vidal, Maria-Esther
    The Natural Language Processing (NLP) community has significantly contributed to the solutions for entity and relation recognition from a natural language text, and possibly linking them to proper matches in Knowledge Graphs (KGs). Considering Wikidata as the background KG, there are still limited tools to link knowledge within the text to Wikidata. In this paper, we present Falcon 2.0, the first joint entity and relation linking tool over Wikidata. It receives a short natural language text in the English language and outputs a ranked list of entities and relations annotated with the proper candidates in Wikidata. The candidates are represented by their Internationalized Resource Identifier (IRI) in Wikidata. Falcon 2.0 resorts to the English language model for the recognition task (e.g., N-Gram tiling and N-Gram splitting), and then an optimization approach for the linking task. We have empirically studied the performance of Falcon 2.0 on Wikidata and concluded that it outperforms all the existing baselines. Falcon 2.0 is open source and can be reused by the community; all the required instructions of Falcon 2.0 are well-documented at our GitHub repository (https://github.com/SDM-TIB/falcon2.0). We also demonstrate an online API, which can be run without any technical expertise. Falcon 2.0 and its background knowledge bases are available as resources at https://labs.tib.eu/falcon/falcon2/.
  • Item
    Linked Data Supported Content Analysis for Sociology
    (Berlin ; Heidelberg : Springer, 2019) Tietz, Tabea; Sack, Harald; Acosta, Maribel; Cudré-Mauroux, Philippe; Maleshkova, Maria; Pellegrini, Tassilo; Sack, Harald; Sure-Vetter, York
    Philology and hermeneutics as the analysis and interpretation of natural language text in written historical sources are the predecessors of modern content analysis and date back already to antiquity. In empirical social sciences, especially in sociology, content analysis provides valuable insights to social structures and cultural norms of the present and past. With the ever growing amount of text on the web to analyze, also numerous computer-assisted text analysis techniques and tools were developed in sociological research. However, existing methods often go without sufficient standardization. As a consequence, sociological text analysis is lacking transparency, reproducibility and data re-usability. The goal of this paper is to show, how Linked Data principles and Entity Linking techniques can be used to structure, publish and analyze natural language text for sociological research to tackle these shortcomings. This is achieved on the use case of constitutional text documents of the Netherlands from 1884 to 2016 which represent an important contribution to the European cultural heritage. Finally, the generated data is made available and re-usable as Linked Data not only for sociologists, but also for all other researchers in the digital humanities domain interested in the development of constitutions in the Netherlands.
  • Item
    Knowledge Extraction for Art History: the Case of Vasari’s The Lives of The Artists (1568)
    (Aachen, Germany : RWTH Aachen, 2022) Santini, Cristian; Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Posthumus, Etienne; Sack, Harald; Paschke, Adrian; Rehm, Georg; Neudecker, Clemens; Pintscher, Lydia
    Knowledge Extraction (KE) techniques are used to convert unstructured information present in texts to Knowledge Graphs (KGs) which can be queried and explored. Despite their potential for cultural heritage domains, such as Art History, these techniques often encounter limitations if applied to domain-specific data. In this paper we present the main challenges that KE has to face on art-historical texts, by using as case study Giorgio Vasari's The Lives of The Artists. This paper discusses the following NLP tasks for art-historical texts, namely entity recognition and linking, coreference resolution, time extraction, motif extraction and artwork extraction. Several strategies to annotate art-historical data for these tasks and evaluate NLP models are also proposed.