Search Results

Now showing 1 - 10 of 35
Loading...
Thumbnail Image
Item

Contextual Language Models for Knowledge Graph Completion

2021, Russa, Biswas, Sofronova, Radina, Alam, Mehwish, Sack, Harald, Mehwish, Alam, Ali, Medi, Groth, Paul, Hitzler, Pascal, Lehmann, Jens, Paulheim, Heiko, Rettinger, Achim, Sack, Harald, Sadeghi, Afshin, Tresp, Volker

Knowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the _ne-tuning of the GPT-2 model for triple classi_cation strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed.

Loading...
Thumbnail Image
Item

Modelling Archival Hierarchies in Practice: Key Aspects and Lessons Learned

2021, Vafaie, Mahsa, Bruns, Oleksandra, Pilz, Nastasja, Dessì, Danilo, Sack, Harald, Sumikawa, Yasunobu, Ikejiri, Ryohei, Doucet, Antoine, Pfanzelter, Eva, Hasanuzzaman, Mohammed, Dias, Gaël, Milligan, Ian, Jatowt, Adam

An increasing number of archival institutions aim to provide public access to historical documents. Ontologies have been designed, developed and utilised to model the archival description of historical documents and to enable interoperability between different information sources. However, due to the heterogeneous nature of archives and archival systems, current ontologies for the representation of archival content do not always cover all existing structural organisation forms equallywell. After briefly contextualising the heterogeneity in the hierarchical structure of German archives, this paper describes and evaluates differences between two archival ontologies, ArDO and RiC-O, and their approaches to modelling hierarchy levels and archive dynamics.

Loading...
Thumbnail Image
Item

DDB-KG: The German Bibliographic Heritage in a Knowledge Graph

2021, Tan, Mary Ann, Tietz, Tabea, Bruns, Oleksandra, Oppenlaender, Jonas, Dessì, Danilo, Harald, Sack, Sumikawa, Yasunobu, Ikejiri, Ryohei, Doucet, Antoine, Pfanzelter, Eva, Hasanuzzaman, Mohammed, Dias, Gaël, Milligan, Ian, Jatowt, Adam

Under the German government’s initiative “NEUSTART Kultur”, the German Digital Library or Deutsche Digitale Bibliothek (DDB) is undergoing improvements to enhance user-experience. As an initial step, emphasis is placed on creating a knowledge graph from the bibliographic record collection of the DDB. This paper discusses the challenges facing the DDB in terms of retrieval and the solutions in addressing them. In particular, limitations of the current data model or ontology to represent bibliographic metadata is analyzed through concrete examples. This study presents the complete ontological mapping from DDB-Europeana Data Model (DDB-EDM) to FaBiO, and a prototype of the DDB-KG made available as a SPARQL endpoint. The suitabiliy of the target ontology is demonstrated with SPARQL queries formulated from competency questions.

Loading...
Thumbnail Image
Item

Knowledge Extraction for Art History: the Case of Vasari’s The Lives of The Artists (1568)

2022, Santini, Cristian, Tan, Mary Ann, Tietz, Tabea, Bruns, Oleksandra, Posthumus, Etienne, Sack, Harald, Paschke, Adrian, Rehm, Georg, Neudecker, Clemens, Pintscher, Lydia

Knowledge Extraction (KE) techniques are used to convert unstructured information present in texts to Knowledge Graphs (KGs) which can be queried and explored. Despite their potential for cultural heritage domains, such as Art History, these techniques often encounter limitations if applied to domain-specific data. In this paper we present the main challenges that KE has to face on art-historical texts, by using as case study Giorgio Vasari's The Lives of The Artists. This paper discusses the following NLP tasks for art-historical texts, namely entity recognition and linking, coreference resolution, time extraction, motif extraction and artwork extraction. Several strategies to annotate art-historical data for these tasks and evaluate NLP models are also proposed.

Loading...
Thumbnail Image
Item

Check square at CheckThat! 2020: Claim Detection in Social Media via Fusion of Transformer and Syntactic Features

2020, Cheema, Gullasl S., Hakimov, Sherzod, Ewerth, Ralph, Cappellato, Linda, Eickhoff, Carsten, Ferro, Nicola, Névéol, Aurélie

In this digital age of news consumption, a news reader has the ability to react, express and share opinions with others in a highly interactive and fast manner. As a consequence, fake news has made its way into our daily life because of very limited capacity to verify news on the Internet by large companies as well as individuals. In this paper, we focus on solving two problems which are part of the fact-checking ecosystem that can help to automate fact-checking of claims in an ever increasing stream of content on social media. For the first prob-lem, claim check-worthiness prediction, we explore the fusion of syntac-tic features and deep transformer Bidirectional Encoder Representations from Transformers (BERT) embeddings, to classify check-worthiness of a tweet, i.e. whether it includes a claim or not. We conduct a detailed feature analysis and present our best performing models for English and Arabic tweets. For the second problem, claim retrieval, we explore the pre-trained embeddings from a Siamese network transformer model (sentence-transformers) specifically trained for semantic textual similar-ity, and perform KD-search to retrieve verified claims with respect to a query tweet.

Loading...
Thumbnail Image
Item

Leveraging Literals for Knowledge Graph Embeddings

2021, Gesese, Genet Asefa, Tamma, Valentina, Fernandez, Miriam, Poveda-Villalón, María

Nowadays, Knowledge Graphs (KGs) have become invaluable for various applications such as named entity recognition, entity linking, question answering. However, there is a huge computational and storage cost associated with these KG-based applications. Therefore, there arises the necessity of transforming the high dimensional KGs into low dimensional vector spaces, i.e., learning representations for the KGs. Since a KG represents facts in the form of interrelations between entities and also using attributes of entities, the semantics present in both forms should be preserved while transforming the KG into a vector space. Hence, the main focus of this thesis is to deal with the multimodality and multilinguality of literals when utilizing them for the representation learning of KGs. The other task is to extract benchmark datasets with a high level of difficulty for tasks such as link prediction and triple classification. These datasets could be used for evaluating both kind of KG Embeddings, those using literals and those which do not include literals.

Loading...
Thumbnail Image
Item

On the Role of Images for Analyzing Claims in Social Media

2021, Cheema, Gullal S., Hakimov, Sherzod, Müller-Budack, Eric, Ewerth, Ralph

Fake news is a severe problem in social media. In this paper, we present an empirical study on visual, textual, and multimodal models for the tasks of claim, claim check-worthiness, and conspiracy detection, all of which are related to fake news detection. Recent work suggests that images are more influential than text and often appear alongside fake text. To this end, several multimodal models have been proposed in recent years that use images along with text to detect fake news on social media sites like Twitter. However, the role of images is not well understood for claim detection, specifically using transformer-based textual and multimodal models. We investigate state-of-the-art models for images, text (Transformer-based), and multimodal information for four different datasets across two languages to understand the role of images in the task of claim and conspiracy detection.

Loading...
Thumbnail Image
Item

Combining Textual Features for the Detection of Hateful and Offensive Language

2021, Hakimov, Sherzod, Ewerth, Ralph, Mehta, Parth, Mandl, Thomas, Majumder, Prasenjit, Mitra, Mandar

The detection of offensive, hateful and profane language has become a critical challenge since many users in social networks are exposed to cyberbullying activities on a daily basis. In this paper, we present an analysis of combining different textual features for the detection of hateful or offensive posts on Twitter. We provide a detailed experimental evaluation to understand the impact of each building block in a neural network architecture. The proposed architecture is evaluated on the English Subtask 1A: Identifying Hate, offensive and profane content from the post datasets of HASOC-2021 dataset under the team name TIB-VA. We compared different variants of the contextual word embeddings combined with the character level embeddings and the encoding of collected hate terms.

Loading...
Thumbnail Image
Item

On the Impact of Features and Classifiers for Measuring Knowledge Gain during Web Search - A Case Study

2021, Gritz, Wolfgang, Hoppe, Anett, Ewerth, Ralph, Cong, Gao, Ramanath, Maya

Search engines are normally not designed to support human learning intents and processes. The ÿeld of Search as Learning (SAL) aims to investigate the characteristics of a successful Web search with a learning purpose. In this paper, we analyze the impact of text complexity of Web pages on predicting knowledge gain during a search session. For this purpose, we conduct an experimental case study and investigate the in˝uence of several text-based features and classiÿers on the prediction task. We build upon data from a study of related work, where 104 participants were given the task to learn about the formation of lightning and thunder through Web search. We perform an extensive evaluation based on a state-of-the-art approach and extend it with additional features related to textual complexity of Web pages. In contrast to prior work, we perform a systematic search for optimal hyperparameters and show the possible in˝uence of feature selection strategies on the knowledge gain prediction. When using the new set of features, state-of-the-art results are noticeably improved. The results indicate that text complexity of Web pages could be an important feature resource for knowledge gain prediction.

Loading...
Thumbnail Image
Item

A Data Model for Linked Stage Graph and the Historical Performing Arts Domain

2023, Tietz, Tabea, Bruns, Oleksandra, Sack, Harald, Bikakis, Antonis, Ferrario, Roberta, Jean, Stéphane, Markhoff, Béatrice, Mosca, Alessandro, Nicolosi Asmundo, Marianna

The performing arts are complex, dynamic and embedded into societal and political systems. Providing means to research historical performing arts data is therefore crucial for understanding our history and culture. However, currently no commonly accepted ontology for historical performing arts data exists. On the example of the Linked Stage Graph, this position paper presents the ongoing process of creating an application-driven and efficient data model by leveraging and building upon existing standards and ontologies like CIDOC-CRM, FRBR, and FRBRoo.