Browsing by Author "Sack, Harald"
Now showing 1 - 20 of 21
Results Per Page
- ItemAdvances in Semantics and Explainability for NLP: Joint Proceedings of the 2nd International Workshop on Deep Learning meets Ontologies and Natural Language Processing (DeepOntoNLP 2021) & 6th International Workshop on Explainable Sentiment Mining and Emotion Detection (X-SENTIMENT 2021), co-located with the 18th Extended Semantic Web Conference (ESWC 2021)(Aachen : RWTH Aachen, 2021) Ben Abbès, Sarra; Hantach, Rim; Calvez, Philippe; Buscaldi, Davide; Dessì, Danilo; Dragoni, Mauro; Reforgiato Recupero, Diego; Sack, Harald[no abstract available]
- ItemAnalyzing social media for measuring public attitudes toward controversies and their driving factors: a case study of migration(Wien : Springer, 2022) Chen, Yiyi; Sack, Harald; Alam, MehwishAmong other ways of expressing opinions on media such as blogs, and forums, social media (such as Twitter) has become one of the most widely used channels by populations for expressing their opinions. With an increasing interest in the topic of migration in Europe, it is important to process and analyze these opinions. To this end, this study aims at measuring the public attitudes toward migration in terms of sentiments and hate speech from a large number of tweets crawled on the decisive topic of migration. This study introduces a knowledge base (KB) of anonymized migration-related annotated tweets termed as MigrationsKB (MGKB). The tweets from 2013 to July 2021 in the European countries that are hosts of immigrants are collected, pre-processed, and filtered using advanced topic modeling techniques. BERT-based entity linking and sentiment analysis, complemented by attention-based hate speech detection, are performed to annotate the curated tweets. Moreover, external databases are used to identify the potential social and economic factors causing negative public attitudes toward migration. The analysis aligns with the hypothesis that the countries with more migrants have fewer negative and hateful tweets. To further promote research in the interdisciplinary fields of social sciences and computer science, the outcomes are integrated into MGKB, which significantly extends the existing ontology to consider the public attitudes toward migrations and economic indicators. This study further discusses the use-cases and exploitation of MGKB. Finally, MGKB is made publicly available, fully supporting the FAIR principles.
- ItemAn Assessment of Deep Learning Models and Word Embeddings for Toxicity Detection within Online Textual Comments(Basel : MDPI, 2021) Dessì, Danilo; Recupero, Diego Reforgiato; Sack, HaraldToday, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in general, comments that may hurt someone’s feelings. In this scenario, the detection of this kind of toxicity has an important role to moderate online communication. Deep learning technologies have recently delivered impressive performance within Natural Language Processing applications encompassing Sentiment Analysis and emotion detection across numerous datasets. Such models do not need any pre-defined hand-picked features, but they learn sophisticated features from the input datasets by themselves. In such a domain, word embeddings have been widely used as a way of representing words in Sentiment Analysis tasks, proving to be very effective. Therefore, in this paper, we investigated the use of deep learning and word embeddings to detect six different types of toxicity within online comments. In doing so, the most suitable deep learning layers and state-of-the-art word embeddings for identifying toxicity are evaluated. The results suggest that Long-Short Term Memory layers in combination with mimicked word embeddings are a good choice for this task.
- ItemAudio Ontologies for Intangible Cultural Heritage(Bramhall, Stockport ; EasyChair Ltd., 2022-04-12) Tan, Mary Ann; Posthumus, Etienne; Sack, HaraldCultural heritage portals often contain intangible objects digitized as audio files. This paper presents and discusses the adaptation of existing audio ontologies intended for non-cultural heritage applications. The resulting alignment of the German Digital Library-Europeana Data Model (DDB-EDM) with Music Ontology (MO) and Audio Commons Ontology (ACO) is presented.
- ItemChallenges of Applying Knowledge Graph and their Embeddings to a Real-world Use-case(Aachen, Germany : RWTH Aachen, 2021) Petzold, Rick; Gesese, Genet Asefa; Bogdanova, Viktoria; Zylowski, Thorsten; Sack, Harald; Alam, Mehwish; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, HaraldDifferent Knowledge Graph Embedding (KGE) models have been proposed so far which are trained on some specific KG completion tasks such as link prediction and evaluated on datasets which are mainly created for such purpose. Mostly, the embeddings learnt on link prediction tasks are not applied for downstream tasks in real-world use-cases such as data available in different companies/organizations. In this paper, the challenges with enriching a KG which is generated from a real-world relational database (RDB) about companies, with information from external sources such as Wikidata and learning representations for the KG are presented. Moreover, a comparative analysis is presented between the KGEs and various text embeddings on some downstream clustering tasks. The results of experiments indicate that in use-cases like the one used in this paper, where the KG is highly skewed, it is beneficial to use text embeddings or language models instead of KGEs.
- ItemContextual Language Models for Knowledge Graph Completion(Aachen, Germany : RWTH Aachen, 2021) Russa, Biswas; Sofronova, Radina; Alam, Mehwish; Sack, Harald; Mehwish, Alam; Ali, Medi; Groth, Paul; Hitzler, Pascal; Lehmann, Jens; Paulheim, Heiko; Rettinger, Achim; Sack, Harald; Sadeghi, Afshin; Tresp, VolkerKnowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the _ne-tuning of the GPT-2 model for triple classi_cation strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed.
- ItemDDB-EDM to FaBiO: The Case of the German Digital Library(Aachen, Germany : RWTH Aachen, 2021) Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Oppenlaender, Jonas; Dessì, Danilo; Sack, Harald; Seneviratne, Oshani; Pesquita, Catia; Sequeda, Juan; Etcheverry, LorenaCultural heritage portals have the goal of providing users with seamless access to all their resources. This paper introduces initial efforts for a user-oriented restructuring of the German Digital Library (DDB). At present, cultural heritage objects (CHOs) in the DDB are modeled using an extended version of the Europeana Data Model (DDBEDM), which negatively impacts usability and exploration. These challenges can be addressed by leveraging ontologies, and building a knowledge graph from the DDB's voluminous collection. Towards this goal, an alignment of bibliographic metadata from DDB-EDM to FRBR-Aligned Bibliographic Ontology (FaBiO) is presented.
- ItemDesigning Intelligent Systems for Online Education: Open Challenges and Future Directions(Aachen, Germany : RWTH Aachen, 2021) Dessì, Danilo; Käser, Tanja; Marras, Mirko; Popescu, Elvira; Sack, Harald; Dessì, Danilo; Käser, Tanja; Marras, Mirko; Popescu, Elvira; Sack, HaraldThe design and delivering of platforms for online education is fostering increasingly intense research. Scaling up education online brings new emerging needs related with hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely, as examples. However, with the impressive progress of the data mining and machine learning fields, combined with the large amounts of learning-related data and high-performance computing, it has been possible to gain a deeper understanding of the nature of learning and teaching online. Methods at the analytical and algorithmic levels are constantly being developed and hybrid approaches are receiving an increasing attention. Recent methods are analyzing not only the online traces left by students a posteriori, but also the extent to which this data can be turned into actionable insights and models, to support the above needs in a computationally efficient, adaptive and timely way. In this paper, we present relevant open challenges lying at the intersection between the machine learning and educational communities, that need to be addressed to further develop the field of intelligent systems for online education. Several areas of research in this field are identified, such as data availability and sharing, time-wise and multi-modal data modelling, generalizability, fairness, explainability, interpretability, privacy, and ethics behind models delivered for supporting education. Practical challenges and recommendations for possible research directions are provided for each of them, paving the way for future advances in this field.
- ItemDoMoRe – A recommender system for domain modeling(Setúbal : SciTePress, 2018) Agt-Rickauer, Henning; Kutsche, Ralf-Detlef; Sack, Harald; Hammoudi, Slimane; Ferreira Pires, Luis; Selic, BranDomain modeling is an important activity in early phases of software projects to achieve a shared understanding of the problem field among project participants. Domain models describe concepts and relations of respective application fields using a modeling language and domain-specific terms. Detailed knowledge of the domain as well as expertise in model-driven development is required for software engineers to create these models. This paper describes DoMoRe, a system for automated modeling recommendations to support the domain modeling process. We describe an approach in which modeling benefits from formalized knowledge sources and information extraction from text. The system incorporates a large network of semantically related terms built from natural language data sets integrated with mediator-based knowledge base querying in a single recommender system to provide context-sensitive suggestions of model elements.
- ItemInteraction Network Analysis Using Semantic Similarity Based on Translation Embeddings(Berlin ; Heidelberg : Springer, 2019) Manzoor Bajwa, Awais; Collarana, Diego; Vidal, Maria-Esther; Acosta, Maribel; Cudré-Mauroux, Philippe; Maleshkova, Maria; Pellegrini, Tassilo; Sack, Harald; Sure-Vetter, YorkBiomedical knowledge graphs such as STITCH, SIDER, and Drugbank provide the basis for the discovery of associations between biomedical entities, e.g., interactions between drugs and targets. Link prediction is a paramount task and represents a building block for supporting knowledge discovery. Although several approaches have been proposed for effectively predicting links, the role of semantics has not been studied in depth. In this work, we tackle the problem of discovering interactions between drugs and targets, and propose SimTransE, a machine learning-based approach that solves this problem effectively. SimTransE relies on translating embeddings to model drug-target interactions and values of similarity across them. Grounded on the vectorial representation of drug-target interactions, SimTransE is able to discover novel drug-target interactions. We empirically study SimTransE using state-of-the-art benchmarks and approaches. Experimental results suggest that SimTransE is competitive with the state of the art, representing, thus, an effective alternative for knowledge discovery in the biomedical domain.
- ItemKnowledge Extraction for Art History: the Case of Vasari’s The Lives of The Artists (1568)(Aachen, Germany : RWTH Aachen, 2022) Santini, Cristian; Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Posthumus, Etienne; Sack, Harald; Paschke, Adrian; Rehm, Georg; Neudecker, Clemens; Pintscher, LydiaKnowledge Extraction (KE) techniques are used to convert unstructured information present in texts to Knowledge Graphs (KGs) which can be queried and explored. Despite their potential for cultural heritage domains, such as Art History, these techniques often encounter limitations if applied to domain-specific data. In this paper we present the main challenges that KE has to face on art-historical texts, by using as case study Giorgio Vasari's The Lives of The Artists. This paper discusses the following NLP tasks for art-historical texts, namely entity recognition and linking, coreference resolution, time extraction, motif extraction and artwork extraction. Several strategies to annotate art-historical data for these tasks and evaluate NLP models are also proposed.
- ItemKnowledge Graph enabled Curation and Exploration of Nuremberg's City Heritage(Aachen, Germany : RWTH Aachen, 2021) Tietz, Tabea; Bruns, Oleksandra; Göller, Sandra; Razum, Matthias; Dessì, Danilo; Sack, Harald; Paschke, Adrian; Rehm, Georg; Al Qundus, Jamal; Neudecker, Clemens; Pintscher, LydiaAn important part in European cultural identity relies on European cities and in particular on their histories and cultural heritage. Nuremberg, the home of important artists such as Albrecht Dürer and Hans Sachs developed into the epitome of German and European culture already during the Middle Ages. Throughout history, the city experienced a number of transformations, especially with its almost complete destruction during World War 2. This position paper presents TRANSRAZ, a project with the goal to recreate Nuremberg by means of an interactive 3D tool to explore the city's architecture and culture ranging from the 17th to the 21st century. The goal of this position paper is to discuss the ongoing work of connecting heterogeneous historical data from various sources previously hidden in archives to the 3D model using knowledge graphs for a scientifically accurate interactive exploration on the Web.
- ItemLinked Data Supported Content Analysis for Sociology(Berlin ; Heidelberg : Springer, 2019) Tietz, Tabea; Sack, Harald; Acosta, Maribel; Cudré-Mauroux, Philippe; Maleshkova, Maria; Pellegrini, Tassilo; Sack, Harald; Sure-Vetter, YorkPhilology and hermeneutics as the analysis and interpretation of natural language text in written historical sources are the predecessors of modern content analysis and date back already to antiquity. In empirical social sciences, especially in sociology, content analysis provides valuable insights to social structures and cultural norms of the present and past. With the ever growing amount of text on the web to analyze, also numerous computer-assisted text analysis techniques and tools were developed in sociological research. However, existing methods often go without sufficient standardization. As a consequence, sociological text analysis is lacking transparency, reproducibility and data re-usability. The goal of this paper is to show, how Linked Data principles and Entity Linking techniques can be used to structure, publish and analyze natural language text for sociological research to tackle these shortcomings. This is achieved on the use case of constitutional text documents of the Netherlands from 1884 to 2016 which represent an important contribution to the European cultural heritage. Finally, the generated data is made available and re-usable as Linked Data not only for sociologists, but also for all other researchers in the digital humanities domain interested in the development of constitutions in the Netherlands.
- ItemMachine Learning with Symbolic Methods and Knowledge Graphs(Aachen : RWTH Aachen, 2021) Alam, Mehwish; Ali, Mehdi; Groth, Paul; Hitzler, Pascal; Lehmann, Jens; Paulheim, Heiko; Rettinger, Achim; Sack, Harald; Sadeghi, Afshi; Tresp, Volker[no abstract available]
- ItemModelling Archival Hierarchies in Practice: Key Aspects and Lessons Learned(Aachen, Germany : RWTH Aachen, 2021) Vafaie, Mahsa; Bruns, Oleksandra; Pilz, Nastasja; Dessì, Danilo; Sack, Harald; Sumikawa, Yasunobu; Ikejiri, Ryohei; Doucet, Antoine; Pfanzelter, Eva; Hasanuzzaman, Mohammed; Dias, Gaël; Milligan, Ian; Jatowt, AdamAn increasing number of archival institutions aim to provide public access to historical documents. Ontologies have been designed, developed and utilised to model the archival description of historical documents and to enable interoperability between different information sources. However, due to the heterogeneous nature of archives and archival systems, current ontologies for the representation of archival content do not always cover all existing structural organisation forms equallywell. After briefly contextualising the heterogeneity in the hierarchical structure of German archives, this paper describes and evaluates differences between two archival ontologies, ArDO and RiC-O, and their approaches to modelling hierarchy levels and archive dynamics.
- ItemOntology Modelling for Materials Science Experiments(Aachen, Germany : RWTH Aachen, 2021) Alam, Mehwish; Birkholz, Henk; Dessì, Danilo; Eberl, Christoph; Fliegl, Heike; Gumbsch, Peter; von Hartrott, Philipp; Mädler, Lutz; Niebel, Markus; Sack, Harald; Thomas, Akhil; Tiddi, Ilaria; Maleshkova, Maria; Pellegrini, Tassilo; de Boer, VictorMaterials are either enabler or bottleneck for the vast majority of technological innovations. The digitization of materials and processes is mandatory to create live production environments which represent physical entities and their aggregations and thus allow to represent, share, and understand materials changes. However, a common standard formalization for materials knowledge in the form of taxonomies, ontologies, or knowledge graphs has not been achieved yet. This paper sketches the e_orts in modelling an ontology prototype to describe Materials Science experiments. It describes what is expected from the ontology by introducing a use case where a process chain driven by the ontology enables the curation and understanding of experiments.
- ItemSteps towards a Dislocation Ontology for Crystalline Materials(Aachen, Germany : RWTH Aachen, 2021) Ihsan, Ahmad Zainul; Dessì, Danilo; Alam, Mehwish; Sack, Harald; Sandfeld, Stefan; García-Castro, Raúl; Davies, John; Antoniou, Grigoris; Fortuna, CarolinaThe field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain “dislocations" - a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into understanding dislocation behavior across different length scales both with experimental characterization techniques as well as with simulations. However, for describing such dislocation structures there is still a lack of a common standard to represent and to connect dislocation domain knowledge across different but related communities. An ontology offers a common foundation to enable knowledge representation and data interoperability, which are important components to establish a “digital twin". This paper outlines the first steps towards the design of an ontology in the dislocation domain and shows a connection with the already existing ontologies in the materials science and engineering domain.
- ItemTemporal Evolution of the Migration-related Topics on Social Media(Aachen, Germany : RWTH Aachen, 2021) Chen, Yiyi; Gesese, Genet Asefa; Sack, Harald; Alam, Mehwish; Seneviratne, Oshani; Pesquita, Catia; Sequeda, Juan; Etcheverry, LorenaThis poster focuses on capturing the temporal evolution of migration-related topics on relevant tweets. It uses Dynamic Embedded Topic Model (DETM) as a learning algorithm to perform a quantitative and qualitative analysis of these emerging topics. TweetsKB is extended with the extracted Twitter dataset along with the results of DETM which considers temporality. These results are then further analyzed and visualized. It reveals that the trajectories of the migration-related topics are in agreement with historical events. The source codes are available online: https://bit.ly/3dN9ICB.
- ItemTemporal Role Annotation for Named Entities(Amsterdam [u.a.] : Elsevier, 2018) Koutraki, Maria; Bakhshandegan-Moghaddam, Farshad; Sack, Harald; Fensel, Anna; de Boer, Victor; Pellegrini, Tassilo; Kiesling, Elmar; Haslhofer, Bernhard; Hollink, Laura; Schindler, AlexanderNatural language understanding tasks are key to extracting structured and semantic information from text. One of the most challenging problems in natural language is ambiguity and resolving such ambiguity based on context including temporal information. This paper, focuses on the task of extracting temporal roles from text, e.g. CEO of an organization or head of a state. A temporal role has a domain, which may resolve to different entities depending on the context and especially on temporal information, e.g. CEO of Microsoft in 2000. We focus on the temporal role extraction, as a precursor for temporal role disambiguation. We propose a structured prediction approach based on Conditional Random Fields (CRF) to annotate temporal roles in text and rely on a rich feature set, which extracts syntactic and semantic information from text. We perform an extensive evaluation of our approach based on two datasets. In the first dataset, we extract nearly 400k instances from Wikipedia through distant supervision, whereas in the second dataset, a manually curated ground-truth consisting of 200 instances is extracted from a sample of The New York Times (NYT) articles. Last, the proposed approach is compared against baselines where significant improvements are shown for both datasets.
- ItemTowards a Representation of Temporal Data in Archival Records: Use Cases and Requirements(Aachen, Germany : RWTH Aachen, 2021) Bruns, Oleksandra; Tietz, Tabea; Vafaie, Mahsa; Dessì, Danilo; Sack, Harald; Lopes, Carla Teixeira; Ribeiro, Cristina; Niccolucci, Franco; Rodrigues, Irene; Freire, NunoArchival records are essential sources of information for historians and digital humanists to understand history. For modern information systems they are often analysed and integrated into Knowledge Graphs for better access, interoperability and re-use. However, due to restrictions of the representation of RDF predicates temporal data within archival records is a challenge to model. This position paper explains requirements for modeling temporal data in archival records based on running research projects in which archival records are analysed and integrated in Knowledge Graphs for research and exploration.