DSpace :: Browsing by Author "Alam, Mehwish"

Browsing by Author "Alam, Mehwish"

Now showing 1 - 17 of 17

Analyzing social media for measuring public attitudes toward controversies and their driving factors: a case study of migration
(Wien : Springer, 2022) Chen, Yiyi; Sack, Harald; Alam, Mehwish
Among other ways of expressing opinions on media such as blogs, and forums, social media (such as Twitter) has become one of the most widely used channels by populations for expressing their opinions. With an increasing interest in the topic of migration in Europe, it is important to process and analyze these opinions. To this end, this study aims at measuring the public attitudes toward migration in terms of sentiments and hate speech from a large number of tweets crawled on the decisive topic of migration. This study introduces a knowledge base (KB) of anonymized migration-related annotated tweets termed as MigrationsKB (MGKB). The tweets from 2013 to July 2021 in the European countries that are hosts of immigrants are collected, pre-processed, and filtered using advanced topic modeling techniques. BERT-based entity linking and sentiment analysis, complemented by attention-based hate speech detection, are performed to annotate the curated tweets. Moreover, external databases are used to identify the potential social and economic factors causing negative public attitudes toward migration. The analysis aligns with the hypothesis that the countries with more migrants have fewer negative and hateful tweets. To further promote research in the interdisciplinary fields of social sciences and computer science, the outcomes are integrated into MGKB, which significantly extends the existing ontology to consider the public attitudes toward migrations and economic indicators. This study further discusses the use-cases and exploitation of MGKB. Finally, MGKB is made publicly available, fully supporting the FAIR principles.
Challenges of Applying Knowledge Graph and their Embeddings to a Real-world Use-case
(Aachen, Germany : RWTH Aachen, 2021) Petzold, Rick; Gesese, Genet Asefa; Bogdanova, Viktoria; Zylowski, Thorsten; Sack, Harald; Alam, Mehwish; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, Harald
Different Knowledge Graph Embedding (KGE) models have been proposed so far which are trained on some specific KG completion tasks such as link prediction and evaluated on datasets which are mainly created for such purpose. Mostly, the embeddings learnt on link prediction tasks are not applied for downstream tasks in real-world use-cases such as data available in different companies/organizations. In this paper, the challenges with enriching a KG which is generated from a real-world relational database (RDB) about companies, with information from external sources such as Wikidata and learning representations for the KG are presented. Moreover, a comparative analysis is presented between the KGEs and various text embeddings on some downstream clustering tasks. The results of experiments indicate that in use-cases like the one used in this paper, where the KG is highly skewed, it is beneficial to use text embeddings or language models instead of KGEs.
Contextual Language Models for Knowledge Graph Completion
(Aachen, Germany : RWTH Aachen, 2021) Russa, Biswas; Sofronova, Radina; Alam, Mehwish; Sack, Harald; Mehwish, Alam; Ali, Medi; Groth, Paul; Hitzler, Pascal; Lehmann, Jens; Paulheim, Heiko; Rettinger, Achim; Sack, Harald; Sadeghi, Afshin; Tresp, Volker
Knowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the _ne-tuning of the GPT-2 model for triple classi_cation strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed.
Diving into Knowledge Graphs for Patents: Open Challenges and Benefits
(Aachen, Germany : RWTH Aachen, 2023) Dessi, Danilo; Dessi, Rima; Alam, Mehwish; Trojahn, Cassia; Hertling, Sven; Pesquita, Catia; Aebeloe, Christian; Aras, Hidir; Azzam, Amr; Cano, Juan; Domingue, John; Gottschalk, Simon; Hartig, Olaf; Hose, Katja; Kirrane, Sabrina; Lisena, Pasquale; Osborne, Francesco; Rohde, Philipp; Steels, Luc; Taelman, Ruben; Third, Aisling; Tiddi, Ilaria; Türker, Rima
Textual documents are the means of sharing information and preserving knowledge for a large variety of domains. The patent domain is also using such a paradigm which is becoming difficult to maintain and is limiting the potentialities of using advanced AI systems for domain analysis. To overcome this issue, it is more and more frequent to find approaches to transform textual representations into Knowledge Graphs (KGs). In this position paper, we discuss KGs within the patent domain, present its challenges, and envision the benefits of such technologies for this domain. In addition, this paper provides insights of such KGs by reproducing an existing pipeline to create KGs and applying it to patents in the computer science domain.
Editorial of the Special issue on Cultural heritage and semantic web
(Amsterdam : IOS Press, 2022) Alam, Mehwish; de Boer, Victor; Daga, Enrico; van Erp, Marieke; Hyvönen, Eero; Meroño-Peñuela, Albert
[no abstract available]
Exploring the Impact of Negative Sampling on Patent Citation Recommendation
(Paris : CNRS, 2023) Dessi, Rima; Aras, Hidir; Alam, Mehwish
Due to the increasing number of patents being published every day, patent citation recommendations have become one of the challenging tasks. Since patent citations may lead to legal and economic consequences, patent recommendations are even more challenging as compared to scientific article citations. One of the crucial components of the patent citation algorithm is negative sampling which is also a part of many other tasks such as text classification, knowledge graph completion, etc. This paper, particularly focuses on proposing a transformer-based ranking model for patent recommendations. It further experimentally compares the performance of patent recommendations based on various state-of-the-art negative sampling approaches to measure and compare the effectiveness of these approaches to aid future developments. These experiments are performed on a newly collected dataset of US patents from Google patents.
Further with Knowledge Graphs. Proceedings of the 17th International Conference on Semantic Systems
(Berlin : AKA ; Amsterdam : IOS Press, 2021) Alam, Mehwish; Groth, Paul; de Boer, Victor; Pellegrini, Tassilo; Pandit, Harshvardhan J.; Montiel, Elena; Rodríguez-Doncel, Victor; McGillivray, Barbara; Meroño-Peñuela, Albert
The field of semantic computing is highly diverse, linking areas such as artificial intelligence, data science, knowledge discovery and management, big data analytics, e-commerce, enterprise search, technical documentation, document management, business intelligence, and enterprise vocabulary management. As such it forms an essential part of the computing technology that underpins all our lives today. This volume presents the proceedings of SEMANTiCS 2021, the 17th International Conference on Semantic Systems. As a result of the continuing Coronavirus restrictions, SEMANTiCS 2021 was held in a hybrid form in Amsterdam, the Netherlands, from 6 to 9 September 2021. The annual SEMANTiCS conference provides an important platform for semantic computing professionals and researchers, and attracts information managers, ITarchitects, software engineers, and researchers from a wide range of organizations, such as research facilities, NPOs, public administrations and the largest companies in the world. The subtitle of the 2021 conference’s was “In the Era of Knowledge Graphs”, and 66 submissions were received, from which the 19 papers included here were selected following a rigorous single-blind reviewing process; an acceptance rate of 29%. Topics covered include data science, machine learning, logic programming, content engineering, social computing, and the Semantic Web, as well as the additional sub-topics of digital humanities and cultural heritage, legal tech, and distributed and decentralized knowledge graphs. Providing an overview of current research and development, the book will be of interest to all those working in the field of semantic systems.
Improving Language Model Predictions via Prompts Enriched with Knowledge Graphs
(Aachen, Germany : RWTH Aachen, 2023) Brate, Ryan; Minh-Dang, Hoang; Hoppe, Fabian; He, Yuan; Meroño-Peñuela, Albert; Sadashivaiah, Vijay; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego
Despite advances in deep learning and knowledge graphs (KGs), using language models for natural language understanding and question answering remains a challenging task. Pre-trained language models (PLMs) have shown to be able to leverage contextual information, to complete cloze prompts, next sentence completion and question answering tasks in various domains. Unlike structured data querying in e.g. KGs, mapping an input question to data that may or may not be stored by the language model is not a simple task. Recent studies have highlighted the improvements that can be made to the quality of information retrieved from PLMs by adding auxiliary data to otherwise naive prompts. In this paper, we explore the effects of enriching prompts with additional contextual information leveraged from the Wikidata KG on language model performance. Specifically, we compare the performance of naive vs. KG-engineered cloze prompts for entity genre classification in the movie domain. Selecting a broad range of commonly available Wikidata properties, we show that enrichment of cloze-style prompts with Wikidata information can result in a significantly higher recall for the investigated BERT and RoBERTa large PLMs. However, it is also apparent that the optimum level of data enrichment differs between models.
Machine Learning with Symbolic Methods and Knowledge Graphs
(Aachen : RWTH Aachen, 2021) Alam, Mehwish; Ali, Mehdi; Groth, Paul; Hitzler, Pascal; Lehmann, Jens; Paulheim, Heiko; Rettinger, Achim; Sack, Harald; Sadeghi, Afshi; Tresp, Volker
[no abstract available]
Ontology Modelling for Materials Science Experiments
(Aachen, Germany : RWTH Aachen, 2021) Alam, Mehwish; Birkholz, Henk; Dessì, Danilo; Eberl, Christoph; Fliegl, Heike; Gumbsch, Peter; von Hartrott, Philipp; Mädler, Lutz; Niebel, Markus; Sack, Harald; Thomas, Akhil; Tiddi, Ilaria; Maleshkova, Maria; Pellegrini, Tassilo; de Boer, Victor
Materials are either enabler or bottleneck for the vast majority of technological innovations. The digitization of materials and processes is mandatory to create live production environments which represent physical entities and their aggregations and thus allow to represent, share, and understand materials changes. However, a common standard formalization for materials knowledge in the form of taxonomies, ontologies, or knowledge graphs has not been achieved yet. This paper sketches the e_orts in modelling an ontology prototype to describe Materials Science experiments. It describes what is expected from the ontology by introducing a use case where a process chain driven by the ontology enables the curation and understanding of experiments.
Proceedings of the Workshop on Deep Learning for Knowledge Graphs (DL4KG 2021)
(Aachen : RWTH Aachen, 2021) Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego, Sack, Harald
[no abstract available]
Semantic role labeling for knowledge graph extraction from text
(Berlin ; Heidelberg : Springer, 2021) Alam, Mehwish; Gangemi, Aldo; Pressutti, Valentina; Reforgiato Recupero, Diego
This paper introduces TakeFive, a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph. It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalizes the results as a knowledge graph. This formal representation complies with the frame semantics used in Framester, a factual-linguistic linked data resource. We tested our method on the WSJ section of the Peen Treebank annotated with VerbNet and PropBank labels and on the Brown corpus. The evaluation has been performed according to the CoNLL Shared Task on Joint Parsing of Syntactic and Semantic Dependencies. The obtained precision, recall, and F1 values indicate that TakeFive is competitive with other existing methods such as SEMAFOR, Pikes, PathLSTM, and FRED. We finally discuss how to combine TakeFive and FRED, obtaining higher values of precision, recall, and F1 measure.
Special issue on conceptual structures
(Dordrecht [u.a.] : Springer Science + Business Media B.V, 2022) Alam, Mehwish; Braun, Tanya; Endres, Dominik; Yun, Bruno
[no abstract available]
Steps towards a Dislocation Ontology for Crystalline Materials
(Aachen, Germany : RWTH Aachen, 2021) Ihsan, Ahmad Zainul; Dessì, Danilo; Alam, Mehwish; Sack, Harald; Sandfeld, Stefan; García-Castro, Raúl; Davies, John; Antoniou, Grigoris; Fortuna, Carolina
The field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain “dislocations" - a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into understanding dislocation behavior across different length scales both with experimental characterization techniques as well as with simulations. However, for describing such dislocation structures there is still a lack of a common standard to represent and to connect dislocation domain knowledge across different but related communities. An ontology offers a common foundation to enable knowledge representation and data interoperability, which are important components to establish a “digital twin". This paper outlines the first steps towards the design of an ontology in the dislocation domain and shows a connection with the already existing ontologies in the materials science and engineering domain.
Temporal Evolution of the Migration-related Topics on Social Media
(Aachen, Germany : RWTH Aachen, 2021) Chen, Yiyi; Gesese, Genet Asefa; Sack, Harald; Alam, Mehwish; Seneviratne, Oshani; Pesquita, Catia; Sequeda, Juan; Etcheverry, Lorena
This poster focuses on capturing the temporal evolution of migration-related topics on relevant tweets. It uses Dynamic Embedded Topic Model (DETM) as a learning algorithm to perform a quantitative and qualitative analysis of these emerging topics. TweetsKB is extended with the extracted Twitter dataset along with the results of DETM which considers temporality. These results are then further analyzed and visualized. It reveals that the trajectories of the migration-related topics are in agreement with historical events. The source codes are available online: https://bit.ly/3dN9ICB.
Towards Analyzing the Bias of News Recommender Systems Using Sentiment and Stance Detection
(New York,NY,United States : Association for Computing Machinery, 2022) Alam, Mehwish; Iana, Andreea; Grote, Alexander; Ludwig, Katharina; Müller, Philipp; Paulheim, Heiko; Laforest, Frédérique; Troncy, Raphael; Médini, Lionel; Herman, Ivan
News recommender systems are used by online news providers to alleviate information overload and to provide personalized content to users. However, algorithmic news curation has been hypothesized to create filter bubbles and to intensify users' selective exposure, potentially increasing their vulnerability to polarized opinions and fake news. In this paper, we show how information on news items' stance and sentiment can be utilized to analyze and quantify the extent to which recommender systems suffer from biases. To that end, we have annotated a German news corpus on the topic of migration using stance detection and sentiment analysis. In an experimental evaluation with four different recommender systems, our results show a slight tendency of all four models for recommending articles with negative sentiments and stances against the topic of refugees and migration. Moreover, we observed a positive correlation between the sentiment and stance bias of the text-based recommenders and the preexisting user bias, which indicates that these systems amplify users' opinions and decrease the diversity of recommended news. The knowledge-aware model appears to be the least prone to such biases, at the cost of predictive accuracy.
Understanding Class Representations: An Intrinsic Evaluation of Zero-Shot Text Classification
(Aachen, Germany : RWTH Aachen, 2021) Hoppe, Fabian; Dessì, Danilo; Sack, Harald; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, Harald
Frequently, Text Classification is limited by insufficient training data. This problem is addressed by Zero-Shot Classification through the inclusion of external class definitions and then exploiting the relations between classes seen during training and unseen classes (Zero-shot). However, it requires a class embedding space capable of accurately representing the semantic relatedness between classes. This work defines an intrinsic evaluation based on greater-than constraints to provide a better understanding of this relatedness. The results imply that textual embeddings are able to capture more semantics than Knowledge Graph embeddings, but combining both modalities yields the best performance.

Browsing by Author "Alam, Mehwish"

Results Per Page

Sort Options