Search Results

Now showing 1 - 10 of 38
  • Item
    Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems
    (Berlin ; Heidelberg : Springer, 2022) Greiner-Petter, André; Cohl, Howard S.; Youssef, Abdou; Schubotz, Moritz; Trost, Avi; Dey, Rajen; Aizawa, Akiko; Gipp, Bela; Fisman, Dana; Rosu, Grigore
    Digital mathematical libraries assemble the knowledge of years of mathematical research. Numerous disciplines (e.g., physics, engineering, pure and applied mathematics) rely heavily on compendia gathered findings. Likewise, modern research applications rely more and more on computational solutions, which are often calculated and verified by computer algebra systems. Hence, the correctness, accuracy, and reliability of both digital mathematical libraries and computer algebra systems is a crucial attribute for modern research. In this paper, we present a novel approach to verify a digital mathematical library and two computer algebra systems with one another by converting mathematical expressions from one system to the other. We use our previously developed conversion tool (referred to as ) to translate formulae from the NIST Digital Library of Mathematical Functions to the computer algebra systems Maple and Mathematica. The contributions of our presented work are as follows: (1) we present the most comprehensive verification of computer algebra systems and digital mathematical libraries with one another; (2) we significantly enhance the performance of the underlying translator in terms of coverage and accuracy; and (3) we provide open access to translations for Maple and Mathematica of the formulae in the NIST Digital Library of Mathematical Functions.
  • Item
    Collaborative annotation and semantic enrichment of 3D media
    (New York,NY,United States : Association for Computing Machinery, 2022) Rossenova, Lozana; Schubert, Zoe; Vock, Richard; Sohmen, Lucia; Günther, Lukas; Duchesne, Paul; Blümel, Ina; Aizawa, Akiko
    A new FOSS (free and open source software) toolchain and associated workflow is being developed in the context of NFDI4Culture, a German consortium of research- and cultural heritage institutions working towards a shared infrastructure for research data that meets the needs of 21st century data creators, maintainers and end users across the broad spectrum of the digital libraries and archives field, and the digital humanities. This short paper and demo present how the integrated toolchain connects: 1) OpenRefine - for data reconciliation and batch upload; 2) Wikibase - for linked open data (LOD) storage; and 3) Kompakkt - for rendering and annotating 3D models. The presentation is aimed at librarians, digital curators and data managers interested in learning how to manage research datasets containing 3D media, and how to make them available within an open data environment with 3D-rendering and collaborative annotation features.
  • Item
    Data Protection Impact Assessments in Practice: Experiences from Case Studies
    (Berlin ; Heidelberg : Springer, 2022) Friedewald, Michael; Schiering, Ina; Martin, Nicholas; Hallinan, Dara; Katsikas, Sokratis; Lambrinoudakis, Costas; Cuppens, Nora; Mylopoulos, John; Kalloniatis, Christos; Meng, Weizhi; Furnell, Steven; Pallas, Frank; Pohle, Jörg; Sasse, M. Angela; Abie, Habtamu; Ranise, Silvio; Verderame, Luca; Cambiaso, Enrico; Vidal, Jorge Maestre; Monge, Marco Antonio Sotelo
    In the context of the project A Data Protection Impact Assessment (DPIA) Tool for Practical Use in Companies and Public Administration an operationalization for Data Protection Impact Assessments was developed based on the approach of Forum Privatheit. This operationalization was tested and refined during twelve tests with startups, small- and medium sized enterprises, corporations and public bodies. This paper presents the operationalization and summarizes the experience from the tests.
  • Item
    Meetings and Mood-Related or Not? Insights from Student Software Projects
    (New York : Association for Computing Machinery, 2022) Klünder, Jil; Karras, Oliver; Madeiral, Fernanda; Lassenius, Casper
    [Background:] Teamwork, coordination, and communication are a prerequisite for the timely completion of a software project. Meetings as a facilitator for coordination and communication are an established medium for information exchange. Analyses of meetings in software projects have shown that certain interactions in these meetings, such as proactive statements followed by supportive ones, influence the mood and motivation of a team, which in turn affects its productivity. So far, however, research has focused only on certain interactions at a detailed level, requiring a complex and fine-grained analysis of a meeting itself. [Aim:] In this paper, we investigate meetings from a more abstract perspective, focusing on the polarity of the statements, i.e., whether they appear to be positive, negative, or neutral. [Method:] We analyze the relationship between the polarity of statements in meetings and different social aspects, including conflicts as well as the mood before and after a meeting. [Results:] Our results emerge from 21 student software project meetings and show some interesting insights: (1) Positive mood before a meeting is both related to the amount of positive statements in the beginning, as well as throughout the whole meeting, (2) negative mood before the meeting only influences the amount of negative statements in the first quarter of the meeting, but not the whole meeting, and (3) the amount of positive and negative statements during the meeting has no influence on the mood afterwards. [Conclusions:] We conclude that the behaviour in meetings might rather influence short-term emotional states (feelings) than long-term emotional states (mood), which are more important for the project.
  • Item
    TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation
    (New York,NY,United States : Association for Computing Machinery, 2022) Oelen, Allard; Stocker, Markus; Auer, Sören; Aizawa, Akiko
    As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles.
  • Item
    PowerDuck: A GOOSE Data Set of Cyberattacks in Substations
    (New York City : ACM, 2022-08-08) Zemanek, Sven; Hacker, Immanuel; Wolsing, Konrad; Wagner, Eric; Henze, Martin; Serror, Martin
    Power grids worldwide are increasingly victims of cyberattacks, where attackers can cause immense damage to critical infrastructure. The growing digitalization and networking in power grids combined with insufficient protection against cyberattacks further exacerbate this trend. Hence, security engineers and researchers must counter these new risks by continuously improving security measures. Data sets of real network traffic during cyberattacks play a decisive role in analyzing and understanding such attacks. Therefore, this paper presents PowerDuck, a publicly available security data set containing network traces of GOOSE communication in a physical substation testbed. The data set includes recordings of various scenarios with and without the presence of attacks. Furthermore, all network packets originating from the attacker are clearly labeled to facilitate their identification. We thus envision PowerDuck improving and complementing existing data sets of substations, which are often generated synthetically, thus enhancing the security of power grids.
  • Item
    Knowledge Extraction for Art History: the Case of Vasari’s The Lives of The Artists (1568)
    (Aachen, Germany : RWTH Aachen, 2022) Santini, Cristian; Tan, Mary Ann; Tietz, Tabea; Bruns, Oleksandra; Posthumus, Etienne; Sack, Harald; Paschke, Adrian; Rehm, Georg; Neudecker, Clemens; Pintscher, Lydia
    Knowledge Extraction (KE) techniques are used to convert unstructured information present in texts to Knowledge Graphs (KGs) which can be queried and explored. Despite their potential for cultural heritage domains, such as Art History, these techniques often encounter limitations if applied to domain-specific data. In this paper we present the main challenges that KE has to face on art-historical texts, by using as case study Giorgio Vasari's The Lives of The Artists. This paper discusses the following NLP tasks for art-historical texts, namely entity recognition and linking, coreference resolution, time extraction, motif extraction and artwork extraction. Several strategies to annotate art-historical data for these tasks and evaluate NLP models are also proposed.
  • Item
    On the Impact of Temporal Representations on Metaphor Detection
    (Paris : European Language Resources Association (ELRA), 2022) Giorgio Ottolina; Matteo Palmonari; Manuel Vimercati; Mehwish Alam; Calzolari, Nicoletta; Béchet, Frédéric; Blache, Philippe; Choukri, Khalid; Cieri, Christopher; Declerck, Thierry; Goggi, Sara; Isahara, Hitoshi; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Odijk, Jan; Piperidis, Stelios
    State-of-the-art approaches for metaphor detection compare their literal - or core - meaning and their contextual meaning using metaphor classifiers based on neural networks. However, metaphorical expressions evolve over time due to various reasons, such as cultural and societal impact. Metaphorical expressions are known to co-evolve with language and literal word meanings, and even drive, to some extent, this evolution. This poses the question of whether different, possibly time-specific, representations of literal meanings may impact the metaphor detection task. To the best of our knowledge, this is the first study that examines the metaphor detection task with a detailed exploratory analysis where different temporal and static word embeddings are used to account for different representations of literal meanings. Our experimental analysis is based on three popular benchmarks used for metaphor detection and word embeddings extracted from different corpora and temporally aligned using different state-of-the-art approaches. The results suggest that the usage of different static word embedding methods does impact the metaphor detection task and some temporal word embeddings slightly outperform static methods. However, the results also suggest that temporal word embeddings may provide representations of the core meaning of the metaphor even too close to their contextual meaning, thus confusing the classifier. Overall, the interaction between temporal language evolution and metaphor detection appears tiny in the benchmark datasets used in our experiments. This suggests that future work for the computational analysis of this important linguistic phenomenon should first start by creating a new dataset where this interaction is better represented.
  • Item
    The Concept of Identifiability in ML Models
    (Setúbal : SciTePress - Science and Technology Publications, Lda., 2022) von Maltzan, Stephanie; Bastieri, Denis; Wills, Gary; Kacsuk, Péter; Chang, Victor
    Recent research indicates that the machine learning process can be reversed by adversarial attacks. These attacks can be used to derive personal information from the training. The supposedly anonymising machine learning process represents a process of pseudonymisation and is, therefore, subject to technical and organisational measures. Consequently, the unexamined belief in anonymisation as a guarantor for privacy cannot be easily upheld. It is, therefore, crucial to measure privacy through the lens of adversarial attacks and precisely distinguish what is meant by personal data and non-personal data and above all determine whether ML models represent pseudonyms from the training data.
  • Item
    Improving Zero-Shot Text Classification with Graph-based Knowledge Representations
    (Aachen, Germany : RWTH Aachen, 2022) Hoppe, Fabian; Hartig, Olaf; Seneviratne, Oshani
    Insufficient training data is a key challenge for text classification. In particular, long-tail class distributions and emerging, new classes do not provide any training data for specific classes. Therefore, such a zeroshot setting must incorporate additional, external knowledge to enable transfer learning by connecting the external knowledge of previously unseen classes to texts. Recent zero-shot text classifier utilize only distributional semantics defined by large language models and based on class names or natural language descriptions. This implicit knowledge contains ambiguities, is not able to capture logical relations nor is it an efficient representation of factual knowledge. These drawbacks can be avoided by introducing explicit, external knowledge. Especially, knowledge graphs provide such explicit, unambiguous, and complementary, domain specific knowledge. Hence, this thesis explores graph-based knowledge as additional modality for zero-shot text classification. Besides a general investigation of this modality, the influence on the capabilities of dealing with domain shifts by including domain-specific knowledge is explored.