Search Results

Now showing 1 - 10 of 10
Loading...
Thumbnail Image
Item

An AI-based open recommender system for personalized labor market driven education

2022, Tavakoli, Mohammadreza, Faraji, Abdolali, Vrolijk, Jarno, Molavi, Mohammadreza, Mol, Stefan T., Kismihók, Gábor

Attaining those skills that match labor market demand is getting increasingly complicated, not in the last place in engineering education, as prerequisite knowledge, skills, and abilities are evolving dynamically through an uncontrollable and seemingly unpredictable process. Anticipating and addressing such dynamism is a fundamental challenge to twenty-first century education. The burgeoning availability of data, not only on the demand side but also on the supply side (in the form of open educational resources) coupled with smart technologies, may provide a fertile ground for addressing this challenge. In this paper, we propose a novel, Artificial Intelligence (AI) driven approach to the development of an open, personalized, and labor market oriented learning recommender system, called eDoer. We discuss the complete system development cycle starting with a systematic user requirements gathering, and followed by system design, implementation, and validation. Our recommender prototype (1) derives the skill requirements for particular occupations through an analysis of online job vacancy announcements

Loading...
Thumbnail Image
Item

Compact representations for efficient storage of semantic sensor data

2021, Karim, Farah, Vidal, Maria-Esther, Auer, Sören

Nowadays, there is a rapid increase in the number of sensor data generated by a wide variety of sensors and devices. Data semantics facilitate information exchange, adaptability, and interoperability among several sensors and devices. Sensor data and their meaning can be described using ontologies, e.g., the Semantic Sensor Network (SSN) Ontology. Notwithstanding, semantically enriched, the size of semantic sensor data is substantially larger than raw sensor data. Moreover, some measurement values can be observed by sensors several times, and a huge number of repeated facts about sensor data can be produced. We propose a compact or factorized representation of semantic sensor data, where repeated measurement values are described only once. Furthermore, these compact representations are able to enhance the storage and processing of semantic sensor data. To scale up to large datasets, factorization based, tabular representations are exploited to store and manage factorized semantic sensor data using Big Data technologies. We empirically study the effectiveness of a semantic sensor’s proposed compact representations and their impact on query processing. Additionally, we evaluate the effects of storing the proposed representations on diverse RDF implementations. Results suggest that the proposed compact representations empower the storage and query processing of sensor data over diverse RDF implementations, and up to two orders of magnitude can reduce query execution time.

Loading...
Thumbnail Image
Item

Resorting to Context-Aware Background Knowledge for Unveiling Semantically Related Social Media Posts

2022, Sakor, Ahmad, Singh, Kuldeep, Vidal, Maria-Esther

Social media networks have become a prime source for sharing news, opinions, and research accomplishments in various domains, and hundreds of millions of posts are announced daily. Given this wealth of information in social media, finding related announcements has become a relevant task, particularly in trending news (e.g., COVID-19 or lung cancer). To facilitate the search of connected posts, social networks enable users to annotate their posts, e.g., with hashtags in tweets. Albeit effective, an annotation-based search is limited because results will only include the posts that share the same annotations. This paper focuses on retrieving context-related posts based on a specific topic, and presents PINYON, a knowledge-driven framework, that retrieves associated posts effectively. PINYON implements a two-fold pipeline. First, it encodes, in a graph, a CORPUS of posts and an input post; posts are annotated with entities for existing knowledge graphs and connected based on the similarity of their entities. In a decoding phase, the encoded graph is used to discover communities of related posts. We cast this problem into the Vertex Coloring Problem, where communities of similar posts include the posts annotated with entities colored with the same colors. Built on results reported in the graph theory, PINYON implements the decoding phase guided by a heuristic-based method that determines relatedness among posts based on contextual knowledge, and efficiently groups the most similar posts in the same communities. PINYON is empirically evaluated on various datasets and compared with state-of-the-art implementations of the decoding phase. The quality of the generated communities is also analyzed based on multiple metrics. The observed outcomes indicate that PINYON accurately identifies semantically related posts in different contexts. Moreover, the reported results put in perspective the impact of known properties about the optimality of existing heuristics for vertex graph coloring and their implications on PINYON scalability.

Loading...
Thumbnail Image
Item

Bias in data-driven artificial intelligence systems - An introductory survey

2020, Ntoutsi, E., Fafalios, P., Gadiraju, U., Iosifidis, V., Nejdl, W., Vidal, Maria-Esther, Ruggieri, S., Turini, F., Papadopoulos, S., Krasanakis, E., Kompatsiaris, I., Kinder-Kurlanda, K., Wagner, C., Karimi, F., Fernandez, M., Alani, H., Berendt, B., Kruegel, T., Heinze, C., Broelemann, K., Kasneci, G., Tiropanis, T., Staab, S.

Artificial Intelligence (AI)-based systems are widely employed nowadays to make decisions that have far-reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well-grounded in a legal frame. In this survey, we focus on data-driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth. This article is categorized under: Commercial, Legal, and Ethical Issues > Fairness in Data Mining Commercial, Legal, and Ethical Issues > Ethical Considerations Commercial, Legal, and Ethical Issues > Legal Issues.

Loading...
Thumbnail Image
Item

Transforming the study of organisms: Phenomic data models and knowledge bases

2020, Thessen, Anne E., Walls, Ramona L., Vogt, Lars, Singer, Jessica, Warren, Robert, Buttigieg, Pier Luigi, Balhoff, James P., Mungall, Christopher J., McGuinness, Deborah L., Stucky, Brian J., Yoder, Matthew J., Haendel, Melissa A.

The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.

Loading...
Thumbnail Image
Item

Analysing the requirements for an Open Research Knowledge Graph: use cases, quality requirements, and construction strategies

2021, Brack, Arthur, Hoppe, Anett, Stocker, Markus, Auer, Sören, Ewerth, Ralph

Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.

Loading...
Thumbnail Image
Item

A multi-method psychometric assessment of the affinity for technology interaction (ATI) scale

2020, Lezhnina, Olga, Kismihók, Gábor

In order to develop valid and reliable instruments, psychometric validation should be conducted as an iterative process that “requires a multi-method assessment” (Schimmack, 2019, p. 4). In this study, a multi-method psychometric approach was applied to a recently developed and validated scale, the Affinity for Technology Interaction (ATI) scale (Franke, Attig, & Wessel, 2018). The dataset (N ​= ​240) shared by the authors of the scale (Franke et al., 2018) was used. Construct validity of the ATI was explored by means of hierarchical clustering on variables, and its psychometric properties were analysed in accordance with an extended psychometric protocol (Dima, 2018) by methods of Classical Test Theory (CTT) and Item Response Theory (IRT). The results showed that the ATI is a unidimensional scale (homogeneity H ​= ​0.55) with excellent reliability (ω ​= ​0.90 [0.88-0.92]) and construct validity. Suggestions for further improvement of the ATI scale and the psychometric protocol were made.

Loading...
Thumbnail Image
Item

Deutschsprachige Game Studies 2021 – 2031: eine Vorausschau

2021, Inderst, Rudolf, Heller, Lambert

Rudolf Inderst und Lambert Heller stellen die grundsätzliche Frage, ob Text überhaupt die richtige Form ist, um sich mit digitalen Spielen wissenschaftlich auseinanderzusetzen. Sie sprechen sich dabei für die Etablierung und Verwendung der Form des Videoessays ein, die bereits in ihrer audiovisuellen Materialität dem Gegenstand angemessener sei.

Loading...
Thumbnail Image
Item

Multimodal news analytics using measures of cross-modal entity and context consistency

2021, Müller-Budack, Eric, Theiner, Jonas, Diering, Sebastian, Idahl, Maximilian, Hakimov, Sherzod, Ewerth, Ralph

The World Wide Web has become a popular source to gather information and news. Multimodal information, e.g., supplement text with photographs, is typically used to convey the news more effectively or to attract attention. The photographs can be decorative, depict additional details, but might also contain misleading information. The quantification of the cross-modal consistency of entity representations can assist human assessors’ evaluation of the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today’s society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of the entities in text and photograph by exploiting state-of-the-art computer vision approaches. In contrast to previous work, our system automatically acquires example data from the Web and is applicable to real-world news. Moreover, an approach that quantifies contextual image-text relations is introduced. The feasibility is demonstrated on two datasets that cover different languages, topics, and domains.

Loading...
Thumbnail Image
Item

Enhancing Virtual Ontology Based Access over Tabular Data with Morph-CSV

2020, Chaves-Fraga, David, Ruckhaus, Edna, Priyatna, Freddy, Vidal, Maria-Esther, Corchio, Oscar

Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets, either by materializing integrated data into RDF or by performing on-the fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented; thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with a benchmark using the GTFS dataset from the Madrid subway; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of MorphCSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.