Search Results

Now showing 1 - 4 of 4
  • Item
    Compact representations for efficient storage of semantic sensor data
    (Dordrecht : Springer Science + Business Media B.V, 2021) Karim, Farah; Vidal, Maria-Esther; Auer, Sören
    Nowadays, there is a rapid increase in the number of sensor data generated by a wide variety of sensors and devices. Data semantics facilitate information exchange, adaptability, and interoperability among several sensors and devices. Sensor data and their meaning can be described using ontologies, e.g., the Semantic Sensor Network (SSN) Ontology. Notwithstanding, semantically enriched, the size of semantic sensor data is substantially larger than raw sensor data. Moreover, some measurement values can be observed by sensors several times, and a huge number of repeated facts about sensor data can be produced. We propose a compact or factorized representation of semantic sensor data, where repeated measurement values are described only once. Furthermore, these compact representations are able to enhance the storage and processing of semantic sensor data. To scale up to large datasets, factorization based, tabular representations are exploited to store and manage factorized semantic sensor data using Big Data technologies. We empirically study the effectiveness of a semantic sensor’s proposed compact representations and their impact on query processing. Additionally, we evaluate the effects of storing the proposed representations on diverse RDF implementations. Results suggest that the proposed compact representations empower the storage and query processing of sensor data over diverse RDF implementations, and up to two orders of magnitude can reduce query execution time.
  • Item
    Persistent Identification Of Instruments
    (Ithaka : Cornell University, 2020) Stocker, Markus; Darroch, Louise; Krahl, Rolf; Habermann, Ted; Devaraju, Anusuriya; Schwardmann, Ulrich; D'Onofrio, Claudio; Häggström, Ingemar
    Instruments play an essential role in creating research data. Given the importance of instruments and associated metadata to the assessment of data quality and data reuse, globally unique, persistent and resolvable identification of instruments is crucial. The Research Data Alliance Working Group Persistent Identification of Instruments (PIDINST) developed a community-driven solution for persistent identification of instruments which we present and discuss in this paper. Based on an analysis of 10 use cases, PIDINST developed a metadata schema and prototyped schema implementation with DataCite and ePIC as representative persistent identifier infrastructures and with HZB (Helmholtz-Zentrum Berlin für Materialien und Energie) and BODC (British Oceanographic Data Centre) as representative institutional instrument providers. These implementations demonstrate the viability of the proposed solution in practice. Moving forward, PIDINST will further catalyse adoption and consolidate the schema by addressing new stakeholder requirements.
  • Item
    Das #vBIB20-Experiment: spontan, agil und virtuell
    (Heidelberg : Universitätsbibliothek, 2020) Bielesch, Stefan; Engelkenmeier, Ute; Kösters, Jens; Petri, Nicole; Stöhr, Matti; Stummeyer, Sabine
    After the cancellation of the 109th German Librarians' Day in Hannover, the #vBIB20 took place from 26-28 May 2020 as an alternative planned at short notice, which was conducted as a web conference. The article briefly examines from the point of view of the organisation (TIB Hannover, Association of Information and Library Professionals BIB) the challenges and experiences in the implementation of the pure online conference, which was unprecedented in the German-speaking library community on this scale.
  • Item
    Enhancing Virtual Ontology Based Access over Tabular Data with Morph-CSV
    (Amsterdam : IOS Press, 2020) Chaves-Fraga, David; Ruckhaus, Edna; Priyatna, Freddy; Vidal, Maria-Esther; Corchio, Oscar
    Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets, either by materializing integrated data into RDF or by performing on-the fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented; thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with a benchmark using the GTFS dataset from the Madrid subway; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of MorphCSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.