Optimizing Federated Queries Based on the Physical Design of a Data Lake

dc.contributor.authorRohde, Philipp D.
dc.contributor.authorVidal, Maria-Esther
dc.date.accessioned2021-04-13T08:27:38Z
dc.date.available2021-04-13T08:27:38Z
dc.date.issued2020
dc.description.abstractThe optimization of query execution plans is known to be crucial for reducing the query execution time. In particular, query optimization has been studied thoroughly for relational databases over the past decades. Recently, the Resource Description Framework (RDF) became popular for publishing data on the Web. As a consequence, federations composed of different data models like RDF and relational databases evolved. One type of these federations are Semantic Data Lakes where every data source is kept in its original data model and semantically annotated with ontologies or controlled vocabularies. However, state-of-the-art query engines for federated query processing over Semantic Data Lakes often rely on optimization techniques tailored for RDF. In this paper, we present query optimization techniques guided by heuristics that take the physical design of a Data Lake into account. The heuristics are implemented on top of Ontario, a SPARQL query engine for Semantic Data Lakes. Using sourcespecific heuristics, the query engine is able to generate more efficient query execution plans by exploiting the knowledge about indexes and normalization in relational databases. We show that heuristics which take the physical design of the Data Lake into account are able to speed up query processing.eng
dc.description.versionpublishedVersioneng
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/6145
dc.identifier.urihttps://doi.org/10.34657/5193
dc.language.isoengeng
dc.publisherAachen : RWTHeng
dc.relation.essn1613-0073
dc.relation.ispartofProceedings of the Workshops of the EDBT/ICDT 2020 Joint Conferenceeng
dc.relation.ispartofseriesCEUR Workshop Proceedings 2578 (2020)eng
dc.rights.licenseCC BY 4.0 Unportedeng
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/eng
dc.subjectquery execution timeeng
dc.subjectResource Description Frameworkeng
dc.subjectSemantic Data Lakeseng
dc.subject.classificationKonferenzschriftger
dc.subject.ddc004eng
dc.titleOptimizing Federated Queries Based on the Physical Design of a Data Lakeeng
dc.typebookParteng
dc.typeTexteng
dcterms.bibliographicCitation.journalTitleCEUR Workshop Proceedingseng
tib.accessRightsopenAccesseng
tib.relation.conferenceEDBT-ICDT-WS 2020, 30 March 2020, Copenhagen, Denmarkeng
wgl.contributorTIBeng
wgl.subjectInformatikeng
wgl.typeBuchkapitel / Sammelwerksbeitrageng
wgl.typeKonferenzbeitrageng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Rohde2020.pdf
Size:
892.83 KB
Format:
Adobe Portable Document Format
Description: