Search Results

Now showing 1 - 4 of 4
  • Item
    Formalizing Gremlin pattern matching traversals in an integrated graph Algebra
    (Aachen, Germany : RWTH Aachen, 2019) Thakkar, Harsh; Auer, Sören; Vidal, Maria-Esther; Samavi, Reza; Consens, Mariano P.; Khatchadourian, Shahan; Nguyen, Vinh; Sheth, Amit; Giménez-García, José M.; Thakkar, Harsh
    Graph data management (also called NoSQL) has revealed beneficial characteristics in terms of flexibility and scalability by differ-ently balancing between query expressivity and schema flexibility. This peculiar advantage has resulted into an unforeseen race of developing new task-specific graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards flex-ible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing sys-tem (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern match-ing for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra.
  • Item
    Why reinvent the wheel: Let's build question answering systems together
    (New York City : Association for Computing Machinery, 2018) Singh, K.; Radhakrishna, A.S.; Both, A.; Shekarpour, S.; Lytra, I.; Usbeck, R.; Vyas, A.; Khikmatullaev, A.; Punjani, D.; Lange, C.; Vidal, Maria-Esther; Lehmann, J.; Auer, Sören
    Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein, a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein, thus improving the performance of the generated pipelines.
  • Item
    EVENTSKG: A 5-Star Dataset of Top-Ranked Events in Eight Computer Science Communities
    (Berlin ; Heidelberg : Springer, 2019) Fathalla, Said; Lange, Christoph; Auer, Sören; Hitzler, Pascal; Fernández, Miriam; Janowicz, Krzysztof; Zaveri, Amrapali; Gray, Alasdair J.G.; Lopez, Vanessa; Haller, Armin; Hammar, Karl
    Metadata of scientific events has become increasingly available on the Web, albeit often as raw data in various formats, disregarding its semantics and interlinking relations. This leads to restricting the usability of this data for, e.g., subsequent analyses and reasoning. Therefore, there is a pressing need to represent this data in a semantic representation, i.e., Linked Data. We present the new release of the EVENTSKG dataset, comprising comprehensive semantic descriptions of scientific events of eight computer science communities. Currently, EVENTSKG is a 5-star dataset containing metadata of 73 top-ranked event series (almost 2,000 events) established over the last five decades. The new release is a Linked Open Dataset adhering to an updated version of the Scientific Events Ontology, a reference ontology for event metadata representation, leading to richer and cleaner data. To facilitate the maintenance of EVENTSKG and to ensure its sustainability, EVENTSKG is coupled with a Java API that enables users to add/update events metadata without going into the details of the representation of the dataset. We shed light on events characteristics by analyzing EVENTSKG data, which provides a flexible means for customization in order to better understand the characteristics of renowned CS events.
  • Item
    SemSur: A Core Ontology for the Semantic Representation of Research Findings
    (Amsterdam [u.a.] : Elsevier, 2018) Fathalla, Said; Vahdati, Sahar; Auer, Sören; Lange, Christoph; Fensel, Anna; de Boer, Victor; Pellegrini, Tassilo; Kiesling, Elmar; Haslhofer, Bernhard; Hollink, Laura; Schindler, Alexander
    The way how research is communicated using text publications has not changed much over the past decades. We have the vision that ultimately researchers will work on a common structured knowledge base comprising comprehensive semantic and machine-comprehensible descriptions of their research, thus making research contributions more transparent and comparable. We present the SemSur ontology for semantically capturing the information commonly found in survey and review articles. SemSur is able to represent scientific results and to publish them in a comprehensive knowledge graph, which provides an efficient overview of a research field, and to compare research findings with related works in a structured way, thus saving researchers a significant amount of time and effort. The new release of SemSur covers more domains, defines better alignment with external ontologies and rules for eliciting implicit knowledge. We discuss possible applications and present an evaluation of our approach with the retrospective, exemplary semantification of a survey. We demonstrate the utility of the SemSur ontology to answer queries about the different research contributions covered by the survey. SemSur is currently used and maintained at OpenResearch.org.