Search Results

Now showing 1 - 2 of 2
  • Item
    Building Scholarly Knowledge Bases with Crowdsourcing and Text Mining
    (Aachen : RWTH, 2020) Stocker, Markus; Zhang, Chengzhi; Mayr, Philipp; Lu, Wei; Zhang, Yi
    For centuries, scholarly knowledge has been buried in documents. While articles are great to convey the story of scientific work to peers, they make it hard for machines to process scholarly knowledge. The recent proliferation of the scholarly literature and the increasing inability of researchers to digest, reproduce, reuse its content are constant reminders that we urgently need a transformative digitalization of the scholarly literature. Building on the Open Research Knowledge Graph (http://orkg.org) as a concrete research infrastructure, in this talk we present how using crowdsourcing and text mining humans and machines can collaboratively build scholarly knowledge bases, i.e. systems that acquire, curate and publish data, information and knowledge published in the scholarly literature in structured and semantic form. We discuss some key challenges that human and technical infrastructures face as well as the possibilities scholarly knowledge bases enable.
  • Item
    Easy Semantification of Bioassays
    (Heidelberg : Springer, 2022) Anteghini, Marco; D’Souza, Jennifer; dos Santos, Vitor A. P. Martins; Auer, Sören
    Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. We propose a solution for automatically semantifying biological assays. Our solution contrasts the problem of automated semantification as labeling versus clustering where the two methods are on opposite ends of the method complexity spectrum. Characteristically modeling our problem, we find the clustering solution significantly outperforms a deep neural network state-of-the-art labeling approach. This novel contribution is based on two factors: 1) a learning objective closely modeled after the data outperforms an alternative approach with sophisticated semantic modeling; 2) automatically semantifying biological assays achieves a high performance F1 of nearly 83%, which to our knowledge is the first reported standardized evaluation of the task offering a strong benchmark model.