Search Results

Now showing 1 - 5 of 5
  • Item
    Phenotyping in the era of genomics: MaTrics—a digital character matrix to document mammalian phenotypic traits
    (Amsterdam [u.a.] : Elsevier, 2021) Stefen, Clara; Wagner, Franziska; Asztalos, Marika; Giere, Peter; Grobe, Peter; Hiller, Michael; Hofmann, Rebecca; Jähde, Maria; Lächele, Ulla; Lehmann, Thomas; Ortmann, Sylvia; Peters, Benjamin; Ruf, Irina; Schiffmann, Christian; Thier, Nadja; Unterhitzenberger, Gabriele; Vogt, Lars; Rudolf, Matthias; Wehner, Peggy; Stuckas, Heiko
    A new and uniquely structured matrix of mammalian phenotypes, MaTrics (Mammalian Traits for Comparative Genomics) in a digital form is presented. By focussing on mammalian species for which genome assemblies are available, MaTrics provides an interface between mammalogy and comparative genomics. MaTrics was developed within a project aimed to find genetic causes of phenotypic traits of mammals using Forward Genomics. This approach requires genomes and comprehensive and recorded information on homologous phenotypes that are coded as discrete categories in a matrix. MaTrics is an evolving online resource providing information on phenotypic traits in numeric code; traits are coded either as absent/present or with several states as multistate. The state record for each species is linked to at least one reference (e.g., literature, photographs, histological sections, CT scans, or museum specimens) and so MaTrics contributes to digitalization of museum collections. Currently, MaTrics covers 147 mammalian species and includes 231 characters related to structure, morphology, physiology, ecology, and ethology and available in a machine actionable NEXUS-format*. Filling MaTrics revealed substantial knowledge gaps, highlighting the need for phenotyping efforts. Studies based on selected data from MaTrics and using Forward Genomics identified associations between genes and certain phenotypes ranging from lifestyles (e.g., aquatic) to dietary specializations (e.g., herbivory, carnivory). These findings motivate the expansion of phenotyping in MaTrics by filling research gaps and by adding taxa and traits. Only databases like MaTrics will provide machine actionable information on phenotypic traits, an important limitation to genomics. MaTrics is available within the data repository Morph·D·Base (www.morphdbase.de).
  • Item
    Transforming the study of organisms: Phenomic data models and knowledge bases
    (San Francisco, Calif. : Public Library of Science, 2020) Thessen, Anne E.; Walls, Ramona L.; Vogt, Lars; Singer, Jessica; Warren, Robert; Buttigieg, Pier Luigi; Balhoff, James P.; Mungall, Christopher J.; McGuinness, Deborah L.; Stucky, Brian J.; Yoder, Matthew J.; Haendel, Melissa A.
    The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.
  • Item
    Anatomy and the type concept in biology show that ontologies must be adapted to the diagnostic needs of research
    (London : BioMed Central, 2022) Vogt, Lars; Mikó, István; Bartolomaeus, Thomas
    Background: In times of exponential data growth in the life sciences, machine-supported approaches are becoming increasingly important and with them the need for FAIR (Findable, Accessible, Interoperable, Reusable) and eScience-compliant data and metadata standards. Ontologies, with their queryable knowledge resources, play an essential role in providing these standards. Unfortunately, biomedical ontologies only provide ontological definitions that answer What is it? questions, but no method-dependent empirical recognition criteria that answer How does it look? questions. Consequently, biomedical ontologies contain knowledge of the underlying ontological nature of structural kinds, but often lack sufficient diagnostic knowledge to unambiguously determine the reference of a term. Results: We argue that this is because ontology terms are usually textually defined and conceived as essentialistic classes, while recognition criteria often require perception-based definitions because perception-based contents more efficiently document and communicate spatial and temporal information—a picture is worth a thousand words. Therefore, diagnostic knowledge often must be conceived as cluster classes or fuzzy sets. Using several examples from anatomy, we point out the importance of diagnostic knowledge in anatomical research and discuss the role of cluster classes and fuzzy sets as concepts of grouping needed in anatomy ontologies in addition to essentialistic classes. In this context, we evaluate the role of the biological type concept and discuss its function as a general container concept for groupings not covered by the essentialistic class concept. Conclusions: We conclude that many recognition criteria can be conceptualized as text-based cluster classes that use terms that are in turn based on perception-based fuzzy set concepts. Finally, we point out that only if biomedical ontologies model also relevant diagnostic knowledge in addition to ontological knowledge, they will fully realize their potential and contribute even more substantially to the establishment of FAIR and eScience-compliant data and metadata standards in the life sciences.
  • Item
    ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph
    (Sofia : Pensoft, 2021) Auer, Sören; Stocker, Markus; Vogt, Lars; Fraumann, Grischa; Garatzogianni, Alexandra
    This document is an edited version of the original funding proposal entitled 'ORKG: Facilitating the Transfer of Research Results with the Open Research Knowledge Graph' that was submitted to the European Research Council (ERC) Proof of Concept (PoC) Grant in September 2020 (https://erc.europa.eu/funding/proof-concept). The proposal was evaluated by five reviewers and has been placed after the evaluations on the reserve list. The main document of the original proposal did not contain an abstract.
  • Item
    Semantic units: organizing knowledge graphs into semantically meaningful units of representation
    (London : BioMed Central, 2024) Vogt, Lars; Kuhn, Tobias; Hoehndorf, Robert
    Background In today’s landscape of data management, the importance of knowledge graphs and ontologies is escalating as critical mechanisms aligned with the FAIR Guiding Principles—ensuring data and metadata are Findable, Accessible, Interoperable, and Reusable. We discuss three challenges that may hinder the effective exploitation of the full potential of FAIR knowledge graphs. Results We introduce “semantic units” as a conceptual solution, although currently exemplified only in a limited prototype. Semantic units structure a knowledge graph into identifiable and semantically meaningful subgraphs by adding another layer of triples on top of the conventional data layer. Semantic units and their subgraphs are represented by their own resource that instantiates a corresponding semantic unit class. We distinguish statement and compound units as basic categories of semantic units. A statement unit is the smallest, independent proposition that is semantically meaningful for a human reader. Depending on the relation of its underlying proposition, it consists of one or more triples. Organizing a knowledge graph into statement units results in a partition of the graph, with each triple belonging to exactly one statement unit. A compound unit, on the other hand, is a semantically meaningful collection of statement and compound units that form larger subgraphs. Some semantic units organize the graph into different levels of representational granularity, others orthogonally into different types of granularity trees or different frames of reference, structuring and organizing the knowledge graph into partially overlapping, partially enclosed subgraphs, each of which can be referenced by its own resource. Conclusions Semantic units, applicable in RDF/OWL and labeled property graphs, offer support for making statements about statements and facilitate graph-alignment, subgraph-matching, knowledge graph profiling, and for management of access restrictions to sensitive data. Additionally, we argue that organizing the graph into semantic units promotes the differentiation of ontological and discursive information, and that it also supports the differentiation of multiple frames of reference within the graph.