Search Results

Now showing 1 - 10 of 92
  • Item
    Biobank Oversight and Sanctions Under the General Data Protection Regulation
    (Dordrecht ; Heidelberg ; New York ; London : Springer, 2021) Hallinan, Dara; Slokenberga, Santa; Tzortzatou, Olga; Reichel, Jane
    This contribution offers an insight into the function and problems of the oversight and sanctions mechanisms outlined in the General Data Protection Regulation as they relate to the biobanking context. These mechanisms might be considered as meta-mechanisms—mechanisms relating to, but not consisting of, substantive legal principles—functioning in tandem to ensure biobank compliance with data protection principles. Each of the mechanisms outlines, on paper at least, comprehensive and impressive compliance architecture—both expanding on their capacity in relation to Directive 95/46. Accordingly, each mechanism looks likely to have a significant and lasting impact on biobanks and biobanking. Despite this comprehensiveness, however, the mechanisms are not immune from critique. Problems appear regarding the standard of protection provided for research subject rights, regarding the disproportionate impact on legitimate interests tied up with the biobanking process—particularly genomic research interests—and regarding their practical implementability in biobanking.
  • Item
    The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies
    (London : Nature Publ. Group, 2015) Kirklin, Scott; Saal, James E.; Meredig, Bryce; Thompson, Alex; Doak, Jeff W.; Aykol, Muratahan; Rühl, Stephan; Wolverton, Chris
    The Open Quantum Materials Database (OQMD) is a high-throughput database currently consisting of nearly 300,000 density functional theory (DFT) total energy calculations of compounds from the Inorganic Crystal Structure Database (ICSD) and decorations of commonly occurring crystal structures. To maximise the impact of these data, the entire database is being made available, without restrictions, at www.oqmd.org/download. In this paper, we outline the structure and contents of the database, and then use it to evaluate the accuracy of the calculations therein by comparing DFT predictions with experimental measurements for the stability of all elemental ground-state structures and 1,670 experimental formation energies of compounds. This represents the largest comparison between DFT and experimental formation energies to date. The apparent mean absolute error between experimental measurements and our calculations is 0.096 eV/atom. In order to estimate how much error to attribute to the DFT calculations, we also examine deviation between different experimental measurements themselves where multiple sources are available, and find a surprisingly large mean absolute error of 0.082 eV/atom. Hence, we suggest that a significant fraction of the error between DFT and experimental formation energies may be attributed to experimental uncertainties. Finally, we evaluate the stability of compounds in the OQMD (including compounds obtained from the ICSD as well as hypothetical structures), which allows us to predict the existence of ~3,200 new compounds that have not been experimentally characterised and uncover trends in material discovery, based on historical data available within the ICSD.
  • Item
    NFDI4Chem - Fachkonsortium für die Chemie
    (Marburg : Philipps-Universität, 2021) Ortmeyer, Jochen; Schön, Florian; Herres-Pawlis, Sonja; Jung, Nicole; Bach, Felix; Liermann, Johannes; Neumann, Steffen; Popp, Christian; Razum, Matthias; Koepler, Oliver; Steinbeck, Christoph
    Als Fachkonsortium für die Chemie hat sich NFDI4Chem innerhalb der Nationalen Forschungsdateninfrastruktur (NFDI) gebildet. In diesem Beitrag stellt sich das Konsortium kurz vor und legt seine zentralen Ziele und wichtigsten Verbesserungen für das Forschungsdatenmanagement (FDM) in der Chemie sowie die praktischen Heraus-forderungen dar. Die Vision von NFDI4Chem ist die umfassende Digitalisierung und Vernetzung aller Prozesse im Umgang mit Forschungsdaten in der chemischen Forschung. Beginnend mit der Erzeugung der Daten, über deren Verarbeitung und Analyse bis hin zur Publikation wird eine modulare, vernetzte Infrastruktur aus Software-Tools, elektronischen Laborjournalen und Datenrepositorien entwickelt und bereitgestellt, die Forschende im Laboralltag unterstützt. Die Digitalisierung wird begleitet durch die Entwicklung von Minimalinformationen für Datenpublikationen, bestehend unter anderem aus Standards für Daten- und Metadatenformate sowie Ontologien zur semantischen Beschreibung. Seine Aufgaben verfolgt das NFDI4Chem-Konsortium wissenschaftsgeleitet und mit dem klaren Ziel, eine intuitiv und effizient nutzbare Infrastruktur zu entwickeln. Das Gestalten eines kulturellen Wandels, gemeinsam mit der wissenschaftlichen Community, zur Etablierung und Akzeptanz eines FAIRen Umgangsmit Daten ist daher ein weiteres wichtiges Element der NFDI4Chem-Aktivitäten.
  • Item
    Digital research data: from analysis of existing standards to a scientific foundation for a modular metadata schema in nanosafety
    (London : BioMed Central, 2022) Elberskirch, Linda; Binder, Kunigunde; Riefler, Norbert; Sofranko, Adriana; Liebing, Julia; Minella, Christian Bonatto; Mädler, Lutz; Razum, Matthias; van Thriel, Christoph; Unfried, Klaus; Schins, Roel P. F.; Kraegeloh, Annette
    Background: Assessing the safety of engineered nanomaterials (ENMs) is an interdisciplinary and complex process producing huge amounts of information and data. To make such data and metadata reusable for researchers, manufacturers, and regulatory authorities, there is an urgent need to record and provide this information in a structured, harmonized, and digitized way. Results: This study aimed to identify appropriate description standards and quality criteria for the special use in nanosafety. There are many existing standards and guidelines designed for collecting data and metadata, ranging from regulatory guidelines to specific databases. Most of them are incomplete or not specifically designed for ENM research. However, by merging the content of several existing standards and guidelines, a basic catalogue of descriptive information and quality criteria was generated. In an iterative process, our interdisciplinary team identified deficits and added missing information into a comprehensive schema. Subsequently, this overview was externally evaluated by a panel of experts during a workshop. This whole process resulted in a minimum information table (MIT), specifying necessary minimum information to be provided along with experimental results on effects of ENMs in the biological context in a flexible and modular manner. The MIT is divided into six modules: general information, material information, biological model information, exposure information, endpoint read out information and analysis and statistics. These modules are further partitioned into module subdivisions serving to include more detailed information. A comparison with existing ontologies, which also aim to electronically collect data and metadata on nanosafety studies, showed that the newly developed MIT exhibits a higher level of detail compared to those existing schemas, making it more usable to prevent gaps in the communication of information. Conclusion: Implementing the requirements of the MIT into e.g., electronic lab notebooks (ELNs) would make the collection of all necessary data and metadata a daily routine and thereby would improve the reproducibility and reusability of experiments. Furthermore, this approach is particularly beneficial regarding the rapidly expanding developments and applications of novel non-animal alternative testing methods.
  • Item
    Caching and Reproducibility: Making Data Science Experiments Faster and FAIRer
    (Lausanne : Frontiers Media, 2022) Schubotz, Moritz; Satpute, Ankit; Greiner-Petter, André; Aizawa, Akiko; Gipp, Bela
    Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, others cannot reproduce the experiment and reuse the findings for subsequent research. Second, suppose the ad-hoc research software fails during often long-running computational expensive experiments. In that case, the overall effort to iteratively improve the software and rerun the experiments creates significant time pressure on the researchers. We suggest making caching an integral part of the research software development process, even before the first line of code is written. This article outlines caching recommendations for developing research software in data science projects. Our recommendations provide a perspective to circumvent common problems such as propriety dependence, speed, etc. At the same time, caching contributes to the reproducibility of experiments in the open science workflow. Concerning the four guiding principles, i.e., Findability, Accessibility, Interoperability, and Reusability (FAIR), we foresee that including the proposed recommendation in a research software development will make the data related to that software FAIRer for both machines and humans. We exhibit the usefulness of some of the proposed recommendations on our recently completed research software project in mathematical information retrieval.
  • Item
    Gesamtkonzept für die Informationsinfrastruktur in Deutschland
    (Kommission Zukunft der Informationsinfrastruktur, 2011) Kommission Zukunft der Informationsinfrastruktur
    Was haben digitalisierte Objektträger aus der Krebsforschung, Magnetbandaufzeichnungen des ersten bemannten Mondfluges und das Tierstimmenarchiv der Berliner Humboldt- Universität miteinander zu tun? In allen Fällen enthalten sie wertvolle wissenschaftliche Informationen. Ihre Verfügbarkeit jedoch ist nicht immer gegeben: Wenige Klicks am Rechner genügen, um übers Internet beispielsweise den Teichfrosch (Rana esculenta) quaken zu hören. Doch wer Originalaufzeichnungen der ersten Mondmission sucht, hat Pech gehabt: Seit Jahren stöbern Mitarbeiter der US-Weltraumagentur NASA erfolglos in ihren Archiven und suchen die Spulen. Es wird immer mehr zur Gewissheit: Die drei Zentimeter breiten Magnetbänder wurden irgendwann schlicht gelöscht und mit anderen Daten überspielt. Ein Gutes hatte aber die Suche der NASA: Sie förderte in Australien andere alte Datenbänder zutage, auf denen Informationen über Mondstaub gespeichert sind. Doch darauf folgte gleich das nächste Problem – die Daten waren nicht lesbar. Man fand glücklicherweise einen historischen Rekorder, mit dem die Informationen entziffert werden konnten. Das Gerät von der Größe eines Kühlschranks kommt aus einem Museum. Diese Beispiele illustrieren die zunehmend wichtige Frage, wie Forscherinnen und Forscher künftig mit wissenschaftlichen Informationen und Daten künftig umgehen müssen, um sie für weitere Forschungsprozesse zu sichern und zugänglich zu machen. Mit diesem Themenkomplex hat sich die „Kommission Zukunft der Informationsinfrastruktur“ befasst. Diese hochrangig besetzte Expertengruppe hat unter der Federführung der Leibniz-Gemeinschaft das vorliegende Gesamtkonzept erarbeitet. Der Auftrag dazu kam von der Gemeinsamen Wissenschaftskonferenz des Bundes und der Länder (GWK). In der bemerkenswert kurzen Zeit von nur 15 Monaten ist es den Experten – es waren knapp 135 Personen aus 54 Institutionen – gelungen, eine umfassende Sachdarstellung sowie detaillierte Empfehlungen zu erarbeiten. Die Zusammensetzung der Kommission stellt ein Novum dar. Sie repräsentiert die maßgeblichen Akteure der Informationsinfrastruktur in Deutschland, und zwar sowohl die Dienstleister selbst als auch die Förderorganisationen ebenso wie die wissenschaftlichen Nutzer. Allen Mitgliedern der Kommission gebührt großer Dank für die erfolgreiche Arbeit. Mein ganz besonderer Dank gilt dem Engagement der Präsidiumsbeauftragten der Leibniz-Gemeinschaft für Informationsinfrastruktur, Sabine Brünger-Weilandt, die den Vorsitz der Kommission innehatte. Sie ist die Geschäftsführerin des Leibniz-Instituts für Informationsinfrastruktur – FIZ Karlsruhe, das sie zeitgleich zur Leitung der Kommission durch seine turnusgemäße Evaluierung geführt hat. Das vorliegende Konzept zeigt das enorme Potenzial für den Wissenschaftsstandort Deutschland, das in der strategischen Weiterentwicklung der Informationsinfrastruktur steckt. Und es weist den Weg in die Zukunft der Informationsinfrastruktur. Jetzt gilt es, die Umsetzung voranzutreiben.
  • Item
    Special issue on conceptual structures
    (Dordrecht [u.a.] : Springer Science + Business Media B.V, 2022) Alam, Mehwish; Braun, Tanya; Endres, Dominik; Yun, Bruno
    [no abstract available]
  • Item
    Understanding Class Representations: An Intrinsic Evaluation of Zero-Shot Text Classification
    (Aachen, Germany : RWTH Aachen, 2021) Hoppe, Fabian; Dessì, Danilo; Sack, Harald; Alam, Mehwish; Buscaldi, Davide; Cochez, Michael; Osborne, Francesco; Reforgiato Recupero, Diego; Sack, Harald
    Frequently, Text Classification is limited by insufficient training data. This problem is addressed by Zero-Shot Classification through the inclusion of external class definitions and then exploiting the relations between classes seen during training and unseen classes (Zero-shot). However, it requires a class embedding space capable of accurately representing the semantic relatedness between classes. This work defines an intrinsic evaluation based on greater-than constraints to provide a better understanding of this relatedness. The results imply that textual embeddings are able to capture more semantics than Knowledge Graph embeddings, but combining both modalities yields the best performance.
  • Item
    Steps towards a Dislocation Ontology for Crystalline Materials
    (Aachen, Germany : RWTH Aachen, 2021) Ihsan, Ahmad Zainul; Dessì, Danilo; Alam, Mehwish; Sack, Harald; Sandfeld, Stefan; García-Castro, Raúl; Davies, John; Antoniou, Grigoris; Fortuna, Carolina
    The field of Materials Science is concerned with, e.g., properties and performance of materials. An important class of materials are crystalline materials that usually contain “dislocations" - a line-like defect type. Dislocation decisively determine many important materials properties. Over the past decades, significant effort was put into understanding dislocation behavior across different length scales both with experimental characterization techniques as well as with simulations. However, for describing such dislocation structures there is still a lack of a common standard to represent and to connect dislocation domain knowledge across different but related communities. An ontology offers a common foundation to enable knowledge representation and data interoperability, which are important components to establish a “digital twin". This paper outlines the first steps towards the design of an ontology in the dislocation domain and shows a connection with the already existing ontologies in the materials science and engineering domain.
  • Item
    The genomic data deficit : On the need to inform research subjects of the informational content of their genomic sequence data in consent for genomic research
    (Amsterdam [u.a.] : Elsevier Science, 2020) Hallinan, Dara
    Research subject consent plays a significant role in the legitimation of genomic research in Europe – both ethically and legally. One key criterion for any consent to be legitimate is that the research subject is ‘informed’. This criterion implies that the research subject is given all relevant information to allow them to decide whether engaging with a genomic research infrastructure or project would be normatively desirable and whether they wish to accept the risks associated with engagement. This article makes the normative argument that, in order to be truly ‘informed’, the research subject should be provided with information on the informational content of their genomic sequence data. Information should be provided, in the first instance, prior to the initial consent transaction, and should include: information on the fact that genomic sequence data will be collected and processed, information on the types of information which can currently be extracted from sequence data and information on the uncertainties surrounding the types of information which may eventually be extractable from sequence data. Information should also be provided, on an ongoing basis, as relevant and necessary, throughout the research process, and should include: information on novel information which can be extracted from sequence data and information on the novel uses and utility of sequence data. The article argues that current elaborations of ‘informed’ consent fail to adequately address the requirements set out in the normative argument and that this inadequacy constitutes an issue in need of a solution. The article finishes with a set of observations as to the fora best suited to deliver a solution and as to the substantive content of a solution. © 2020 The Authors