Search Results

Now showing 1 - 3 of 3
  • Item
    A PDF Test-Set for Well-Formedness Validation in JHOVE - The Good, the Bad and the Ugly
    (Zenodo, 2017) Lindlar, Michelle; Tunnat, Yvonne; Wilson, Carl
    Digital preservation and active software stewardship are both cyclical processes. While digital preservation strategies have to be reevaluated regularly to ensure that they still meet technological and organizational requirements, software needs to be tested with every new release to ensure that it functions correctly. JHOVE is an open source format validation tool which plays a central role in many digital preservation workflows and the PDF module is one of its most important features. Unlike tools such as Adobe PreFlight or veraPDF which check against requirements at profile level, JHOVE’s PDF-module is the only tool that can validate the syntax and structure of PDF files. Despite JHOVE’s widespread and long-standing adoption, the underlying validation rules are not formally or thoroughly tested, leading to bugs going undetected for a long time. Furthermore, there is no ground-truth data set which can be used to understand and test PDF validation at the structural level. The authors present a corpus of light-weight files designed to test the validation criteria of JHOVE’s PDF module against “well-formedness”. We conclude by measuring the code coverage of the test corpus within JHOVE PDF validation and by feeding detected inconsistencies of the PDF-module back into the open source development process.
  • Item
    TIB and East Asia Department at the TIB
    (Zenodo, 2017) Lu, Linna
    The "2017 International Conference on Integrated Development of Digital Publishing and Digital Libraries (CDPDL)" took place in Taiyuan from August 16-18,2017. This conference provides a platform for Chinese and foreign librarians, publishers and scholars to present the latest developments in Chinese library resources and technical innovations. In the early 1980s, with the growing interest in East Asia through research in academic and industrial fields, the TIB's Regional Department for East Asia was founded as an independent unit. The aim of the acquisition of the Regional Unit is to provide materials that not only provide an overview of current developments in East Asia but also provide detailed, specialized information. In comparison to most Asian departments of other libraries, the TIB's East Asia collection focuses not on Asian linguistics and humanities, but on modern literature in the field of technology and natural sciences. In order to reflect on the current state of research in East Asia and to communicate research and industry, our acquisition focuses on modern periodicals, academic journals and series from scientific and technical associations, reports from universities and research institutes. Exactly these special features were presented to the participants at this year's CDPDL. Many of our colleagues in the branch could thus get to know the TIB and the regional department better, especially in the country and the geographical area in which we have placed one of our main focuses of collection.
  • Item
    Discovery and efficient reuse of technology pictures using Wikimedia infrastructures. A proposal
    (Zenodo, 2016) Heller, Lambert; Blümel, Ina; Cartellieri, Simone; Wartena, Christian
    Multimedia objects, especially images and figures, are essential for the visualization and interpretation of research findings. The distribution and reuse of these scientific objects is significantly improved under open access conditions, for instance in Wikipedia articles, in research literature, as well as in education and knowledge dissemination, where licensing of images often represents a serious barrier. Whereas scientific publications are retrievable through library portals or other online search services due to standardized indices there is no targeted retrieval and access to the accompanying images and figures yet. Consequently there is a great demand to develop standardized indexing methods for these multimedia open access objects in order to improve the accessibility to this material. With our proposal, we hope to serve a broad audience which looks up a scientific or technical term in a web search portal first. Until now, this audience has little chance to find an openly accessible and reusable image narrowly matching their search term on first try - frustratingly so, even if there is in fact such an image included in some open access article.