Search Results

Now showing 1 - 2 of 2
  • Item
    Micro archives as rich digital object representations
    (Zenodo, 2018) Holzmann, Helge; Runnwerth, Mila
    Digital objects as well as real-world entities are commonly referred to in literature or on the Web by mentioning their name, linking to their website or citing unique identifiers, such as DOI and ORCID, which are backed by a set of meta information. All of these methods have severe disadvantages and are not always suitable though: They are not very precise, not guaranteed to be persistent or mean a big additional effort for the author, who needs to collect the metadata to describe the reference accurately. Especially for complex, evolving entities and objects like software, pre-defined metadata schemas are often not expressive enough to capture its temporal state comprehensively. We found in previous work that a lot of meaningful information about software, such as a description, rich metadata, its documentation and source code, is usually available online. However, all of this needs to be preserved coherently in order to constitute a rich digital representation of the entity. We show that this is currently not the case, as only 10% of the studied blog posts and roughly 30% of the analyzed software websites are archived completely, i.e., all linked resources are captured as well. Therefore, we propose Micro Archives as rich digital object representations, which semantically and logically connect archived resources and ensure a coherent state. With Micrawler we present a modular solution to create, cite and analyze such Micro Archives. In this paper, we show the need for this approach as well as discuss opportunities and implications for various applications also beyond scholarly writing.
  • Item
    Preserving information on mathematical software via web archives
    (Zenodo, 2018) Holzmann, Helge; Runnwerth, Mila
    Software is an essential ingredient in mathematical research especially in numerical analysis, mathematical modelling, and statistics. However, the traditional publication process has not come up with a satisfyingly neat solution to reference software yet, taking into account its dynamics in order to allow for reproducibility. While software on GitHub might be ”cited” by linking to the repository including the precise commit given by its SHA signature, commercial software can only be referenced by its web presence most of the time. In 2016, we evaluated swMath, a database for information on mathematical software, on how a software’s archived web site reflects its development. We found, although some web sites are already archived and indeed provide a temporal overview of their corresponding software’s development including documentation, there is need for improvement. In 2017, we presented a demo framework to archive and cite software homepages adding a time stamp. Our vision is a service to archive semantically linked web sites of mathematical software on the basis of swMATH. The resulting archive contains all information on the software at a specific time and can be cited in a traditional publication via DOI.