Search Results

Now showing 1 - 4 of 4
  • Item
    Characterization and classification of semantic image-text relations
    (Berlin : Springer Nature, 2020) Otto, C.; Springstein, M.; Anand, A.; Ewerth, R.
    The beneficial, complementary nature of visual and textual information to convey information is widely known, for example, in entertainment, news, advertisements, science, or education. While the complex interplay of image and text to form semantic meaning has been thoroughly studied in linguistics and communication sciences for several decades, computer vision and multimedia research remained on the surface of the problem more or less. An exception is previous work that introduced the two metrics Cross-Modal Mutual Information and Semantic Correlation in order to model complex image-text relations. In this paper, we motivate the necessity of an additional metric called Status in order to cover complex image-text relations more completely. This set of metrics enables us to derive a novel categorization of eight semantic image-text classes based on three dimensions. In addition, we demonstrate how to automatically gather and augment a dataset for these classes from the Web. Further, we present a deep learning system to automatically predict either of the three metrics, as well as a system to directly predict the eight image-text classes. Experimental results show the feasibility of the approach, whereby the predict-all approach outperforms the cascaded approach of the metric classifiers.
  • Item
    Survey vs Scraped Data: Comparing Time Series Properties of Web and Survey Vacancy Data
    (Berlin : Springer Nature, 2019) De Pedraza, P.; Visintin, S.; Tijdens, K.; Kismihók, G.
    This paper studies the relationship between a vacancy population obtained from web crawling and vacancies in the economy inferred by a National Statistics Office (NSO) using a traditional method. We compare the time series properties of samples obtained between 2007 and 2014 by Statistics Netherlands and by a web scraping company. We find that the web and NSO vacancy data present similar time series properties, suggesting that both time series are generated by the same underlying phenomenon: the real number of new vacancies in the economy. We conclude that, in our case study, web-sourced data are able to capture aggregate economic activity in the labor market.
  • Item
    Scholarly event characteristics in four fields of science: a metrics-based analysis
    (Berlin : Springer Nature, 2020) Fathalla, S.; Vahdati, S.; Lange, C.; Auer, Sören
    One of the key channels of scholarly knowledge exchange are scholarly events such as conferences, workshops, symposiums, etc.; such events are especially important and popular in Computer Science, Engineering, and Natural Sciences.However, scholars encounter problems in finding relevant information about upcoming events and statistics on their historic evolution.In order to obtain a better understanding of scholarly event characteristics in four fields of science, we analyzed the metadata of scholarly events of four major fields of science, namely Computer Science, Physics, Engineering, and Mathematics using Scholarly Events Quality Assessment suite, a suite of ten metrics.In particular, we analyzed renowned scholarly events belonging to five sub-fields within Computer Science, namely World Wide Web, Computer Vision, Software Engineering, Data Management, as well as Security and Privacy.This analysis is based on a systematic approach using descriptive statistics as well as exploratory data analysis. The findings are on the one hand interesting to observe the general evolution and success factors of scholarly events; on the other hand, they allow (prospective) event organizers, publishers, and committee members to assess the progress of their event over time and compare it to other events in the same field; and finally, they help researchers to make more informed decisions when selecting suitable venues for presenting their work.Based on these findings, a set of recommendations has been concluded to different stakeholders, involving event organizers, potential authors, proceedings publishers, and sponsors. Our comprehensive dataset of scholarly events of the aforementioned fields is openly available in a semantic format and maintained collaboratively at OpenResearch.org.
  • Item
    Replication and Refinement of an Algorithm for Automated Drusen Segmentation on Optical Coherence Tomography
    (Berlin : Springer Nature, 2020) Wintergerst, M.W.M.; Gorgi Zadeh, S.; Wiens, V.; Thiele, S.; Schmitz-Valckenberg, S.; Holz, F.G.; Finger, R.P.; Schultz, T.
    Here, we investigate the extent to which re-implementing a previously published algorithm for OCT-based drusen quantification permits replicating the reported accuracy on an independent dataset. We refined that algorithm so that its accuracy is increased. Following a systematic literature search, an algorithm was selected based on its reported excellent results. Several steps were added to improve its accuracy. The replicated and refined algorithms were evaluated on an independent dataset with the same metrics as in the original publication. Accuracy of the refined algorithm (overlap ratio 36–52%) was significantly greater than the replicated one (overlap ratio 25–39%). In particular, separation of the retinal pigment epithelium and the ellipsoid zone could be improved by the refinement. However, accuracy was still lower than reported previously on different data (overlap ratio 67–76%). This is the first replication study of an algorithm for OCT image analysis. Its results indicate that current standards for algorithm validation do not provide a reliable estimate of algorithm performance on images that differ with respect to patient selection and image quality. In order to contribute to an improved reproducibility in this field, we publish both our replication and the refinement, as well as an exemplary dataset.