Search Results

Now showing 1 - 10 of 12
Loading...
Thumbnail Image
Item

Quality Prediction of Open Educational Resources A Metadata-based Approach

2020, Tavakoli, Mohammadreza, Elias, Mirette, Kismihók, Gábor, Auer, Sören, Chang, Maiga, Sampson, Demetrios G., Huang, Ronghuai, Hooshyar, Danial, Chen, Nian-Shing, Kinshuk, Pedaste, Margus

In the recent decade, online learning environments have accumulated millions of Open Educational Resources (OERs). However, for learners, finding relevant and high quality OERs is a complicated and time-consuming activity. Furthermore, metadata play a key role in offering high quality services such as recommendation and search. Metadata can also be used for automatic OER quality control as, in the light of the continuously increasing number of OERs, manual quality control is getting more and more difficult. In this work, we collected the metadata of 8,887 OERs to perform an exploratory data analysis to observe the effect of quality control on metadata quality. Subsequently, we propose an OER metadata scoring model, and build a metadata-based prediction model to anticipate the quality of OERs. Based on our data and model, we were able to detect high-quality OERs with the F1 score of 94.6%. © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Loading...
Thumbnail Image
Item

Sleep apnea-hypopnea quantification by cardiovascular data analysis

2014, Camargo, S., Riedl, M., Anteneodo, C., Kurths, J., Penzel, T., Wessel, N.

Sleep disorders are a major risk factor for cardiovascular diseases. Sleep apnea is the most common sleep disturbance and its detection relies on a polysomnography, i.e., a combination of several medical examinations performed during a monitored sleep night. In order to detect occurrences of sleep apnea without the need of combined recordings, we focus our efforts on extracting a quantifier related to the events of sleep apnea from a cardiovascular time series, namely systolic blood pressure (SBP). Physiologic time series are generally highly nonstationary and entrap the application of conventional tools that require a stationary condition. In our study, data nonstationarities are uncovered by a segmentation procedure which splits the signal into stationary patches, providing local quantities such as mean and variance of the SBP signal in each stationary patch, as well as its duration L. We analysed the data of 26 apneic diagnosed individuals, divided into hypertensive and normotensive groups, and compared the results with those of a control group. From the segmentation procedure, we identified that the average duration 〈L〉, as well as the average variance 〈σ2〉, are correlated to the apnea-hypoapnea index (AHI), previously obtained by polysomnographic exams. Moreover, our results unveil an oscillatory pattern in apneic subjects, whose amplitude S∗ is also correlated with AHI. All these quantities allow to separate apneic individuals, with an accuracy of at least 79%. Therefore, they provide alternative criteria to detect sleep apnea based on a single time series, the systolic blood pressure.

Loading...
Thumbnail Image
Item

Spatiotemporal data analysis with chronological networks

2020, Ferreira, Leonardo N., Vega-Oliveros, Didier A., Cotacallapa, Moshé, Cardoso, Manoel F., Quiles, Marcos G., Zhao, Liang, Macau, Elbert E. N.

The number of spatiotemporal data sets has increased rapidly in the last years, which demands robust and fast methods to extract information from this kind of data. Here, we propose a network-based model, called Chronnet, for spatiotemporal data analysis. The network construction process consists of dividing a geometric space into grid cells represented by nodes connected chronologically. Strong links in the network represent consecutive recurrent events between cells. The chronnet construction process is fast, making the model suitable to process large data sets. Using artificial and real data sets, we show how chronnets can capture data properties beyond simple statistics, like frequent patterns, spatial changes, outliers, and spatiotemporal clusters. Therefore, we conclude that chronnets represent a robust tool for the analysis of spatiotemporal data sets.

Loading...
Thumbnail Image
Item

ColiCoords: A Python package for the analysis of bacterial fluorescence microscopy data

2019, Smit, Jochem H., Li, Yichen, Warszawik, Eliza M., Herrmann, Andreas, Cordes, Thorben, Gilestro, Giorgio F

Single-molecule fluorescence microscopy studies of bacteria provide unique insights into the mechanisms of cellular processes and protein machineries in ways that are unrivalled by any other technique. With the cost of microscopes dropping and the availability of fully automated microscopes, the volume of microscopy data produced has increased tremendously. These developments have moved the bottleneck of throughput from image acquisition and sample preparation to data analysis. Furthermore, requirements for analysis procedures have become more stringent given the demand of various journals to make data and analysis procedures available. To address these issues we have developed a new data analysis package for analysis of fluorescence microscopy data from rod-like cells. Our software ColiCoords structures microscopy data at the single-cell level and implements a coordinate system describing each cell. This allows for the transformation of Cartesian coordinates from transmission light and fluorescence images and single-molecule localization microscopy (SMLM) data to cellular coordinates. Using this transformation, many cells can be combined to increase the statistical power of fluorescence microscopy datasets of any kind. ColiCoords is open source, implemented in the programming language Python, and is extensively documented. This allows for modifications for specific needs or to inspect and publish data analysis procedures. By providing a format that allows for easy sharing of code and associated data, we intend to promote open and reproducible research. The source code and documentation can be found via the project’s GitHub page.

Loading...
Thumbnail Image
Item

Topological data analysis of contagion maps for examining spreading processes on networks

2015, Taylor, Dane, Klimm, Florian, Harrington, Heather A., Kramár, Miroslav, Mischaikow, Konstantin, Porter, Mason A., Mucha, Peter J.

Social and biological contagions are influenced by the spatial embeddedness of networks. Historically, many epidemics spread as a wave across part of the Earth’s surface; however, in modern contagions long-range edges—for example, due to airline transportation or communication media—allow clusters of a contagion to appear in distant locations. Here we study the spread of contagions on networks through a methodology grounded in topological data analysis and nonlinear dimension reduction. We construct ‘contagion maps’ that use multiple contagions on a network to map the nodes as a point cloud. By analysing the topology, geometry and dimensionality of manifold structure in such point clouds, we reveal insights to aid in the modelling, forecast and control of spreading processes. Our approach highlights contagion maps also as a viable tool for inferring low-dimensional structure in networks.

Loading...
Thumbnail Image
Item

Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript

2013, Amancio, D.R., Altmann, E.G., Rybski, D., Oliveira Jr., O.N., da Costa, L.F.

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

Loading...
Thumbnail Image
Item

Publikationsmonitoring

2020, Schmeja, Stefan, Tullney, Marco, Lackner, Karin, Schilhan, Lisa, Kaier, Christian

Die systematische Erfassung und Dokumentation des Publikationsoutputs einer Einrichtung spielt in Zeiten einer vorwiegend quantitativ erfolgenden Bewertung von Forschungsleistungen eine immer wichtigere Rolle für viele Universitäten und Forschungseinrichtungen. Zusammen mit weiteren Parametern wie eingeworbenen Drittmitteln dienen Publikationsdaten nicht nur der Außendarstellung, sondern auch der internen Auswertung und Steuerung bis hin zur leistungsorientierten Mittelvergabe. Hochschulverwaltungen dienen die Daten zur Identifizierung von Handlungsbedarfen bei der Entwicklung von Fachbereichen und Instituten. Systematisch erfasste Publikationsdaten geben Aufschluss über Trends in der Forschung und, insbesondere im Zusammenhang mit Zitationsdaten, über Verbindungen zwischen unterschiedlichen Forschungsgebieten oder Einrichtungen.

Loading...
Thumbnail Image
Item

A meta-analysis of catalytic literature data reveals property-performance correlations for the OCM reaction

2019, Schmack, Roman, Friedrich, Alexandra, Kondratenko, Evgenii V., Polte, Jörg, Werwatz, Axel, Kraehnert, Ralph

Decades of catalysis research have created vast amounts of experimental data. Within these data, new insights into property-performance correlations are hidden. However, the incomplete nature and undefined structure of the data has so far prevented comprehensive knowledge extraction. We propose a meta-analysis method that identifies correlations between a catalyst’s physico-chemical properties and its performance in a particular reaction. The method unites literature data with textbook knowledge and statistical tools. Starting from a researcher’s chemical intuition, a hypothesis is formulated and tested against the data for statistical significance. Iterative hypothesis refinement yields simple, robust and interpretable chemical models. The derived insights can guide new fundamental research and the discovery of improved catalysts. We demonstrate and validate the method for the oxidative coupling of methane (OCM). The final model indicates that only well-performing catalysts provide under reaction conditions two independent functionalities, i.e. a thermodynamically stable carbonate and a thermally stable oxide support.

Loading...
Thumbnail Image
Item

Change in the embedding dimension as an indicator of an approaching transition

2014, Neuman, Y., Marwan, N., Cohen, Y.

Predicting a transition point in behavioral data should take into account the complexity of the signal being influenced by contextual factors. In this paper, we propose to analyze changes in the embedding dimension as contextual information indicating a proceeding transitive point, called OPtimal Embedding tRANsition Detection (OPERAND). Three texts were processed and translated to time-series of emotional polarity. It was found that changes in the embedding dimension proceeded transition points in the data. These preliminary results encourage further research into changes in the embedding dimension as generic markers of an approaching transition point.

Loading...
Thumbnail Image
Item

Testing the detectability of spatio-temporal climate transitions from paleoclimate networks with the start model

2014, Rehfeld, K., Molkenthin, N., Kurths, J.

A critical challenge in paleoclimate data analysis is the fact that the proxy data are heterogeneously distributed in space, which affects statistical methods that rely on spatial embedding of data. In the paleoclimate network approach nodes represent paleoclimate proxy time series, and links in the network are given by statistically significant similarities between them. Their location in space, proxy and archive type is coded in the node attributes. We develop a semi-empirical model for Spatio- Temporally AutocoRrelated Time series, inspired by the interplay of different Asian Summer Monsoon (ASM) systems. We use an ensemble of transition runs of this START model to test whether and how spatio-temporal climate transitions could be detectable from (paleo)climate networks. We sample model time series both on a grid and at locations at which paleoclimate data are available to investigate the effect of the spatially heterogeneous availability of data. Node betweenness centrality, averaged over the transition region, does not respond to the transition displayed by the START model, neither in the grid-based nor in the scattered sampling arrangement. The regionally defined measures of regional node degree and cross link ratio, however, are indicative of the changes in both scenarios, although the magnitude of the changes differs according to the sampling. We find that the START model is particularly suitable for pseudo-proxy experiments to test the technical reconstruction limits of paleoclimate data based on their location, and we conclude that (paleo)climate networks are suitable for investigating spatio-temporal transitions in the dependence structure of underlying climatic fields.