Search Results

Now showing 1 - 6 of 6
  • Item
    Evolutionary design of explainable algorithms for biomedical image segmentation
    ([London] : Nature Publishing Group UK, 2023) Cortacero, Kévin; McKenzie, Brienne; Müller, Sabina; Khazen, Roxana; Lafouresse, Fanny; Corsaut, Gaëlle; Van Acker, Nathalie; Frenois, François-Xavier; Lamant, Laurence; Meyer, Nicolas; Vergier, Béatrice; Wilson, Dennis G.; Luga, Hervé; Staufer, Oskar; Dustin, Michael L.; Valitutti, Salvatore; Cussat-Blanc, Sylvain
    An unresolved issue in contemporary biomedicine is the overwhelming number and diversity of complex images that require annotation, analysis and interpretation. Recent advances in Deep Learning have revolutionized the field of computer vision, creating algorithms that compete with human experts in image segmentation tasks. However, these frameworks require large human-annotated datasets for training and the resulting “black box” models are difficult to interpret. In this study, we introduce Kartezio, a modular Cartesian Genetic Programming-based computational strategy that generates fully transparent and easily interpretable image processing pipelines by iteratively assembling and parameterizing computer vision functions. The pipelines thus generated exhibit comparable precision to state-of-the-art Deep Learning approaches on instance segmentation tasks, while requiring drastically smaller training datasets. This Few-Shot Learning method confers tremendous flexibility, speed, and functionality to this approach. We then deploy Kartezio to solve a series of semantic and instance segmentation problems, and demonstrate its utility across diverse images ranging from multiplexed tissue histopathology images to high resolution microscopy images. While the flexibility, robustness and practical utility of Kartezio make this fully explicable evolutionary designer a potential game-changer in the field of biomedical image processing, Kartezio remains complementary and potentially auxiliary to mainstream Deep Learning approaches.
  • Item
    Ranking facts for explaining answers to elementary science questions
    (Cambridge : Cambridge University Press, 2023) D’Souza, Jennifer; Mulang, Isaiah Onando; Auer, Sören
    In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by “connecting the dots” across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.
  • Item
    Comparing national differences in what people perceive to be there: Mapping variations in crowd sourced land cover
    (London : International Society for Photogrammetry and Remote Sensing, 2015) Comber, A.; Mooney, P.; Purves, R.S.; Rocchini, D.; Walz, A.
  • Item
    Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript
    (San Francisco, CA : Public Library of Science (PLoS), 2013) Amancio, D.R.; Altmann, E.G.; Rybski, D.; Oliveira Jr., O.N.; da Costa, L.F.
    While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.
  • Item
    Order patterns networks (orpan) - A method to estimate time-evolving functional connectivity from multivariate time series
    (Lausanne : Frontiers Research Foundation, 2012) Schinkel, S.; Zamora-López, G.; Dimigen, O.; Sommer, W.; Kurths, J.
    Complex networks provide an excellent framework for studying the function of the human brain activity. Yet estimating functional networks from measured signals is not trivial, especially if the data is non-stationary and noisy as it is often the case with physiological recordings. In this article we propose a method that uses the local rank structure of the data to define functional links in terms of identical rank structures. The method yields temporal sequences of networks which permits to trace the evolution of the functional connectivity during the time course of the observation. We demonstrate the potentials of this approach with model data as well as with experimental data from an electrophysiological study on language processing.
  • Item
    Knowledge organization systems in mathematics and in libraries
    (Zenodo, 2017) Kasprzik, Anna
    Based on the project activities planned in the context of the Specialized Information Service for Mathematics (TIB Hannover, FAU Erlangen, L3S, SUB Göttingen) we give an overview over the history and interplay of subject cataloguing in libraries, the development of computerized methods for metadata processing and the rise of the Semantic Web. We survey various knowledge organization systems such as the Mathematics Subject Classification, the German Authority File, the clustering International Authority File VIAF, and lexical databases such as WordNet and their potential use for mathematics in education and research. We briefly address the difference between thesauri and ontologies and the relations they typically contain from a linguistic perspective. We will then discuss with the audience how the current efforts to represent and handle mathematical theories as semantic objects can help deflect the decline of semantic resource annotation in libraries that has been predicted by some due to the existence of highly performant retrieval algorithms (based on statistical, neuronal, or other big data methods). We will also explore the potential characteristics of a fruitful symbiosis between carefully cultivated kernels of semantic structure and automated methods in order to scale those structures up to the level that is necessary in order to cope with the amounts of digital data found in libraries and in (mathematical) research (e.g., in simulations) today.