3 results
Search Results
Now showing 1 - 3 of 3
- ItemEmploying Hybrid AI Systems to Trace and Document Bias in ML Pipelines(New York, NY : IEEE, 2024) Russo, Mayra; Chudasama, Yasharajsinh; Purohit, Disha; Sawischa, Sammy; Vidal, Maria-EstherArtificial Intelligence (AI) systems can introduce biases that lead to unreliable outcomes and, in the worst-case scenarios, perpetuate systemic and discriminatory results when deployed in the real world. While significant efforts have been made to create bias detection methods, developing reliable and comprehensive documentation artifacts also makes for valuable resources that address bias and aid in minimizing the harms associated with AI systems. Based on compositional design patterns, this paper introduces a documentation approach using a hybrid AI system to prompt the identification and traceability of bias in datasets and predictive AI models. To demonstrate the effectiveness of our approach, we instantiate our pattern in two implementations of a hybrid AI system. One follows an integrated approach and performs fine-grained tracing and documentation of the AI model. In contrast, the other hybrid system follows a principled approach and enables the documentation and comparison of bias in the input data and the predictions generated by the model. Through a use-case based on Fake News detection and an empirical evaluation, we show how biases detected during data ingestion steps (e.g., label, over-representation, activity bias) affect the training and predictions of the classification models. Concretely, we report a stark skewness in the distribution of input variables towards the Fake News label, we uncover how a predictive variable leads to more constraints in the learning process, and highlight open challenges of training models with unbalanced datasets. A video summarizing this work is available online (https://youtu.be/v2GfIQPAy_4?si=BXtWOf97cLiZavyu),and the implementation is publicly available on GitHub (https://github.com/SDM-TIB/DocBiasKG).
- ItemCauseKG: A Framework Enhancing Causal Inference With Implicit Knowledge Deduced From Knowledge Graphs(New York, NY : IEEE, 2024) Huang, Hao; Vidal, Maria-EstherCausal inference is a critical technique for inferring causal relationships from data and distinguishing causation from correlation. Causal inference frameworks rely on structured data, typically represented in flat tables or relational models. These frameworks estimate causal effects based only on explicit facts, overlooking implicit information in the data, which can lead to inaccurate causal estimates. Knowledge graphs (KGs) inherently capture implicit information through logical rules applied to explicit facts, providing a unique opportunity to leverage implicit knowledge. However, existing frameworks are not applicable to KGs due to their semi-structured nature. CauseKG is a causal inference framework designed to address the intricacies of KGs and seamlessly integrate implicit information using KG-specific entailment techniques, providing a more accurate causal inference process. We empirically evaluate the effectiveness of CauseKG against benchmarks constructed from synthetic and real-world datasets. The results suggest that CauseKG can produce a lower mean absolute error in causal inference compared to state-of-the-art methods. The empirical results demonstrate CauseKG's ability to address causal questions in a variety of domains. This research highlights the importance of extending causal inference techniques to KGs, emphasising the improved accuracy that can be achieved by integrating implicit and explicit information.
- ItemRobust Fusion of Time Series and Image Data for Improved Multimodal Clinical Prediction(New York, NY : IEEE, 2024) Rasekh, Ali; Heidari, Reza; Hosein Haji Mohammad Rezaie, Amir; Sharifi Sedeh, Parsa; Ahmadi, Zahra; Mitra, Prasenjit; Nejdl, WolfgangWith the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenotyping where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code for this work is publicly available at: https://github.com/AliRasekh/TSImageFusion.