Machine Learning Frameworks for Predicting Contaminant Leaching and Sorption Dynamics in Environmental Matrices

DSpace Repositorium (Manakin basiert)


Dateien:

Zitierfähiger Link (URI): http://hdl.handle.net/10900/173304
http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1733047
http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1733047
http://dx.doi.org/10.15496/publikation-114629
Dokumentart: Dissertation
Erscheinungsdatum: 2025-12-18
Originalveröffentlichung: Applicability of machine learning models for the assessment of long-term pollutant leaching from solid waste materials. Waste Management. 2023 Nov 1;171:337-49. ; Ensemble surrogate modeling of advective-dispersive transport with intraparticle diffusion model for column-leaching test. Journal of Contaminant Hydrology. 2024 Nov 1;267:104423.; Modeling PFAS Sorption in Soils Using Machine Learning. Environmental Science & Technology. 2025 Apr 11;59(15):7678-87.
Sprache: Englisch
Fakultät: 7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich: Geographie, Geoökologie, Geowissenschaft
Gutachter: Grathwohl, Peter (Prof. Dr.)
Tag der mündl. Prüfung: 2025-10-10
DDC-Klassifikation: 550 - Geowissenschaften
Freie Schlagwörter:
Waste Management
Reactive Transport
Machine Learning
Bayesian Inference
Ensemble Modeling
Simulation-based inference
PFAS
SHapley Additive exPlanation (SHAP)
Lizenz: http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en
Zur Langanzeige

Abstract:

Machine learning (ML) has the potential to fundamentally transform environmental science by providing a rapid, scalable alternative to slow, resource-intensive experiments and complex simulations. Traditional experimental methods, such as column leaching tests, require weeks of continuous laboratory measurements. Similarly, numerical contaminant transport models demand substantial computational resources to simulate complex sub-surface processes accurately. These time and resource requirements limit the practicality of such approaches for large-scale environmental assessments. This thesis introduces ML as a transformative tool to rapidly and accurately predict contaminant behavior. ML-driven models significantly reduce the duration of column leaching tests from 7–14 days to a single day by leveraging early-stage experimental data. Trained on construction and demolition waste materials, these models not only accurately predict the long-term leaching behavior of key contaminants—including sulfate, vanadium, chromium, copper, and organic pollutants (15 priority PAHs identified by the US-EPA)—but also enable fast, cost-effective risk assessments and guide sustainable waste management decisions. Once the ML models are trained, sensitivity analysis is performed to understand the underlying leaching dynamics, revealing pH and electrical conductivity as the most influential factors. Beyond experimental acceleration, this work enhances the efficiency of numerical simulations used to model contaminant transport in porous media. Traditional numerical models rely on solving coupled equations for advection, dispersion, and inter-phase mass transfer, requiring extensive computational resources. To address this, a surrogate modeling framework is developed using a random forest stacking model, reducing computational costs by over 1,000 times while maintaining predictive accuracy. Optimized through adaptive-recursive sampling, the model efficiently selects training data points, balancing exploration and exploitation. Additionally, Neural Posterior Estimation calibrates model parameters probabilistically using copper leaching data from two distinct soils, enabling robust uncertainty quantification. Furthermore, the need for accurate predictive tools is particularly urgent for emerging contaminants such as per- and polyfluoroalkyl substances (PFAS), known for their persistence, and complex sorption behavior. This thesis develops the PFAS Sorption Stacking Model (PSSM)—an ML-driven framework integrating compound-specific properties with soil characteristics to predict solid-liquid distribution coefficients (Kd). A key innovation is the incorporation of charge density as a sorption descriptor, particularly for PFAS compounds with pKa values near typical soil pH, which provides deeper insights into electrostatic interactions. A comprehensive dataset consolidating sorption isotherm data for 51 PFAS compounds across 455 soil types enhances model robustness. Missing values are systematically imputed using a k-nearest neighbors algorithm, preserving data integrity and improving predictive stability. The model demonstrates accurate and stable predictive performance, achieving a normalized root mean square error of 0.07, making it one of the most precise PFAS sorption models to date. As a practical application, PSSM is integrated into an online platform for real-time PFAS sorption predictions and the generation of spatial Kd maps. This tool supports targeted contamination assessments, facilitating the identification of PFAS hotspots and improving environmental risk management.

Das Dokument erscheint in: