Abstract:Objective Given the extensive application of machine learning (ML) in medical models and its remarkable learning and generalization capabilities, this study employed automated ML (AutoML) combined with patient demographics and clinical conditions to early assess the risk of failure in bowel preparation prior to colonoscopy.Methods A retrospective analysis was conducted on patients who underwent colonoscopy examinations in Hospital 1 and Hospital 2 from January 2022 to January 2023, and their general and clinical information was collected. According to the Boston bowel preparation scale (BBPS), a BBPS of ≤ 5 was defined as a failure in bowel preparation, > 5 was deemed satisfactory. From the data of the two hospitals, we randomly divided the dataset into a training set (n = 303) and a validation set (n = 76) at an 8∶2 ratio. Least absolute shrinkage and selection operator (LASSO) logistic regression (LR) model was used for feature selection, a nomogram scoring system was constructed, and models were established using AutoML based on five algorithms. Model performance was evaluated through receiver operator characteristic curve (ROC curve), calibration curves, LR-based decision curve analysis (DCA), SHAP plots, and force plots.Results Among the 379 patients, 105 cases (27.7%) experienced bowel preparation failure (BBPS ≤ 5). 21 study variables were narrowed down to 10 through LASSO with 5-fold cross-validation, resulting in the development of a Nomogram chart with demonstrated reliability via calibration curves. Using the H2O platform and five algorithms [gradient boosting machine (GBM), deep learning (DL), generalized linear model (GLM), Stacked Ensemble and distributed random forest (DRF)], 67 models were developed. Stacked Ensemble outperformed the others with an area under the curve (AUC) of 0.871, LogLoss of 0.403, and RMSE of 0.354, surpassing traditional LR model and other models. Variable importance contribution plots indicated significant predictive influences from factors such as the interval between laxative ingestion and examination, history of constipation, completion of laxative regimen, age, and presence of a companion during the procedure. Finally, SHAP plots and force plots revealed variable distribution patterns in binary classification predictions and the impact of variables on predictive outcomes.Conclusion The AutoML model based on the Stacked Ensemble algorithm exhibits clear clinical utility in early prediction of bowel preparation failure risk. Moreover, a clinically applicable column chart scoring tool is constructed.