Abstract:Objective To develop deep learning models for the diagnosis and risk stratification of internal hemorrhoids in endoscopy.Methods Endoscopic images in upper anus dentate line were collected, which were divided into normal group and internal hemorrhoids group (Task A). Based on the LDRf standard, internal hemorrhoids group was further classified into Rf0, Rf1 and Rf2 (Task B). Five deep learning models, included: Xception, ResNet, EfficientNet (based on CNNs architecture) and ViT, ConvMixer (Transformer architecture), were chosen to be trained on the two computer vision tasks. The models were evaluated by accuracy, recall, precision, F1 and prediction time. Their performances were compared with two endoscopists.Results The five models showed good performance in the validation dataset of the two tasks. The best was the ConvMixer model (accuracy 0.961 in Task A and 0.911 in Task B), followed by the EfficientNet model (0.956 and 0.901), which were both higher than the endoscopists (senior 0.952 and 0.881; junior 0.913 and 0.832). Meanwhile, in terms of prediction time in the validation dataset, all models (<10 s) cost significantly less time than the endoscopists ( > 300 s). Furthermore, the Grad-CAM promoted model’s visualization and explanation.Conclusion This study trained deep learning models to diagnose and stratify internal hemorrhoids in endoscopy, whose performance was better than endoscopists. Computer vision models, based on deep learning, could assist endoscopists to diagnose and stratify internal hemorrhoids, which show promise in future clinical practice.