Multi-sensor information fusion plays a critical role in hydroturbine bearing fault diagnosis. Traditional approaches often struggle to capture complex spatiotemporal dependencies, limiting their ability to fully exploit the latent features in multi-sensor data. To address this challenge, this paper proposes a novel Multi-Scale Spatiotemporal Graph Neural Network (MSTGNN) for hydroturbine bearing fault diagnosis, aiming to comprehensively model the spatiotemporal correlations within sensor data. First, an adaptive receptive field convolutional layer is designed to extract key information from each sensor. Then, a dynamic graph structure is constructed to capture spatial dependencies among sensors. A multi-scale spatiotemporal convolution module is subsequently introduced to extract temporal features, enhancing the model’s capability to identify fault patterns across different time scales. Extensive experiments on two bearing case studies demonstrate that MSTGNN achieves outstanding diagnostic performance, with accuracies of 99.96% on the Harbin Institute of Technology (HIT) bearing dataset and 99.27% on a thrust bearing dataset from a hydropower station in Sichuan Province. These results exceed the best-performing baseline method by 2.29% and 1.13%, respectively. Finally, ablation studies confirm the effectiveness of each proposed component, further validating the potential of MSTGNN in hydroturbine bearing fault diagnosis.