In the existing remote sensing image recognition, the problems of poor imaging quality, small sample size, and the difficulty of using a single attention mechanism to fully extract the hidden distinguishing features in the image, this paper proposes a method for detecting regional destruction in remote sensing images based on MA-CapsNet (multi-attention capsule encoder-decoder network) network. The method firstly adopts the BSRGAN model for image super-resolution processing of the original destruction data, and adopts various data enhancement operations for data expansion of the processed image, and then adopts the MA-CapsNet network proposed in this paper for further processing, using Swin-transformer and Convolutional Block The lower level features are extracted using a cascading attention mechanism consisting of Swin-transformer and Convolutional Block Attention Module (CBAM); finally, the feature map is fed into the classifier to complete the detection of the destroyed area after the precise target features are captured by the CapsNet module (CapsNet). In the destruction area detection experiment carried out in the remote sensing images after the 2010 Haiti earthquake, the accuracy of the MA-CapsNet model in area destruction detection reaches 99.64%, which is better than that of the current state-of-the-art models such as ResNet, Vision Transformer (VIT), and the ablation experimental network model. This method improves the model's characterization ability and solves the problem of low accuracy of remote sensing image destruction area detection under complex background, which is of theoretical guidance significance for quickly grasping remote sensing image destruction and damage assessment.