With the rapid changes in remote sensing platforms, there is a noticeable exponential increase in the quantity of remote sensing images. Choosing the appropriate remote sensing images from extensive remote sensing big data is now a fundamental challenge in remote sensing applications. Currently, utilizing deep Convolutional Neural Networks (CNNs) for extracting deep features from images has become the main approach for remote sensing image retrieval due to its effectiveness. However, the high feature dimensions pose challenges for similarity measurement in the image retrieval, resulting in decreased processing speed and retrieval accuracy. The hash method maps images into compact binary codes from a high-dimensional space, which can be used in remote sensing image retrieval to efficiently reduce feature dimensions. Therefore, this paper proposes a ResNet-based adaptive dilated and structural embedding asymmetric hashing algorithm for the remote sensing image retrieval. Firstly, an adaptive dilated convolution module is designed to adaptively capture multi-scale features of remote sensing images without introducing additional model parameters. Secondly, to address the issue of insufficient extraction of structural information in remote sensing imagery, the current structural embedding module has been optimized and improved to effectively extract geometric structure features from remote sensing images. Lastly, to tackle the problem of low retrieval efficiency caused by intra-class differences and inter-class similarities, pairwise similarity-based constraints are introduced to preserve the similarity of remote sensing images in both the original feature space and the hash space. Experimental comparisons with four datasets (i.e. UCM, NWPU, AID, and PatternNet) were conducted to demonstrate the effectiveness of the proposed method. The mean average precision rates for 64-bit hash codes were 98.07%, 93.65%, 97.92%, and 97.53% with these four datasets, respectively, proving the superiority of our proposed approach over other existing deep hashing image retrieval methods. In addition, four ablation experiments were carried out to verify each module of the proposed method. The ablation experimental results showed that the mean average precision rate was 68.9% by only using the ResNet18 backbone network. The rate will rise to 81.71% after introducing the structural self-similarity coding module, indicating an improvement of 12.81%. Meanwhile, introducing the adaptive dilated convolution module increased the average precision rate by 10.53%. The additional implementation of the pairwise similarity constraints module further increased the average precision rate to 98.07%, indicating a rise of 5.83%. In summary, the experimental results confirm the efficiency of the proposed network framework, which can improve the retrieval accuracy of remote sensing images while maintaining the advantages of deep hashing features. © 2024 Science Press. All rights reserved.