The large-scale image-based localization (IBL) problem involves matching a query image with a database image to determine the geolocation of the query. A major challenge in this problem stems from significant variations between images captured at the same location, including different viewpoints, illumination conditions, and seasonal changes. To address this issue, we recognize the potential advantages of integrating difficult positive samples into the training process. Consequently, we introduce a novel retrieval-based framework meticulously designed to harness the advantages presented by these difficult positive samples. A pivotal component is the proposed dual-attention-transformer-based semantic reranking module, which leverages semantic segmentation to preserve local feature points. This module, powered by the dual-attention-transformer, extracts nuanced global-to-local information via channel self-attention and window self-attention, thereby facilitating sample augmentation and final reranking. Additionally, we introduce the adaptive triplet loss, a dynamic mechanism incorporating weighted difficult positive samples into supervised information, which strengthens the model's robustness. We extensively evaluate our framework on various city-level datasets and demonstrate its superiority over state-of-the-art methods. Furthermore, an exhaustive ablation study systematically validates the effectiveness of each individual component, underscoring their contributions to the proposed methodology.