This paper focuses on challenges of detecting small underwater targets in uncertain underwater environments by sonar. Specifically, a comprehensive approach for image sonar data acquisition, enhancement, and target detection based on convolutional networks (SAED-NET) is proposed, considering the effects of sonar data transmission, acoustic image quality, small target features, and the effectiveness of detection. Firstly, to improve the quality of sonar images, we adapt a sonar data conversion protocol and improved bilinear interpolation algorithm for coordinate conversion, a comparative filtering and enhancing algorithm for acoustic noise suppression and edge preservation, so as to construct a dataset suitable for underwater small target detection in a real environment. Secondly, in detection part of SAED-NET, a supervised learning-based acoustic target detection and tracking network is employed, where the detection head and loss function is improved, the deformable convolution of the backbone network is introduced, the echo interference of the acoustic image is eliminated. In addition, the Kalman tracking state vectors are reorganised and the target-associated matching is carried out to be substituted into various kinds of sort networks to reduce the detection jumps. By real experiments with SAED-NET on Underwater Vehicles, the results yield an efficiency of subsea target detection, with target detection accuracy improved from 86.86% (69.59% for small target) to 97.87% (83.90% for small target), mAP accuracy elevated from 62.70% to a maximum of 70.92%. Furthermore, the approach also demonstrates excellent tracking performance.