In the computer vision domain, temporal convolution networks (TCN) have gained traction due to their lightweight, robust architectures for sequence-to-sequence prediction tasks. With that insight, in this study, we propose a novel deep learning architecture for biosignal segmentation and anomaly localiza-tion based on TCNs, named the multi-stage stacked TCN, which employs multiple TCN modules with varying dilation factors. More precisely, for each stage, our architecture uses TCN modules with multiple dilation factors, and we use convolution-based fusion to combine predictions returned from each stage. Furthermore, aiming smoothed predictions, we introduce a novel loss function based on the first-order derivative. To demonstrate the robustness of our architecture, we evaluate our model on five different tasks related to three 1D biosignal modalities (heart sounds, lung sounds and electrocardiogram). Our proposed framework achieves state-of-the-art performance for all tasks, significantly outperforming the respective state-of-the-art models having F1 score gains up to approximate to 9 %. Furthermore, the framework demon-strates competitive performance gains compared to traditional multi-stage TCN models with similar con-figurations yielding F1 score gains up to approximate to 5 %. Our model is also interpretable. Using neural conductance, we demonstrate the effectiveness of having TCNs with varying dilation factors. Our visualizations show that the model benefits from feature maps captured at multiple dilation factors, and the information is effectively propagated through the network such that the final stage produces the most accurate result.(c) 2023 Elsevier Ltd. All rights reserved.