Emotion recognition in textual data is a rapidly evolving field with diverse applications. While the stateof-the-art (SOTA) models based on pre-trained large language models (LLMs) have demonstrated significant achievements, the existing approaches often overlook fine-grained emotional nuances within individual sentences and the influence of contextual information. Additionally, despite the growing interest in personalized Natural Language Processing, recent studies have highlighted limitations in the literature, particularly the lack of explainability methods to interpret the improvements observed in these models. This study explores the CLARIN-Emo dataset to demonstrate the effectiveness of integrating personalized and contextual information for accurate emotion detection. By framing textual emotion recognition as a sequence sentence classification (SSC) task and leveraging transformer-based architectures, the proposed multi- source fusion approach significantly outperformed the baseline model, which considers each sentence in isolation. Furthermore, a personalized method, referred to as UserID, captures user-specific characteristics by assigning each annotator a unique identifier, significantly enhancing emotion prediction accuracy. This work also introduces an extension of Data Maps by differentiating dynamic training metrics to analyze the models' training behaviors. The results validate the capability of this approach in visually interpreting and facilitating performance comparisons between models.