The main contribution of this study is to introduce deep reinforcement learning (DRL) within the model prediction control (MPC) framework, and consider comprehensive economic objectives including fuel cell degradation costs, lithium battery aging costs, hydrogen consumption costs, etc. This approach successfully mitigated the inherent shortcomings of deep reinforcement learning, namely poor generalization and lack of adaptability, thereby significantly enhancing the robustness of economic driving decision in unknown scenarios. In this study, an MPC framework was developed for the energy management problem of fuel cell vehicles, and Bi-directional Long Short-Term Memory (Bi-LSTM) neural network was used to construct a vehicle speed predictor The accuracy of its prediction was verified through comparative analysis, and then it was regarded as a DRL model. Different from the overall strategy of the entire driving cycle, the model based DRL agent can learn the optimal action for each vehicle state. The simulation evaluated the impact of different predictors and prediction ranges on hydrogen economy, and verified the adaptability of the proposed strategy in different driving environments, the stability of battery state maintenance, and the advantages of delaying energy system degradation through comprehensive comparative analysis.