The complexity of chemical process continues to increase, identifying the accurate process model has become a significant task of automatic control and optimal design. The effectiveness of chemical process identification based on deep learning methods has been verified in recent years. Aiming at the characteristics of chemical process, such as temporal correlation, nonlinearity, high dimension and strong coupling, a chemical process modeling method which combines spatio-temporal attention long short-term memory structure with a novel second-order optimization algorithm (STA-SO-LSTM) is proposed. Firstly, a second-order LSTM back-propagation algorithm is proposed to improve the model accuracy and training speed. This novel optimization algorithm uses the second derivative information of the neuron activation function and the gradient information to estimate the inverse Hessian matrix without matrix inversion operation. Then, considering the correlation and the time characteristics among different chemical process variables, a model combining spatio-temporal attention and LSTM was adopted. To demonstrate the efficiency of the developed neural network structure and algorithm, various comparative results are conducted on Tennessee-Eastman (TE) process and fractionator datasets. The experimental results clearly show that the structure combining LSTM and spatio-temporal attention mechanism has good performance in establishing dynamic model. And compared with some traditional and recently proposed optimization algorithms, the proposed second-order algorithm has higher accuracy and faster convergence speed.