In response to the shortcomings of insufficient music structure, this article proposes a structured model based on motivational phrases and phrases. Starting from the composition structure of motivational phrases, deep learning techniques are used to learn composition. In the music generation model, a Scratch music generation model that can generate Pianoroll format music is constructed by using a generative adversarial network based on emotions and time structures. And use convolutional neural networks in the generator and discriminator to improve training speed. The effectiveness and practicality of the two algorithm models were verified through multiple comparative experiments and algorithm effectiveness experiments. This method achieves structural feature extraction of music by designing feature extractors at different music granularities. By designing feature expression functions at multi-scale music granularity, the music structure embedded in the music itself is incorporated into the reward function. Use forward backward propagation method to update the parameters of the model, and use dropout technique to improve the model's ability to resist overfitting. The test results show that the model has specific generalization ability, with an accuracy rate of 90%, and high recall and accuracy of the model. The experimental results show that this method can achieve better music generation results than the reward function method based on manual rules and before and after relationships. Solved the problem of lacking knowledge of music theory to propose rules, and compensated for the pain of insufficient utilization of music structure information in network models based on context. © 2024 Slovene Society Informatika. All rights reserved.