Smart manufacturing scheduling (SMS) requires a high degree of flexibility to successfully cope with changes in operational decision level planning processes in today's production environments, which are usually subject to high uncertainty. In such a unique and complex scenario as the real job shop, the modelling of SMS as a Markov decision process (MDP), and its approach by deep reinforcement learning (DRL), is a research field of growing interest given its characteristics. It allows us to consider achieving high flexibility levels by promoting process automation, autonomy in decision making, and the ability to act in real time when faced with disturbances and disruptions in a highly dynamic environment. This paper addresses the problem of scheduling a quasi-realistic job shop environment characterised by machines receiving jobs from buffers that accumulate numerous jobs using a wide variety of parts and multimachine routes with a diverse number of operation phases by developing a digital twin of the job shop based on a MDP with the DRL methodology. This is approached by: modelling the job shop scheduling environment with OpenAI Gym; designing an observation space with 18 job features; designing an action space composed of three priority heuristic rules; shaping a single reward function with a multiobjective characteristic; using the implementation of the proximal policy optimisation (PPO) algorithm from the Stable Baselines 3 library. This modelling approach, dubbed as job shop smart manufacturing scheduling (JSSMS), is characterised by deterministic formulation and implementation. The model is subjected to validation by comparing it to several of the best-known heuristic priority rules. The main findings of this methodology allow to replicate, to a great extent, the positive aspects of heuristic rules and to mitigate the negative ones, which achieves more balanced behaviour in most of the measures established as performance indicators and outperforms heuristic rules from this multi-objective perspective. Finally, further research is oriented to dynamic and stochastic approaches to address the job shop reality in an Industry 4.0 context.