The bin packing problem (BPP) has attracted enthusiastic research interest recently, owing to its widespread applications in logistics and warehousing environments. It is truly essential to optimize the bin packing to enable more objects to be packed into bins, in which the object packing order and placement position are the two crucial optimization goals. However, existing optimization methods for BPP, such as the genetic algorithm (GA), emerge as the primary issues in highly time cost and relatively low accuracy, making it difficult to implement in realistic scenarios. To well relieve related research gaps, we present a novel optimization method of 2D and 3D BPP for objects with regular shapes via deep reinforcement learning (DRL), maximizing the space utilization and minimizing the usage number of bins. First, an end-to-end DRL neural network constructed by a modified Pointer Network consisting of an encoder, a decoder and an attention module is proposed to achieve the optimal object packing order. Second, conforming to the top-down operation mode, the placement strategy based on a height map is used to determine the placement positions of the ordered objects in the bins, preventing the objects from colliding with bins and other objects in bins. Third, the reward and loss functions are defined as the indicators of the compactness, pyramid, and usage number of bins to conduct the DRL neural network training based on an on-policy actor-critic framework. Finally, we conduct extensive experiments to evaluate the performance of the proposed method, and demonstrate that our method achieves a 3% improvement and more than 50x time saving over the GA. Further, an experiment on robotic packing is implemented to validate its generalization capacity in the realistic environment.