Drone swarm coordination, inspired by natural swarm behaviors such as bird flocking, has traditionally hinged on rule-based methodologies, with Craig Reynolds' Boid algorithm as a seminal reference. These rule-based strategies, although functional in many contexts, fail to capture the intricate adaptive learning mechanisms inherent to birds navigating dynamic environments. Recognizing this limitation, this paper introduces a multi-agent reinforcement learning (MARL) approach to Boid modeling for drone swarms, its more authentic emulation of the nuanced learning processes witnessed in avian species. Our methodology leverages reinforcement learning algorithms to train individual drones, enabling them to autonomously make decisions based on their local environment and the overall swarm's state. Distinct from traditional rule-based models, our MARL-driven drones continually optimize their collective and individual behaviors, echoing the adaptability of natural flocks. This adaptability, intrinsically woven into the fabric of our model, offers heightened proficiency in navigating complex, mutable environments, bringing the simulation closer to the organic flocking dynamics observed in nature. Empirical evaluations showcase that our MARL Boid not only surpass traditional rule-based models in cohesion, alignment, and separation but also excel in adaptability to environmental variations. Moreover, our model is able to capture the flocking behavior quite effectively and show robustness against external perturbations.