Over the last few decades, a large body of research was carried out covering Byzantine Fault Tolerance (BFT) systems. This research has brought forward new techniques, including but not limited, for ordering operations (Abraham et al., 2018; Buchman, 2016; Guo et al., 2020; Bessani et al., 2014; Duan et al., 2018) and state transfer (Bessani et al., 2013; Distler, 2021; Eischer et al., 2019), on networks that suffer from byzantine faults. More recently, the ongoing research on distributed ledgers re-ignited the interest on BFT, due to its high throughput when compared to other alternatives of byzantine consensus (Vukoli & cacute;, 2016). In this paper we present three contributions covering several aspects, including modular and extensible framework design and implementation, system optimization through development of better networking alternatives, a greater use of parallelism, several ordering protocol improvements and extensive comparative assessment of previous state-of-the-art approaches. First, we introduce Atlas, an open-source modular BFT framework that aims to support the research and development of highly efficient BFT protocols, by decoupling traditionally entangled sub-protocols, e.g., consensus primitive from the execution (Bessani et al., 2014), and deferment of log management to replicated services from state transfer. Atlas allows to further provide modules that can be re-used across different BFT approaches, such as deterministic and probabilistic/randomized models. Second, we present FeBFT, a new BFT implementation developed upon Atlas that combines pre-existing proven ideas from PBFTs, namely its 3-phase consensus and view-change protocol. This base approach is then extended with novel optimizations of the protocol, namely, multi-leader proposals (Stathakopoulou et al., 2019), multi-instance consensus execution (Stathakopoulou et al., 2022; Behl et al., 2015), and configurable batching solution that allow us to reduce the latency while improving throughput at the same time. Third, we offer a comprehensive evaluation amongst our work and other state-of-the-art BFT-SMR implementations, namely, Atlas (Neto et al., 2024a) with FeBFT (Official febft repository 2024), BFT-SMaRt (Bessani et al., 2014) and Themis (R & uuml;sch et al., 2019). With these contributions, we aim to lay the ground work to: (i) improve reusability and hence productivity in BFT(-SMR) development; (ii) increase system safety, performance, scalability and reduce recovery time with the optimizations proposed; (iii) draw insights on the bottlenecks preventing order-of-magnitude improvements in BFT processing from a system's perspective; and lastly, (iv) improve reproducibility between different BFT (sub-)protocols by allowing for true apples-to-apples comparisons.