Visual-inertial system (VINS) is often used to provide accurate motion estimates for autonomous robots and mobile devices, while filtering-based VINS remains popular due to its low memory requirements, CPU usage, and processing time. However, most filtering-based VINS algorithms are required to initialize from a static state. It is often the bottleneck for filtering-based VINS to initialize from any unknown dynamic motion conditions. Moreover, optimization-based VINS algorithms usually produce results that are more accurate than filtering-based VINS algorithms for their continuous iterative optimizations. To strengthen the level of effectiveness of filtering-based VINS, this research is twofold. First, in order to bootstrap filtering-based VINS with high-quality initialization in whether static or dynamic circumstances, this work proposed a robust initialization method. It was realized by aligning the relative pose estimated by vision with preintegration inertial measurement unit (IMU). Second, in order to enhance the accuracy of motion estimation, the gravity vector was added into the state vector to optimize online, and the performances of four different gravity vector parameterizations were compared. Moreover, we additionally performed outliers rejection in optical flow tracking to improve initialization and state estimation accuracy. Our approach was implemented in the OpenVINS framework, its performance was validated on public datasets and real-world experiments. The experimental results verified that our methodology can carry out robust initialization under different circumstances without knowing the system's states or movement in advance, meanwhile, state estimation accuracy is significantly improved and shows competing estimation performance.