Background: To solve the traditional manual teleoperation problems, this project explores a more natural method of manipulator teleoperation, sets up a fusion virtual teleoperation scene, uses hand gesture recognition technology to convert the operator's hand action to the manipulator's execution instruction, avoids the complexity which is caused by the operation translation and the posture handle control respectively. The three-dimensional scene reconstruction enables the operator to have a clear understanding of the three-dimensional situation of the whole teleoperation scene, and can perceive the spatial relationship between the operator, the manipulator and the target in real time, thus improving the execution efficiency of the teleoperation. Methods: This project used mature commercial depth camera devices (such as Kinect, Leapmotion, Intel Creative, ZED camera, etc.) as input device to identify operator's hand movements and used a manipulator with multiple joints for teleoperation. The camera was installed on the wrist of manipulator or other related parts to obtain the images of operational objectives without changing the robot arm control system. The location and attitude of operational objectives was achieved by measuring the depth of data image, manipulator space location identification. After the completion of the operational objectives identification, the virtual hand, manipulator and the target were generated in the same virtual scene by using virtual reality technology and augmented reality technology to provide a stereoscopic display to the operator through the 3D display device. Based on stereo video image display technology, the position relation of manipulator and operational objectives in space was restored in virtual reality system, and the visual space location was realized by SLAM method. The operator will get the visual perception of the teleoperation scene with large field of view. The mapping relationship between the hand model and the manipulator was established to get the precise control, which can enhance the perception ability of the operator. Finally, an experimental system was formed to evaluate the applicability and effectiveness of interaction, which was modified according to the feedback results. An effective human-machine interaction method was proposed based on visual gesture recognition. Conclusion: Experiments show that the accuracy of the gesture operation is slightly lower than that of the handle operation, but it is easier to operate, and the operation is natural and smooth, which accords with the advantages of human-computer interaction habits. For scenes with low precision requirements, the advantages of gesture operation are obvious. It can be used as an operation mode of teleoperation. It has wide application prospects in the field of teleoperation system or task, such as space station, extraterrestrial exploration, robots, UAV and so on.