The Glycine subgenus Soja includes two species, cultivated soybean [(Glycine max (L.) Merr.)] and the progenitor wild soybean (G. soja). However, a morphologically intermediate form, the semi-wild soybean (G. gracilis), exists between the two species, and its taxonomic position is under debate. In this study, we evaluated phylogenetic relationships and occurrence events within the subgenus Soja based on genetic variation of SSR loci using a set of accessions comprising wild soybeans (a parts per thousand currency sign3.0 g 100-seed weight), semi-wild soybeans (> 3.0 g) and soybean landraces (a parts per thousand yen4.0 g). The results showed that semi-wild soybean accessions collected in natural fields should be treated as a variant of G. soja and not of G. max, and were genetically differentiated from the soybean landraces, even large-seeded semi-wild soybean accessions (6.01-9.0 g) with seed weights overlapping with or exceeding those of soybean landraces. Evolutionary bottleneck analysis indicated that semi-wild soybean is not a transitional form in the domestication of cultivated soybeans from wild soybean. G. soja contained two genetically differentiated forms, small-seeded type (typical, plus 2.01-2.50 g) and a large-seeded type (2.51-3.0 g). Genetically, the large-seeded wild soybean was closer to the semi-wild soybean, although in morphology it resembled the typical wild soybean. Ancestry analysis confirmed that cultivated soybean genes have introgressed into modern wild soybean populations. The green cotyledon character and other rare characters such as white flower, grey pubescence, no-seed bloom, and coloured seed-coats (brown, green, and yellow) in wild soybean were shown to be involved in introgression from cultivated soybeans.