We present a novel methodology for exploring 4D seismic data in the context of monitoring subsurface resources. Data-space exploration is a key activity in scientific research, but it has long been overlooked in favor of model-space investigations. Our methodology performs a data-space exploration that aims to define structures in the covariance matrix of the observational errors. It is based on Bayesian inferences, where the posterior probability distribution is reconstructed through trans-dimensional (trans-D) Markov chain Monte Carlo sampling. The trans-D approach applied to data-structures (termed "partitions") of the covariance matrix allows the number of partitions to freely vary in a fixed range during the McMC sampling. Due to the trans-D approach, our methodology retrieves data-structures that are fully data-driven and not imposed by the user. We applied our methodology to 4D seismic data, generally used to extract information about the variations in the subsurface. In our study, we make use of real data that we collected in the laboratory, which allows us to simulate different acquisition geometries and different reservoir conditions. Our approach is able to define and discriminate different sources of noise in 4D seismic data, enabling a data-driven evaluation of the quality (so-called "repeatability") of the 4D seismic survey. We find that: (a) trans-D sampling can be effective in defining data-driven data-space structures; (b) our methodology can be used to discriminate between different families of data-structures created from different noise sources. Coupling our methodology to standard model-space investigations, we can validate physical hypothesis on the monitored geo-resources. Plain Language Summary The increasing amount of geophysical data available for making inferences on the Earth's properties needs to develop automated workflows for data preparation, now that expert opinion is becoming too time-consuming and too expensive. We present a novel approach for geophysical data-mining. Our approach assume weak prior information about the data-space, that is, about how the data are clustered and how their uncertainties are distributed among them. Based on such prior information, our approach is able to indicate which data volumes coherently represent the initial hypotheses and which need further investigations.