Design, participants, and sample size
The EVasCu study, a cross-sectional design study, is based on data obtained from healthy adult subjects from the city of Cuenca, Spain (collected from June to December 2022). The inclusion and exclusion criteria for participants are shown in Additional file 1: Table S1. This study was conducted according to the guidelines for reporting observational studies “Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement” [10].
The sample size was calculated using Epidat software and indicated that 355 participants would provide an estimated effect size of 1, with an alpha risk of 0.05 and an absolute precision level of 0.04 to detect a statistically significant result for the EVA index [11]. Subjects meeting the inclusion and exclusion criteria were invited to participate in the study, and eventually, 390 participants were enrolled.
Ethical considerations
The research protocol of this study was approved by the Clinical Research Ethics Committee of the Cuenca Health Area (REG: 2022/PI2022). Written informed consent to participate was obtained from all subjects included in the study. All procedures performed in this study were in accordance with the Declaration of Helsinki and its later amendments or comparable ethical standards for experiments involving humans [12].
Variables
Vascular parameters
Arterial stiffness was measured using oscillometric techniques with Mobil-O-Graph® (IEM GmbH) and VaSera (FUKUDA-DENSHI). Mobil-O-Graph® measures aortic pulse wave velocity (a-PWv) and augmentation index (AIx75), which were calculated as the mean of two repeated measurements, separated by 5 min each, while VaSera measures the cardio-ankle vascular index (CAVI). These parameters were measured in a quiet place and after a 5-min rest period using cuff size according to the participant’s arm/s and/or lower limb circumference.
Mean and maximal intima-media thickness (IMT) was measured by ultrasound with the Sonosite SII device (Sonosite Inc., Bothell, Washington, USA). IMT was calculated as the mean measurement of the right and left carotid arteries.
Clinical parameters
Pulse pressure (PP) was obtained from the difference between mean systolic blood pressure (SBP) and diastolic blood pressure (DBP). Blood pressure was measured in a quiet place and after a 5-min rest period using the Omron® M5-I monitor (Omron Healthcare UK Ltd. with a cuff size according to the participant’s arm circumference. SBP and DBP were calculated as the mean of two repeated measurements, separated by 5 min each.
Advanced glycation end products (AGEs) were measured by skin autofluorescence (SAF) with the AGE Reader® device. AGEs were calculated as the mean of the measurements from both arms. The mean for each arm was calculated as the mean of three repeated measurements.
Biochemical parameters
Glucose and ultrasensitive C-reactive protein (CRP) determinations were measured on a Roche Diagnostics® Cobas 8000 system, and insulin determinations were measured on the Abbott ® Architect platform. Glycated hemoglobin A1c (HbA1c) was determined by high-performance liquid chromatography using the ADAMS A1c HA-8180V analyser from A. Menarini Diagnostics®. Samples were collected between 8 a.m. and 9 a.m. and after 12 h of fasting.
Statistical analysis
Confirmatory factor analysis
To examine the construct validity of a single-factor model to measure the EVA index, different models including vascular, clinical, and biochemical parameters were examined to determine which variables from these three groups showed the best fit. Regression coefficients > 0.3 and a statistical significance of p 0.96 and the root mean square residual (SRMR) was < 0.008 [13].
Choosing the optimal number of risk groups
The optimal number of risk groups (K) was determined considering the nature of the variables that best formed the construct validity of the EVA index. To accomplish this task, different values of K, ranging from 2 to 5, were explored and used in different algorithms to determine the optimal selection of groups in the dataset. First, the Calinski‒Harabasz [14] and Davies‒Bouldin [15] indices, which assess the relationship between the dispersion within the groups themselves and the dispersion between them, were calculated. A higher value indicates better separation between groups and less dispersion within each group in the Calinski‒Harabasz algorithm, while a lower value in the Davies‒Bouldin algorithm indicates better separation between groups and greater cohesion within each group.
Second, the silhouette index, which computes the mean distance of each observation to the observations in its own group (cohesion) and the mean distance to the observations in the other groups (separation) [16], was also calculated. The silhouette index varies between − 1 and 1, where a value close to 1 suggests that the observations within a group are very close to each other (cohesion) and far from the observations of other groups (separation).
Cluster calculation and validation
To calculate the assignment of each patient to each of the vascular aging risk groups (K), two different unsupervised clustering methodologies were computed. First, the K-means algorithm was implemented, which assigns each patient to the closest group or centroid, considering that centroids are representative points of each group in the multidimensional space and correspond to the mean of all points in that group [17]. This process was repeated up to a maximum of 100 times or until the relative changes in centroid positions were negligible. Second, hierarchical clustering was also used to assign subjects to each vascular aging risk group based on the four variables selected in the construct. Hierarchical clustering is an unsupervised approach that constructs a tree structure (dendrogram), where each level represents a partition of the data into different groups [18] (Additional file 1: Fig. S1). Finally, the concordance index was calculated as the ratio of the total number of subjects assigned to the same group by both methodologies.
Furthermore, to verify the similarity in the assignment of subjects to different groups with these two methodologies, the adjusted Rand index (ARI) was calculated [19]. The ARI is a measure of similarity between two partitions of a dataset and compares the agreement between group assignments in two different clustering methodologies. The ARI ranges from − 1 to 1, where a value of 1 indicates a perfect match between group assignments, while a value of 0 indicates that the assignments match to the same extent by chance.
Dimension reduction
The multivariable approach to the construct proposed in this study makes it difficult to visually analyse the dispersion of observations and their respective group assignments. To address this issue, dimensionality reduction was performed using principal component analysis (PCA) [20]. PCA allows the original variables to be transformed into orthogonal components that preserve as much variance as possible. This enables visual exploration and detection of patterns and trends through scatter plots of the subjects, facilitating the assessment of group clarity and separation and providing insight into the importance of each variable in forming the principal components.
All abovementioned analyses were also performed by sex. Statistical analyses were performed using STATA 15 and MATLAB 2022b.