### Statistical analyses

The first analysis calculated Bartlett's test statistic to determine whether the variances of two experimental groups, from a microarray data set [see Additional file 4], were equal. The null hypothesis of this analysis was that there was no significant difference between the variances of the two groups, and the significance level to reject the null hypothesis was set to 0.05. Data set was composed of 12 samples: 5 and 7 samples were obtained from non-failing hearts and DCM patients respectively. Out of 8068 genes, 526 genes were found to be statistically significant (P < 0.05, before correction for multiple-testing), one gene was under the significance level after correcting with Bonferroni, and one gene was under the significance level after correcting with B&H. However, after performing the permutation test, 327 genes were found significantly differentially expressed (P < 0.05). That is, the two group samples being compared exhibit equal variances, which is commonly expected in typical microarray data analyses.

The second analysis implemented the *t*-test (type: two sample equal variances; number of distribution tails: two-tailed): equal variances and two tailed) to estimate the potential statistical significant difference between the means of two (normally distributed) experimental groups, from the same microarray data set analysed above [see Additional file 5]. The null hypothesis of this analysis was that there was no difference between the means of the two groups, and the significance level to reject the null hypothesis was set to 0.05. In this case, the raw P-values of 1413 genes were under the significance level (P < 0.05), 39 genes were under the significance level after correcting with B&H, and only two genes were under the significance level after correcting with Bonferroni. In this case, results were consistent with our expectations: B&H identified more genes than Bonferroni did, which shows that the former tends to be less stringent. After performing the permutation test, 1398 genes were found significantly differentially expressed (P < 0.05). In addition, we noted that the raw P-values of some of the genes filtered out by Bonferroni were well below the significance level, i.e. they were potentially significant under a less conservative correction approach. For example, raw P-values of ACVR1 and CFHR1 were 0.0004 and 0.004, respectively, and their P-values after Bonferroni correction were above 0.9. However, based on the permutation-based test, these two genes fall below the significance threshold (corrected P values: 0.0001 and 0.001 for ACVR1 and CFHR1, respectively). This, as expected, shows the statistical power of permutation-based procedures for multiple testing.

The third analysis implemented the Chi-square test on categorical data derived from a genetic variation data set (SNPs) [see Additional file 6]. The problem was to determine statistically significant genetic variations among the SNPs of three ethnic groups: African-American, European-American and Chinese. The data encode genotype values for each SNP under each group [15]. This data set was composed of 34 samples: 10 from African-Americans, 12 from European-Americans and 11 from Han-Chinese people. The null hypothesis of this analysis was that there were no genetic differential variations among the three groups, and the significance level to reject the null hypothesis was set to 0.05. In this case the raw P-values of 153 SNPs, out of 334, were under the significance level (P < 0.05). Bonferroni correction identified only eight SNPs, whose P-values were below significance level, and B&H correction identified 131 SNPs, whose P-values were below significance level. In contrast, the permutation test identified more features than B&H: 153 SNPs with significant P-values. These results are consistent with the results reported by Carlson, et al. (2003) [16], which found that only 48% of the SNPs were shared by African-Americans and European-Americans. In our study, the permutation-based adjustment found that 55% of SNPs showed no significant differences among the three populations been analysed. These results again confirm the statistical power of permutation-based procedures for multiple testing.

A fourth analysis implemented the ANOVA test to estimate the potential statistical significant difference between the means of three (normally distributed) experimental groups. Samples in this data set were obtained from heart tissue of healthy donors, as well as from donors suffering from either dilated or ischemic cardiomyopathy [see Additional file 7]. We used the ANOVA test to look for possible outstanding differences among the three populations evaluated, because *t*-test is designed to perform pair-wise comparisons, only. The null hypothesis of this ANOVA analysis was that there were no differences between the means of the three groups, and the significance level to reject the null hypothesis was set to 0.05. In this case, the raw P-values of 6371 genes were under the significance level (P < 0.05), 3331 genes were under the significance level after correcting with B&H, and only nine genes were under the significance level after correcting with Bonferroni. After performing the permutation test, 6262 genes were found significantly differentially expressed (P < 0.05). The genes reported as significantly differentially after correcting via Bonferroni were not included in the set of potentially significant genes detected by the permutation test. In addition, we compared our results against those previously reported by Kittleson, et al. (2005) [17] and found that most genes reported by them as significantly differentially expressed were also below significant level when our permutation test was performed, or when P-values were corrected via the B&H method. In contrast, only one of the genes reported by Kittleson's was also below significant level after we corrected with Bonferroni. Perhaps this analysis showed the real strength that the permutation test has to identify potential biomarkers of disease.