Skip to main content

Table 2 The confusion matrixes and statistics of the RF, XGBoost, SVM and DT models for the three continental (African, European and East Asian) biogeographic origin inferences, respectively

From: Comprehensive evaluations of individual discrimination, kinship analysis, genetic relationship exploration and biogeographic origin prediction in Chinese Dongxiang group by a 60-plex DIP panel

Random forest (RF)

Extreme gradient boosting (XGBoost)

Prediction of biogeographic origin

Africa

Europe

East_Asia

Prediction of biogeographic origin

Africa

Europe

East_Asia

Africa

165

2

0

Africa

162

1

0

Europe

0

122

0

Europe

3

123

2

East_Asia

0

1

703

East_Asia

0

1

701

Accuracy of biogeographic origin inference: 0.9970, 95% CI: (0.9912 ~ 0.9994)

Accuracy of biogeographic origin inference: 0.9930, 95% CI: (0.9855 ~ 0.9972)

Support vector machine (SVM)

Decision tree (DT)

Prediction of biogeographic origin

Africa

Europe

East_Asia

Prediction of biogeographic origin

Africa

Europe

East_Asia

Africa

163

3

1

Africa

148

10

4

Europe

1

120

3

Europe

14

103

9

East_Asia

1

2

699

East_Asia

3

12

690

Accuracy of biogeographic origin inference: 0.9889, 95% CI: (0.9803 ~ 0.9945)

Accuracy of biogeographic origin inference: 0.9476, 95% CI: (0.9319 ~ 0.9606)