Skip to main content

Table 3 CeVD case identification with DAD and EMR

From: Cerebrovascular disease case identification in inpatient electronic medical record data using natural language processing

Model

Sensitivity% (95% CI)

Specificity% (95% CI)

PPV% (95% CI)

NPV% (95% CI)

F1%

Accuracy% (95% CI)

ICD-10-CA-codes in DAD

25.0 (20.6–29.8)

99.3 (98.9–99.6)

82.6 (74.5–88.5)

90.8 (90.3–91.3)

38.4

90.5 (89.4–91.5)

CUI + TF-IDF + RF

65.8 (60.7–70.7)

98.5 (98.0–99.0)

85.9 (81.5–89.3)

95.5 (94.9–96.1)

74.1

94.7 (93.8–95.4)

CUI + word count + XGBoost

68.1 (63.0–72.8)

98.6 (98.1–99.0)

86.9 (82.7–90.2)

95.8 (95.2–96.4)

76.2

95.0 (94.2–95.7)

CUI + TF-IDF + XGBoost*

70.00 (65.0–74.7)

99.1 (98.7–99.3)

87.8 (83.7–91.0)

97.1 (96.6–97.5)

77.8

96.5 (95.8–97.0)

BOW + TF-IDF + XGBoost

59.2 (53.9–64.3)

98.7 (98.1–99.1)

85.5 (80.9–89.2)

94.7 (94.1–95.3)

69.4

94.0 (93.1–94.8)

  1. The value in bold indicates the best among other approaches in that specific metric