The system performs identification of disease mentions in discharge summaries also assessing uncertainty and negation.
Evaluated at the i2b2 Obesity Challenge ranking 1st and 2nd in the textual (explicit) and intuitive (implicit) subtasks respectively.
The underlying techniques used are:
- disease and related terminology NER by regular expression dictionaries
- rule-based context semantics identification
- decision logic to combine mention level information into document level classification
Additional features include:
- resolution of most common medical abbreviations
- recognition and normalization of biomarkers (sex, age, race, height, weight, BMI, blood pressure, heart rate, respiratory rate, temperature, cholesterol, triglycerides)
- identification and classification of discharge summary elements (sections)
- identification of family history and allergies
A web-based graphical user interface (screenshot) provides human readable summary tables and evidence highlighting.