ETUDE (Evaluation Tool for Unstructured Data and Extractions) is an open source NLP output evaluation tool flexible enough to score a range of documents with varied type systems and different evaluation criteria.
ETUDE is comprised of two primary components: a Java- based user-interface (UI) and a Python-based engine. The UI, also called Viewer, is written in Java 9 and uses the JavaFx framework. It provides an intuitive interface for creating new configuration files (which determine how the engine reads in annotated documents), for scoring paired reference and system output annotations with the engine, and for viewing prior runs and analyses of batch processed corpora. The engine runs under both Python 2.7 and Python 3.7. Based on a combination of plain text configuration files and command-line parameters, the engine has the flexibility to read in files annotated in a variety of formats (e.g., plain text, inline XML, offset XML, UIMA CAS XML) and evaluate a configurable list of annotation types including attributes (e.g., Person vs. Location and Problem + Negated, respectively) with respect to another corpus, potentially in a different format.
Direct user output (via stdout) can include a range of common metrics (e.g., TP, Precision, F1, F0.5, etc.), a confusion matrix, and a simple listing of annotation type counts. This analysis is generated at a range of granularities including at the micro- averaged level, per type, per file, at the macro-averaged level, etc. Additional output formats include JSON and CSV. The JSON files are tuned to be read in by the viewer with the option to store these files on disk for immediate or future investigation by a human, as needed.
Paul M. Heider a, Jean-Karlo Accetta b, Stéphane M. Meystre a,b
a Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC, USA
b Clinacuity Inc., Charleston, SC, USA