ARC v2.0

Resource URLs: 

http://arc.4thparadigm.org
http://code.google.com/p/mavericarc/

Description: 

You Know About NLP
 
If you are familiar with NLP and you are using information retrieval techniques to conduct research then you can use ARC by adding it to your existing toolkit so that it can help you get relevant data to your clinicians faster.
 
You Are New to NLP
 
If you are not familiar with NLP but you have read or heard somewhere that information retrieval helps in getting patient data from medical records then by using ARC you will be able to automatically extract information to reduce manual chart reviews.
 
About ARC
 
The Automated Retrieval Console (ARC) is open source software designed to improve the processes of information retrieval (e.g., natural language processing, machine learning, information extraction, etc).  Behind the scenes, ARC processes text with open source NLP pipelines converting unstructured text to structured data such as SNOMED or UMLS codes.  The structured data is then automatically converted to features and fed to open source supervised machine learning algorithms.  For non-technical end-users, ARC requires only examples of targeted information in order to develop, evaluate, and deploy IR algorithms.  For developers and IR researchers, ARC provides a suite of tools to mix and match NLP with machine learning and interfaces to quickly calculate performance.  ARC imports UIMA-based pipelines to convert free text into different feature types for classification and can be downloaded 'bundled' with cTAKES or a standalone version.
 
Updates
 
ARC v2.0 Release Candidate 10 comes with two DIY features one for concept extraction and another for document classification. Concept extraction is the newest feature and what it does is it imports annotations and documents from an external source, processes them using cTAKES, performs an experiments blast and sets up for running a retrieve on a new collection. This is the fastest but least configurable way of going from a gold standard to running models on new data.
 

Application domains: 

Programming languages: 

Operating systems: 

Citations: 

D'Avolio L, Nguyen T, Goryachev S, Fiore L. Automated Concept-Level Information Extraction to Reduce the Need for Custom Software and Rules Development. Journal of the American Medical Informatics Association 2011 18(5): 607-13.
D'Avolio L, Nguyen T, Farwell W, Chen Y, Fitzmeyer F, Harris O, Fiore L. Evaluation of a Generalizable Approach to Clinical Information Retrieval using the Automated Retrieval Console (ARC). Journal of the American Medical Informatics Association 2010 17(4): 375-382.
D'Avolio L, Nguyen T, Fiore L. The Automated Retrieval Console (ARC): Open Source Software for Streamlining the Process of Natural Language Processing. Proceedings of the Association of Computing Machinery's International Health Informatics Symposium: 2010 Nov. 11-12; Arlington, VA; p. 469-473.

Screenshots: 

ARC Main Interface
DIY Document-level Interface
Data Source Configuration Editor

License type: 

Licensing notes: 

For all other uses contact author at ldavolio [at] gmail

Date of latest version: 

August 24, 2011
5
Your rating: None Average: 5 (1 vote)