You Know About NLP
If you are familiar with NLP and you are using information retrieval techniques to conduct research then you can use ARC by adding it to your existing toolkit so that it can help you get relevant data to your clinicians faster.
You Are New to NLP
If you are not familiar with NLP but you have read or heard somewhere that information retrieval helps in getting patient data from medical records then by using ARC you will be able to automatically extract information to reduce manual chart reviews.
The Automated Retrieval Console (ARC) is open source software designed to improve the processes of information retrieval (e.g., natural language processing, machine learning, information extraction, etc). Behind the scenes, ARC processes text with open source NLP pipelines converting unstructured text to structured data such as SNOMED or UMLS codes. The structured data is then automatically converted to features and fed to open source supervised machine learning algorithms. For non-technical end-users, ARC requires only examples of targeted information in order to develop, evaluate, and deploy IR algorithms. For developers and IR researchers, ARC provides a suite of tools to mix and match NLP with machine learning and interfaces to quickly calculate performance. ARC imports UIMA-based pipelines to convert free text into different feature types for classification and can be downloaded 'bundled' with cTAKES or a standalone version.
ARC v2.0 Release Candidate 10 comes with two DIY features one for concept extraction and another for document classification. Concept extraction is the newest feature and what it does is it imports annotations and documents from an external source, processes them using cTAKES, performs an experiments blast and sets up for running a retrieve on a new collection. This is the fastest but least configurable way of going from a gold standard to running models on new data.