iDASH NLP Annotation Workshop, September 29
- 11:11pm August 18, 2012
- Maryann Martone
iDASH NLP Annotation Workshop
When: Saturday, September 29, 2012 (All day)
Where: Atkinson Hall, Calit2 Auditorium, University of California, San Diego
The goals of the workshop are to explore practical and research aspects of annotation of biomedical text, with a focus on clinical text annotation. Topics will include but are not limited to
*Shared resources, including lexical and semantic annotated
corpora
*Creation of layered annotations
*Tools to support manual annotation of text
*Techniques for improving the efficiency of manual annotation,
including active learning and crowdsourcing
*Techniques for improving the quality of annotation
*Domain adaptation
*Evaluation of annotation quality
*Semantic models of schema, guidelines, and annotations
Keynote speaker: Bob Carpenter, PhD., Columbia University
Inferring Gold Standards from Crowdsourced Annotations
In this talk, I'll show how model-based techniques originally developed for analyzing multiple diagnostic tests in epidemiology may be applied to inferring a gold-standard corpora from crowdsourced annotations. The standard models also infer annotator accuracies and biases. Hierarchical models extend these models to overall task difficulty. The surprising result is that neither high inter-annotator agreement nor high accuracy is required to derive corpora of measurably high quality. For example a handful of very noisy annotators (e.g., 75% accuracy, substantial category bias, and less than 50% inter-annotator agreement) can be used to generate a near-perfect gold-standard corpus. Further advantages of the model-based approach to annotation is a calibration of posterior uncertainty on an item-by-item basis, with the obvious application to active learning and less obvious application to learning and evaluating using a probabilistic notion of a corpus.
For more information and registration:
http://idash.ucsd.edu/events/workshops/nlp-annotation-workshop