Mark A. Mandel
at the University of Pennsylvania
I am the research administrator of the ITR Grant in Biomedical
Information Extraction. In non-technical terms, we are analyzing
biomedical research texts to develop computer programs to extract
information from them and put it into databases that researchers can
search without having to find and read all the texts. That's not
complete, but it covers a lot of our work.
Some numbers from our annotation production system:
recent overview of number of files at
each stage of processing
recent character and token counts
for POS-tagged files, both with and without the extraneous parts of
files (anything beyond the abstract title and body)
For annotators
I am now hiring and training annotators.
Finding me
I am on the staff of the Linguistic Data Consortium, but
my office is in the IRCS suite:
Institute for Research in Cognitive Science
Suite
400A, 3401 Walnut Street
Philadelphia, PA 19104-6228
215
898-0328
mamandel@ldc.upenn.edu
Personal
My personal home page is
here.
Some project documents
- Upcoming conferences that might be relevant for the project, compiled by Seth Kulick.
-
Summary of the ITR/E oncology
work: Paper submitted to HLT/NAACL
2004 (Human Language Technology conference / North American
chapter of the Association for Computational Linguistics annual
meeting). Pete White writes: "This manuscript has been submitted to
[this workshop]. It is our information extraction group's latest
summary of our work and thus represents the best current overall
perspective of the project."
-
Pete White's presentation
(2003-11-04) on some of our work at HGVS
2003 (PowerPoint).
-
CorefFest (2003-05-15): Like ChemFest (below), but dedicated to the concept(s) of
coreference.
-
ChemFest (2003-04-03): a day-long meeting to hammer out an
initial definition of chemical entities for the purpose of automatic
and human tagging. We referred to 40 MedLine abstracts supplied by
Andy Schein:
and later to 10 on protein-DNA complexes
that Alex Vasserman brought.
- Here's what we had at the end of the day.
-
Resources for Biomedical Terminology and
Ontology: a work in progress.
2004-11-09