Human Gene Normalization
From BioIE Wiki
Main Page : Entity Normalization : Human Gene Normalization
- Definition (generally more restrictive than in regular annotation)
- Manual
- Adjudication
This task is different from our regular entity annotation work. Normally we work within a single abstract, identifying mentions of entities of interest and labeling them with the class they belong to (gene, malignancy type, ...). Our purpose in this task is to associate gene references in an abstract with standard identifiers from external reference sources.
Although there are standard catalogues of genes and their products, and standards for referring to them, the terminology in actual use is much more chaotic. Different researchers may use different names for a single gene, or the same name for different genes; or they may use "the same name" with somewhat different spelling or punctuation. This makes automatic information extraction more difficult. Automated normalization would be a great help, and our manual normalization can be used to train automated taggers for the purpose, as our other manual annotation is used to train taggers to recognize names of genes and other entities as such.
Compared with our regular entity annotation, in Human Gene Normalization we
- look only at gene-type entities
- use a generally narrower definition (see the Notes on tagging)
- associate them with standard identifiers
- have to use a more complex procedure, part of which requires Excel spreadsheets and has to be done outside the LAW workflow system
Main Page : Entity Normalization : Human Gene Normalization
