(draft...)
| Annotators' home |
(The name: You'll see WordFreak, Wordfreak, and wordfreak. The designer doesn't seem worried by it, so I won't be either. A lot of the time I just save my fingers and type WF.)
Decide where you want to keep the files you're annotating. If you're running WordFreak on your own machine, two reasonable options are in the same directory as WordFreak itself, or in a subdirectory of that directory. If you're running WordFreak on a machine in the IRCS suite, you'll need to have an account of your own to save the files on; the local files on these machines are considered temporary and may be wiped at night. When you take a file to work on, check it out and put a copy in your files directory.
If you already have WF, start it up. If you're connected to the Web, WF will access the appropriate URL and see if there's a newer version, and if there is, it will update itself and start up. Once WF has started you can safely disconnect from the Web and run offline.
If you don't have WF on your machine, get it from the appropriate URL for your group ("Getting WordFreak").
Specify the type of file you are looking for. If you are starting a fresh file, choose either "Text files" (if the file's extension is .txt or .sgm) or "All files". Select the file you want and click OPEN. The file will appear in the project view. WordFreak will ask if you want to create an annotation file; answer Yes.
WF never modifies the text file; it saves its annotations in an annotation file, whose name is the name of the text file but with .ann added at the end. To work on a file with existing annotations you can either open the text file or open the .ann file; WF will look for the corresponding file name and open them both. They must be in the same directory.
A green checkmark will appear on the icon of the annotation file in the project view.
If you have more than one file loaded, switch to this one with
WordFreak uses the name "tagging" to refer to work done automatically by programs that it calls, and "annotation" to refer to work done by a human annotator. That distinction isn't always necessary or made in other contexts.
Before annotating the text, you must tag it for paragraphs, sentences, and tokens, in that order. We intend to automate this task, but for now you will have to do it semi-manually, telling WordFreak to use the tagger plug-ins for these types of tagging.
First, select the file in the project view if it isn't already selected.
Paragraphs aren't complex, but most of the operation is the same for all three types of tagging, so I'll go into considerable detail here.
(NOTE: As you switch between WF and other applications, you may find that the main WF window is on top but the Chooser window is hidden behind other application windows. You can bring it forward with Window | Bring all to front.)
| icon | function | tooltip |
| first row | ||
| < | previous tagged entity | left |
| > | next tagged entity | right |
| + | tag selection as entity | add |
| – | untag selected entity | remove |
| second row | ||
| <=| | extend beginning of selection | grow left |
| >=| | contract beginning of selection | shrink left |
| |=< | contract end of selection leftwards | grow right |
| |=> | extend end of selection rightwards | shrink right |
Use the > and < buttons to show each tagged paragraph in turn. (You can also do this with the arrow keys, when the text window is highlighted: left or up for the previous tag, right or down for the next. [2003-07-23]) There should be no problem with the paragraph tagging; it's a pretty straightforward task for the tagging program. The text may include some XML labels in angle brackets, like "<ABSTRACT>", and the highlighting may not include those; that's all right. The highlighting may or may not also include the blank line between paragraphs, and that's all right too.
(When you have more than one file loaded, if you're at the beginning or end of one of them, the Chooser > and < buttons will move you to the previous or next file. You can also move between them directly with Annotation | Go To .)
What to do if the tagging is wrong? Then you have to fix it. The process for that is the same as for the manual annotation that constitutes your main job, so I'll describe the mechanics here.
Suppose two paragraphs are highlighted together as a single paragraph. The easiest way to fix this is in two steps: remove and add. (I'm talking about removing and adding tags in the Chooser, not removing and adding files in the main WordFreak window!)
Click (not drag) in the Text view. (In entity annotation, there may sometimes be tags within tags. Move between them with the Chooser < and > or the keyboard arrow keys.) With the mistagged section highlighted, click anywhere in the paragraph you want to remove. It will highlight in purple. Click the – button in the Chooser. The highlighting will disappear.
This is going to be a long explanation because I am folding a lot of information about selecting and annotating into it. Here goes:
Check your work by clicking < and > to be sure that the highlighting is correct. If it's off by just a little, you can use shrink and grow; and, as always, you can ignore the space between paragraphs. When the paragraphs are correct, return to the project view and go on to sentences.
And that finishes the pre-tagging. Now you can get to the meat of your work.
We don't have taggers yet for the entity categories, or good POS taggers for biomedical text, which is why we need your work.
...
[2003-08-19] WordFreak has shown various kinds of instability if you have more than one file at a time loaded into it, or even added. In fact, some serious bugs seem to show up even if you close and remove a file before adding and loading a new one. So...
[2003-07-23] You're probably used to applications, like word processors, in which a mouse click in text sets an insertion cursor so you can start typing or editing at that point. But in WordFreak you can't type or edit, so there is no insertion cursor.
Instead, a mouse click selects the nearest tagged entity (of any of the types currently shown in the Chooser window). In order to select a token or a chunk of text in WordFreak you have to drag the mouse at least a little bit; even one pixel will do. This extreme motion sensitivity can make it hard to select a tagged string. Check the status line to be sure you've actually got what you wanted.
[2003-12-02] Eric Pancoast writes: "It is possible to overlap tags in annotation tasks like entity tagging. If an annotator selects by clicking and dragging (which selects tokens rather than an annotation) it will allow them to tag something twice. So the annotators should know that if they want to switch an annotation from one type to another, they should click instead of click-drag, use the arrow-keys, or use the next-previous annotation buttons to select the annotation."
[2004-01-11] For each level of automatic tagging -- paragraph, sentence, token, and POS -- set the WF settings in the following order:
[2003-10-02] There is a known bug in WF's handling of the first sentence of a paragraph. If you delete the sentence tag on the first sentence of a paragraph, WF moves the left edge of the paragraph tag, apparently to where the first remaining sentence tag begins. Then, when you tag the first sentence correctly (or try to), either the tagging doesn't take, or it seems to take, but subsequent tagging is messed up or impossible.
This bug is on Eric's list, but it isn't fixed yet. The workaround is a pain but is doable: Never delete the first sentence tag in a paragraph. Instead, "shrink-right" or "grow-right"* that tag in the Chooser, one boring click at a time (since Java doesn't understand holding down the mouse button), till it ends at the right place. Then adjust other sentence tags as needed.
* (Of course, if the sentence tag ends somewhere in the middle of the real first sentence, you must first delete the tag on the SECOND sentence. That does not cause problems.)
[2003-08-04] In annotation as in all other computer work, it's wise to "Save early and often".
[2003-04-10] If WF says it can't save your file:
[2003-07-23] In some special situations you may have to tag the same string of text with two different tags. Be careful to add the second tag instead of replacing the first one with it:
| Annotators' home |
2004-02-24