System requirements
• Operating system Windows 2000 or Windows XP SP2
• Minimum RAM 128 MB
Architecture - Functionalities
BOEMIE text annotation tool has been developed over the Ellogon text-engineering platform, extending the embedded annotation tool. It has been designed to run as a stand-alone application to avoid installation requirement and dependencies from other software. This makes it much simpler for the end-user who faces only the menus and functions that he needs and not the whole interface of the Ellogon platform.
The main functionalities that it offers are:
a) It creates and deletes corpus for annotation from html or text documents. Such a corpus is called an Ellogon collection.
b) It displays html documents properly as it has a built-in html renderer.
c) The user can annotate both manually and automatically. Manual annotation is facilitated by a smart text-marking system by which the user selects with a mouse click words instead of single characters. On the other hand, automatic annotation works either by matching user-defined regular expression patterns or by keeping logs of the annotations made and when the user desires using them to annotate similar text segments.
d) HLC instances (HLCIs) are created through tables. Every HLC instance corresponds to a table having its fields filled with MLC instances.
e) Relations between the HLCIs are set from inside the tables. One table can be associated to one or more tables of a certain type according to the HLC properties in the domain ontology.
Another important feature of the tool is that it uses an extended annotation schema which is described in a specific xml file. It is characterized as “extended” because apart from the annotation types it includes the table types, the names of the table fields for each type and the relations between the tables along with the applied restrictions. This schema can be easily defined by modifying the xml file so as to make it compatible with the concepts and relations of any ontology.
Concerning the representation of the data that include annotations, tables and relations, it uses the Ellogon data model. To make it more specific, every MLC instance corresponds to an annotation of type “ne” having an attribute named “type” whose value characterizes the name of the MLC it belongs. Every HLC instance (table) corresponds to an annotation of type “BOEMIE_IE_Template:<table type>” having as attributes the names of the table fields plus a “relates_to_<table type>” attribute. The value of every attribute is the id’s of the MLC annotations that are included in the corresponding field except for the last attribute which contains the id (or id’s) of the HLC annotation(s) that this particular annotation relates to.
Finally, all the above-mentioned data can be exported to an Abox in OWL format by an external ellogon component which has been specially designed for this task.