One of the biggest challenges in developing UIMA workflows is the incompatibility of components that support different type systems, and yet, could exchange conceptually similar annotation structures. For instance, the output Sentence type of a sentence detector may be incompatible with the input Sentence type of a named entity recogniser only because the two seemingly the same types were defined in two different type systems. A less trivial source of incompatibility is when two conceptually equivalent types are structurally different, for instance, coreference phenomenon can be encoded as a chain (a linked list) or as an array.
We have developed SPARQL Annotation Editor, a processing component that allows a developer to manipulate annotations (and thus convert types) by using SPARQL queries. Using a widely adopted query language makes this solution more approachable and encourages ad-hoc conversions that would otherwise have to be done programmatically.
Type system alignment using SPARQL will be presented at the 7th Linguistic Annotation Workshop & Interoperability with Discourse that takes place in Sofia, Bulgaria, on 8 August. An online tutorial will follow shortly.
The details are covered in the following paper which will appear in the workshop proceedings:
Rak, R. and Ananiadou, S. (To appear). Making UIMA Truly Interoperable with SPARQL. In: Proceedings of the 7th Linguistic Annotation Workshop & Interoperability with Discourse