Text this: An ontology-based information extractor for data-rich documents in the information technology domain