Text this: Deep learning of robust representations for multi-instance and multi-label image classification