MINT : Multi Instance Network, an efficient framework for video object segmentation

Video Object Segmentation consists on segmenting an object along all the frames of a video. The use of temporal and spatial cues are essential signals to bear in mind when dealing with this assignment. This thesis proposes MINT, Multi Instance Network, a method that takes into account shape priors,...

Full description

Autores:: Jeanneret Sanmiguel, Guillaume

Tipo de recurso:

Fecha de publicación:: 2019

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: eng

Description
Summary:	Video Object Segmentation consists on segmenting an object along all the frames of a video. The use of temporal and spatial cues are essential signals to bear in mind when dealing with this assignment. This thesis proposes MINT, Multi Instance Network, a method that takes into account shape priors, the location and temporal information to create the segmentation of the object of interest while the inference time per frame is beneath state-of-the-art methods. MINT is able to generate segmentations at 50.51 FPS using the complete model and 81.74 FPS for the Fast version. Furthermore, MINT pushes the task of Video Object Segmentation by processing Multiple Instances in a single forward pass without any post-processing. MINT is trained and tested on the Largest Multi-Instance Video Object Segmentation Dataset, Youtube-VOS, achieving an overall performance of 0.592.

MINT : Multi Instance Network, an efficient framework for video object segmentation

Publicaciones similares