MINT : Multi Instance Network, an efficient framework for video object segmentation
Video Object Segmentation consists on segmenting an object along all the frames of a video. The use of temporal and spatial cues are essential signals to bear in mind when dealing with this assignment. This thesis proposes MINT, Multi Instance Network, a method that takes into account shape priors,...
- Autores:
-
Jeanneret Sanmiguel, Guillaume
- Tipo de recurso:
- Fecha de publicación:
- 2019
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/43963
- Acceso en línea:
- http://hdl.handle.net/1992/43963
- Palabra clave:
- Sistemas multimedia - Investigaciones
Video digital - Investigaciones
Procesamiento de imágenes - Técnicas digitales - Investigaciones
Simulación por computadores - Investigaciones
Ingeniería
- Rights
- openAccess
- License
- http://creativecommons.org/licenses/by-nc-nd/4.0/
Summary: | Video Object Segmentation consists on segmenting an object along all the frames of a video. The use of temporal and spatial cues are essential signals to bear in mind when dealing with this assignment. This thesis proposes MINT, Multi Instance Network, a method that takes into account shape priors, the location and temporal information to create the segmentation of the object of interest while the inference time per frame is beneath state-of-the-art methods. MINT is able to generate segmentations at 50.51 FPS using the complete model and 81.74 FPS for the Fast version. Furthermore, MINT pushes the task of Video Object Segmentation by processing Multiple Instances in a single forward pass without any post-processing. MINT is trained and tested on the Largest Multi-Instance Video Object Segmentation Dataset, Youtube-VOS, achieving an overall performance of 0.592. |
---|