Variable stars' light curve detection and classification using supervised machine learning

We present two applications of supervised machine learning aimed at addressing the light curve classification problem in stellar variability. Our main goal is to streamline the analysis of light curves obtained from large-scale photometric and multi-epoch astronomic surveys. In the first application...

Full description

Autores:
Elizabethson, Astaroth
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2024
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/73980
Acceso en línea:
https://hdl.handle.net/1992/73980
Palabra clave:
Astronomy
Machine Learning
KNN
CART
RF
SVM
RR Lyrae stars
Cepheid stars
T Tauri stars
VVV Survey
Vista Variable Stars in the Via Lactea
TESS
Transiting Exoplanet Survey Satellite
Física
Rights
openAccess
License
https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
id UNIANDES2_3f030529e38878f38bc5743a3804e8f4
oai_identifier_str oai:repositorio.uniandes.edu.co:1992/73980
network_acronym_str UNIANDES2
network_name_str Séneca: repositorio Uniandes
repository_id_str
dc.title.eng.fl_str_mv Variable stars' light curve detection and classification using supervised machine learning
dc.title.alternative.eng.fl_str_mv Variable stars light curve detection and classification using supervised machine learning
title Variable stars' light curve detection and classification using supervised machine learning
spellingShingle Variable stars' light curve detection and classification using supervised machine learning
Astronomy
Machine Learning
KNN
CART
RF
SVM
RR Lyrae stars
Cepheid stars
T Tauri stars
VVV Survey
Vista Variable Stars in the Via Lactea
TESS
Transiting Exoplanet Survey Satellite
Física
title_short Variable stars' light curve detection and classification using supervised machine learning
title_full Variable stars' light curve detection and classification using supervised machine learning
title_fullStr Variable stars' light curve detection and classification using supervised machine learning
title_full_unstemmed Variable stars' light curve detection and classification using supervised machine learning
title_sort Variable stars' light curve detection and classification using supervised machine learning
dc.creator.fl_str_mv Elizabethson, Astaroth
dc.contributor.advisor.none.fl_str_mv García Varela, José Alejandro
dc.contributor.author.none.fl_str_mv Elizabethson, Astaroth
dc.contributor.jury.none.fl_str_mv Alonso García, Javier
Giraldo Trujillo, Luis Felipe
dc.subject.keyword.eng.fl_str_mv Astronomy
Machine Learning
KNN
CART
RF
SVM
RR Lyrae stars
Cepheid stars
T Tauri stars
VVV Survey
Vista Variable Stars in the Via Lactea
TESS
Transiting Exoplanet Survey Satellite
topic Astronomy
Machine Learning
KNN
CART
RF
SVM
RR Lyrae stars
Cepheid stars
T Tauri stars
VVV Survey
Vista Variable Stars in the Via Lactea
TESS
Transiting Exoplanet Survey Satellite
Física
dc.subject.themes.spa.fl_str_mv Física
description We present two applications of supervised machine learning aimed at addressing the light curve classification problem in stellar variability. Our main goal is to streamline the analysis of light curves obtained from large-scale photometric and multi-epoch astronomic surveys. In the first application, we conduct a variability and morphological classification study on TESS light curves for T Tauri star candidates in several regions, including Orion complex forming region, IC 348, gamma Velorum, Upper Scorpius, Corona Australis, and Perseus OB2. We introduce 11 morphological classes that link variations in brightness with potential physical or geometric phenomena in T Tauri stars. To automate the classification among these classes, we develop a supervised machine learning algorithm. Our algorithm optimizes and compares the true positive rate (recall) among k-nearest neighbors, classification trees, random forests, and support vector machines. We achieve this by characterizing light curves with features related to time, periodicity, and magnitude distribution. We train binary and multiclass classifiers and interpret the results in a way that allows our final algorithm to assign single or mixed classes. In the testing sample, the algorithm assigns mixed classes to 27% of the stars, with some stars receiving up to five simultaneous class assignments. We present a catalog of 3672 T Tauri star candidates, along with their possible period estimations, predicted morphological classes, and visually reviewed assignments. The cross validation estimated performance of the final classifiers is reported. Binary classifiers perform better than multiclass classifiers for classes with limited representation in the training sample. Support vector machines and random forest classifiers achieve better recalls. Furthermore, we provide another performance estimation of the final classifiers using the revised classes of our testing sample, indicating that this performance excels in single-classed stars, which account for approximately 75% of the testing sample. In the second application, we focus on the b278 and b279 fields of the VVV survey, conducted in the Ks infrared band. We analyze time-series data from over 60 epochs in each field to assess the performance of binary and multiclass classifiers. Our primary objective is to have these classifiers identify stellar variability and subsequently differentiate between various classes of variability, especially classical Cepheids, RR Lyrae, long-period variables, and Mira variables. Notably, the features used in this analysis are independent of a periodicity search process. This approach allows for the inclusion of variable stars that do not exhibit periodic changes in magnitude and saves the computational work of a priori period estimations over the whole initial data. We create the training dataset by extracting time-series data from the public catalog of the VVV template project. Additionally, we include time-series data from variable stars observed in the 2MASS-GC02 and Terzan10 globular clusters, and generate synthetic non-variable light curves that emulate the cadence and magnitude uncertainties of the VVV data. We conduct a comparative analysis of the F1 score of these classifiers. In the end, this research produces a catalogue of candidates for variable stars in the Galactic Bulge direction, including 266 candidates whose phased light curves are consistent with the morphology expected for their classes.
publishDate 2024
dc.date.accessioned.none.fl_str_mv 2024-02-15T15:40:49Z
dc.date.available.none.fl_str_mv 2024-02-15T15:40:49Z
dc.date.issued.none.fl_str_mv 2024-01-31
dc.type.none.fl_str_mv Trabajo de grado - Doctorado
dc.type.driver.none.fl_str_mv info:eu-repo/semantics/doctoralThesis
dc.type.version.none.fl_str_mv info:eu-repo/semantics/acceptedVersion
dc.type.coar.none.fl_str_mv http://purl.org/coar/resource_type/c_db06
dc.type.content.none.fl_str_mv Text
dc.type.redcol.none.fl_str_mv https://purl.org/redcol/resource_type/TD
format http://purl.org/coar/resource_type/c_db06
status_str acceptedVersion
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/1992/73980
dc.identifier.doi.none.fl_str_mv 10.57784/1992/73980
dc.identifier.instname.none.fl_str_mv instname:Universidad de los Andes
dc.identifier.reponame.none.fl_str_mv reponame:Repositorio Institucional Séneca
dc.identifier.repourl.none.fl_str_mv repourl:https://repositorio.uniandes.edu.co/
url https://hdl.handle.net/1992/73980
identifier_str_mv 10.57784/1992/73980
instname:Universidad de los Andes
reponame:Repositorio Institucional Séneca
repourl:https://repositorio.uniandes.edu.co/
dc.language.iso.none.fl_str_mv eng
language eng
dc.rights.uri.none.fl_str_mv https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
dc.rights.accessrights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.coar.none.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.extent.none.fl_str_mv 147 páginas
dc.format.mimetype.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidad de los Andes
dc.publisher.program.none.fl_str_mv Doctorado en Ciencias - Física
dc.publisher.faculty.none.fl_str_mv Facultad de Ciencias
dc.publisher.department.none.fl_str_mv Departamento de Física
publisher.none.fl_str_mv Universidad de los Andes
institution Universidad de los Andes
bitstream.url.fl_str_mv https://repositorio.uniandes.edu.co/bitstreams/33e8ab7f-ed90-4995-850c-cb835e4b8c6a/download
https://repositorio.uniandes.edu.co/bitstreams/0ed5e488-7277-4cdc-8feb-f7ad06498169/download
https://repositorio.uniandes.edu.co/bitstreams/9b589775-1d91-4b75-8755-268c94c55623/download
https://repositorio.uniandes.edu.co/bitstreams/9b5447fd-5b3d-486b-8ee0-2a1ad3adbe56/download
https://repositorio.uniandes.edu.co/bitstreams/3c34b2b9-ac07-4f12-8712-a8313d829eba/download
https://repositorio.uniandes.edu.co/bitstreams/fa530d9c-a7b3-4661-9431-663092495b76/download
https://repositorio.uniandes.edu.co/bitstreams/ca2b3bd6-ee03-4a14-8b1d-4352f0cab1f9/download
bitstream.checksum.fl_str_mv ae9e573a68e7f92501b6913cc846c39f
8792a2d5ae60946e24d5cadb46c8f8fb
022ee9edf7621e09e4f57af9db64518c
a54b356d4e516ef66dec54c1987ba092
7eb07ec9fcb84efdb46baa04a49bc12e
bc99b66b6f87c4f6fae1fddb0a4a3f5b
866c9e00f2872aa820810bf068db7187
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio institucional Séneca
repository.mail.fl_str_mv adminrepositorio@uniandes.edu.co
_version_ 1812134033720606720
spelling García Varela, José Alejandrovirtual::424-1Elizabethson, AstarothAlonso García, JavierGiraldo Trujillo, Luis Felipe2024-02-15T15:40:49Z2024-02-15T15:40:49Z2024-01-31https://hdl.handle.net/1992/7398010.57784/1992/73980instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/We present two applications of supervised machine learning aimed at addressing the light curve classification problem in stellar variability. Our main goal is to streamline the analysis of light curves obtained from large-scale photometric and multi-epoch astronomic surveys. In the first application, we conduct a variability and morphological classification study on TESS light curves for T Tauri star candidates in several regions, including Orion complex forming region, IC 348, gamma Velorum, Upper Scorpius, Corona Australis, and Perseus OB2. We introduce 11 morphological classes that link variations in brightness with potential physical or geometric phenomena in T Tauri stars. To automate the classification among these classes, we develop a supervised machine learning algorithm. Our algorithm optimizes and compares the true positive rate (recall) among k-nearest neighbors, classification trees, random forests, and support vector machines. We achieve this by characterizing light curves with features related to time, periodicity, and magnitude distribution. We train binary and multiclass classifiers and interpret the results in a way that allows our final algorithm to assign single or mixed classes. In the testing sample, the algorithm assigns mixed classes to 27% of the stars, with some stars receiving up to five simultaneous class assignments. We present a catalog of 3672 T Tauri star candidates, along with their possible period estimations, predicted morphological classes, and visually reviewed assignments. The cross validation estimated performance of the final classifiers is reported. Binary classifiers perform better than multiclass classifiers for classes with limited representation in the training sample. Support vector machines and random forest classifiers achieve better recalls. Furthermore, we provide another performance estimation of the final classifiers using the revised classes of our testing sample, indicating that this performance excels in single-classed stars, which account for approximately 75% of the testing sample. In the second application, we focus on the b278 and b279 fields of the VVV survey, conducted in the Ks infrared band. We analyze time-series data from over 60 epochs in each field to assess the performance of binary and multiclass classifiers. Our primary objective is to have these classifiers identify stellar variability and subsequently differentiate between various classes of variability, especially classical Cepheids, RR Lyrae, long-period variables, and Mira variables. Notably, the features used in this analysis are independent of a periodicity search process. This approach allows for the inclusion of variable stars that do not exhibit periodic changes in magnitude and saves the computational work of a priori period estimations over the whole initial data. We create the training dataset by extracting time-series data from the public catalog of the VVV template project. Additionally, we include time-series data from variable stars observed in the 2MASS-GC02 and Terzan10 globular clusters, and generate synthetic non-variable light curves that emulate the cadence and magnitude uncertainties of the VVV data. We conduct a comparative analysis of the F1 score of these classifiers. In the end, this research produces a catalogue of candidates for variable stars in the Galactic Bulge direction, including 266 candidates whose phased light curves are consistent with the morphology expected for their classes.Doctor en Ciencias - FísicaDoctorado147 páginasapplication/pdfengUniversidad de los AndesDoctorado en Ciencias - FísicaFacultad de CienciasDepartamento de Físicahttps://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdfinfo:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Variable stars' light curve detection and classification using supervised machine learningVariable stars light curve detection and classification using supervised machine learningTrabajo de grado - Doctoradoinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_db06Texthttps://purl.org/redcol/resource_type/TDAstronomyMachine LearningKNNCARTRFSVMRR Lyrae starsCepheid starsT Tauri starsVVV SurveyVista Variable Stars in the Via LacteaTESSTransiting Exoplanet Survey SatelliteFísica200726594Publication2e03708c-f6d7-4f0f-a46b-71da91053f0788a1271b-7c5b-4cba-a02a-87878aba01e4virtual::424-188a1271b-7c5b-4cba-a02a-87878aba01e4virtual::424-1https://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0000382418virtual::424-1LICENSElicense.txtlicense.txttext/plain; charset=utf-82535https://repositorio.uniandes.edu.co/bitstreams/33e8ab7f-ed90-4995-850c-cb835e4b8c6a/downloadae9e573a68e7f92501b6913cc846c39fMD51ORIGINALVariable stars' light curve detection and classification.pdfVariable stars' light curve detection and classification.pdfapplication/pdf26551555https://repositorio.uniandes.edu.co/bitstreams/0ed5e488-7277-4cdc-8feb-f7ad06498169/download8792a2d5ae60946e24d5cadb46c8f8fbMD52autorizacionTesisFirmada.pdfautorizacionTesisFirmada.pdfHIDEapplication/pdf300281https://repositorio.uniandes.edu.co/bitstreams/9b589775-1d91-4b75-8755-268c94c55623/download022ee9edf7621e09e4f57af9db64518cMD53TEXTVariable stars' light curve detection and classification.pdf.txtVariable stars' light curve detection and classification.pdf.txtExtracted texttext/plain101480https://repositorio.uniandes.edu.co/bitstreams/9b5447fd-5b3d-486b-8ee0-2a1ad3adbe56/downloada54b356d4e516ef66dec54c1987ba092MD54autorizacionTesisFirmada.pdf.txtautorizacionTesisFirmada.pdf.txtExtracted texttext/plain2015https://repositorio.uniandes.edu.co/bitstreams/3c34b2b9-ac07-4f12-8712-a8313d829eba/download7eb07ec9fcb84efdb46baa04a49bc12eMD56THUMBNAILVariable stars' light curve detection and classification.pdf.jpgVariable stars' light curve detection and classification.pdf.jpgGenerated Thumbnailimage/jpeg5622https://repositorio.uniandes.edu.co/bitstreams/fa530d9c-a7b3-4661-9431-663092495b76/downloadbc99b66b6f87c4f6fae1fddb0a4a3f5bMD55autorizacionTesisFirmada.pdf.jpgautorizacionTesisFirmada.pdf.jpgGenerated Thumbnailimage/jpeg11070https://repositorio.uniandes.edu.co/bitstreams/ca2b3bd6-ee03-4a14-8b1d-4352f0cab1f9/download866c9e00f2872aa820810bf068db7187MD571992/73980oai:repositorio.uniandes.edu.co:1992/739802024-08-26 15:25:46.542https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdfopen.accesshttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.coPGgzPjxzdHJvbmc+RGVzY2FyZ28gZGUgUmVzcG9uc2FiaWxpZGFkIC0gTGljZW5jaWEgZGUgQXV0b3JpemFjacOzbjwvc3Ryb25nPjwvaDM+CjxwPjxzdHJvbmc+UG9yIGZhdm9yIGxlZXIgYXRlbnRhbWVudGUgZXN0ZSBkb2N1bWVudG8gcXVlIHBlcm1pdGUgYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBTw6luZWNhIHJlcHJvZHVjaXIgeSBkaXN0cmlidWlyIGxvcyByZWN1cnNvcyBkZSBpbmZvcm1hY2nDs24gZGVwb3NpdGFkb3MgbWVkaWFudGUgbGEgYXV0b3JpemFjacOzbiBkZSBsb3Mgc2lndWllbnRlcyB0w6lybWlub3M6PC9zdHJvbmc+PC9wPgo8cD5Db25jZWRhIGxhIGxpY2VuY2lhIGRlIGRlcMOzc2l0byBlc3TDoW5kYXIgc2VsZWNjaW9uYW5kbyBsYSBvcGNpw7NuIDxzdHJvbmc+J0FjZXB0YXIgbG9zIHTDqXJtaW5vcyBhbnRlcmlvcm1lbnRlIGRlc2NyaXRvcyc8L3N0cm9uZz4geSBjb250aW51YXIgZWwgcHJvY2VzbyBkZSBlbnbDrW8gbWVkaWFudGUgZWwgYm90w7NuIDxzdHJvbmc+J1NpZ3VpZW50ZScuPC9zdHJvbmc+PC9wPgo8aHI+CjxwPllvLCBlbiBtaSBjYWxpZGFkIGRlIGF1dG9yIGRlbCB0cmFiYWpvIGRlIHRlc2lzLCBtb25vZ3JhZsOtYSBvIHRyYWJham8gZGUgZ3JhZG8sIGhhZ28gZW50cmVnYSBkZWwgZWplbXBsYXIgcmVzcGVjdGl2byB5IGRlIHN1cyBhbmV4b3MgZGUgc2VyIGVsIGNhc28sIGVuIGZvcm1hdG8gZGlnaXRhbCB5L28gZWxlY3Ryw7NuaWNvIHkgYXV0b3Jpem8gYSBsYSBVbml2ZXJzaWRhZCBkZSBsb3MgQW5kZXMgcGFyYSBxdWUgcmVhbGljZSBsYSBwdWJsaWNhY2nDs24gZW4gZWwgU2lzdGVtYSBkZSBCaWJsaW90ZWNhcyBvIGVuIGN1YWxxdWllciBvdHJvIHNpc3RlbWEgbyBiYXNlIGRlIGRhdG9zIHByb3BpbyBvIGFqZW5vIGEgbGEgVW5pdmVyc2lkYWQgeSBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBBbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCB1dGlsaWNlIGVuIHRvZGFzIHN1cyBmb3JtYXMsIGxvcyBkZXJlY2hvcyBwYXRyaW1vbmlhbGVzIGRlIHJlcHJvZHVjY2nDs24sIGNvbXVuaWNhY2nDs24gcMO6YmxpY2EsIHRyYW5zZm9ybWFjacOzbiB5IGRpc3RyaWJ1Y2nDs24gKGFscXVpbGVyLCBwcsOpc3RhbW8gcMO6YmxpY28gZSBpbXBvcnRhY2nDs24pIHF1ZSBtZSBjb3JyZXNwb25kZW4gY29tbyBjcmVhZG9yIGRlIGxhIG9icmEgb2JqZXRvIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8uPC9wPgo8cD5MYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGVtaXRlIGVuIGNhbGlkYWQgZGUgYXV0b3IgZGUgbGEgb2JyYSBvYmpldG8gZGVsIHByZXNlbnRlIGRvY3VtZW50byB5IG5vIGNvcnJlc3BvbmRlIGEgY2VzacOzbiBkZSBkZXJlY2hvcywgc2lubyBhIGxhIGF1dG9yaXphY2nDs24gZGUgdXNvIGFjYWTDqW1pY28gZGUgY29uZm9ybWlkYWQgY29uIGxvIGFudGVyaW9ybWVudGUgc2XDsWFsYWRvLiBMYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgZXh0ZW5zaXZhIG5vIHNvbG8gYSBsYXMgZmFjdWx0YWRlcyB5IGRlcmVjaG9zIGRlIHVzbyBzb2JyZSBsYSBvYnJhIGVuIGZvcm1hdG8gbyBzb3BvcnRlIG1hdGVyaWFsLCBzaW5vIHRhbWJpw6luIHBhcmEgZm9ybWF0byBlbGVjdHLDs25pY28sIHkgZW4gZ2VuZXJhbCBwYXJhIGN1YWxxdWllciBmb3JtYXRvIGNvbm9jaWRvIG8gcG9yIGNvbm9jZXIuPC9wPgo8cD5FbCBhdXRvciwgbWFuaWZpZXN0YSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8sIGxhIG9icmEgZXMgZGUgc3UgZXhjbHVzaXZhIGF1dG9yw61hIHkgdGllbmUgbGEgdGl0dWxhcmlkYWQgc29icmUgbGEgbWlzbWEuPC9wPgo8cD5FbiBjYXNvIGRlIHByZXNlbnRhcnNlIGN1YWxxdWllciByZWNsYW1hY2nDs24gbyBhY2Npw7NuIHBvciBwYXJ0ZSBkZSB1biB0ZXJjZXJvIGVuIGN1YW50byBhIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBzb2JyZSBsYSBvYnJhIGVuIGN1ZXN0acOzbiwgZWwgYXV0b3IgYXN1bWlyw6EgdG9kYSBsYSByZXNwb25zYWJpbGlkYWQsIHkgc2FsZHLDoSBkZSBkZWZlbnNhIGRlIGxvcyBkZXJlY2hvcyBhcXXDrSBhdXRvcml6YWRvcywgcGFyYSB0b2RvcyBsb3MgZWZlY3RvcyBsYSBVbml2ZXJzaWRhZCBhY3TDumEgY29tbyB1biB0ZXJjZXJvIGRlIGJ1ZW5hIGZlLjwvcD4KPHA+U2kgdGllbmUgYWxndW5hIGR1ZGEgc29icmUgbGEgbGljZW5jaWEsIHBvciBmYXZvciwgY29udGFjdGUgY29uIGVsIDxhIGhyZWY9Im1haWx0bzpiaWJsaW90ZWNhQHVuaWFuZGVzLmVkdS5jbyIgdGFyZ2V0PSJfYmxhbmsiPkFkbWluaXN0cmFkb3IgZGVsIFNpc3RlbWEuPC9hPjwvcD4K