Variable stars' light curve detection and classification using supervised machine learning

We present two applications of supervised machine learning aimed at addressing the light curve classification problem in stellar variability. Our main goal is to streamline the analysis of light curves obtained from large-scale photometric and multi-epoch astronomic surveys. In the first application...

Full description

Autores:
Elizabethson, Astaroth
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2024
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/73980
Acceso en línea:
https://hdl.handle.net/1992/73980
Palabra clave:
Astronomy
Machine Learning
KNN
CART
RF
SVM
RR Lyrae stars
Cepheid stars
T Tauri stars
VVV Survey
Vista Variable Stars in the Via Lactea
TESS
Transiting Exoplanet Survey Satellite
Física
Rights
openAccess
License
https://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
Description
Summary:We present two applications of supervised machine learning aimed at addressing the light curve classification problem in stellar variability. Our main goal is to streamline the analysis of light curves obtained from large-scale photometric and multi-epoch astronomic surveys. In the first application, we conduct a variability and morphological classification study on TESS light curves for T Tauri star candidates in several regions, including Orion complex forming region, IC 348, gamma Velorum, Upper Scorpius, Corona Australis, and Perseus OB2. We introduce 11 morphological classes that link variations in brightness with potential physical or geometric phenomena in T Tauri stars. To automate the classification among these classes, we develop a supervised machine learning algorithm. Our algorithm optimizes and compares the true positive rate (recall) among k-nearest neighbors, classification trees, random forests, and support vector machines. We achieve this by characterizing light curves with features related to time, periodicity, and magnitude distribution. We train binary and multiclass classifiers and interpret the results in a way that allows our final algorithm to assign single or mixed classes. In the testing sample, the algorithm assigns mixed classes to 27% of the stars, with some stars receiving up to five simultaneous class assignments. We present a catalog of 3672 T Tauri star candidates, along with their possible period estimations, predicted morphological classes, and visually reviewed assignments. The cross validation estimated performance of the final classifiers is reported. Binary classifiers perform better than multiclass classifiers for classes with limited representation in the training sample. Support vector machines and random forest classifiers achieve better recalls. Furthermore, we provide another performance estimation of the final classifiers using the revised classes of our testing sample, indicating that this performance excels in single-classed stars, which account for approximately 75% of the testing sample. In the second application, we focus on the b278 and b279 fields of the VVV survey, conducted in the Ks infrared band. We analyze time-series data from over 60 epochs in each field to assess the performance of binary and multiclass classifiers. Our primary objective is to have these classifiers identify stellar variability and subsequently differentiate between various classes of variability, especially classical Cepheids, RR Lyrae, long-period variables, and Mira variables. Notably, the features used in this analysis are independent of a periodicity search process. This approach allows for the inclusion of variable stars that do not exhibit periodic changes in magnitude and saves the computational work of a priori period estimations over the whole initial data. We create the training dataset by extracting time-series data from the public catalog of the VVV template project. Additionally, we include time-series data from variable stars observed in the 2MASS-GC02 and Terzan10 globular clusters, and generate synthetic non-variable light curves that emulate the cadence and magnitude uncertainties of the VVV data. We conduct a comparative analysis of the F1 score of these classifiers. In the end, this research produces a catalogue of candidates for variable stars in the Galactic Bulge direction, including 266 candidates whose phased light curves are consistent with the morphology expected for their classes.