A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis

ABSTRACT: The Intensive Care Unit (ICU) is a hospital department that provides intensive treatment to patients with severe and life-threatening conditions. The primary function of the ICU is to deliver care which cannot be administered in other areas of the hospital. Patients in the ICU are the most...

Full description

Autores:
García Gallo, Javier Esteban
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2019
Institución:
Universidad de Antioquia
Repositorio:
Repositorio UdeA
Idioma:
spa
OAI Identifier:
oai:bibliotecadigital.udea.edu.co:10495/14479
Acceso en línea:
http://hdl.handle.net/10495/14479
Palabra clave:
Mortality
Mortalidad
Quality of life
Calidad de vida
Preventive medicine
Medicina preventiva
Disease control
Lucha contra las enfermedades
http://vocabularies.unesco.org/thesaurus/concept7468
http://vocabularies.unesco.org/thesaurus/concept3622
http://vocabularies.unesco.org/thesaurus/concept5458
http://vocabularies.unesco.org/thesaurus/concept8186
Rights
openAccess
License
Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)
id UDEA2_d7a71f50efa47d05c8ef5f30fcf8c794
oai_identifier_str oai:bibliotecadigital.udea.edu.co:10495/14479
network_acronym_str UDEA2
network_name_str Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
title A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
spellingShingle A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
Mortality
Mortalidad
Quality of life
Calidad de vida
Preventive medicine
Medicina preventiva
Disease control
Lucha contra las enfermedades
http://vocabularies.unesco.org/thesaurus/concept7468
http://vocabularies.unesco.org/thesaurus/concept3622
http://vocabularies.unesco.org/thesaurus/concept5458
http://vocabularies.unesco.org/thesaurus/concept8186
title_short A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
title_full A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
title_fullStr A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
title_full_unstemmed A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
title_sort A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsis
dc.creator.fl_str_mv García Gallo, Javier Esteban
dc.contributor.advisor.none.fl_str_mv Duitama Muñoz, John Freddy
Fonseca Ruiz, Nelson Javier
dc.contributor.author.none.fl_str_mv García Gallo, Javier Esteban
dc.subject.unesco.none.fl_str_mv Mortality
Mortalidad
Quality of life
Calidad de vida
Preventive medicine
Medicina preventiva
Disease control
Lucha contra las enfermedades
topic Mortality
Mortalidad
Quality of life
Calidad de vida
Preventive medicine
Medicina preventiva
Disease control
Lucha contra las enfermedades
http://vocabularies.unesco.org/thesaurus/concept7468
http://vocabularies.unesco.org/thesaurus/concept3622
http://vocabularies.unesco.org/thesaurus/concept5458
http://vocabularies.unesco.org/thesaurus/concept8186
dc.subject.unescouri.none.fl_str_mv http://vocabularies.unesco.org/thesaurus/concept7468
http://vocabularies.unesco.org/thesaurus/concept3622
http://vocabularies.unesco.org/thesaurus/concept5458
http://vocabularies.unesco.org/thesaurus/concept8186
description ABSTRACT: The Intensive Care Unit (ICU) is a hospital department that provides intensive treatment to patients with severe and life-threatening conditions. The primary function of the ICU is to deliver care which cannot be administered in other areas of the hospital. Patients in the ICU are the most heavily monitored patients in the entire hospital; for this reasons the ICU is a data rich environment, even to the point of exhaustion. The vast amount of data obtained from a single patient in an intensive care unit makes it humanly impossible to organize and interpret it in the required time, thus, scores that model the patient severity and can be related with the mortality have been created. The primary motivation of this scores was to derive further insight of the patient condition and improve patient care. Traditionally, this scores are population-based and provide statistically rigorous results for an average patient, and are useful to guide prognostication, to assess ongoing disease development and organ function, to compare ICU performance over time and across units and to compare clinical trial population outcomes but, pitifully, they are not precise enough to draw conclusions about groups of patients that share a relevant clinical condition, like a particular disease, and even less to be used for individual prediction of outcomes. When standard scores do not fit the data of a specific population well enough, two approaches to adapting them for use among patients with the specific condition have been used. One approach would be to modify the traditional score by adapting them for use specifically among patients that share a condition, which we will be referring as adjusted models. The other approach would be to develop entirely new models based on a population that shares a common characteristics and that incorporates additional variables that could potentially enhance accuracy, which we will be referring as customized models. Sepsis patients are a specific population that is especially vulnerable, since they present a high in-hospital mortality of 25–30% and patients with sepsis are frequently cared for in ICUs, either because sepsis itself led to their admission or because sepsis developed as a complication of their admission for other reasons; moreover, it has been reported that sepsis survivors had substantially increased risks of all-cause mortality, as well as major health complications at 1 year after discharge when compared with the general population. For sepsis patients within the ICU, mortality prediction has been accessed through both adjusted and customized models; however, approaches addressed so far have focused on the in-hospital mortality prediction, and no methods have been proposed to identify and predict long-term risk and mortality in sepsis patients that are being taken care of in the ICU. According to the above, in this work, we present the development of a model that goes beyond the prediction of in-hospital mortality and alert those patients who may have a poor prognosis after being discharged from the hospital, and we formulate our research question as follows: Among adult ICU patients, is it possible to identify those who are at risk of dying one year after their sepsis related admission using demographic variables, comorbidities and physiological data obtained during the first 24 hours of their ICU stay? In order to answer such a question, we used three approaches. First we developed a custom one-year mortality prediction model using a Stochastic Gradient Boosting (SGB) technique. The model was based on the data of 5650 ICU patient’s admissions that were retrospectively identified as having sepsis, and used 132 predictors, obtained from variables found in the literature review or suggested by experts. In the first approach, we also used two techniques to measure the importance of the used predictors, and we found 17 predictors that allowed us to develop an SGB model with a performance similar to the complete model (which uses all the 132 predictors). In the second approach, we developed a methodology that allows the stratification of patients according to their one-year mortality risk. For this, we extended our study cohort using two additional retrospective criteria for sepsis identification and focusing only on the variables that were relevant (according to the results of the previous approach) or that were routinely taken to patients within the ICU, obtaining 15082 admissions; From said cohort we developed two scores systems that are correlated with the one-year mortality risk of the patients. Although the developed customized models for sepsis patient within the ICU proved far outperform adjusted scores for the one-year mortality prediction task, they continue to be population-based and therefore they provide “the average best choice” for sepsis patients. For this reason, in the third approach, we also propose and evaluate the generation of personalized models based on patient similarity metrics. The goal of this personalized models is to identify patients who are similar to a new patient and derive insights from the data of those similar patients to provide personalized predictions. Personalized models has been widely used for predictions in several fields, including music, movies and e-commerce, however, there are still very few studies that focus on personalized prediction models based on health data prediction. Moreover, no studies have been reported in which personalized models are developed from a population known to be very homogenous, such as our study population, where it is known that all patients have infection, organ dysfunction, and ICU stays of more than 24 hours. The developed models, with the three approaches, showed discrimination superior to adjusted models based on traditional severity scores and, the population based methodologies also presented adequate calibration. Specifically, our personalized models demonstrated the value of patient similarity metrics in outcome prediction modeling and showed superiority when compared to population-based models. Also, since we focused on long-term mortality prediction, these models successfully identify those patients who are at risk of dying one year after their sepsis related admission using demographic variables, comorbidities and physiological data obtained during the first 24 hours of their ICU stay, indicating early, which patients should be accompanied, observed attentively and provided with additional care that improve their quality of life. Finally, in order to enable the clinical use of the machine learning models developed for the prediction of one-year mortality of sepsis patients within the ICU, we developed a software based on the models that presented a better performance and the functionalities that are considered useful so that intensivist can obtain details of the particular condition of each patient and provide better care.
publishDate 2019
dc.date.issued.none.fl_str_mv 2019
dc.date.accessioned.none.fl_str_mv 2020-05-20T17:18:42Z
dc.date.available.none.fl_str_mv 2020-05-20T17:18:42Z
dc.type.spa.fl_str_mv info:eu-repo/semantics/doctoralThesis
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_b1a7d7d4d402bcce
dc.type.hasversion.spa.fl_str_mv info:eu-repo/semantics/draft
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_db06
dc.type.redcol.spa.fl_str_mv https://purl.org/redcol/resource_type/TD
dc.type.local.spa.fl_str_mv Tesis/Trabajo de grado - Monografía - Doctorado
format http://purl.org/coar/resource_type/c_db06
status_str draft
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10495/14479
url http://hdl.handle.net/10495/14479
dc.language.iso.spa.fl_str_mv spa
language spa
dc.rights.*.fl_str_mv Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)
dc.rights.spa.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.rights.accessrights.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.creativecommons.spa.fl_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0/
rights_invalid_str_mv Atribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)
http://creativecommons.org/licenses/by-nc-nd/2.5/co/
http://purl.org/coar/access_right/c_abf2
https://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
dc.format.extent.spa.fl_str_mv 166
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.publisher.place.spa.fl_str_mv Medellín, Colombia
institution Universidad de Antioquia
bitstream.url.fl_str_mv http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/3/GarciaJavier_2019_MachineLearningBased.pdf
http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/4/license_rdf
http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/5/license.txt
bitstream.checksum.fl_str_mv 3719ecb64eaf51f055dd05e78d7aa4fa
b88b088d9957e670ce3b3fbe2eedbc13
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional Universidad de Antioquia
repository.mail.fl_str_mv andres.perez@udea.edu.co
_version_ 1812173157390352384
spelling Duitama Muñoz, John FreddyFonseca Ruiz, Nelson JavierGarcía Gallo, Javier Esteban2020-05-20T17:18:42Z2020-05-20T17:18:42Z2019http://hdl.handle.net/10495/14479ABSTRACT: The Intensive Care Unit (ICU) is a hospital department that provides intensive treatment to patients with severe and life-threatening conditions. The primary function of the ICU is to deliver care which cannot be administered in other areas of the hospital. Patients in the ICU are the most heavily monitored patients in the entire hospital; for this reasons the ICU is a data rich environment, even to the point of exhaustion. The vast amount of data obtained from a single patient in an intensive care unit makes it humanly impossible to organize and interpret it in the required time, thus, scores that model the patient severity and can be related with the mortality have been created. The primary motivation of this scores was to derive further insight of the patient condition and improve patient care. Traditionally, this scores are population-based and provide statistically rigorous results for an average patient, and are useful to guide prognostication, to assess ongoing disease development and organ function, to compare ICU performance over time and across units and to compare clinical trial population outcomes but, pitifully, they are not precise enough to draw conclusions about groups of patients that share a relevant clinical condition, like a particular disease, and even less to be used for individual prediction of outcomes. When standard scores do not fit the data of a specific population well enough, two approaches to adapting them for use among patients with the specific condition have been used. One approach would be to modify the traditional score by adapting them for use specifically among patients that share a condition, which we will be referring as adjusted models. The other approach would be to develop entirely new models based on a population that shares a common characteristics and that incorporates additional variables that could potentially enhance accuracy, which we will be referring as customized models. Sepsis patients are a specific population that is especially vulnerable, since they present a high in-hospital mortality of 25–30% and patients with sepsis are frequently cared for in ICUs, either because sepsis itself led to their admission or because sepsis developed as a complication of their admission for other reasons; moreover, it has been reported that sepsis survivors had substantially increased risks of all-cause mortality, as well as major health complications at 1 year after discharge when compared with the general population. For sepsis patients within the ICU, mortality prediction has been accessed through both adjusted and customized models; however, approaches addressed so far have focused on the in-hospital mortality prediction, and no methods have been proposed to identify and predict long-term risk and mortality in sepsis patients that are being taken care of in the ICU. According to the above, in this work, we present the development of a model that goes beyond the prediction of in-hospital mortality and alert those patients who may have a poor prognosis after being discharged from the hospital, and we formulate our research question as follows: Among adult ICU patients, is it possible to identify those who are at risk of dying one year after their sepsis related admission using demographic variables, comorbidities and physiological data obtained during the first 24 hours of their ICU stay? In order to answer such a question, we used three approaches. First we developed a custom one-year mortality prediction model using a Stochastic Gradient Boosting (SGB) technique. The model was based on the data of 5650 ICU patient’s admissions that were retrospectively identified as having sepsis, and used 132 predictors, obtained from variables found in the literature review or suggested by experts. In the first approach, we also used two techniques to measure the importance of the used predictors, and we found 17 predictors that allowed us to develop an SGB model with a performance similar to the complete model (which uses all the 132 predictors). In the second approach, we developed a methodology that allows the stratification of patients according to their one-year mortality risk. For this, we extended our study cohort using two additional retrospective criteria for sepsis identification and focusing only on the variables that were relevant (according to the results of the previous approach) or that were routinely taken to patients within the ICU, obtaining 15082 admissions; From said cohort we developed two scores systems that are correlated with the one-year mortality risk of the patients. Although the developed customized models for sepsis patient within the ICU proved far outperform adjusted scores for the one-year mortality prediction task, they continue to be population-based and therefore they provide “the average best choice” for sepsis patients. For this reason, in the third approach, we also propose and evaluate the generation of personalized models based on patient similarity metrics. The goal of this personalized models is to identify patients who are similar to a new patient and derive insights from the data of those similar patients to provide personalized predictions. Personalized models has been widely used for predictions in several fields, including music, movies and e-commerce, however, there are still very few studies that focus on personalized prediction models based on health data prediction. Moreover, no studies have been reported in which personalized models are developed from a population known to be very homogenous, such as our study population, where it is known that all patients have infection, organ dysfunction, and ICU stays of more than 24 hours. The developed models, with the three approaches, showed discrimination superior to adjusted models based on traditional severity scores and, the population based methodologies also presented adequate calibration. Specifically, our personalized models demonstrated the value of patient similarity metrics in outcome prediction modeling and showed superiority when compared to population-based models. Also, since we focused on long-term mortality prediction, these models successfully identify those patients who are at risk of dying one year after their sepsis related admission using demographic variables, comorbidities and physiological data obtained during the first 24 hours of their ICU stay, indicating early, which patients should be accompanied, observed attentively and provided with additional care that improve their quality of life. Finally, in order to enable the clinical use of the machine learning models developed for the prediction of one-year mortality of sepsis patients within the ICU, we developed a software based on the models that presented a better performance and the functionalities that are considered useful so that intensivist can obtain details of the particular condition of each patient and provide better care.166application/pdfspainfo:eu-repo/semantics/draftinfo:eu-repo/semantics/doctoralThesishttp://purl.org/coar/resource_type/c_db06https://purl.org/redcol/resource_type/TDTesis/Trabajo de grado - Monografía - Doctoradohttp://purl.org/coar/version/c_b1a7d7d4d402bcceAtribución-NoComercial-SinDerivadas 2.5 Colombia (CC BY-NC-ND 2.5 CO)info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/2.5/co/http://purl.org/coar/access_right/c_abf2https://creativecommons.org/licenses/by-nc-nd/4.0/A machine-learning-based model for the one-year mortality prediction in patients admitted to an intensive care unit with diagnosis of sepsisMedellín, ColombiaMortalityMortalidadQuality of lifeCalidad de vidaPreventive medicineMedicina preventivaDisease controlLucha contra las enfermedadeshttp://vocabularies.unesco.org/thesaurus/concept7468http://vocabularies.unesco.org/thesaurus/concept3622http://vocabularies.unesco.org/thesaurus/concept5458http://vocabularies.unesco.org/thesaurus/concept8186Doctor en Ingeniería ElectrónicaDoctoradoFacultad de Ingeniería. Doctorado en Ingeniería ElectrónicaUniversidad de AntioquiaORIGINALGarciaJavier_2019_MachineLearningBased.pdfGarciaJavier_2019_MachineLearningBased.pdfTesis doctoralapplication/pdf13901871http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/3/GarciaJavier_2019_MachineLearningBased.pdf3719ecb64eaf51f055dd05e78d7aa4faMD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8823http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/4/license_rdfb88b088d9957e670ce3b3fbe2eedbc13MD54LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://bibliotecadigital.udea.edu.co/bitstream/10495/14479/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD5510495/14479oai:bibliotecadigital.udea.edu.co:10495/144792021-05-21 11:44:17.679Repositorio Institucional Universidad de Antioquiaandres.perez@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=