Using an Anchor to Improve Linear Predictions with Application to Predicting Disease Progression

Linear models are some of the most straightforward and commonly used modelling approaches. Consider modelling approximately monotonic response data arising from a time-related process. If one has knowledge as to when the process began or ended, then one may be able to leverage additionalassumed data...

Full description

Autores:
Karanevich, Alex G.
He, Jianghua
Gajewski, Byron
Tipo de recurso:
Article of journal
Fecha de publicación:
2018
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/66487
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/66487
http://bdigital.unal.edu.co/67515/
Palabra clave:
51 Matemáticas / Mathematics
31 Colecciones de estadística general / Statistics
Anclaje
esclerosis lateral amiotrófica
modelos lineales
mínimos cuadrados ordinarios
regresión sesgada
Anchor
Amyotrophic lateral sclerosis
Biased regression
Linear models
Ordinary least squares
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
Description
Summary:Linear models are some of the most straightforward and commonly used modelling approaches. Consider modelling approximately monotonic response data arising from a time-related process. If one has knowledge as to when the process began or ended, then one may be able to leverage additionalassumed data to reduce prediction error. This assumed data, referred to as the anchor, is treated as an additional data-point generated at either the beginning or end of the process. The response value of the anchor is equal to an intelligently selected value of the response (such as the upper bound, lower bound, or 99th percentile of the response, as appropriate). The anchor reduces the variance of prediction at the cost of a possible increase in prediction bias, resulting in a potentially reduced overall mean-square prediction error. This can be extremely eective when few individual data-points are available, allowing one to make linear predictions using as little as a single observed data-point. We develop the mathematics showing the conditions under which an anchor can improve predictions, and also demonstrate using this approach to reduce prediction error when modelling the disease progression of patients with amyotrophic lateral sclerosis.