A derivation of the optimal answer-copying index and some applications

Multiple choice exams are frequently used as an efficient and objective instrument to evaluate knowledge. Nevertheless, they are more vulnerable to answer-copying than tests based on open questions. Several statistical tests (known as indices) have been proposed to detect cheating but to the best of...

Full description

Autores:: Romero, Mauricio
Jara Pinzón, Diego
Riascos Villegas, Álvaro José

Tipo de recurso:: Work document

Fecha de publicación:: 2014

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: spa

id	UNIANDES2_aede0f155672e36ae69c1cda5b67e074
oai_identifier_str	oai:repositorio.uniandes.edu.co:1992/8507
network_acronym_str	UNIANDES2
network_name_str	Séneca: repositorio Uniandes
repository_id_str
dc.title.none.fl_str_mv	A derivation of the optimal answer-copying index and some applications
dc.title.alternative.none.fl_str_mv	Una derivación de índice de copia óptimo y algunas aplicaciones
title	A derivation of the optimal answer-copying index and some applications
spellingShingle	A derivation of the optimal answer-copying index and some applications Answer copying False discovery rate Index Neyman-Pearson Lemma Mediciones y pruebas educativas Pruebas de conocimiento C19, I20
title_short	A derivation of the optimal answer-copying index and some applications
title_full	A derivation of the optimal answer-copying index and some applications
title_fullStr	A derivation of the optimal answer-copying index and some applications
title_full_unstemmed	A derivation of the optimal answer-copying index and some applications
title_sort	A derivation of the optimal answer-copying index and some applications
dc.creator.fl_str_mv	Romero, Mauricio Jara Pinzón, Diego Riascos Villegas, Álvaro José
dc.contributor.author.none.fl_str_mv	Romero, Mauricio Jara Pinzón, Diego Riascos Villegas, Álvaro José
dc.subject.keyword.none.fl_str_mv	Answer copying False discovery rate Index Neyman-Pearson Lemma
topic	Answer copying False discovery rate Index Neyman-Pearson Lemma Mediciones y pruebas educativas Pruebas de conocimiento C19, I20
dc.subject.armarc.none.fl_str_mv	Mediciones y pruebas educativas Pruebas de conocimiento
dc.subject.jel.none.fl_str_mv	C19, I20
description	Multiple choice exams are frequently used as an efficient and objective instrument to evaluate knowledge. Nevertheless, they are more vulnerable to answer-copying than tests based on open questions. Several statistical tests (known as indices) have been proposed to detect cheating but to the best of our knowledge they all lack a mathematical support that guarantees optimality in any sense. This work aims at filling this void by deriving the uniform most powerful (UMP) test assuming the response distribution is known. In practice we must estimate a behavioral model that yields a response distribution for each question. We calculate the empirical type-I and type-II error rates for several indices, that assume different behavioral models, using simulations based on real data from twelve nation wide multiple choice exams taken by 5th and 9th graders in Colombia. We find that the index with the highest power among those studied, subject to the restriction of preserving the type-I error, is the one that uses a nominal response model for item answering, conditions on the answers of the individual suspected of being the source of copy and calculates critical values via a normal approximation. This index was first studied by Wollack (1997) and later by W. Van der Linden and Sotaridona (2006) and is superior to the indices studied and developed by Wesolowsky (2000) and Frary, Tideman, and Watts (1977). Furthermore, we compare the performance of the indices on examination rooms with different levels of proctoring and find that increasing the level of proctoring can reduce copying by as much as 50% and that simple strategies such as having different students answer different portions of the test at different times canal so reduce cheating by over 50%. Finally, a Bonferroni type false discovery rate procedure is used to detect massive cheating. The application is straightforward and we believe it could be use to make entire examination rooms retake an exam under stricter surveillance conditions.
publishDate	2014
dc.date.issued.none.fl_str_mv	2014
dc.date.accessioned.none.fl_str_mv	2018-09-27T16:53:51Z
dc.date.available.none.fl_str_mv	2018-09-27T16:53:51Z
dc.type.spa.fl_str_mv	Documento de trabajo
dc.type.coarversion.fl_str_mv	http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.driver.spa.fl_str_mv	info:eu-repo/semantics/workingPaper
dc.type.coar.spa.fl_str_mv	http://purl.org/coar/resource_type/c_8042
dc.type.content.spa.fl_str_mv	Text
dc.type.redcol.spa.fl_str_mv	https://purl.org/redcol/resource_type/WP
format	http://purl.org/coar/resource_type/c_8042
dc.identifier.issn.none.fl_str_mv	1657-5334
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/1992/8507
dc.identifier.eissn.none.fl_str_mv	1657-7191
dc.identifier.doi.none.fl_str_mv	10.57784/1992/8507
dc.identifier.instname.spa.fl_str_mv	instname:Universidad de los Andes
dc.identifier.reponame.spa.fl_str_mv	reponame:Repositorio Institucional Séneca
dc.identifier.repourl.spa.fl_str_mv	repourl:https://repositorio.uniandes.edu.co/
identifier_str_mv	1657-5334 1657-7191 10.57784/1992/8507 instname:Universidad de los Andes reponame:Repositorio Institucional Séneca repourl:https://repositorio.uniandes.edu.co/
url	http://hdl.handle.net/1992/8507
dc.language.iso.none.fl_str_mv	spa
language	spa
dc.relation.ispartofseries.none.fl_str_mv	Documentos CEDE No. 32 Agosto de 2014
dc.relation.repec.SPA.fl_str_mv	https://ideas.repec.org/p/col/000089/012061.html
dc.rights.uri.*.fl_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.spa.fl_str_mv	info:eu-repo/semantics/openAccess
dc.rights.coar.spa.fl_str_mv	http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-nd/4.0/ http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv	openAccess
dc.format.extent.none.fl_str_mv	27 páginas
dc.format.mimetype.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidad de los Andes, Facultad de Economía, CEDE
publisher.none.fl_str_mv	Universidad de los Andes, Facultad de Economía, CEDE
institution	Universidad de los Andes
bitstream.url.fl_str_mv	https://repositorio.uniandes.edu.co/bitstreams/d0aca76d-368d-4159-8ae8-c90babcb4bae/download https://repositorio.uniandes.edu.co/bitstreams/88e95fc3-1ed3-4c87-a507-53f8e7be3013/download https://repositorio.uniandes.edu.co/bitstreams/960d48da-2a3c-40f4-a69e-3051fff4b3bc/download
bitstream.checksum.fl_str_mv	48614e2b8108626659e9f400390e1219 668e044f4a557a8db8606415522753f2 c3d03bdb5c14aa9cc3d629613eb70821
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio institucional Séneca
repository.mail.fl_str_mv	adminrepositorio@uniandes.edu.co
_version_	1837005027377414144
spelling	Al consultar y hacer uso de este recurso, está aceptando las condiciones de uso establecidas por los autores.http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Romero, Mauriciod63643dd-3340-4e4c-a638-03fc8e702a58600Jara Pinzón, Diego05cdda94-4c03-4758-8f4b-d630cb74ab6b600Riascos Villegas, Álvaro José101106002018-09-27T16:53:51Z2018-09-27T16:53:51Z20141657-5334http://hdl.handle.net/1992/85071657-719110.57784/1992/8507instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/Multiple choice exams are frequently used as an efficient and objective instrument to evaluate knowledge. Nevertheless, they are more vulnerable to answer-copying than tests based on open questions. Several statistical tests (known as indices) have been proposed to detect cheating but to the best of our knowledge they all lack a mathematical support that guarantees optimality in any sense. This work aims at filling this void by deriving the uniform most powerful (UMP) test assuming the response distribution is known. In practice we must estimate a behavioral model that yields a response distribution for each question. We calculate the empirical type-I and type-II error rates for several indices, that assume different behavioral models, using simulations based on real data from twelve nation wide multiple choice exams taken by 5th and 9th graders in Colombia. We find that the index with the highest power among those studied, subject to the restriction of preserving the type-I error, is the one that uses a nominal response model for item answering, conditions on the answers of the individual suspected of being the source of copy and calculates critical values via a normal approximation. This index was first studied by Wollack (1997) and later by W. Van der Linden and Sotaridona (2006) and is superior to the indices studied and developed by Wesolowsky (2000) and Frary, Tideman, and Watts (1977). Furthermore, we compare the performance of the indices on examination rooms with different levels of proctoring and find that increasing the level of proctoring can reduce copying by as much as 50% and that simple strategies such as having different students answer different portions of the test at different times canal so reduce cheating by over 50%. Finally, a Bonferroni type false discovery rate procedure is used to detect massive cheating. The application is straightforward and we believe it could be use to make entire examination rooms retake an exam under stricter surveillance conditions.Los exámenes de opción múltiple se usan con frecuencia como un instrumento eficiente y objetivo para evaluar el conocimiento. Sin embargo, son más vulnerables a la copia de respuestas que las pruebas basadas en preguntas abiertas. Se han propuesto varias pruebas estadísticas (conocidas como índices) para detectar trampas, pero según nuestro conocimiento, todas carecen de un soporte matemático que garantice la optimización en cualquier sentido. Este trabajo tiene como objetivo llenar este vacío derivando la prueba uniforme más potente (UMP) suponiendo que se conoce la distribución de la respuesta. En la práctica, debemos estimar un modelo de comportamiento que produzca una distribución de respuesta para cada pregunta. Calculamos las tasas de error empíricas de tipo I y tipo II para varios índices, que suponen diferentes modelos de comportamiento, utilizando simulaciones basadas en datos reales de doce exámenes de opción múltiple a nivel nacional tomados por estudiantes de quinto y noveno grado en Colombia. Encontramos que el índice con el poder más alto entre los estudiados, sujeto a la restricción de preservar el error tipo I, es el que usa un modelo de respuesta nominal para responder preguntas, condiciones sobre las respuestas del individuo sospechoso de ser la fuente de copia y calcula valores críticos a través de una aproximación normal. Este índice fue estudiado primero por Wollack (1997) y luego por W. Van der Linden y Sotaridona (2006) y es superior a los índices estudiados y desarrollados por Wesolowsky (2000) y Frary, Tideman y Watts (1977). Además, comparamos el rendimiento de los índices en las salas de examen con diferentes niveles de supervisión y descubrimos que aumentar el nivel de supervisión puede reducir la copia hasta en un 50% y que estrategias simples como hacer que diferentes estudiantes respondan diferentes partes de la prueba en diferentes tiempos de canal reduce la trampa en más del 50%. Finalmente, se utiliza un procedimiento de tasa de descubrimiento falso de tipo Bonferroni para detectar trampas masivas. La aplicación es sencilla y creemos que podría usarse para hacer que salas de examen completas vuelvan a realizar un examen en condiciones de vigilancia más estrictas.27 páginasapplication/pdfspaUniversidad de los Andes, Facultad de Economía, CEDEDocumentos CEDE No. 32 Agosto de 2014https://ideas.repec.org/p/col/000089/012061.htmlA derivation of the optimal answer-copying index and some applicationsUna derivación de índice de copia óptimo y algunas aplicacionesDocumento de trabajoinfo:eu-repo/semantics/workingPaperhttp://purl.org/coar/resource_type/c_8042http://purl.org/coar/version/c_970fb48d4fbd8a85Texthttps://purl.org/redcol/resource_type/WPAnswer copyingFalse discovery rateIndexNeyman-Pearson LemmaMediciones y pruebas educativasPruebas de conocimientoC19, I20Facultad de EconomíaPublicationTEXTdcede2014-32.pdf.txtdcede2014-32.pdf.txtExtracted texttext/plain53811https://repositorio.uniandes.edu.co/bitstreams/d0aca76d-368d-4159-8ae8-c90babcb4bae/download48614e2b8108626659e9f400390e1219MD54THUMBNAILdcede2014-32.pdf.jpgdcede2014-32.pdf.jpgIM Thumbnailimage/jpeg8827https://repositorio.uniandes.edu.co/bitstreams/88e95fc3-1ed3-4c87-a507-53f8e7be3013/download668e044f4a557a8db8606415522753f2MD55ORIGINALdcede2014-32.pdfdcede2014-32.pdfapplication/pdf468384https://repositorio.uniandes.edu.co/bitstreams/960d48da-2a3c-40f4-a69e-3051fff4b3bc/downloadc3d03bdb5c14aa9cc3d629613eb70821MD511992/8507oai:repositorio.uniandes.edu.co:1992/85072024-06-04 15:27:52.5http://creativecommons.org/licenses/by-nc-nd/4.0/open.accesshttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.co

A derivation of the optimal answer-copying index and some applications

Publicaciones similares