Treatment of missing data in an educational evaluation with longitudinal data
DOI:
https://doi.org/10.18222/eae.v36.11449Keywords:
Academic Performance, Missing Data Treatment, Longitudinal Study, Regression AnalysisAbstract
The absence of data in educational assessments is related to student’s performance and profile. This study proposes a new approach based on pattern-mixture models for analyzing incomplete data in longitudinal evaluations. This approach is compared with listwise deletion (LD) and multiple imputation (IM) procedures, using linear growth models, based on a sample of 8,681 high school students in Ceará state, Brazil. The results show that the procedures differ in estimating the effects of predictor variables and average rate of learning in mathematics. The new approach yields more realistic estimates are obtained for the average rate of learning and the trajectories generated are more coherent than those estimated by the multiple imputation procedure.
Downloads
References
Alabadla, M., Sidi, F., Ishak, I., Ibrahim, H., Affendey, L. S., Ani, Z. C., Jabar, M. A., Bukar, U. A., Devaraj, N. K., Muda, A. S., Tharek, A., Omar, N., & Jaya, M. I. M. (2022). Systematic review of using machine learning in imputing missing values. IEEE Access, 10, 44483-44502. https://doi.org/10.1109/ACCESS.2022.3160841 DOI: https://doi.org/10.1109/ACCESS.2022.3160841
Allison, P. D. (2001). Missing data. Sage. DOI: https://doi.org/10.4135/9781412985079
Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data analysis. Journal of School Psychology, 48(1), 5-37. https://doi.org/10.1016/j.jsp.2009.10.001 DOI: https://doi.org/10.1016/j.jsp.2009.10.001
Bello, L., & Britto, V. (2024, 22 março). Uma em cada quatro mulheres de 15 a 29 anos não estudava e nem estava ocupada em 2023. Agência IBGE Notícias. https://agenciadenoticias.ibge.gov.br/agencia-noticias/2012-agencia-de-noticias/noticias/39531-uma-em-cada-quatromulheres-de-15-a-29-anos-nao-estudava-e-nem-estava-ocupada-em-2023
Cheema, J. R. (2014). A review of missing data handling methods in education research. Review of Educational Research, 84(4), 487-508. https://doi.org/10.3102/0034654314532697 DOI: https://doi.org/10.3102/0034654314532697
Collins, L. M., Schafer, J. L., & Kam, C-M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6(4), 330-351. https://doi.org/10.1037//1082-989X.6.4.330 DOI: https://doi.org/10.1037//1082-989X.6.4.330
Davis, R., Occhipinti, S., & Jones, L. (2018). Managing missing data: Concepts, theories, and methods. In P. Brough (Ed.), Advanced research methods for applied psychology (pp. 187-200). Routledge. DOI: https://doi.org/10.4324/9781315517971-19
Demirtas, H., & Schafer, J. L. (2003). On the performance of random-coefficient pattern-mixture models for non-ignorable dropout. Statistics in Medicine, 22(16), 2553-2575. https://doi.org/10.1002/sim.1475 DOI: https://doi.org/10.1002/sim.1475
Enders, C. K. (2011). Missing not at random models for latent growth curve analyses. Psychological Methods, 16(1), 1-16. https://doi.org/10.1037/a0022640 DOI: https://doi.org/10.1037/a0022640
Enders, C. K. (2022). Applied missing data analysis. Guilford.
Enders, C. K. (2023). Missing data: An update on the state of the art. Psychological Methods, 30(2), 322-339. https://doi.org/10.1037/met0000563 DOI: https://doi.org/10.1037/met0000563
Ferrão, M. E., & Prata, P. (2019). Computing topics on multiple imputation in Big Identifiable Data using R: An application to educational research. In 19th International Conference on Computational Science and Its Applications: ICCSA 2019 (Part. 3, pp. 12-24). Springer. DOI: https://doi.org/10.1007/978-3-030-24302-9_2
Ferrão, M. E., Prata, P., & Alves, M. T. G. (2020). Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system. Ensaio: Avaliação e Políticas Públicas em Educação, 28(108), 599-641. https://doi.org/10.1590/S0104-40362020002802346 DOI: https://doi.org/10.1590/s0104-40362020002802346
Ferreira, M. E. (2022). Evasão escolar no ensino médio: Possíveis causas e soluções. RCMOS – Revista Científica Multidisciplinar O Saber, 1(1), 310-315. https://doi.org/10.51473/rcmos.v2i1.277 DOI: https://doi.org/10.51473/rcmos.v2i1.277
Fitzmaurice, G., Davidian, M., Verbeke, G., & Molenberghs, G. (2008). Longitudinal data analysis. Chapman & Hall. https://doi.org/10.1201/9781420011579 DOI: https://doi.org/10.1201/9781420011579
Franco, C. (2001). O SAEB – Sistema de Avaliação da Educação Básica: Potencialidades, problemas e desafios. Revista Brasileira de Educação, (17), 127-133. https://www.scielo.br/j/rbedu/a/qCYrZ7vVQYFH7fRXBhBZ5Nm/abstract/?lang=pt DOI: https://doi.org/10.1590/S1413-24782001000200010
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. https://doi.org/10.1146/annurev.psych.58.110405.085530 DOI: https://doi.org/10.1146/annurev.psych.58.110405.085530
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8(3), 206-213. https://doi.org/10.1007/s11121-007-0070-9 DOI: https://doi.org/10.1007/s11121-007-0070-9
Hedeker, D., & Gibbons, R. D. (1997). Application of random-effects pattern-mixture models for missing data in longitudinal studies. Psychological Methods, 2(1), 64-78. https://doi.org/10.1037/1082-989X.2.1.64 DOI: https://doi.org/10.1037//1082-989X.2.1.64
Ismail, A. R., Abidin, N. Z., & Maen, M. K. (2022). Systematic review on missing data imputation techniques with machine learning algorithms for healthcare. Journal of Robotics and Control (JRC), 3(2), 143-152. https://doi.org/10.18196/jrc.v3i2.13133 DOI: https://doi.org/10.18196/jrc.v3i2.13133
Jeličić, H., Phelps, E., & Lerner, R. M. (2009). Use of missing data methods in longitudinal studies: The persistence of bad practices in developmental psychology. Developmental Psychobiology, 45(4), 1195-1199. https://doi.org/10.1037/a0015665 DOI: https://doi.org/10.1037/a0015665
Leon, F. L. L. de, & Menezes, N. A., Filho (2002). Reprovação, avanço e evasão escolar no Brasil. Pesquisa e Planejamento Econômico, 32(3), 417-452. http://repositorio.ipea.gov.br/handle/11058/4286
Little, R. J. (2024). Missing data analysis. Annual Review of Clinical Psychology, 20, 149-173. https://doi.org/10.1146/annurev-clinpsy-080822-051727 DOI: https://doi.org/10.1146/annurev-clinpsy-080822-051727
Little, R. J., & Rubin, D. B. (2019). Statistical analysis with missing data (Vol. 793). John Wiley & Sons. DOI: https://doi.org/10.1002/9781119482260
McKnight, P. E., McKnight, K. M., Sidani, S., & Figueredo, A. J. (2007). Missing data: A gentle introduction. Guilford.
Ministério da Educação (MEC). (2024, 22 fevereiro). Ensino médio tem maior taxa de evasão da educação básica. agência gov. https://agenciagov.ebc.com.br/noticias/202402/ensino-medio-tem-maior-taxa-de-evasao-da-educacao-basica
Neri, M. (2009). Motivos da evasão escolar. Fundação Getulio Vargas.
Occhipinti, S. (2024). Missing data. In P. Brough (Ed.), Advanced research methods for applied psychology (pp. 211-223). Routledge. DOI: https://doi.org/10.4324/9781003362715-19
Pigott, T. D. (2001). A review of methods for missing data. Educational Research and Evaluation, 7(4), 353-383. https://doi.org/10.1076/edre.7.4.353.8937 DOI: https://doi.org/10.1076/edre.7.4.353.8937
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models. Applications and data analysis methods (2nd edition). Sage.
Rousseau, M., Simon, M., Bertrand, R., & Hachey, K. (2012). Reporting missing data: A study of selected articles published from 2003-2007. Quality & Quantity, 46(5), 1393-1406. https://doi.org/10.1007/s11135-011-9452-y DOI: https://doi.org/10.1007/s11135-011-9452-y
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581-592. https://doi.org/10.2307/2335739 DOI: https://doi.org/10.1093/biomet/63.3.581
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. John Wiley & Sons. DOI: https://doi.org/10.1002/9780470316696
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147-177. https://doi.org/10.1037/1082-989X.7.2.147 DOI: https://doi.org/10.1037//1082-989X.7.2.147
Secretaria da Educação. (2011). SPAECE 2011: Boletim Pedagógico Matemática – Ensino Médio (Vol. 3). CAEd; Governo do Estado do Ceará. https://prototipos.caeddigital.net/arquivos/ce/colecoes/2011/BOLETIM_SPAECE_VOL%203_MT_3%20EM.pdf
Seu, K., Kang, M.-S., & Lee, H. (2022). An intelligent missing data imputation techniques: A review. JOIV: International Journal on Informatics Visualization, 6(1-2), 278-283. http://dx.doi.org/10.30630/joiv.6.1-2.935 DOI: https://doi.org/10.30630/joiv.6.1-2.935
Shirasu, M. R. (2014). Determinantes da evasão e repetência escolar no Ceará [Dissertação de mestrado, Universidade Federal do Ceará]. Repositório Institucional UFC. https://repositorio.ufc.br/handle/riufc/15223
Silva, J. L. P. (2013). Métodos de imputação múltipla para GEE em estudos longitudinais [Dissertação de mestrado, Universidade Federal de Minas Gerais]. Repositório Institucional da UFMG. https://hdl.handle.net/1843/BUOS-8GHJRP
Simões, A. (2014). Acesso e evasão na educação básica: As perspectivas da população de baixa renda no Brasil. Ministério do Desenvolvimento Social e Assistência Social, Família e Combate à Fome.
Soares, T. M., Fernandes, N. da S., Nóbrega, M. C., & Nicolella, A. C. (2015). Fatores associados ao abandono escolar no ensino médio público de Minas Gerais. Educação e Pesquisa, 41(3), 757-772. https://doi.org/10.1590/S1517-9702201507138589 DOI: https://doi.org/10.1590/S1517-9702201507138589
Vinha, L. G. do A. (2016). Estudos longitudinais e tratamento de dados ausentes em avaliações educacionais [Tese de doutorado, Universidade de Brasília]. Repositório Institucional UnB. http://repositorio.unb.br/handle/10482/20204
Vinha, L. G. do A., & Laros, J. A. (2018). Dados ausentes em avaliações educacionais: Comparação de métodos de tratamento. Estudos em Avaliação Educacional, 29(70), 156-187. https://doi.org/10.18222/eae.v0ix.3916 DOI: https://doi.org/10.18222/eae.v0ix.3916
Wærsted, M., Børnick, T. S., Twisk, J. W. R., & Veiersted, K. B. (2018). Simple descriptive missing data indicators in longitudinal studies with attrition, intermittent missing data and a high number of follow-ups. BMC Research Notes, 11, 1-7. https://doi.org/10.1186/s13104-018-3228-6 DOI: https://doi.org/10.1186/s13104-018-3228-6
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Luis Gustavo do Amaral Vinha, Jacob Arie Laros

This work is licensed under a Creative Commons Attribution 4.0 International License.
a. Authors retain the copyright and grant the journal the right to first publication.
b. All works are licensed under the Creative Commons Attribution License, which allows the sharing of the paper with acknowledgment of authorship.
Until 2024, Estudos em Avaliação Educacional adopted the Creative Commons Attribution-NonCommercial (CC BY-NC) license for its publications. For texts published from 2025 onwards, the journal will adopt the Creative Commons Attribution (CC BY) license, in line with the principles of Open Science.





