Técnicas de clasificación de data mining: una aplicación al consumo de tabaco en adolescentes

Autores/as

  • Juan J. Montaño Universidad de las Islas Baleares
  • Elena Gervilla Universidad de las Islas Baleares
  • Berta Cajal Universidad de las islas Baleares
  • Alfonso Palmer Universidad de las Islas Baleares
DOI: https://doi.org/10.6018/analesps.30.2.160881
Palabras clave: redes neuronales artificiales, nicotina, data mining, tabaco, modelo de regresión logística, análisis discriminante

Resumen

El presente trabajo tiene el propósito de analizar el poder predictivo de diversas variables psicosociales y de personalidad sobre el consumo o no consumo de nicotina en la población adolescente mediante el uso de diversas técnicas de clasificación procedentes de la metodología Data Mining. Más concretamente, se analizan las RNA –Perceptrón Multicapa (MLP), Funciones de Base Radial (RBF) y Redes Probabilísticas (PNN)--, los árboles de decisión, el modelo de regresión logística y el análisis discriminante. Para ello, se ha trabajado con una muestra de 2666 adolescentes, de los cuales 1378 no consumen nicotina mientras que 1288 son consumidores de nicotina. Los modelos analizados han sido capaces de discriminar correctamente entre ambos tipos de sujeto en un rango comprendido entre el 77.39% y el 78.20%, alcanzando una sensibilidad del 91.29% y una especificidad del 74.32%. Con este estudio, se pone a disposición del especialista en conductas adictivas, un conjunto de técnicas estadísticas avanzadas capaces de manejar simultáneamente una gran cantidad de variables y sujetos, así como aprender de forma automática patrones y relaciones complejas, siendo muy adecuadas para la predicción y prevención del comportamiento adictivo.

Descargas

Los datos de descargas todavía no están disponibles.

Biografía del autor/a

Juan J. Montaño, Universidad de las Islas Baleares

Profesor Titular de Universidad en la Universidad de las Islas Baleares con el perfil de Análisis de Datos

Elena Gervilla, Universidad de las Islas Baleares

Profesora Ayudante Doctor en la Universidad de las Islas Baleares

Berta Cajal, Universidad de las islas Baleares

Profesora Titular de Universidad en la Universidad de las Islas Baleares con el perfil de Fundamentos de Metodología y de Análisis de Datos

Alfonso Palmer, Universidad de las Islas Baleares

Profesor Catedrático de Universidad en la Universidad de las Islas Baleares con el perfil de Estadística Aplicada

Citas

Battiti, R. (1992). First and second order methods for learning: between steepest descent and Newton's method. Neural Computation, 4, 141-166.

Bishop, C.M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press.

Breiman, L., Friedman, J.H., Losen, R.A. & Stone, C.J. (1984). Classification And Regression Trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software.

Broman, C.L. (2009). The longitudinal impact of adolescent drug use on socioeconomic outcomes in young adulthood. Journal of Child & Adolescent Substance Abuse, 18, 131-143.

Broomhead, D.S. & Lowe, D. (1988). Multivariable functional interpolation and adaptive networks. Complex Systems, 2, 321-355.

Buscema, M. (1995). Squashing Theory: A prediction approach for drug behavior. Drugs and Society, 8(3-4), 103-110.

Buscema, M. (1997). A general presentation of artificial neural networks. I. Substance Use & Misuse, 32(1), 97-112.

Buscema, M. (1998). Artificial neural networks and complex systems. I. Theory. Substance Use & Misuse, 33(1), 1-220.

Buscema, M., Intraligi, M. & Bricolo, R. (1998). Artificial neural networks for drug vulnerability recognition and dynamic scenarios simulation. Substance Use & Misuse, 33(3), 587-623.

Carvajal, S.C. & Granillo, T.M. (2006). A prospective test of distal and proximal determinants of smoking initiation in early adolescents. Addictive Behaviors, 31, 649-660.

Ciairano, S., Bosma, H.A., Miceli, R. & Settani, M. (2008). Adolescent substance use in two European countries: Relationships with psychosocial adjustment, peers, and activities. International Journal of Clinical and Health Psychology, 8(1), 119-138.

Clarke, B., Fokoué, E. & Zhang, H.H. (2009). Principles and Theory for Data Mining and Machine Learning. New York: Springer.

Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematical Control, Signal and Systems, 2, 303-314.

De Leeuw, R.N.H., Engels, R.C.M.E., Vermulst, A.A. & Scholte, R.H.J. (2009). Relative risks of exposure to different smoking models on the development of nicotine dependence during adolescence: a five-wave longitudinal study. Journal of Adolescent Health, 45, 171-178.

De Vries, H., Engels, R., Kremers, S., Wetzels, J. & Mudde, A. (2003). Parents’ and friends’ smoking status as predictors of smoking onset: Findings from six European countries. Health Education Research, 18, 627-636.

Dick, D.M., Pagan, J.L., Viken, R., Purcell, S., Kaprio, J., Pulkkinen, L. & Rose, R.J. (2007). Changing environmental influences on substance use across development. Twin Research and Human Genetics, 10(2), 315-326.

Doran, N., McCharge, D. & Cohen, L. (2007). Impulsivity and the reinforcing value of cigarette smoking. Addictive Behaviors, 32, 90-98.

Fernández, J.R., Secades, R., Vallejo, G. & Errasti, J.M. (2003). Evaluation of what parents know about their children’s drug use and how they perceive the most common family risk factors. Journal of Drug Education, 33, 334-350.

Fisher, L.B., Winickoff, J.P., Camargo, C.A., Colditz, G.A. & Frazier, A.L. (2007). Household smoking restrictions and adolescent smoking. American Journal of Health Promotion, 22, 15-21.

Fisher, R.A. (1936). The use of multiple measurements on taxonomic problems. Annals of Eugenics, 7, 179-188.

Franken, I.H.A., Muris, P. & Georgieva, I. (2006). Gray’s model of personality and addiction. Addictive Behaviors, 31, 399-403.

Funahashi, K. (1989). On the approximate realization of continuous mappings by neural networks. Neural Networks, 2, 183-192.

Georgiades, K. & Boyle, M.H. (2007). Adolescent tobacco and cannabis use: young adult outcomes from the Ontario Child Health Study. Journal of Child Psychology and Psychiatry, 48, 724-731.

Gervilla, E. & Palmer, A. (2009). Predicción del consumo de cocaína en adolescentes mediante árboles de decisión. Revista de Investigación en Educación, 6, 7-13.

Gervilla, E. & Palmer, A. (2010). Prediction of cannabis and cocaine use in adolescence using decision trees and logistic regression. The European Journal of Psychology Applied to Legal Context, 2(1), 19-35.

Gervilla, E., Cajal, B., Roca, J. & Palmer, A. (2010). Modelling alcohol consumption during adolescente using Zero Inflated Negative Binomial and Decision Trees. The European Journal of Psychology Applied to Legal Context, 2, 145-159.

Gervilla, E., Jiménez, R., Montaño, J.J., Sesé, A., Cajal, B. & Palmer, A. (2009). La metodología del Data Mining. Una aplicación al consumo de alcohol en adolescentes. Adicciones, 21(1), 65-80.

Giudici, P. (2003). Applied data mining: Statistical methods for business and industry. Chichester: Hoboken, NJ: Wiley.

Hall, J.A. & Valente, T.W. (2007). Adolescent smoking networks: The effect of influence and selection on future smoking. Addictive Behaviors, 32, 3054-3059.

Han, J. & Kamber, M. (2006). Data Mining: Concepts and Techniques (2nd. ed.). San Francisco: Morgan Kaufmann.

Hand, D., Mannila, H. & Smith, P. (2001). Principles of Data Mining. London: The MIT Press.

Hartman, E., Keeler, J.D. & Kowalski, J.M. (1990). Layered neural networks with Gaussian hidden units as universal approximators. Neural Computation, 2(2), 210-215.

Hernandez, J., Ramirez, M. & Ferri, C. (2004). Introducción a la Minería de Datos [Introduction to Data Mining]. Madrid: Pearson Educación, S.A.

Hoffman, B.R., Monge, P.R., Chou, C.P. & Valente, T.W. (2007). Perceived peer influence and peer selection on adolescent smoking. Addictive Behaviors, 32, 1546-1554.

Hoffman, J.H., Welte, J.W. & Barnes, G.M. (2001). Co-ocurrence of alcohol and cigarette use among adolescents. Addictive Behaviors, 26, 63-78.

Hornik, K., Stinchcombe, M. & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359-366.

Hosmer, D.W. & Lemeshow, S. (2000). Applied Logistic Regression (2nd edition). New York: Wiley.

Huver, R.M.E., Engels, R.C.M.E., Vermulst, A.A. & De Vries, H. (2007). Is parenting style a context for smoking-specific parenting practices? Drug and Alcohol Dependence, 89, 116-125.

Johnson, P. B., Boles, S. M. & Kleber, H. D. (2000). The relationship between adolescent smoking and drinking and likelihood estimates of illicit drug use. Journal of Addictive Diseases, 19(2), 75-82.

Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10, 215-236.

Kantardzic, M. (2003). Data Mining: Concepts, Models, Methods, and Algorithms. New York: Wiley.

Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29(2), 119-127.

Kitsantas, P., Moore, T.W. & Sly, D.F. (2007). Using classification trees to profile adolescent smoking behaviors. Addictive Behaviors, 32, 9-23.

Larose, D.T. (2005). Discovering Knowledge in Data: An Introduction to Data Mining. Hoboken, NJ: Wiley.

Luther, E.J., Parzynski, C.S., Jaszyna-Gasior, M., Bagot, K.S., Royo, M.B., Leff, M.K. & Moolchan, E.T. (2008). Does allowing adolescents to smoke at home affect their consumption and dependence? Addictive Behaviors, 33, 836-840.

Maurelli, G. & Di Giulio, M. (1998). Artificial neural networks for the identification of the differences between “light” and “heavy” alcoholics, starting from five nonlinear biological variables. Substance Use & Misuse, 33(3), 693-708.

Molyneux, A., Lewis, S., Antoniak, M., Browne, W., McNeill, A., Godfrey, C. & Britton, J. (2004). Prospective study of the effect of exposure to other smokers in high school tutor groups on the risk of incident smoking in adolescence. American Journal of Epidemiology, 159(2), 127-132.

Montaño, J.J., Palmer, A. & Muñoz, P. (2011). Artificial neural networks applied to forecasting time series. Psicothema, 23, 322-329.

Muñoz, M. & Graña, J.L. (2001). Factores familiares de riesgo y de protección para el consumo de drogas en adolescentes. Psicothema, 13(1), 87-94.

Okoli, C.T.C., Richardson, C.G. & Johnson, J.L. (2008). An examination of the relationship between adolescents’ initial smoking experience and their exposure to peer and family member smoking. Addictive Behaviors, 33, 1183-1191.

Otten, R., Engels, R.C.M.E. & Prinstein, M.J. (2009). A prospective study of perception in adolescent smoking. Journal of Adolescent Health, 44, 478-484.

Otten, R., Wanner, B., Vitaro, F. & Engels, R.C.M.E. (2009). Disruptiveness, peer experiences and adolescent smoking: a long-term longitudinal approach. Addiction,104, 641-650.

Palmer, A. & Montaño, J.J. (1999). ¿Qué son las redes neuronales artificiales? Aplicaciones realizadas en el ámbito de las adicciones. [What are artificial neural networks? Applications in the field of addictions]. Adicciones, 11, 243-255.

Palmer, A., Jiménez, R. & Gervilla, E. (2011). Knowledge-Oriented Applications in Data Mining. In Data Mining: Machine learning and statistical techniques. Viena: Intech. Open Access Publisher.

Palmer, A., Montaño, J.J. & Calafat, A. (2000). Predicción del consumo de éxtasis a partir de redes neuronales artificiales [Ecstasy consumption prediction on the basis of artificial neural networks]. Adicciones, 12, 29-41.

Parr-Rud, O. (2001). Data Mining Cookbook. Modeling Data for Marketing, Risk and Customer Relationship Management. New York: John Wiley & Sons.

Pérez, C. & Santín, D. (2007). Minería de Datos. Técnicas y Herramientas. Madrid: Thomson.

Piko, B.F. (2006). Adolescent smoking and drinking: The role of communal mastery and other social influences. Addictive Behaviors, 31, 102-114.

Quinlan, J.R. (1986). Induction of Decision Trees. Machine Learning, 1, 81-106.

Quinlan, J.R. (1993). C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann.

Quinlan, J.R. (1997). C5.0 Data Mining Tool. Rule Quest Research, http://www.rulequest.com.

Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by error propagation. In D.E. Rumelhart, & J.L. McClelland (Eds.), Parallel distributed processing (pp. 318-362). Cambridge, MA: MIT Press.

Sargent, J.D., Tanski, S., Stoolmiller M. & Hanewinkel, R. (2009). Using sensation seeking to target adolescents for substance use interventions. Addiction, 105, 506-514.

Shmueli, G., Patel, N.R. & Bruce, P.C. (2007). Data mining in excel: Lecture notes and cases. Arlington, VA: Resampling Stats, Inc.

Simons-Morton, B. (2007). Social influences on adolescent substance use. American Journal of Health Behavior, 31, 672-684.

Specht, D.F. (1990). Probabilistic neural networks. Neural Networks, 3, 109-118.

Speri, L., Schilirò, G., Bezzetto, A., Cifelli, G., De Battisti, L., Marchi, S., Modenese, M., Varalta, F. & Consigliere, F. (1998). The use of artificial neural networks methodology in the assessment of “vulnerability” to heroin use among army corps soldiers: A preliminary study of 170 cases inside the Military Hospital of Legal Medicine of Verona. Substance Use & Misuse, 33(3), 555-586.

Szabo, E., White, V. & Hayman, J. (2006). Can home smoking restrictions influence adolescents’ smoking behaviors if their parents and friends smoke? Addictive Behaviors, 31(12), 2298-2303.

Wasserman, P.D. (1989). Neural computing: theory and practice. New York: Van Nostrand Reinhold.

Widrow, B. & Hoff, M. (1960). Adaptive switching circuits. In J. Anderson & E. Rosenfeld (Eds.), Neurocomputing (pp. 126-134). Cambridge, Mass.: The MIT Press.

Witten, I.H. & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques (2nd. ed.). San Francisco: Morgan Kaufmann.

Ye, N. (Ed.) (2003). The Handbook of Data Mining. Mahwah, NJ: Lawrence Erlbaum Associates.

Publicado
08-04-2014
Cómo citar
Montaño, J. J., Gervilla, E., Cajal, B., & Palmer, A. (2014). Técnicas de clasificación de data mining: una aplicación al consumo de tabaco en adolescentes. Anales de Psicología / Annals of Psychology, 30(2), 633–641. https://doi.org/10.6018/analesps.30.2.160881
Número
Sección
Psicología y adolescencia