Corpus linguistics and child language

foundations for the design of a significative sample


  • Pablo Figueiredo Palacios Universidade de Santiago de Compostela
Keywords: child language, phonology, corpus linguistics


This paper deals with the elaboration of a significative sample which will be used to study, comparatively, the emergence of the phonological component in children aged 1;6 – 3;6 in American English and Peninsular Spanish. Two issues concerning corpus linguistics will be addressed, namely, the representativeness of data and its treatment (corpus-based and corpus-driven approaches shall be discussed) in child language corpora. The sources of our data, the projects CHILDES and PhonBank, will also be analyzed. Additionally, the psycholinguistic criteria which we shall employ will be closely scrutinized, as well as their relevance in the design of the data sample.


Download data is not yet available.


BIBER, Douglas (1993): “Representativeness in Corpus Design”, en Literary and Linguistic Computing, Vol. 8, No. 4, pp. 243-257. Oxford: Oxford University Press.

BIBER, Douglas, Susan CONRAD y Randi RIEPEN (1998): Corpus Linguistics. Investigating Language Structure and Use. Cambridge: Cambridge University Press.

BIBER, Douglas y James K. JONES (2003): “Quantitative methods in corpus linguistics”, en LÜDELING, A. y M. KYTÖ (eds.): Corpus linguistics: An international handbook, Vol. 2, pp. 1286-1304. Berlin: Walter de Gruyter.

DEMUTH, Katherine, Jennifer CULBERTSON y Jennifer ALTER (2006): “Word-minimality, epenthesis, and coda licensing in the acquisition of English, en Language & Speech, 49, pp. 137-174.

ENRÍQUEZ, Iván (2015): La Adquisición de Construcciones Complejas: de la Interacción a la Gramática. [Tesis doctoral] Santiago de Compostela: Universidade de Santiago de Compostela.

FERNÁNDEZ, Milagros (2003): “Dinamismo construccional en el lenguaje infantil y teoría lingüística”, en Estudios de Lingüística Universidad de Alicante (ELUA), 17 (vol. especial), pp. 273-287. Alicante: Universidad de Alicante.

– (2005): “El lenguaje infantil. Algunos lugares comunes revisitados”, en Interlingüística, 16 (1), pp. 21-42.

– (2006): “Usos verbales y adquisición de la gramática. Construcciones y procesos en el habla infantil, en Revista Española de Lingüística (RSEL), 36, pp. 319-347.

– (2007): “La actualidad de los estudios sobre lenguaje infantil”, en Lynx: Panorámica de estudios lingüísticos, 6, pp. 3-40.

– (coord.) (2011): Lingüística de corpus y adquisición de la lengua. Madrid: Arco/Libros.

– (2015): “Lenguaje infantil y medidas de desarrollo verbal”, en ENSAYOS, Revista de la Facultad de Educación de Albacete, 30 (2), pp. 53-69.

– (2017): [en revisión]: “Corpus lingüísticos y “representatividad”. El valor de los datos en fuentes de habla infantil”.

GRUNWELL, Pamela (1981): “The development of phonology”, en First Language, 2, pp. 161-191.

HANSON, Norwood Russell (1958): Patterns of Discovery. An Inquiry into the Conceptual Foundations of Science. Cambridge: Cambridge University Press.

HUNSTON, Susan (2002): Corpora in Applied Linguistics. Cambridge: Cambridge University Press.

INGRAM, David (1976): Phonological disability in children. London: Edward Arnold.

– (1986): “Phonological development: production” en FLETCHER, P. y M. GARMAN (eds.): Language Acquisition. Studies in first language development, pp. 223-239. Cambridge: Cambridge University Press.

LAKOFF, George y Mark JOHNSON (1980): Metaphors We Live By. Chicago and London: The University of Chicago Press.

LEECH, Geoffrey (2007): “New resources, or just better old ones?” en M. HUNDT, N. NESSELHAUF y C. BIEWER (eds.): Corpus Linguistics and the Web, pp. 134–49. Amsterdam: Rodopi.

MACWHINNEY, Brian y Catherine SNOW (1985): “The Child Language Data Exchange System”, en Journal of Child Language, 12, pp. 271-472.

MCENERY, Tony y Andrew HARDIE (2012): Corpus Linguistics. Method, Theory and Practice. Cambridge: Cambridge University Press.

PETERS, Ann M. (1980): “The units of language acquisition”, University of Hawai’i Working Papers in Linguistics 12 (1), pp.1-72.

– (1983): The Units of Language Acquisition, Monographs in Applied Psycholinguistics, Cambridge University Press.

ROSE, Yvan (2012): “Multilingual Phonological Corpus Analysis: The Tools behind the PhonBank Project”, en SCHMIDT, T. y K. WÖRNER (eds.): Multilingual Corpora and Multilingual Corpus Analysis, pp. 365–381. Amsterdam:John Benjamins Publishing Company.

– (2014): “Corpus-based Investigations of Child Phonological Development: Formal and Practical Considerations”, en DURAND, J., U. GUT y G. KRISTOFFERSEN (eds.), The Oxford Handbook of Corpus Phonology, pp. 265- 285. Oxford: Oxford University Press.

ROSE, Yvan y Brian MACWHINNEY (2014): “The PhonBank Project: Data and Software- Assisted Methods for the Study of Phonology and Phonological Development”, en DURAND, J., U. GUT y G. KRISTOFFERSEN (eds.), The Oxford Handbook of Corpus Phonology, pp. 380–401. Oxford: Oxford University Press.

ROSE, Yvan, Brian MACWHINNEY, Rod BYRNE, Gregory HEDLUND, Keith MADDOCKS, Philip O’BRIEN y Todd WAREHAM (2006): “Introducing Phon: A Software Solution for the Study of Phonological Acquisition”, en BAMMAN, D., T. MAGNITSKAIA y C. ZALLER (eds.): Proceedings of the 30th Annual Boston University Conference on Language Development, pp. 489- 500. Somerville, MA: Cascadilla Press.

ROSE, Yvan, Gregory HEDLUND, Todd WAREHAM, Rod BYRNE y Brian MACWHINNEY (2013): “Phon: A Computational Basis for Phonological Database Building and Model Testing”, en VILLAVICENCIO, A. et al. (eds.): Cognitive Aspects of Computational Language Acquisition, pp. 29-49. Berlin/Heidelberg: Springer.

SARDINHA, Tony Berber (2004): Lingüística de Corpus. Barueri, SP: Manole.

SLOBIN, Dan Isaac (ed.) (1985a): The Crosslinguistic Study of Language Acquisition, Vol. 1: The data. Hillsdale, NJ: Lawrence Erlbaum Associates.

– (ed.) (1985b): The Crosslinguistic Study of Language Acquisition, Vol. 2: Theoretical issues. Hillsdale, NJ: Lawrence Erlbaum Associates.

– (ed.) (1992): The Crosslinguistic Study of Language Acquisition, Vol. 3. Hillsdale, NJ: Lawrence Erlbaum Associates.

– (ed.) (1997a): The Crosslinguistic Study of Language Acquisition, Vol. 4. Hillsdale, NJ: Lawrence Erlbaum Associates.

– (ed.) (1997b): The Crosslinguistic Study of Language Acquisition, Vol. 5: Expanding the contexts. Hillsdale, NJ: Lawrence Erlbaum Associates.

STAMPE, David. (1969): “The Acquisition of Phonetic Representation”, en BINNICK, R. et al. (eds.): Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, pp. 443-454. Chicago: Chicago Linguistic Society.

TOGNINI-BONELLI, Elena (2001): Corpus Linguistics at Work. Amsterdam/ Philadelphia: John Benjamins Publishing Company.

TOMASELLO, Michael (2003): Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press.

TOMASELLO, Michael & Daniel STAHL (2004): “Sampling children’s spontaneous speech: how much is enough?” en Journal of Child Language, 31 (1), pp 101–121.

How to Cite
Figueiredo Palacios, P. (2019). Corpus linguistics and child language: foundations for the design of a significative sample. Journal of Linguistic Research, 21, 152–168.