Utilizing corpus stylistics to facilitate literary analysis: An assessment of the effectiveness of semantic domains in identifying major literary themes in a selection of Charles Dickens novels


  • Wesam Mohamed Abdelkhalek Ibrahim Princess Nourah bint Abdulrahman University, Riyadh, SAUDI ARABIA




Corpus Stylistics, WMatrix5, semantic domains, Charles Dickens


This paper argues that corpus linguistic procedures can be considered an addition to the analytical inventory of traditional stylistics. It aims to explore how corpus linguistic procedures, particularly semantic domains, can be effective in detecting major literary themes in fiction. In order to do so, five corpora have been compiled: a corpus for each of the four novels of Charles Dickens’ selected (i.e., Oliver Twist, David Copperfield, Great Expectations and Our Mutual Friend) and a compiled corpus combining all four novels. Wmatrix 5, with the BNC Sampler-Written as a reference corpus, is used to extract the key semantic domains in each corpus respectively. The literature on the selected novels is consulted to identify the major themes; it is then verified whether these themes are reflected in the corpus analysis, and, finally, the extent to which the procedure is effective in reflecting the major literary themes is explored. The findings confirm the effectiveness of the procedure of analysing semantic domains in studying literary texts, particularly in relation to their themes.


Download data is not yet available.


Ackroyd, P. (1990). Dickens. London: Sinclair-Stevenson.

Adolphs, S. (2006). Introducing Electronic Text Analysis. A Practical Guide for Language and Literary Studies. London: Routledge.

Adolphs, S. and Carter, R. (2002). ‘‘Corpus stylistics: Point of view and semantic prosodies in Virginia Woolf’s To the Lighthouse’’. Poetica 58: 7-20.

Archer, D. and Bousfield, D. (2010) ‘‘‘See Better, Lear’? See Lear Better! A Corpus-Based Pragma Stylistic Investigation of Shakespeare’s King Lear’’, in Mclntyre, D. and Busse, B. (eds.) Language and Style, pp. 183-203. Basingstoke: Palgrave.

Bagehot, W. (1858). “Charles Dickens”, National Review, vol.7, 1858, in Stephen Wall, (ed). 1970. Charles Dickens: A Critical Anthology. Middlesex: Penguin Books Ltd.

Behnam, B. (1996). A stylistic study of cohesive features in English prose fiction with some pedagogical implications for non-native contexts. Sheffield: Department of English Language and Linguistics, University of Sheffield. PhD thesis.

Bell, M. (2000). Sentimentalism, Ethics and the Culture of Feeling. Basingstoke: Palgrave.

Bennett, G. R. (2010). Using corpora in the language learning classroom: Corpus linguistics for teachers. Michigan: University of Michigan Press.

Biber, D. (2011). ‘‘Corpus linguistics and the study of literature: Back to the future?’’ Scientific Study of Literature 1(1): 15-23.

Carter, R. and McRae, J. (1996). Language, literature and the learner: Creative classroom practice. London: Routledge.

Carter, R. and Simpson, P. (eds). (1989). Language, discourse and literature: An introductory reader in discourse stylistics. London: Routledge.

Dickens, C. (2008). David Copperfield, ed. by Nina Burgis. Oxford: Oxford University Press.

Dickens, C. (2008). Oliver Twist, ed. by Kathleen Tillotson. Oxford: Oxford University Press.

Dickens, C. (2008). Our Mutual Friend, ed. by Michael Cotsell. Oxford: Oxford University Press.

Dickens, C. (2008). Great Expectations, ed. by Margaret Cardwell. Oxford: Oxford University Press.

Choudhury, B. (2009). Charles Dickens: Great Expectations. New Delhi: PHI Learning Private Limited.

Churchill, R. C. (1975). A Bibliography of Dickensian Criticism 1836-1975. London, Basingstoke, and New York: Macmillan.

Cook, G. (1994). Discourse and literature: The interplay of form and mind. Oxford: Oxford University Press.

Culpeper, J. (2002). ‘‘Computers, language and characterisation: An analysis of six characters in Romeo and Juliet’’. In U. Melander-Marttala, C. Östman and M. Kytö (eds) Conversation in Life and in Literature, pp. 11-30. Uppsala: Universitet stryckeriet.

Culpeper, J. (2002). ‘‘Keyness: Words, Parts-of-Speech and Semantic Categories in the Character-talk of Shakespeare’s Romeo and Juliet’’. International Journal of Corpus Linguistics 14 (1): 29-59

Fischer-Starcke, B. (2010). Corpus linguistics in literary analysis: Jane Austen and her contemporaries. London: Continuum.

Fowler, R. (1996). Linguistic criticism. Oxford: Oxford University Press.

Gilquin, G. (2010). Corpus, cognition and causative constructions. Amsterdam: John Benjamins Publishing Company.

Hall, G. (2007). ‘‘Stylistics in Second Language Contexts: A Critical Perspective’’. In Greg Watson and Sonia Zyngier (eds.), Literature and stylistics for language learners: theory and practice, pp. 91–105. Basingstoke: Palgrave Macmillan.

Hernández, P. S. (2011). ‘‘The potential of literacy texts in the language classroom: The study of linguistic functions’’. Odisea: Revista de Estudios Ingleses(12), 233-244.

Ho, Y. (2011). Corpus Stylistics in Principles and Practice: A Stylistic Exploration of John Fowles' The Magus. London: Continuum.

Hoover, D. L. (1999). Language and style in The Inheritors. Lanham, Md: University Press of America.

Hoover, D. L. (2002). ‘‘Frequent word sequences and statistical stylistics’’. Literary and Linguistic Computing 17(2): 157-180.

Hori, M. (2004). Investigating Dickens’ Style. A Collocational Analysis. Basingstoke: Palgrave Macmillan.

Hunt, D. and Carter, R. (2012). ‘‘Seeing through the bell jar: Investigating linguistic patterns of psychological disorder’’. Journal of Medical Humanities 33(1), 27–39.

Johnson, E. (1977). Charles Dickens: His Tragedy and Triumph. New York: The Viking Press.

Keshabyan-Ivanova, I. (2014). ‘‘A computer-assisted analysis of literary texts: a sample study’’. Computer modelling & new technologies 18(6), 59-64.

Kramsch, C. (1993). Context and culture in language teaching. Oxford: Oxford University Press.

Kramsch, C. and Kramsch O. (2000). ‘‘The avatars of literature in language study’’. The Modern Language Journal 84(4), 553–573.

Lawson, A. (2000). ‘“Die schöne Geschichte’: a corpus-based analysis of Thomas Mann’s Joseph und seine Brüder”. In B. Dodd (ed.) Working with German Corpora, pp. 161-80. Birmingham: University of Birmingham Press.

Lazar, G. (1993). Literature and language teaching: a guide for teachers and trainers. Cambridge: Cambridge University Press.

Leech, G. (1992). ‘‘Corpus linguistics and theories of linguistic performance’’. In Jan Svartvik (ed.), Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82, Stockholm, 4–8 August 1991, 105–122. Berlin: Mouton de Gruyter.

Leech, G. (2008). Language in Literature: style and foregrounding Harlow, UK: Pearson.

Louw, B. (1993). ‘‘Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies’’. In M. Baker, G. Francis and E. Tognini-Bonelli (eds.), Text and Technology: In honour of John Sinclair, pp. 155-176. Amsterdam: John Benjamins Publishing.

Mahlberg, M. (2007a). ‘‘Clusters, key clusters and local textual functions in Dickens’’. Corpora(2) 1, pp. 1-31.

Mahlberg, M. (2007b). ‘‘Corpus stylistics: bridging the gap between linguistic and literary studies’’. In Michael Hoey, Michaela Mahlberg & Michael Stubbs (eds.), Text, Discourse and Corpora: Theory and Analysis. London: Continuum.

Mahlberg, M. (2010). ‘‘Corpus Linguistics and the Study of Nineteenth-Century Fiction’’. Journal of Victorian Culture 15(2): 292-298.

Mahlberg, M. (2013). Corpus Stylistics and Dickens’ Fiction. New York: Routledge.

Mahlberg, M. (2014). ‘‘Corpus stylistics’’. In M. Burke (ed.), The Routledge Handbook of Stylistics, pp. 378-392. London: Routledge.

Mahlberg, M. and McIntyre, D. (2011). ‘‘A case for corpus stylistics: Ian Fleming’s Casino Royale’’. English Text Construction 4(2): 204-227.

Mahlberg, M. and Smith, C. (2010). ‘‘Corpus Approaches to Prose Fiction: Civility and Body Language in Pride and Prejudice’’. In D. McIntyre and B. Busse (eds.), Language and Style, pp. 449-467. Basingstoke: Palgrave Macmillan.

McEnery, T. and Gabrielatos, C. (2006). ‘‘English corpus linguistics’’. In Bas Arts and April McMahon (eds.), The handbook of English linguistics, pp. 33–71. London: Blackwell Publishing.

McEnery, T. and Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge: Cambridge University Press.

McEnery, A., R. Xiao and Y. Tono. 2006. Corpus Based Language Studies. An Advanced Resource Book. London: Routledge.

McIntyre, D. (2008). ‘‘Integrating multimodal analysis and the stylistics of drama: a multimodal perspective on Ian McKellen’s Richard III’’. Language and Literature 17(4): 309-334.

McIntyre, D. (2010). ‘‘Dialogue and Characterization in Quentin Tarantino’s Reservoir Dogs: A Corpus Stylistic Analysis’’. In Dan McIntyre & Beatrix Busse (eds.), Language and Style, pp. 162–182. Basingstoke: Palgrave Macmillan.

McIntyre, D. (2015). ‘‘Towards an integrated corpus stylistics’’. Topics in Linguistics (16)1: 59-68.

McIntyre, D. and Walker, B. (2010). ‘‘How can corpora be used to explore the language of poetry and drama?’’ In A. O’Keeffe and M. McCarthy (eds), The Routledge Handbook of Corpus Linguistics, pp. 516–530. London: Routledge.

Merchant, P. and Waters, C. (2015). Dickens and the Imagined Child. Ashgate Publishing, Ltd.

Meyer, C. F. (2002). English corpus linguistics: An introduction. Cambridge: Cambridge University Press.

Murphy, S. (2007). ‘‘Now I am alone: A corpus stylistic approach to Shakespearian soliloquies’’. In Costas Gabrielatos, Richard Slessor & J.W. Unger (eds.), Papers from the Lancaster University Postgraduate Conference in Linguistics & Language Teaching, vol. 1, 67–85. Lancaster: Department of Linguistics and English Language, Lancaster University.

Naciscione, A. (2010). Stylistic use of phraseological units in discourse. Amsterdam: John Benjamins Publishing Company.

O’Halloran, K. (2007a). ‘‘The subconscious in James Joyce’s ‘Eveline’: a corpus stylistic analysis that chews on the ‘Fish hook’’. Language and Literature 16(3): 227-244.

O’Halloran, K. (2007b). ‘‘Corpus-assisted literary evaluation’’. Corpora 2(1): 33-63. doi:10.3366/cor.2007.2.1.33.

O’Keeffe, A. (2006). Investigating media discourse. London: Routledge.

Oliphant, Margaret. (1862). “Sensation Novels”, Blackwood’s Magazine, vol.91, 1862, in Stephen Wall, ed. 1970. Charles Dickens: A Critical Anthology, Middlesex: Penguin Books Ltd.

Orwell, G. (1968). “Charles Dickens”. In George Orwell, Sonia Orwell and Ian Angus (eds), The Collected Essays, Journalism and Letters of George Orwell. London: Secker and Warburg.

Raina, Badri. (1986). Dickens and Dialectic of Growth. Madison, Wisconsin: The University of Wisconsin Press.

Rayson, P. (2003). Matrix: A Statistical Method and Software Tool for Linguistic Analysis through Corpus Comparison. Ph.D. Thesis. Lancaster University.

Rayson, P. (2008). ‘‘From key words to key semantic domains’’. International Journal of Corpus Linguistics 13(4), 519–549.

Rayson, P. (2021). Wmatrix5: A Web-based Corpus Processing Environment. Computing Department, Lancaster University. https://ucrel-wmatrix5.lancaster.ac.uk/ wmatrix5.html

Romaine, S. (2010). ‘‘19th Century key words, key semantic domains and affect: In the rich vocabulary of love ‘most dearest’ be a true superlative”. Studia Neophilologica 82(1): 12-48.

Schuster, S. (2014). An Analysis of Childhood and Child Labour in Charles Dickens’ Works: David Copperfield and Oliver Twist. Hamburg: Anchor Academic Publishing.

Scott, M. (2006). ‘‘Key words of individual texts’’. In M. Scott and C. Tribble (eds), Textual Patterns: Key Words and Corpus Analysis in Language Education, pp. 55–72. Amsterdam: John Benjamins.

Scott, M. and Tribble, C. (2006). Textual Patterns: Key Words and Corpus Analysis in Language Education. Amsterdam: John Benjamins.

Semino, E. (2011). ‘‘Stylistics’’. In James Simpson (ed.), The Routledge handbook of applied linguistics, 541–554. London: Routledge.

Semino, E. and Short, M. (2004). Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of English writing. London and New York: Routledge.

Short, M. (1996). Exploring the language of poems, plays, and prose. London: Longman.

Simpson, P. (1993). Language, ideology, and point of view. London: Routledge.

Sinclair, J. (2004). Trust the text: Language, corpus and discourse. London: Routledge.

Sinclair, J. (2007). ‘‘Introduction’’. In Michael Hoey, Michaela Mahlberg and Michael Stubbs (eds.), Text, discourse and corpora: Theory and analysis. London: Continuum

Starcke, B. (2006). ‘‘The phraseology of Jane Austen’s Persuasion: phraseological units as carriers of meaning’’. ICAME Journal 30: 87-104.

Stubbs, M. (2001). Words and phrases: Corpus studies of lexical semantics. Oxford: Blackwell Publishers.

Stubbs, M. (2005). ‘‘Conrad in the Computer: Examples of Quantitative Stylistic Methods’’. Language and Literature, 14(1): 5-24.

Taine, H. “Charles Dickens, son talent et ses oeuvres”, Revue des Deux Mondes February 1856, (translated by H. Van Laun, 1871, in History of English Literature). In Stephen Wall, ed. 1970. Charles Dickens: A Critical Anthology. Middlesex: Penguin Books.

Taylor, C. (2008). ‘‘What is corpus linguistics? What the data says’’. ICAME Journal(32), 179-200.

Toolan, M. (1998). Language in literature: An introduction to stylistics. London: Routledge.

Toolan, M. (2006). ‘‘Top keyword abridgements of short stories: A corpus linguistic resource?’’ Journal of Literary Semantics 35(2): 181-194.

Trollope, A. (1870). “Charles Dickens”, St Paul’s Magazine, vol.6, 1870, in Stephen Wall, ed. 1970. Charles Dickens: A Critical Anthology. Middlesex: Penguin Books Ltd.

Walder, D. (1981). Dickens and Religion. London: George Allen & Unwin Ltd.

Walker, B. (2010). ‘‘Wmatrix, Key-Concepts and the Narrators in Julian Barnes’ Talking It Over’’. In D. McIntyre and B. Busse (eds) Language and Style, pp. 364-387. Palgrave MacMillan, Basingstoke.

Walker, B. (2012). Character and Characterisation in Julian Barnes’ Talking it over: A Corpus Stylistic Analysis. Ph.D. Thesis. Lancaster University.

Wall, S. (1970). Charles Dickens: A Critical Anthology. Middlesex: Penguin Books Ltd.

Watson, G. and Zyngier, S. (eds). (2007). Literature and stylistics for language learners: Theory and practice. Basingstoke: Palgrave Macmillan.

Widdowson, H. G. (1975). Stylistics and the teaching of literature. London: Routledge.

Widdowson, H. G. (1992). Practical stylistics: An approach to poetry. Oxford: Oxford University Press.

Widdowson, H. G. (2008). ‘‘The novel features of text: Corpus analysis and stylistics’’. In Andrea Gerbig & Oliver Mason (eds.), Language, people, numbers, corpus linguistics and society, pp. 293-304. Amsterdam: Rodopi.

Wynne, M. (2006). ‘‘Stylistics: Corpus Approaches’’ in K. Brown (ed. in chief) The Encyclopaedia of Language and Linguistics, pp. 223–26. Oxford: Elsevier.



How to Cite

Ibrahim, W. M. A. (2022). Utilizing corpus stylistics to facilitate literary analysis: An assessment of the effectiveness of semantic domains in identifying major literary themes in a selection of Charles Dickens novels. AJELP: Asian Journal of English Language and Pedagogy, 10(1), 114–138. https://doi.org/10.37134/ajelp.vol10.1.9.2022