Google Translate vs. DeepL: A quantitative evaluation of close-language pair translation (French to English)


  • Ahmad Yulianto Faculty of Languages and Arts, Universitas Negeri Semarang (UNNES), INDONESIA
  • Rina Supriatnaningsih Faculty of Languages and Arts, Universitas Negeri Semarang (UNNES), INDONESIA



assessment, evaluation, human text, machine translation, metric


Machine translation has improved in quality and worked best when applied to language pair of the same language family. This research was aimed to assess the quality of Google Translate and DeepL in terms of accuracy and readability. French to English translation data of En attendant Godot playscript by GT and DeepL were evaluated. The English Original version (EO) of the text served as reference. Two quantitative methods were employed i.e., manual with SAE J2450 translation metric and automatic assessment with Coh Metrix tool. The result of manual assessment shows that GT and DeepL outputs passed the grade, scoring 84 and 99.04 respectively. Referring to CdT Rubric, a translation is good when it has 80 - 99 points. In Coh-Metrix result GT and DeepL scores varied. Statistical analysis with ANOVA shows that GT and DeepL are not significantly different from EO. EO mean score is 99.69, GT is 100.4 and DeepL is 100.78. In conclusion, DeepL scores higher in manual assessment, indicative of its accuracy while GT and DeepL are more or less the same in Coh-Metrix assessment. In terms of readability, DeepL offers better reading ease as proved by Flesch Reading Ease, Flesch-Kincaid Grade Level and Coh Metrix Readability formulas, all in favor of DeepL. Despite this statistical result, there are many things that GT and DeepL need to improve like world knowledge and ability to decipher lexical and structural ambiguities.  


Download data is not yet available.


Ahrenberg, L. (2017). Comparing machine translation and human translation: A case study. In Proceedings of The First Workshop on Human-Informed Translation and Interpreting Technology (pp. 21–28). Wolverhampton, UK.

Arnold, E. A. 1994. Machine Translation: An Introductory Guide. London: Blackwells-NCC.

Bell, Roger T. (1991). Translation and Translating: Theory and Practice. London and New York: Longman.

Carroll., John B. 1966. An experiment in evaluating the quality of translation. Mechanical Translation and Computational Linguistics, 9(3-4):67–75.

Crystal, D. (1991). The Cambridge encyclopedia of language. Cambridge, UK: Cambridge University Press.

Finch, A., Hwang, Y.S. and Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence. In Proceedings of the Third International Workshop on Paraphrasing (IWP2005) (pp. 17-24)

Fowler, C. A., & Hodges, B. H. (2011). Dynamics and languaging: toward an ecology of language. Ecological Psychology, 23(3), 147-156. Retrieved on August 16, 2021 from

Graesser, A. C., McNamara, D. S., & Louwerse, M. M (2003). What do readers need to learn in order to process coherence relations in narrative and expository text. In A.P. Sweet and C.E. Snow (Eds.), Rethinking reading comprehension. New York: Guilford Publications.

Graesser, A. C., McNamara D. S., Louwerse, M. and Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods, Instruments, & Computers 2004, 36 (2), 193-202.

Horguelin, Paul and Louise Brunette. 1998.Practique de la Révision, 3ème Edition Revue et Augmentée. Québec: Linguatech éditeur.

Khomeijani, F. 2005. A Framework for Translation Evaluation. Cambridge: Blackwell Publishers Inc.

Larose, R. 1998. “Méthodologie de l’Évaluation des Traductions”. [A Method for Assessing Translation Quality]. Meta 43: 163-86.

Larson, M. L. 1984. A Guide to Cross language Equivalence. Maryland: University Press of America.

Li, H., Graesser, A.C., and Cai, Z. (2014). Comparison of Google translation with human translation. In Proceedings of the Twenty-Seventh International

Martinez, R.M.2014. A Deeper look into metrics for translation quality assessment (TQA). In Miscellanea: A Journal of English and American studies 49 (2014): pp. 73-94 ISSN: 1137-6368

Nida, Eugene A. & Taber, Charles R. (1969). The theory and practice of translation. Leiden: E.J. Brill.

Nord, Christiane. 1997. Translation as a Purposeful Activity. Manchester, UK: St. Jerome.

O’Brien, Sharon. 2012. “Towards a Dynamic Quality Evaluation Model for Translation”. Jostrans: The Journal of Specialized Translation 17. Retrieved from (Accessed 16 August, 2021) as cited in Dialnet

Parra Galiano, Silvia. 2005. La Revisión de Traducciones en la Traductología : Aproximación a la Práctica de la Revisión en el Ámbito Profesional Mediante el Estudio de Casos y Propuestas de Investigación. (Doctoral Dissertation, Universidad de Granada, Spain). Retrieved from (Accessed 31 July, 2021) as cited in Dialnet

Puchała-Ladzińska, K. (2016). Machine translation: a threat or an opportunity for human translators? Studia Anglica Resoviensia 13: 89-98.

Rahimi, R. 2004. Alpha, Beta and Gamma Features in Translation: Towards the Objectivity of Testing Translation, Translation Studies. Norwood: Ablex Publishing.

Richards, J., Platt, J., & Platt, H. (1997). Dictionary of language teaching and applied linguistics. London: Longman.

SAE. 2001. Translation Quality Metric [J2450]. Warrendale, PA: SAE.

Secâra, Alina. 2005. Translation Evaluation – a State of the Art Survey. eCoLoRe/MeLLANGE Workshop Proceedings. Leeds, UK: University of Leeds Press: 39-44.

Shankland, S. (2013). Google Translate Now Serves 200 Million People Daily. CNET. Retrieved from

Štefčík, J. (2015). Evaluating Machine Translation Quality: A Case Study of Translation - a Verbatim Transcription from Slovak into German. Vertimo Studijos. 2015. ISSN 2029-7033.

Sun, Sanjun. (2015). Measuring translation difficulty: Theoretical and methodological considerations. Across Languages and Cultures. 2015. DOI: 10.1556/084.2015.16.1.2. Retrieved from on 17 August 2021.

Templin M. C. (1957) Certain language skills in children. Minneapolis: University of Minne- sota Press

Torop, P. (2002). Translation as translating as culture. Sign System Studies, 30(2), 593–605.

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Waddington, Christopher. 2000. Estudio Comparativo de Diferentes Métodos de Evaluación de Traducción General (Inglés–Español). Madrid, Spain: Universidad Pontificia de Comillas.

Waddington, Christopher. 2000. Estudio Comparativo de Diferentes Métodos de Evaluación de Traducción General (Inglés–Español). Madrid, Spain: Universidad Pontificia de Comillas.

Williams, Malcolm. 1989. “The Assessment of Professional Translation Quality: Creating Credibility out of Chaos”. TTR: Traduction, Terminologie, Redaction 2: 13-33.



How to Cite

Yulianto, A., & Supriatnaningsih, R. (2021). Google Translate vs. DeepL: A quantitative evaluation of close-language pair translation (French to English). AJELP: Asian Journal of English Language and Pedagogy, 9(2), 109–127.