Guidelines for Post-Editing Machine Translated Content

Machine translation (MT) systems like Google, Microsoft and DeepL have made dramatic progress over the last years. What was once the butt of many jokes has now approached (for some kinds of texts) near human quality. With ever-increasing volumes of content which needs to get translated, many organizations are turning to MT. Some organizations have developed customized MT systems that are trained in a specific domain. This yields even better translation quality. The cost savings inherent in using MT can be very high, also resulting in faster speed to market with translated content.

But still, the quality of MT is not perfect and can lead to mistranslated content and incorrect content. So at some point, the translation industry came up with a new service level: post-edited machine translation or PEMT.

What is PEMT?

PEMT, an abbreviation of post-edited machine translation, refers to a workflow in which text is initially translated by software.  It is then sent off for review to professional translators who edit the text and correct the sentences that were done poorly by the MT software. In an ideal scenario, PEMT can yield perfect translations which cost much less than a pure human process.

PEMT Guidelines

Here are some guidelines and best practices for post-editing machine translated content. Remember that the purpose of post-editing website content is to achieve good translation quality in the shortest time possible. So you should only change what is essential to ensure clear understanding and grammatical correctness of the content. Do not dwell too much on stylistic issues. If a sentence is grammatically correct and is an accurate translation of the original, try not to spend time to improve it. There will probably be plenty of other sentences that need more work.

Comprehensive Guide for Post-Editing Machine Translations:

When reviewing machine-translated content, it’s crucial to remember that it’s not immune to inaccuracies. It’s essential to meticulously inspect all machine-translated text, making necessary corrections to align it with the original document’s meaning, tone, and intent.

    1. Start by thoroughly understanding the source material. Grasp its significance, intention, and the tone it conveys.
    2. Assess the machine-translated content for its appropriateness. Often, modifying the machine translation to mirror the original makes for an efficient and accurate translation. However, there might be instances where starting afresh or utilizing a similar translation from the translation memory (TM) is more effective.
      • Be wary of over-editing. Stick to the guidelines, translation memories, glossaries, and style guides provided for the project. Avoid personal preference edits unless they improve fluency and accuracy without deviating from required references.
      • Guard against under-editing. Even if the translation seems smooth, hidden errors or inaccuracies might exist. Match each translated segment with its source, paying attention to common machine translation mistakes listed below, and adjust accordingly.
      • Conduct automated quality assurance (QA) checks with settings tailored to identify terminology consistency, repetitive words, extra spaces, and spelling mistakes. Note that Microsoft Word’s spell checker often excels in identifying grammatical errors possibly introduced by machine translations.
    3. Should you encounter recurring issues within the machine translations, please inform the content owner for future reference.

Common MT errors

  • Unnecessary additions
  • Missing or untranslated words
  • Incorrect use of terminology
  • Errors in punctuation, capitalization, and hyphenation
  • Misplaced or absent tags
  • Issues with spacing, like extra spaces or incorrect spacing around numbers and units
  • Misinterpretation of proper nouns (pay special attention to client and product names)
  • Lack of consistency in terminology and style
  • Variances in tone
  • Misinterpretations, including text and acronyms
  • Grammatical errors, such as incorrect gender usage or word order