AI and the Art of Peace: Navigating Translation and Tradition

Peace agreements are foundational legal documents that are critical in resolving conflicts. Every word and its placement carry significant meaning that can influence the outcome of peace processes. When translating agreements from their original languages, the accuracy of translation in both the vocabulary and form of these documents is paramount and requires attention to nuances in language and culture. Artificial Intelligence (AI) translation tools are increasingly sophisticated, but they still struggle to interpret these nuances.

An example of such nuance is a local agreement included in the Version 8 release of PA-X: ‘The Peace Agreement District Kurram executed in Para Chanar’. This local inter-tribal six-point ceasefire agreement outlines the areas of enforcement in the town, provides for elements of demobilization, and sets out terms of violation. As this agreement was identified as being in Urdu, it was sent to Umar Shehzad, a native speaker, for translation.

An image of a peace agreement in Arabic
Peace Agreement entitled ‘The Peace Agreement District Kurram executed in Para Chanar’

The third clause of this agreement states, ‘Starting with immediate effect, the “taega” has been put in place between the fighting parties for one year.’

A section of a peace agreement in Arabic

Umar was previously unfamiliar with this word or custom, so he did some further research. His investigation revealed that the entire agreement is written in Urdu except for the Pashto word “ تیگہ “ or “taega”, which occurs in the third clause of the peace agreement and is quite central to the whole agreement. It might have been kept in Pashto because it is untranslatable in Urdu (and probably in any other language).

The word “ تیگہ “ or “taega” has two meanings: it is a Pashto word for “stone” and it also refers to the region-specific tribal tradition of committing to the cessation of hostilities between the warring parties. The tribal elders gather in jirga (a meeting of tribal chiefs), and once they reach an agreement, they symbolically set the terms in stone, thus strictly prohibiting all acts of violence between the fighting parties with immediate effect. Usually, it’s a time-barred agreement ranging from 40 days to one year. The custom is invoked for an immediate ceasefire pending a complete settlement of the dispute to provide respite in the fighting and to return to the usual business of the day-to-day, like traveling and trade. It is considered extremely unlawful to commit violence when “taega” is in place and the full force of the community comes down to punish the violator.

Artificial Intelligence and Translation

As innovation in Artificial Intelligence (AI) continues to progress, so does the potential to apply this technology to translation capabilities.

The term ‘natural language’ refers to ‘human’ language, as opposed to machine language such as Python or JavaScript. Natural Language Processing (NLP), or machines designed to understand and generate human natural language, can use AI as one of their tools to accomplish this task. Various AI techniques have been deployed to improve the accuracy and scale of automated translation, from Google Translate to generative AI, such as ChatGPT.

AI techniques for translation are at their most beneficial when dealing with issues of scale over accuracy. Large businesses might need a solution that can translate thousands of documents quickly and cheaply and afford some errors, depending on what it is translating. While the accuracy of these technologies is improving, AI still has difficulty understanding natural language’s nuances – and it is those very nuances that are a critical element of understanding and researching peace agreements.

Language as data

Translating peace agreements accurately is a critical component of the Peace Agreement Database (PA-X), a database containing over 2,000 peace agreements and ceasefires since 1990, for research and analysis.

Peace agreement language is data that can be understood and analysed in its own right. To allow for accurate categorization of the content of peace agreements, the PA-X team at the University of Edinburgh codes each peace agreement provision from a list of over 225 categories for quantitative and qualitative analysis.

How language is used can be critical to how peace agreements are understood, including their intent, categorization, and understanding as a legally binding document. The language and intent behind it are critical to what we are trying to understand. When translating agreements, the PA-X team is not just looking for the English equivalent of the words being used; it is also interested in what that language, in its own terms and cultural context, means for the intent and binding nature of the agreement itself. To capture this context, the PA-X team currently uses human translators for the translation of the peace agreements. While this can be time and resource-intensive, the benefits go beyond just accurate translations.

Key challenges in AI translation

So why is the agreement ‘The Peace Agreement District Kurram executed in Para Chanar’ important when comparing AI translation techniques to human translation? The first and most apparent is that translation applications using machine learning techniques struggle to translate the document.

Google Translate identified that provision 3 was about ceasefires but missed the significance of the word ‘taega’:

Translated sections of a peace agreement

This is important as, with this translation, the provision would only be searchable within the ‘Ceasefire Provisions’ category of the PA-X Peace Agreement Database. However, properly translating this provision would also make it searchable in the ‘traditional leaders’ category. In a sense, this AI-based translation solution would result in ‘lost data’ for researchers using PA-X.

It is important to note that ChatGPT, currently the most popular generative AI platform, could not even translate the agreement as it came from an image of a document. The program identified that the language was Urdu while missing that one word in the document was Pashto.

What does this tell us about translating peace agreements with AI? Languages and the culture surrounding them are data and context critical to correctly understanding and analysing peace agreements. Having a peace agreement translated, like the ‘The Peace Agreement District Kurram executed in Para Chanar’ as translated by Umar, is not just a mechanical task but a piece of research that requires skill and understanding. This type of data is currently challenging for an AI system to comprehend.

While AI translation solutions are proving useful for industries whose primary challenge is scale, human translation remains indispensable for researchers, policymakers, and other interested parties, where accuracy and insight are most critical.


About the Authors:

Adam Farquhar is a Research Associate and Data Officer at PeaceRep. He supports the management, development, and coding of the PA-X Peace Agreement Database and its sub-databases. His research interests include the application of geocoding and AI in peacebuilding. You can contact him at adam.farquhar@ed.ac.uk.

Umar Shehzad is a doctoral researcher at the University of Edinburgh. His research looks into the politics and aesthetics of the face in modern literature. You can contact him at umar.shehzad@ed.ac.uk.