Machine Translation Extension

Last modified by slauriere on 2024/11/19 16:14

Description

The goal of the translator extension to allow users to translate content and optional object fields from one language to several others using an external machine translation service such as LibreTranslate, DeepL etc. It also adds the ability to store the translated output either in a distinct location as the original page or in the same location, or to display it without storing it.

A side use case consists in declaring a given page as being a translation of another one in a distinct location: this can be useful for multilingual websites where page names differ per language, in order to ease switching from one page to its homologous page in a distinct language.

The Translation extension will obsolete the DeepL extension by adding an abstraction layer between the translation service and the translators and by adding the ability to store the output as documents.

UX open questions

  • Do we enable multiple translation services per wiki and display a translator selector on translation like it's done for LLMs in WAISE, or do we enable just one translator per wiki?
  • How does the service integrate with WAISE?
  • How do we show visually the original page?
  • Naming strategies: either prefix the name with the locale ("/job-offers" -> "fr/offres-emplois"), or translate in situ (same location): are those strategies enough or should we provide a more generic mechanism to configure the document name generation?
  • See if we introduce a dedicated section in the admin or make the config part of the Content section

UI

Drop down content menu

dropdown-menu.png

A dropdown menu in the content menu lists all existing translations and allows to launch the translation action, when the current user has edit right on the current page.

TODO: see how to give emphasis to the document in original language and to indicate whether the document has been reviewed or not.

Translator modal

translation-modal.png

When hitting the Translate action, a modal pops up, listing all existing translations with their translation date and allowing either to translate to a new language, retranslate, or bulk (re)translate. An additional column should display whether the translation was manually reviewed or not.

Translate action progress

translation-service-progress.png

  • In a first version we will display the translation progress using the job macro, with a button at the bottom allowing to return to the original page to access its newly created or updated translations:
  • In a second version, the progress will be displayed directly in the modal, with the ability to cancel it, and with an updated list of translations once the job completes.

Language switcher

language-switcher.png

A dropdown in the top level menu allows to switch from one page to its homologous page in a distinct language.

Publication button

The extension provides a Publish / Unpublish button that can be activated in the configuration, allowing to switch a new translation from hidden to non-hidden: this allows to implement is light workflow, so as to no make the newly translated pages visible until they've been reviewed.

Page children translation

When translating a page, the translation modal should expose an option allowing to translate the current page and its children.

Sychronisation status info

When a translated page is not in sync with the content in the original language, a visual indicator reflecting the situation should be exposed to the user, also inviting allowed users to update the translation. Asynced content happens when the original content gets updated after the translation took place.

Implementation

XClasses

Class nameDescriptionProperties
TranslatorClassRepresents a translation service

Each translator instance has to provide:

  • A Component performing the actual translation and implementing a Translator interface with method "translate(String content, Locale sourceLanguage, Locale targetLanguage, boolean html)"
  • A configuration class allowing to configure the translator: API key or any other authentication credentials
TranslatorConfigurationClassTranslation service configuration
  • Translation service (DBList)): Translator to be used to perform the translations
  • Target properties: list of object reference properties to be translated (eg "XWiki.Movie.MovieClass^storyLine"): this property makes sense only when documents are stored in distinct locations
  • Target classes: list of XClass references which conditions the display of the translation dropdown menu (eg only "ArticleClass" or any other class for which it makes sense to use a translation service or to declare an equivalent page)
TranslationClassRepresents a translated page.
  • Translation date
  • Original page (useful only when the translation is stored in a distinct location from the original)
  • Reviewed (to indicate if the document was manually reviewed or not)

API

  • The extension should provide a Translation API for other apps to use it.
  • Some translation services, such as DeepL, can use a dictionary of terms to be used when translating content. This capability should be reflected in the API.
Interface or ClassMethods or fields
Translator Interface
  • translate(String content, Locale from, Locale to)
  • translate(String content, Locale from, Locale to, boolean isHtml)
  • translate(String content, Locale from, Locale to, boolean isHtml)
Translation Options
  • Dictionary
  • boolean isHtml
  • Locale from
  • Locale toet

See also

UI Extensions

  • Content menu extension listing existing extensions and giving the ability to translate.
  • Top level menu extension allowing to switch from one language to another
  • Extension allowing to redirect the home page to its homologous one in the browser language declared as preferred (on first visit only)

Quick actions

Add a quick action allowing to translate content while editing a page: this action will be listed in the "Actions" list of the quick actions (next to the "Search and replace" one); it will open a modal allowing to enter text, submit it for translation and insert the translated content at the current position.

Scheduled jobs

Add Scheduler Job allowing to bulk translate a set of documents.

Technical notes

  • We need to display the translation job status directly in the current page instead of switching to a page containing a job macro. This means using the Job Runner JavaScript API -> let's see if there is sample implementations with real jobs, eg in XWiki.PDFExport.WebHome.
  • Ideally the Translator configuration page should let users configure both the generic behaviour of the service and each translator credentials (API key, tokens etc.), rather than using distinct pages. Let's see if other configurations use objects in multiple pages.

 


Get Connected