MetaPhrase

An open source translation editor for native mobile app localization.

View project on GitHub

Tech notes

This page contains some notes about the development of MetaPhrase, organized in the following sections:

Technologies used

This section will briefly describe the main technologies involved in the project and the reasons why each of them was selected. This is a personal side project so, naturally, some of the decisions were totally arbitrary and relied more on personal taste or personal history than anything else. I’ll briefly document my experience with any alternatives that I’ve tried and abandoned for each of the sections.

Language and build tools

The original idea behind this project was to create an open source application that could run on all major desktop operating systems (Windows, Linux and MacOS), sharing as much code as possible. At the same time, I had roughly ten years of experience in mobile development, and I’ve always been an enthusiast of the Kotlin programming language since it is a very powerful and expressive yet concise language featuring really interesting concepts such as structured concurrency with coroutines, null safety with optionals, and a multi-paradigm approach combining the best of object-orientation and functional programming.

I have been using Kotlin for my Android projects for the last 3 years. This is why I decided to implement this application as a Kotlin multiplatform project, where virtually all the code can safely reside in the jvmMain source set. Another implication of this choice was that I could leverage my experience with Gradle as a build tool.

User Interface

Being an Android developer, I was already accustomed to working with Jetpack Compose on a daily basis, but until 2022 I’d had few chances to experiment its multiplaftorm releases, especially the desktop flavor on the JVM. This project was the perfect occasion to put it under test and see how far it was possible to go with it.

Compose makes it easy to create and maintain isolated and reusable components with a very convenient declarative style. The desktop version apparently has less components than other older and more widespread toolkits but it has all the basic building blocks that are needed and it is very easy to combine them together and style them to fit the project needs. Despite being a mobile-first library, as far as I’ve seen it has a quite good support for desktop specificities such as mouse pointer inputs (right click, double click, …), context menus, etc.

I didn’t evaluate many alternatives, I never regretted not having used Swing or SWT honestly.

Architecture and navigation

One of the main aspects that I missed from Android development on desktop was a navigation library (since the Jetpack libraries ported as of mid-2023 does not include Navigation). In this project I decided to give a chance to the Decompose library (more info here).

Apart from providing constructs for both stack- and slot-based navigation, I found out that its hierarchical structure could fit into a Model-View-ViewModel architectural pattern (with the component being the viewmodel and the content being the view). Moreover, since each node of the hierarchy only knows about its direct children, it creates a highly modular architecture where it is quite easy to refactor and “transplant” entire subgraphs from one point to another.

[Side note: The validation UI was originally a modal dialog, but transforming it into just another component to display in the bottom panel was almost painless.]

I evaluated other alternatives such as Precompose and Voyager, but I liked more Decompose for being unobtrusive and integrating seamlessly with Koin for dependency injection (Voyager as far as I know still is missing support for Koin integration on desktop).

Dependency Injection

I was primarily accustomed to Dagger and Hilt from the Android world, and I had much less experience with other frameworks for Kotlin (such as Kodein or Koin). I tried both and the final choice was set on Koin because I find it concise and yet very powerful. Sometimes it can be tricky to figure out the cause of some error because messages are not always crystal clear (even if they mostly are, especially compared to Dagger 😅) and missing bindings only pop up at runtime due to its nature, but I found that if you know what you are doing – at least most of the time – it’s no great deal.

I evaluated other alternatives such as Kodein, honestly it did its job well but I remember having had some problems in injecting the same instance of a singleton across different Gradle modules (and I didn’t want to fallback to plain language objects without being able to pass dependencies dynamically).

Primary storage and ORM

I wanted all the application data to be saved on a local database. My experience with embedded databases was limited to sqlite3 on mobile, but I wanted to interact with it using a JDBC driver. I tried using the org.xerial:sqlite-jdbc library and it worked while running debug builds, but I found out that I couldn’t run release builds because it failed notarizing on MacOS. So, looking for alternatives, I decided to give a try to the H2 embedded DB and its JDBC driver and I never regretted it.

Apart from the storage engine, I also wanted to use an ORM, and I discovered the JetBrains Exposed library, which is powerful, intuitive and very well documented. I had complex queries (e.g. translation memory and glossary lookups) with multiple joins (even recursively) and I was amazed at seeing how well it performs and integrates with Kotlin’s coroutines.

I evaluated other popular alternatives in the multiplatform environment, such as SQLDelight from CashApp, but I found it more focused on mobile with lacking documentation on desktop (especially a multi-module desktop application), e.g. it never mentioned in the documentation that in order for schema generation to take place the the sq files had to be put under src/jvmMain so I was left with undecipherable errors and no solution. Additionally, integration with H2 was still experimental, so I didn’t insist much trying.

Secondary storage

Luckily, the AndroidX DataStore library from Jetpack was ported to multiplatform and is in alpha stage, so I decided giving it a try. Since I only had to save a couple of primitive values in the secondary storage (for anything more complicated I had the database), I chose the preference based data store. It worked well, despite some minor concurrency issues. No alternative evaluated here, maybe java.util.prefs.Preferences would have been a better replacement, I’m open to suggestions.

XML (de)serialization

In this project, parsing and writing XML was a core feature (for Android resources), and my choice was on the Redundent library (more info here) for being really lightweight and easy as a piece of cake to use. I really enjoyed its declarative approach to serialization which makes it really easy to write and to maintain the code. Parsing with Redundent is somehow a “minor” feature (and indeed not well documented), but it does its job well and it’s very concise.

I had used more standard approaches such as SAXParser and XMLPullParser and both of them were quite a nightmare. After discovering Redundent I am looking forward to eradicating them from my other projects too! 😂

Logging

I wanted a logger to be able to write log to a file (rather than just in the console) for bug reporting purposes, in order to ask users to send me the logs if they encountered issues with the program. I didn’t find a suitable (and well documented) solution for Kotlin multiplatform, so I had to fallback to plain Java solutions, such as sl4j and log4j. Being a modern developer (this is opinionated, I know!) I was not very fond of having to write XML configuration files in order to get this to work (and it was not easy to configure the log file destination via environment variables) but I didn’t find any viable alternative so I had to settle down with it.

Networking

This was an offline application focused on storing data only on local files and databases. The need to perform network requests emerged when adding the integration with machine translation services. For this, the Ktor library was selected, due to the popularity it has gained in the multiplatform environment as well as its being very well documented. Plus, its content negotiation part offers out of the box support for Json with kotlinx serialization, so it was very easy to integrate and to work with in general.

🔝

Architecture

Module structure

The code is organized in a series of isolated modules that can be grouped into three categories in hierarchical order (inspired by Robert Martin’s Clean architecture):

  • core modules: common utilities that provided the low level infrastructure and are shared among higher modules. Core modules names conventionally start with the core- prefix.
  • domain modules: business logic (repositories and use cases), in the most complex cases these modules also contain data definitions and persistence classes (entities and data sources). Domain modules are organized by domain specificity (project, translation memory, glossary, etc.) and their name starts with the domain- prefix.
  • feature modules: top-level modules that contain the presentation and UI layer for the different sections of the application (project, translation editor, panels, dialogs, etc.). They are easily identified because name starts with the feature- prefix.

Each module can have multiple submodules, especially complex ones (such as domain or feature modules). Domain modules for example tend to have a submodule for repositories and a submodule for use cases; they can optionally have a data and or persistence submodule. Feature modules can have a dialog submodule (with each dialog as a sub-submodule) or can have different submodules modeling different UI parts (e.g. the translation toolbar or the message list in the feature-translate module).

If a module requires dependency injection, the bindings for the classes and interfaces of that particular module are contained in a di package within that same module.

Here ist a short description of what can be found in each module:

  • core-common contains a set shared utilities divided by package: coroutine dispatchers (coroutines), file system (files), data store (keystore), logging (log), notification center (notification), shared UI components and theme (ui), extension functions and utilities (utils).
  • core-localization: contains the main entry point to localization in the L10n shared object and String extension functions; L10n uses the internal DefaultLocalization repository class internally to manage the language bundles and load them from resources via the LocalizationResourceLoader.
  • core-persistence: contains the AppDatabase class that provides a centralized entry point for the persistence layer and it is a factory for the DAO classes (which are found in each domain persistence submodule). Whenever a new persisted entity is created, AppDatabase needs to be updated for the schema creation/update and with the create DAO factory method.
  • domain-formats contains the business logic (mainly usecases) to manage import and export to resource files (Android XML, iOS stringtables, Windows resx, GNU gettext PO, ngx-translate JSON, Flutter ARB, Java properties)
  • domain-glossary contains the data layer and business logic layers of the glossary feature; it is divided into the following submodules:
    • data contains the model classes for the glossary terms
    • persistence contains the entity definitions and local data source (DAO, data access object) for glossary terms and associations between terms
    • repository contains the repositories to create, read, update and delete glossary terms and associations between terms
    • usecase contains the use cases needed to perform the glossary lookup operations
  • domain-language contains the data layer and business logic layer for language management; it is divided in the following submodules:
    • data contains the model classes for languages
    • persistence contains the local data sources for language data
    • repository contains the repositories to create, read and delete language data
    • usecase contains the use cases needed to interact with language data
  • domain-mt contains the business logic, data sources and data transfer objects used in the connectors for Machine Translation providers.
  • domain-project contains the data layer and business logic layer for project and messages (segments and segment pairs, aka translation units); it is divided in the following submodules:
    • data contains the model classes for project, segments and translation units
    • persistence contains the entity definitions and local data sources for project data
    • repository contains the repositories to create, read, update and delete project data
    • usecase contains the use cases needed to interact with project data
  • domain-spellcheck contains the business logic for the spelling checker and sentence analyzer. The entry point for this feature is the ValidateSpellingUseCase class for checking a sentence for errors (and to get suggestions); whereas stemming is accessible via the Spelling#getLemmata(String) method.
  • domain-tm contains the data layer and business logic layer for the translation memory; it is divided in the following submodules:
    • data contains the model classes translation memory entries
    • persistence contains the entity definitions and local data sources for translation memory entries
    • repository contains the repositories to create, read, update and delete entries from the translation memory
    • usecase contains the use cases needed to check for similarities, importing exporting and managing the translation memory
  • feature-intro contains the presentation logic and UI for the intro screen (empty project screen)
  • feature-main contains the presentation logic and UI for the root content, that routes the user either towards the intro screen or the projects content (either project list or translation editor); this module also contains the dialog:settings submodule for the application settings dialog.
  • feature-projects contains the presentation logic that routes the user either towards the project list or the translate content; it additionally has the following submodules:
    • list presentation logic and UI for project list screen
    • dialog:newproject contains the presentation logic and UI for the project creation and edit dialog
    • dialog:statistics contains the presentation logic and UI for the project statistics dialog
  • feature-translate is the main module which contains the presentation logic for the translation editor, panels and dialog; it has the following submodules:
    • dialog:newsegment contains the presentation logic and UI for the message creation dialog
    • dialog:newterm contains the presentation logic and UI for the new glossary term dialog
    • messages contains the presentation logic and UI for the message list inside the translation editor
    • panel:glossary contains the presentation logic and UI for the glossary panel
    • panel:matches contains the presentation logic and UI for the TM matches panel
    • panel:memory contains the presentation logic and UI for the TM content panel
    • panel:validate contains the presentation logic and UI for the validation panel
    • toolbar contains the presentation logic and UI for the translate toolbar
🔝

How to build

Since this is a Gradle project, running it in debug mode should be as simple as launching a

$ ./gradlew run

The Gradle version used is 7.5 so make sure you have at least a JDK >= 18.

🔝

Code documentation

You can find the classes documentation, generated with Dokka by KDoc comments, here.

🔝

Back to project homepage.