6. I18n improvement and SEO

In the previous chapter, we talked about some configurations of I18n. So in this chapter, we need to continue to improve the automation script of I18n and complete the SEO configuration of I18n.

Problems with automated translation

So let’s talk about the problem first. The version of this section is github tag v2.2.1.

full mode

First of all, the biggest problem is the full mode. Maybe it should not be called full, but reset, because the missing mode can already handle incremental translation needs efficiently. The only reason to keep full mode is to use it when you need to completely reset the translation, which should be rare and avoided because the context required to do so at a later stage is simply too large and risky.

But the most practical problem is that AI outputs so much full content that it is impossible to use JSON.parse to convert it into Json. The more content, the harder it is to output fixed formatted content, so I directly removed the full mode, and the default is missing mode.

"i18n:translate": "tsx scripts/18n/cli.ts --mode=missing",
"i18n:keys": "tsx scripts/18n/cli.ts --mode=keys --keys",
"i18n:list": "tsx scripts/18n/cli.ts --list-locales"
```

### Token consumption in missing mode

When using an AI model for translation, tokens are consumed with each request. Token is the basic unit of text. Usually a word is broken down into one or more tokens.

The current translation implementation has serious token consumption problems, which are mainly reflected in:

1. Duplicate Prompt Templates: Use the same lengthy prompt for every language
2. Single language, single request: Each language sends a separate request, and the context cannot be shared.
3. Multiple API calls: One API call per language, increasing latency and cost

The solution is to translate using the union of missing keys in all languages as the source text:

1. **Batch processing of translation requests**: By merging the missing keys for all languages into a single collection, we can request translations for multiple languages in one API call instead of sending separate requests for each language.

2. **Reduce duplicate content**: When multiple languages lack the same keys, we only need to send the original English text of these keys once, instead of sending them repeatedly in each language.

3. **Shared context**: The AI model can understand everything that needs to be translated in one request, which helps maintain consistency in translation, especially for related terms.

4. **Reduce API call costs**: Reducing the number of API calls not only reduces latency, but also significantly reduces token consumption, because we no longer need to repeatedly send the same prompt template for each language.

For example, let's say we have three languages (Chinese, Japanese, and Korean), and each language is missing 5 of the same keys. Using the old method, we needed to send 3 separate requests, each containing the full prompt template and the contents of the 5 keys. With the new method, we only need to send 1 request, containing the content of the prompt template and 5 keys once, and then ask the AI model to generate translations for all three languages at the same time.

> Is there a better way? Obviously, there is, that is, adding a key + language comparison table. One key corresponds to the translation of all languages. Each time the key is compared with en.json, the comparison table is always the main one. Deleting a row in the comparison table will delete a row of all keys. If any column in a row of the comparison table is empty, it will be regenerated with AI. This is what my main project is doing now. The front-end and local partners work together to operate this key+language comparison table. But this solution is too complex and not suitable for single-person development.

#### Optimized translation process

1. Collect missing keys for all target languages
2. Calculate the

Problems with automated translation

full mode

Subscribe to unlock the full article