Common Base Project

    6. I18n improvement and SEO

    Members only · Non-members can read 30% of the article.

    Published
    May 17, 2025
    Reading Time
    5 min read
    Author
    Felix
    Access
    Members only
    Preview only

    Non-members can read 30% of the article.

    In the previous chapter, we talked about some configurations of I18n. So in this chapter, we need to continue to improve the automation script of I18n and complete the SEO configuration of I18n.

    Problems with automated translation

    So let’s talk about the problem first. The version of this section is github tag v2.2.1.

    full mode

    First of all, the biggest problem is the full mode. Maybe it should not be called full, but reset, because the missing mode can already handle incremental translation needs efficiently. The only reason to keep full mode is to use it when you need to completely reset the translation, which should be rare and avoided because the context required to do so at a later stage is simply too large and risky.

    But the most practical problem is that AI outputs so much full content that it is impossible to use JSON.parse to convert it into Json. The more content, the harder it is to output fixed formatted content, so I directly removed the full mode, and the default is missing mode.

    "i18n:translate": "tsx scripts/18n/cli.ts --mode=missing",
    "i18n:keys": "tsx scripts/18n/cli.ts --mode=keys --keys",
    "i18n:list": "tsx scripts/18n/cli.ts --list-locales"
    ```
    
    ### Token consumption in missing mode
    
    When using an AI model for translation, tokens are consumed with each request. Token is the basic unit of text. Usually a word is broken down into one or more tokens.
    
    The current translation implementation has serious token consumption problems, which are mainly reflected in:
    
    1. Duplicate Prompt Templates: Use the same lengthy prompt for every language
    2. Single language, single request: Each language sends a separate request, and the context cannot be shared.
    3. Multiple API calls: One API call per language, increasing latency and cost
    
    The solution is to translate using the union of missing keys in all languages as the source text:
    
    1. **Batch processing of translation requests**: By merging the missing keys for all languages into a single collection, we can request translations for multiple languages in one API call instead of sending separate requests for each language.
    
    2. **Reduce duplicate content**: When multiple languages ​​lack the same keys, we only need to send the original English text of these keys once, instead of sending them repeatedly in each language.
    
    3. **Shared context**: The AI ​​model can understand everything that needs to be translated in one request, which helps maintain consistency in translation, especially for related terms.
    
    4. **Reduce API call costs**: Reducing the number of API calls not only reduces latency, but also significantly reduces token consumption, because we no longer need to repeatedly send the same prompt template for each language.
    
    For example, let's say we have three languages ​​(Chinese, Japanese, and Korean), and each language is missing 5 of the same keys. Using the old method, we needed to send 3 separate requests, each containing the full prompt template and the contents of the 5 keys. With the new method, we only need to send 1 request, containing the content of the prompt template and 5 keys once, and then ask the AI ​​model to generate translations for all three languages ​​at the same time.
    
    > Is there a better way? Obviously, there is, that is, adding a key + language comparison table. One key corresponds to the translation of all languages. Each time the key is compared with en.json, the comparison table is always the main one. Deleting a row in the comparison table will delete a row of all keys. If any column in a row of the comparison table is empty, it will be regenerated with AI. This is what my main project is doing now. The front-end and local partners work together to operate this key+language comparison table. But this solution is too complex and not suitable for single-person development.
    
    #### Optimized translation process
    
    1. Collect missing keys for all target languages
    2. Calculate the
    
    Members only

    Subscribe to unlock the full article

    Support the writing, unlock every paragraph, and receive future updates instantly.

    Comments

    Join the conversation

    0 comments
    Sign in to comment

    No comments yet. Be the first to add one.