Internationalization support for AI Search

Watch

Save as PDF

Share this page

Feedback

Administer the Now Platform

HomeVancouver Now Platform AdministrationAdminister the Now Platform Configure core featuresSearch administrationAI SearchAdministering AI SearchInternationalization support for AI Search

Table of Contents

Internationalization support for AI Search

Release version:
Vancouver
Yokohama
Xanadu
Washington DC
UpdatedJul 29, 2024
5 minutes to read

Vancouver
AI Search

The Vancouver release is no longer supported. As such, the product documentation and release notes are provided for informational purposes only, and will not be updated.

AI Search supports indexing and search in all languages offered by the Now Platform. Search linguistic features are supported in Brazilian Portuguese, Dutch, English, French, French - Canada, German, Italian, Japanese, Korean, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese.

Internationalization support is automatically enabled and isn't configurable.

To view the full list of languages offered as Now Platform plugins and supported in AI Search, see Activate a language.

Note: After you activate a new language plugin, you must reindex all indexed source content that you want to make searchable in the new language. For details on reindexing, see Perform a full table index or reindex for a single indexed source.

Language settings determine how AI Search separates the text of indexed content and search queries into individual terms. This process, called tokenization, is handled differently for each supported language, using language-specific settings. For example, most languages use spaces and punctuation to separate words and sentences, but when tokenizing Chinese or Japanese text, AI Search instead uses contextual interpretation to correctly identify word and sentence breaks. When tokenizing Japanese text, AI Search additionally recognizes the nakaguro (middle dot) as a word separator.

Note: If you indexed content in Brazilian Portuguese, Dutch, Italian, Japanese, Korean, Portuguese, or Swedish prior to August 2024, you should reindex it to benefit from new tokenization improvements for these languages.

Indexing behavior in supported languages

When indexing content and metadata from a Now Platform source record or an external document, AI Search uses tokenization settings for the language of the record or document, as shown in the following table.


Record or Document	Tokenization Settings
Source record from the Task [task] table or one of its child tables	AI Search performs language identification and uses tokenization settings for the detected language to index the record's content and metadata. Note: Language identification only identifies Brazilian Portuguese, Dutch, English, French, French - Canada, German, Italian, Japanese, Korean, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese. Content in other languages is identified and treated as English.
Source record from a non-Task table	AI Search uses tokenization settings for the record's language to index its content and metadata. If the record has no language specified, the Now Platform treats it as being in the instance's default language. In an English instance, for example, AI Search indexes records without specified languages using tokenization settings for English.
External document	AI Search performs language identification and uses tokenization settings for the detected language to index the document's content and metadata. Note: Language identification only identifies Brazilian Portuguese, Dutch, English, French, French - Canada, German, Italian, Japanese, Korean, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese. Content in other languages is identified and treated as English.

Note: When indexing content and metadata, AI Search recognizes regions of Japanese, Simplified Chinese, and Traditional Chinese text embedded within text in other languages. These text regions are indexed with the appropriate language tokenization settings regardless of the surrounding text's language. As an example, suppose you index an English-language Knowledge article that includes a paragraph of Simplified Chinese. AI Search indexes this paragraph's content as Simplified Chinese and the rest of the record's content as English.

Search query behavior in supported languages

When processing search query text, AI Search uses tokenization settings for the language of the current user's Now Platform session.

Note: AI Search recognizes Japanese, Simplified Chinese, and Traditional Chinese terms in search queries. These terms are processed with the appropriate language tokenization settings regardless of the user session's language. As an example, if a user in a French user session searches for remplacement ordinateur 笔记本电脑, AI Search applies Simplified Chinese settings for the 笔记本电脑 term and French settings for the other search terms.

AI Search compares your search query terms with terms from indexed content and metadata, returning search results for indexed records or documents that contain matches. When your search terms are in the same language as the indexed terms, AI Search processes both sets of terms with the same tokenization settings, producing predictable matches and search results. If your search terms aren't in the same language as the indexed terms, AI Search processes the two sets of terms with different tokenization settings and matching may be unpredictable.

Language dependence for search features

The following search features are language-dependent and supported only for the listed languages.

Search:

Table 1. Search feature language dependence
Feature	Language dependence and supported languages
Genius Results	AI Search only evaluates Genius Result configurations with NLU triggers if the linked NLU model has the same language as the search query. Supported languages: English.
Language identification and tokenization	During indexing, AI Search identifies supported languages in Task table records and external documents. Text processing for the indexed content uses tokenization settings for the identified language. Supported languages: Brazilian Portuguese, Dutch, English, French, French - Canada, German, Italian, Japanese, Korean, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese.
Lemma and Unicode normalization	AI Search performs language-specific lemma normalization for terms in indexed content and search queries. Supported languages: Brazilian Portuguese, Dutch, English, French, French - Canada, German, Italian, Japanese, Korean, Portuguese, Simplified Chinese, Spanish, Swedish, and Traditional Chinese. Note: For German, Korean, and Swedish, AI Search performs term decompounding in addition to lemma normalization. AI Search performs Unicode normalization for all terms in indexed content and search queries. For more information on normalization of lemmas and Unicode forms in indexed content and search queries, see Lemma and Unicode normalization.
Result improvement rules	AI Search only evaluates activation for result improvement rules that have the same language as the search query or that have All Languages specified. Supported languages: All languages activated in your instance. For the list of languages you can activate, see Activate a language.
Stop words	AI Search only considers stop words from dictionaries that have the same language as the search query. Supported languages: All languages activated in your instance. For the list of languages you can activate, see Activate a language.
Synonyms	AI Search only considers synonyms from dictionaries that have the same language as the search query. Supported languages: All languages activated in your instance. For the list of languages you can activate, see Activate a language.
Typo handling	AI Search derives a separate list of auto-correction terms for each supported language found in search source indexed content. Auto-correction only replaces search query terms with terms from the list that has the same language as the search query. Supported languages: Brazilian Portuguese, Dutch, English, French - Canada, French, German, Italian, Portuguese, Spanish, and Swedish. Typo handling isn't supported for Japanese, Korean, Simplified Chinese, or Traditional Chinese.

Vancouver Now Platform Administration

Filters

Versions

Products