Create a word corpus
-
- UpdatedNov 25, 2024
- 4 minutes to read
- Washington DC
- AI Experiences
Build a collection of words and phrases that functions as the vocabulary the system uses to compare your instance records based on their textual similarity. You can think of the word corpus as a dictionary that you want your machine-learning system to understand.
Before you begin
After upgrading, your existing solutions with a word corpus become Workflow solutions the next time they are re-trained. Also the Word Corpus field is removed from the form.
The following information is provided for legacy context.
About this task
The primary purpose for a word corpus is to infer textual data for training your NLU model. If using a word corpus in a solution, you must specify it for training in the solution definition phase of a solution. A trained word corpus can be reused across solutions and capabilities.
You can use a word corpus to help compare similar record text in a table or across multiple tables. A word corpus can also be helpful in other scenarios, such as clustering, where you group similar records together for data analysis, reuse, or review. The items you add to your corpus should be specific to your company and your industry so you can reuse it in other similarity or clustering solutions and apply it to various use cases.
In this example procedure, you're working on incident records and you want to locate relevant knowledge base (KB) articles that could provide resolutions to those incident cases. Your goal here is to create a word corpus that you can apply to a new similarity solution that compares active incidents to published KB articles.
Procedure
Result
What to do next
Related Content
- Create and train a classification solution
Specify the records used to train a classification solution, what fields trigger a prediction, and how often you want to retrain your solution.
- Create and train a similarity solution
Create and train a machine-learning solution to collect and compare your existing records to new similar records. For example, you can compare the text in an open Incident record to a resolved Incident record to reuse its resolution.
- Create and train a clustering solution
Group similar records into clusters so you can address them collectively or identify patterns.