ClusteringSolution - Global

Release version:
Yokohama
Xanadu
Washington DC
Vancouver
UpdatedJan 30, 2025
10 minutes to read

Yokohama
API reference

The ClusteringSolution API is a scriptable object used in Predictive Intelligence stores.

This API requires the Predictive Intelligence plugin (com.glide.platform_ml) and is provided within the sn_ml namespace.

The solution setup-to-training flow is as follows:

Create a dataset using the DatasetDefinition API.
Mandatory if using the K-means clustering algorithm. Build an encoder using the Encoder API.
Use the constructor to create a clustering solution object.
Add the solution object to the clustering solution store using the ClusteringSolutionStore - add() method.
Train the solution using the submitTrainingJob() method. This creates a version of the object that you can manage using the ClusteringSolutionVersion API.
Get predictions using the ClusteringSolutionVersion – predict() method.

Note: This API runs with full privileges before the Vancouver Patch 7 Hotfix 2b and Washington DC Patch 7 releases. With later releases, grant access using ACLs. For more information see Query ACLs.

For usage guidelines, refer to Using ML APIs.

ClusteringSolution - ClusteringSolution(Object config)

Creates a cluster solution.

Search:

Table 1. Parameters
Name	Type	Description
config	Object	JavaScript object containing configuration properties of the solution. `{ "algorithmConfig": {Object}, "clusterConcept": "String", "clusterConceptFieldNames": [Array], "dataset": {Object}, "domainName": "String", "encoder": {Object}, "groupByFieldName": "String", "groupUnclusteredRecords": Boolean, "inputFieldNames": [Array], "label": "String", "maxTimeWindowForUpdate" : Number, "minRecordsPerCluster" : Number, "minRowCount": "String", "processingLanguage": "String", "stopwords": [Array], "trainingFrequency": "String", "updateFrequency": "String" }`
config.algorithmConfig	Object	Required unless setting the encoder property. JavaScript object containing algorithm configuration properties. Property settings vary by the value set in the algorithm property. `'algorithmConfig': { "algorithm": "String", // See algorithmConfig.algorithm setting description for property settings based on algorithm }`
config.algorithmConfig.algorithm	String	Method for encoding your solution. Valid values: dbscan – Density-Based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm. Properties used with this algorithm: distanceMetric epsilon minimumNeighbours hdbscan – Hierarchical Density Based Spatial Clustering of Applications with Noise (HDBSCAN) clustering algorithm. Properties used with this algorithm: minimumSamples kmeans – K-means clustering algorithm. Default. The targetCoverage property is used with this algorithm. Some users prefer DBSCAN because it doesn't require you to specify the number of clusters in the data before clustering. Properties for dbscan: `'algorithmConfig': { "algorithm": "dbscan", "distanceMetric": "String", "epsilon": Number, "minimumNeighbours": Number }` Properties for hdbscan: `'algorithmConfig': { "algorithm": "hdbscan", "minimumSamples": Number }` Properties for kmeans: `'algorithmConfig': { "algorithm": "kmeans", "targetCoverage": Number }`
config.algorithmConfig.distanceMetric	String	DBSCAN algorithm only. Distance metric to scan for similar data objects. Valid values: levenshteinDistance
config.algorithmConfig.epsilon	Number	DBSCAN algorithm only. Decimal value between 0 and 1 representing the size of the neighborhood search radius.
config.algorithmConfig.minimumNeighbours	Number	DBSCAN algorithm only. Minimum number of neighbors required in a point to be a part of a cluster. For levenshteinDistance the value must be 1 so that no points are excluded from the dataset.
config.algorithmConfig.minimumSamples	Number	Minimum number of data samples in a neighborhood required to determine if a point is a core point. Default: None
config.algorithmConfig.targetCoverage	Number	K-means algorithm only. Percentile field to filter out records that are less similar to each other.
config.clusterConcept	String	Optional. Concept type. A concept is a set of words listed in descending order of frequency. To generate a TFIDF-based cluster concept, set the value to `tfidf`. Concept types are listed in the Clustering Definitions [ml_capability_definition_clustering] table. Default: Frequency-based cluster concept
config.clusterConceptFieldNames	Array	Optional. List of cluster concept field names. These values are external columns for creating a cluster concept and not used for cluster solution training. If external columns are provided, those columns are only used for the cluster concept and not for clustering solution training. Cluster concept fields are listed in the Clustering Definitions [ml_capability_definition_clustering] table. Default: Input text columns generate the cluster concept
config.dataset	Object	DatasetDefinition object name.
config.domainName	String	Optional. Domain name associated with this dataset. See Domain separation and Predictive Intelligence. Default: Current domain, for example, `"global"`.
config.encoder	Object	Required unless setting the algorithmConfig property to `"levenshteinDistance"`. Trained encoder object to assign to this solution. See Encoder - Encoder(Object config).
config.groupByFieldName	String	Optional. Field name by which the system groups records into one or more clusters. In the following setup example, the system groups each type into an individual cluster, rendering 10 clusters. groupByFieldName value is `'category'` DatasetDefinition tableName value is `'incident'` Incident [incident] table has 10 category types
config.groupUnclusteredRecords	Boolean	Flag that indicates whether to group unclustered records in results. Valid values: true: Group unclustered records separately in results. false: Do not group unclustered records in results. Unclustered values (-1) display with the rest of the results. Default: false
config.inputFieldNames	Array	List of input field names as strings. The model uses these fields used to make predictions.
config.label	String	Identifies the prediction task.
config.maxTimeWindowForUpdate	Number	Optional. Number of minutes preceding the model update point to look for records. For example, if the value is 15, the system only looks for records created in the preceding 15 minutes. By default, the system scans all records.
config.minRecordsPerCluster	Number	Optional. Minimum number of records to allow in any cluster. The value must be greater than or equal to 2. Default: 2
config.minRowCount	String	Optional. Minimum number of records required in the dataset for training. Default: 10000
config.processingLanguage	String	Processing language in two-letter ISO 639-1 language code format.
config.stopwords	Array	Optional. Preset list of strings that the system automatically generates based on the language property setting. For details, see Create a custom stopwords list. Default: English Stopwords
config.trainingFrequency	String	The frequency to retrain the model. Possible values: every_30_days every_60_days every_90_days every_120_days every_180_days run_once Default: run_once
config.updateFrequency		The frequency at which the model for the solution definition must be rebuilt. Possible values: do_not_update every_1_day every_1_hour every_6_hours every_12_hours every_1_minute every_15_minutes every_30_minutes Default: do_not_update

Example

The following example shows how to create an object and add it to the ClusteringSolution store. The example also shows how to submit the object for training.

try{
    var myData = new sn_ml.DatasetDefinition({
        'tableName' : 'incident',
        'fieldNames' : ['category', 'short_description', 'state', 'description'],
        'encodedQuery' : 'activeANYTHING'
    });

    // get a trained encoder from the store
    var myEncoder = sn_ml.EncoderStore.get('<encoder_name >');
        
    var mySolution = new sn_ml.ClusteringSolution({
        'label': "clustering solution",
        'dataset' : myData,
        'encoder' : myEncoder,
        'inputFieldNames':['short_description'],                
        'groupByFieldName' : 'category',        
        'algorithmConfig' : {
            'algorithm' : 'kmeans',
            'targetCoverage' : '90'
        }
    });
    
    // add solution
    var solutionName = sn_ml.ClusteringSolutionStore.add(mySolution);
    var solutionVersion = mySolution.submitTrainingJob();    
    var trainingStatus = solutionVersion.getStatus();
    gs.print(JSON.stringify(JSON.parse(trainingStatus), null, 2));

} catch(ex){
    gs.print('Exception caught: '+ ex.getMessage());
}

Output:

{
  "state": "waiting_for_training",
  "percentComplete": "0",
  "hasJobEnded": "false"
}

Example

The following example shows how to include the 'description' field as a cluster concept field.

var myIncidentData = new sn_ml.DatasetDefinition({
    'tableName' : 'incident',
    'fieldNames' : ['category', 'short_description', 'description'],
});

var encodersolutionName = sn_ml.EncoderStore.get('<encoder_name >');

var mySolution = new sn_ml.ClusteringSolution({
	'label': 'clustering_test',
	'dataset': myIncidentData,
	'inputFieldNames': ['short_description'],
	'encoder': encodersolutionName,
	'clusterConceptFieldNames': ['description']
});

var solutionNameFromStore = sn_ml.ClusteringSolutionStore.add(mySolution);
var myClassifier = mySolution.submitTrainingJob();

ClusteringSolution - cancelTrainingJob()

Cancels a job for a solution object that has been submitted for training.

Table 2. Parameters
Name	Type	Description
None

Table 3. Returns
Type	Description
None

Example

The following example shows how to cancel an existing training job.

var mySolution = sn_ml.ClusteringSolutionStore.get('ml_sn_global_global_clustering');

mySolution.cancelTrainingJob();

ClusteringSolution - getActiveVersion()

Gets the active ClusteringSolutionVersion object.

Table 4. Parameters
Name	Type	Description
None

Table 5. Returns
Type	Description
Object	Active ClusteringSolutionVersion object.

Example

The following example shows how to get an active ClusteringSolution version from the store and return its training status.

var mlSolution = sn_ml.ClusteringSolutionStore.get('ml_x_snc_global_global_clustering');

gs.print(JSON.stringify(JSON.parse(mlSolution.getActiveVersion().getStatus()), null, 2));

Output:

{
  "state": "solution_complete",
  "percentComplete": "100",
  "hasJobEnded": "true"
}

ClusteringSolution - getAllVersions()

Gets all versions of a clustering solution.

Table 6. Parameters
Name	Type	Description
None

Table 7. Returns
Type	Description
Array	Existing versions of a solution object. See also ClusteringSolutionVersion API.

Example

The following example shows how to get all ClusteringSolution version objects and call the getVersionNumber() and getStatus() solution version methods on them.

var mlSolution = sn_ml.ClusteringSolutionStore.get('ml_x_snc_global_global_clustering');

var mlSolutionVersions = mlSolution.getAllVersions();

for (i = 0; i < mlSolutionVersions.length; i++) {
gs.print("Version " + mlSolutionVersions[i].getVersionNumber() + " Status: " + mlSolutionVersions[i].getStatus() +"\n");
};

Output:

Version 3 Status: {"state":"solution_complete","percentComplete":"100","hasJobEnded":"true"}

Version 2 Status: {"state":"solution_complete","percentComplete":"100","hasJobEnded":"true"}

Version 1 Status: {"state":"solution_cancelled","percentComplete":"0","hasJobEnded":"true"}

ClusteringSolution - getLatestVersion()

Gets the latest version of a solution.

Table 8. Parameters
Name	Type	Description
None

Table 9. Returns
Type	Description
Object	ClusteringSolutionVersion object corresponding to the latest version of a ClusteringSolution().

Example

The following example shows how to get the latest version of a solution and return its training status.

var mlSolution = sn_ml.ClusteringSolutionStore.get('ml_x_snc_global_global_clustering');

gs.print(JSON.stringify(JSON.parse(mlSolution.getLatestVersion().getStatus()), null, 2));

Output:

{
  "state": "solution_complete",
  "percentComplete": "100",
  "hasJobEnded": "true"
}

ClusteringSolution - getName()

Gets the name of the object to use for interaction with the store.

Table 10. Parameters
Name	Type	Description
None

Table 11. Returns
Type	Description
String	Name of the solution object.

Example

The following example shows how to update ClusteringSolution dataset information and print the name of the object.

// Update solution
var myIncidentData = new sn_ml.DatasetDefinition({
   'tableName' : 'incident',
   'fieldNames' : ['category', 'short_description', 'priority'],
   'encodedQuery' : 'activeANYTHING'
});

var eligibleFields = JSON.parse(myIncidentData.getEligibleFields('clustering'));

var myCluster = new sn_ml.ClusteringSolution({
   'label': "my clustering solution",
   'dataset' : myIncidentData,
   'inputFieldNames': eligibleFields['eligibleInputFieldNames'],
   'predictedFieldName': 'category'
});

// update solution
sn_ml.ClusteringSolutionStore.update('ml_x_snc_global_global_clustering_solution', myCluster);

// print solution name
gs.print('Solution Name: '+myCluster.getName());

Output:

Solution Name: ml_x_snc_global_global_clustering_solution

ClusteringSolution - getProperties()

Gets solution object properties.

Table 12. Parameters
Name	Type	Description
None

Search:

Table 13. Returns
Type	Description
Object	Contents of the Dataset and ClusteringSolution() object details in the ClusteringSolutionStore. `{ "algorithmConfig": {Object}, "datasetProperties": {Object}, "domainName": "String", "encoder": {Object}, "groupByFieldName": "String", "inputFieldNames": [Array], "label": "String", "minRecordsPerCluster" : Number, "name": "String", "processingLanguage": "String", "scope": "String", "stopwords": [Array], "trainingFrequency": "String", "updateFrequency": "String" }`
<Object>.algorithmConfig	JavaScript object containing algorithm configuration properties. Property results vary by the value set in the algorithm property. `'algorithmConfig' : { "algorithm": "String", // See algorithmConfig.algorithm setting description for property settings based on algorithm }` Data type: Object.
<Object>.algorithmConfig.algorithm	Method for encoding your solution. Properties for dbscan: `'algorithmConfig': { "algorithm": "dbscan", "distanceMetric": "String", "epsilon": Number, "minimumNeighbours": Number }` Properties for kmeans: `'algorithmConfig': { "algorithm": "kmeans", "targetCoverage": Number }` Data type: String.
<Object>.algorithmConfig.distanceMetric	DBSCAN algorithm only. Distance metric to scan for similar data objects. Data type: String.
<Object>.algorithmConfig.epsilon	DBSCAN algorithm only. Decimal value between 0 and 1 representing the size of the neighborhood search radius. Data type: Number.
<Object>.algorithmConfig.minimumNeighbours	DBSCAN algorithm only. Minimum number of neighbors required in a point to be a part of a cluster. For levenshteinDistance the value must be 1 so that no points are excluded from the dataset. Data type: Number.
<Object>.algorithmConfig.targetCoverage	K-means algorithm only. Percentile field to filter out records that are less similar to each other. Data type: Number.
<Object>.datasetProperties	Lists the properties of the DatasetDefinition() object associated with the solution. `{ "encodedQuery": "String", "fieldDetails": [Array], "fieldNames": [Array], "tableName": "String" }` Data type: Object.
<Object>.datasetProperties.tableName	Name of the table for the dataset. For example, `"tableName" : "Incident"`. Data type: String.
<Object>.datasetProperties.fieldNames	List of field names from the specified table as strings. For example, `"fieldNames" : ["short_description", "priority"]`. Data type: Array.
<Object>.datasetProperties.fieldNames.fieldDetails	List of JavaScript objects that specify field properties. `[ { "name": "String", "type": "String" } ]` Data type: Array.
<Object>.datasetProperties.fieldNames.fieldDetails.<object>.name	Name of the field defining the type of information to restrict this dataset to. Data type: String.
<Object>.datasetProperties.fieldDetails.<object>.type	Machine-learning field type. Data type: String.
<Object>.datasetProperties.fieldDetails.encodedQuery	Encoded query string in standard Glide format. See Encoded query strings. Data type: String.
<Object>.domainName	Domain name associated with this dataset. See Domain separation and Predictive Intelligence. Data type: String.
<Object>.encoderProperties	Encoder object assigned to this solution. See Encoder - Encoder(Object config). Data type: Object.
<Object>.groupByFieldName	Field name by which the system groups records into one or more clusters. Data type: String
<Object>.inputFieldNames	List of input field names as strings. The model uses these fields used to make predictions. Data type: String.
<Object>.label	Identifies the prediction task. `{ "label": "my first prediction" }` Data type: String.
<Object>.minRecordsPerCluster	Minimum number of records to allow in any cluster. Data type: Number.
<Object>.name	System-assigned name. Data type: String.
<Object>.predictedFieldName	Identifies a field to be trained for predictability. Data type: String.
<Object>.processingLanguage	Processing language in two-letter ISO 639-1 language code format. Data type: String.
<Object>.scope	Object scope. Currently the only valid value is `global`. Data type: String
<Object>.stopwords	Optional. Preset list of strings that the system automatically generates based on the language property setting. For details, see Create a custom stopwords list. Data type: Array.
<Object>.trainingFrequency	The frequency to retrain the model. Possible values: every_30_days every_60_days every_90_days every_120_days every_180_days run_once Default: run_once Data type: String.
<Object>.updateFrequency	The frequency at which the model for the solution definition must be rebuilt. Possible values: do_not_update every_1_day every_1_hour every_6_hours every_12_hours every_1_minute every_15_minutes every_30_minutes Default: do_not_update Datatype: String

Example

The following example gets properties of a solution object in the store.

var myCluster = new sn_ml.ClusteringSolutionStore.get("ml_x_snc_global_global_clustering_solution");

gs.print(JSON.stringify(JSON.parse(myCluster.getProperties()), null, 2));

Output:

*** Script: {
  "algorithmConfig": {
    "algorithm": "kmeans",
    "targetCoverage": "90"
  },
  "datasetProperties": {
    "tableName": "incident",
    "fieldNames": [
      "category",
      "short_description",
      "state",
      "description"
    ],
    "encodedQuery": "activeANYTHING"
  },
  "domainName": "global",
  "encoderProperties": {
    "datasetsProperties": [
      {
        "tableName": "incident",
        "fieldNames": [
          "assignment_group",
          "short_description",
          "description"
        ],
        "encodedQuery": "activeANYTHING"
      }
    ],
    "domainName": "global",
    "label": "my encoder definition",
    "name": "ml_x_snc_global_global_my_encoder_definition",
    "processingLanguage": "en",
    "scope": "global",
    "stopwords": [
      "Default English Stopwords"
    ],
    "trainingFrequency": "run_once"
  },
  "groupByFieldName": "category",
  "inputFieldNames": [
    "short_description"
  ],
  "label": "clustering solution",
  "minRecordsPerCluster": 2,
  "name": "ml_x_snc_global_global_clustering_solution",
  "processingLanguage": "en",
  "scope": "global",
  "stopwords": [
    "Default English Stopwords"
  ],
  "trainingFrequency": "run_once",
  "updateFrequency": "do_not_update"
}}

ClusteringSolution - getVersion(String version)

Gets a solution by provided version number.

Table 14. Parameters
Name	Type	Description
version	String	Existing version number of a solution.

Table 15. Returns
Type	Description
Object	Specified version of the ClusteringSolution() object on which you can call ClusteringSolutionVersion API methods.

Example

The following example shows how to get the training status of a solution by version number.

var mlSolution = sn_ml.ClusteringSolutionStore.get('ml_x_snc_global_global_clustering');

gs.print(JSON.stringify(JSON.parse(mlSolution.getVersion('1').getStatus()), null, 2));

Output:

{
  "state": "solution_complete",
  "percentComplete": "100",
  "hasJobEnded": "true"
}

ClusteringSolution - setActiveVersion(String version)

Activates a specified version of a solution in the store.

Table 16. Parameters
Name	Type	Description
version	String	Name of the ClusteringSolution() object version to activate. Activating this version deactivates any other version.

Table 17. Returns
Type	Description
None

Example

The following example shows how to activate a solution version in the store.

sn_ml.ClusteringSolution.setActiveVersion("ml_incident_categorization");

ClusteringSolution - submitTrainingJob()

Submits a training job.

Note: Before running this method, you must first add a solution to the store using the ClusteringSolutionStore - add() method.

Table 18. Parameters
Name	Type	Description
None

Table 19. Returns
Type	Description
Object	ClusteringSolutionVersion object corresponding to the ClusteringSolution being trained.

Example

The following example shows how to create a dataset, apply it to a solution, add the solution to a store, and submit the training job.

// Create a dataset 
var myData = new sn_ml.DatasetDefinition({

  'tableName' : 'incident',
  'fieldNames' : ['assignment_group', 'short_description', 'description'],
  'encodedQuery' : 'activeANYTHING'

});

// get a trained encoder from the store
var myEncoder = sn_ml.EncoderStore.get('ml_x_snc_global_global_encoder');

// Create a solution 
var mySolution = new sn_ml.ClusteringSolution({

  'label': "my solution definition",
  'dataset' : myData,
  'encoder' : myEncoder,
  'predictedFieldName' : 'assignment_group',
  'inputFieldNames':['short_description']

});

// Add the solution to the store to later be able to retrieve it.
var my_unique_name = sn_ml.ClusteringSolutionStore.add(mySolution);

// Train the solution - this is a long running job 
var myClusterVersion = mySolution.submitTrainingJob();

ClusteringSolution - ClusteringSolution(Object config)
ClusteringSolution - cancelTrainingJob()
ClusteringSolution - getActiveVersion()
ClusteringSolution - getAllVersions()
ClusteringSolution - getLatestVersion()
ClusteringSolution - getName()
ClusteringSolution - getProperties()
ClusteringSolution - getVersion(String version)
ClusteringSolution - setActiveVersion(String version)
ClusteringSolution - submitTrainingJob()

Was this topic helpful?

Yes No

Yokohama API Reference

Filters

Versions

Products