The DatasetDefinition API provides methods to identify a set of records including a table name, columns, and row selection criteria to use as input for ML training algorithms. Datasets don't contain the actual data.

This API requires the Predictive Intelligence plugin (com.glide.platform_ml) and is provided within the sn_ml namespace. For information, see Predictive Intelligence.

Use the dataset to estimate mutual information PredictabilityEstimate or train data specified by an Encoder. You can also use the dataset to train data specified by one of the following solution types:

For usage guidelines, refer to Using ML APIs.

DatasetDefinition - DatasetDefinition(Object)

Creates an instance of the DatasetDefinition class, enabling you to define a dataset by table name, fields, and query.

Create your dataset definition by passing a table and a list of fields. You can also pass a query to restrict datasets to include rows with specific characteristics.

Once created, a DatasetDefinition object cannot be modified.

Example

The following example shows how to create a dataset definition.

var myData = new sn_ml.DatasetDefinition(
  { 
     'tableName' : 'incident', 
     'fieldNames' : ['category', 'short_description', 'priority', 'assignment_group.name'],
     'fieldDetails' : [
       {
         'name' : 'category',
         'type' : 'nominal'
       },
       {
         'name' : 'short_description',
         'type' : 'text'
       }], 
     'encodedQuery' : 'sys_created_onONLast%202%20quarters@javascript:gs.beginningOfLast2Quarters()@javascript:gs.endOfLast2Quarters()^state=3'
  });

DatasetDefinition - getEligibleFields(String capability)

Returns a list of fields that are eligible as either input fields (features) or predicted fields regarding a solution of a given capability, for example, a classification solution. Eligibility is determined based on the fields having the appropriate glide data types.

Table 2. Parameters
Name Type Description
capability String Capability for which to retrieve fields eligible for training. This method currently only supports classification solutions, any other value for the capability throws a "capability not supported" exception.

Valid values: "classification"

Table 3. Returns
Type Description
Object Object containing eligible input field names and eligible output field names.
{	 
  "eligibleInputFieldNames" : [Array],
  "eligibleOutputFieldNames" : [Array] 
}
<Object>.eligibleInputFieldNames List of strings indicating input fields eligible for training.

Data type: Array

<Object>.eligibleOutputFieldNames List of strings indicating output fields eligible for training.

Data type: Array

Example

The following example shows how to display eligible fields for a classification solution.

var myIncidentData = new sn_ml.DatasetDefinition({
  'tableName' : 'incident',
  'encodedQuery' : 'activeANYTHING'
});

var eligibleFields = JSON.parse(myIncidentData.getEligibleFields('classification'));

gs.print(JSON.stringify(eligibleFields, null, 2));

Output:

{
  "eligibleInputFieldNames": [
    "resolved_by",
    "short_description",
    "description",
    "notify"
  ],
  "eligibleOutputFieldNames": [
    "parent",
    "caused_by",
    "location",
    "category"
  ]
}