Data collection process and logging

To debug data collection, you need to know the data collection process and how it is reflected in the job logs.

As an administrator, sometimes you have to debug a data collection job. Each job generates a log, but to understand the entries in this log, you need to know which step in the data collection process produced the entries.

The data collection job involves executing an SQL query for each indicator source that uses the data collector. The query is repeated for every collection time from the start date to the stop date, and then queries are run for the next indicator source. Each step of executing the query is documented in the data collection job log. The following example is excerpted from the [PA Incident] Historic Data Collector job.
Table 1.
Step number Step of SQL query execution Example of resulting log entry
1 Retrieve indicator source. Processing indicator source Incidents.Open
2 Start date of collection job. Collecting for 20171028
3 Fetch fields. Fetching "short_description,sys_id,op ened_at,assignment_group, description,priority,category" from "incident"
4 Generate SQL based on the conditions that are specified in the indicator source.
Note: If the indicator source specifies Today in one of the conditions, Today is considered relative to the period for which the data collection job is executed. For example, the Incidents.New indicator source includes the condition [Opened][on][Today]. With days defined to start at 07:00:00, when data is collected for 2017-10-28, the job produces the SQL script on the right.
SELECT task0.`sys_id` FROM task task0 WHERE task0.`sys_class_name` = 'incident' AND (task0.`opened_at` >= '2017-10-28 07:00:00' AND task0.`opened_at` <= '2017-10-29 06:59:59')
5 Validate indicator conditions (Not logged)
6 Execute SQL query, which fetches rows from the facts table. Fetched 150 rows from incident
7 The map/reduce function runs. Applying map/reduce function for indicator source Incidents.Open
Applied map/reduce function
8 If text indexing is active and has been configured for the indicator source, the data collector stores the resulting text index. Storing Text Index for indicator source Incidents.Open
Bytes used by text index: 41,984 for: Incidents.Open
9 Loop through the records of the indicator source and execute or evaluate any scripts.
For each indicator that is a member of the collection job and uses the same indicator source:
10 Validate indicator conditions (Not logged)
11 Calculate the indicator score (Not logged)
12 For each breakdown:
  1. Calculate the breakdown score or execute the breakdown script.
  2. Retrieve all breakdown unique values.
  3. Create or update the array for scores or snapshots. The array is Indicator, Breakdown 1, Element 1, Breakdown 2, Element 2, Domain, Value, Array of [sys_id]
Not logged, but retrieving breakdown unique values can cause delays, especially if the query does not use indexes or retrieves many records.
13 Delete previous scores for the indicators and breakdowns that use the indicator source. Deleting previous results for indicator source Incidents.Open
Deleted previous results 38
Deleted previous results 21
14 Store newly collected results for the indicator source. Storing collected results for indicator source Incidents.Open
Stored collected results
15 Specify which indicators the data collector does not collect scores for. Not collecting for Indicator: Summed age of open incidents with excluded Breakdown: Assignment Group
16 Finish collecting data for that indicator source for that period. Collecting for 20171028 finished
17 For each other period, if any, for the same indicator source, loop back to step 2.
18 For each other indicator source, if any, loop back to step 1.