STATISTICS command

Concept Information

Generating statistics

Calculates statistics for one or more numeric or datetime fields in an Analytics table.

Syntax

STATISTICS {<ON> field_name <...n>|<ON> ALL <EXCLUDE field_name <...n>>} <STD> <MODMEDQ> <NUMBER n> <TO {SCREEN|filename|PRINT}> <IF test> <WHILE test> <FIRST range|NEXT range> <APPEND>

Parameters

Name Description
ON field_name <...n> | ON ALL

Specify one or more numeric or datetime fields to generate statistics for, or specify ON ALL to generate statistics for all numeric and datetime fields in the Analytics table.

EXCLUDE field_name

optional

Only valid when generating statistics using ON ALL.

The field or fields to exclude from the command. EXCLUDE allows you to fine-tune ON ALL, by excluding the specified fields.

EXCLUDE must immediately follow ON ALL. For example:

ON ALL EXCLUDE field_1 field_2

STD

optional

Calculates the standard deviation of the fields specified, in addition to the other statistics.

MODMEDQ

optional

Calculates the mode, median, first quartile, and third quartile values of the fields specified, in addition to the other statistics.

NUMBER n

optional

The number of high and low values to retain during processing. The default value is 5.

TO SCREEN | filename | PRINT

optional

The location to send the results of the command to:

  • SCREEN displays the results in the Analytics display area

    Tip

    You can click any linked result value in the display area to drill down to the associated record or records in the source table.

  • filename saves the results to a file

    Specify filename as a quoted string with the appropriate file extension. For example: TO "Output.TXT"

    By default, the file is saved to the folder containing the Analytics project.

    Use either an absolute or relative file path to save the file to a different, existing folder:

    • TO "C:\Output.TXT"
    • TO "Results\Output.TXT"
  • PRINT sends the results to the default printer

IF test

optional

A conditional expression that must be true in order to process each record. The command is executed on only those records that satisfy the condition.

Note

The IF parameter is evaluated against only the records remaining in a table after any scope parameters have been applied (WHILE, FIRST, NEXT).

WHILE test

optional

A conditional expression that must be true in order to process each record. The command is executed until the condition evaluates as false, or the end of the table is reached.

Note

If you use WHILE in conjunction with FIRST or NEXT, record processing stops as soon as one limit is reached.

FIRST range | NEXT range

optional

The number of records to process:

  • FIRST start processing from the first record until the specified number of records is reached
  • NEXT start processing from the currently selected record until the specified number of records is reached

Use range to specify the number of records to process.

If you omit FIRST and NEXT, all records are processed by default.

APPEND

optional

Appends the command output to the end of an existing file instead of overwriting it.

Note

You must ensure that the structure of the command output and the existing file are identical:

  • the same fields
  • the same field order
  • matching fields are the same length
  • matching fields are the same data type

Analytics appends output to an existing file regardless of its structure. If the structure of the output and the existing file do not match, jumbled, missing, or inaccurate data can result.

Analytics output variables

Note

If you generate statistics for more than one field in a table, the system-generated output variables contain values for the first listed field only.

Name Contains
ABSn

The absolute value calculated by the command.

Not calculated for datetime fields.

AVERAGEn

The mean value calculated by the command.

COUNTn

The record count calculated by the command.

  • If the variable name is COUNT1, it is storing the record count for the most recent command executed.
  • If the variable name is COUNTn, where n is greater than 1, the variable is storing the record count for a command executed within a GROUP command.

    The value of n is assigned based on the line number of the command in the GROUP. For example, if the command is one line below the GROUP command it is assigned the value COUNT2. If the command is four lines below the GROUP command, it is assigned the value COUNT5.

HIGHn

The 5th highest value, or the 5th most recent date, identified by the command.

The 5th highest is the default setting. The setting can be changed using the NUMBER parameter. For example, NUMBER 3 specifies that the 3rd highest value is stored.

Note

When Analytics identifies the highest value, duplicate values are not factored out. For example, if values in descending order are 100, 100, 99, 98, the 3rd highest value is 99, not 98.

LOWn

The 5th lowest value, or the 5th oldest date, identified by the command.

The 5th lowest is the default setting. The setting can be changed using the NUMBER parameter. For example, NUMBER 3 specifies that the 3rd lowest value is stored.

Note

When Analytics identifies the lowest value, duplicate values are not factored out. For example, if values in ascending order are 1, 1, 2, 3, the 3rd lowest value is 2, not 3.

MAXn

The maximum value, or the most recent date, identified by the command.

MEDIANn

The median value identified by the command.

A median date is displayed as a serial datetime value.

MINn The minimum value, or the oldest date, identified by the command.
MODEn

The most frequently occurring value identified by the command.

A modal date is displayed as a serial datetime value.

Q25n

The first quartile value (lower quartile value) calculated by the command.

A first quartile date is displayed as a serial datetime value.

Q75n

The third quartile value (upper quartile value) calculated by the command.

A third quartile date is displayed as a serial datetime value.

RANGEn The difference between the maximum and minimum values calculated by the command.
STDDEVn The standard deviation value calculated by the command.
TOTALn

The total value calculated by the command.

The value of n is 1 unless the TOTAL command is inside a GROUP command, in which case the value of n corresponds to the line number of the TOTAL command in the GROUP command.

For more information, see GROUP command.

For datetime fields, the total of all serial datetime values in the field.

Examples

Generating conditional statistics

You generate statistics for the Quantity field in records where the product class ID is 01:

STATISTICS ON Quantity IF ProdCls = "01"

Remarks

All the statistics are generated for numeric fields. Only a subset of the statistics are generated for datetime fields.

Several of the datetime statistics are displayed as serial datetime values. You can use a conversion function to convert a serial datetime value to a human-readable regular datetime value. For more information, see Serial datetimes.

Statistic name Numeric field Datetime field
Range The difference between the highest and lowest values The number of days between the most recent and oldest dates
Positive
  • The number of positive values

  • The total of all positive values

  • The average positive value

  • The number of datetime values

  • The total of all serial datetime values

    Not meaningful for dates, or time values that represent points in time. Meaningful for time values that represent amounts of time.

    For more information, see An amount of time versus a point in time.

  • The average datetime value

Negative
  • The number of negative values

  • The total of all negative values

  • The average negative value

Not applicable, not generated
Zeros The number of zero values The number of blank dates
Totals
  • The total number of positive, negative, and zero values

  • The total of all positive, negative, and zero values

  • The average of all positive, negative, and zero values

Not applicable, not generated
Abs Value The total of all values while disregarding the sign of the numbers Not applicable, not generated
Std Dev (optional) The standard deviation from the mean value

The standard deviation from the average datetime value

Not displayed when outputting statistics to screen, but captured in the STDDEVn system-generated output variable.

Median (optional) The median value
  • Odd-numbered sets of values: the middle value

  • Even-numbered sets of values: the average of the two values at the middle

The median date
  • Odd-numbered sets of dates: the middle date

  • Even-numbered sets of dates: the average of the two dates at the middle

Q25 (optional) The first quartile value (lower quartile value)
  • The result is an interpolated value based on an Analytics algorithm

  • Produces the same result as the QUARTILE and QUARTILE.INC functions in Microsoft Excel

The first quartile date (lower quartile date)
  • The result is an interpolated value based on an Analytics algorithm

  • Offset by 1 day from the result of the QUARTILE and QUARTILE.INC functions in Microsoft Excel

    For more information, see Serial datetimes.

Q75 (optional) The third quartile value (upper quartile value)
  • The result is an interpolated value based on an Analytics algorithm

  • Produces the same result as the QUARTILE and QUARTILE.INC functions in Microsoft Excel

The third quartile date (upper quartile date)
  • The result is an interpolated value based on an Analytics algorithm

  • Offset by 1 day from the result of the QUARTILE and QUARTILE.INC functions in Microsoft Excel

    For more information, see Serial datetimes.

Mode (optional) The most frequently occurring value
  • Displays “N/A” if no value occurs more than once

  • In the event of a tie, displays the lowest value

The most frequently occurring date
  • Displays “N/A” if no date occurs more than once

  • In the event of a tie, displays the oldest date

Highest

The five highest values

You can change the number of high values with the # of High/Low setting on the More tab in the Statistics dialog box.

The five most recent dates

You can change the number of recent dates with the # of High/Low setting on the More tab in the Statistics dialog box.

Lowest

The five lowest values

You can change the number of low values with the # of High/Low setting on the More tab in the Statistics dialog box.

The five oldest dates

You can change the number of oldest dates with the # of High/Low setting on the More tab in the Statistics dialog box.