Data Table Native Encoding

Any Data Table may be encoded into and decoded from:

  • A Unicode string
  • A byte array

Encoding to/from strings is often used for manipulating Data Tables programmatically, e.g. via Java SDK or .NET API. String representation of table formats is also often used for compact human-readable format representation.

Encoding to/from byte arrays is used to transfer Data Tables over the network using the AggreGate Communication Protocol or storing them in the AggreGate Server database.

Byte Array Encoding

Encoding of a Data Table into a byte array involves two stages:

  • First, the Data Table is encoded into a Unicode string
  • Second, the resulting string is encoded into an array of bytes using UTF-8 encoding. See UTF-8 documentation (e.g. UTF-8 Wikipedia article ) to find out the details.

String Encoding Concept

Generally, Data Table and its different parts are encoded into string as a number of elements of the following format:

<[element_name=]element_value>

Both the name and value may include any character except for those used by AggreGate Communication Protocol or used to encode the element itself. An element's value may be an encoded list of nested elements.

Table Encoding

Here is the format for an encoded data table:

<F=record_format>[<I=>][<R=record>][<R=record>]…

Element Name

Element Value

F

Table format descriptor, defining names and types of all columns in the data table and other table properties. Its encoding is described here.

This element may be missing if the ID element (D) presents and format with this ID is already known in current connection.

D

ID of the table format. See format caching for more details.

I

Invalidator element. If this element exists in the data table, the whole table is considered as invalid. When AggreGate Server requests a variable value from an Agent, the Agent first acknowledges this request, and then begins polling the records from the hardware device. In the meanwhile, AggreGate Server "knows" that it is receiving a complete and correct table.

If at this point there is a sudden failure between the Agent and the hardware device, and for some reason the Agent does not receive the data it should get from the hardware device, the Agent now inserts an Invalidator element into the data table sent to the server. This element is actually the Agent telling AggreGate Server "I cannot fully obtain this data". Having received this element, AggreGate Server now knows the operation could not actually be completed.

R

Record of the data table. Records are encoded one by one. Format of encoded records is described below.

T

Timestamp of the data table. It normally indicates when the data table itself or the data sample it represents was created/acquired. Timestamp is encoded into a string as a number of milliseconds since epoch (1 Jan 1970).

Q

Quality of the data table. It explains how reliable is the data sample represented by the data table. Quality is a 32-bit signed integer value encoded into a string.


Example: <F=<<IP><S><F=C>><M=1><X=1>><R=<192.168.1.88>>

This example shows an encoded data table those format defined one string field called "IP", and that contains one record with value "192.168.1.88". See details of record and format encoding below.

Table Format Encoding

record_format is included in every data table, even if it is empty. It's built like this:

<field_format><field_format>…[<F=flags>][<V=table_validators>][<R=record_validators>][M=min_records][X=max_records][<B=bindings>][<N=naming_expression>]

Elements with no identifiers (as shown in the beginning of the example above, <field_format>) are considered to be encoded formats descriptors of table fields. Field format descriptors are encoded one by one, starting from the first field.

Element Name

Element Value

F

Combination of zero or more of the following flags:

“R” (“Reorderable”) - indicates rows of this table may be reordered by AggreGate users editing it.

“U” (“Unresizable”) - indicates that users cannot add/remove rows when editing the table.

V

Table validators that help perform complex validation of a whole table.

R

Record validators that are used to validate every record.

B

Table bindings.

M

Minimal allowed number of records in the table.

X

Maximal allowed number of records in the table.

N

Table naming expression.


Example: <<date><D>><M=1><X=1>

This example format described a one-cell table (one field, minimum and maximum record count is also one). The only field's name is "date", and it's type is Date. See details of field format encoding below.


Example: <<id><S><D=Card ID><V=<L=10 10>>><<name><S><D=Cardholder Name>><M=0><X=255>

This format describes a two-fields table that may contain from 0 to 255 records. First string field is called "id" and it's description is "Card ID". It has a limits validator that restricts value length to exactly 10 characters. Second field is also of string type, it's called "name".

Field Format Encoding

field_format is a string describing one field in a Data Table. It's formatted like this:

<name><type>[<F=flags>][<A=default>][<D=description>][<H=help>][<S=selection_values>][<V=validators>][<E=editor>]

The first two elements have no names. The first element is the field name, and the second one contains the field type code (see table below).

Element Name

Element Value

F

Combination of zero or more of the following flags:

N” (“Nullable”) - indicates that column can contain NULL values

O”  (“Optional”) - indicates that column in optional

E” (“Extendable selection values”) – indicates that field may contain values that are not listed in selection_values

R” (“Read only”) – indicates that the field value is read only

C” (“Not replicated”) - indicates that value of this field is not replicated during Data Table Copy operations

H” (“Hidden”) – indicates that column should not be visible during Data Table edit operations

K” (“Key field”) - indicates that column is a key field. Key fields are used in data table smart copy operation. Another use is a key fields validator that ensures that table doesn't contain records with equal combinations of all key fields.

A

Default value of field encoded into a string as covered in value encoding section.

D

Field description.

H

Field help (detailed description).

S

List of selection values for the field. See encoding rules here.

V

List of field validators. See encoding rules here.

E

Code of editor/renderer. This element enables custom visual representation of field value. Supported editors and renderers are listed here.

O

Editor-specific options. Options supported by every editor/renderer type are listed here. If editor options are not specified, this element should be omitted from table format definition.

I

String ID of field icon.

G

Field group.


Example: <value><I>

This is the simplest possible field format descriptor that defines the integer field called "value".


Example: <period><L><A=30000><D=Check Period><V=<L=100 1000000>><E=period><O=0 4>

This format descriptor defines a long (64-bit integer) field with default value 30000. Field description is "Check Period". If has a limits validator that restricts values to the range 100..1000000. Editor/renderer type instructs the system to use Time Period editor for editing field value.

Field Types and Value Encoding

Type Code

Type

Transfer

Comments and value encoding rules

S

String

Yes

Inserted "as is".

I

Integer


Converted to string, e.g. 123 or -123.

L

Long


Converted to string, e.g. 123 or -123.

B

Boolean


TRUE is encoded as string 1 and FALSE as string 0

F

Float


Converted to a string according to the below rules. All characters mentioned below are ASCII characters.

  • If the argument is NaN, the result is the string NaN.
  • Otherwise, the result is a string that represents the sign and magnitude (absolute value) of the argument. If the sign is negative, the first character of the result is -; if the sign is positive, no sign character appears in the result. As for the magnitude m:
  • If m is infinity, it is represented by the characters Infinity; thus, positive infinity produces the result "Infinity" and negative infinity produces the result -Infinity.
  • If m is zero, it is represented by the characters 0.0; thus, negative zero produces the result -0.0 and positive zero produces the result 0.0.
  • If m is greater than or equal to 10-3 but less than 107, then it is represented as the integer part of m, in decimal form with no leading zeroes, followed by ., followed by one or more decimal digits representing the fractional part of m.
  • If m is less than 10-3 or greater than or equal to 107, then it is represented in so-called "computerized scientific notation." Let n be the unique integer such that 10n <= m < 10n+1; then let a be the mathematically exact quotient of m and 10n so that 1 <= a < 10. The magnitude is then represented as the integer part of a, as a single decimal digit, followed by ., followed by decimal digits representing the fractional part of a, followed by the letter E, followed by a representation of n as a decimal integer.

How many digits must be printed for the fractional part of m or a? There must be at least one digit to represent the fractional part, and beyond that as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type Float (or Double, if a double number processed). That is, suppose that x is the exact mathematical value represented by the decimal representation produced by this method for a finite nonzero argument d. Then d must be the double value nearest to x; or if two double values are equally close to x, then d must be one of them and the least significant bit of the significand of d must be 0.

E

Double


Rules as for the Float type

D

Date


Converted to string in the form "yyyy-MM-dd HH:mm:ss.SSS", where

yyyy is year

MM is month

dd is day of month

HH is hour (0-23)

mm is minutes

ss is seconds

SSS is milliseconds

The conversion must use UTC timezone.

T

Data Table

Yes

Nested data table is encoded to string according to the Data Table Encoding rules

C

Color


Converted to string in the form "#RRGGBB", where

RR is red value (0-255) is hex form

GG is green value (0-255) is hex form

BB is blue value (0-255) is hex form

A

Data Block

Yes

Converted to string in the following form:

Version / ID / Name / Preview_length / Data_length / Preview Data

The string contains several parts separated by / character. Those parts are:

  • Version. Version of the data block encoding algorighm, currently 0.
  • ID. Unique identifier of this data block within an AggreGate Server installation. NULL (undefined) ID is a represented by a single 0x1A (SUB) character (see NULL Value Encoding).
  • Name. Name of the data block, usually name of file that was loaded into the data block. NULL (undefined) name is a represented by a single 0x1A (SUB) character (see NULL Value Encoding).
  • Preview_length. Number of bytes contained in the encoded data block preview. A preview is a shortened representation of a data block, e.g. a thumbnail of an image. -1 length means that preview is not available.
  • Data_length. Number of bytes contained in the encoded data. -1 length means that data is not available.
  • Preview. Encoded bytes of the data block's preview.
  • Data. Encoded bytes of the block's data.

There is no separator character between Preview and Data fields. Preview data should be separated from main data according to their lengths defined by Preview_length and Data_length.

During encoding of a Data Block to a string, bytes of Preview and Data are converted to Unicode symbols with codes 0...255, i.e. ASCII characters.

Encoding Selection Values

Selection values for the field are encoded as a list of elements. Each element's name is the selection value's visible description for the user (what the user will see in the listbox). The element's value is the selection value, encoded into a string as described in the value encoding section.

Example: A list of three selection values for an integer field may be encoded as follows: <Zero=0><One=1><Two=2>

Encoding Validators

Field validators are encoded as a list of elements, one per every validator. The element's name is a type code of the validator while its value contains validator-specific options.

Field Validators

Here is a list of supported field validators:

Type Code

Description

Suitable Field Types

Validator-Specific Options

L

Limits Validator. Checks if a value is within the range defined by validator parameters.

String, Integer, Long, Float, Data

Validator options are encoded into a string as two integer numbers separated by a space. The first number indicates the minimum value of a range, and the second number specifies the maximum value. These numbers have different meaning for different field types:

For String fields, these parameters limit minimal and maximal length of the string.

For Integer, Long and Float fields they indicate minimal and maximal value.

For Data fields they limit number of bytes that can be contained in the data block.

Limits are inclusive for all field types (i.e, a limit of "3" would permit a string such as abc -- containing 3 characters).

R

Regular Expression Validator. Checks if a string value matches a regular expression specified by the validator parameter.

String

Validator option string contains a regular expression to which the field value will be matched. It may be followed by an optional error message separated from the regular expression by a ^^ string. If validation fails (i.e. string value does not match the regular expression), this error message will be shown to the user.

E

Expression Validator. Evaluates an expression with value of this field as a environment variable {env/value}. Expression should return boolean value, otherwise validation fails.

All field types are suitable

The text of expression.


Example 1: <L=0 255>

If this limits validator is added to the format of string field, it will allow only Strings those length is from 0 to 255 characters. It it is defined for an Integer fields, it will restrict field values to the numbers that are greater or equal to 0 and less or equal to 255.

Example 2: <R=^[_A-Za-z0-9-]+(\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[_A-Za-z0-9-]+)^^Invalid E-Mail>

^[_A-Za-z0-9-]+(\.[_A-Za-z0-9-]+)*@[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[_A-Za-z0-9-]+) is a regex (The first ^ mark belong to it). After this regex, ^^ is used as a separator and is then followed by the text for the error message, "Invalid E-Mail".

This regular expression validator checks if a field contains a valid e-mail address and fails with an Invalid E-Mail error if it doesn't. See Regular Expressions Syntax appendix for more information on regular expressions.

Record Validators

Here is a list of supported record validators:

Type Code

Description

Validator-Specific Options

K

Key Fields. Checks whether a combination of key field values does not already exists in the table. Applied during record addition.

None - key fields are marked with Key Field flag of field format.

Table Validators

Here is a list of supported table validators:

Type Code

Description

Validator-Specific Options

K

Key Fields. Checks whether combination of key field values if unique for every record.

None - key fields are marked with Key Field flag of field format.

E

Expression Validator. Evaluates an expression having this table as a default table. If expression returns NULL table is considered valid, otherwise expression output is converted to string and used as error text.

The text of expression.


Example: <E={activationThreshold} > {deactivationThreshold} ? null : 'Activation threshold must be greater than deactivation threshold'>

Checks that one field greater than another.

Encoding of NULL Values

A NULL ("<Not set>") value is encoded with a single 0x1A (SUB) character. This rule applies to encoding NULL values of table cells, default values of table fields, selection values and any other place where field values may appear.

If visible separators are used to encode the Data Table, NULL values are encoded as "^" characters.

Transfer Encoding

Field values are transfer-encoded to remove characters used by the Data Table encoding format and AggreGate Communication Protocol. These characters are substituted with special patterns according to the following table:

Character

Replaced by

0x25 (%)

%%

0x02 (STX)

%^

0x0D (CR)

%$

0x17 (ETB)

%/

0x1C (FS)

%<

0x1D (GS)

%>

0x1E (RS)

%=

Note that the patterns in the Replaced by column are literal strings -- they're just what you see above.

Encoding of Data Records

Each data record is encoded into a string according to the following format:

[<I=record_ID>]<field_value><field_value>…

Elements without names are field values. Field values are encoded one by one, in the same order as they appear in the table format descriptor.

Element Name

Element Value

I

Record ID (Long number)

String Encoding Modes

Data Tables may be encoded in two modes:

  • Using visible separators
  • Using invisible separators

Three special characters are used as separators for encoding:

Visible Separator

Invisible Separator (character code)

<

0x1C

>

0x1D

=

0x1E

In this article all examples use visible separators. The only limitation when encoding a table using visible separators is that its different elements, when they are encoded to strings, must not contain these separators.