La séptima vida

...o el gato así lo espera/teme

A little language for evaluating conditionals

I have a PLC monitoring project where a program will extract information from a PLC and forward the data using MQTT messages. But messages will be sent only under certain conditions, like when the variable temperature is above a given threshold or when the variable alarm is active. I created a mini-language to specify such conditions. The application will read the specifications from a configuration file and it will test them repeatedly.

The program will parse the little language to produce a data structure. Then, the data structure will be turned into a function. This function will test the specified condition as variables evolve.

I wrote the parser using the Perl module Parse::RecDescent. Writing the grammar itself was quite interesting, and so I am describing it here. But let's begin with the simplest of the language elements, which I called "directives":

always
This condition is always true.
never
This condition is always false.

At this point, the grammar is extremely simple. It is embedded in the following program:

use Parse::RecDescent;
use Data::Dumper;
use strict;
use warnings;
use v5.26;

my $grammar = <<'GRAMMAR';
    directive      : ('always' | 'never')
    { [ $item[1] ] }
GRAMMAR

my $parser = Parse::RecDescent->new($grammar);
die "Invalid grammar" unless defined $parser;

my $expression  = 'always';
my $r = $parser->directive($expression);
die "Could not parse $expression" unless defined $r;
say Dumper $r;

Running the above program will return an array reference with the single element always inside.

Well, not so interesting. Lets go to a more complex case, comparisons. The program may compare variables to numbers or to other variables: take temperature > 30 as an example. Our little grammar requires numbers, variables, and comparison operators to support comparison instructions. Note that an instruction can be a directive, too:

   
    instruction    : variable operator number
                   { [ 'numeric', @item[2,1,3] ] }
                   | variable operator variable
                   { [ 'variable', @item[2,1,3] ] }
                   | directive
                   { [ $item{directive} ] } 
    directive      : 'always' | 'never'
    operator       : ('>' | '<' | '=')
    number         : /[+-]?\d+(?:\.\d+)?/
    variable       : /[A-Za-z][A-Za-z0-9_]*/

Parsing "temperature > 30" with the above grammar gives:

$VAR1 = [
          'numeric',
          '>',
          'temperature',
          '30'
        ];

Next, I would like to test boolean variables being on or off, and I would like to know when they trigger or turn_off. I also want to detect if variables changed, too. Now, I can write conditions like alarm triggered, temperature changed, and automatic_mode on.

Let's change the grammar by adding post-operators and a new set of instructions:

    
    instruction    : variable operator number
                   { [ 'numeric', @item[2,1,3] ] }
                   | variable operator variable
                   { [ 'variable', @item[2,1,3] ] }
                   | variable postop
                   { ['postop', $item{postop}, $item{variable}] }
                   | directive
                   { [ $item{directive} ] } 
    directive      : 'always' | 'never'
    operator       : ('>' | '<' | '=')
    postop         : 'changed' | 'triggered' | 'turned_off' | 'on' | 'off'
    number         : /[+-]?\d+(?:\.\d+)?/
    variable       : /[A-Za-z][A-Za-z0-9_]*/

The output for alarm triggered is then:

$VAR1 = [
          'postop',
          'triggered',
          'alarm'
        ];

At this point, the grammar allows for a nice variety of instructions; yet, it lacks the flexibility of logical operators like AND, OR and NOT. These operators will allow me to combine instructions into full fledged expressions. What is more, logical operators and parenthesized expressions will allow for complete expressiveness. So let's do this:

    expression     : unaryop expression
                   { [ 'logicop', $item{unaryop}, $item{expression} ] }
                   | '(' expression ')' logicop expression
                   { ['logicop',  $item{logicop}, $item[2], $item[5] ] }
                   | instruction logicop expression
                   { ['logicop', $item{logicop}, $item{instruction}, $item{expression}] }
                   | '(' expression ')'
                   { $item{expression} }
                   | instruction
                   { $item{instruction} }
    instruction    : variable operator number
                   { [ 'numeric', @item[2,1,3] ] }
                   | variable operator variable
                   { [ 'variable', @item[2,1,3] ] }
                   | variable postop
                   { ['postop', $item{postop}, $item{variable}] }
                   | directive
                   { [ $item{directive} ] } 
    logicop        : 'AND' | 'OR'
    unaryop        : 'NOT'
    directive      : 'always' | 'never' | 'any_change'
    operator       : '>' | '<' | '='
    postop         : 'changed' | 'triggered' | 'turned_off' | 'on' | 'off'
    number         : /[+-]?\d+(?:\.\d+)?/
    variable       : /[A-Za-z][A-Za-z0-9_]*/

This last addition requires an explanation. An expression here is one or more instructions, possibly grouped in parentheses, and connected by logical operators. The different productions of the expression rule mean:

unaryop expression
Implements NOT expression
'(' expression ')' logicop expression
This production handles two expressions joined by a logical operator, either AND or OR in this case. It is there to handle strings that start with a parenthesized expression. As will be evident later, both expressions may be arbitrarily complex. What is not evident is that a grammar like expression logicop expression is invalid since it is left-recursive.
instruction logicop expression
In a simpler case, an instruction is joined to an expression by a logical operator. This production solves the case for left-recursive expression logicop expression.
'(' expression ')'
An expression may be within parentheses, too.
instruction
This production states that a simple instruction, by itself, is also an expression.

The examples below show expressions that may be parsed with this grammar along with the resulting data structures:

temperature > 30
    [
          'numeric',
          '>',
          'temperature',
          '30'
    ];
temperature < minimum
    [
          'compare',
          '<',
          'temperature',
          'minimum'
    ];
alarm off
    [
          'postop',
          'off',
          'alarm'
     ];
alarm on OR temperature > 30
    [
          'logicop',
          'OR',
          [
            'on',
            'alarm'
          ],
          [
            'numeric',
            '>',
            'temperature',
            '30'
          ]
     ];
alarm off OR temperature > minimum AND (temperature < 30 OR alarm on)
    
    [
          'logicop',
          'OR',
          [
           'postop',
            'off',
            'alarm'
          ],
          [
            'logicop',
            'AND',
            [
              'compare',
              '>',
              'temperature',
              'minimum'
            ],
            [
              'OR',
              [
                'numeric',
                '<',
                'temperature',
                '30'
              ],
              [
                'postop',
                'on',
                'alarm'
              ]
            ]
          ]
    ];

I have barely scratched the surface of Parse::RecDescent and yet the grammar is already able to parse my little language. The resulting data structure will be treated by a different module to produce a closure that evaluates the condition even as variables change. But that shall be the topic of another post.

Final module

The parser explained here is part of an application, and it lives in a module. The full code is here:

package Machine::Interface::RuleParser;

use Parse::RecDescent;
use strict;
use warnings;
use v5.14;

#$::RD_ERRORS = 1;
#$::RD_WARN   = 1;
#$::RD_HINT   = 1;
#$::RD_TRACE  = 1;

my $grammar = <<'GRAMMAR';
    expression     : unaryop expression
                   { [ 'logicop', $item{unaryop}, $item{expression} ] }
                   | '(' expression ')' logicop expression
                   { [ 'logicop', $item{logicop}, $item[2], $item[5] ] }
                   | instruction logicop expression
                   { ['logicop', $item{logicop}, $item{instruction}, $item{expression}] }
                   | '(' expression ')'
                   { $item{expression} }
                   | instruction
                   { $item{instruction} }
    instruction    : variable operator number
                   { [ 'numeric', @item[2,1,3] ] }
                   | variable operator variable
                   { [ 'variable', @item[2,1,3] ] }
                   | variable postop
                   { ['postop', $item{postop}, $item{variable}] }
                   | directive
                   { [ $item{directive} ] } 
    logicop        : 'AND' | 'OR'
    unaryop        : 'NOT'
    directive      : 'always' | 'never' | 'any_change'
    operator       : '>' | '<' | '='
    postop         : 'changed' | 'triggered' | 'turned_off' | 'on' | 'off'
    number         : /[+-]?\d+(?:\.\d+)?/
    variable       : /[A-Za-z][A-Za-z0-9_]*/
GRAMMAR

my $parser = Parse::RecDescent->new($grammar);
die "Invalid grammar" unless defined $parser;

sub parse_rule {
    my ($class, $text) = @_;
    my $line = $parser->expression($text);
    return $line;
}

1;