simplestates_test.py:

Test the simplestates.py generic state machine

Status:draft
Date:2006-12-01
Copyright:2006 Guenter Milde. Released under the terms of the GNU General Public License (v. 2 or later)

Revised for Python3.

Abstract State Machine Class

First version of an abstract state machine

class SimpleStates1:
    """generic state machine acting on iterable data

    Class attributes
    init_state -- name of the first state_handler method
    """
    init_state = 'start'

Initialisation

  • sets the data object to the data argument.
  • remaining keyword arguments are stored as class attributes (or methods, if they are function objects) overwriting class defaults (a neat little trick I found somewhere on the net)
  • the state_handler attribute is set to the method named in init_state
def __init__(self, data, **keyw):
    """data   --  iterable data object
                  (list, file, generator, string, ...)
       **keyw --  all remaining keyword arguments are
                  stored as class attributes
    """
    self.data = data
    self.__dict__.update(keyw)

The special __iter__ method returns an iterator. This allows to use a class instance directly in an iteration loop. We define it as is a generator method that sets the initial state and then iterates over the data calling the state methods:

def __iter__(self):
    self.state_handler = getattr(self, self.init_state)
    for token in self.data:
        yield self.state_handler(token)

To use class instances as callable objects, we add a __call__ method:

def __call__(self):
    """Run state-machine and return tokens as a list"""
    return [token for token in self]

Example 1: A two-state machine sorting numbers

Our small example state machine subclasses the SimpleStates1 class:

class Example1(SimpleStates1):
    """A silly example two-state machine sorting numbers
    in the categories "low" (< 3) and "high" (>= 3).
    """

It will be completed by two state methods and a __str__ method.

State Methods

State methods are functions that are called to iterate over the data. They will typically

  • test the data token for a change of state indicator
  • return the data token after some processing

In our example, the low method switches to high (and calls it with the data token), if token is bigger than 3. If not, it returns “l(token)”:

def low(self, token):
    # print( "low(", token, ")", end=' ' )
    if token > 3:
        self.state_handler = self.high
        # backtracking
        return self.state_handler(token)
    return "l(%d)"%token

The high method switches to low, if token is bigger than 3. If not, it returns “h(token)”:

def high(self, token):
    # print( "high(", token, ")", end= ' ' )
    if token <= 3:
        self.state_handler = self.low
        # backtracking
        return self.state_handler(token)
    return "h(%d)"%token

Conversion of the class instance to a string is done by joining the list that is returned by a call to the instance with spaces:

def __str__(self):
    return " ".join(self())
Test

Testing is done with the nose test framework. This will collect and execute all test functions and methods (basically everything that starts or ends with “[Tt]est”). This is similar to the more known “py.test”.

We set up some test data:

testdata = [1, 2, 3, 4, 5, 4, 3, 2, 1]

and define a test function:

def test_Example1():
    statemachine = Example1(testdata, init_state='low')
    for result in statemachine:
        print( result, end=' ' )
    print()

    # Calling an instance should return a list of results
    print( statemachine() )
    assert statemachine() == ['l(1)','l(2)','l(3)',  # low numbers
                             'h(4)','h(5)','h(4)',  # high numbers
                             'l(3)','l(2)','l(1)']  # low again

    # Converting to a string should call the __str__ method::
    print( str(statemachine) )
    assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"
Discussion

The sorting works as expected. However, as the state handlers get the data token by token, acting on subsequent tokens or tests that combine the knowledge of several tokens are hard to achieve.

An example would be a state handler that sums up the data tokens and returns the result if it exceeds a threshold.

Varied State Machine Class Template

The second version of an abstract state machine converts the test data to an iterator which is shared by the state methods.

There is no need to pass this on via arguments, as class methods share the class instances attributes (class variables).

We subclass our first version and modify to our needs:

class SimpleStates2(SimpleStates1):
    """second version of the abstract state machine class
    """

We add the initialisation of the data to the __iter__ method. The data is transformed into an iterator first.

def __iter__(self):
    self.data_iterator = iter(self.data)
    self.state_handler = getattr(self, self.init_state)
    # do not pass data tokens as argument
    # (state methods should call next(self.data_iterator) instead)
    while True:
        yield self.state_handler()

Iterators “use up” the data, so the state methods will always get a “new” token until the data is fully “used up” and StopIteration is raised aborting the iteration.

Doing the conversion from iterable to iterator in __iter__ and not in __init__ allows repeated iteration over the class instance (if the data is a list or a file and not already a generator) as the “used up” generator is replaced by a new one.

Example 2: Another two-state machine sorting numbers

Our small example state machine subclasses the SimpleStates2 class and adds 2 methods as state handlers.

class Example2(SimpleStates2):
    """An example two-state machine sorting numbers
    in the categories "low" (< 3) and "high" (>= 3).
    """
State methods

This time, the state methods will get the data tokens not as argument but take them from the data_iterator. Note that backtracking is impossible with a standard iterator. See below for the problem this causes for our sorting algorithm.

def low(self):
    # print( "low(", token, ")", end= ' ' )
    token = next(self.data_iterator)
    if token > 3:
        self.state_handler = self.high
    return "l(%d)"%token
def high(self):
    # print( "high(", token, ")", end= ' ' )
    token = next(self.data_iterator)
    if token <= 3:
        self.state_handler = self.low
    return "h(%d)"%token
Test

Define a second test function:

def test_Example2():
    statemachine = Example2(testdata, init_state='low')

Calling an instance should return a list of results. However, as we cannot backtrack on a normal iterator, the result is not as we expected: There is a “hysteresis” the “switch triggering” token is always processed by the “old” state:

print( statemachine() )
assert statemachine() == ['l(1)', 'l(2)', 'l(3)', # low numbers
                         'l(4)', 'h(5)', 'h(4)', # high numbers
                         'h(3)', 'l(2)', 'l(1)'] # low numbers
Discussion

Missing backtracks break our number sorting machine. The remedy is the use of an iterator with an appendleft() method (known from the dqueue() standard class). We will come to this in Example 4

OTOH, as the state methods do the fetching of data tokens themself, a state handler that sums up the data tokens and returns the result if it exceeds a threshold would be easy to implement. We will do this in our next example using state handler generators.

State Machine class using state_handler generators

The variations in StateMachine2 complicate the StateMachine design. They makes sense, however, if we use generated iterators to handle the states. No changes are needed to the abstract base class, so that Example 3 can build on StateMachine2:

class Example3(SimpleStates2):

Example 3: A two-state machine with state handler generators

State Generators

State Generators generate and return an iterator that will handle the next data token(s) if its .__next__() method is called. This is easily achieved with a for loop over self.data and the yield keyword.

def high_handler_generator(self):
    """Return an iterator, whose next() method
    returns "h(token)" and switches to `low`, if token > 3
    """
    for token in self.data_iterator:
        if token <= 3:
            self.state_handler = self.low
        yield "h(%d)"%token
#
def low_handler_generator(self):
    """Return an iterator, whose next() method sums up data tokens.
    If the sum exceeds 8, it is returned and the state
    switches to `high`.
    """
    sum = 0
    for token in self.data_iterator:
        sum += token
        if sum > 8:
            self.state_handler = self.high
            yield "s=%d"%sum
            sum = 0 # next iteration continues here
    # no more tokens but sum not reached
    yield "p=%d"%sum # partial sum

The iterator must instantiate the state-iterators before starting the iteration loop:

def __iter__(self):
    """Generate and return an iterator

    * convert `data` to an iterator
    * convert the state generators into iterators
    * (re) set the state_handler attribute to the init-state
    * pass control to the active states state_handler
      which should call and process next(self.data_iterator)
    """
    self.data_iterator = iter(self.data)
    self.high = self.high_handler_generator().__next__
    self.low = self.low_handler_generator().__next__
    # init state
    self.state_handler = getattr(self, self.init_state)
    # now start the iteration, aborts if data is empty
    while True:
        yield self.state_handler()
Test

Again define a test function that gets an instance of the Example3 class

def test_Example3():
    statemachine = Example3(testdata, init_state='low')

Calling statemachine() should iterate over the test data and return the processed values as list:

print( statemachine() )
assert statemachine() == ['s=10','h(5)','h(4)','h(3)', 'p=3']

Backtracking

the iterqueue module provides an “extensible” iterator with, e.g., an appendleft method to push back values:

from iterqueue import XIter

Thus we can prepend a non-processed data item to the data iterator for use by the next state handler

Example 4: A two-state machine with generators and backtracking

Again we start from the SimpleStates2 base class:

class Example4(SimpleStates2):
    """two-state machine with generators and backtracking
    """

Let the iterator wrap the data in an XIter instance with appendleft method:

def __iter__(self):
    """Generate and return an iterator

    * convert `data` to an iterator queue
    * convert the state generators into iterators
    * (re) set the state_handler attribute to the init-state
    * pass control to the active states state_handler
      which should call and process next(self.data_iterator)
    """
    self.data_iterator = XIter(self.data) # queue with `appendleft` method
    self.high = self.high_handler_generator().__next__
    self.low = self.low_handler_generator().__next__
    self.state_handler = getattr(self, self.init_state)
    # now start the iteration
    while True:
        yield self.state_handler()

Add state method generators that use the “backtracking” feature:

def high_handler_generator(self):
    """Return an iterator, whose next() method
    returns "h(token)" and switches to `low`, if token > 3
    """
    for token in self.data_iterator:
        # print( "high got", token )
        if token <= 3:
            # push back data token
            self.data_iterator.appendleft(token)
            # set the new state
            self.state_handler = self.low
            # return non-value indicating the state switch
            yield None
        else:
            yield "h(%d)"%token
#
def low_handler_generator(self):
    """Return an iterator, whose next() method
    returns "l(token)" and switches to `high`, if token <=3
    """
    for token in self.data_iterator:
        # print( "low got", token )
        if token > 3:
            self.data_iterator.appendleft(token) # push back
            # set and run the new state
            self.state_handler = self.high
            # alternatively, return the token processed by the new
            # state handler
            yield self.state_handler()
        else:
            yield "l(%d)"%token

The __str__ converter should ignore the switch-indicator:

def __str__(self):
    tokens = [token for token in self() if token != None]
    return " ".join(tokens)
Test

Again define a test function. This time with an instance of the Example4 class

def test_Example4():
    statemachine = Example4(testdata, init_state='low')

Calling statemachine() should iterate over the test data and return the processed values as list. If the state of the machine changes, the special “non-value” None is returned.

print( statemachine() ) # only printed if something goes wrong
assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
                         'h(4)', 'h(5)', 'h(4)', None, # switch indicator
                         'l(3)', 'l(2)', 'l(1)']

Converting to a string should skip the None values:

print( statemachine )
assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"
Discussion

The XIter class allows backtracking also in a state machine with state handlers acting on a common iterator object. The “high” and “low” handlers demonstrate two possible actions for the state-transition with backtrack: Either call the new state handler from the current one (like the low_handler_generator) or return a “non-value” that signifies that processing the data token did not produce any output data.

Using generators made the state handlers shorter and (once the concept of a generator is clear) easier. Further advantages of the generator concept are

  • internal variables are easily saved over subsequent invocations
  • no function-call overhead (not relevant in this example but maybe for a state machine that has to process long data lists.

Converting all state method generators with a generic function

In Example4, we had to redefine the __iter__ method to convert the method state generators into iterators. It would be nice if this could be done in the base class.

SimpleStates3 adds a generic function for this task that relies on a simple naming convention: functions whose name matches <state>_handler_generator should be converted to iterators and their .__next__() method stored as <state>.

class SimpleStates5(SimpleStates2):
    """generic state machine acting on iterable data
    """
    def _initialize_state_generators(self):
        """Generic function to initialise state handlers from generators

        functions whose name matches `[^_]<state>_handler_generator` should
        be converted to iterators and their `.__next__()` method stored as
        `<state>`.
        """
        suffix = "_handler_generator"
        shg_names = [name for name in dir(self)
                      if name.endswith(suffix)
                      and not name.startswith("_")]
        for name in shg_names:
            shg = getattr(self, name)
            print( shg )
            setattr(self, name[:-len(suffix)], shg().__next__)


    def __iter__(self):
        """Generate and return an iterator

        * convert `data` to an iterator queue
        * convert the state generators into iterators
        * (re) set the state_handler attribute to the init-state
        * pass control to the active states state_handler
          which should call and process next(self.data_iterator.next)
        """
        self.data_iterator = XIter(self.data) # queue with `appendleft` method
        self._initialize_state_generators()
        self.state_handler = getattr(self, self.init_state)
        # now start the iteration
        while True:
            yield self.state_handler()

Example 5

The next example combines the state handlers from Example 4 and the new class.:

class Example5(Example4, SimpleStates5):
    """one more example"""
    pass
Test

A function that has the generator-suffix but is prefixed with an underscore, should be skipped by the _initialize_state_generators method:

class Test_SimpleStates5:
    E5 = Example5(testdata)
    E5._bogus_handler_generator = "low"
    def test_initialize_state_generators(self):
        self.E5._initialize_state_generators()

A test function. This time with an instance of the Example5 class

def test_Example5():
    statemachine = Example5(testdata, init_state='low')
    print( statemachine.__dict__ )

Calling statemachine() should iterate over the test data and return the processed values as list. If the state of the machine changes, the special “non-value” None is returned.

print( statemachine() ) # only printed if something goes wrong
assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
                         'h(4)', 'h(5)', 'h(4)', None, # switch indicator
                         'l(3)', 'l(2)', 'l(1)']

Converting to a string should skip the None values:

print( statemachine )
assert str(statemachine) == "l(1) l(2) l(3) h(4) h(5) h(4) l(3) l(2) l(1)"

Putting it together

The file simplestates.py contains the full definition of the SimpleStates5 class in a self-contained version.

Example 6

The final Example is used to test whether we have put it together well. It subclasses SimpleStates and adds state method generators for “high” and “low”:

import simplestates
class Example6(simplestates.SimpleStates):
    """two-state machine with generators and backtracking
    """
    def high_handler_generator(self):
        """Return an iterator, whose next() method
        returns "h(token)" and switches to `low`, if token > 3
        """
        for token in self.data_iterator:
            # print( "high got", token )
            if token <= 3:
                # push back data token
                self.data_iterator.appendleft(token)
                # set the new state
                self.state_handler = self.low
                # return the token processed by the new state handler
                yield self.state_handler()
            else:
                yield "h(%d)"%token
    #
    def low_handler_generator(self):
        """Return an iterator, whose next() method
        returns "l(token)" and switches to `high`, if token <=3
        """
        for token in self.data_iterator:
            # print( "low got", token )
            if token > 3:
                self.data_iterator.appendleft(token) # push back
                # set and run the new state
                self.state_handler = self.high
                # return the token processed by the new state handler
                yield self.state_handler()
            else:
                yield "l(%d)"%token
Test

In order not to make it dependent on the iterqueue module, the final SimpleStates does not wrap the data in an XIter instance. This step should be done at the instantiation of a state machine.

def test_Example5():
    statemachine = Example5(XIter(testdata), init_state='low')
    print( statemachine.__dict__ )

Calling statemachine() should iterate over the test data and return the processed values as list:

print( statemachine() ) # only printed if something goes wrong
# reset the data iterator as it is "used up" now
statemachine.data = XIter(testdata)
assert statemachine() == ['l(1)', 'l(2)', 'l(3)',
                         'h(4)', 'h(5)', 'h(4)', None,
                         'l(3)', 'l(2)', 'l(1)']

Index

generator:A function with a yield keyword. Calling this function will return an iterator
iterator:An object with a next() method. Calling next(<iterator>) will (typically) return one data token (list element, line in a file, …). If there is no more data the StopIteration exception is raised.

Command line usage

running this script should explore it in the “nose” test framework:

if __name__ == "__main__":
    import nose, doctest
    # first run any doctests
    doctest.testmod()
    # then run nose on this module
    nose.runmodule() # requires nose 0.9.1