actor
is an entity with its own lifetime, whose job is to react to incoming messages.
Here "react" means to produce side-effects like logging, blinking LEDs etc, or to send
new messages.
Here is a diagram of normal actor life cycle, where round ellipse is actor state and box is actor method.
Usually, actor reacts on user-defined messages, when it is in operational
state.
An actor have to subscribe on particual message type(s) to react on the message.
This subscription is performed via initialize()
method overriding. As there
are framework messages, an user-defined actor must invoke ActorBase::initialize()
method to allow the framework do that.
Actor does not react on the messages, when it is in the off
state.
The ActorBase::advance_stop()
method invocation does that. Actor can override
the advance_stop()
method and postpone the ActorBase::advance_stop()
invokation
when it is ready, i.e. until the underlying hardware is off.
Likewise the advance_init()
method can be overrided and the ActorBase::advance_init()
can be postponed, until underlying hardware is ready.
The advance_start()
method plays a bit different role: it is singalled from its
supervisor (see below), that everything is ready to start (i.e. all other actor's
siblings, belonging to the same supervisor) and it is safe to send a message to other
actors as they are operational
. In other words, it is the cross-actor synchronization
point. When overriding, the ActorBase::advance_start()
must be invoked to change
the state.
All advance_*
methods are invoked automatically upon receiving
rotor-light messages, it is incorrect to
invoke them manually.
actor
is designed to be recycleable, i.e. when it is shut down (off
state), it can
be started again. If an actor has defaults values, which are changed during its lifetime,
the overriden initialize()
method is the proper place to reset the defaults.
There might be the case, then an actor cannot be initialized, i.e. due to hardware
failure. In that case, the ActorBase::initialize()
must not be invoked, and
the failure should be reported to its supervisor via the code like:
send<ctx::thread, message::ChangeStateAck>(0, supervisor->get_id(), id, State::initialized, false);
What to do with the failure, is the job of the supervisor.
Subscribe and initialize the default values in initialize()
method. Start messaging
with other actors in advance_start()
. Initialize the hardware via the overriding
advance_init()
method and when you done, invoke the ActorBase::advance_init()
.
Similarly, shutdown the hardware via the overriding advance_stop()
method and when
you done, invoke the ActorBase::advance_stop()
.
All cross-actor communication when everything is ready is performed when actors are
in the operational
state.
Supervisor is a special kind of actor, which manages (or "orchestrates", if you like) other actors (including supervisors), which it owns. Supervisor is responsible for initializing, shutting down, synchronization of actors and it also handles actors failures.
Supervisor initialization (shut down) is simple: it becomes initialized
(off
)
when all its child actors are initialized
(off
). It is imporant, that supervisor
waits, until all its children become initialized
(off
).
When it is initialized
it advances self into operational
state and dispatches
start message to all its children. This way it synchronizes start of actors:
an actor has guarantee, when it starts sending messages to other actors, they
are operational
.
Dealing with actor failure, is a bit more complex task, as it depends on actor state and the fail policy, defined in the actor.
The full list of fail policies is:
enum class FailPolicy {
restart = 0b00000001,
force_restart = 0b00000011,
escalate = 0b00000100, // default
force_escalate = 0b00001100,
ignore = 0b00010000,
};
If actor stops in operational
state this is the normal case, and supervisor does
nothing, unless the actor policy is force_restart
or force_escalate
, which
start restart and escalate procedure correspondingly.
The actor restart procedure is simple: send it stop message (if the actor wasn't off
),
and then initialize()
it and wait initialization confirmation (it's done via
ActorBase::advance_init()
).
The actor escalate procedure is simple too: supervisor stops all actors, and then shuts
self down too. If it was the root supervisor, it exits from ::process()
messages
method too, as there are no more messages. Then, usually, that leads to exiting
from main()
method of the firmware which causes hardware restart.
If actor fails in initializing
state, and it's policy is restart
, then supervisor
performs the restart actor procedure.
If actor fails in initializing
state, and it's policy is escalate
, then supervisor
performs the escalate failure procedure.
If actor has ignore
policy and it stops, the supervisor continue to perform it's
normal activity (i.e. wait of initialization/shutdown of other actors).
NB: fail policy can be changed by actor itself during it's lifetime, i.e. escalate falure after 3 failed initializations.
The default policy of an actor is escalate
.
The "failed to shutdown" is meaningless for the framework, i.e. supervisor always waits until it receives child actor shutdown confirmation.
force_*
policies, "ignore" actor state and instruct supervisor to perform appropriate
procedure. restart
, escalate
and ignore
has sense only for operational
actor
state.
The underlying idea is: if an actor fails to initialize (due to hardware failure), give it another chance by restarting it afresh. May be a few times. May be recurrently.
Supervisor was designed to be composeable, i.e. it can contain other supervisors as its childern forming hierarchy of actors .
Why would you need that, instead of having the "flat" actors list and just have messaging? It is needed to handle failure escalation: supervisor groups related actors, and if one of them fails, restart the whole group, maybe a few times, util escalating the failure upstream, i.e. to root supervisor.
If the root supervisor fails too... ok, it seems, that we have tried the best we can,
and the final radical remedy left is just to restart the whole board (i.e. exit from
main()
entry point).
In rotor_light
it is possitble to have multiple queues. Queues are indexed from 0,
as usually, howerver queues are processed in the reverse order, so a queue with
higher index is processed before any queue with lower index. This means that
queue index is actually queue priority.
using Queue = rl::Queue<Storage, 15, 5>;
In the example above master queue consists of two sub-queues: first (index 0
) with
storage for 15 messages, and the second (index 1
) with storage for 5 messages.
Until there is any message in the second queue, it is processed; then it starts
processing messages in the first queue.