|
@@ -0,0 +1,145 @@
|
|
|
+# Conception
|
|
|
+
|
|
|
+## actor
|
|
|
+
|
|
|
+`actor` is an entity with its own lifetime, whose job is to react to incoming messages.
|
|
|
+Here "react" means to produce side-effects like logging, blinking LEDs etc, or to send
|
|
|
+new messages.
|
|
|
+
|
|
|
+Here is a diagram of normal actor life cycle, where round ellipse is actor state and
|
|
|
+box is actor method.
|
|
|
+
|
|
|
+![actor lifcycle](actor-lifetime.png)
|
|
|
+
|
|
|
+Usually, actor reacts on user-defined messages, when it is in `operational` state.
|
|
|
+An actor have to subscribe on particual message type(s) to react on the message.
|
|
|
+This subscription is performed via `initialize()` method overriding. As there
|
|
|
+are framework messages, an user-defined actor **must** invoke `ActorBase::initialize()`
|
|
|
+method to allow the framework do that.
|
|
|
+
|
|
|
+Actor does not react on the messages, when it is in the `off` state.
|
|
|
+The `ActorBase::advance_stop()` method invocation does that. Actor can override
|
|
|
+the `advance_stop()` method and postpone the `ActorBase::advance_stop()` invokation
|
|
|
+when it will be ready, i.e. until the underlying hardware will be off.
|
|
|
+
|
|
|
+Likewise the `advance_init()` method can be overrided and the `ActorBase::advance_init()`
|
|
|
+can be postponed, until underlying hardware will be ready.
|
|
|
+
|
|
|
+The `advance_start()` method plays a bit different role: it is singalled from its
|
|
|
+**supervisor** (see below), that everything is ready to start, i.e. all other actor's
|
|
|
+siblings, belonging to the same supervisor, and it is safe to send message to other
|
|
|
+actors as they are `operational`. In other words, it is the *cross-actor synchronization
|
|
|
+point*. When overriding, the `ActorBase::advance_start()` must be invoked to change
|
|
|
+the state.
|
|
|
+
|
|
|
+`actor` is designed to be recycleable, i.e. when it is shut down (`off` state), it can
|
|
|
+be started again. If an actor has defaults values, which are changed during its lifetime,
|
|
|
+the overriden `initialize()` method is the proper place to reset the defaults.
|
|
|
+
|
|
|
+There might be the case, then an actor cannot be initialized, i.e. due to hardware
|
|
|
+failure. In that case, the `ActorBase::initialize()` **must not be invoked**, and
|
|
|
+the failure should be reported to its supervisor via the code like:
|
|
|
+
|
|
|
+```
|
|
|
+send<message::ChangeStateAck>(0, supervisor->get_id(), id, State::initialized, false);
|
|
|
+```
|
|
|
+
|
|
|
+What to do with the failure, is the job of the *supervisor*.
|
|
|
+
|
|
|
+### actor summary
|
|
|
+
|
|
|
+Subscribe and initialize the default values in `initialize()` method. Start messaging
|
|
|
+with other actors in `advance_start()`. Initialize the hardware via the overriding
|
|
|
+`advance_init()` method and when you done, invoke the `ActorBase::advance_init()`.
|
|
|
+Similarly, shutdown the hardware via the overriding `advance_stop()` method and when
|
|
|
+you done, invoke the `ActorBase::advance_stop()`.
|
|
|
+
|
|
|
+All cross-actor communication when everything is ready is performed when actors are
|
|
|
+in the `operational` state.
|
|
|
+
|
|
|
+## supervisor
|
|
|
+
|
|
|
+Supervisor is a special kind of actor, which manages (or "orchestrates", if you like)
|
|
|
+other actors (including supervisors), which it owns. Supervisor is responsible for
|
|
|
+**initializing**, **shutting down**, **synchronization** of actors and also handles
|
|
|
+**actors failures**.
|
|
|
+
|
|
|
+Supervisor initialization (shut down) is simple: it becomes `initialized` (`off`)
|
|
|
+when all its child actors are `initialized` (`off`). It is imporant, that supervisor
|
|
|
+**waits**, until all its children become `initialized` (`off`).
|
|
|
+
|
|
|
+When it is `initialized` it advances self into `operational` state and dispatches
|
|
|
+start message to all its children. This way it **synchronizes** start of actors:
|
|
|
+an actor has guarantee, when it starts sending messages to other actors, they
|
|
|
+are `operational`.
|
|
|
+
|
|
|
+### fail policies
|
|
|
+
|
|
|
+Dealing with actor failure, is a bit more complex task, as it depends on actor state
|
|
|
+and the fail policy, defined in the actor.
|
|
|
+
|
|
|
+The full list of fail policies is:
|
|
|
+
|
|
|
+```
|
|
|
+enum class FailPolicy {
|
|
|
+ restart = 0b00000001,
|
|
|
+ force_restart = 0b00000011,
|
|
|
+ escalate = 0b00000100, // default
|
|
|
+ force_escalate = 0b00001100,
|
|
|
+ ignore = 0b00010000,
|
|
|
+};
|
|
|
+```
|
|
|
+
|
|
|
+If actor stops in `operational` state this is the normal case, and supervisor does
|
|
|
+nothing, unless the actor policy is `force_restart` or `force_escalate`, which
|
|
|
+start restart and escalate procedure correspondingly.
|
|
|
+
|
|
|
+The actor **restart procedure** is simple: send it stop message (if the actor wasn't `off`),
|
|
|
+and then `initialize()` it and wait initialization confirmation (it's done via
|
|
|
+`ActorBase::advance_init()`).
|
|
|
+
|
|
|
+The actor **escalate procedure** is simple: supervisor stops all actors, and then shuts
|
|
|
+self down too. If it was the root supervisor, it exits from `::process()` messages
|
|
|
+method too, as there are no more messages(). Then, usually, that leads to exiting
|
|
|
+from `main()` methon of the firmware which causes hardware restart..
|
|
|
+
|
|
|
+If actor fails in `initializing` state, and it's policy is `restart`, then supervisor
|
|
|
+performs the restart actor procedure.
|
|
|
+
|
|
|
+If actor fails in `initializing` state, and it's policy is `escalate`, then supervisor
|
|
|
+performs the escalate failure procedure.
|
|
|
+
|
|
|
+If actor has `ignore` policy and it stops, the supervisor continue to perform it's
|
|
|
+normal activity (i.e. wait of initialization/shutdown of other actors).
|
|
|
+
|
|
|
+NB: fail policy can be changed by actor itself during it's lifetime, i.e. escalate
|
|
|
+falure after 3 failed initializations.
|
|
|
+
|
|
|
+The **default policy** of an actor is `escalate`.
|
|
|
+
|
|
|
+The "failed to shutdown" is meaningless for the framework, i.e. supervisor always
|
|
|
+waits until it receives child actor shutdown confirmation.
|
|
|
+
|
|
|
+### fail policies summary
|
|
|
+
|
|
|
+`force_*` policies, "ignore" actor state and instruct supervisor to perform appropriate
|
|
|
+procedure. `restart`, `escalate` and `ignore` has sense only for `operational` actor
|
|
|
+state.
|
|
|
+
|
|
|
+## composition
|
|
|
+
|
|
|
+The underlying idea is: if an actor fails to initialize (due to hardware failure),
|
|
|
+give it another chance by restarting it afresh. May be a few times. May be recurrently.
|
|
|
+
|
|
|
+
|
|
|
+Supervisor was designed to be composeable, i.e. it can contain other supervisors as
|
|
|
+its childern forming *hierarchy of actors* .
|
|
|
+
|
|
|
+Why would you need that, instead of having the "flat" actors list and just have messaging?
|
|
|
+It is needed to handle failure escalation: supervisor groups related actors, and if one
|
|
|
+of them fails, restart the whole group, maybe a few times, util escalating the failure
|
|
|
+upstream, i.e. to root supervisor.
|
|
|
+
|
|
|
+If the root supervisor fails too... ok, it seems, that we have tried the best we can,
|
|
|
+and the final radical remedy left is just to *restart the whole board* (i.e. exit from
|
|
|
+`main()` entry point).
|