summaryrefslogtreecommitdiffstats
path: root/docs/signaling/architecture.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/signaling/architecture.md')
-rw-r--r--docs/signaling/architecture.md467
1 files changed, 467 insertions, 0 deletions
diff --git a/docs/signaling/architecture.md b/docs/signaling/architecture.md
new file mode 100644
index 0000000..823a6d2
--- /dev/null
+++ b/docs/signaling/architecture.md
@@ -0,0 +1,467 @@
+---
+title: AGL - Message Signaling Architecture
+author: Fulup Ar Foll (IoT.bzh)
+date: 2016-06-30
+
+categories: architecture, appfw
+tags: architecture, signal, message
+layout: techdoc
+
+---
+
+**Table of Content**
+
+1. TOC
+{:toc}
+
+## Context
+
+Automotive applications need to understand in real time the context in which
+vehicles operate. In order to do so, it is critical for automotive application
+to rely on a simple, fast and secure method to access data generated by the
+multiple sensors/ECU embedded in modern cars.
+
+This signaling problem is neither new, neither unique to the automotive and
+multiple solutions often described as Message Broker or Signaling Gateway have
+been around for a while.
+
+The present document is the now implemented since AGL Daring Dab version, to
+handle existing signaling/message in a car. It relies on [[APbinder]]
+binder/bindings model to minimize complexity while keeping the system fast
+around secure. We propose a model with multiple transport options and a full set
+of security feature to protect the service generating the signal as well as
+consuming them.
+
+## Objectives
+
+Our objectives are solving following 3 key issues:
+
+1. reduce as much as possible the amount of exchanged data to the meaningful
+ subset really used by applications
+1. offer a high level API that obfuscates low level and proprietary interface to
+ improve stability in time of the code
+1. hide specificities of low level implementation as well as the chosen
+ deployment distribution model.
+
+To reach first objective, events emission frequency should be controlled at the
+lowest level it possibly can. Aggregation, composition, treatment, filtering of
+signals should be supported at software level when not supported by the hardware.
+
+Second objectives of offering long term stable hight level API while allowing
+flexibility in changing low level implementation may look somehow conflicting.
+Nevertheless by isolating low level interface from high level and allowing
+dynamic composition it is possible to mitigate both objectives.
+
+## Architecture
+
+Good practice is often based on modularity with clearly separated components
+assembled within a common framework. Such modularity ensures separation of
+duties, robustness, resilience and achievable long term maintenance.
+
+This document uses the term "**Service**" to define a specific instance of this
+proposed common framework used to host a group of dedicated separated components
+that handle targeted signals/events. Each service exposes to services/applications
+the signals/events it is responsible for.
+
+As an example, a CAN service may want to mix non-public proprietary API with
+CANopen compatible devices while hiding this complexity to applications. The
+goal is on one hand to isolate proprietary piece of code in such a way that it
+is as transparent as possible for the remaining part of the architecture. On a
+second hand isolation of code related to a specific device provides a better
+separation of responsibilities, keeping all specificity related to a given
+component clearly isolated and much easier to test or maintain. Last but not
+least if needed this model may also help to provide some proprietary code
+directly as binary and not as source code.
+
+Communicating between the car and regular apps should be done using a 2 levels
+AGL services which have two distincts roles:
+
+- low level should handle communication with CAN bus device (read, decoding,
+ basic and efficient filtering, caching, ...)
+- high level should handle more complex tasks (signals compositions, complex
+ algorythms like Kalman filter, business logic...)
+
+![image](./images/signal-service-arch.svg "Signal Agent Architecture")
+
+To do so, the choice has been to use a similar architecture than [[OpenXC]], a
+Ford project. Principle is simple, from a JSON file that describes all CAN
+signals wanted to be handled, in general a conversion from a **dbc** file, AGL
+generator convert it to a C++ source code file. This file which in turn is used
+as part of the low level CAN service which can now be compiled. This service
+reads, decodes and serves this CAN signals to a high level CAN service that
+holds business logic and high level features like described is the above
+chapter.
+
+![image](./images/can-generator.svg "AGL CAN generator")
+
+While in some cases it may be chosen to implement a single service responsible
+for everything, other scenarii may chose to split responsibility between
+multiple services. Those multiple services may run on a single ECU or on
+multiple ECUs. Chosen deployment distribution strategy should not impact the
+development of components responsible for signals/events capture. As well as it
+should have a loose impact on applications/services consuming those events.
+
+A distributed capable architecture may provide multiple advantages:
+
+- it avoids to concentrate complexity in a single big/fat component.
+- it leverages naturally multiple ECUs and existing network architecture
+- it simplifies security by enabling isolation and sandboxing
+- it clearly separates responsibilities and simplifies resolution of conflicts
+
+Distributed architecture has to be discussed and about now is not fully
+implemented. Low level CAN service isn't fully functional nor tested to assume
+this feature but its architecture let the possibility open and will be
+implemented later.
+
+![image](./images/distributed-arch.png "Distributed Architecture")
+
+Performance matters. There is a trade-off between modularity and efficiency.
+This is specially critical for signals where propagation time from one module to
+the other should remain as short as possible and furthermore should consume as
+little computing resources as possible.
+
+A flexible solution should provide enough versatility to either compose modules
+in separate processes; either chose a model where everything is hosted within a
+single process. Chosen deployment model should have minor or no impact on
+development/integration processes. Deployment model should be something easy to
+change, it should remain a tactical decision and never become a structuring
+decision.
+
+Nevertheless while grouping modules may improve performance and reduce resource consumption, on the other hand,
+it has a clear impact on security. No one should forget that some signals have very different level of security from other ones.
+Mixing everything within a single process makes all signal's handling within a single security context.
+Such a decision may have a significant impact on the level on confidence one may have in the global system.
+
+Providing such flexibility constrains the communication model used by modules:
+
+- The API of integration of the modules (the API of the framework) that enables
+ the connection of modules must be independent of the implementation of
+ the communication layer
+- The communication layer must be as transparent as possible, its
+ implementation shouldn't impact how it is used
+- The cost of the abstraction for modules grouped in a same process
+ must be as little as possible
+- The cost of separating modules with the maximum of security must remain as
+ minimal as possible
+
+Another point impacting performance relates to a smart limitation on the number
+of emitted signals. Improving the cost of sending a signal is one thing,
+reducing the number of signals is an other one. No one should forget that the
+faster you ignore a useless signal the better it is. The best way to achieve
+this is by doing the filtering of useless signal as close as possible of the
+component generating the signal and when possible directly at the hardware level.
+
+To enable the right component to filter useless signals, consumer clients must
+describe precisely the data they need. A filter on frequency is provided since
+Daring Dab version, as well as minimum and maximum limits. These filters can be
+specified at subscription time. Also, any data not required by any client should
+at the minimum never be transmitted. So only changed data is transmitted and if
+another service needs to receive at a regular time, it has to assume that if no
+events are received then it is that the value hasn't change. Furthermore when
+possible then should even not be computed at all, a CAN signal received on
+socket is purely ignored if no one asks for it.
+
+Describing expected data in a precise but nevertheless simple manner remains a
+challenge. It implies to manage:
+
+- requested frequency of expected data
+- accuracy of data to avoid detection of inaccurate changes
+- when signaling is required (raising edge, falling edge,
+ on maintained state, ...),
+- filtering of data to avoid glitches and noise,
+- composition of signals both numerically and logically (adding,
+ subtracting, running logical operators like AND/OR/XOR, getting the mean, ...)
+- etc...
+
+It is critical to enable multiple features in signal queries to enable modules
+to implement the best computing method. The best computing method may have an
+impact on which device to query as well as on which filters should be applied.
+Furthermore filtering should happen as soon as possible and obviously when
+possible directly at hardware level.
+
+### Transport Solutions
+
+D-Bus is the standard choice for Linux, nevertheless it has some serious
+performance limitation due to internal verbosity. Nevertheless because it is
+available and pre-integrated with almost every Linux component, D-Bus may still
+remains an acceptable choice for signal with low rate of emission (i.e. HMI).
+
+For a faster communication, Jaguar-Land-Rover proposes a memory shared signal
+infrastructure. Unfortunately this solution is far from solving all issues and
+has some drawbacks. Let check the open issues it has:
+
+- there is no management of what requested data are. This
+ translate in computing data even when not needed.
+- on top of shared memory, an extra side channel is required for processes
+ to communicate with the daemon.
+- a single shared memory implies a lot of concurrency handling. This might
+ introduce drawbacks that otherwise would be solved through communication
+ buffering.
+
+ZeroMQ, NanoMSG and equivalent libraries focused on fast communication. Some
+(e.g. ZeroMQ) come with a commercial licensing model when others (e.g. NanoMSG)
+use an open source licensing. Those solutions are well suited for both
+communicating inside a unique ECU or across several ECUs. However, most of them
+are using Unix domain sockets and TCP sockets and typically do not use shared
+memory for inter-process communication.
+
+Last but not least Android binder, Kdbus and other leverage shared memory, zero
+copy and sit directly within Linux kernel. While this may boost information
+passing between local processes, it also has some limitations. The first one is
+the non support of a multi-ECU or vehicle to cloud distribution. The second one
+is that none of them is approved upstream in kernel tree. This last point may
+create some extra burden each time a new version of Linux kernel is needed or
+when porting toward a new hardware is required.
+
+### Query and Filtering Language
+
+Description language for filtering of expected data remains an almost green
+field where nothing really fit signal service requirements. Languages like
+Simulink or signal processing graphical languages are valuable modelling tools.
+Unfortunately they cannot be inserted in the car. Furthermore those languages
+have many features that are not useful in proposed signal service context and
+cost of integrating such complex languages might not be justified for something
+as simple as a signal service. The same remarks apply for automation languages.
+
+Further investigations leads to some specifications already presents like the
+one from Jaguar Land Rover [[VISS]], for **Vehicule Information Service
+Specification** and another from Volkwagen AG named [[ViWi]], stand for
+**Volkwagen Infotainment Web Interface**. Each ones has their differences and
+provides different approach serving the same goal:
+
+| VISS | ViWi |
+|---------------------------------------------------------------|-----------------------------------------------------------------|
+| Filtering on node (not possible on several nodes or branches) | Describe a protocol |
+| Access restrictions to signals | Ability to specify custom signals |
+| Use high level development languages | RESTful HTTP calls |
+| One big Server that handle requests | Stateless |
+| Filtering | Filtering, sorting |
+| Static signals tree not extensible [[VSS]] | Use JSON objects to communicate |
+| Use of AMB ? | Identification of resources may be a bit heavy going using UUID |
+| Use of Websocket | |
+
+About **[[VISS]]** specification, the major problem comes from the fact that
+signals are specified under the [[VSS]], **Vehicle Signal Specification**. So,
+problem is that it is difficult, if not impossible, to make a full inventory
+of all signals existing for each car. More important, each evolution in signals
+must be reported in the specification and it is without seeing the fact that
+car makers have their names and set of signals that would mostly don't
+comply with the [[VSS]]. VISS doesn't seems to be an valuable way to handle
+car's signals, a big component that responds requests, use of **Automotive
+Message Broker** that use DBus is a performance problem. Fujitsu Ten recent
+study[[1]] highlights that processor can't handle an heavy load on CAN bus and
+that Low level binding adopted for AGL is about 10 times[[2]] less impact on
+performance.
+
+## Describing Signal Subscriptions using JSON
+
+JSON is a rich structured representation of data. For requested data, it allows
+the expression of multiple features and constraints. JSON is both very flexible
+and efficient. There are significant advantages in describing requested data at
+subscription time using a language like JSON. Another advantage of JSON is that
+no parser is required to analyse the request.
+
+Existing works exists to describe a signals that comes first from Vector with
+its proprietary database (`DBC`) which widely used in industry. Make a
+description based on this format appears to be a good solution and Open Source
+community already has existing tools that let you convert proprietary file
+format to an open one. So, a JSON description based on work from [[OpenXC]] is
+specified [here](https://github.com/openxc/vi-firmware/blob/master/docs/config/reference.rst)
+which in turn is used in Low level CAN service in AGL:
+
+```json
+{ "name": "example",
+ "extra_sources": [],
+ "initializers": [],
+ "loopers": [],
+ "buses": {},
+ "commands": [],
+ "0x3D9": {
+ "bus": "hs",
+ "signals": {
+ "PT_FuelLevelPct": {
+ "generic_name": "fuel.level",
+ "bit_position": 8,
+ "bit_size": 8,
+ "factor": 0.392157,
+ "offset": 0
+ },
+ "PT_EngineSpeed": {
+ "generic_name": "engine.speed",
+ "bit_position": 16,
+ "bit_size": 16,
+ "factor": 0.25,
+ "offset": 0
+ },
+ "PT_FuelLevelLow": {
+ "generic_name": "fuel.level.low",
+ "bit_position": 55,
+ "bit_size": 1,
+ "factor": 1,
+ "offset": 0,
+ "decoder": "decoder_t::booleanDecoder"
+ }
+ }
+ }
+}
+```
+
+From a description like the above one, low level CAN generator will output
+a C++ source file which let low level CAN service that uses it to handle such
+signal definition.
+
+## Naming Signal
+
+Naming and defining signals is something very complex. For example just
+***speed***, as a signal, is difficult to define.
+What unit is used (km/h, M/h, m/s, ...)?
+From which source (wheels, GPS, AccelMeter)?
+How was it captured (period of measure, instantaneous, mean, filtered)?
+
+In order to simplify application development we should nevertheless agree on
+some naming convention for key signals. Those names might be relatively complex
+and featured. They may include a unit, a rate, a precision, etc.
+
+How these names should be registered, documented and managed is out of scope of
+this document but extremely important and at some point in time should be
+addressed. Nevertheless this issue should not prevent from moving forward
+developing a modern architecture. Developers should be warned that naming is a
+complex task, and that in the future naming scheme should be redefined, and
+potential adjustments would be required.
+
+About Low level CAN signals naming a doted notation, like the one used by Jaguar
+Landrover, is a good compromise as it describe a path to an car element. It
+separates and organize names into hierarchy. From the left to right, you
+describe your names using the more common ancestor at the left then more you go
+to the right the more it will be accurate. Using this notation let you subscribe
+or unsubscribe several signals at once using a globbing expression.
+
+Example using OBD2 standard PID:
+
+```path
+engine.load
+engine.coolant.temperature
+fuel.pressure
+intake.manifold.pressure
+engine.speed
+vehicle.speed
+intake.air.temperature
+mass.airflow
+throttle.position
+running.time
+EGR.error
+fuel.level
+barometric.pressure
+commanded.throttle.position
+ethanol.fuel.percentage
+accelerator.pedal.position
+hybrid.battery-pack.remaining.life
+engine.oil.temperature
+engine.torque
+```
+
+Here you can chose to subscribe to all engine component using an expression
+like : `engine.*`
+
+## Reusing existing/legacy code
+
+About now provided services use:
+
+- **Low Level** [[OpenXC]] project provides logic and some useful libraries to
+ access a CAN bus. It is the choice for AGL.
+
+- **High Level** In many cases accessing to low level signal is not enough.
+ Low level information might need to be composed (i.e. GPS+Gyro+Accel).
+ Writing this composition logic might be quite complex and reusing existing
+ libraries like: LibEkNav for Kalman filtering [[9]] or Vrgimbal for 3 axes
+ control[[10]] may help saving a lot of time. AGL apps should access CAN
+ signals through High Level service. High level can lean on as many low level
+ service as needed to compute its **Virtual signals** coming from differents
+ sources. Viwi protocol seems to be a good solution.
+
+## Leveraging AGL binder
+
+Such a model is loosely coupled with AGL binder. Low level CAN service as well
+as virtual signal components may potentially run within any hosting environment
+that would provide the right API with corresponding required facilities.
+Nevertheless leveraging [[APbinder]] has multiple advantages. It already
+implements event notification to support a messaging/signaling model for
+distributed services. It enables a subscribe model responding to the
+requirement and finally it uses JSON natively.
+
+This messaging/signalling model already enforces the notion of subscription for
+receiving data. It implies that unexpected data are not sent and merely not
+computed. When expected data is available, it is pushed to all waiting
+subscriber only one time.
+
+The [[APbinder]] provides transparency of communication.
+It currently implements the transparency over D-Bus/Kdbus and WebSocket.
+Its transparency mechanism of communication is easy to extend to other
+technologies: pools of shared memory or any proprietary transport model.
+
+When bindings/services are loaded by the same binder, it provides transparently
+`in-memory` communication. This in-memory communication is really efficient: on
+one hand, the exchanged JSON objects are not serialized (because not streamed),
+on the other hand, those JSON objects provide a high level of abstraction able
+to transfer any data.
+
+Technically a service is a standard [[APbinder]] binding which is also handled
+by the system and launched as a daemon by systemD.
+Therefore Signal/Agent inherits of security protection through SMACK, access
+control through Cynara, transparency of API to transport layer, life cycle
+management, ... Like any other [[APbinder]] process is composed of a set of
+bindings. In signal service specific case, those bindings are in fact the
+`signal modules`.
+
+The proposed model allows to implement low level dependencies as independent
+signal modules. Those modules when developed are somehow like "Lego Bricks".
+They can be spread or grouped within one or multiple services depending on
+deployment constraints (performance, multi-ECU, security & isolation
+constraints,...).
+
+On top of that low level signal modules, you should use a high level service.
+A first implementation of [[ViWi]] is available [here](https://github.com/iotbzh/high-level-viwi-service)
+and can be use to integrate business logic and high level features.
+
+The model naturally uses JSON to represent data.
+
+## Multi-ECU and Vehicule to Cloud interactions
+
+While this might not be a show stopper for current projects, it is obvious that
+in the near future Signal/Agent should support a fully distributed
+architectures. Some event may come from the cloud (i.e. request to start
+monitoring a given feature), some may come from SmartCity and nearby vehicles,
+and last but not least some may come from another ECU within the same vehicle or
+from a virtualized OS within the same ECU (e.g. cluster & IVI). In order to do
+so, Signal modules should enable composition within one or more [[APbinder]]
+inside the same ECU. Furthermore they should also support chaining with the
+outside world.
+
+![image](./images/cloud-arch.svg "Cloud & Multi-ECU Architecture")
+
+1. Application requests Virtual Signal exactly like if it was a low level signal
+1. Agent Signal has direct relation to low level signal
+1. Agent needs to proxy to an other service inside the same ECU to access the signal
+1. Signal is not present on current ECU. Request has to be proxied to the outside world
+
+[AppFw]: http://iot.bzh/download/public/2016/appfw/01_Introduction-to-AppFW-for-AGL-1.0.pdf "Application Framework"
+[APcore]: http://iot.bzh/download/public/2016/appfw/03_Documentation-AppFW-Core-1.0.pdf "AppFw Core"
+[APmain]: https://gerrit.automotivelinux.org/gerrit/#/q/project:src/app-framework-main "AppFw Main"
+[APbinder]: https://gerrit.automotivelinux.org/gerrit/#/q/project:src/app-framework-binder "AppFw Binder"
+[APsamples]: https://gerrit.automotivelinux.org/gerrit/gitweb?p=src/app-framework-binder.git;a=tree;f=bindings/samples "AppFw Samples"
+[Signal-K]: http://signalk.org/overview.html
+[1]: http://schd.ws/hosted_files/aglmmwinter2017/37/20170201_AGL-AMM_F10_kusakabe.pdf
+[2]: https://wiki.automotivelinux.org/_media/agl-distro/20170402_ften_can_kusakabe_v2.pdf
+[6]: https://github.com/otcshare/automotive-message-broker
+[7]: http://ardupilot.org/rover/index.html
+[8]: https://github.com/ArduPilot/ardupilot/tree/master/libraries
+[9]: https://bitbucket.org/jbrandmeyer/libeknav/wiki/Home
+[10]: http://ardupilot.org/rover/docs/common-vrgimbal.html
+[11]: http://elinux.org/R-Car/Boards/Porter:PEXT01
+[12]: https://github.com/gpsnavi/gpsnavi
+[VISS]: http://rawgit.com/w3c/automotive/gh-pages/vehicle_data/vehicle_information_service.html
+[VSS]: https://github.com/GENIVI/vehicle_signal_specification
+[ViWi]: https://www.w3.org/Submission/2016/SUBM-viwi-protocol-20161213/
+[OpenXC]: http://openxcplatform.com/
+[low level CAN service]: https://gerrit.automotivelinux.org/gerrit/#/admin/projects/src/low-level-can-generator
+[high level ViWi]: https://github.com/iotbzh/high-level-viwi-service