summaryrefslogtreecommitdiffstats
path: root/signaling/architecture.md
blob: 2132c4ad0fda2b134f6376ab8665803221311dd8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
---
title: AGL - Message Signaling Architecture
author: Fulup Ar Foll (IoT.bzh)
date: 2016-06-30

categories: architecture, appfw
tags: architecture, signal, message
layout: techdoc

---

**Table of Content**

1. TOC

{:toc}

## Context

Automotive applications need to understand in real time the context in which
vehicles operate. In order to do so, it is critical for automotive application
to rely on a simple, fast and secure method to access data generated by the
multiple sensors/ECU embedded in modern cars.

This signaling problem is neither new, neither unique to the automotive and
multiple solutions often described as Message Broker or Signaling Gateway have
been around for a while.

The present document is the now implemented since AGL Daring Dab version, to
handle existing signaling/message in a car. It relies on [[APbinder]]
binder/bindings model to minimize complexity while keeping the system fast
around secure. We propose a model with multiple transport options and a full set
of security feature to protect the service generating the signal as well as
consuming them.

## Objectives

Our objectives are solving following 3 key issues:

1. reduce as much as possible the amount of exchanged data to the meaningful
 subset really used by applications
1. offer a high level API that obfuscates low level and proprietary interface to
 improve stability in time of the code
1. hide specificities of low level implementation as well as the chosen
 deployment distribution model.

To reach first objective, events emission frequency should be controlled at the
lowest level it possibly can. Aggregation, composition, treatment, filtering of
signals should be supported at software level when not supported by the hardware.

Second objectives of offering long term stable hight level API while allowing
flexibility in changing low level implementation may look somehow conflicting.
Nevertheless by isolating low level interface from high level and allowing
dynamic composition it is possible to mitigate both objectives.

## Architecture

Good practice is often based on modularity with clearly separated components
assembled within a common framework. Such modularity ensures separation of
duties, robustness, resilience and achievable long term maintenance.

This document uses the term "**Service**" to define a specific instance of this
proposed common framework used to host a group of dedicated separated components
that handle targeted signals/events. Each service exposes to services/applications
the signals/events it is responsible for.

As an example, a CAN service may want to mix non-public proprietary API with
CANopen compatible devices while hiding this complexity to applications. The
goal is on one hand to isolate proprietary piece of code in such a way that it
is as transparent as possible for the remaining part of the architecture. On a
second hand isolation of code related to a specific device provides a better
separation of responsibilities, keeping all specificity related to a given
component clearly isolated and much easier to test or maintain. Last but not
least if needed this model may also help to provide some proprietary code
directly as binary and not as source code.

Communicating between the car and regular apps should be done using a 2 levels
AGL services which have two distincts roles:

- low level should handle communication with CAN bus device (read, decoding,
 basic and efficient filtering, caching, ...)
- high level should handle more complex tasks (signals compositions, complex
 algorythms like Kalman filter, business logic...)

![image](./images/signal-service-arch.svg "Signal Agent Architecture")

To do so, the choice has been to use a similar architecture than [[OpenXC]], a
Ford project. Principle is simple, from a JSON file that describes all CAN
signals wanted to be handled, in general a conversion from a **dbc** file, AGL
generator convert it to a C++ source code file. This file which in turn is used
as part of the low level CAN service which can now be compiled. This service
reads, decodes and serves this CAN signals to a high level CAN service that
holds business logic and high level features like described is the above
chapter.

![image](./images/can-generator.svg "AGL CAN generator")

While in some cases it may be chosen to implement a single service responsible
for everything, other scenarii may chose to split responsibility between
multiple services. Those multiple services may run on a single ECU or on
multiple ECUs. Chosen deployment distribution strategy should not impact the
development of components responsible for signals/events capture. As well as it
should have a loose impact on applications/services consuming those events.

A distributed capable architecture may provide multiple advantages:

- it avoids to concentrate complexity in a single big/fat component.
- it leverages naturally multiple ECUs and existing network architecture
- it simplifies security by enabling isolation and sandboxing
- it clearly separates responsibilities and simplifies resolution of conflicts

Distributed architecture has to be discussed and about now is not fully
implemented. Low level CAN service isn't fully functional nor tested to assume
this feature but its architecture let the possibility open and will be
implemented later.

![image](./images/distributed-arch.png "Distributed Architecture")

Performance matters. There is a trade-off between modularity and efficiency.
This is specially critical for signals where propagation time from one module to
the other should remain as short as possible and furthermore should consume as
little computing resources as possible.

A flexible solution should provide enough versatility to either compose modules
in separate processes; either chose a model where everything is hosted within a
single process. Chosen deployment model should have minor or no impact on
development/integration processes. Deployment model should be something easy to
change, it should remain a tactical decision and never become a structuring
decision.

Nevertheless while grouping modules may improve performance and reduce resource consumption, on the other hand,
it has a clear impact on security. No one should forget that some signals have very different level of security from other ones.
Mixing everything within a single process makes all signal's handling within a single security context.
Such a decision may have a significant impact on the level on confidence one may have in the global system.

Providing such flexibility constrains the communication model used by modules:

- The API of integration of the modules (the API of the framework) that enables
  the connection of modules must be independent of the implementation of
  the communication layer
- The communication layer must be as transparent as possible, its
  implementation shouldn't impact how it is used
- The cost of the abstraction for modules grouped in a same process
  must be as little as possible
- The cost of separating modules with the maximum of security must remain as
 minimal as possible

Another point impacting performance relates to a smart limitation on the number
of emitted signals. Improving the cost of sending a signal is one thing,
reducing the number of signals is an other one. No one should forget that the
faster you ignore a useless signal the better it is. The best way to achieve
this is by doing the filtering of useless signal as close as possible of the
component generating the signal and when possible directly at the hardware level.

To enable the right component to filter useless signals, consumer clients must
describe precisely the data they need. A filter on frequency is provided since
Daring Dab version, as well as minimum and maximum limits. These filters can be 
specified at subscription time. Also, any data not required by any client should
at the minimum never be transmitted. So only changed data is transmitted and if
another service needs to receive at a regular time, it has to assume that if no
events are received then it is that the value hasn't change. Furthermore when
possible then should even not be computed at all, a CAN signal received on
socket is purely ignored if no one asks for it.

Describing expected data in a precise but nevertheless simple manner remains a
challenge. It implies to manage:

- requested frequency of expected data
- accuracy of data to avoid detection of inaccurate changes
- when signaling is required (raising edge, falling edge,
  on maintained state, ...),
- filtering of data to avoid glitches and noise,
- composition of signals both numerically and logically (adding,
  subtracting, running logical operators like AND/OR/XOR, getting the mean, ...)
- etc...

It is critical to enable multiple features in signal queries to enable modules
to implement the best computing method. The best computing method may have an
impact on which device to query as well as on which filters should be applied.
Furthermore filtering should happen as soon as possible and obviously when
possible directly at hardware level.

### Transport Solutions

D-Bus is the standard choice for Linux, nevertheless it has some serious
performance limitation due to internal verbosity. Nevertheless because it is
available and pre-integrated with almost every Linux component, D-Bus may still
remains an acceptable choice for signal with low rate of emission (i.e. HMI).

For a faster communication, Jaguar-Land-Rover proposes a memory shared signal
infrastructure. Unfortunately this solution is far from solving all issues and
has some drawbacks. Let check the open issues it has:

- there is no management of what requested data are. This
 translate in computing data even when not needed.
- on top of shared memory, an extra side channel is required for processes
 to communicate with the daemon.
- a single shared memory implies a lot of concurrency handling. This might
 introduce drawbacks that otherwise would be solved through communication
 buffering.

ZeroMQ, NanoMSG and equivalent libraries focused on fast communication. Some
(e.g. ZeroMQ) come with a commercial licensing model when others (e.g. NanoMSG)
use an open source licensing. Those solutions are well suited for both
communicating inside a unique ECU or across several ECUs. However, most of them
are using Unix domain sockets and TCP sockets and typically do not use shared
memory for inter-process communication.

Last but not least Android binder, Kdbus and other leverage shared memory, zero
copy and sit directly within Linux kernel. While this may boost information
passing between local processes, it also has some limitations. The first one is
the non support of a multi-ECU or vehicle to cloud distribution. The second one
is that none of them is approved upstream in kernel tree. This last point may
create some extra burden each time a new version of Linux kernel is needed or
when porting toward a new hardware is required.

### Query and Filtering Language

Description language for filtering of expected data remains an almost green
field where nothing really fit signal service requirements. Languages like
Simulink or signal processing graphical languages are valuable modelling tools.
Unfortunately they cannot be inserted in the car. Furthermore those languages
have many features that are not useful in proposed signal service context and
cost of integrating such complex languages might not be justified for something
as simple as a signal service. The same remarks apply for automation languages.

Further investigations leads to some specifications already presents like the
one from Jaguar Land Rover [[VISS]], for **Vehicule Information Service
Specification** and another from Volkwagen AG named [[ViWi]], stand for
**Volkwagen Infotainment Web Interface**. Each ones has their differences and
provides different approach serving the same goal:

|                        VISS                                   |                                   ViWi                          |
|---------------------------------------------------------------|-----------------------------------------------------------------|
| Filtering on node (not possible on several nodes or branches) | Describe a protocol                                             |
| Access restrictions to signals                                | Ability to specify custom signals                               |
| Use high level development languages                          | RESTful HTTP calls                                              |
| One big Server that handle requests                           | Stateless                                                       |
| Filtering                                                     | Filtering, sorting                                              |
| Static signals tree not extensible [[VSS]]                    | Use JSON objects to communicate                                 |
| Use of AMB ?                                                  | Identification of resources may be a bit heavy going using UUID |
| Use of Websocket                                              |      |

About **[[VISS]]** specification, the major problem comes from the fact that
signals are specified under the [[VSS]], **Vehicle Signal Specification**. So,
problem is that it is difficult, if not impossible, to make a full inventory
of all signals existing for each car. More important, each evolution in signals
must be reported in the specification and it is without seeing the fact that
car makers have their names and set of signals that would mostly don't
comply with the [[VSS]]. VISS doesn't seems to be an valuable way to handle
car's signals, a big component that responds requests, use of **Automotive
Message Broker** that use DBus is a performance problem. Fujitsu Ten recent
study[[1]] highlights that processor can't handle an heavy load on CAN bus and
that Low level binding adopted for AGL is about 10 times[[2]] less impact on
performance.

## Describing Signal Subscriptions using JSON

JSON is a rich structured representation of data. For requested data, it allows
the expression of multiple features and constraints. JSON is both very flexible
and efficient. There are significant advantages in describing requested data at
subscription time using a language like JSON. Another advantage of JSON is that
no parser is required to analyse the request.

Existing works exists to describe a signals that comes first from Vector with
its proprietary database (`DBC`) which widely used in industry. Make a
description based on this format appears to be a good solution and Open Source
community already has existing tools that let you convert proprietary file
format to an open one. So, a JSON description based on work from [[OpenXC]] is
specified [here](https://github.com/openxc/vi-firmware/blob/master/docs/config/reference.rst)
which in turn is used in Low level CAN service in AGL:

```json
{   "name": "example",
    "extra_sources": [],
    "initializers": [],
    "loopers": [],
    "buses": {},
    "commands": [],
    "0x3D9": {
    "bus": "hs",
    "signals": {
        "PT_FuelLevelPct": {
        "generic_name": "fuel.level",
        "bit_position": 8,
        "bit_size": 8,
        "factor": 0.392157,
        "offset": 0
        },
        "PT_EngineSpeed": {
        "generic_name": "engine.speed",
        "bit_position": 16,
        "bit_size": 16,
        "factor": 0.25,
        "offset": 0
        },
        "PT_FuelLevelLow": {
        "generic_name": "fuel.level.low",
        "bit_position": 55,
        "bit_size": 1,
        "factor": 1,
        "offset": 0,
        "decoder": "decoder_t::booleanDecoder"
        }
    }
    }
}
```

From a description like the above one, low level CAN generator will output
a C++ source file which let low level CAN service that uses it to handle such
signal definition.

## Naming Signal

Naming and defining signals is something very complex. For example just
***speed***, as a signal, is difficult to define.
What unit is used (km/h, M/h, m/s, ...)?
From which source (wheels, GPS, AccelMeter)?
How was it captured (period of measure, instantaneous, mean, filtered)?

In order to simplify application development we should nevertheless agree on
some naming convention for key signals. Those names might be relatively complex
and featured. They may include a unit, a rate, a precision, etc.

How these names should be registered, documented and managed is out of scope of
this document but extremely important and at some point in time should be
addressed. Nevertheless this issue should not prevent from moving forward
developing a modern architecture. Developers should be warned that naming is a
complex task, and that in the future naming scheme should be redefined, and
potential adjustments would be required.

About Low level CAN signals naming a doted notation, like the one used by Jaguar
Landrover, is a good compromise as it describe a path to an car element. It
separates and organize names into hierarchy. From the left to right, you
describe your names using the more common ancestor at the left then more you go
to the right the more it will be accurate. Using this notation let you subscribe
or unsubscribe several signals at once using a globbing expression.

Example using OBD2 standard PID:

```path
engine.load
engine.coolant.temperature
fuel.pressure
intake.manifold.pressure
engine.speed
vehicle.speed
intake.air.temperature
mass.airflow
throttle.position
running.time
EGR.error
fuel.level
barometric.pressure
commanded.throttle.position
ethanol.fuel.percentage
accelerator.pedal.position
hybrid.battery-pack.remaining.life
engine.oil.temperature
engine.torque
```

Here you can chose to subscribe to all engine component using an expression
like : `engine.*`

## Reusing existing/legacy code

About now provided services use:

- **Low Level** [[OpenXC]] project provides logic and some useful libraries to
 access a CAN bus. It is the choice for AGL.

- **High Level** In many cases accessing to low level signal is not enough.
  Low level information might need to be composed (i.e. GPS+Gyro+Accel).
  Writing this composition logic might be quite complex and reusing existing
  libraries like: LibEkNav for Kalman filtering [[9]] or Vrgimbal for 3 axes
  control[[10]] may help saving a lot of time. AGL apps should access CAN 
  signals through High Level service. High level can lean on as many low level
  service as needed to compute its **Virtual signals** coming from differents
  sources. Viwi protocol seems to be a good solution.

## Leveraging AGL binder

Such a model is loosely coupled with AGL binder. Low level CAN service as well
as virtual signal components may potentially run within any hosting environment
that would provide the right API with corresponding required facilities.
Nevertheless leveraging [[APbinder]] has multiple advantages. It already
implements event notification to support a messaging/signaling model for
distributed services. It enables a subscribe model responding to the
requirement and finally it uses JSON natively.

This messaging/signalling model already enforces the notion of subscription for
receiving data. It implies that unexpected data are not sent and merely not
computed. When expected data is available, it is pushed to all waiting
subscriber only one time.

The [[APbinder]] provides transparency of communication.
It currently implements the transparency over D-Bus/Kdbus and WebSocket.
Its transparency mechanism of communication is easy to extend to other
technologies: pools of shared memory or any proprietary transport model.

When bindings/services are loaded by the same binder, it provides transparently
`in-memory` communication. This in-memory communication is really efficient: on
one hand, the exchanged JSON objects are not serialized (because not streamed),
on the other hand, those JSON objects provide a high level of abstraction able
to transfer any data.

Technically a service is a standard [[APbinder]] binding which is also handled
by the system and launched as a daemon by systemD.
Therefore Signal/Agent inherits of security protection through SMACK, access
control through Cynara, transparency of API to transport layer, life cycle
management, ... Like any other [[APbinder]] process is composed of a set of
bindings. In signal service specific case, those bindings are in fact the
`signal modules`.

The proposed model allows to implement low level dependencies as independent
signal modules. Those modules when developed are somehow like "Lego Bricks".
They can be spread or grouped within one or multiple services depending on
deployment constraints (performance, multi-ECU, security & isolation
constraints,...).

On top of that low level signal modules, you should use a high level service.
A first implementation of [[ViWi]] is available [here](https://github.com/iotbzh/high-level-viwi-service)
and can be use to integrate business logic and high level features.

The model naturally uses JSON to represent data.

## Multi-ECU and Vehicule to Cloud interactions

While this might not be a show stopper for current projects, it is obvious that
in the near future Signal/Agent should support a fully distributed
architectures. Some event may come from the cloud (i.e. request to start
monitoring a given feature), some may come from SmartCity and nearby vehicles,
and last but not least some may come from another ECU within the same vehicle or
from a virtualized OS within the same ECU (e.g. cluster & IVI). In order to do
so, Signal modules should enable composition within one or more [[APbinder]]
inside the same ECU. Furthermore they should also support chaining with the
outside world.

![image](./images/cloud-arch.svg "Cloud & Multi-ECU Architecture")

1. Application requests Virtual Signal exactly like if it was a low level signal
1. Agent Signal has direct relation to low level signal
1. Agent needs to proxy to an other service inside the same ECU to access the signal
1. Signal is not present on current ECU. Request has to be proxied to the outside world

[AppFw]:  http://iot.bzh/download/public/2016/appfw/01_Introduction-to-AppFW-for-AGL-1.0.pdf "Application Framework"
[APcore]:  http://iot.bzh/download/public/2016/appfw/03_Documentation-AppFW-Core-1.0.pdf "AppFw Core"
[APmain]:  https://gerrit.automotivelinux.org/gerrit/#/q/project:src/app-framework-main "AppFw Main"
[APbinder]:  https://gerrit.automotivelinux.org/gerrit/#/q/project:src/app-framework-binder "AppFw Binder"
[APsamples]:  https://gerrit.automotivelinux.org/gerrit/gitweb?p=src/app-framework-binder.git;a=tree;f=bindings/samples "AppFw Samples"
[Signal-K]: http://signalk.org/overview.html
[1]: http://schd.ws/hosted_files/aglmmwinter2017/37/20170201_AGL-AMM_F10_kusakabe.pdf
[2]: https://wiki.automotivelinux.org/_media/agl-distro/20170402_ften_can_kusakabe_v2.pdf
[6]:  https://github.com/otcshare/automotive-message-broker
[7]:  http://ardupilot.org/rover/index.html
[8]:  https://github.com/ArduPilot/ardupilot/tree/master/libraries
[9]:  https://bitbucket.org/jbrandmeyer/libeknav/wiki/Home
[10]: http://ardupilot.org/rover/docs/common-vrgimbal.html
[11]: http://elinux.org/R-Car/Boards/Porter:PEXT01
[12]: https://github.com/gpsnavi/gpsnavi
[VISS]: http://rawgit.com/w3c/automotive/gh-pages/vehicle_data/vehicle_information_service.html
[VSS]: https://github.com/GENIVI/vehicle_signal_specification
[ViWi]: https://www.w3.org/Submission/2016/SUBM-viwi-protocol-20161213/
[OpenXC]: http://openxcplatform.com/
[low level CAN service]: https://gerrit.automotivelinux.org/gerrit/#/admin/projects/src/low-level-can-generator
[high level ViWi]: https://github.com/iotbzh/high-level-viwi-service