Re: "Pseudo-DIA" between the Linux Kernel Development Community and its Users - 3/5 Safety for Operating Systems involves functional and quality requirements

Jochen Kall

Hi John,

I like your reasoning and the schematic, it sums the situation up very nicely.
I'm not sure if i understand/agree with all your arguments leading to your conclusion though, especially the parts on where there is a development process to be considered, see inline comments below, I also added some notes to your schematic that came to mind, see attachment.


Best Regards
Jochen Kall

On behalf of Toyota. 
Dr. rer. nat. Jochen Kall
Functional Safety
ITK Engineering GmbH
Im Speyerer Tal 6
76761 Rülzheim
Tel.: +49 7272 7703-546
Fax: +49 7272 7703-100
ITK Engineering GmbH | Im Speyerer Tal 6 | 76761 Rülzheim
Tel.: +49 7272 7703-0 | Fax: +49 7272 7703-100
mailto:info@... |

Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
Dr. Rudolf Maier
Geschäftsführung/Executive Board:
Michael Englert (Vorsitzender/Chairman), Bernd Gohlicke
Sitz der Gesellschaft/Registered Office: 76761 Rülzheim
Registergericht/Registered Court: Amtsgericht Landau, HRB 32046
USt.-ID-Nr./VAT-ID-No. DE 813165046

-----Ursprüngliche Nachricht-----
Von: development-process@... <development-process@...> Im Auftrag von John MacGregor via
Gesendet: Montag, 25. Mai 2020 18:36
An: Lukas Bulwahn <Lukas.Bulwahn@...>
Cc: development-process@...
Betreff: Re: [development-process] "Pseudo-DIA" between the Linux Kernel Development Community and its Users - 3/5 Safety for Operating Systems involves functional and quality requirements

Hi Lukas,

After the workshop, and before my vacation, another veil falls...

For this e-mail, I've enhanced the general V-Model with safety activities / processes. This means that the diagramme covers the general development lifecycle as well as the safety lifecycle. The standards are not so clear about that. Rather than having a separate, parallel V for the safety lifecycle, I've inserted orange "safety" boxes in the nodes representing each development phase.
[JK] fully agree, separate non integrated life cycles don't work out in general, they are usually the result of companies treating safety as an "addon" to existing processes as to not having to change preexisting processes...performance usually turns out as you would expect.

In the case of ISO 26262, there is the famous illustration of the 3 Vs superimposed over the overview of the standards series (Figure 1 in every standard) and Figures 2 & 3 in Part 4 which replace the hardware and software Vs with boxes. The enclosed illustration seems generally compatible.

It's not generally included in the V-Model, but in the context of safety-critical systems, there should be backwards traceability between the requirements and the work products that implement them.
[JK] absolutely necessary, I think the illustration could even use some more traceability arrows -> attachment

Two points are immediately noticeable:
1) The standards' requirements mostly only cover the tip of the iceberg of system development activities. (Well, I have to admit I made the orange boxes small so that they wouldn't interfere with the phase titles, however (-: ).
2) There is an overlap between safety functionality and operating system functionality.

Those turquoise boxes represent the development process all application, middleware and, yes, operating system elements in the safety-critical system. The system itself is composed of (using a somewhat arbitrary mixture of terminology):
1) newly-developed safety elements
2) newly-developed non-safety elements
3) pre-existing safety elements that have been used in a similar domain
4) pre-existing safety elements that have been used in another context (at least from the 26262 perspective), i.e. another instance of the same product class
5) pre-existing non-safety elements
6) pre-existing safety components (hardware and software)
7) pre-existing non-safety components (hardware and software) each of which may have a different certification or qualification route as well as different generic development processes. The difference between elements and components seem nebulous to me and I'd rather call pre-existing things "off-the-shelf", whereby one might have to differentiate whose shelf they come from.
[JK] Nebulous not just to you... In the ISO26262 context, element is a catch all, from whole systems down to individual software units. Component is a bit narrower, it refers to something that is not on the system level, but consists of several software units or hardware parts respectively.

From the previous e-mail (which admittedly considered only non-safety-critical systems), a Linux that is currently being selected for use in an embedded system would belong to category 7 and that is the focus here. It may soon be the case that safety-critical applications will use Linux. There may come a time, where safety functionality has been brought upstream to the Kernel, but these are now not quite the case.

The safety-critical system development process starts by defining the safety-critical system and the environment (context) within which it operates. A hazard and risk analysis is then performed to develop the safety requirements on the system and a corresponding functional safety concept. A technical safety concept is developed in the system architecture phase, which ultimately results in safety requirements on the software architecture, and therefore on the operating system.
[JK] Terminology. I assume by "safety requirements on the system" you mean what ISO26262 calls a safety goal, i.e. the top level safety requirement to the item being developed?.

At this point the requirements on the operating system should be functional requirements, for safety mechanisms or safety functions, and / or requirements on the qualities of those functions (response time, resource consumption, etc.). Safety functionality, or mechanisms, include such things as monitoring, periodic testing, diagnostic and logging functionalities, tolerance mechanisms for residual design faults in the hardware, environmental stresses, operator mistakes, residual software design faults, data communication errors and overload situations; things that may already exist in the operating system in some form. Refer to 61508-3 a) for a better list.

In other words, the safety-related requirements on the operating system should already be functional or quality requirements that should comparable to other requirements on the operating system.

This principle has already been accepted by a number of accreditation agencies in the context of SafeScrum. There, the hazard analyses result in hazard stories, which are stored in the product backlog. The hazard stories reflect a certain functionality that enable the system to achieve or maintain a safe state in a certain hazard scenario. During a sprint, the developers are instructed in the safety-critical aspects of the hazard story by accompanying safety personnel. The developers develop the hazard functionality while monitoring the safety-related requirements.

In the context of the general development lifecycle of a safety-critical product, i.e. the turquoise boxes, the system developer would again go through the selection and configuration process for an operating system as would be done in developing a conventional system. The operating system has already been developed and therefore has the functionality and qualities that it delivers "off the shelf". The developer must ensure that the operating system meets its functional and quality requirements. Where there are alternatives, the system developer would have the option of choosing the candidate that best meets the requirements, perhaps with an emphasis on its systematic capability (i.e. the potential of an element to lead to a failure of a safety function).
I think at this point it could be mentioned, that the safety integrity standards have provisions for the related scenario of component reuse, leading us back to ISO26262: Part 8 clause 12 and Route 3S of IEC 61508 and all the headache associated with that. We discussed in detail whether these clauses are directly applicable, with no clear consensus, there is evidence after all that they never were intended for the reuse of complex systems like an OS in the first place, even if the wording in the standards would technically allow it.
So basically your sentence "The developer must ensure that the operating system meets its functional and quality requirements" encompasses the majority of what the completed ELISA project would create including the arguments of compatibility of the arguments with the safety standards plus tailoring to the actual item. Quite a lot for an innocent little sentence like that^^.

Currently, Linux has not been developed for safety-critical applications, but it may be that it already possesses functionality to perform safety functions with adequate quality. Otherwise, there is the possibility, as in the case with Google and Android, that the system developer can develop, or commission the development of, the necessary functionality. This is the only time there is a development process, and that is not for the operating system but for a feature of the operating system.
Here I have to disagree, the decision that the functional and quality requirements are fulfilled in the last paragraph above also includes the decision that the systematic integrity, which depends on the way it was development by, is sufficient, which leads us right back to the development process of the preexisting code that one intends to reuse. Maybe I don't fully grasp your argument, but right now I think restricting "development process" to new safety features is just not correct.

In the context of safety, I could foresee see possible Kernel development activities related to new drivers and for the safety functionality.

The point here is that Linux has always (up to now) been developed as functionality. It may be possible to isolate the safety-related parts of that functionality and, as part of the systems engineering part of the development process, attach quality requirements to them and validate that the requirements have been achieved. For me, this would be the development interface for the DIA.
Maybe my issue with this argument is the definition (or rather lack thereof) of "quality requirements". Do you only refere to non-functional requirement to the software or does this term also encompass requirements to the development process by which it is developed?

In the next e-mail, I'll look at who might develop that functionality and how the responsibilities might be distributed among the players that perform the development activities.


Mit freundlichen Grüßen / Best regards

John MacGregor

Safety, Security and Privacy (CR/AEX4)
Robert Bosch GmbH | Postfach 10 60 50 | 70049 Stuttgart | GERMANY | Tel. +49 711 811-42995 | Mobil +49 151 543 09433 | John.MacGregor@...

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Franz Fehrenbach; Geschäftsführung: Dr. Volkmar Denner, Prof. Dr. Stefan Asenkerschbaumer, Dr. Michael Bolle, Dr. Christian Fischer, Dr. Stefan Hartung, Dr. Markus Heyn, Harald Kröger, Christoph Kübel, Rolf Najork, Uwe Raschke, Peter Tyroller

-----Original Message-----

Join to automatically receive all group messages.