Date   

Re: TSC Meeting agenda (Org focus) - Wed, 12/2/2020

Shuah Khan
 

On 12/1/20 11:05 AM, Paoloni, Gabriele wrote:
Hi Shuah
Many thanks for scheduling this discussion

-----Original Message-----
From: devel@lists.elisa.tech <devel@lists.elisa.tech> On Behalf Of Shuah
Khan
Sent: Tuesday, December 1, 2020 6:07 PM
To: devel@lists.elisa.tech
Cc: Shuah Khan <skhan@linuxfoundation.org>
Subject: [ELISA Technical Community] TSC Meeting agenda (Org focus) -
Wed, 12/2/2020

All,

Here is the agenda for tomorrow's meeting:

Review pending action items
Workshop #6 Planning
ELISA Strategy

As for strategy, the goal is coming up with a clear higher level
definition first and them map the WGs activities and how each of
these activities feed into the overall strategy.

Note: I reviewed the existing slides on strategy, none of them is
speaking to me as they dive right into WGs without a clear definition
of higher level goals.

I started a high level strategy doc seeding it with distilled ideas
from Paul Albertella's thread on this topic and grouping them in
"What" and "How".
Today in the safety arch WG we discussed over a qualification flow
that I presented. It is just a starting point but maybe can be helpful
do define part of the 'What' and part of the 'How':
https://drive.google.com/file/d/1uss9FVcEIF8ecJftD0mv6pJY7mHue9DS/view?usp=sharing
Yes please add them as suggested edits.

Thanks Paul for starting the thread.

https://docs.google.com/document/d/1Jx77Mw_BqdanGYILCkyw3Fdifxn7F
O7Dmt-9ZU2KISo/edit#
Do you want us to comment/modify the file before tomorrow's meeting
or you want us to go over it altogether tomorrow?
Please do suggested edits for changes. One more thing that need
to be outlined is how WGs fit in these goals and interactions
between WGs for goals. The idead being the graphic will be goal
driven with WGs feeding into these goals.

After our meeting, I will try to put a visual for this doc.

thanks,
-- Shuah


Re: TSC Meeting agenda (Org focus) - Wed, 12/2/2020

Paoloni, Gabriele
 

Hi Shuah

Many thanks for scheduling this discussion

-----Original Message-----
From: devel@lists.elisa.tech <devel@lists.elisa.tech> On Behalf Of Shuah
Khan
Sent: Tuesday, December 1, 2020 6:07 PM
To: devel@lists.elisa.tech
Cc: Shuah Khan <skhan@linuxfoundation.org>
Subject: [ELISA Technical Community] TSC Meeting agenda (Org focus) -
Wed, 12/2/2020

All,

Here is the agenda for tomorrow's meeting:

Review pending action items
Workshop #6 Planning
ELISA Strategy

As for strategy, the goal is coming up with a clear higher level
definition first and them map the WGs activities and how each of
these activities feed into the overall strategy.

Note: I reviewed the existing slides on strategy, none of them is
speaking to me as they dive right into WGs without a clear definition
of higher level goals.

I started a high level strategy doc seeding it with distilled ideas
from Paul Albertella's thread on this topic and grouping them in
"What" and "How".
Today in the safety arch WG we discussed over a qualification flow
that I presented. It is just a starting point but maybe can be helpful
do define part of the 'What' and part of the 'How':
https://drive.google.com/file/d/1uss9FVcEIF8ecJftD0mv6pJY7mHue9DS/view?usp=sharing


Thanks Paul for starting the thread.

https://docs.google.com/document/d/1Jx77Mw_BqdanGYILCkyw3Fdifxn7F
O7Dmt-9ZU2KISo/edit#
Do you want us to comment/modify the file before tomorrow's meeting
or you want us to go over it altogether tomorrow?

Thanks
Gab


thanks,
-- Shuah





---------------------------------------------------------------------
INTEL CORPORATION ITALIA S.p.A. con unico socio
Sede: Milanofiori Palazzo E 4
CAP 20094 Assago (MI)
Capitale Sociale Euro 104.000,00 interamente versato
Partita I.V.A. e Codice Fiscale 04236760155
Repertorio Economico Amministrativo n. 997124
Registro delle Imprese di Milano nr. 183983/5281/33
Soggetta ad attivita' di direzione e coordinamento di
INTEL CORPORATION, USA

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


TSC Meeting agenda (Org focus) - Wed, 12/2/2020

Shuah Khan
 

All,

Here is the agenda for tomorrow's meeting:

Review pending action items
Workshop #6 Planning
ELISA Strategy

As for strategy, the goal is coming up with a clear higher level
definition first and them map the WGs activities and how each of
these activities feed into the overall strategy.

Note: I reviewed the existing slides on strategy, none of them is
speaking to me as they dive right into WGs without a clear definition
of higher level goals.

I started a high level strategy doc seeding it with distilled ideas
from Paul Albertella's thread on this topic and grouping them in
"What" and "How".

Thanks Paul for starting the thread.

https://docs.google.com/document/d/1Jx77Mw_BqdanGYILCkyw3Fdifxn7FO7Dmt-9ZU2KISo/edit#

thanks,
-- Shuah


ELISA TSC Meeting - Wed, 12/02/2020 3:00pm-4:00pm #cal-reminder

devel@lists.elisa.tech Calendar <devel@...>
 

Reminder: ELISA TSC Meeting

When: Wednesday, 2 December 2020, 3:00pm to 4:00pm, (GMT+01:00) Europe/Amsterdam

Where:https://zoom.us/j/97628705664?pwd=UVJnTjFHank1cHdNZ04vbWlSUGs5QT09

View Event

Organizer: ELISA Project

Description:

──────────

ELISA Project is inviting you to a scheduled Zoom meeting.

Join Zoom Meeting
https://zoom.us/j/97628705664?pwd=UVJnTjFHank1cHdNZ04vbWlSUGs5QT09

Meeting ID: 976 2870 5664
Passcode: 498396
One tap mobile
+13017158592,,97628705664#,,,,,,0#,,498396# US (Germantown)
+13126266799,,97628705664#,,,,,,0#,,498396# US (Chicago)

Dial by your location
+1 301 715 8592 US (Germantown)
+1 312 626 6799 US (Chicago)
+1 646 558 8656 US (New York)
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 669 900 6833 US (San Jose)
855 880 1246 US Toll-free
877 369 0926 US Toll-free
+1 204 272 7920 Canada
+1 438 809 7799 Canada
+1 587 328 1099 Canada
+1 647 374 4685 Canada
+1 647 558 0588 Canada
+1 778 907 2071 Canada
855 703 8985 Canada Toll-free
Meeting ID: 976 2870 5664
Passcode: 498396
Find your local number: https://zoom.us/u/adJwdCp3cq


Re: Kernel Configurations in the Tool Investigation and Code Improvement Subgroup

Lukas Bulwahn
 

On Tue, Dec 1, 2020 at 12:38 PM Elana Copperman
<Elana.Copperman@mobileye.com> wrote:

Following initial discussion with Lukas, here is a summary of some relevant guidelines:

tinyconfiig is ok but still needs some tweaking so that you can boot the system (e.g., add support for initrd; printk output during kernel boot; ELF binaries; procfs; sysfs).
Also, there have been some customized patches introduced over the years, so that tinyconfig does not exactly match default Linux release.
You should work with a kernel that is standard, minimally bootable and supports kernel boot output.

Strategy:
a) ELISA alone will never be able to fix all bugs at a faster rate than natural kernel growth introduces new bugs.
Try to quantify your bug handling capacity and aim to have a kernel on which you can fix bugs at a faster rate than new ones are introduced on average as the system grows.
b) Remove subsystems methodically (so that we have a clear scope of what is / is not covered), bringing it down to a size for which the number of errors is reasonable and scalable.
c) Publicize your work and create incentive for others to tackle additional subsystems for their own needs, so that this can become a community effort.
Thanks, Elana. This will be our strategy.

We will start tracking and fixing all new bugs on tinyconfig for a
small selection of tools, which should be doable by the current group
(I think it is doable with a few hours of work each week).

Then, we continue to add selectively more functionality and code that
we can handle, if the group of participants grows.

Lukas


Re: Kernel Configurations in the Tool Investigation and Code Improvement Subgroup

elana.copperman@...
 

Following initial discussion with Lukas, here is a summary of some relevant guidelines:
  1. tinyconfiig is ok but still needs some tweaking so that you can boot the system (e.g., add support for initrd; printk output during kernel boot; ELF binaries; procfs; sysfs).
    Also, there have been some customized patches introduced over the years, so that tinyconfig does not exactly match default Linux release.
    You should work with a kernel that is standard, minimally bootable and supports kernel boot output. 
      
  2. Strategy:
    a) ELISA alone will never be able to fix all bugs at a faster rate than natural kernel growth introduces new bugs.
      Try to quantify your bug handling capacity and aim to have a kernel on which you can fix bugs at a faster rate than new ones are introduced on average as the system grows.
    b) Remove subsystems methodically (so that we have a clear scope of what is / is not covered), bringing it down to a size for which the number of errors is reasonable and scalable.
    c) Publicize your work and create incentive for others to tackle additional subsystems for their own needs, so that this can become a community effort.
      
  3. Candidates for removal from your test scope.  After a first scale down, compare output to capacity (see point 2 above), and we may then repeat the process until we get a kernel with which we can work effectively.
    a)  All architecture specific settings (including all Android settings)
    b) All vendor specific hardware interfaces.
    c) USB, DRM, sound as well as HI subsystems.
    d) Plug-n-play (PnP) device interfaces
    e) Deprecated ETH protocols (e.g., RARP; EGP; IGRP) as well as protocols which are relevant only with deleted subsystems (e.g., UCAN).
    f) Bluetooth functionality 
    g) Debug functionality
    i) Common functionality which is also removed from tinyconfig.  These may be needed, but according to our plan - whoever complains, is invited to do the work on any subsystem:
  • BLOCK
  • MULTIUSER
  • TIMERFD
  • MEMBARRIER
  • COMPAT_BRK
  • PROC_SYSCTL
  • Enable CONFIG_PREEMPT_NONE
Regards
Elana


From: devel@... <devel@...> on behalf of Lukas Bulwahn <lukas.bulwahn@...>
Sent: Monday, November 30, 2020 2:01 PM
To: devel@... <devel@...>
Subject: [ELISA Technical Community] Kernel Configurations in the Tool Investigation and Code Improvement Subgroup
 
Dear all,

In the last meeting, it was requested to share the kernel
configurations the Tool Investigation and Code Improvement Subgroup is
using.

We currently use:
  - x86-64 tinyconfig: this can be created with 'make tinyconfig' for
initial investigations of tools (to get a small overview of the number
of findings on a very small config.)
  - x86-64 defconfig without DRM, SOUND, USB for the CI system. You
can obtain the config with:

for clang-analyzer:

make CC=clang defconfig
scripts/config -d CONFIG_DRM
scripts/config -d CONFIG_SOUND
scripts/config -d CONFIG_USB_SUPPORT

for smatch:

make defconfig
scripts/config -d CONFIG_DRM
scripts/config -d CONFIG_SOUND
scripts/config -d CONFIG_USB_SUPPORT

If you have any feedback on this kernel configuration, please let us know.

As of now, it looks like a manageable code base to continue our investigations.

Best regards,

Lukas






ELISA Workshop #6 Topic Idea - Code Coverage analysis for GLibC

Gurvitz, Eli (Mobileye)
 

Hi,

 

- Topic idea: 

Structural coverage analysis is required by Safety standards and that includes Linux and GLibC.

Building GLibC with instrumentation for code coverage analysis (gcov) was not possible in previous versions and is still tricky now.

In this presentation we will describe the process of instrumenting glibc for coverage and the patch to GLibC that we uploaded.

We will also show a hands-on example.

-This work was done as an ELISA mentorship project.

 

- What you hope to accomplish in the session: 

* Share the knowledge about how to instrument GLibC for coverage

* Provide a hands-on example of how this is done.

 

- Any critical participants needed for the discussion –

* Eli Gurvitz

* Ashutosh Pandey


- Estimated time needed the sessions: 
medium – 60 minute presentation

 

Thanks,

Eli Gurvitz

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


ELISA Workshop #6 - Topic proposal - Intel's Linux Test Robot

Gurvitz, Eli (Mobileye)
 

Hi,

 

- Topic idea: 

* Oliver Sang and Philip Li from Intel will present Intel’s Linux Test Robot (LKP) - https://01.org/lkp/

 

 

- What you hope to accomplish in the session: 

* Introduce the ELISA community to the Intel LKP project

* Consider how Intel’s reports can be used in the field of Safety

* Consider cooperation and contribution by ELISA members to LKP

 

- Any critical participants needed for the discussion -


- Estimated time needed the sessions: 
medium – 30 minute presentation + 15 minutes Q&A

 

Thanks,

Eli Gurvitz

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


REMINDER: ELISA Workshop #6 Dates Survey - Please Complete by Tuesday, December 1st

Min Yu
 

Just a quick reminder to please complete the poll by Tomorrow, Tuesday, December 1st, about the best dates for ELISA Workshop #6. 

Preview image
Doodle: ELISA Workshop #6 Dates
Doodle is the simplest way to schedule meetings with clients, colleagues, or friends. Find the best time for one-to-ones and team meetings with our user-friendly…
doodle.com


--
Min Yu
Operations Manager
The Linux Foundation
+1(530) 902-6464 (m)

---------- Forwarded message ---------
From: Min Yu <myu@...>
Date: Fri, Nov 20, 2020 at 11:17 AM
Subject: ELISA Workshop #6 Dates Survey - Please Complete by Tuesday, December 1st
To: <devel@...>


Dear all,

We're starting to plan for the ELISA Workshop #6. It will be held again in a virtual format over the course of 3 days and across multiple time zones.

To help us get a sense of your availability, could you please take a moment to complete this poll below?

regards,
Min
--
Min Yu
Operations Manager
The Linux Foundation
+1(530) 902-6464 (m)


Kernel Configurations in the Tool Investigation and Code Improvement Subgroup

Lukas Bulwahn
 

Dear all,

In the last meeting, it was requested to share the kernel
configurations the Tool Investigation and Code Improvement Subgroup is
using.

We currently use:
- x86-64 tinyconfig: this can be created with 'make tinyconfig' for
initial investigations of tools (to get a small overview of the number
of findings on a very small config.)
- x86-64 defconfig without DRM, SOUND, USB for the CI system. You
can obtain the config with:

for clang-analyzer:

make CC=clang defconfig
scripts/config -d CONFIG_DRM
scripts/config -d CONFIG_SOUND
scripts/config -d CONFIG_USB_SUPPORT

for smatch:

make defconfig
scripts/config -d CONFIG_DRM
scripts/config -d CONFIG_SOUND
scripts/config -d CONFIG_USB_SUPPORT

If you have any feedback on this kernel configuration, please let us know.

As of now, it looks like a manageable code base to continue our investigations.

Best regards,

Lukas


Re: [EXT] [ELISA Technical Community] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Lukas Bulwahn
 

On Mon, Nov 23, 2020 at 5:11 PM Ralf Ramsauer
<ralf.ramsauer@oth-regensburg.de> wrote:

Hi,

On 23/11/2020 11:14, Lukas Bulwahn wrote:
Dear all,

Pia Eichinger, a student at OTH Regensburg, mentored by Ralf Ramsauer
and Wolfgang Mauerer, has written her bachelor thesis on Maintainers
Expectations vs. Maintainers Reality: An Analysis of Organisational
and Maintenance Structure of the Linux Kernel. Simply quoting her
conclusion:

"We showed that around 20% of all patches were theoretically wrongly
integrated when strictly analysing MAINTAINERS. The reality of
integration and maintenance structure is more complicated than that,
which we also explored. Furthermore, we identified 12 major subsystems
of the Linux kernel. This is very helpful for an overview of the
organisational structure, realistic grouping of subsystems and further
Linux kernel topology discussions."
Let me add, by manual investigation, we found some patterns within
"wrongly integrated" patches. Just to give a concrete example, Jakub
Kicinski started to integrate patches for NETWORKING, before David
Miller added him as maintainer. And Jakub has a long history of writing
code. So to some degree, it's reasonable what was happening.
Another example: A employee at Broadcom committed code for another
device of Broadcom, while violating MAINTAINERS. Again, reasonable to
some degree. At least the code was publicly discussed before integration.

From a formal standpoint, these are "violations" of the integration
process, depending on a definition of how patch integration actually
should/must work.

This is exactly why we started to dig deeper and tried to get an
overview of what Kernel subsystems actually exist, the second part of
Pia's work. Sections in the MAINTAINERS are tightly coupled to the file
structure of the kernel, and sections in the MAINTAINERS overlap.
Our assumption is that sections with high overlap build "clusters" of
responsibility - or, IOW, subsystems. And if you check if a patch was
integrated by a person within a subsystem, then the ratio will likely be
higher than 80% - but we haven't done that yet.
Agree. I need to add that Pia's work is good research work per the
following definition: Every good research work raises many more new
relevant research questions while answering a first clear research
question at hand.

And here we need more people/students/industry partners (collaboration
projects) funding research institutes to follow up on those points. As
you made clear, there are some further points to consider...

I would like to add that some simple properties in MAINTAINERS are
obviously not clean, e.g., 10% of the files in the repository do not
even have a section they belong to (other than THE REST). So, any
change to only any of those files would lead immediately to a 'process
violation' according to the definition above.


I have placed a copy of her thesis in the Google Drive folder (under
Technical Community/Development Process Working
Group/Development-Process-Data-Mining) here:

https://drive.google.com/file/d/12ta2YxgEzEfrIcmWid8kwIyVEywbUjbA/view?usp=sharing

Pia, Ralf, Wolfgang, congratulations on this work and thank you very
much for your investigation.

Just to give a quick insight how the various strings of work are related:

- The development process working group has set the goal to understand
the development management and change management of the Linux kernel.
Due to priorities, the group has not worked on the distributed
responsibility model for different parts of the overall code base and
responsibility for change management in much detail yet. I would
expect that the group eventually claims that "the person that is
responsible for the acceptance of a change (a patch) is determined by
the information provided in the MAINTAINERS file."
Yes, we lack such a clear statement. Upstream Kernel lacks a clear
definition as well: Official documentation states that
get_maintainers.pl can be helpful for finding recipients, otherwise AKPM
is the last resort. That's all in the official submission guide.

We need two definitions: Who should be addressed by the author, and who
should finally be in charge of integrating a patch. MAINTAINERS provides
answers for both questions.


- Based on that hypothesis, Pia determined the evidence for this claim
and truth of that statement. The evidences suggest that this is 80%
true.

- The remaining challenge is now to determine if the expectation of a
person assessing the development process and the evidences provided
meet. In other words, is "80% truth" for this claim good enough? If
this 80% conformance statement based on evidence is not sufficient, an
extended and refined claim, data preparation, collection, measurement
and interpretation is required. Hopefully, the Evidence Working Group
can eventually formulate such research questions, interpret results
appropriately and refine the analysis, e.g., create scripts to have
the documentation reflect the executed reality, determine the "right"
rules for interpretation and document those, and explain how to come
to actual risk assessments.

Some personal notes:

That 20% are not integrated according to the description in
MAINTAINERS might be surprising to outsiders and especially to
assessors that believe that the working model is fully described and
documented in a 100% structured and reality-reflected manner. The
challenge is really to understand if these 20% are A. due to other
implicit rules that have not been stated explicitly and outdated data
on organisational structures (which implies only a low risk on the
delivered product) or B. due to chaotic integration schemes, i.e., for
20% changes anyone can add anything (which implies a higher risk on
the delivered product).
Don't forget that MAINTAINERS is an emerging document.

$ git log --since=2020-01-01 --until=2020-07-01 \
--pretty=format:%H MAINTAINERS | wc -l
739

That's a lot of changes, and provides some factors of uncertainty. For
example, if we check if a patch was addressed correctly, we pick the
state of MAINTAINERS that corresponds with the Date header of the mail
with the patch. Yet we don't know which version the author chose.

Maybe the committer was correct at the moment of integration? Unlikely,
but possible. It's hard to quantify...


If it is due to A, the Evidence Working Group can make those implicit
rules more explicit and update the data on organisation, and hence new
measurements will lead to lowering the criticality and risk of the
unknown on the challenge above.
If it is due to B, the change management induces a risk with a
specific known upper bound for specific areas, which needs to be
considered by the risk management of the downstream integrator. In the
worst case, ending with the decision that the responsibility of the
observed code integration is too chaotic to meet the business needs of
the downstream integrator. (In other words, not everything in the
kernel repository is gold; and if you rely on crappy code in the
repository, you will ship crap or if you are well informed, decide not
to ship crap...)
Of course, there's a reason why there is a MAINTAINERS. I guess we can
agree that it would be worse without. And at the moment we have nothing
to compare with: is 80% rather high or low?

For another ongoing large-scale study, we had a look at a dozen of other
projects besides Linux. And there exist (especially userland) projects
that you even might ship in your products, where the vast majority of
commits have never been seen on any list and have never been reviewed by
anyone else but the author - contrary to the submission guidelines of
the project and confirmed by maintainers. :-)

So my hypothesis is that "sticking to development guidelines" and
"software quality" strongly correlates with the privilege level of the
processor where the software executes.


So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.
Thanks! Sorry for not kicking off yet, so much work, so little time...
Ralf, I think we actually kicked off many years ago and are working
for many years now, but others that state interest have not joined yet
;)

Lukas


Re: [EXT] [ELISA Technical Community] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Ralf Ramsauer
 

Hi,

On 23/11/2020 11:14, Lukas Bulwahn wrote:
Dear all,

Pia Eichinger, a student at OTH Regensburg, mentored by Ralf Ramsauer
and Wolfgang Mauerer, has written her bachelor thesis on Maintainers
Expectations vs. Maintainers Reality: An Analysis of Organisational
and Maintenance Structure of the Linux Kernel. Simply quoting her
conclusion:

"We showed that around 20% of all patches were theoretically wrongly
integrated when strictly analysing MAINTAINERS. The reality of
integration and maintenance structure is more complicated than that,
which we also explored. Furthermore, we identified 12 major subsystems
of the Linux kernel. This is very helpful for an overview of the
organisational structure, realistic grouping of subsystems and further
Linux kernel topology discussions."
Let me add, by manual investigation, we found some patterns within
"wrongly integrated" patches. Just to give a concrete example, Jakub
Kicinski started to integrate patches for NETWORKING, before David
Miller added him as maintainer. And Jakub has a long history of writing
code. So to some degree, it's reasonable what was happening.
Another example: A employee at Broadcom committed code for another
device of Broadcom, while violating MAINTAINERS. Again, reasonable to
some degree. At least the code was publicly discussed before integration.

From a formal standpoint, these are "violations" of the integration
process, depending on a definition of how patch integration actually
should/must work.

This is exactly why we started to dig deeper and tried to get an
overview of what Kernel subsystems actually exist, the second part of
Pia's work. Sections in the MAINTAINERS are tightly coupled to the file
structure of the kernel, and sections in the MAINTAINERS overlap.
Our assumption is that sections with high overlap build "clusters" of
responsibility - or, IOW, subsystems. And if you check if a patch was
integrated by a person within a subsystem, then the ratio will likely be
higher than 80% - but we haven't done that yet.


I have placed a copy of her thesis in the Google Drive folder (under
Technical Community/Development Process Working
Group/Development-Process-Data-Mining) here:

https://drive.google.com/file/d/12ta2YxgEzEfrIcmWid8kwIyVEywbUjbA/view?usp=sharing

Pia, Ralf, Wolfgang, congratulations on this work and thank you very
much for your investigation.

Just to give a quick insight how the various strings of work are related:

- The development process working group has set the goal to understand
the development management and change management of the Linux kernel.
Due to priorities, the group has not worked on the distributed
responsibility model for different parts of the overall code base and
responsibility for change management in much detail yet. I would
expect that the group eventually claims that "the person that is
responsible for the acceptance of a change (a patch) is determined by
the information provided in the MAINTAINERS file."
Yes, we lack such a clear statement. Upstream Kernel lacks a clear
definition as well: Official documentation states that
get_maintainers.pl can be helpful for finding recipients, otherwise AKPM
is the last resort. That's all in the official submission guide.

We need two definitions: Who should be addressed by the author, and who
should finally be in charge of integrating a patch. MAINTAINERS provides
answers for both questions.


- Based on that hypothesis, Pia determined the evidence for this claim
and truth of that statement. The evidences suggest that this is 80%
true.

- The remaining challenge is now to determine if the expectation of a
person assessing the development process and the evidences provided
meet. In other words, is "80% truth" for this claim good enough? If
this 80% conformance statement based on evidence is not sufficient, an
extended and refined claim, data preparation, collection, measurement
and interpretation is required. Hopefully, the Evidence Working Group
can eventually formulate such research questions, interpret results
appropriately and refine the analysis, e.g., create scripts to have
the documentation reflect the executed reality, determine the "right"
rules for interpretation and document those, and explain how to come
to actual risk assessments.

Some personal notes:

That 20% are not integrated according to the description in
MAINTAINERS might be surprising to outsiders and especially to
assessors that believe that the working model is fully described and
documented in a 100% structured and reality-reflected manner. The
challenge is really to understand if these 20% are A. due to other
implicit rules that have not been stated explicitly and outdated data
on organisational structures (which implies only a low risk on the
delivered product) or B. due to chaotic integration schemes, i.e., for
20% changes anyone can add anything (which implies a higher risk on
the delivered product).
Don't forget that MAINTAINERS is an emerging document.

$ git log --since=2020-01-01 --until=2020-07-01 \
--pretty=format:%H MAINTAINERS | wc -l
739

That's a lot of changes, and provides some factors of uncertainty. For
example, if we check if a patch was addressed correctly, we pick the
state of MAINTAINERS that corresponds with the Date header of the mail
with the patch. Yet we don't know which version the author chose.

Maybe the committer was correct at the moment of integration? Unlikely,
but possible. It's hard to quantify...


If it is due to A, the Evidence Working Group can make those implicit
rules more explicit and update the data on organisation, and hence new
measurements will lead to lowering the criticality and risk of the
unknown on the challenge above.
If it is due to B, the change management induces a risk with a
specific known upper bound for specific areas, which needs to be
considered by the risk management of the downstream integrator. In the
worst case, ending with the decision that the responsibility of the
observed code integration is too chaotic to meet the business needs of
the downstream integrator. (In other words, not everything in the
kernel repository is gold; and if you rely on crappy code in the
repository, you will ship crap or if you are well informed, decide not
to ship crap...)
Of course, there's a reason why there is a MAINTAINERS. I guess we can
agree that it would be worse without. And at the moment we have nothing
to compare with: is 80% rather high or low?

For another ongoing large-scale study, we had a look at a dozen of other
projects besides Linux. And there exist (especially userland) projects
that you even might ship in your products, where the vast majority of
commits have never been seen on any list and have never been reviewed by
anyone else but the author - contrary to the submission guidelines of
the project and confirmed by maintainers. :-)

So my hypothesis is that "sticking to development guidelines" and
"software quality" strongly correlates with the privilege level of the
processor where the software executes.


So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.
Thanks! Sorry for not kicking off yet, so much work, so little time...

Ralf



Lukas





Re: [ELISA Development Process WG] [ELISA Technical Community] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

elana.copperman@...
 

Good luck!
An interesting note would be to compare your results to the evidences of process compliance in non-open source (commercial) software development.
And resulting implications for quality.
Interesting indeed.
Regards
Elana


From: development-process@... <development-process@...> on behalf of Lukas Bulwahn <lukas.bulwahn@...>
Sent: Monday, November 23, 2020 1:25 PM
To: Paul Albertella <paul.albertella@...>
Cc: devel@... <devel@...>; development-process@... <development-process@...>; Pia Eichinger <pia.eichinger@...>; Ralf Ramsauer <ralf.ramsauer@...>; Wolfgang Mauerer <wolfgang.mauerer@...>
Subject: Re: [ELISA Development Process WG] [ELISA Technical Community] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel
 
On Mon, Nov 23, 2020 at 12:17 PM Paul Albertella
<paul.albertella@...> wrote:
>
>
>
> On 23/11/2020 10:14, Lukas Bulwahn wrote:
> > So... I guess the Evidence Working Group has created yet another work
> > result, even without kicking off... :) Paul, I certainly appreciate if
> > you would like to continue the investigation of these evidence
> > together with me (and maybe, Ralf and Wolfgang) and future prospective
> > bachelor and master students.
>
> Yes, I'd like to do that :-)

Let us get started then... :)

>
> This is a good example of Process vs Policy vs Practice (see my previous
> post [1]), where Practice (as shown by Evidence), does not necessarily
> comply with a Policy (partially documented in this case by MAINTAINERS),
> perhaps because a Process (a description of the steps that should inform
> Practice, and the order in which they may occur) is not clearly defined.
>
> A question we might ponder: if we can make the Policy more explicit and
> enforce it by automation (e.g. a script to check that compliance), then
> perhaps a documented Process is less important...
>

...in the end, the result of clear Policy, Process and Practice
matching each other is more "trustable software"...

Lukas






Re: Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Lukas Bulwahn
 

On Mon, Nov 23, 2020 at 12:17 PM Paul Albertella
<paul.albertella@codethink.co.uk> wrote:



On 23/11/2020 10:14, Lukas Bulwahn wrote:
So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.
Yes, I'd like to do that :-)
Let us get started then... :)


This is a good example of Process vs Policy vs Practice (see my previous
post [1]), where Practice (as shown by Evidence), does not necessarily
comply with a Policy (partially documented in this case by MAINTAINERS),
perhaps because a Process (a description of the steps that should inform
Practice, and the order in which they may occur) is not clearly defined.

A question we might ponder: if we can make the Policy more explicit and
enforce it by automation (e.g. a script to check that compliance), then
perhaps a documented Process is less important...
...in the end, the result of clear Policy, Process and Practice
matching each other is more "trustable software"...

Lukas


Re: [ELISA Development Process WG] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Lukas Bulwahn
 

On Mon, Nov 23, 2020 at 11:47 AM Elana Copperman
<Elana.Copperman@mobileye.com> wrote:

Very interesting work, Lukas - thanks for bringing to our attention.
And there is certainly plenty of work to follow up.

Actually - I find the results to be quite complimentary. Given the open-source mindset and heavy reliance on voluntary contributions, the findings are actually better than my expectations.
As noted by Pia (page 22):
Our analysis shows that a high ratio of all recent patches was not integrated according to process, even though the majority was. This does not implicitly mean that 20% of all integrated patches during that time window were directly harmful or integrated by maintainers who completely lack expertise. Since the patches were found on public mailing lists before integration, it is safe to assume that the majority of these patches were exposed to discussion and reviewing.
What this means is that this research work relates to the quality of the release / CM process, not necessarily to the quality of the code itself (design, development, testing).
Interesting definition of software quality...

80% compliance for a voluntary community is quite amazing. I wonder what comparable rates would be for commercial software development entities with mandatory processes - based on intuition only (no evidences yet) from my long-term experience, I would not guess the compliance rate to be as high as 80%. But then again, an interesting point for comparison.
Yes, that is the challenge of interpretation of such data... what does
"good" actually mean and what is "good enough"?

Elana, what do you mean with "voluntary"? Where is the evidence that
the changes and work is done on a voluntary basis and accepted on a
voluntary basis?

In practice, this means we only see here the opening shot. More evidences to be collected and analyzed. But indeed, a very interesting starting point.
Agree. The interpretation of such data needs people with an overall
good kernel community and kernel development understanding and
long-term involvement (certainly longer than just a few weeks); hence,
an Evidence working group that slowly learns how to interpret such
data and starts to educate themselves in this area.

Lukas


Re: Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Paul Albertella
 

On 23/11/2020 10:14, Lukas Bulwahn wrote:
So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.
Yes, I'd like to do that :-)

This is a good example of Process vs Policy vs Practice (see my previous post [1]), where Practice (as shown by Evidence), does not necessarily comply with a Policy (partially documented in this case by MAINTAINERS), perhaps because a Process (a description of the steps that should inform Practice, and the order in which they may occur) is not clearly defined.

A question we might ponder: if we can make the Policy more explicit and enforce it by automation (e.g. a script to check that compliance), then perhaps a documented Process is less important...

Cheers,

Paul

[1] https://lists.elisa.tech/g/development-process/message/582


Re: [ELISA Development Process WG] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

elana.copperman@...
 

Very interesting work, Lukas - thanks for bringing to our attention.
And there is certainly plenty of work to follow up.

Actually - I find the results to be quite complimentary.  Given the open-source mindset and heavy reliance on voluntary contributions, the findings are actually better than my expectations.
As noted by Pia (page 22):
​Our analysis shows that a high ratio of all recent patches was not integrated according to process, even though the majority was.  This does not implicitly mean that 20% of all integrated patches during that time window were directly harmful or integrated by maintainers who completely lack expertise.  Since the patches were found on public mailing lists before integration, it is safe to assume that the majority of these patches were exposed to discussion and reviewing.
What this means is that this research work relates to the quality of the release / CM process, not necessarily to the quality of the code itself (design, development, testing).
80% compliance for a voluntary community is quite amazing.  I wonder what comparable rates would be for commercial software development entities with mandatory processes - based on intuition only (no evidences yet) from my long-term experience, I would not guess the compliance rate to be as high as 80%.  But then again, an interesting point for comparison.

In practice, this means we only see here the opening shot.  More evidences to be collected and analyzed.  But indeed, a very interesting starting point.
Thanks, Lukas, for sharing.
Regards
Elana


From: development-process@... <development-process@...> on behalf of Lukas Bulwahn <lukas.bulwahn@...>
Sent: Monday, November 23, 2020 12:14 PM
To: devel@... <devel@...>; development-process@... <development-process@...>
Cc: Pia Eichinger <pia.eichinger@...>; Ralf Ramsauer <ralf.ramsauer@...>; Wolfgang Mauerer <wolfgang.mauerer@...>; Paul Albertella <paul.albertella@...>
Subject: [ELISA Development Process WG] Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel
 
Dear all,

Pia Eichinger, a student at OTH Regensburg, mentored by Ralf Ramsauer
and Wolfgang Mauerer, has written her bachelor thesis on Maintainers
Expectations vs. Maintainers Reality: An Analysis of Organisational
and Maintenance Structure of the Linux Kernel. Simply quoting her
conclusion:

"We showed that around 20% of all patches were theoretically wrongly
integrated when strictly analysing MAINTAINERS. The reality of
integration and maintenance structure is more complicated than that,
which we also explored. Furthermore, we identified 12 major subsystems
of the Linux kernel. This is very helpful for an overview of the
organisational structure, realistic grouping of subsystems and further
Linux kernel topology discussions."

I have placed a copy of her thesis in the Google Drive folder (under
Technical Community/Development Process Working
Group/Development-Process-Data-Mining) here:

https://drive.google.com/file/d/12ta2YxgEzEfrIcmWid8kwIyVEywbUjbA/view?usp=sharing

Pia, Ralf, Wolfgang, congratulations on this work and thank you very
much for your investigation.

Just to give a quick insight how the various strings of work are related:

- The development process working group has set the goal to understand
the development management and change management of the Linux kernel.
Due to priorities, the group has not worked on the distributed
responsibility model for different parts of the overall code base and
responsibility for change management in much detail yet. I would
expect that the group eventually claims that "the person that is
responsible for the acceptance of a change (a patch) is determined by
the information provided in the MAINTAINERS file."

- Based on that hypothesis, Pia determined the evidence for this claim
and truth of that statement. The evidences suggest that this is 80%
true.

- The remaining challenge is now to determine if the expectation of a
person assessing the development process and the evidences provided
meet. In other words, is "80% truth" for this claim good enough? If
this 80% conformance statement based on evidence is not sufficient, an
extended and refined claim, data preparation, collection, measurement
and interpretation is required. Hopefully, the Evidence Working Group
can eventually formulate such research questions, interpret results
appropriately and refine the analysis, e.g., create scripts to have
the documentation reflect the executed reality, determine the "right"
rules for interpretation and document those, and explain how to come
to actual risk assessments.

Some personal notes:

That 20% are not integrated according to the description in
MAINTAINERS might be surprising to outsiders and especially to
assessors that believe that the working model is fully described and
documented in a 100% structured and reality-reflected manner. The
challenge is really to understand if these 20% are A. due to other
implicit rules that have not been stated explicitly and outdated data
on organisational structures (which implies only a low risk on the
delivered product) or B. due to chaotic integration schemes, i.e., for
20% changes anyone can add anything (which implies a higher risk on
the delivered product).

If it is due to A, the Evidence Working Group can make those implicit
rules more explicit and update the data on organisation, and hence new
measurements will lead to lowering the criticality and risk of the
unknown on the challenge above.
If it is due to B, the change management induces a risk with a
specific known upper bound for specific areas, which needs to be
considered by the risk management of the downstream integrator. In the
worst case, ending with the decision that the responsibility of the
observed code integration is too chaotic to meet the business needs of
the downstream integrator. (In other words, not everything in the
kernel repository is gold; and if you rely on crappy code in the
repository, you will ship crap or if you are well informed, decide not
to ship crap...)

So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.


Lukas






Maintainers Expectations vs. Maintainers Reality: An Analysis of Organisational and Maintenance Structure of the Linux Kernel

Lukas Bulwahn
 

Dear all,

Pia Eichinger, a student at OTH Regensburg, mentored by Ralf Ramsauer
and Wolfgang Mauerer, has written her bachelor thesis on Maintainers
Expectations vs. Maintainers Reality: An Analysis of Organisational
and Maintenance Structure of the Linux Kernel. Simply quoting her
conclusion:

"We showed that around 20% of all patches were theoretically wrongly
integrated when strictly analysing MAINTAINERS. The reality of
integration and maintenance structure is more complicated than that,
which we also explored. Furthermore, we identified 12 major subsystems
of the Linux kernel. This is very helpful for an overview of the
organisational structure, realistic grouping of subsystems and further
Linux kernel topology discussions."

I have placed a copy of her thesis in the Google Drive folder (under
Technical Community/Development Process Working
Group/Development-Process-Data-Mining) here:

https://drive.google.com/file/d/12ta2YxgEzEfrIcmWid8kwIyVEywbUjbA/view?usp=sharing

Pia, Ralf, Wolfgang, congratulations on this work and thank you very
much for your investigation.

Just to give a quick insight how the various strings of work are related:

- The development process working group has set the goal to understand
the development management and change management of the Linux kernel.
Due to priorities, the group has not worked on the distributed
responsibility model for different parts of the overall code base and
responsibility for change management in much detail yet. I would
expect that the group eventually claims that "the person that is
responsible for the acceptance of a change (a patch) is determined by
the information provided in the MAINTAINERS file."

- Based on that hypothesis, Pia determined the evidence for this claim
and truth of that statement. The evidences suggest that this is 80%
true.

- The remaining challenge is now to determine if the expectation of a
person assessing the development process and the evidences provided
meet. In other words, is "80% truth" for this claim good enough? If
this 80% conformance statement based on evidence is not sufficient, an
extended and refined claim, data preparation, collection, measurement
and interpretation is required. Hopefully, the Evidence Working Group
can eventually formulate such research questions, interpret results
appropriately and refine the analysis, e.g., create scripts to have
the documentation reflect the executed reality, determine the "right"
rules for interpretation and document those, and explain how to come
to actual risk assessments.

Some personal notes:

That 20% are not integrated according to the description in
MAINTAINERS might be surprising to outsiders and especially to
assessors that believe that the working model is fully described and
documented in a 100% structured and reality-reflected manner. The
challenge is really to understand if these 20% are A. due to other
implicit rules that have not been stated explicitly and outdated data
on organisational structures (which implies only a low risk on the
delivered product) or B. due to chaotic integration schemes, i.e., for
20% changes anyone can add anything (which implies a higher risk on
the delivered product).

If it is due to A, the Evidence Working Group can make those implicit
rules more explicit and update the data on organisation, and hence new
measurements will lead to lowering the criticality and risk of the
unknown on the challenge above.
If it is due to B, the change management induces a risk with a
specific known upper bound for specific areas, which needs to be
considered by the risk management of the downstream integrator. In the
worst case, ending with the decision that the responsibility of the
observed code integration is too chaotic to meet the business needs of
the downstream integrator. (In other words, not everything in the
kernel repository is gold; and if you rely on crappy code in the
repository, you will ship crap or if you are well informed, decide not
to ship crap...)

So... I guess the Evidence Working Group has created yet another work
result, even without kicking off... :) Paul, I certainly appreciate if
you would like to continue the investigation of these evidence
together with me (and maybe, Ralf and Wolfgang) and future prospective
bachelor and master students.


Lukas


Re: ELISA Workshop #6 Proposal - Kernel configurations for safety critical applications

elana.copperman@...
 

- Topic idea:  To introduce our taxonomy for kernel configurations for safety, and to agree on a set of criteiria for existing and/or new configurations which can support qualification of a Linux kernel image for use in 
safety critical applications
- What you hope to accomplish in the session:  To bring together Linux users who are familiar with existing kernel configurations and features for an in-depth discussion on specific features and options.  Some examples:
  • Memory protection (kernel memory, heap, stack)
  • Timing and execution in concurrent programs
- Any critical participants needed for the discussion - Shauah, Elana

- Estimated time needed the sessions:  short (30 min),  medium(60),
   long(90), extra-long(120 or more) - Extra long (workshop), 2 hours

We should decide if the content is mature enough to add a link (open to all for viewing only) to the current spreadsheet draft.

Elana


From: devel@... <devel@...> on behalf of Shuah Khan <skhan@...>
Sent: Friday, November 20, 2020 1:18 AM
To: devel@... <devel@...>
Cc: Shuah Khan <skhan@...>
Subject: [ELISA Technical Community] ELISA Workshop #6 Call for Topics and Sessions
 
All,

Yes it is time to prepare for ELISA Workshop #6 towards the end of
January or early February. With a just a few weeks left this year,
I am getting started early with this call for topics.

Possible dates for the conference: (without conflicting LCA & FOSDEM)
Jan 26-28th or Feb 2-4 2021 - a doodle poll will be sent out soon to
make a final call.

If you have an area of interest you want to see discussed in the ELISA
Workshop #6, please send to this email list, with the subject "Workshop
#6 Proposal - <your topic idea>".

In the body of the email,  please include:
- Topic idea
- What you hope to accomplish in the session
- Any critical participants needed for the discussion
- Estimated time needed the sessions:  short (30 min),  medium(60),
   long(90), extra-long(120 or more)

Sessions can include Presentations, Technical Discussion Groups,
Planning Discussions, Tutorials or Dedicated Working Sessions. Working
Sessions are intended to only work on shared piece of work, a session
to write together on a document or some piece of code, try out some tool
together etc. (but no presentation).

thanks,
-- Shuah






Re: ELISA Workshop #6 Proposal - Kernel testing reference process and follow ups for ELISA

elana.copperman@...
 

- Topic idea:  Wrap up report by Development Process WG for analysis of kernel test process, focusing on identifying and prioritizing follow up tasks which may be focused by ELISA

- What you hope to accomplish in the session:  WG members are engaged in a safety analysis of the Linux kernel test process, based on safety standards and a defined reference process.  In this workshop, we will introduce newcomers to the ongoing safety assessment process, and attempt to converge on initial outcomes and next steps.  Newcomers and new ideas are welcome.

- Any critical participants needed for the discussion - Kate, Paul, Pete, Elana

- Estimated time needed the sessions:  short (30 min),  medium(60),
   long(90), extra-long(120 or more) - Extra long (workshop), 2 hours or more, as time permits.

We should decide if the content is mature enough to add a link (open to all for viewing only) to the current spreadsheet draft.

Elana


From: devel@... <devel@...> on behalf of Shuah Khan <skhan@...>
Sent: Friday, November 20, 2020 1:18 AM
To: devel@... <devel@...>
Cc: Shuah Khan <skhan@...>
Subject: [ELISA Technical Community] ELISA Workshop #6 Call for Topics and Sessions
 
All,

Yes it is time to prepare for ELISA Workshop #6 towards the end of
January or early February. With a just a few weeks left this year,
I am getting started early with this call for topics.

Possible dates for the conference: (without conflicting LCA & FOSDEM)
Jan 26-28th or Feb 2-4 2021 - a doodle poll will be sent out soon to
make a final call.

If you have an area of interest you want to see discussed in the ELISA
Workshop #6, please send to this email list, with the subject "Workshop
#6 Proposal - <your topic idea>".

In the body of the email,  please include:
- Topic idea
- What you hope to accomplish in the session
- Any critical participants needed for the discussion
- Estimated time needed the sessions:  short (30 min),  medium(60),
   long(90), extra-long(120 or more)

Sessions can include Presentations, Technical Discussion Groups,
Planning Discussions, Tutorials or Dedicated Working Sessions. Working
Sessions are intended to only work on shared piece of work, a session
to write together on a document or some piece of code, try out some tool
together etc. (but no presentation).

thanks,
-- Shuah





121 - 140 of 1406