Task proposal: Verification tool finding management


Lukas Bulwahn
 

Dear all,

I rewrote/rephrased Nicholas' False positive management infrastructure (I name it: Verification tool finding management, which is also still clumbsy but at least it does not suggest just managing false positives, which sound even more "wasteful".).


Verification tool finding management

Goal: Verification tool finding management established 

Generate a public database of false-positive findings in order to make the current mainline tools easier to use and, thus, more attractive for kernel developers. 

Needed activities

- Investigation of reported data for existing tools
- Definition of suitable format and process for shared distributed database and collaboration of entry collection
- Tools for managing database (command-line tools & web-based tools)
- Tools for transfer of verification tool finding assessments in evolving software
- Extension of existing tools to filter findings with available database
- Outreach to tool developers and dissemination into kernel community

Effort estimation

- Investigation of reported data for existing tools: 0.5 months effort
- Definition of suitable format and process for shared distributed database and collaboration of entry collection: 0.5 months effort
- Tools for managing database (command-line tools & web-based tools): 4 months development effort (could be done by a SW dev. team of 2-3 people)
- Tools for transfer of verification tool finding assessments in evolving software: 6 months development effort (could be done by a SW dev. team of 2-3 people)
- Extension of existing tools to filter findings with available database: 1 month of development effort (could be done by a SW dev. team of 2-3 people)
- Outreach to tool developers and dissemination into kernel community: 1 month of dissemination activities

Total effort:
13 months software development effort 

Deliverables

- Implementation of database and management tool, verification tool importers, manual and semi-automatic evolution mapping
- Documentation, articles to inform developers, presentation on conferences
- First database of findings for a suitable kernel configuration and version

Timeline/Milestones

- M1: Concept for implementation available
- M2: Tool output can be imported into database and managed in application, and exported to filters.
- M3: First algorithm for finding evolution implemented
- M4: Presentation of current state of tooling on international conference
- M5: Second algorithm for finding evolution implemented
- M6: Second Presentation of current state of tooling on international conference
- M7: Building up a stable community around the software in the kernel-janitor group
- M8: algorithms validated on larger scale


Any comments?

I would probably need to synchronize with Nicholas and the comments from others to condense into one proposal.

Best regards,

Lukas


Lukas Bulwahn
 

Attaching Nicholas' proposal to this email thread to simplify finding it on the list.

Best regards,

Lukas


John MacGregor <john.macgregor@...>
 

Hi Lukas

Are you going to put all these task proposals and associated information in a repo somewhere? Searching through the mailing list of all associated e-mails doesn't work well for me.

Cheers

John

Mit freundlichen Grüßen / Best regards

John MacGregor

Safety, Security and Privacy (CR/AEX4)
Tel. +49 711 811-42995 | Mobil +49 151 54309433 | Fax +49 69 9540295-532 | John.MacGregor@...

-----Original Message-----
From: devel@... <devel@...> On Behalf Of Lukas
Bulwahn
Sent: Donnerstag, 18. April 2019 11:44
To: devel@...
Subject: [devel] Task proposal: Verification tool finding management

Dear all,

I rewrote/rephrased Nicholas' False positive management infrastructure (I
name it: Verification tool finding management, which is also still clumbsy but
at least it does not suggest just managing false positives, which sound even
more "wasteful".).


Verification tool finding management

Goal: Verification tool finding management established

Generate a public database of false-positive findings in order to make the
current mainline tools easier to use and, thus, more attractive for kernel
developers.

Needed activities

- Investigation of reported data for existing tools
- Definition of suitable format and process for shared distributed database
and collaboration of entry collection
- Tools for managing database (command-line tools & web-based tools)
- Tools for transfer of verification tool finding assessments in evolving
software
- Extension of existing tools to filter findings with available database
- Outreach to tool developers and dissemination into kernel community

Effort estimation

- Investigation of reported data for existing tools: 0.5 months effort
- Definition of suitable format and process for shared distributed database
and collaboration of entry collection: 0.5 months effort
- Tools for managing database (command-line tools & web-based tools): 4
months development effort (could be done by a SW dev. team of 2-3
people)
- Tools for transfer of verification tool finding assessments in evolving
software: 6 months development effort (could be done by a SW dev. team
of 2-3 people)
- Extension of existing tools to filter findings with available database: 1
month of development effort (could be done by a SW dev. team of 2-3
people)
- Outreach to tool developers and dissemination into kernel community: 1
month of dissemination activities

Total effort:
13 months software development effort

Deliverables

- Implementation of database and management tool, verification tool
importers, manual and semi-automatic evolution mapping
- Documentation, articles to inform developers, presentation on
conferences
- First database of findings for a suitable kernel configuration and version

Timeline/Milestones

- M1: Concept for implementation available
- M2: Tool output can be imported into database and managed in
application, and exported to filters.
- M3: First algorithm for finding evolution implemented
- M4: Presentation of current state of tooling on international conference
- M5: Second algorithm for finding evolution implemented
- M6: Second Presentation of current state of tooling on international
conference
- M7: Building up a stable community around the software in the kernel-
janitor group
- M8: algorithms validated on larger scale


Any comments?

I would probably need to synchronize with Nicholas and the comments from
others to condense into one proposal.

Best regards,

Lukas


Lukas Bulwahn
 

Hi John,

yes, I intend to do that. Once I got some feedback via mail, I would rework the proposals and include them in the repository.


Best regards,

Lukas

-----Ursprüngliche Nachricht-----
Von: devel@... [mailto:devel@...] Im Auftrag von John MacGregor
Gesendet: Donnerstag, 18. April 2019 13:12
An: devel@...
Betreff: Re: [devel] Task proposal: Verification tool finding management

Hi Lukas

Are you going to put all these task proposals and associated information in a repo somewhere? Searching through the mailing list of all associated e-mails doesn't work well for me.

Cheers

John

Mit freundlichen Grüßen / Best regards

John MacGregor

Safety, Security and Privacy (CR/AEX4)
Tel. +49 711 811-42995 | Mobil +49 151 54309433 | Fax +49 69 9540295-532 | John.MacGregor@...

-----Original Message-----
From: devel@... <devel@...> On Behalf Of Lukas
Bulwahn
Sent: Donnerstag, 18. April 2019 11:44
To: devel@...
Subject: [devel] Task proposal: Verification tool finding management

Dear all,

I rewrote/rephrased Nicholas' False positive management infrastructure (I
name it: Verification tool finding management, which is also still clumbsy but
at least it does not suggest just managing false positives, which sound even
more "wasteful".).


Verification tool finding management

Goal: Verification tool finding management established

Generate a public database of false-positive findings in order to make the
current mainline tools easier to use and, thus, more attractive for kernel
developers.

Needed activities

- Investigation of reported data for existing tools
- Definition of suitable format and process for shared distributed database
and collaboration of entry collection
- Tools for managing database (command-line tools & web-based tools)
- Tools for transfer of verification tool finding assessments in evolving
software
- Extension of existing tools to filter findings with available database
- Outreach to tool developers and dissemination into kernel community

Effort estimation

- Investigation of reported data for existing tools: 0.5 months effort
- Definition of suitable format and process for shared distributed database
and collaboration of entry collection: 0.5 months effort
- Tools for managing database (command-line tools & web-based tools): 4
months development effort (could be done by a SW dev. team of 2-3
people)
- Tools for transfer of verification tool finding assessments in evolving
software: 6 months development effort (could be done by a SW dev. team
of 2-3 people)
- Extension of existing tools to filter findings with available database: 1
month of development effort (could be done by a SW dev. team of 2-3
people)
- Outreach to tool developers and dissemination into kernel community: 1
month of dissemination activities

Total effort:
13 months software development effort

Deliverables

- Implementation of database and management tool, verification tool
importers, manual and semi-automatic evolution mapping
- Documentation, articles to inform developers, presentation on
conferences
- First database of findings for a suitable kernel configuration and version

Timeline/Milestones

- M1: Concept for implementation available
- M2: Tool output can be imported into database and managed in
application, and exported to filters.
- M3: First algorithm for finding evolution implemented
- M4: Presentation of current state of tooling on international conference
- M5: Second algorithm for finding evolution implemented
- M6: Second Presentation of current state of tooling on international
conference
- M7: Building up a stable community around the software in the kernel-
janitor group
- M8: algorithms validated on larger scale


Any comments?

I would probably need to synchronize with Nicholas and the comments from
others to condense into one proposal.

Best regards,

Lukas


Evgeny Novikov <novikov@...>
 

Dear Lukas,

within our not so large team we usually discuss such proposals using something like Google Docs. Is this is an option for you?

First thoughts on the merits.

1. Regarding the name.

Coverity seems to use word "triage" to describe this shortly (https://dzone.com/articles/triage-security-vulnerabilities-with-coveritys-sec).

In Klever we refer to this as "means for expert assessment of verification results". We are not native speakers, so, I do not pretend for correct naming.

2. It is not clear what sort of analysis tools you target.

For me "verification" is too generic word which covers all methods and tools that check something against some specifications. This includes code review, testing, static analysis, formal methods. I do not think that you would like to consider all of them within this proposal.

3. I advice you to investigate related work.

I already specified that developers spend a lot of time to maintain corresponding means. Unfortunately, in most cases these means are not publicly available. But may be it would be better to pay for licenses or for subscription rather than to develop and to maintain necessary things by yourself. There are many other activities that you should focus on.

4. A small advertisement that may be an interesting option for you.

Klever is open source and it includes many related means (I added the primary developer of them in copy):
- there is a web interface over a database with a very rich and mature scheme;
- there is a console interface to perform simple database operations like uploading verification results;
- many experts can assess verification results at a time (different experts can be granted different levels of access);
- experts can select values from several predefined lists (e.g. "Bug" or "False alarm"), specify random tags (tags are arranged as trees) and an arbitrary text description;
- expert assessment history is kept, it includes dates, changes made, expert data;
- it is possible to download/upload particular/all verification and expert assessment results;
- verification results are shown together with related source code (we have been improving this part very considerably);
- similar reports are automatically assessed (experts can easily distinguish automatically assessed reports since this can be wrong);
- there is a support for confirmations and likes;
- one can compare reports produced for different versions/configurations of target software as well as for different versions/configurations of verification tools and specifications;
- many statistics is calculated and can be presented in different views.

Underlying technologies include: Python Django, HTML/JavaScript/jQuery/AJAX/CSS, PostgreSQL, Apache/Nginx, Gunicorn.

Here is an output from clock.pl (http://cloc.sourceforge.net/):
282 text files.
267 unique files.
23 files ignored.

github.com/AlDanial/cloc v 1.70 T=0.71 s (362.7 files/s, 83296.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 95 3636 2274 22637
JavaScript 50 1609 1846 15345
HTML 76 418 5 8418
CSS 29 164 347 2670
XML 8 0 0 93
JSON 1 0 0 19
-------------------------------------------------------------------------------
SUM: 259 5827 4472 49182
-------------------------------------------------------------------------------

At the moment Klever is intended just for expert assessment of verification results from software model checkers, so, some improvements will be definitely necessary for static analyzers (this is you primary target, I guess). But in general it should be easier than to develop everything from scratch. Besides, there are some very expected missed features, but we are implementing them right now since we need them for other applications as well.

It is not necessary to use the whole Klever since it includes much more than just means for expert assessment of verification results. It is enough to use just a UI part of it. Sometimes we do this ourselves.

BTW, during recent discussions I already thought about integration of Linux friendly and open source static analyzers within Klever. This seems to be a normal practice even for proprietary systems. This work may include not only improving means for expert assessment of verification results as suggested above, but a much more close integration. For instance, since static analyzers operate very fast and are very simple to be used, we can launch them optionally when we are collecting so called build bases (https://github.com/17451k/clade, sorry, for one more advertisement). Then verification results can be immediately visualized and assessed by experts.

Moreover, formal verification monsters can leverage results from static analyzers. We already used this approach several times, e.g. to verify that drivers do not call might sleep functions we collected lists of these functions in advance. There are other potential use cases as well.

Best regards,
Evgeny

18.04.2019, 14:17, "Lukas Bulwahn" <lukas.bulwahn@...>:

Hi John,

yes, I intend to do that. Once I got some feedback via mail, I would rework the proposals and include them in the repository.

Best regards,

Lukas

-----Ursprüngliche Nachricht-----
Von: devel@... [mailto:devel@...] Im Auftrag von John MacGregor
Gesendet: Donnerstag, 18. April 2019 13:12
An: devel@...
Betreff: Re: [devel] Task proposal: Verification tool finding management

Hi Lukas

Are you going to put all these task proposals and associated information in a repo somewhere? Searching through the mailing list of all associated e-mails doesn't work well for me.

Cheers

John

Mit freundlichen Grüßen / Best regards

John MacGregor

Safety, Security and Privacy (CR/AEX4)
Tel. +49 711 811-42995 | Mobil +49 151 54309433 | Fax +49 69 9540295-532 | John.MacGregor@...

 -----Original Message-----
 From: devel@... <devel@...> On Behalf Of Lukas
 Bulwahn
 Sent: Donnerstag, 18. April 2019 11:44
 To: devel@...
 Subject: [devel] Task proposal: Verification tool finding management

 Dear all,

 I rewrote/rephrased Nicholas' False positive management infrastructure (I
 name it: Verification tool finding management, which is also still clumbsy but
 at least it does not suggest just managing false positives, which sound even
 more "wasteful".).

 Verification tool finding management

 Goal: Verification tool finding management established

 Generate a public database of false-positive findings in order to make the
 current mainline tools easier to use and, thus, more attractive for kernel
 developers.

 Needed activities

   - Investigation of reported data for existing tools
   - Definition of suitable format and process for shared distributed database
 and collaboration of entry collection
   - Tools for managing database (command-line tools & web-based tools)
   - Tools for transfer of verification tool finding assessments in evolving
 software
   - Extension of existing tools to filter findings with available database
   - Outreach to tool developers and dissemination into kernel community

 Effort estimation

   - Investigation of reported data for existing tools: 0.5 months effort
   - Definition of suitable format and process for shared distributed database
 and collaboration of entry collection: 0.5 months effort
   - Tools for managing database (command-line tools & web-based tools): 4
 months development effort (could be done by a SW dev. team of 2-3
 people)
   - Tools for transfer of verification tool finding assessments in evolving
 software: 6 months development effort (could be done by a SW dev. team
 of 2-3 people)
   - Extension of existing tools to filter findings with available database: 1
 month of development effort (could be done by a SW dev. team of 2-3
 people)
   - Outreach to tool developers and dissemination into kernel community: 1
 month of dissemination activities

 Total effort:
 13 months software development effort

 Deliverables

   - Implementation of database and management tool, verification tool
 importers, manual and semi-automatic evolution mapping
   - Documentation, articles to inform developers, presentation on
 conferences
   - First database of findings for a suitable kernel configuration and version

 Timeline/Milestones

   - M1: Concept for implementation available
   - M2: Tool output can be imported into database and managed in
 application, and exported to filters.
   - M3: First algorithm for finding evolution implemented
   - M4: Presentation of current state of tooling on international conference
   - M5: Second algorithm for finding evolution implemented
   - M6: Second Presentation of current state of tooling on international
 conference
   - M7: Building up a stable community around the software in the kernel-
 janitor group
   - M8: algorithms validated on larger scale

 Any comments?

 I would probably need to synchronize with Nicholas and the comments from
 others to condense into one proposal.

 Best regards,

 Lukas


Lukas Bulwahn
 

Dear Evgeny,

unfortunately, I cannot access Google Docs at the moment, but I am clarifying that and it should probably work in the future.

Thanks for the suggestions. This is really helpful and already a good benefit of the presentation and discussion.

To comment 1., I will check the different wordings and decide.
To comment 2., I am actually considering all methods and tools, but really we start with the ones where it seems most suitable as they are already applied to the Linux kernel (sparse, smatch, possibly coccinelle rules).

To comment 3., any pointers to related work is of interest. I am interested in a tool where I can feed the output from the tools above as input and track them.
To comment 4., I was actually considering elixir as starting point. Klever sounds really like a good starting point and I hope a first investigation importing sparse warnings is a feasible exercise.

I agree that you probably are already thinking about very similar ideas in Klever that we developed independently. Let us continue to explore this together.

Best regards,

Lukas

-----Ursprüngliche Nachricht-----
Von: devel@... [mailto:devel@...] Im Auftrag von Evgeny Novikov
Gesendet: Freitag, 19. April 2019 09:13
An: devel@...
Cc: gratinskiy <gratinskiy@...>
Betreff: Re: [devel] Task proposal: Verification tool finding management

Dear Lukas,

within our not so large team we usually discuss such proposals using something like Google Docs. Is this is an option for you?

First thoughts on the merits.

1. Regarding the name.

Coverity seems to use word "triage" to describe this shortly (https://dzone.com/articles/triage-security-vulnerabilities-with-coveritys-sec).

In Klever we refer to this as "means for expert assessment of verification results". We are not native speakers, so, I do not pretend for correct naming.

2. It is not clear what sort of analysis tools you target.

For me "verification" is too generic word which covers all methods and tools that check something against some specifications. This includes code review, testing, static analysis, formal methods. I do not think that you would like to consider all of them within this proposal.

3. I advice you to investigate related work.

I already specified that developers spend a lot of time to maintain corresponding means. Unfortunately, in most cases these means are not publicly available. But may be it would be better to pay for licenses or for subscription rather than to develop and to maintain necessary things by yourself. There are many other activities that you should focus on.

4. A small advertisement that may be an interesting option for you.

Klever is open source and it includes many related means (I added the primary developer of them in copy):
- there is a web interface over a database with a very rich and mature scheme;
- there is a console interface to perform simple database operations like uploading verification results;
- many experts can assess verification results at a time (different experts can be granted different levels of access);
- experts can select values from several predefined lists (e.g. "Bug" or "False alarm"), specify random tags (tags are arranged as trees) and an arbitrary text description;
- expert assessment history is kept, it includes dates, changes made, expert data;
- it is possible to download/upload particular/all verification and expert assessment results;
- verification results are shown together with related source code (we have been improving this part very considerably);
- similar reports are automatically assessed (experts can easily distinguish automatically assessed reports since this can be wrong);
- there is a support for confirmations and likes;
- one can compare reports produced for different versions/configurations of target software as well as for different versions/configurations of verification tools and specifications;
- many statistics is calculated and can be presented in different views.

Underlying technologies include: Python Django, HTML/JavaScript/jQuery/AJAX/CSS, PostgreSQL, Apache/Nginx, Gunicorn.

Here is an output from clock.pl (http://cloc.sourceforge.net/):
282 text files.
267 unique files.
23 files ignored.

github.com/AlDanial/cloc v 1.70 T=0.71 s (362.7 files/s, 83296.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 95 3636 2274 22637
JavaScript 50 1609 1846 15345
HTML 76 418 5 8418
CSS 29 164 347 2670
XML 8 0 0 93
JSON 1 0 0 19
-------------------------------------------------------------------------------
SUM: 259 5827 4472 49182
-------------------------------------------------------------------------------

At the moment Klever is intended just for expert assessment of verification results from software model checkers, so, some improvements will be definitely necessary for static analyzers (this is you primary target, I guess). But in general it should be easier than to develop everything from scratch. Besides, there are some very expected missed features, but we are implementing them right now since we need them for other applications as well.

It is not necessary to use the whole Klever since it includes much more than just means for expert assessment of verification results. It is enough to use just a UI part of it. Sometimes we do this ourselves.

BTW, during recent discussions I already thought about integration of Linux friendly and open source static analyzers within Klever. This seems to be a normal practice even for proprietary systems. This work may include not only improving means for expert assessment of verification results as suggested above, but a much more close integration. For instance, since static analyzers operate very fast and are very simple to be used, we can launch them optionally when we are collecting so called build bases (https://github.com/17451k/clade, sorry, for one more advertisement). Then verification results can be immediately visualized and assessed by experts.

Moreover, formal verification monsters can leverage results from static analyzers. We already used this approach several times, e.g. to verify that drivers do not call might sleep functions we collected lists of these functions in advance. There are other potential use cases as well.

Best regards,
Evgeny

18.04.2019, 14:17, "Lukas Bulwahn" <lukas.bulwahn@...>:

Hi John,

yes, I intend to do that. Once I got some feedback via mail, I would rework the proposals and include them in the repository.

Best regards,

Lukas

-----Ursprüngliche Nachricht-----
Von: devel@... [mailto:devel@...] Im Auftrag von John MacGregor
Gesendet: Donnerstag, 18. April 2019 13:12
An: devel@...
Betreff: Re: [devel] Task proposal: Verification tool finding management

Hi Lukas

Are you going to put all these task proposals and associated information in a repo somewhere? Searching through the mailing list of all associated e-mails doesn't work well for me.

Cheers

John

Mit freundlichen Grüßen / Best regards

John MacGregor

Safety, Security and Privacy (CR/AEX4)
Tel. +49 711 811-42995 | Mobil +49 151 54309433 | Fax +49 69 9540295-532 | John.MacGregor@...

 -----Original Message-----
 From: devel@... <devel@...> On Behalf Of Lukas
 Bulwahn
 Sent: Donnerstag, 18. April 2019 11:44
 To: devel@...
 Subject: [devel] Task proposal: Verification tool finding management

 Dear all,

 I rewrote/rephrased Nicholas' False positive management infrastructure (I
 name it: Verification tool finding management, which is also still clumbsy but
 at least it does not suggest just managing false positives, which sound even
 more "wasteful".).

 Verification tool finding management

 Goal: Verification tool finding management established

 Generate a public database of false-positive findings in order to make the
 current mainline tools easier to use and, thus, more attractive for kernel
 developers.

 Needed activities

   - Investigation of reported data for existing tools
   - Definition of suitable format and process for shared distributed database
 and collaboration of entry collection
   - Tools for managing database (command-line tools & web-based tools)
   - Tools for transfer of verification tool finding assessments in evolving
 software
   - Extension of existing tools to filter findings with available database
   - Outreach to tool developers and dissemination into kernel community

 Effort estimation

   - Investigation of reported data for existing tools: 0.5 months effort
   - Definition of suitable format and process for shared distributed database
 and collaboration of entry collection: 0.5 months effort
   - Tools for managing database (command-line tools & web-based tools): 4
 months development effort (could be done by a SW dev. team of 2-3
 people)
   - Tools for transfer of verification tool finding assessments in evolving
 software: 6 months development effort (could be done by a SW dev. team
 of 2-3 people)
   - Extension of existing tools to filter findings with available database: 1
 month of development effort (could be done by a SW dev. team of 2-3
 people)
   - Outreach to tool developers and dissemination into kernel community: 1
 month of dissemination activities

 Total effort:
 13 months software development effort

 Deliverables

   - Implementation of database and management tool, verification tool
 importers, manual and semi-automatic evolution mapping
   - Documentation, articles to inform developers, presentation on
 conferences
   - First database of findings for a suitable kernel configuration and version

 Timeline/Milestones

   - M1: Concept for implementation available
   - M2: Tool output can be imported into database and managed in
 application, and exported to filters.
   - M3: First algorithm for finding evolution implemented
   - M4: Presentation of current state of tooling on international conference
   - M5: Second algorithm for finding evolution implemented
   - M6: Second Presentation of current state of tooling on international
 conference
   - M7: Building up a stable community around the software in the kernel-
 janitor group
   - M8: algorithms validated on larger scale

 Any comments?

 I would probably need to synchronize with Nicholas and the comments from
 others to condense into one proposal.

 Best regards,

 Lukas


Evgeny Novikov <novikov@...>
 

Dear Lukas,

23.04.2019, 09:59, "Lukas Bulwahn" <lukas.bulwahn@...>:
Dear Evgeny,

unfortunately, I cannot access Google Docs at the moment, but I am clarifying that and it should probably work in the future.

Thanks for the suggestions. This is really helpful and already a good benefit of the presentation and discussion.

To comment 1., I will check the different wordings and decide.
To comment 2., I am actually considering all methods and tools, but really we start with the ones where it seems most suitable as they are already applied to the Linux kernel (sparse, smatch, possibly coccinelle rules).
If you will try to develop a good infrastructure around all good methods and tools suitable for verification of the Linux kernel, this will be an outstanding project of course. But also it will take enormous efforts and most likely you do not implement it. Moreover, I doubt that this project is strictly necessary in context of the safety-critical software certification.


To comment 3., any pointers to related work is of interest. I am interested in a tool where I can feed the output from the tools above as input and track them.
All industrial level, commercial static analyzers have a corresponding infrastructure. For instance, you can investigate the Coverity one directly for the Linux kernel at https://scan.coverity.com/projects/linux (you will need to join the project).

Though, I think that all infrastructures are very closely related with corresponding tools. So, it may be not so trivial to support them for other tools. Maybe you can gain an access to the 0-day infrastructure that is likely most appropriate for target tools.

To comment 4., I was actually considering elixir as starting point. Klever sounds really like a good starting point and I hope a first investigation importing sparse warnings is a feasible exercise.

I agree that you probably are already thinking about very similar ideas in Klever that we developed independently. Let us continue to explore this together.
After exhaustion of research grants, we are not able to contribute much to any verification (testing, static analysis, software model checking, deductive verification) of the Linux kernel. Almost everything that we do now in context of software model checking is not related with the Linux kernel, although, if there are some common things we contribute to verification of the Linux kernel unintentionally.

Moreover, I think more and more than you should come to agreement and confidence with your stakeholders on final goals and means to reach these goals prior to proceeding to particular activities. In this case you will be able to do really necessary things and to obtain results faster that will be better for everybody in long term. For instance, if you will pay all attention to likely strictly necessary and missed requirements and testing, you will have necessary means (tools, test cases, documentation) to get necessary certificates (the first real project success). Maybe it will be necessary to proceed to other optional activities to improve the quality further. Of course, this can be the case if your stakeholders will agree on this.

Best regards,
Evgeny


Best regards,

Lukas

-----Ursprüngliche Nachricht-----
Von: devel@... [mailto:devel@...] Im Auftrag von Evgeny Novikov
Gesendet: Freitag, 19. April 2019 09:13
An: devel@...
Cc: gratinskiy <gratinskiy@...>
Betreff: Re: [devel] Task proposal: Verification tool finding management

Dear Lukas,

within our not so large team we usually discuss such proposals using something like Google Docs. Is this is an option for you?

First thoughts on the merits.

1. Regarding the name.

Coverity seems to use word "triage" to describe this shortly (https://dzone.com/articles/triage-security-vulnerabilities-with-coveritys-sec).

In Klever we refer to this as "means for expert assessment of verification results". We are not native speakers, so, I do not pretend for correct naming.

2. It is not clear what sort of analysis tools you target.

For me "verification" is too generic word which covers all methods and tools that check something against some specifications. This includes code review, testing, static analysis, formal methods. I do not think that you would like to consider all of them within this proposal.

3. I advice you to investigate related work.

I already specified that developers spend a lot of time to maintain corresponding means. Unfortunately, in most cases these means are not publicly available. But may be it would be better to pay for licenses or for subscription rather than to develop and to maintain necessary things by yourself. There are many other activities that you should focus on.

4. A small advertisement that may be an interesting option for you.

Klever is open source and it includes many related means (I added the primary developer of them in copy):
- there is a web interface over a database with a very rich and mature scheme;
- there is a console interface to perform simple database operations like uploading verification results;
- many experts can assess verification results at a time (different experts can be granted different levels of access);
- experts can select values from several predefined lists (e.g. "Bug" or "False alarm"), specify random tags (tags are arranged as trees) and an arbitrary text description;
- expert assessment history is kept, it includes dates, changes made, expert data;
- it is possible to download/upload particular/all verification and expert assessment results;
- verification results are shown together with related source code (we have been improving this part very considerably);
- similar reports are automatically assessed (experts can easily distinguish automatically assessed reports since this can be wrong);
- there is a support for confirmations and likes;
- one can compare reports produced for different versions/configurations of target software as well as for different versions/configurations of verification tools and specifications;
- many statistics is calculated and can be presented in different views.

Underlying technologies include: Python Django, HTML/JavaScript/jQuery/AJAX/CSS, PostgreSQL, Apache/Nginx, Gunicorn.

Here is an output from clock.pl (http://cloc.sourceforge.net/):
     282 text files.
     267 unique files.
      23 files ignored.

github.com/AlDanial/cloc v 1.70 T=0.71 s (362.7 files/s, 83296.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 95 3636 2274 22637
JavaScript 50 1609 1846 15345
HTML 76 418 5 8418
CSS 29 164 347 2670
XML 8 0 0 93
JSON 1 0 0 19
-------------------------------------------------------------------------------
SUM: 259 5827 4472 49182
-------------------------------------------------------------------------------

At the moment Klever is intended just for expert assessment of verification results from software model checkers, so, some improvements will be definitely necessary for static analyzers (this is you primary target, I guess). But in general it should be easier than to develop everything from scratch. Besides, there are some very expected missed features, but we are implementing them right now since we need them for other applications as well.

It is not necessary to use the whole Klever since it includes much more than just means for expert assessment of verification results. It is enough to use just a UI part of it. Sometimes we do this ourselves.

BTW, during recent discussions I already thought about integration of Linux friendly and open source static analyzers within Klever. This seems to be a normal practice even for proprietary systems. This work may include not only improving means for expert assessment of verification results as suggested above, but a much more close integration. For instance, since static analyzers operate very fast and are very simple to be used, we can launch them optionally when we are collecting so called build bases (https://github.com/17451k/clade, sorry, for one more advertisement). Then verification results can be immediately visualized and assessed by experts.

Moreover, formal verification monsters can leverage results from static analyzers. We already used this approach several times, e.g. to verify that drivers do not call might sleep functions we collected lists of these functions in advance. There are other potential use cases as well.

Best regards,
Evgeny

18.04.2019, 14:17, "Lukas Bulwahn" <lukas.bulwahn@...>:
 Hi John,

 yes, I intend to do that. Once I got some feedback via mail, I would rework the proposals and include them in the repository.

 Best regards,

 Lukas

 -----Ursprüngliche Nachricht-----
 Von: devel@... [mailto:devel@...] Im Auftrag von John MacGregor
 Gesendet: Donnerstag, 18. April 2019 13:12
 An: devel@...
 Betreff: Re: [devel] Task proposal: Verification tool finding management

 Hi Lukas

 Are you going to put all these task proposals and associated information in a repo somewhere? Searching through the mailing list of all associated e-mails doesn't work well for me.

 Cheers

 John

 Mit freundlichen Grüßen / Best regards

 John MacGregor

 Safety, Security and Privacy (CR/AEX4)
 Tel. +49 711 811-42995 | Mobil +49 151 54309433 | Fax +49 69 9540295-532 | John.MacGregor@...

  -----Original Message-----
  From: devel@... <devel@...> On Behalf Of Lukas
  Bulwahn
  Sent: Donnerstag, 18. April 2019 11:44
  To: devel@...
  Subject: [devel] Task proposal: Verification tool finding management

  Dear all,

  I rewrote/rephrased Nicholas' False positive management infrastructure (I
  name it: Verification tool finding management, which is also still clumbsy but
  at least it does not suggest just managing false positives, which sound even
  more "wasteful".).

  Verification tool finding management

  Goal: Verification tool finding management established

  Generate a public database of false-positive findings in order to make the
  current mainline tools easier to use and, thus, more attractive for kernel
  developers.

  Needed activities

    - Investigation of reported data for existing tools
    - Definition of suitable format and process for shared distributed database
  and collaboration of entry collection
    - Tools for managing database (command-line tools & web-based tools)
    - Tools for transfer of verification tool finding assessments in evolving
  software
    - Extension of existing tools to filter findings with available database
    - Outreach to tool developers and dissemination into kernel community

  Effort estimation

    - Investigation of reported data for existing tools: 0.5 months effort
    - Definition of suitable format and process for shared distributed database
  and collaboration of entry collection: 0.5 months effort
    - Tools for managing database (command-line tools & web-based tools): 4
  months development effort (could be done by a SW dev. team of 2-3
  people)
    - Tools for transfer of verification tool finding assessments in evolving
  software: 6 months development effort (could be done by a SW dev. team
  of 2-3 people)
    - Extension of existing tools to filter findings with available database: 1
  month of development effort (could be done by a SW dev. team of 2-3
  people)
    - Outreach to tool developers and dissemination into kernel community: 1
  month of dissemination activities

  Total effort:
  13 months software development effort

  Deliverables

    - Implementation of database and management tool, verification tool
  importers, manual and semi-automatic evolution mapping
    - Documentation, articles to inform developers, presentation on
  conferences
    - First database of findings for a suitable kernel configuration and version

  Timeline/Milestones

    - M1: Concept for implementation available
    - M2: Tool output can be imported into database and managed in
  application, and exported to filters.
    - M3: First algorithm for finding evolution implemented
    - M4: Presentation of current state of tooling on international conference
    - M5: Second algorithm for finding evolution implemented
    - M6: Second Presentation of current state of tooling on international
  conference
    - M7: Building up a stable community around the software in the kernel-
  janitor group
    - M8: algorithms validated on larger scale

  Any comments?

  I would probably need to synchronize with Nicholas and the comments from
  others to condense into one proposal.

  Best regards,

  Lukas