Date
1 - 3 of 3
RFC - Discovering Linux kernel subsystems used by a workload
Shuah Khan
All,
Please review the document that outlines the process to get insight into the resources used by a workload. Shefali Sharma and I identified a process for gathering fine grained information about system resources necessary to run a generic workloads on Linux. This process can then be applied to any workload including individual commands and important use-cases in that workload. As an example, what subsystems are used when a user queries the insulin pump status when OpenAPS workload in running. In addition this process can be used by System Integrators to gain insight into the resources used by their workloads. Please review and give us feedback. Once this review is complete, we will upload the document to github. https://docs.google.com/document/d/1OgbTDFdrWtQTCYoRwNIZMQPhGhnHbyughLXrLfiaTi4/edit# thanks, -- Shuah & Shefali |
|
Jonathan Moore <jandcmoore@...>
Very interesting and some good information about the tools available thank you. Have you any given any thought or do you have information that supports the variety of the measurements? ie do the results actually match what is going on? How do we verify that? Are the results between tools generally in agreement or are some results tuned for different workloads? Does the point in time of measurement make a difference? Can peak load eg during application startup up be differentiated from a low demand mode? What instances have you seen where the reports are low or too high eg situations when the reported measurements exceed the actual available etc.? What effect does measuring have on the system/task being measured? Can these measurements be made at the same time as high stress workloads without impacting the workload? How does one deal with Nyquist? When a task is zombie/dead/lost do the measurements indicate this and how quickly? Do you have a set of 'test' loads to explore all of this? That might be enough questions for now. :-) Jonathan On Wed, Aug 3, 2022, 12:55 PM Shuah Khan <skhan@...> wrote: All, |
|
Shuah Khan
Please see inline.
On 8/5/22 11:22 AM, Jonathan Moore wrote: Very interesting and some good information about the tools available thank you.The goal is to get insight into system calls and ioctls invoked by a workload. Results do match the system activity for that workload. Running streace on "ls" command will tell you the system activity for that command. How do we verify that? Are the results between tools generally in agreement or are some results tuned for different workloads? Does the point in time of measurement make a difference? Can peak load eg during application startup up be differentiated from a low demand mode? What instances have you seen where the reports are low or too high eg situations when the reported measurements exceed the actual available etc.? What effect does measuring have on the system/task being measured? Can these measurements be made at the same time as high stress workloads without impacting the workload? How does one deal with Nyquist? When a task is zombie/dead/lost do the measurements indicate this and how quickly? Do you have a set of 'test' loads to explore all of this? The only tool we are using here is strace and the same tool is used on 3 workloads. Results are not tuned for a workload. Please keep in mind, the goal here is understanding the system footprint for a workload. It isn't goal to see how a workload behaves under varying loads. The goal is to give tool and process to system integrators to follow to get insight into their workloads. This information can then be used to develop a plan for gathering evidence for certification. Hope this helps understand the goals of this work. thanks, -- Shuah |
|