The impact of the organization of the software development process on the quality of programs and on personal productivity in the Individual development process

Original author: Mark Paulk
  • Transfer
Very soon, on December 17, a master class by Mark Paulk, co-author of the Capability Maturity Model for Software, will be held at the Luxoft Training training center .
Mark Paulk develops and teaches courses on software development, improving the software development process (CMM and CMMI), process maturity, agile methodologies, project management for software development and statistical analysis.
We suggest you familiarize yourself with his article on the relationship between the organization of the software development process, software quality, and personal productivity.



The impact of the organization of the software development process on the quality of programs and on personal productivity in the Individual development process



Mark Paulk, Carnegie Mellon University

INTRODUCTION

One of the basic tenets in modern theory of software development processes is that enhancing the role of processes, or adherence to “best practices,” increases developer productivity and improves the products they create. In practice, it is rather difficult to separate one from the other, since a network of complex causal relationships is built between the two categories, however, such a separation is still quite common, since each of these concepts is complex and includes several dimensions. And although the organization of the development process does not guarantee successful completion of the project, it nevertheless increases this likelihood.
The belief that a well-organized software development process adds value is at the core of models and standards such as the Software Development Process Maturity Model (CMM) and CMM Integration (CMMI). The nature of the improvements depends on the business process, but as a rule, they affect productivity and quality. In order to demonstrate the impact of the Personal Software Process (PSP) on performance and quality, statistics from the analysis can be used. PSP demonstrates productivity and quality growth following the adoption of a streamlined development process, but it also illustrates some of the difficulties in defining these concepts. Watts Humphrey successfully applied the principles of organization of software development to CMM, Team Software Process (TSP) and PSP. Many studies show the influence of these principles on the quality and productivity of the organization, team (project) and individual, respectively.
PSP step by step applies the concepts of the development process and quantitative management to the work of the developer in the learning environment. There are 4 main processes in PSP: PSP0, PSP1, PSP2, and PSP3. Each process is built on to the previous one by adding engineering or management tasks. Adding tasks step-by-step allows the developer to analyze the impact that new techniques have on his or her personal effectiveness. Usually given 10 tasks. PSP data is well suited for research, as many factors that influence project performance and interfere with research data, such as variability of requirements and teamwork difficulties, are either restrained or completely eliminated in PSP. Students at PSP use a wide variety of programming languages, but only programs were included in the sample.

QUALITY IMPACT

For a simple quality analysis, PSP uses available defect density data. In software development, data on the number of defects detected in the first year (or in the first six months or another period of time) are often used as a measure of quality. Since PSP training takes place in the classroom, the products do not pass “field trials”. A reasonable option is to use the density metric of defects found during testing. The box diagram (Fig. 1) shows the measurable and statistically significant improvements in software quality that occurred over 4 PSP processes. Quality improved by 79 percent, and variability decreased by 81 percent.
However, a reservation must be made: despite the fact that the density of defects can be used to measure quality, customers are not concerned with defects, but with failures. Defects can go unnoticed for years without affecting the normal use of the product. Even such a quality measurement as the average time between failures is not perfect, since many aspects of software products matter to customers (and the customer is the main evaluator of quality). For example, Garwin identifies nine dimensions of quality (performance, performance, functionality, safety, standards, reliability, durability, serviceability and aesthetics), indicating the complexity of the concept of “overall quality”. The software quality specifications listed in ISO 9126 include functionality, reliability, usability, performance, maintainability and portability. Unfortunately, in the context of the PSP, only an analysis of the density of defects is possible without the possibility of taking into account a larger context. However, here difficulties can be very serious.



Fig. 1. Quality Improvement in PSP

In the box diagram, the “boxes” are shown on the 25th and 75th percentiles of the data set, and the median is the center line. In this version of the box diagram, the “contacts” are represented by one and a half inter-quartile range and can be used to determine emissions. A line through the entire chart represents the total average for the entire data set. Student t-test for each pair (Each Pair Student's t) and multiple Tukey-Kramer comparison of all pairs of groups (All Pairs Tukey-Kramer) give liberal and conservative comparison criteria, respectively; if the comparative circles do not overlap or the external angle of intersection is less than 90 degrees, we can conclude that the average values ​​of different groups differ significantly at a given level of confidence (α = 0.05).



Fig. 2. Improving the quality of tasks

Another reasonable question may be: will not the study of the number of defects provide more information than the analysis of their density, given that the size of the tasks is approximately the same? As can be seen in fig. 3, the size of the last five jobs is larger than the first five, but the variability of sizes emphasizes the inappropriateness of using the number of lines of code (LOC) as a measure for determining the size of a project. Since all students were given the same tasks and the same programming language was used, the differences should be explained by the decisions made by each individual programmer.



Fig. 3. Various sizes (number of lines of code) of PSP jobs

Despite doubts about using lines of code as a measure of program dimension to normalize the number of defects, Fig. 4 you can see that the number of defects in PSP jobs is mainly reduced, despite the fact that the number of lines of code is growing. This confirms the hypothesis that the organization of the development process in PSP improves quality.



Fig. 4. The number of defects detected during testing in PSP

The question may arise, which factors in addition to the process affect the quality of the software. Among the possible options, factors such as the experience and education of a programmer can be proposed, however, neither one nor the other, as the study showed, affects the quality. Consideration of a larger array of data that takes into account various programming languages ​​suggests that the programming language is not such a factor, in contrast to the individual abilities of the programmer, as shown in Fig. 5.
Programmers were divided by ability based on the results of the first three tasks at four levels. As can be seen from fig. 5, compiled on the basis of the model of repeated data changes, leaders (TQ) invariably cope with tasks better than those lagging behind (BQ) (the two groups between them, designated as B M2 and T M2, also retain their relative positions). The software quality of students from the upper quartile has more than doubled, and students from the lower quartile have more than quadrupled.
It should be noted that the ability of programmers can be measured in many other ways. The method we have chosen focuses on the quality of software identified in testing, which is affected by the assumption of several defects (high-quality development) and their effective identification and elimination (high-quality review). In the analysis, these two reasons are not separated.



Fig. 5. Qualitative trends in the ability of programmers.

INFLUENCE ON PERFORMANCE

A similar analysis can be performed for performance (the ratio of the result to the expended resources), measured as the number of lines of code per hour. As shown in fig. 6, productivity in PSP processes increased by 12 percent, and variability decreased by 11 percent (as well as the statistically significant difference between PSP0 and PSP3). Whether such an increase is significant remains at the discretion of the reader. In many environments, factors such as volatility of requirements are likely to offset this slight increase, but in the controlled environment of the PSP classroom, the effect is obvious.



Fig. 6. Improving PSP Performance

The disadvantages of this analysis even surpass the imperfections of qualitative analysis. Counting the number of lines of code per hour is a very poor way to measure performance. However, it is impossible to say whether alternatives such as analysis of functional points, requirements or the number of user stories per hour are the best options, although all four of these analyzes (and others) are used in software development projects. An alternative would be to analyze the number of hours spent on a task, as shown in Fig. 7. This analysis measures the performance for each individual task, but does not take into account the differences in decisions made by each programmer (as shown in Fig. 3).



Fig. 7. The effort spent on assignments

The analysis did not reveal the effect of education and the number of years of experience on productivity. For the entire PSP dataset, C ++ and Java showed "greater performance" than C and Visual Basic when measuring the number of lines of code per hour. However, if you look at the amount of effort spent on solving problems (as shown in Fig. 8 for task number 10), then there was no difference between the programming languages.


Fig. 8. Differences in effort when using different languages ​​(task number 10)

When measuring performance as the number of lines of code per hour, the programmer’s abilities affect performance. Moreover, it was revealed that productivity is growing along with quality (as expected).

CONCLUSION

The results presented in this article overlap with the results of previous work on the PSP, but pay attention to some difficulties associated with the interpretation of the results. Doubts were expressed about the possibility of drawing conclusions about real projects based on tasks performed by students. However, PSP classes are often conducted in real conditions, rather than in classrooms, and developers in this sample have up to 34 years of experience with a median of 7 years of experience. In this regard, students studying PSP are more like typical developers involved in projects than students of the faculty of computer science.
A more important issue is the extent to which the tasks performed during the PSP training are consistent with the actual design objectives. And although each task fits into the framework of real design work, the tasks in the PSP are not related to problems such as uncertainty and high variability of requirements or product integration, which pose the greatest difficulties in real projects and are areas where experience and preparation can play a key role to succeed. PSP teaches the basics of the Team Development Process (TSP), and it was revealed that TSP projects also improved quality and increased productivity.
The most difficult question raised by this study is characteristic of the entire software development industry: how can we reliably measure productivity and quality? As this work showed, the common metrics of defect density and the number of lines of code per hour have significant drawbacks. A potentially more reliable metric, such as the amount of processing (the percentage of time spent on fixing defects), is not known to everyone and is rarely used. Perhaps the metrics described will prove to be useful, given that it is unlikely that it will be possible to find and implement the best options, but any conclusions based on these analyzes should be carefully weighed before making any decisions. In evidence-based management, context is crucial.
One reviewer of this article noted that it would be fair to ask another question: why is PSP not used more often if it is so successful? Unfortunately, not all of the many best practices in software engineering are accepted by companies, despite strong evidence of the benefits of their use. For example, there is a huge amount of research that confirms the effectiveness and efficiency of inspections, but how many organizations systematically use some form of friendly assessment and even more so strict peer review? Researchers can gather evidence to help them make informed decisions and, like teachers and consultants, can struggle to adopt best practices, but the organization of the processes in software engineering is still relatively young.

Capability Maturity Model, CMM and CMMI are registered trademarks of Carnegie Mellon University.

SMCMM Integration, Personal Software Process, PSP, and SEI are service marks of Carnegie Mellon University.

Read more about the master class by Mark Paul on December 17 at Luxoft Training.

Also popular now: