Concepts: Metrics

We measure primarily to gain control of a project, and therefore to be able to manage it. We measure to evaluate how close or far we are from the objectives we had set in our plan in terms of completion, quality, compliance to requirements, etc.

We measure also to be able to better estimate for new projects effort, cost and quality, based on past experience. Finally, we measure to evaluate how we improve on some key aspects of performance of the process over time, to see what are the effects of changes.

Measuring some key aspects of a project adds a non-negligible cost. So we do not measure just anything because we can. We must set very precise goals for this effort, and only collect metrics that will allow us to satisfy these goals.

These general management goals do not translate readily into metrics. We have to translate them into some smaller subgoals (or action goals) which identify the actions project members have to take to achieve the goal. And we have to make sure that the people involved understand the benefits.

Then some of the subgoals (but not all) would require some metrics to be collected.

A useful way to categorize these goals is by organization, project and technical need. This gives a framework for the refinement discussed above.

Organizational Needs for Metrics

An organization needs to know, and perhaps improve, its costs per ‘item’, shorten its build times (time-to market), while delivering product of known quality (objective and subjective), and appropriate maintenance demands. An organization may from time to time (or even continuously) need to improve its performance to remain competitive. To reduce its risks, an organization needs to know the skill level and experience level of its staff, and ensure it has the other resources and capability to compete in its chosen sphere. An organization must be able to introduce new technology and determine the cost-benefit of that technology. The following table lists some examples of the kinds of metrics that are relevant to these needs for a software development organization.

Concern	Metric
Item Cost	Cost per line of code, cost per function point, cost per use case. Normalized effort (across defined portion of life cycle, programming language, staff grade, etc.) per line of code, function point or use case. Note that these metrics are not usually simple numbers - they depend on the size of the system to be delivered and whether the schedule is compressed.
Construction Time	Elapsed time per line of code or per function point. Note that this will also depend on system size. The schedule can also be shortened by adding staff - but only up to a point. An organization's management ability will determine exactly where the limit is.
Defect Density in Delivered Product	Defects (discovered after delivery) per line of code or per function point.
Subjective Quality	Ease of use, ease of operation, customer acceptance. Although these are fuzzy, ways of attempting quantification have been devised.
Ease of Maintenance	Cost per line of code or function point per year.
Skills Profile, Experience Profile	The Human Resources group would presumably keep some kind of skills and experience database.
Technology Capability	Tools - an organization should know which are in general use, and the extent of expertise for those not regularly used. Process Maturity - where does the organization rate on the SEI CMM scale, for example? Domain Capability - in which application domains is the organization capable of performing?
Process Improvement Measures	Process execution time and effort. Defect rates, causal analysis statistics, fix rates, scrap and rework.

Project Needs for Metrics

The Project Manager must be able to see if s/he is tracking towards such goals, expanded in the following table to give some idea of things to consider when thinking about project measurements:

Concern

Project Effort and Budget

How is project tracking on effort and cost against plan?

Project Schedule

Is the project meeting its milestones?

Transition/Installation

Are the predicted effort, cost and skills requirements acceptable?

Operation

Are the predicted effort and skills requirements supportable by the customer?

Maintenance/Supportability

Are the predicted effort and skills requirements acceptable to the customer?

Functional Requirements

Are the requirements valid, complete?
Are the requirements allocated to an iteration?
Are the requirements being realized according to plan?

Non-Functional Requirements

Performance
- Is the system meeting requirements for responsiveness, throughput, recovery time?
Capacity
- Can the system handle the required number of simultaneous users? Can the web site handle the required number of hits per second? Is there sufficient storage for the required number of customer records?
Quality Factors
- Reliability: how often are system failures allowed, and what constitutes a system failure?
- Usability: is the system easy and pleasant to use? How long does it take to learn how to use it and what skills are required?
- Fault tolerance/robustness/resilience/survivability: can the system continue to function if failures occur? Can the system cope with bad input? Is the system capable of automatic recovery after failure?
Specialty Engineering Requirements
- Safety: can the system perform without risk to life or property (tangible and intangible)?
- Security/privacy: does the system protect sensitive data from unauthorized access? Is the system secure from malicious access?
- Environmental impact: does the system meet environmental requirements?
Other Regulatory or Legal Requirements
Constraints
- External environment: is the system capable of operation in the prescribed environment?
- Resources, host, target: does the system meet its CPU, memory, language, hardware/software environment, constraints?
- Use of commercial-off-the-shelf (COTS) or other existing software: is the system meeting its reuse constraints?
- Staff availability and skills: can the system be built with the number and type of staff available?
- Interface support/compatibility: can the system support required access to and from other systems?
- Reusability: what provisions are made for the system to be reusable?
- Imposed standards: are the system and the development method compliant?
- Other design constraints (architectural, algorithmic, for example): is the system using the required architectural style? Are the prescribed algorithms being used?

This is an extensive, but not exhaustive list, of concerns for the Project Manager. Many will require the collection and analysis of metrics, some will also require the development of specific tests (to derive measurements) to answer the questions posed.

Technical Needs for Metrics

Many of the project needs will not have direct measures and even for those that do, it may not be obvious what should be done or changed to improve them. Lower level quality-carrying attributes can be used to build in quality against various higher level quality attributes such as those identified in ISO Standard 9126 (Software Quality Characteristics and Metrics) and those mentioned above in Project Needs. These technical measures are of engineering (structural and behavioral) characteristics and effects (covering process and product), that contribute to project level metrics needs. The attributes in the following table have been used to derive a sample set of metrics for the Rational Unified Process artifacts and process. This may be found in Guidelines: Metrics.

Quality	Attributes
Goodness of Requirements	Volatility: frequency of change, rate of introduction of new requirements Validity: are these the right requirements? Completeness: are any requirements missing? Correctness of expression: are the requirements properly stated? Clarity: are the descriptions understandable and unambiguous?
Goodness of Design	Coupling: how extensive are the connections between system elements? Cohesion: do the components each have a single, well-defined purpose? Primitiveness: can the methods or operations of a class be constructed from other methods or operations of the class? If so they are not primitive (a desirable characteristic). Completeness: does the design completely realize the requirements? Volatility: frequency of architectural change.
Goodness of Implementation	Size: how close is the implementation to the minimal size (to solve the problem)? Will the implementation meet its constraints? Complexity: is the code algorithmically difficult or intricate? Is it difficult to understand and modify? Completeness: does the implementation faithfully realize all of the design?
Goodness of Test	Coverage: how well does the test exercise the software? Are all instructions executed by a set of tests? Does the test exercise many paths through the code? Validity: are the tests themselves a correct reflection of the requirements?
Goodness of Process (at lowest level)	Defect rate, defect cause: what is the incidence of defects in an activity, and what are the causes? Effort and duration: what duration and how much human effort does an activity require? Productivity: per unit of human effort, what does an activity yield? Goodness of artifacts: what is the level of defects in the outputs of an activity?
Effectiveness of Process/Tool Change	(as for Goodness of Process, but percentage changes rather than total values): Defect rate, defect cause Effort and duration Productivity Goodness of artifacts

What is a Metric?

Each metric is made up of one or more collected metrics. Consequentially each primitive metric has to be clearly identified and its collection procedure defined.

Metrics to support change or achievement goals are often "first-derivative" over time (or iterations or project). We are interested in a trend, not in the absolute value. To "improve quality" we need to check that the residual level of known defects diminishes over time.

Templates

Name	Name of the metric and any known synonyms.
Definition	The attributes of the entities that are measured using this metric, how the metric is calculated, and which primitive metrics it is calculated from.
Goals	List of goals and questions related to this metric. Also some explanation as to why the metric is being collected.
Analysis procedure	How the metric is intended to be used. preconditions for the interpretation of the metric (e.g., valid range of other metrics). Target values or trends. Models of analysis techniques and tools to be used. Implicit assumptions (for example, of the environment or models). Calibration procedures. Storage.
Responsibilities	Who will collect and aggregate measurement data, prepare the reports and analyze the data.

Metrics Activities

Define measurement plan is done once per development cycle - in the inception phase, as part of the general planning activity, or sometimes as part of the configuration of the process in the development case. The measurement plan may be revisited like any other section of the software development plan during the course of the project.

Collect measures is done repetitively, at least once per iteration, and sometimes more often; for example, weekly on an iteration spanning many months.

The metrics collected are part of the Status Assessment document, to be exploited in assessing the progress and health of the project. They may also be accumulated for later use in project estimations and trends over the organization.

How are the Metrics Used?

Estimation

The project manager in particular is faced with having to plan – assign resources to activities with budgets and schedules. Either effort and schedule are estimated from a judgment of what is to be produced, or the inverse – there are fixed resources and schedule and an estimate of what can be produced is needed. Estimation typically has to do with the calculation of resource needs based on other factors – typically size and productivity – for planning purposes.

Prediction

Prediction is only slightly different from estimation, and is usually about the calculation of the future value of some factor based on today’s value of that factor, and other influencing factors. For example, given a sample of performance data, it is useful to know (predict) from it how the system will perform under full load, or in a resource constrained or degraded configuration. Reliability prediction models use defect rate data to predict when the system will reach certain reliability levels. Having planned an activity, the project manager will need data on which to predict completion dates and effort at completion.

Assessment

Assessment is used to establish the current position for comparison with a threshold, say, or identification of trends, or for comparison between alternatives, or as the basis for estimation or prediction.