Validation

Home Up

 

Software Data Validation

Data validation is an important part of populating a software database.  Past experience has taught us that including unvalidated data in a database renders the database unreliable, or makes the data questionable or unusable.  The quality of the data contributed to the database cannot be validated by simple inspection as one would hope.  In addition to inspection, the data must be compared to reality (history) to assure the data is reasonable.

The data is validated according to a 3-step process in Sage 4. 

·         Is the development productivity within the current range of observed productivity?

·         Is the product complexity consistent with established ranges of development effort and schedule?

·         Is the developer technology consistent with current development capability?

Tables of historic data are used to assess the relative quality of each data point.  The observation of cost, schedule, and productivity data from the three test viewpoints provides a reasonable data validity assessment.

Development Productivity

Project type and developer capability are major productivity drivers.  Although productivity values can range theoretically from near zero to several hundred source lines per person month (lppm), the ranges for specific product lines tend to cluster in narrow groups.  Developer capability ratings (described in a following section) also tend to fall into a narrow range of values; thus the productivity range is infrequently above the range specified for the project type.  For example, historic development productivity values for spacecraft payloads typically fall into a range from approximately 30 to 60 lppm.  Payload productivity values outside that range must be questioned in terms of effective size values and capability ratings.      

The validation process compares the data point productivity value to the productivity range typical for the corresponding application type.  The comparison results are classified as typical, questionable, or invalid. 

Complexity

Project complexity has a consistent, observable property.  This property, or characteristic, is a relationship between development cost and schedule.  The property was first observed by Larry Putnam in the 1970s and used in the SLIM software estimating system.  Complexity has been defined as a function of the full-scale development effort and the full-scale development time.

The validation process calculates an effective value for the data point that is based upon the development effort and schedule.  The calculated complexity value is compared with the typical value range for the application type.  The comparison results will be classified as typical, questionable, or invalid. 

Technology Rating

The developer technology rating is a numeric measure of an organization’s software development capability.  This measure is directly related to the productivity achievable by an organization. 

The validation process calculates a rating value for the data point that is based upon size, development effort, environment, and product characteristics.  The calculated value is then compared with the typical technology value range.  The comparison results is classified as typical, questionable, or invalid. 

Data Confidence Level

The end result of the validation process is a data Confidence Rating Level (CRL) code.  The CRL scale includes ratings from 1 (garbage) to 100 (all validation results within nominal ranges).  The confidence rating merges the results of the three validation tests (productivity, complexity and technology) into a single statistical measure.  The ratings are contained in the following table:

 

Value

Definition

Comments

100

Credible

Data is nominal for all validation tests

90

Highly likely

 

80

Realistic

Within normal ranges for all validation tests

70

Likely

 

60

Reasonable

Within normal ranges for some validation tests

50

Possible

 

40

Unlikely

Within normal range for one validation test

30

Rough

 

20

Little chance

All validation tests are marginal

10

Highly unlikely

 

0

Incredible

Data failed all validation tests (not even close)

 

 

Copyright ©2010 Software Engineering, Inc.           Last update: 13 April 2010