Jump to content

Past Performance as Technical Factor


Dormer30

Recommended Posts

I believe when evaluating past performance as part of a Best Value evaluation, that if a company fails to provide any past performance then we simply rate them as "Neutral." Neutral is the word we use in lieu of "Not rating them favorably or unfavorably." I have a company who rated Neutral in past performance, but Exceptional in all of the other tech factors. Since I cannot use Past Performance unfavorably, I can still rate that company "Exceptional" overall, right?

I am getting conflicting guidance from Legal and another contracting level. Anybody have an opinion or have this happen?

Thanks,

John

Link to comment
Share on other sites

Let me start by saying, I haven't had this happen to me. Was this potential situation considered and addressed in either your acquisition plan or your technical evaluation plan? If so, as long as you do what you said you were going to do, then you are golden. Personally I don't believe you have an issue as long as you do it the same way in all the evaluations and you document what you consistently did (Eval report shows that for vendor with Neutral on Past Performance, you determined overall proposal rating by doing x).

If your approach to determining overall rating is averaging, then I think you are okay by changing the divisor to be the number of remaining factors. If the approach involves adding up weighted scores then you need to ensure that your method doesn't give them a zero for that factor and thus negatively impact their total.

Link to comment
Share on other sites

Guest Vern Edwards

You asked what kind of aggregate ("overall") rating you could assign to the offeror with the neutral rating. Is it necessary to assign an overall rating? Hopefully, you did not tell offerors that you would aggregate the nonprice factor ratings for the purpose of making tradeoffs. If you did not, then don't do it. It's not necessary and it complicates your problem. If you did, then what you do depends on what you said in your RFP, which I have not seen. "Neutral" signifies that there is no record of past performance, so you cannot say that past performance was good or bad. However, it is well-established in GAO case law that you do not have to consider neutral to be as good as a favorable rating. It does not make sense to give two offerors the same rating when one was rated exceptional in past performance and the other was rated neutral. So you might rate one "Exceptional" and the other "Exceptional - (minus)".

It appears that your factors include nonprice factors other than past performance, past performance, and price, and that one offeror received ratings of exceptional, exceptional, and has a higher price and the other offeror has ratings of exceptional, neutral, and has a lower price. The exceptional ratings for the other nonprice factors essentially cancel each other out, so you are left with a tradeoff of exceptional and a higher price against neutral and a lower price. The GAO has held that in making such tradeoffs you can choose the offeror with the better rating and the higher price over the offeror with the neutral rating and the lower price. See, e.g., Blue Rock Structures, Inc., Comp. Gen. Dec. B-287960.2, 2001 CPD ? 184:

[W]here a solicitation, as here, provides for award on the basis of the best value offer, a determination to award to a higher-priced offeror with a good past performance record over a lower-priced offeror with a neutral past performance rating is not precluded. Such a determination is consistent with making a price/technical tradeoff to determine if one proposal's technical superiority is worth the higher cost associated with that proposal. See Eng'g and Computation, Inc., B-275180.2, Jan. 29, 1997, 97-1 CPD ? 47 at 4-5; Excalibur Sys., Inc., B-272017, July 12, 1996, 96-2 CPD ? 13 at 3.

You must be able to make the case that the higher priced offeror's exceptional rating in past performance was worth the marginal difference in price.

Link to comment
Share on other sites

Without reading your RFP, any answer is going to have to make a lot of assumptions. There is a lot available on this topic generally, and on wifcon in particular. I would recommend doing a Google search:

site:wifcon.com +"past performance" +neutral

You'll get links to Mr. Phillips' article and very useful blog posts by Don Acquisition and Vern Edwards, not to mention the Bid Protest by FAR area. Michael Golden also has an excellent article on past performance that shouldn't be hard to find online, which I believe Vern has referenced in his blog.

Link to comment
Share on other sites

You asked what kind of aggregate ("overall") rating you could assign to the offeror with the neutral rating. Is it necessary to assign an overall rating? Hopefully, you did not tell offerors that you would aggregate the nonprice factor ratings for the purpose of making tradeoffs. If you did not, then don't do it. It's not necessary and it complicates your problem.

Vern,

Why do you think that aggregation of the nonprice factors makes things more complicated? Isn't that what the level of confidence assessment rating (LOCAR) is? I thought that made things less complicated.

Link to comment
Share on other sites

Guest Vern Edwards

Don,

Yes, the LOCAR method is an aggregation method, and a good one, if I can say so myself, but it is complicated. I don't think I ever said it makes things easier. I claimed that it enables one to think about the problem more clearly and rationally. One must understand it thoroughly and have prepared one's RFP with that method in mind in order to make it work well. Moreover, it uses numerical scoring, which is anathema in many agencies, although it can be adapted for use with adjectival ratings. Another good aggregation method is SMART, which stands for Simple Multi-attribute Rating Technique. It is described in detail in Decision Analysis for Management Judgment, 3d ed., by Goodwin and Wright, Chapter 3.

All such methods require study and preparation, but work well in the hands of someone who knows what he or she is doing. There are not many such people in my experience.

Link to comment
Share on other sites

I believe when evaluating past performance as part of a Best Value evaluation, that if a company fails to provide any past performance then we simply rate them as "Neutral." Neutral is the word we use in lieu of "Not rating them favorably or unfavorably." I have a company who rated Neutral in past performance, but Exceptional in all of the other tech factors. Since I cannot use Past Performance unfavorably, I can still rate that company "Exceptional" overall, right?

I am getting conflicting guidance from Legal and another contracting level. Anybody have an opinion or have this happen?

Thanks,

John

The approach you are contemplating is an "exclusionary" approach which is fine. Another approach would be a "quantitative" approach. For instance the evaluation range may be from poor/marginal/average/good/exceptional. You may rate the neutral as "average" which GAO has upheld. In other words there is not cookie cutter approach in how to treat neutral past performance.

In at least one COFC case the court actually preferred the "exclusionary" approact which they held as being the most pure method of handling no past performance as neither negative or positive. Here is a snippit from that case (Metcalf Construction Company, Inc., v. The United States and Lend Lease ACTUS (Intervenor), COFC No. 02-5C, September 24, 2002):

The plain meaning of the word neutral is: ?Not allied with, supporting, or favoring either side... Not one thing or the other: indifferent;?33 i.e., having no effect, null or zero. When a literal interpretation is given to the statute, and a zero (or null) value is applied to the neutral rating factor, the subject offeror?s overall rating suffers as compared to other bidders.34 In the case at bar, the offeror is partially spared from the harshest possible result because the neutral rating occurs in only one of two equally weighted subfactors, instead of a whole category or factor. Operating from the standpoint of strict adherence to the language of the statute, no value or a null value should be substituted in the past performance subfactor under Factor C as illustrated below:

According to the solicitation, the ratings in both subfactors, ?NR? and ?HA,? are ?equally weighted.? Therefore, the logical first-blush inclination would be to ?average? the two ratings to arrive at a single overall rating. Since these ratings are adjectival, rather than numerical, an exact (quantitative) value or rating cannot result. We are thereby faced with the confines of solving an equation of non-quantitative variables, which logically, we know, must yield a result that is lower than ?HA,? but for which the actual result is speculative. (If this were a numerical scheme, and HA = 3, and A = 2, the outcome would be 1.5, which is a rating lower than ?Acceptable?). This court, therefore, and in conjunction with the administrative cases cited above condemning the ?zeroing effect,? rejects the functionality (and fairness) of a null value application since said application has the effect of treating the offeror unfavorably.

While the court found no binding authority on this issue, we have looked to the numerous administrative decisions dealing with the applicable statute. There are a couple of variations in applying the statute which have been contrived in order to avoid the otherwise harsh outcome that results from a literal application of the statute (the zeroing effect). First, there is assigning a quantitative value to the neutral rating reflecting the mid-point of the applicable rating scale; e.g., ?good,? or ?satisfactory.?35 And, in the case of a numerical rating scheme, a number representing the mid-point of the applicable rating scale is assigned.36 Alternatively, there are other cases where the category with the ?NR? rating is totally eliminated for the affected bidder(s).37 We will look at each approach separately.

1. Quantitative Approach

This court is unpersuaded that the assignment of any value at all to a neutral rating operates within the meaning of the statute since ? ?the offeror may not be evaluated favorably or unfavorably on the factor of past performance.? Surely even a middle value may favor or disfavor an offeror. The defendant apparently applied this ?mid-point? theory, by treating the neutral rating as equivalent to the ?Acceptable? rating - - ?The Navy therefore, [sic] rated all offerors equally, consistently, and fairly, as each offeror receiving one ?highly acceptable? and one ?acceptable? (or ?neutral?) received an overall ?acceptable? rating.? Def. Motion at 26. The court rejects this approach as inconsistent with, and therefore violative of, the operative statute.

2. Exclusion Approach

Remaining, then, is the theory of totally eliminating the factor, or in this case, the subfactor, from the affected bidder?s evaluation. In Meridian Mgmt. Corp., past references questionnaires contained thirty-one (31) questions that were rated separately, then divided by 31 to arrive at an overall ?experience/past performance score.? The protester in that case argued that it was penalized when it received a ?0? for a question regarding laboratory work that was ?not applicable? in a prior contract. The agency involved in that solicitation agreed that the protester should not be penalized, and subsequently recalculated the protester?s score by dividing the total for the thirty (30) answered questions by 30, instead of 31. In doing so, the agency totally eliminated the question that was ?not applicable? to the protester from its overall ?experience/past performance score.?

The court finds that the approach taken in Meridian is the better approach, which neither (1) treats the bidder favorably nor unfavorably by unduly quantifying (as ?Acceptable?) the neutral rating, nor does it (2) lead to a speculative or unfavorable result (via averaging). Therefore, under the Meridian approach, subfactor 1 of Factor C, in the case at bar, is completely eliminated from Metcalf?s Factor C evaluation, since subfactor 1 is, in effect, ?not applicable? to Metcalf. Thus, Metcalf is to be evaluated purely on subfactor 2, which, in this case, becomes its overall rating for Factor C, to wit, ?HA.? This approach is consistent with the outcome derived by the TEB in its September 10, 2001 report. (App. A, page 3 of 4).

Link to comment
Share on other sites

Guest Vern Edwards

Metcalf is found at 53 Fed.Cl. 617 (2002). Here is what happened:

The factor in question, Factor C, was small business utilization. It consisted of two subfactors: (1) past performance and (2) proposed subcontracting. The two subfactors were equally weighted. The rating scheme in question was:

Highly Acceptable (HA)

Acceptable (A)

Neutral Rating (NR)

The protester scored NR on past performance and HA on proposed subcontracting. The source selection board gave the protester an overall rating of A. In discussing this result, Judge Gibson, who is no longer on the court, said:

According to the solicitation, the ratings in both subfactors, ?NR? and ?HA,? are ?equally weighted.? Therefore, the logical first-blush inclination would be to ?average? the two ratings to arrive at a single overall rating. Since these ratings are adjectival, rather than numerical, an exact (quantitative) value or rating cannot result. We are thereby faced with the confines of solving an equation of non-quantitative variables, which logically, we know, must yield a result that is lower than ?HA,? but for which the actual result is speculative. (If this were a numerical scheme, and HA = 3, and A = 2, the outcome would be 1.5, which is a rating lower than ?Acceptable?). This court, therefore, and in conjunction with the administrative cases cited above condemning the ?zeroing effect,? rejects the functionality (and fairness) of a null value application since said application has the effect of treating the offeror unfavorably.

(Math is not my strong point, but I think that the average of two ratings, 3 and 2, would be 2.5, not 1.5: 3 + 2 = 5; 5/2 = 2.5. So I wonder about the judge's comment that the outcome would be 1.5 "which is a rating lower than 'Acceptable.'")

The court then posed this question:

The precise question before us now is whether the SSB properly evaluated the neutral (?NR?) rating assigned to plaintiff in subfactor 1 of Factor C in the process of re-*637 assigning Metcalf an overall Factor C rating of ?A.? Stated differently, the question is-whether in the process of writing the overall Factor C rating down from ?HA? to ?A,? did the SSB include in the equation a quantification attributable to subfactor 1 of Factor C. To this question, we answer affirmatively.

The court considered the agency's adjustment, combining HA and NR to produce A to be a form of what it called the "quantitative" approach. The court then described the approach taken by the agency in a GAO protest decision, Meridian Management Corp., Comp. Gen. Dec. B-285127, 2000 CPD ? 121, in which the agency handled a neutral rating by excluding the rating from consideration entirely. The court refered to that as the "exclusionary" approach, and says:

The court finds that the approach taken in Meridian is the better approach, which neither (1) treats the bidder favorably nor unfavorably by unduly quantifying (as ?Acceptable?) the neutral rating, nor does it (2) lead to a speculative or unfavorable result (via averaging). Therefore, under the Meridian approach, subfactor 1 of Factor C, in the case at bar, is completely eliminated from Metcalf's Factor C evaluation, since subfactor 1 is, in effect, ?not applicable? to Metcalf. Thus, Metcalf is to be evaluated purely on subfactor 2, which, in this case, becomes its overall rating for Factor C, to wit, ?HA.? This approach is consistent with the outcome derived by the TEB in its September 10, 2001 report.

The court found that the agency's method was an incorrect application of the neutral rule, but that the offeror was not prejudiced by it.

Judges on the COFC are not bound by the decisions of the other judges, and Metcalf has not been cited for its past performance analysis as far as I could determine. Judge Gibson is gone. I don't think his analysis was particularly astute. There are serious problems with the exclusionary approach if not handled properly during the tradeoff analysis, which the judge does not mention. GAO has not cited Metcalf in any of its decisions and does not seem to have an issue with what the court called the "quantitative" approach. See, e.g., Joint Management and Technology Services, Comp. Gen. Dec. B-294229, 2004 CPD ? 208, decided two years after Metcalf (permissible to assign midpoint rating of 5 points on a scale of 0 to 10 to offeror with no record of past performance).

All the same, Metcalf illustrates the difficulties associated with developing "overall" ratings by adding the rating or score for past performance to other nonprice ratings.

Link to comment
Share on other sites

Guest Vern Edwards

Yes, when I reread the passage, I think the judge meant for his math to work like this:

(HA + NR)/2 = avg.

(3 + 0)/2 = 1.5.

What's odd about the court's analysis is that the agency combined HA and NR to produce 2, Acceptable, not 1.5. So he could see that they did not use a strictly quantitative approach. I don't know why he bothered with his 1.5, "lower than acceptable" comment.

Link to comment
Share on other sites

Yes, when I reread the passage, I think the judge meant for his math to work like this:

(HA + NR)/2 = avg.

(3 + 0)/2 = 1.5.

What's odd about the court's analysis is that the agency combined HA and NR to produce 2, Acceptable, not 1.5. So he could see that they did not use a strictly quantitative approach. I don't know why he bothered with his 1.5, "lower than acceptable" comment.

Having just come upstairs this evening from watching "Comedy Central" and drinking too much Jack Daniels, this whole thread seems to belong on that channel. The assumed precision of adjectival and especially numerical assigned ratings and assigned summary ratings using any kind of overall rating system to justify the "best value" tradeoff decision is initially hilarious reading - but is actually sad to me. What the heck difference is there really between an exceptional and exceptional minus (or whatever) rating?

In a tradeoff decision, we should be primarily interested in drilling down to the underlying basis for the ratings - the relative strengths or advantages and weaknesses or disadvantages of the various proposals in making the tradeoff decisions. Then, for example, decide whether the relative advantages of one proposal would justify paying the additional price, when a higher "rated" proposal costs more than another qualifying proposal.

We seem to want to make this into "rocket science" with all these formulas and other machinations, which seemingly provide some kind of justification for the selection decision.

Having attended several Design-Build Institute of America national conferences, I also see state and local government entities that are just now venturing into "best value" competitions, justifying their "best value" selections, based upon formulas such as $/points or using other goofy mechanical formulas which nobody, owner or industry seem to understand, to justify their best value selections. Since few seem to understand the intricacies of the method, the industry more or less accepts the decisions and the state and local officials seem to be publicly happy with their justifications.

When my former office used to use those numerical machinations back in the late 80's, industry seemed to accept the decisions. But in truth, my boss would come ask me (when I wasn't on a selection board) to go and read all the proposals and come up with some discussion points to help him explain in the debriefings why their firms weren't selected. That was pretty silly from my perspective.

I like to evaluate "past performance" as a "proposal risk", rather than use the same rating system as other factors. Thus, a lack of "past performance" is rated as an "unknown risk" versus low, moderate or high risk of high performance for those firms that have a past performance history. What the heck is a "neutral" rating, anyway? I think that is almost meaningless.

So a proposal could have an overall exceptional rating with "unknown proposal risk" if they cannot provide any kind of past performance rating by the owner.

Then we look at the underlying differences between the quality proposals and their relative prices. Its usually not that hard to determine the "best value", at least for construction and design-build source selections.

Supply and services might be more difficult to distinguish between proposers. But, Gee Wiz - we all make best value decisions in our every day purchases, usually without resorting to some silly numerical or adjectival summary rating.

Even in Consumer Reports, one can look at the underlying basis for "factor" ratings in the accompanying articles or at the on-line, more detailed reports, then decide for themselves what they are willing to pay for a product.

Link to comment
Share on other sites

Guest Vern Edwards

Joel:

What's your point? Is your bottom line that you do not like to use numerical or adjectival rating schemes or that you do not like the idea of aggregating individual factor ratings into "overall" ratings? You can't be opposed to adjectival rating schemes per se, because you say you use such a scheme to rate what you call "proposal risk." And you can't be opposed to overall ratings per se, because you say that in your scheme a proposal could have "an overall exceptional rating with 'unknown proposal risk' if they cannot provide any kind of past performance rating by the owner."

Is your point that tradeoffs should not be made on the basis of comparisons of scores or ratings, but on the basis of the comparisons of the underlying differences among proposals? If so, I think we would all agree with that, especially since the GAO has taken that position in its protest decisions. That was never at issue in this thread.

The purpose of numerical and adjectival scoring and rating schemes is to simplify complex information. However, simplification results in the loss of detail. Thus, scores and ratings can be used for preliminary analyses, but should not be used as the basis for tradeoffs and decisions. That has long been known and understood by experienced practitioners. I wrote a long article about that for The Nash & Cibinic Report in January 2009, "Postscript II: Scoring or Rating in Source Selection," 23 N&CR ?2.

As for "goofy mechanical formulas" and "numerical machinations," i.e., numerical scoring, such methods can be analytically quite helpful in complex evaluations when used by knowledgeable practitioners with appropriate training, much more helpful than adjectival schemes. The problem in source selection is that most of the people who use such methods have not been properly educated and trained and do not understand how to construct them or use them properly.

Would you like to know more? I can provide you, or anyone else, with a list of books.

Happy holidays. Have a sip of Jack Daniels for me. I said a sip.

Vern

Link to comment
Share on other sites

Joel:

What's your point? Is your bottom line that you do not like to use numerical or adjectival rating schemes or that you do not like the idea of aggregating individual factor ratings into "overall" ratings?

Vern, while overall ratings are useful for an initial cut at competitive ranges and for starting the comparative, trade-off analysis, I am opposed to the over-use and exclusive use of the ratings to make the trade-off selection decisions, which IS being done by many selection authorities, KO's and evaluation teams. This is especially pervasive when using point scoring systems, which is one major reason why Army prohibited further use of numerical scoring systems in 2001.

I am also opposed to scoring price, using price per quality point ratios and other such mechanical methods.

Scoring anything, especially price assumes too much precision in the makeup of the various factors or their relative importance. I have seen selections based upon a couple of points difference between proposals, without any discussion of the underlying basis for the rating.

When Army banned numerical scoring a large number of KO's expressed frustration in not knowing how to assign adjectival ratings or to compare proposals.

Why? Because teams were used to assigning scores, then trying to develop the background commentary strengths and weaknesses, etc. To support what they already perceived the score to be, rather than the other way around. Sheesh. You were right when you said that there is not a wide amount of knowledge or skill in conducting source selections.

I pretty much agree with what you said in your post.

I'll lift a Jack for you. Merry Christmas.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...