Jump to content

Speaking of ratings...


Vern Edwards

Recommended Posts

Speaking of ratings... and incompetence.

Read R&K Enterprise Solutions, Inc., GAO B-419919.6, September 12, 2022. Read it and weep.

https://www.gao.gov/products/b-419919.6%2Cb-419919.7%2Cb-419919.8

The task order "fair opportunity" was conducted pursuant to FAR 16.505. The Air Force Combat Command set forth to award a task order for "training, operations, and administrative services" under GSA's OASIS contract by issuing a Fair Opportunity Proposal Request, FOPR. Presumably pronounced fop-er. The Air Force received "proposals" from nine "offerors."

As you read, keep the following long-standing and fundamental bid protest case law principle in mind:

Quote

[R]atings, whether numerical, color, or adjectival, are merely guides for intelligent decisionmaking. One Largo Metro LLC, et al., B-404896 et al., June 20, 2011, 2011 CPD ¶ 128 at 14. Specifically, before an agency can select a higher-priced quotation that has been rated technically superior to a lower-priced but acceptable one, the award decision must be supported by a rational explanation of why the higher-rated quotation is, in fact, superior, and explaining why its technical superiority warrants paying a price premium.

Alpha Omega Integration LLC, B-419812.2, August 10, 2021.

From The Nash & Cibinic Report, February 2006:

Quote

Scores or ratings may be helpful in contractor selection, but they are not sound bases for decisionmaking. No decisionmaker should ever base tradeoffs or source selection decisions on scores or ratings. If I were a CO today and were using a scoring or rating scheme, I would not describe it in an RFP or mention scores or ratings in a decision document. I would insist that decision documents explain tradeoffs and the decision rationale strictly in terms of specific attribute differences--good things and bad things that contribute or detract value. 

QUESTIONS:

  1. Did the evaluation factors make sense?
  2. Were the factors well-structured?
  3. Was the fair opportunity an essay-writing contest?
  4. Did the numerical rating scale make sense?
  5. Should the agency lawyer have rejected the decision document for explaining the decision in terms of point scores and price percentages?
  6. Why did the agency's lawyer fight that protest?
  7. The FOPR was issued on May 21, 2021. The Air Force awarded the task order on May 23, 2022. The GAO issued the decision last month. Was that fast? Too long?
Link to comment
Share on other sites

Quote

If I were a CO today and were using a scoring or rating scheme, I would not describe it in an RFP or mention scores or ratings in a decision document.

I’ll reply to your question later Vern after I read the decision.  But describing a scoring or rating scheme in the RFP is a pet peeve of mine.  I see it all the time and asked numerous COs why they do it but never received a satisfactory answer

Link to comment
Share on other sites

Since the procurement is for a suite of training, operations, and administrative services, I don’t see the evaluation factors making sense. Nor are they well structured.  It seems like experience and past performance are better suited factors.  Perhaps technical might include a transition plan and methodology for conducting some of the work.

It definitely looks like an essay writing contest.  Years ago I saw a very effective approach for this type work.  It involved oral proposals and “pop quizzes” for the key offerors personnel.

I don’t see the numerical rating as beneficial at all.

Before I saw the question about should the lawyers have rejected the decision document, that was one of the first things that popped into my head.  How could any lawyer reasonably versed in contracting not question it?  It’s a $100 million action with selection based on percentage of numerical points of one offeror over another.  Who knows why the agency fought this?  It may be hoping to cover up a big blunder.

12 months to award a task order for training, operations, and administrative services!  Maybe four months or five at most.

Link to comment
Share on other sites

@formerfedTake a good, close look at the agency's numerical rating scale and then at the board's report to the SSA. Notice anything about the scale and the board's math? What do you think of its percentages?

Is the technical rating scale ordinal, interval, ratio, or nominal (categorical).

Is price measured on an ordinal, interval, ratio, or nominal scale?

Link to comment
Share on other sites

17 hours ago, Vern Edwards said:

@formerfedTake a good, close look at the agency's numerical rating scale and then at the board's report to the SSA. Notice anything about the scale and the board's math? 

Is the technical rating scale ordinal, interval, ratio, or nominal (categorical).

Is price measured on an ordinal, interval, ratio, or nominal scale?

bosgood got it.  Note number 11 of the decision also noted the errors.

Its been many years since I took a statistics course but I’ll take a guess at the scales.  The technical rating score uses ordinal while price uses ratio.

Link to comment
Share on other sites

The real mistake was that the Weighted Total Evaluation Score (WTES) uses numbers, but the scale is not numerical in the sense of reflecting quantities. It sets up ranked categories identified by the numbers 0, 3, 4, and 5. Those are not measures of anything. They are labels for categories on a nominal scale. The numbers are rank orders, 3 being the highest rank, so I call the scale categorically ordinal. See Stevens, "On the Theory of Scales of Measurement," Science, June 7, 1946.

https://psychology.okstate.edu/faculty/jgrice/psyc3214/Stevens_FourScales_1946.pdf

Two offerors can be in Category of 3, but one might be better than another, though you cannot tell from the number whether that is so or, if so, which is the better and how much better. Offerors in Category 3 are better than offerors in Category 2, but you cannot tell how much better. You cannot validly conduct mathematical operations like division with numbers on an or ordinal) scale. Yet the agency did just that, using the aggregated numbers to calculate percentage differences among offerors. And for reasons I do not understand they set the scale up as follows: 0, _, _, 3, 4, 5. Why omit 1 and 2? Why not set the scale at 0, 1, 2, 3? 

It seems an error of judgment to use such a system to calculate percentage variations in offeror quality and to make quality/price tradeoffs on that basis.

And how can they not have known that explaining their tradeoff and the source selection decision in terms of ratings. The GAO held way back in the '70s that such an approach would fail the test of rationality if not backed up with descriptive findings.

And why, oh why, did the Air Force lawyers let them fight the protest? They should have told them to take corrective action and rewrite the decision document.

I feel embarrassed for the team that conducted this procurement.

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...