All Over the Map — The Imprecision of 24 California HERS Ratings
A home energy rating is supposed to tell you how energy efficient your home is. A certified home energy rater goes to the home and collects all the data relevant to energy consumption in the home (well, all the data included in the rating anyway, which is almost everything). Then they enter the data into energy modeling software and get the results: consumption for heating, cooling, water heating, lights, and appliances plus this thing called a HERS Index. But what if the results were off by 50%, 100%, or even more?
That's exactly what a new study in California has found. In the same project I wrote about earlier this year (Stockton Project Demonstrates Huge Home Energy Savings), researchers John Proctor, Rick Chitwood, and Bruce Wilcox hired 6 California HERS raters to rate the four homes in the study. Here's a quick description of the homes:
The envelope, please
The study, titled Round Robin HERS Ratings of Four California Homes: A Central Valley Research Homes Project, was published in May 2014 and shows the data from the raters they hired to rate each home. And who did those ratings? Before they did any work on the homes, the researchers hired six HERS raters using the California HERS protocols. All used the energy modeling software Energy Pro version 22.214.171.124.
Here's the summary table of the main results:
The second and third columns, labeled HERS Rating, show the range of HERS Index values calculated by the six raters and the percent difference between the highest and lowest. Only one of the homes (Mayfair) comes close to having an acceptable variation among raters. Even that 12%, however, is four times as much as RESNET allows in the field QA process for raters. The heating and cooling consumption are also shown, and the spread is even worse.
The graph below shows each rater's HERS Indices for the four houses. The vertical scale is the Index and the horizontal scale shows the raters by ID numbers. The Grange and Mayfair homes, being the two oldest, have the highest HERS Indices for all six raters with the Grange being the highest of all four homes for five of the six raters.
Why so much variation?
Strangely, the smallest, simplest house had the most HERS Index variation among the six raters. The home the researchers called the Grange, shown in the photo at the beginning of this article, has only 852 square feet of conditioned floor area. It also has the simplest type of foundation to enter (slab on grade) and none of the complicating factors like attic kneewalls, vaulted ceilings, or sealed attics that can make some houses difficult to rate. It's just a simple, little box.
Despite the simplicity, however, the HERS Index for this home ranged from a low of 182 to a high of 269, a difference of 48%. One of the problems seems to have been the raters' inability to agree on the efficiency of the air conditioner. The six raters used SEER ratings of 8, 10, 11, and 12, a difference of 50%.
Another problem with accuracy was attic insulation. The researchers reported that the Mayfair, the second smallest and oldest of the homes, had an average of about one inch of insulation on top of the ceiling, as shown in the photo below. That would result in an R-value of about 3 or 4, yet three of the HERS raters entered that ceiling as being insulated to R-11. In the Fidelia home, the ceiling insulation entries varied from R-19 to R-49.
Accuracy versus precision
In their initial report on the HERS ratings, the researchers did not include a comparison of the raters' results to the measured results from the study. This is a three-year project in which the researchers started off with a baseline analysis of the homes, including these HERS ratings, and the proceeding through various improvements. Before making any improvements, however, they simulated occupancy of the homes and measured the energy consumption.
The graph below (marked Fig. 4) shows the spread of cooling energy use for the four homes as calculated by the energy modeling of the six raters and the monitored energy use with simulated occupancy. As you can see, the raters' numbers were significantly higher than the monitored results in nearly every case.
One interesting thing to notice here is that for the Mayfair, the raters had pretty good agreement. As shown in Table 1 (above), their results differed from one another by a maximum of 24%. The mean of their six results is 2903 kWh. The measured result for cooling energy use is 731 kWh. I could tell you how many standard deviations of the mean separate the results (22), but the graphic below shows what it means to be precise but not accurate. For that illustration to be numerically correct, though, the cluster of shots would have to be much farther from the bullseye.
California HERS ratings
HERS ratings in California are not in the purview of RESNET, the nonprofit organization that oversees HERS ratings in most of the country. The state oversees the work of HERS raters there. The California guidelines are similar to RESNET's, with the same basic structure and quality assurance requirements. The California Energy Commission has a page on their HERS program, with links to their regulations, technical manual, and more.
Are HERS ratings worthless?
It's certainly tempting to think that HERS ratings are a waste of money when you see how imprecise and inaccurate these ratings are. After all, if the six raters here can differ so much from one another and from the measured results, that doesn't engender a lot of confidence in the process.
Even though California's rating process is separate from RESNET's, it's structured similarly so two factors are probably most responsible for the discrepancies the researchers found in this study. One is that a HERS rating is an asset label. It's not meant to align perfectly with actual energy use because it's designed as a way to help you compare one house to another without the differences that arise from the way people live in their homes.
The other factor is quality assurance (QA). I wrote about this recently because RESNET is going through the process of trying to get greater consistency in HERS ratings (and unfortunately trying to go down the wrong road).
My company, Energy Vanguard, is a HERS provider, and I think we do a pretty good job with QA. We train our raters well and don't just automatically approve every file sent our way. Sometimes a rater will have to make corrections to a file three or four times before we approve it, and probably about half of the files we get need at least one correction.
But I think we go further than many. RESNET requires providers to check only 10% of all files, meaning 90% can get approved without anyone looking at them. And, as I wrote in my article about RESNET's improvement efforts, RESNET has done little to no technical oversight of providers, so it's not hard for bad ratings to get all the way through the process. I'm sure it's not much different in California, and that's why Proctor, Chitwood, and Wilcox found the results described here.
I think home energy ratings are a useful tool. They're not as good as they can be, however, and I think studies like this one can help expose the problems. Only then can we fix them.
HERS Estimates Vs. Energy Consumption - a report at the Proctor Engineering website containing some of the results described above
All but one of the images and graphs here are from the reports of John Proctor, Rick Chitwood, and Bruce Wilcox. The red bullseye image is from Wikimedia Commons and is in the public domain.
NOTE: Comments are moderated. Your comment will not appear below until approved.