We just finished reading Esty and Cornelius' Environmental Performance Measurement: The Global Report 2001-2002, where the ESI was discussed. We also looked at the 2002 ESI online. We are finding that the methodology sections (both in the book and online) are insufficient to allow us to replicate the rankings. We are interested in seeing how the rankings change if we impute our own weighting schemes, but to do so requires that we understand how they were done in the first place. So, here are our questions: 1. How were the z-scores for each of the variables converted to a percentile? 2. How were the percentiles converted to the indicator? Was this a sum of all the percentiles, an average of the percentiles, or something different? 3. How were the indicators compiled to create the overall ESI? 4. Were missing values included in all the above computations or were values imputed in the above prior to the computation of z-scores?
A first basic clarification is that the Esty and Cornelius chapters rely on the 2001 ESI, and the 2002 ESI made some changes to the methodology. The answer below is with regard to the 2002 ESI. Annex two in our 2002 ESI report has more detail, including some steps you didn’t ask about that would be necessary to replicate our rankings. For example, we converted some variables to log scales, and we trimmed the tails of extreme distributions.
1. How were the z-scores for each of the variables converted to a percentile?
We didn’t convert the z-scores to percentiles but rather left them as z-scores.
2. How were the percentiles converted to the indicator? Was this a sum of all the percentiles, an average of the percentiles, or something different?
The indicators are just the averages of the variables’ z-scores.
3. How were the indicators compiled to create the overall ESI?
The ESI is calculated by first averaging the indicators scores, and then converting that number to a standard normal percentile.
4. Were missing values included in all the above computations or were values imputed in the above prior to the computation of z-scores?
We made imputations at the variable level, not the indicator level. This would be the hardest step to replicate because we used a probabilistic imputation procedure whose outcomes are not deterministic.