?> Statistical transformations applied in the EPI | Environmental Performance Index Skip to main content

You are here

News & Insights

The Metric

Jan 28, 2014

Statistical transformations applied in the EPI

Image source: Rodolfo Herman at Wikipedia.com

For a step-by-step explanation of the process used to construct the Environmental Performance Index, see "Measuring Progress: A Practical Guide from the Developers of the EPI."

It is often desirable to apply a statistical transformation to a dataset. After reviewing the distribution of a dataset, one might determine that the data are skewed (e.g,, a heavy right-tail) or could be inverted to improve interpretability. The EPI applies two types of transformations to some datasets: logarithmic and inversion.

Logarithmic transformations

Logarithmic transformation serves two purposes. First, and most importantly, if an indicator has a sizeable number of countries very close to the target, a logarithmic scale more clearly differentiates among the best environmental performers. Using raw (untransformed) data ignores small differences among top-performing countries and only acknowledges more substantial differences between leaders and laggards. The use of the log transformation has the effect of “spreading out” performance, allowing the EPI to reflect important differences, not only between the leaders and laggards, but also among the best performers.

Secondly, logarithmic transformation improves the interpretation of differences between entities scored at opposite ends of the scale. As an example, consider two comparisons of particulate matter (PM10): top-performers Venezuela and Grenada (having PM10 values of 10.54 μg/m3 and 20.54 μg/m3, respectively), and low performers Libya and Kuwait (87.63 and 97.31, respectively). Both comparisons involve differences of 10 units on the raw scale (μg/m3), but they are substantively different. Venezuela is an order of magnitude better than Grenada, while Libya and Kuwait differ by a much smaller amount in terms of percentage on a log scale.

Compared to the use of the raw measurement scale, the log scale downplays the differences between the leaders and laggards, while more accurately reflecting the nature of differences at all ranges of performance. This data transformation can encourage continued improvements by the leaders, where even small improvements can be difficult to make, but provides relatively fewer rewards for the same amount of improvement among the laggards.


In some cases, it is also necessary to invert data to make an appropriate fit into an index’s framework. This most commonly occurs with the EPI when “good” performance is on the opposite end of the spectrum from other data. For example, 100 percent of critical habitat protected implies a high level of environmental performance (i.e., “good” performance), whereas 100 percent of fisheries overexploited or collapsed implies poor performance. In order to keep high scores on the same end of the performance spectrum, the latter dataset could be inverted by taking the scores and subtracting them from one.