Data transformations are commonly-used tools that can serve many functions in quantitative analysis of data, including meeting assumptions and improving effect sizes, thus constituting important aspects of best practice.
A transformation is a mathematical modification of the variable to achieve a particular goal (e.g., normality, enhanced interpretability). There are an almost infinite variety of possible data transformations, from adding constants to multiplying, squaring or raising to a power, converting to logarithmic scales, inverting and reflecting, taking the square root of the values, and even applying trigonometric transformations such as sine wave transformations.
While these are important options for analysts, they do fundamentally transform the nature of the variable, making the interpretation of the results somewhat more complex. Further, few (if any) statistical texts discuss the tremendous influence a distribution's minimum value has on the efficacy of a transformation. Specifically, the lowest value (anchor) in your variable’s distribution can influence the efficacy of a particular transformation. The goal of this paper is to promote thoughtful and informed use of data transformations, focusing on three data transformations most commonly discussed in social sciences texts (square root, log, and inverse) for improving the normality of variables.
Dr. Stuart's web address is www.biostat.jhsph.edu/~estuart