Advanced Assessment Methods: What Does Correlation Really Mean?

scott's thoughts Oct 25, 2022

Thank you for joining me as I continue my blog series on utilizing advanced assessment methods for the vast amounts of data collected by your PA program. We have spent several blogs exploring the benefits of this methodology. Now we will break down the numbers so you can understand what it all means. What do applied statistics tell us in this context about correlations? What constitutes statistical significance?

Before you race for the door, let me ease your mind. This will not be a statistics course. Statistics may seem so complex that they have no practical application, but that is far from the truth. With just a few definitions and an understanding of how variables move in relation to one another, you can reap the benefits of advanced analysis.

This is a topic near and dear to my heart. I have taught research methodology and stats to graduate health science students for more than fifteen years. I have also been intricately involved with assessment in PA education for many years and have used these metho. myself to enhance assessment. We are going to take a practical approach here, demonstrating how statistics enhance the depth and breadth of the assessment process. Think of it as taking the next step forward in data collection.

We begin by looking at correlation, which is a relationship between two variables or measures: an independent variable (which we may think of as the cause) and a dependent variable (the effect) – though it is important to remember that correlation is not necessarily causation. A correlation coefficient quantifies the degree of the relationship between the variables.

Let us look at a basic example. You may have noticed that on rainy mornings, many of your co-workers are late. In your thoughts, you correlate rainy weather (the independent variable) to tardiness (the dependent variable).

Certainly, there are plenty of reasons to believe rain and tardiness are correlated; rainy weather might make visibility worse, make people drive more slowly, cause traffic jams and fender-benders, slow the bus schedule, perhaps even make people more likely to sleep through their alarms. However, it is also possible that your co-workers are always late, but you only notice it on rainy mornings. When we measure a correlation, we are looking for a measurement that says, “Yes, this variable (tardiness) and this variable (rain) move together.”

A correlation can be positive or negative. In the case of a positive correlation, the two variables rise or fall together: the heavier the rainfall, the greater the tardiness. In the case of a negative correlation, the variables move opposite each other. An example would be exercise time’s negative correlation to body fat percentage, or in other words, the more time one spends exercising, the lower body fat percentage one might expect to have.

At this point, correlation does not ask that we prove causation. As we saw from our rainy-day example, the actual cause of tardiness need not be the rain itself but any one of dozens of inconveniences related to rainy weather. What a correlation instead provides is information on how much the two variables move in relation to each other. If that relationship is strong enough, we can use one variable to predict the occurrence and/or strength of the other.

The correlation coefficient is that measurement, and its value can range from -1 (negative correlations) to +1 (positive correlations) – the closer your score gets to the -1 or +1, the stronger the negative or positive correlation. The closer the score is to zero, the less the correlation. Note that a perfect -1 or +1 score is theoretically possible but infinitesimally improbable – correlations do not require perfection anyway.

After a certain amount of research conducted, you might learn that there is only a .10 percent correlation between precipitation and tardiness – which a very weak relationship. Perhaps, indeed, you are just more likely to notice tardiness on rainy days.

Or you may learn that there is a .72 percent correlation between mornings where precipitation chances are above 80% and an average of 15-20 minutes of tardiness – an extraordinarily strong correlation coefficient! Therefore, you can predict that, next Tuesday when the precipitation chance is 80%, scheduling a meeting for 8:00 a.m. sharp is a bad idea. Make it 9:00 a.m. instead to allow everyone plenty of time to get to work. Now, this is a considerably simplified example that applies statistics to something that could rely on common sense rather than math, but hopefully you get the idea!

As applied to the PA environment, you can see how a strong correlation can be an extremely beneficial predictor, for example, if you can show a strong correlation between a certain test score or course score and PANCE pass rates. In my next blog, I will continue this discussion by exploring the concept of statistical significance to determine whether correlations are meaningful.

Advanced Assessment Methods: What Does Correlation Really Mean?

Two Step