It's raining today, and I've just emerged from reading a selection of articles on time memory in animals with a view to answering the question "can animals experience episodic memory?" Turns out that's a pain to answer since there are plenty of mechanisms that can masquerade as reliving an event including learning semantic rules and intervals . The upshot is we still don't know. The nice thing about my field however is that when theory tends towards the nebulous - and it frequently does, I can focus on methods for a while to take my mind off things...
You should create these graphs for each individual and each condition you're interested in rather than collapsing across everything.
Andrew Heathcote and members of the Newcastle Cognition Lab have created a remarkably easy to use program called QMPE which organizes your RTs into quantiles and then deconvolves them into mu, sigma and tau representing the mean and standard deviation of the normal curve and the exponential tail respectively.
This is kind of exciting when you consider that often the largest differences between groups of subjects (especially comparisons of old and young adults) are NOT found in the mean values but in the slowest responses (which is precisely what tau represents!) The upshot is that while an analysis on the mean values may not reveal anything interesting (i.e. no difference), the tails of the distributions may tell a different story. The only issue may be in trying to convince other (less statistically diverse) people that your findings are in fact as cool as you know them to be.
So, there you have it - this isn't exhuastive by a long shot, and I'm sure there are other ways of looking at RTs. Feel free to add your ideas below or point out anything you think I've missed.
Lacouture, Y., & Cousineau, D. (2008). How to use MATLAB to fit the ex ‐ Gaussian and other probability functions to a distribution of response times. Tutorials in Quantitative Methods for Psychology, 4(1), 35–45. Retrieved from http://www.tqmp.org/doc/vol4-1/p35-45_Lacouture.pdf
So, I've run an experiment and have a whole set of reaction times that I'm itching to analyze. What are my options? Any good statistician will tell you to plot your data first. In fact, should you go to see them for advice, they will, with a vague air of distressed incredulity, tell you to go off and plot your data. Usually I feel slightly ashamed for not having done this already, and hurry away to produce a slew of charts.
There are a number of options. Let's get a sense of the shape of the distribution first. If you plan to run any parametric analyses, you've made the assumption that your data is normal (or that you will correct it to be as normal as possible later on). A histogram is a general starting place for most people.
I simulated a normal distribution in R and then plotted it (I'll include the code at the bottom of the document). You might also want to get a sense of whether there are any outliers in your data - boxplots are generally more useful in this regard. Of course if you're feeling very adventurous, you might want to combine them both and use something called a Beanplot - it's easiest to make sense of this as a density curve turned on it's side with individual points plotted as bars. Here are some examples of each...
|
Great! The data is normal with no outliers (just as it was simulated to be...) If there were outliers (normally at least 3 SD from the mean), they would show up as asterisks on the boxplot and we could either remove that case altogether, apply a transformation, or winsorize the data (i.e. remove the top and bottom 'x' % and replace these values with the remaining next highest value). Data cleaning is a bit of a contentious area, that said, no data cleaning is often worse than removing outliers since your data will be misleading.
Wonderful, assuming you've inspected your data for each individual, you can now analyze everything using a conventional regression, t-test or ANOVA depending on your design and question.
But what if your data are not normal?
Well, right off the bat, any parametric analysis you run will be biased and misleading since the assumption of underlying normality is used to calculate p values. This doesn't however mean that your data is a total flop - many distributions of RTs are not normal, and there are ways to deal with and even capitalize on this.
The first thing you may want to try and do is bootstrap your data. Essentially this method just involves resampling from your dataset a specified number of times to create many sample datasets which can then be averaged to give the bootstrapped distribution. This has the advantage of always resulting in a normal distribution since outlying points will be sampled less frequently and have less influence. The same is true for skewed distributions.
SPSS has made bootstrapping data and test statistics trivial (merely select that option before running the test), and because of this, I haven't been quite brave enough to try this in R yet.
On the other hand, you might have something wonderful (at least according to a few very statistically minded psychologists). You might have an ex-Gaussian distribution. This distribution looks rather normal, but with a slight positive bias (i.e. the tail stretches out to the right a bit). Here's an example from Lacouture & Cousineau (2008).
Wonderful, assuming you've inspected your data for each individual, you can now analyze everything using a conventional regression, t-test or ANOVA depending on your design and question.
But what if your data are not normal?
Well, right off the bat, any parametric analysis you run will be biased and misleading since the assumption of underlying normality is used to calculate p values. This doesn't however mean that your data is a total flop - many distributions of RTs are not normal, and there are ways to deal with and even capitalize on this.
The first thing you may want to try and do is bootstrap your data. Essentially this method just involves resampling from your dataset a specified number of times to create many sample datasets which can then be averaged to give the bootstrapped distribution. This has the advantage of always resulting in a normal distribution since outlying points will be sampled less frequently and have less influence. The same is true for skewed distributions.
SPSS has made bootstrapping data and test statistics trivial (merely select that option before running the test), and because of this, I haven't been quite brave enough to try this in R yet.
On the other hand, you might have something wonderful (at least according to a few very statistically minded psychologists). You might have an ex-Gaussian distribution. This distribution looks rather normal, but with a slight positive bias (i.e. the tail stretches out to the right a bit). Here's an example from Lacouture & Cousineau (2008).
As you can see, the ex-Gaussian curve (at the far right) is the convolution of the normal and exponential components.
Andrew Heathcote and members of the Newcastle Cognition Lab have created a remarkably easy to use program called QMPE which organizes your RTs into quantiles and then deconvolves them into mu, sigma and tau representing the mean and standard deviation of the normal curve and the exponential tail respectively.
This is kind of exciting when you consider that often the largest differences between groups of subjects (especially comparisons of old and young adults) are NOT found in the mean values but in the slowest responses (which is precisely what tau represents!) The upshot is that while an analysis on the mean values may not reveal anything interesting (i.e. no difference), the tails of the distributions may tell a different story. The only issue may be in trying to convince other (less statistically diverse) people that your findings are in fact as cool as you know them to be.
So, there you have it - this isn't exhuastive by a long shot, and I'm sure there are other ways of looking at RTs. Feel free to add your ideas below or point out anything you think I've missed.
#To generate 100 random normal intergers
N <- 100
x <- rnorm(N)
H <- hist(x, freq=TRUE) # This will plot the histogram as well
dx <- min(diff(H$breaks))
curve(N*dx*dnorm(x), add=TRUE, col="blue")
#For the boxplot
boxplot(x)
#for the beanplot
beanplot(x) # you will have to install the beanplot package first
#For the boxplot
boxplot(x)
#for the beanplot
beanplot(x) # you will have to install the beanplot package first
Lacouture, Y., & Cousineau, D. (2008). How to use MATLAB to fit the ex ‐ Gaussian and other probability functions to a distribution of response times. Tutorials in Quantitative Methods for Psychology, 4(1), 35–45. Retrieved from http://www.tqmp.org/doc/vol4-1/p35-45_Lacouture.pdf
Comments
Post a Comment