Humor is a funny thing (pun intended) and a very human emotion. At my alma mater Reed College we used to have a joke (it was really a meta-joke, because Reed) that the reason that Reed requires all students to take a common Humanities course is so that they can all have a common base of information to make jokes about. My pondering on this led me to the realization that humor can sometimes follow something like the uncanny valley (see my previous comic on this) where there’s a ‘sweet spot’ for maximal funniness. Too far away and you don’t get it, too close and you nitpick and/or aren’t surprised by the joke (surprise is actually the root of humor). Hope you find this funny. Or not.
(and, of course, a big nod to XKCD for the look/feel inspiration of this comic)
Humor is a funny thing
I’ve written before about the importance of replicates. Here’s my funny idea of how a scientist might try to carry out this meme of trying to get a picture of yourself holding up a sign passed around the internet to demonstrate the danger of posting stuff to kids/students/etc. And what is up with that anyway? It’s interesting and cool the first few times you see someone do it. But after that it starts to get a *little* bit old.
I am but a poor scientist trying to demonstrate (very confidently) a simple concept.
[Disclaimer: I’m not a statistician, but I do play one at work from time to time. If I’ve gotten something wrong here please point it out to me. This is an evolving thought process for me that’s part of the larger picture of what the scientific method does and doesn’t mean- not the definitive truth about multiple hypothesis testing.]
There’s a division in research between hypothesis-driven and discovery-driven endeavors. In hypothesis-driven research you start out with a model of what’s going on (this can be explicitly stated or just the amalgamation of what’s known about the system you’re studying) and then design an experiment to test that hypothesis (see my discussions on the scientific method here and here). In discovery-driven research you start out with more general questions (that can easily be stated as hypotheses, but often aren’t) and generate larger amounts of data, then search the data for relationships using statistical methods (or other discovery-based methods).
The problem with analysis of large amounts of data is that when you’re applying a statistical test to a dataset you are actually testing many, many hypotheses at once. This means that your level of surprise at finding something that you call significant (arbitrarily but traditionally a p-value of less than 0.05) may be inflated by the fact that you’re looking a whole bunch of times (thus increasing the odds that you’ll observe SOMETHING just on random chance alone- see this excellent xkcd cartoon for an example, see below since I’ll refer to this example). So you need to apply some kind of multiple hypothesis correction to your statistical results to reduce the chances that you’ll fool yourself into thinking that you’ve got something real when actually you’ve just got something random. In the XKCD example below a multiple hypothesis correction using Bonferroni’s method (one of the simplest and most conservative corrections) would suggest that the threshold for significance should be moved to 0.05/20=0.0025 – since 20 different tests were performed.
Here’s where the problem of a false dichotomy occurs. Many researchers who analyze large amounts of data believe that utilizing a hypothesis-based approach mitigates the effect of multiple hypothesis testing on their results. That is, they believe that they can focus their investigation of the data to a subset constrained by a model/hypothesis and thus reduce the effect that multiple hypothesis testing has on their analysis. Instead of looking at 10,000 proteins in a study they now look at only the 25 proteins that are thought to be present in a particular pathway of interest (where the pathway here represent the model based on existing knowledge). This is like saying, “we believe that jelly beans in the blue green color range cause acne” and then drawing your significance threshold at 0.05/4=0.0125 – since there are ~4 jelly beans tested that are in the blue-green color range (not sure if ‘lilac’ counts or not- that would make 5). All well and good EXCEPT for the fact that the actual chance of detecting something by random chance HASN’T changed. In large scale data analysis (transcriptome analysis, e.g.) you’ve still MEASURED everything else. You’ve just chosen to limit your investigation to a smaller subset and then can ‘go easy’ on your multiple hypothesis correction.
The counter-argument that might be made to this point is that by doing this you’re testing a specific hypothesis, one that you believe to be true and may be supported by existing data . This is a reasonable point in one sense- it may lend credence to your finding that there is existing information supporting your result. But on the other hand it doesn’t change the fact that you still could be finding more things by chance than you realize because you simply hadn’t looked at the rest of your data. It turns out that this is true not just of analysis of big data, but also of some kinds of traditional experiments aimed at testing individual – associative- hypotheses. The difference there is that it is technically unfeasible to actually test a large amount of the background cases (generally limited to one or two negative controls). Also a mechanistic hypothesis (as opposed to an associative one) is based on intervention, which tells you something different and so is not (as) subject to these considerations.
Imagine that you’ve dropped your car keys in the street and you don’t know what they look like (maybe borrowing a friend’s car). You’re pretty sure you dropped them in front of the coffee shop on a block with 7 other shops on it- but you did walk the length of the block before you noticed the keys were gone. You walk directly back to look in front of the coffee shop and find a set of keys. Great, you’re done. You found your keys, right? What if you looked in front of the other stores and found other sets of keys. You didn’t look- but that doesn’t make it less likely that you’re wrong about these keys (your existing knowledge/model/hypothesis “I dropped them in front of the coffee shop” could easily be wrong).
You know, I think the glamour publishers could really benefit from a journal to publish these kinds of results. Far less messy and then they won’t get confused with real science. Also, a bonus is that the title of the papers could be already buzzfeed-ready, no editing involved.
Also I’ve officially titled my series of academia-themed comics Red Pen/Black Pen (see previous post for something of an explanation)
Too good to be true or too good to pass up?
This comic was inspired by this wonderful parody, which was circulating awhile back but unfortunately I don’t know proper attribution.
Because you tried really hard for a really long time.
Spike Lee’s 1989 classic “Do the Right Thing” is about a lot of things. It’s about life in general and I still don’t fully understand it’s message- why did Mookie throw the garbage can through Sal’s window. Was it the right thing to do?
It was NOT about life in academia – but did have elements about the conflict between the creative and destructive influences that I find very compelling. And Radio Raheem’s Love/Hate speech seemed to speak to me in a different way. Anyway, here’s an homage I did for fun.