The Numerology of License Plates

I posted awhile back about encountering two vehicles with the same 3 letter code on their license plates as mine while driving to work one morning. Interestingly, in the following months I found myself paying more and more attention to license plates and saw at least 6-7 other vehicles in the area (a small three-city region with about 200K residents) with the same code.

Spooky. I started to feel like there was some kind of cosmological numerology going on in license plates around me that was trying to send me a message. BUT WHAT WAS IT?

A conclusion I drew from my thinking on the probability of that happening was that:

it is evident that there can be multiple underlying and often hidden explanatory variables that may be influencing such probabilities [from my post]

It was suggested that part of my noticing the plates could have been confirmation bias, I was looking for something so I noticed that thing more than normal given a pretty variable and unconnected background. I’m sure that’s true. However, I was sitting in traffic one evening (yes, we do have *some* traffic around here) and saw three plates that started with the letters ARK in the space of about 5 minutes. Weird.

So THEN I started really looking at the plates around me and noticed a strong underlying variable that pretty much explains it all. But it’s kinda interesting. I first noticed that Washington state seems to have recently switched from three number-three letter plates to three letter-four number plates. I then noticed that the starting letters for both kinds of plates were in a narrow range, W-Z for the old plates and A-C for the new plates. There don’t seem to be *any* plates outside that range right now (surveying a couple of hundred plates over the last couple of days). W is really underrepresented as is C – the tails of the distribution. This makes me guess that there’s a rolling distribution with a window of about 6 letters for license plates (in the state of Washington, other states have other systems or are on a different pattern). This probably changes with time as people have to renew their plates, buy new vehicles and get rid of the old. So the effective size of the license plate universe I tried to calculate in my previous post is much smaller than what I was thinking.

I don’t know why I find this so interesting but it really is. I know this is just some system that the Washington State Department of Licensing has and I could probably go to an office and just ask, but it seems like it’s a metaphor for larger problems of coincidence, underlying mechanisms, and science. I’m actually pretty satisfied with my findings, even though they won’t be published as a journal article (hey- you’re still reading, right?). On my way to pick up lunch today I noticed some more ARK plates (4) and these two sitting right next to each other (also 3 other ABG plates in other parts of the parking lot).

LicensePlates

The universe IS trying to tell me something. It’s about science stupid.

Magic Hands

Too good to be true or too good to pass up?

Too good to be true or too good to pass up?

There’s been a lot of discussion about the importance of replication in science (read an extensive and very thoughtful post about that here) and notable occurrences of non-reproducible science being published in high-impact journals. The recent retraction of the two STAP stem cell papers from Nature and accompanying debate over who should be blamed and how. The publication of a study (see also my post about this) in which research labs responsible for high-impact publications were challenged to reproduce their findings showed that many of these findings could not be replicated, in the same labs they were originally performed in. These, and similar cases and studies, indicate serious problems in the scientific process- especially, it seems, for some high-profile studies published in high-impact journals.

I was surprised, therefore, at the reaction of some older, very experienced PIs recently after a talk I gave at a university. I mentioned these problems, and briefly explained the results of the study on reproducibility to them- that, in 90% of the cases, the same lab could not reproduce the results that they had previously published. They were generally nonplussed. “Oh”, one said, “probably just a post-doc with magic hands that’s no longer in the group”. And all agreed on the difficulty of reproducing results for difficult and complicated experiments.

So my question is: do these fabled lab technicians actually exist? Are there those people who can “just get things to work”? And is this actually a good thing for science?

I have some personal experience in this area. I was quite good at futzing around with getting a protocol to work the first time. I would get great results. Once. Then I would continue to ‘innovate’ and find that I couldn’t replicate my previous work. In my early experiences I sometimes would not keep notes well enough to allow me to go back to the point where I got it to work. Which was quite disturbing and could send me into a non-productive tailspin of trying to replicate the important results. Other times I’d written things down sufficiently that I could get them to work again. And still others I found that someone else in the lab could consistently get better results out of the EXACT SAME protocol- apparently followed the same way. They had magic hands. Something about the way they did things just *worked*. There were some protocols in the lab that just seemed to need this magic touch- some people had it and some people didn’t. But does that mean that the results these protocols produced were wrong?

What kinds of procedures seem to require “magic hands”? One example is from when I was doing electron microscopy (EM) as a graduate student. We were working constantly at improving our protocols for making two-dimensional protein crystals for EM. This was delicate work, which involved mixing protein with a buffer in a small droplet, layering on a special lipid, incubating for some amount of time to let the crystals form, then lifting the fragile lipid monolayer (hopefully with protein crystals) off onto an EM grid and finally staining with an electron dense stain or flash freezing in liquid nitrogen. The buffers would change, the protein preparations would change, the incubation conditions would change, and how the EM grids were applied to our incubation droplets to lift off the delicate 2D crystals was subject to variation. Any one of these things could scuttle getting good crystals and would therefore produce a non-replication situation. There were several of us in the lab that did this and were successful in getting it to work- but it didn’t always work and it took some time to develop the right ‘touch’ to get it to work. The number of factors that *potentially* contributed to success or failure was daunting and a bit disturbing- and sometimes didn’t seem to be amenable to communication in a written protocol. The line between superstition and required steps was very thin.

But this is true of many protocols that I worked with throughout my lab career* – they were often complicated, multi-step procedures that could be affected by many variables- from the ambient temperature and humidity to who prepared the growth media and when. Not that all of these variables DID affect the outcomes but when an experiment failed there were a long list of possible causes. And the secret with this long list? It probably didn’t include all the factors that did affect the outcome. There were likely hidden factors that could be causing problems. So is someone with magic hands lucky, gifted, or simply persistent? I know of a few examples where all three qualities were likely present- with the last one being, in a way, most important. Yes, my collaborator’s post-doc was able to do amazing things and get amazing results. But (and I know this was the case) she worked really long and hard to get them. She probably repeated experiments many, many times ins some cases before she got it to work. And then she repeated the exact combination to repeat the experiments again. And again. And sometimes even that wasn’t enough (oops, the buffer ran out and had to be remade, but the lot number on the bottle was different, and weren’t they working on the DI water supply last week? Now my experiment doesn’t work anymore.)

So perhaps it’s not so surprising that many of these key findings from these papers couldn’t be repeated, even in the same labs. There was not the same incentive to get it to work for one thing- so that post-doc or another graduate student who’s taken over the same duties, probably tried once to repeat the experiment. Maybe twice. Didn’t work. Huh? That’s unfortunate. And that’s about as much time as we’re going to put in to this little exercise. The protocols could be difficult, complicated, and have many known and unknown variables affecting their outcomes.

But does it mean that all these results are incorrect? Does it mean that the underlying mechanisms or biology that was discovered was just plain wrong? No. Not necessarily. Most, if not all, of these high-profile publications that failed to repeat spawned many follow-on experiments and studies. It’s likely that many of the findings were borne out by orthogonal experiments, that is, experiments that test implications of these findings, and by extension the results of the original finding itself. Because of the nature of this study it was conducted anonymously- so we don’t really know, but it’s probably true. This was an important point, and one that was brought up by these experienced PIs I was talking with, is that sometimes direct replication may not be the most important thing. Important, yes. But perhaps not deal-killing if it doesn’t work. The results still might stand IF, and only if, second, third, and fourth orthogonal experiments can be performed that tell the same story.

Does this mean that you actually can make stem cells by treating regular cultured cells with an acid bath? Well, probably not. For some of these surprising, high-profile findings the ‘replication’ that is discussed is other labs trying to see if the finding is correct. So they try the protocols that have been reported, but it’s likely that they also try other orthogonal experiments that would, if positive, support the original claim.

"OMG! This would be so amazing if it's true- so, it MUST be true!"

“OMG! This would be so amazing if it’s true- so, it MUST be true!”

So this gets back to my earlier discussions on the scientific method and the importance of being your own worst skeptic (see here and here). For every positive result the first reaction should be “this is wrong”, followed by, “but- if it WERE right then X, Y, and Z would have to be true. And we can test X, Y, and Z by…”. The burden of scientific ‘truth’** is in replication, but in replication of the finding– NOT NECESSARILY in replication of the identical experiments.

*I was a labbie for quite a few of my formative years. That is, I actually got my hands dirty and did real, honest-to-god experiments, with Eppendorf tubes, vortexers, water baths, cell culture, the whole bit. Then I converted and became what I am today – a creature purely of silicon and code. Which suits me quite well. This is all just to add to my post a “I kinda know what I’m talking about here- at least somewhat”.

** where I using a very scientific meaning of truth here, which is actually something like “a finding that has extensive support through multiple lines of complementary evidence”

Unicorn lovers and pinksters unite!

Last year GoldieBlox released a few ads that I thought were great. You’re probably familiar with them (see below) but they are advertising a building kit targeted especially at girls. These kinds of products are great and much needed. The idea is to counter the years and years of placing girls in pink marketing boxes with a limited number of career-directed options (NO pink CEOs, pink scientists, pink explorers, pink astronauts). Girls WILL like pink and sparkly things. They WILL like princesses, unicorns, and small sad-eyed puppy dogslego-woman-scientist. As many will know this remains a problem- there’s lots of marketing that is still directed that way. However, there has been a recent surge in non-traditional products directed at girls: LEGO women scientists figures, building kits for girls, These are simply great options and great advances and by all means they should continue to be developed, expanded, and marketed.

But back to the ad and my main point. When we push for something, we seem to have to push against something else- we draw lines to discriminate “us” from “them”. For girl power we should be pushing back against the oppressive, ingrained, male-dominated power structure that has been in place in our society for years. However, too often it seems that we push against the wrong things: those girls who love pink, who like unicorns, who wish they were princesses. You can argue about whether this is a good thing or not, but the fact is that these are girls too. This anti-pink message is too often conveyed in marketing and people’s general reactive attitudes against the traditional, including mine- in the context of saying something good: We should empower girls to achieve and not be held back– along with something not so good: not like those other girls who won’t achieve. What this kind of reactive attitude is saying is this:

Because you like pink you can not be an engineer. You can not be a scientist. You can not be an astronaut. Girls who like unicorns do not do that. They are less than girls that don’t like these things.

Make no mistake- I like these ads, I think they’re funny and they make me laugh. But that doesn’t change that they do so at the expense of a group of people who have nothing but potential to be squashed. These GoldieBlox ads aren’t terrible in this way- the ‘pink unicorns’ are things (toys and some cartoon on the TV), not a set of girls, but it remains that the implication is that liking pink is bad and won’t take you anywhere. Clearly liking a particular color shouldn’t have an ounce of an effect on what you will do later in life- or even what you can do now. This was pointed out to me after I posted the ad to my Facebook page, by a good friend who has girls who do like princesses. And it is an excellent point.

So, in a way, this is a limited example. But it highlights a much larger problem with human nature. Humans LOVE to draw lines. Them and us, us versus them. When lines are drawn around another group of people based on some set of attributes (favorite color, gender, skin color, type of pants worn) then all those inside the group suddenly acquire- in your perception- a set of other attributes from that group, whether or not these are accurate and whether or not the individual you’re talking about has said attributes. We *know* things about “those sorts of people”. This is one of the very natural tendencies that we all have, we all indulge in, and we all must do our best to fight against.

Here’s the GoldieBlox ad:

Here is another ad that I think is particularly well done. It highlights how perceptions and language are important- but also demonstrates a point about the tendency of humans to group:

If you have kids try this on them. Ask them to throw ‘like a girl’ and see what they do.

Coincidence

I had a weird thing happen on my way in to work this morning. On the main road just a short distance from my parking lot I noticed that the SUV in front of me had the same three letter combination on their license plate as mine, “YGK”. Then I noticed that the car in front of THEM had the SAME three letter combination! Wow. What are the odds of that happening? Well, I’m not going to tell you the odds of that happening, because I don’t really know. But it did happen. An odd coincidence for sure, but maybe not as cosmically-connected as you might be inclined to think.

First off, let’s think about the odds of drawing the same 3-letter combination from a hat with 26^3 combinations two times in a row (approximating what happened here- because my license plate is fixed). That’s how many different possible 3-letter combinations there are- I suppose probably subtracting one or two for words that aren’t allowed, like “ASS” and, ummm, well maybe there’s another. This is 17,576. The chances of drawing two of the same out of a hat would be 1/17,576 X 1/17,576 – 1 in 300 million. So this means that you could sit and draw letters out of this hat every second (that is drawing two sets of three letters out every second) for about 10 years before you’d be likely to have this happen. Now clearly I’m simplifying here- but still. So for my license plate story I’d be unlikely to have this happen in my lifetime since I’m only driving every now and then and I’m not generally even paying attention to other people’s license plates to see if this has happened or not.

So here are some reasons why it’s not TOO surprising that it did happen. First, assuming all combinations are used, there are 1000 other vehicles in WA state with the same letters, which narrows the field a bit- but only a bit since there are ~6 million registered vehicles (at least in 2012, though some portion of these have the longer 7 number/letter plates). Second, is that it is likely that these are issued in order (though I’m not 100% sure about that, it would seem to make sense) of request. That means that vehicles purchased about the same time as mine (2001) are probably far more likely to have the same set of letters.That’s been about 13 years, which means that those vehicles are going to be of a certain age.  I would also include geography – since that could be another influencing factor as to which numbers/letters you get, but I did get my license plate on the other side of the state. I don’t have a clear idea of how this would bias the probability of seeing three license plates in a row, but it fits in to my next point, which is hidden or partially hidden explanatory variables.

When my wife and I lived in Portland, far before we had such encumbrances as kids to drag us down, we often did a bunch of activities on a weekend. I started to be surprised to notice some of the same people turning up at different places, parks, restaurants, bookstores, museums, etc, far across town. This happened more than you’d expect in a moderately-sized city. Interestingly, in Seattle when we had a kid this also happened. And it happens all the time in our current city(ies), which are much smaller. My idea about this is that it’s not surprising at all. Our choice of activities and times is dictated or heavily influenced by our age, interests, kidlet status, etc. – as are other peoples’. So instead of thinking of the chances of repeatedly bumping in to the same set of people out of the entire population, think about the chances if the background distribution is much more limited, constrained (in part) by those interests and other personal constraints. The probability of this happening then rises considerably because your considering a smaller number of possible people. I’m sure this has been described before in statistics and would love it if someone knew what it’s called (leave a comment).

How does this fit in to my license plate experience? I don’t really have a clear idea, but it is evident that there can be multiple underlying and often hidden explanatory variables that may be influencing such probabilities. Perhaps my work is enriched in people who think like me and hold on to vehicles for a long time- AND purchased vehicles at about the same time. I think that’s probably likely, though I have no idea how to test it. If that’s true then the chances of running in to someone else with the same letters on their plates, or two people at the same time, would have to go up quite a lot. Still, what are the odds?

The false dichotomy of multiple hypothesis testing

[Disclaimer: I’m not a statistician, but I do play one at work from time to time. If I’ve gotten something wrong here please point it out to me. This is an evolving thought process for me that’s part of the larger picture of what the scientific method does and doesn’t mean- not the definitive truth about multiple hypothesis testing.]

There’s a division in research between hypothesis-driven and discovery-driven endeavors. In hypothesis-driven research you start out with a model of what’s going on (this can be explicitly stated or just the amalgamation of what’s known about the system you’re studying) and then design an experiment to test that hypothesis (see my discussions on the scientific method here and here). In discovery-driven research you start out with more general questions (that can easily be stated as hypotheses, but often aren’t) and generate larger amounts of data, then search the data for relationships using statistical methods (or other discovery-based methods).

The problem with analysis of large amounts of data is that when you’re applying a statistical test to a dataset you are actually testing many, many hypotheses at once. This means that your level of surprise at finding something that you call significant (arbitrarily but traditionally a p-value of less than 0.05) may be inflated by the fact that you’re looking a whole bunch of times (thus increasing the odds that you’ll observe SOMETHING just on random chance alone- see this excellent xkcd cartoon for an example, see below since I’ll refer to this example). So you need to apply some kind of multiple hypothesis correction to your statistical results to reduce the chances that you’ll fool yourself into thinking that you’ve got something real when actually you’ve just got something random. In the XKCD example below a multiple hypothesis correction using Bonferroni’s method (one of the simplest and most conservative corrections) would suggest that the threshold for significance should be moved to 0.05/20=0.0025 – since 20 different tests were performed.

Here’s where the problem of a false dichotomy occurs. Many researchers who analyze large amounts of data believe that utilizing a hypothesis-based approach mitigates the effect of multiple hypothesis testing on their results. That is, they believe that they can focus their investigation of the data to a subset constrained by a model/hypothesis and thus reduce the effect that multiple hypothesis testing has on their analysis. Instead of looking at 10,000 proteins in a study they now look at only the 25 proteins that are thought to be present in a particular pathway of interest (where the pathway here represent the model based on existing knowledge). This is like saying, “we believe that jelly beans in the blue green color range cause acne” and then drawing your significance threshold at 0.05/4=0.0125 – since there are ~4 jelly beans tested that are in the blue-green color range (not sure if ‘lilac’ counts or not- that would make 5). All well and good EXCEPT for the fact that the actual chance of detecting something by random chance HASN’T changed. In large scale data analysis (transcriptome analysis, e.g.) you’ve still MEASURED everything else. You’ve just chosen to limit your investigation to a smaller subset and then can ‘go easy’ on your multiple hypothesis correction.

The counter-argument that might be made to this point is that by doing this you’re testing a specific hypothesis, one that you believe to be true and may be supported by existing data . This is a reasonable point in one sense- it may lend credence to your finding that there is existing information supporting your result. But on the other hand it doesn’t change the fact that you still could be finding more things by chance than you realize because you simply hadn’t looked at the rest of your data. It turns out that this is true not just of analysis of big data, but also of some kinds of traditional experiments aimed at testing individual – associative- hypotheses. The difference there is that it is technically unfeasible to actually test a large amount of the background cases (generally limited to one or two negative controls). Also a mechanistic hypothesis (as opposed to an associative one) is based on intervention, which tells you something different and so is not (as) subject to these considerations.

Imagine that you’ve dropped your car keys in the street and you don’t know what they look like (maybe borrowing a friend’s car). You’re pretty sure you dropped them in front of the coffee shop on a block with 7 other shops on it- but you did walk the length of the block before you noticed the keys were gone. You walk directly back to look in front of the coffee shop and find a set of keys. Great, you’re done. You found your keys, right? What if you looked in front of the other stores and found other sets of keys. You didn’t look- but that doesn’t make it less likely that you’re wrong about these keys (your existing knowledge/model/hypothesis “I dropped them in front of the coffee shop” could easily be wrong).

XKCD: significant

Not being part of the rumor mill

I had something happen today that made me stop and think. I repeated a bit of ‘knowledge’ – something science-y that had to do with a celebrity. This was a factoid that I have repeated many other times. Each time I do I state this factoid with a good deal of authority in my voice and with the security that this is “fact”. Someone who was in the room said, “really?” Of course, as a quick Google check to several sites (including snopes.com) showed- this was, at best, an unsubstantiated rumor, and probably just plain untrue. But the memory voice in my head had spoken with such authority! How could it be WRONG? I’m generally pretty good at picking out bits of misinformation that other people present and checking it, but I realized that I’m not always so good about detecting it when I do it myself.

Of course, this is how rumors get spread and disinformation gets disseminated. As scientists we are not immune to it- even if we’d like to think we are. And we actually could be big players in it. You see, people believe us. We speak with the authority of many years of schooling and many big science-y wordings. And the real danger is repeating or producing factoids that fall in “science” but outside what we’re really experts in (where we should know better). Because many non-scientists see us as experts IN SCIENCE. People hear us spout some random science-ish factoid and they LISTEN to us. And then they, in turn, repeat what we’ve said, except that this time they say it with authority because it was stated, with authority, by a reputable source. US. And I realized that this was the exact same reason that it seemed like fact to me. Because it had been presented to me AS FACT by someone who I looked up to and trusted.

So this is just a note of caution about being your own worst critic – even in normal conversation. Especially when it comes to those slightly too plausible factoids. Though it may not seem like it sometimes people do listen to us.

What if fuel efficiency followed Moore’s law

At current rates by the year 2026 we will be able to drive, on average, 37 miles on one gallon of gas in the US. Wow. If that number strikes you as underwhelming then just consider, as a comparison, growth in the computer industry over a corresponding time period. Most portions of the computer industry have followed Moore’s law, a doubling in speed, capacity, efficiency every one to two years. This has proceeded since the 1970s to today, pretty much unabated.

Historical fuel efficiency trends in the US

Historical fuel efficiency trends in the US. Note that on the scale from the graph below, this would be a straight line.

PC hard disk capacity (in GB). The plot is logarithmic, so the fitted line corresponds to exponential growth.

PC hard disk capacity (in GB). The plot is logarithmic, so the fitted line corresponds to exponential growth.

My late grandfather, Gideon Kramer, was a very forward thinker- a futurist even. He was not very tolerant of lack of progress in some areas. He would have said “there’s just NO EXCUSE for this kind of thing”- actually, I think he probably did weigh in on exactly this issue. But really, IS there an excuse for this kind of thing? I’m sure there are sound practical reasons why fuel efficiency hasn’t increased much- at all- since before the 1970s. Even the most cutting edge fuel efficient vehicles, hybrids, and electrics don’t really do all that great. But there were likely very sound practical reasons why humans would never travel into space, never come close to eradicating polio and smallpox, we would never have hand-held communication devices with computational power unimaginable 30 years ago in our pockets. And yet we have done all these things.

So this raised the question: What if fuel efficiency DID follow Moore’s law? Where would we be?

If fuel efficiency in the US started following Moore’s law, with a doubling time of two years, in 1980- about the same time the computer industry really took off, we would have been able to drive from LA to New York on 1 gallon of gas in 1994. By 2002 we could have climbed in our cars, driven around the WORLD, and THEN had to refill our 1 gallon gas tanks. In 2009 we could have driven our cars to the moon- the frickin’ MOON- on, you guessed it, 1 gallon of gas. By the year 2026 we would be able to drive to the SUN- 93 million miles- MILLION MILES- on 1 gallon of gas. (this was, I should mentioned, not overlooked by Gordon Moore himself).

Calculations based on a doubling time of two years.

Calculations based on a doubling time of two years.

 

 

 

Here’s another way of thinking about it. If I drove 25,000 miles a year, which is about twice the national average, I would be able to buy a car- put one gallon of gas in it and drive it for forty (40) years without refueling. This is making a couple of assumptions- the first being that engine life would also follow Moore’s law and my car would actually last that long, and the second limitation is that the gas in the tank would actually go bad long before it got used up. In a few years we would be able to hand down the family gallon of gas to our kids and grandkids- a family heirloom that could continue to be used by generations to come.

So don’t get the wrong idea, not everything in the automotive industry has failed to follow Moore’s law. Apparently the tire pressure used in cars has obeyed Moore’s law, albeit with a longer time period for doubling. Really. That’s the best you can do automotive industry?

Clearly, there are technical and theoretical reasons why fuel efficiency hasn’t followed Moore’s law and probably can’t. However, it seems clear that the potential for fuel efficiency in vehicles is simply not being realized. Is this due to technical or practical constraints, or just simply because the demand isn’t there? I tend to believe that we just don’t want it or need it bad enough to make it happen. But I believe we can do better and we will be forced to in the near future.

15 great ways to fool yourself about your results

I’ve written before about how easy it is to fool yourself and some tips on how to avoid it for high-throughput data. Here is a non-exhaustive list of ways you too can join in the fun!

  1. Those results SHOULD be that good. Nearly perfect. It all makes sense.
  2. Our bioinformatics algorithm worked! We put input in and out came output! Yay! Publishing time.
  3. Hey, these are statistically significant results. I don’t need to care about how many different ways I tested to see if SOMETHING was significant about them.
  4. We only need three replicates to come to our conclusions. Really, it’s what everyone does.
  5. These results don’t look all THAT great, but the biological story is VERY compelling.
  6. A pilot study can yield solid conclusions, right?
  7. Biological replicates? Those are pretty much the same as technical replicates, right?
  8. Awesome! Our experiment eliminated one alternate hypothesis. That must mean our hypothesis is TRUE!
  9. Model parameters were chosen based on what produced reasonable output: therefore, they are biologically correct.
  10. The statistics on this comparison just aren’t working out right. If I adjust the background I’m comparing to I can get much better results. That’s legit, right
  11. Repeating the experiment might spoil these good results I’ve got already.
  12. The goal is to get the p-value less than 0.05. End.Of.The.Line. (h/t Siouxsie Wilespvalue_kid_meme
  13. Who, me biased? Bias is for chumps and those not so highly trained in the sciences as an important researcher such as myself. (h/t Siouxsie Wiles)
  14. It doesn’t seem like the right method to use- but that’s the way they did it in this one important paper, so we’re all good. (h/t Siouxsie Wiles)
  15. Sure the results look surprising, and I apparently didn’t write down exactly what I did, and my memory on it’s kinda fuzzy because I did the experiment six months ago, but I must’ve done it THIS way because that’s what would make the most sense.
  16. My PI told me to do this, so it’s the right thing to do. If I doubt that it’s better not to question it since that would make me look dumb.
  17. Don’t sweat the small details- I mean what’s the worst that could happen?

Want to AVOID doing this? Check out my previous post on ways to do robust data analysis and the BioStat Decision Tool from Siouxsie Wiles that will walk you through the process of choosing appropriate statistical analyses for your purposes! Yes, it is JUST THAT EASY!

Feel free to add to this list in the comments. I’m sure there’s a whole gold mine out there. Never a shortage of ways to fool yourself.

 

Gut feelings about gut feelings about marriage

An interesting study was published about the ‘gut feelings’ of newlyweds, and how they can predict future happiness in the marriage. The study assessed gut feelings (as opposed to stated feelings, which are likely to be biased in the rosy direction) of newlyweds towards their spouse by a word association and controlled for several different variables (like how the same people react to random strangers with the word association). They found that newlyweds that had more ‘gut feeling’ positive associations about their spouse were in happier relationships after four years. Sounds pretty good, right? Fits with what you might think about gut feelings.

The interesting point (which is nicely put in a Nature piece that covers this study) is that after other effects are factored out of the analysis the positive association was statistically significant, but that it could only explain 2% of the eventual difference in happiness (this analysis was apparently done by the Nature reporter, and not reported in the original paper). 2%! That’s not a very meaningful effect- even though it may be statistically significant. Though the study is certainly interesting and likely contains quite a bit of good data – this effect seems vanishingly small.

For interest here are the titles of the paper and some follow-on news pieces that were written about it and how they make the results seem much more clear cut and meaningful.

Title of the original Science paper:

Though They May Be Unaware, Newlyweds Implicitly Know Whether Their Marriage Will Be Satisfying

Title of the Nature piece covering this study:

Newlyweds’ gut feelings predict marital happiness Four-year study shows that split-second reactions foretell future satisfaction.

Headline from New Zealand Herald article:

Gut instinct key to a long and happy marriage

Headline from New York Daily News

Newlyweds’ gut feelings on their marriage are correct: study

 

Excitement about great results

My first attempt at doing an XKCD-like plot about science. This being inspired by my great results I posted about yesterday, which still stand by the way.

"OMG! This would be so amazing if it's true- so, it MUST be true!"

“OMG! This would be so amazing if it’s true- so, it MUST be true!”