Reading, writing, arithmetic, and search: Why it’s more important to be organized than smart*

*here smart is used to mean ‘full of knowledge’

Increasingly the knowledge of the world is at our fingertips, keyboards, cell phone screens,  and other Hitchhiker’s guide-style devices that are always (or almost always) connected to the vast store of information and misinformation that we call the internet. So why do we need to know this stuff at all? Of course a core set of knowledge about how the world works is essential. But really, knowledge moves fast and it’s likely that more than 65% the knowledge currently in your head as ‘truth’ is mislabeled (it’s true- you can check my numbers on the web). So isn’t it more important- with this incredible mental crutch we have now- to be organized? By organization I’m talking about having an efficient file system for finding information. A database that stores metadata about the information, like where to find it. I’m constantly running in to questions that I don’t know the answer to and thinking, “but I DO know where to look to get that information” or, “I remember reading something in [publication/blog/website] about that”. Storing pointers to information is far more efficient than storing the information itself- and potentially more accurate. Still, the internet is chock full of bad information so it’s also essential to have a critical mind with appropriate search strategies and filters in place to be able to sort the wheat from the chaff.

I think that standardized tests should include sections on search. Increasingly, this is THE skill that will allow success. (Of course I’m being deliberately provocative here, I understand that search is different than those other important skills). Without efficient search strategies people will be lost and alone in a world that they cannot possibly know everything about. Search allows them to access this information and utilize it when they need it, then discard it (keeping pointers to new information they have ‘learned’, of course). Without filters to discriminate what’s reliable from unreliable (a necessary skill in any case) you will be unable to function.

How to write a scientific paper (part 1)

[I’ve updated this post by request to include a more in-depth description of the competitive eating analogy, which was promising but sadly lacking in the first version. Thanks Nat]

I’ve written or participated in writing a bunch of scientific papers, book chapters, and conference papers and figured out a thing or two about the process. The process of writing a paper is difficult and in a lot of ways a labor of love.

Writing a paper (or a proposal, or a thesis) is like eating. But not just your sit-down-at-the-Sunday-dinner-table-eating; it’s like competitive eating. The goal is to see how much you can write, revise, smooth, shape, revise again, update, write, revise again, until you feel like you’re going to vomit (figuratively speaking, most of the time)- and then, guess what? You get to continue with the same. At a normal meal you start out hungry. You sit down at the table and help yourself, then pleasantly enjoy eating, growing slowly less and less hungry as you eat more. When you’re full, or sometimes slightly later than that, you stop, push back from the table, burp, then go about your business satisfied with a full stomach. Not so in competitive eating, nor in paper writing. The goals are completely different. You are not eating to feel full- you’re eating to win godamnit. And likewise, you’re not writing just until you don’t feel the urge to write anymore- you’re writing until you damn well finish that stupid paper. Cram it down until you’re finished. It actually is a skill worth having to be able to revisit writing a paper long after you’ve lost the appetite for it.

It’s also similar to the process of moving (like from one residence to another). The moving analogy is that after you’ve finished the first 90% of a paper, you find that you only have 50% more to go, then after you finish 90% of THAT, you find that you only have 50% more to go. It can get ugly, but it’s generally the case that the easiest lifting (moving the couches and beds and other big items) gets done first, goes the fastest, and sometimes you have a bunch of friends helping out. It’s after that part gets finished that the real work starts. And always it takes one person who drives the paper- there’s got to be someone who takes ownership (hopefully the first, or in some cases, the last, author- but not always) and drives the stupid thing to completion, staying after everyone else has left to sweep out the corners and make everything look beautiful.

First, writing a scientific paper is telling a story. Don’t get me wrong- this is non-fiction, and it all should be TRUE as far as you can determine- but it is a story nonetheless. If you organize a paper like a protocol (do step 1, then do step 2, etc.) or present it in the linear way most studies actually occur (we started out to show this, but then we found this other interesting thing, and then we went back and tested it, but that failed so we did something else) then you will lose your audience’s attention. And your primary (as in first, and in some ways most important) audience is the reviewers who act as gatekeepers for your paper to get seen by a larger scientific audience. If your audience loses interest or can’t follow the story or gets frustrated- it’s generally game over. Reviewers might, if you’re lucky, tell you that the overall flow is confusing, or that you should clarify parts or the whole, but it’s equally likely that they’ll lose interest and will simply turn in a poor review that entirely misses your (incredibly important) point.

So my point is that the story that you tell can be in an order other than which it was actually executed. I have clear memories of my graduate advisor taking a bunch of immunoblots that my wife (who was a research tech in the same lab) and I had generated over the course of about a year, laying them out on the bench, and starting to move them around in order as he figured out the story. I was taken aback- I remember thinking, “but that’s NOT the way we did it!” But, of course, he had the big picture in mind and knew how these things could fit in to a story worth telling. It’s important to note that this is not the same as making up a hypothesis to fit the results post hoc. Each of the chunks of results is, itself, an experiment with a hypothesis that was set a priori and tested with the experiment.

Anyway, I like to think of the paper writing process as modular, similar to the way my graduate advisor, and then my post-doc advisor after that, shuffled around bits and pieces of evidence to make a complete picture. Each of those pieces was a module. A bit of science that was executed in a very similar way. Each of those pieces had a beginning, the hypothesis, a middle, the experiment, and an end, the interpretation of the results and placement in the larger context. Each could be moved around, but the order determines how the story flows. These modular pieces fit nicely into the discussion of the scientific method I posted earlier.

Here’s my modular template. This works reasonably well to lay out an initial outline. If you’re having trouble figuring out a paper, it’s not a bad idea to compose a bunch of these modules, print them out, and then physically move them around on a table. It helps make connections that are not apparent from a more static document. For those who’ve written a lot of papers (this holds for proposals as well) this will come as no great revelation but I hope it’s helpful for those who aren’t as experienced.

Manuscript Modular Template

Introduction/Background

  • Statement of problem and significance
  • Background information that leads to the overarching hypothesis
  • Previous approaches
  • Statement of overarching hypothesis
  • Summary of results

Module x: Title

  • Input: What question does the previous module raise that should be answered? This piece obviously has to be in the context of the order that you’ve put these modules so can’t be written ahead of time.
  • Hypothesis: What is the question being asked? What does this experiment test?
  • Method/experiment: How was the question addressed? What were the steps in this experiment?
  • Results/analysis: What were the results and how do they address the hypothesis?
  • Output: What question(s) does this result raise? This will link to the next section and you probably shouldn’t duplicate it with Input section of the next module.

Discussion/Conclusions

Generally this section is a bit more customized to the implications of the study. I think it can follow some general guidelines, but there will be things that have to be given more weight, considerations about data and data processing, caveats of the study, and discussions of the biological importance of the finding in a larger picture view of the overall problem.

How Big Collaborative Science is like Making a Movie

Disclaimer. I’m not involved in filmmaking, I’ve never been on a movie set, never even been to Hollywood. But I have watched a lot of DVD special features on making movies.

I am involved in science; big collaborative science as part of several NIH multi-institutional centers. Two systems biology centers funded by the NIAID (Systems Virology and Systems Biology of Enteropathogens) and a proteomics center part of the NCI Clinical Proteomics Tumor Analysis Consortium (CPTAC). I also work at a national laboratory where that’s the idea of how we’re organized- to do big science. So I know something about that.

Some of the worst movies have the best DVD special features. The League of Extraordinary Gentleman was a pretty bad movie, but it had something like four hours of special features on the DVD, all really interesting. Watching special features about making feature-length movies raised some very interesting parallels between big collaborative science and making movies (at least from my limited perspective.)

So first off, here’s how they’re different:

Goal and product: A movie project’s goal is to, big surprise, make a movie. A movie is a discrete chunk of a result. It gets made and released all at once, and there’s only one way to win: did you succeed in getting the movie made? A research project produces multiple different ‘products’-the chief of these is research publications. But these things are produced over time and released in different ways.

Having one end product, a finished movie, is also different. A research project will have many small projects that generally circulate around a set of central guiding hypotheses (that are at a very abstract level). This means that there will be gradations to successfully completing the project.

Evaluation: Bottom line, movies are successful if they make money. And they can be evaluated based on this by simple comparison with other movies. Research projects, because of the diversity of their different possible products, are harder to evaluate. Impact on the scientific community, high-impact publications, data released, etc. To be fair, a movie can be a classic work of art, and highly reviewed by critics, and so a ‘success’, without making a lot of money.

Now how they’re similar:

Funding: This is where I’m the most shaky in terms of making movies but it seems that there’s a funding agency (studio or often other entities) that basically pore over ‘proposals’ (screenplays, ideas, sequels, etc.) and award funding to people (directors) or groups (a director and their team) who they think can get the job done. It’s not unlike some scientific funding mechanisms- though it doesn’t involve the component of peer review, which many scientific funding agencies use. The funding agency is making an investment and they want to see a return on that investment.

Vision: This isn’t always true of big science, but it also isn’t always true of movies: A successful project must have clear and driving vision. Generally this is instituted by a central figure for the project, the PI or the director, depending. The vision isn’t just about the end product but also about how the end product(s) should be achieved. If the vision isn’t there then nothing will get done, or nothing will get done well.  (I guess I’m using this as equivalent to leadership, but it’s not really)

Organization: There are many parts to making a movie, and increasingly in big science, there are many different disciplines that need to come together to pull off a successful project. There have to be teams with different skills and expertise to work on the various parts of the project. So assembling the appropriate team with the right skills, the ability to get things done (critical), and work in a large project, is very important.

Collaboration/communication: So this is probably like many, many other large-scale projects (building a skyscraper, a ship, running a government, etc.) but I like the insight that the “making of” special features afford into this world. Collaboration is key. No one person, or one discipline can make it all work. Sure, the actors and actresses are important to a movie. But without everyone else working as a team and communicating efficiently at multiple levels they are just going to sit around and stare at each other. Making a movie at large scale seems incredibly complicated. So you need people who are specialists in what they do and probably know very little about what the other people do. Then you need people who glue those groups together, both from the top down, like a director, and from inside, like the leads from each team who can interface with other groups.

I really liked a quote I heard on a recent making of, but won’t be able to properly attribute it. It was the stunt coordinator talking about collaboration. He said (paraphrased), “we can both have different ideas of what a tree is. The best way to collaborate sometimes is to each go off and draw a tree and then come back and say, ‘is this what you meant?'” Successful collaboration can lead to great things that neither side thought of to start with and poor collaboration (in some cases no collaboration) can completely sink a project.

In my admittedly limited experience collaboration is the most important component. Of course, without the requisite technical skills in your collaborator you can’t actually collaborate effectively, so there is that, but the process of communication at all levels will determine how successful the project is and how happy everyone is about it.

Closing credits. Why is this comparison useful?

I think the most important takeaway- and one that I really haven’t discussed here- is that big science is substantially different from ‘traditional science’ (that is, individual researchers working on projects in their own labs, more-or-less by themselves). Thinking of the organization of big science projects in broader, more abstract terms actually seems to make them more approachable and understandable. This is in contrast to thinking of big science as traditional science, only with more people. In my experience this seems to lead to the use of big science funds to accomplish several disconnected traditional science projects, and misses the potential that big science offers. Big science projects have very similar parts and interactions as other large-scale projects. So how can ideas from these projects be used to improve how we do big science? That’s an open question but one worth thinking about.