How long is long: Time in review for scientific publications

Though there seems to be a lot of anecdotal information about how long it takes to get your scientific paper reviewed by a peer-review journal there doesn’t seem to be much actual data about this. Although some journals (like PNAS) list dates for “sent to review” and “approval”, these may not include the whole process- time for editorial consideration for example- and are probably not representative. PLoS journals do a great job and list the date received and the date of acceptance, but I couldn’t figure out a way to get that information in bulk (I didn’t inquire by the way- maybe another project in the works). The length of time it takes to review a paper for publication can have numerous impacts on projects, grant proposals, and the ability to submit to another journal if the paper is rejected.

Having been a peer reviewer for some time I realize that often it’s difficult to return reviews on time, and this is one source of delay. Editorial delays, because of volume of submissions being considered or other reasons, is another. And then there are just difficult reviews that may take more than the normal number of reviewers because of conflicting reviews or reviews that are not clearly positive or negative. This process can easily stretch out over months. Then after reviews come back the authors must address the reviewers concerns and submit their revisions. This is also a source of delay, and can be highly variable. Some journals (BMC journals for example) limit the number of revisions possible on a single manuscript to two- but they allow resubmission of the revised manuscript as a “new” submission after that- presumably to be handled by the same editor.

Over the last few years I’ve tracked the time it takes to get a paper accepted, from the time of first submission, for papers that I’m responsible for (first or last author papers). This doesn’t include rejected papers- some of those times, especially for higher impact journals where the initial decisions whether a paper will be peer reviewed at all are made by editors and turnaround is generally fairly quick.

This is NOT a representative sample, but it does capture many of the elements I’ve discussed above. These numbers are pretty in-line with an evaluation of PLoS One turnaround times.

So the answer is: No, I wouldn’t consider 100 days to be fast, but it’s not exactly slow either. In fact, it may be in line with what can generally be expected from the scientific publication system. I’d be very interested to hear other researchers’ opinions on their times in peer review and if you have data all the better.

Table. Survey of time in review for a number of my own papers.

PMID Journal Days in review Months
23335946 Expert opinions 142 4.7
22546282 BMC Sys Bio 335 11.2
23071432 PLoS CB 366 12.2
22745654 PLoS One 198 6.6
22074594 BMC Sys Bio 103 3.4
21698331 Mol BioSystems 137 4.6
21339814 PLoS One 193 6.4
20974833 Infection and Immunity 55 1.8
20877914 Mol BioSystems 36 1.2
19390620 PLoS Pathogens 132 4.4
Mean 170 5.7
  Std dev 108 4

 

Money is deeply, fundamentally weird.

Ever since I read this article in Wired magazine (you know, the paper things that are thinner than books and you still find in doctor’s offices?) I’ve had this feeling that the sands are shifting beneath my feet. How can you truly know the value of what’s in your wallet? Count your money? Try again. Money is something other than what we normally think it is. The financial credit crisis of 2007-ish happened in part because of people and groups wanting to buy lots of debt. Why does anyone want to buy someone else’s debt? It makes sense (debt gets paid back with interest, that make the owner of the debt money)- but is pretty weird, really. And many people, myself included, felt that they ‘lost’ a large amount of value following that time- but what was that value really? (an awesome overview of this whole thing can be found at This American Life’s podcast about it- HIGHLY recommended). Why does the economy contract? Isn’t it weird that tomorrow there may be more (or less) value in the world than there is today?

Money is fluid, and so is value. Imagine that you have currency based on gold (work with me here). You can set a value for a certain amount of currency based on the very real mass of gold that it represents. No problems. Everything is smooth sailing, right? Well, all the sudden a massive gold vein is discovered near a subway station in Manhattan. And all of the sudden the value of what you have in your pocket is not what it was in the morning. You worked the same amount for it, right? So why did it change?

The Wired article describes a market, based in the online game Everquest, then just a few years old. In this game players can earn currency (in virtual gold pieces I think) by playing the game. The demand, at that time, was such that the virtual gold pieces had real value. That is, there was an exchange rate between Everquest ‘gold’ and real dollars. Think about it for a minute. Instead of thinking, “what weirdos are going to pay real money to buy gold in an online game”, the real question is what does this say about our “real” money? You could, in theory, go to work all day in a virtual world for virtual currency- that is, play a game that enough other interested parties are playing- and then exchange that currency for things that you really need. Who carries coins and bills in their pockets? Credit cards are where it’s at: Money has gone in directly from your employer then gets transferred to the store you’ve made a purchase at- no physical instantiation involved at all. There have been sweatshops uncovered where the workers are playing games for days on end to get virtual currency (that then is turned into ‘real’ money).

The more recent advent of Bitcoin is a similar-type example of our ability as humans to strike bargains between each other. Their (it’s a decentralized, open source effort, so maybe that’s more of “our”) system is pretty cool and complex, but with thought behind it, which is more than I can say of the US monetary system. Money is a bargain between people. It’s not only based on trust, hope, and need, it’s actually a human instantiation of those very emotions. So when you pull out a dollar bill to pay for something, think about how you’re handing over your trust, hope and need to the sales clerk. But I wouldn’t advise mentioning that to them. That would be weird.

Five minute explanation: Cyanothece transcriptional model


Because of the fact that the paper is behind a paywall, I’m making it available as the submitted manuscript. Eventually I’ll get with the program and start releasing on ArXiv or Figshare, but for now it’s here. I’ve tried to make the version somewhat pretty (I get really tired of reading papers that are double-spaced and have the figures and tables at the end).

Citation

McDermott J.E., Oehmen C., McCue L.A., Hill H., Choi D.M., Stöckel J., Liberton M., Pakrasi H.B., Sherman L.A. (2011) A model of cyclic transcriptomic behavior in Cyanothece species ATCC 51142. Mol Biosystems 7(8):2407-2418. PMID: 21698331

*but behind a paywall at Molecular BioSystems

Here available as the submitted manuscript and supplemental information.

Background

Cyanothece sp. 51142 is a ocean-dwelling cyanobacteria that is capable of fixing nitrogen in the dark and photosynthesizing in the light, two normally incompatible activities. Unlike some other cyanobacteria it makes this switch inside the same cell every light/dark cycle (normally about 12 hours). This makes it interesting from the standpoint of bioenergy

A 'wreath' network of transcriptional changes in Cyanothece over a 24 hour period.

A ‘wreath’ network of transcriptional changes in Cyanothece over a 24 hour period.

production but also regulation. The process of how it is able to drastically rearrange it’s machinery every 12 hours is not well understood.

What was done?

We used multiple transcriptomic datasets (measurements of levels of gene expression) taken at different times in the light/dark cycle to construct a general model of the functional processes occurring in Cyanothece. The interesting part about this was that we did not impose the circular shape on the model, it arose naturally from analysis of the data, and it really does represent a clock- with the pattern of gene expression at different times of day being located at different locations on the clock face. We then used a mathematical approach to relate the expression levels of drivers (regulators) with groups of genes that can be associated with different functions. The model allows us to plug in different starting points and predict what the state of the system will be at future times.

Why is it important?

The model we constructed can be changed and results simulated to predict what will happen in a real experiment. These kinds of models are good for focusing experimental efforts by predicting interesting behavior. An example question might be to ask what would happen to the timing of photosynthesis (as judged by gene transcription) if the levels of a key regulator are changed. The resulting prediction(s) can then be tested experimentally to discover new things about the system.

The story

This paper took about five years to get written and accepted. That’s from the point at which I decided that a paper should be written to the point that it was published. It was from the first project I worked on at my then new position. I came up with the wreath visualization early in the process and, after having convinced myself and others that it was real, found that it was a very compelling way to think about the diurnal (day/night) cycle. The figure has been used in many different forms, mainly as eye candy. I’m amused when I see it on a poster that I had nothing to do with (from my workplace PNNL). It has even been used around the web.

 

Gaming the system: How to get an astronomical h-index with little scientific impact

The old scientific adage “publish or perish” has garnered a lot of debate lately. I’ve posted about my own scientific impact as well as the impact of papers published about computational methods that are named versus unnamed in the title. Certainly publications remain the currency of scientific careers, for better or worse- though I think this is changing with more emphasis being placed on other, more flexible and open, forms of scientific outreach. There’s a lot of talk about this subject from various places including ByteSizeBiology, Peter Lawrence, and Michael Eisen – to name a few.

The purpose of this post is to highlight an instance of abuse of the system- kind of in a funny (odd, surprising, shocking) way. This is similar in spirit to recent reports that a math paper generated by linking mathematical words together by an algorithm to write papers was accepted into a journal.

I was searching gene names to research a paper I was writing a couple of years ago and started to notice a weird pattern. Some genes were mostly absent from the literature (that is, no one has actually studied their function, and they haven’t been highlighted in any other screen-type studies that identify lots of things). However, a number of publications on completely different genes looked suspiciously similar. Many of these had titles that included the words “integrative genomic analyses” or “identification and characterization of [gene] in silico”, they all had two authors M. Katoh and M. Katoh or Y. Katoh, though some had more authors, and most were published in a few journals, the International Journal of Molecular Medicine and the International Journal of Oncology both with low, but respectable impact factors (1.8 or so). Many, though not all, of these papers seem to be rehashed digests of information obtained from databases combined with review-type information about potential functions related to cancer or biomedicine. This PubMed search retrieves most of these citations for your amusement.

A quick search in Web of Knowledge for “Katoh M” as an author and “INTERNATIONAL JOURNAL OF ONCOLOGY” as a publication retrieves 99 publications, with a jaw-dropping h-index of 48 (h-index is a measurement of scientific impact of a group of publications). Results from the “INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE” were only slightly less impressive (h-index of 37 with exactly 99 publications as well; see the screen capture of results below). Following up with a search of the three main names here, Masaru, Masuko, or Yuriko (there was also a mysteriously named “Mom Katoh”, who may be the ringleader of the bunch- but she/he only had a couple of publications) retrieved 216 publications with a combined h-index of 56, a number that any biologist would die for (or at least should be very happy with).

Web of Knowledge Search for Katoh M

Web of Knowledge Search for Katoh M

Masaru is affiliated with the apparently reputable National Cancer Research Institute in Japan. But Masuko and Yuriko don’t seem to be closely affiliated to any place in particular (judging by a Google search).

Some of these publications may, in fact, be valuable and have valuable information and results in them- I certainly haven’t gone through each and every one. However, a large number of these “integrative genomic analyses” are not useful and seem to have been targeted at genes with little characterization and are written based on template text. The high citation number that they get, then, may be due to lack of care on the part of those citing the publication, and they are included simply because they appear to be the only comprehensive functional study of a particular gene that has turned up in the study. It certainly emphasizes the need for caution when “filling in” citations for a publication that are not central to the main story (and thus writers, myself certainly included, are less critical about the source of their citations).

How important is having a name for your computational method?

When building software tools, databases, or reporting approaches to data analysis or modeling, choosing a name is important. I started out writing this post with the notion that this is true, searched briefly for evidence to back me up, then realized that I could do this analysis myself. Or at least enough to get an idea of how important having a name for your method might be.

Here’s what I did: gathered all publications in the journal Bioinformatics published between 2004 and 2008 (3517 or so) from the Web of Knowledge/Science. I then identified those publications that referenced software tools or databases by starting with a “[name]: [title of paper]” giving approximately 954 publications (there are more than this that fit the bill, more on that in a minute). I calculated the mean number of citations the publications in each group had (not adjusting for years in publication)- that’s the “All” comparison in the figure below. The difference shows that publications that use a name garner more citations (and thus have more ‘impact’ by this measure) and this was statistically significant by t test (0.005). However, this could be due to the difference in the nature of the publication. Perhaps, tools are just more likely to be cited than more scientific studies about specific systems (I think they are). So I went through an arbitrary selection of 500 of the publications without a name and identified a conservative set of 158 that looked like they could have had names associated with them, based on their titles. This was a bit of an arbitrary endeavor, but I think I did an OK job. That comparison is the “Matched” comparison below and shows a much more marked difference.

You can find a spreadsheet with my analysis here: Bioinformatics_Pubs_WOS_2008

The bottom line: The publications with named methods garnered over three times the number of citations as the pubs with no names and this was also statistically significant (0.05, because of the smaller number of publications in the matched set).

Impact analysis of pubs with named methods versus unnamed methods

Impact analysis of pubs with named methods versus unnamed methods

There are a number of ways I could improve on this comparison and I’d be happy to entertain suggestions on it. However I think the results of this are quite interesting. There are some reasons that they might be true (that are unrelated to actually having a name). First thing I can think of is that the named publications are likely to be application notes, which describe the release of more mature, tested software than the non-named publications that may describe more of the research and proving of the method- that is, they may be more likely to have tool that is actually usable by others (and thus citable) than the other kind of publication, which may not even provide software at all. A good way to examine this would be to construct a matched set of publications that have no named method, but do have associated software (or web interface). However, I really don’t have time for doing that, it sounds painfully boring.

However, another non-exclusive notion that this result suggests is that simply the presence of a recognizable, easily usable name for a method increases the likelihood that it will be cited in future work. This allows association of the complicated and hard-to-describe process that is described in the paper with a “handle” for the method that is easy to remember. This is actually fairly interesting psychologically and suggests what I believe many scientists already realize, that marketing (the choice of a good name for example) can be key in scientific impact. We can debate on whether or not that’s a good thing, but it’s generally true in science.

So these results seem to suggest that a way to increase scientific impact is to name your method. Though, of course, correlation does not imply causation- so it certainly might not work that way. I’m really interested in seeing if there are patterns in the choice of name that extend to impact, but I’m not sure about how to do that. The length (number of characters) in the name has no correlation with number of citations, but that’s as far as I’ve gotten. Any suggestions?

 

Survival of the fitness: how to do good by your health on travel

I don’t travel a lot compared to some people I work with, but I do a bit of business travel. I just returned from a quick trip to DC. If you travel this way, and you’re trying to maintain an exercise regimen of any kind you know how hard it can be.

from DUSAN PETRICIC in The Scientist

When you get to your hotel you just want to lay in bed, relax, and veg out- meetings can go all day, and the food can be, to put it VERY generously, less than healthy. It’s easy to take the vacation way out. That is, to think, “hey, this business travel is kinda like a vacation and I can just let all this health stuff slide for a bit”. Slippery slope- very slippery. It’s not just the travel time you’re talking about, it’s also the time when you get back and start dodging your workout routines and eating well because you’re out of practice. Actually, business travel can be a great opportunity (see me with the more optimism) to actually do more than you usually do- if not in the eating area at least in the fitness area. Here are some things that have helped me (and that I aspire to, I’m certainly not perfect in this area). I’m intentionally trying to avoid the advice that’s good in this area, but could pertain any time to your fitness.

Eating

  1. Bring along healthy snacks/small meals with you. This beats the heck out of buying stuff in the airport, on the airplane, from the hotel snack bar or (heaven forbid) minibar, or from a random vending machine. This wins on the nutrition front and on your wallet too. I generally pack energy bars (the Clif Zbars for kids are actually great for grownups too and about 120 calories), instant oatmeal with extras (brown sugar, dried fruit, peanut butter) since hotel rooms almost always have coffee makers- but don’t forget a spoon, crackers and tuna fish (Starkist has cute packages, but you can easily make your own), and fruit (NOT bananas, but apples, pears, etc.). All of this should make it through security OK- I’ve never had a problem (even with the PB, which is kindof a ‘paste’).
  2. Don’t give up on eating well, but realize that there are just those times. Dinners out with colleagues, free food buffets, cookies and muffins provided at the conference, alcohol and more alcohol- all those things can be tricky. Make sure that you keep a rough estimation of caloric intake in your head and try to match it (or, if you’re really good, precede it by) doing something from the exercise list below- that way things even out, more-or-less.
  3. You probably won’t eat your best, but DON’T eat your worst. This is just common sense, but it’s really easy to forget. If you’re going to eat bad don’t go whole hog- there are generally better choices and worse choices. Try to go toward the light.
  4. Use jet lag and busy meetings to your advantage. Sometimes jet lag and busy meetings (without food available) can be your friend. You may not be hungry at the times you normally are and you may be able to avoid some of the bad by simply skipping it (this can go both ways- I get hungry early in the morning on the East coast for some reason). Also, for me busy is better. I’ll simply forget that I’m hungry (at least hungry in that bored-so-I’ll-munch way).

Exercise

  1. Bring your workout clothes dummy. It seems simple, but it’s probably not the thing you’re thinking of when you’re packing. Don’t forget workout shoes (I use some flat shoes that pack easily) and an mp3 player if you normally use one.
  2. Make use of the hotel gym. Most business hotels have workout rooms. Make sure you ask when you check in where it is and when it’s open. Use it but don’t be tied to your normal workout schedule since it probably won’t work on travel.
  3. Walk. If your meeting is in the city, walk. Walk to the conference (if it’s somewhere else), to dinner, or just plan to walk around during your breaks. This is the thing that’s really helped me and it’s fun too. Do some research prior to your trip to make sure you’ll be walking in safe areas or just ask at the front desk before you venture out. Walking back to the hotel, even a longish way, after dinner can be a good way to make up some calories- but ask at the restaurant about a safe path. Running works too.
  4. Get out and see the place. If you have breaks or free time go and see the sights, but walk. Use public transportation (Metro is best) to get from A to B and walk the rest. Travel like this is a great opportunity and walking is one of the best ways to actually see someplace.
  5. Use the stairs. Not just to get up to your room, but use the stairs to work out. It may be that the hotel doesn’t have a gym or that the gym isn’t the greatest. Use the stairs. Climbing 10 floors (about 5 minutes) should burn somewhere around 50 calories– and you can do it many times. It’s likely that no one will see you sweat, but this might not be the most interesting place to workout. Listen to music or podcasts to pass the time.
  6. Do a workout in your hotel room. You can blast the tunes, watch a movie, or do this completely naked (but please close the blinds, please). There are lots of different fitness regimens that you can do with no equipment at all- and they can kick your ass. Here’s a good set specifically for the hotel room stay from NerdFitness.
  7. Dress like you mean it. Planning to put on your workout clothes provides a much lower energy barrier than actually working out. So do that first. When you’re standing around in your workout clothes you’ll start to feel stupid for not working out. It actually works.
  8. Use your layover. Airports are big. Some are really big. Use that fact. If you have a layover of more than about 45 minutes start walking. Plan out your walk so that you don’t end up far away from your gate when you need to board- which would make you feel dumb, sweat, and probably hate me for my stupid ideas. Try walking the whole thing. If you have a roller bag so much the better. Dragging one of those things around will only make things better. Skip the moving walkways- instead try to beat the people standing on them (or walking on them even) to the other end. Pretend like you’re in a super hurry to catch your plane, it’s fun.