Science figures we could do without

Everyone knows them- Figure 3 in <big name journal> that’s supposed to be telling you something important, or a lot of important things, but instead is either uninterpretable or just plain misleading. An excellent compendium of bad figures can be found at Bad Figure. Others have made collections of similar things: scientific figures, infographics.

I thought I’d add to this by trying to define some classes of bad figures and give some examples.

Note: the examples I’ve chosen ARE NOT a statement about the quality of the originating paper in terms of their results or conclusions. They’re just examples to illustrate these different classes of poor figures.

The “Trump Up”:

This kind of figure could be expressed as a Table or as a single panel of another figure, or even just reported as numbers in the text. However, it would look REALLY COOL as some incomprehensible figure with lots of lines.

This beauty of an example comes from a Nature Genetics paper, “Bayesian method to predict individual SNP genotypes from gene expression data”, Nature Genetics 44, 603–608. Take some simple results and add a bunch of lines. It looks like a bunch of snow angels. It does look like there’s some more complicated information on the predicted genotypes side- but is there actual usable information there?

(a–c) Sample IDs were sorted for each semicircle (right, predicted genotypes; left, observed genotypes; numbers on the outside of the semicircles represent indexed sample numbers). Results are shown for experiments in which RYGB liver was used as the training set for HLC liver (a), HLC liver was used as the training set for RYGB liver (b) and RYGB adipose was used as the training set for HLC liver (c). In the case of a correct pairing (with adjusted minimum Pi,j of <1 × 10−5), the connection between the semicircles was a straight line passing the circle center (blue lines). In the case that no match for a given individual was identified, no line existed: for example, tick A in a–c. The blue curves outside of the right semicircles denote adjusted minimum Pi,j (−log10 transformed) for matching predicted genotype vectors to observed genotype vectors. For convenience, this value was capped at 16. If the value was <5, the curve is shown in red, indicating lack of statistical support for any match. (d) Matching was performed in the HLC liver set to which RNA-DNA mispairing and orphan samples had been added. In the case of a mispairing detected at adjusted minimum Pi,j of <1 × 10−5, the line connecting the semicircles will not be straight (red connections). The predicted genotype of subject 31 (tick A) best matches the observed genotype of subject 98 (tick D). There was no line connecting the observed genotype of subject 31 (tick C). In the case of orphan RNA (for example, subject 137), there was no connection between the predicted genotype (tick B) and observed genotype (tick E). The green curve outside the right semicircle show adjusted −log10 (Pi,i).

(a–c) Sample IDs were sorted for each semicircle (right, predicted genotypes; left, observed genotypes; numbers on the outside of the semicircles represent indexed sample numbers). Results are shown for experiments in which RYGB liver was used as the training set for HLC liver (a), HLC liver was used as the training set for RYGB liver (b) and RYGB adipose was used as the training set for HLC liver (c). In the case of a correct pairing (with adjusted minimum Pi,j of

The “Glamor Cram”:

Many “glamor” journals like Science and Nature have strict limits on the number of pages and figures in a paper. Because some studies being published are large and extremely complicated this can result in some funny, odd, and disturbing outcomes in terms of paper organization. For example, the study of ovarian cancer from The Cancer Genome Atlas published a couple of years ago was 5 pages in the journal and 130 pages of supplemental methods and results (not counting larger tables and data files in the supplemental results). Another effect is that figures can sometimes be crammed with information, because, you know, there’s a LOT to share. An example is from another TCGA paper on breast cancer. It IS interpretable, and even elegant in its own way. But, man, is it complicated- with multiple levels of color and borders having different meanings. Whew. Exhausting.

Mutual exclusivity modules are represented by their gene components and connected to reflect their activity in distinct pathways. For each gene, the frequency of alteration in basal-like (right box) and non-basal (left box) is reported. Next to each module is a fingerprint indicating what specific alteration is observed for each gene (row) in each sample (column). a, MEMo identified several overlapping modules that recapitulate the RTK–PI(3)K and p38–JNK1 signalling pathways and whose core was the top-scoring module. b, MEMo identified alterations to TP53 signalling as occurring within a statistically significant mutually exclusive trend. c, A basal-like only MEMo analysis identified one module that included ATM mutations, defects at BRCA1 and BRCA2, and deregulation of the RB1 pathway. A gene expression heat map is below the fingerprint to show expression levels.

Mutual exclusivity modules are represented by their gene components and connected to reflect their activity in distinct pathways. For each gene, the frequency of alteration in basal-like (right box) and non-basal (left box) is reported. Next to each module is a fingerprint indicating what specific alteration is observed for each gene (row) in each sample (column). a, MEMo identified several overlapping modules that recapitulate the RTK–PI(3)K and p38–JNK1 signalling pathways and whose core was the top-scoring module. b, MEMo identified alterations to TP53 signalling as occurring within a statistically significant mutually exclusive trend. c, A basal-like only MEMo analysis identified one module that included ATM mutations, defects at BRCA1 and BRCA2, and deregulation of the RB1 pathway. A gene expression heat map is below the fingerprint to show expression levels.

 The “Ridiculome” (aka The Hairball):

This is one that I’ve been guilty of- so I’ll use an example from one of my own papers (again: this isn’t a critique of the quality of these papers, which is impeccable in this example of course). Lots of interactions? Why not show them! All. At. Once. Because that will really illustrate your point about your work being super complicated.

From “Enhanced functional information from predicted protein networks.”, Trends in Biotechnology 22:60-62. A rainbow-colored Death Star exploding? A psychedelic pincushion? Who can tell?

Figure 1. Comparison of predicted protein networks for E. coli. (a) Protein pairs and their mutual information scores based on phylogenetic profiling were used to generate a network for E. coli. Figure generated using data from [4, supplementary information] (b) Protein interactions were predicted using Bioverse [7] based on finding pairs of proteins similar in sequence to proteins from a database of experimentally determined interactions. Figure generated using data from Bioverse (http://bioverse.compbio.washington.edu). For both networks, nodes representing proteins are colored based on their gene ontology (GO) [19] category and the 220 proteins present in both networks are outlined in blue. Edges represent the predicted relationships between proteins [functional linkages in (a) and protein interactions in (b)] and are colored by confidence (a) or mutual information score (b).

Figure 1. Comparison of predicted protein networks for E. coli. (a) Protein pairs and their mutual information scores based on phylogenetic profiling were used to generate a network for E. coli. Figure generated using data from [4, supplementary information] (b) Protein interactions were predicted using Bioverse [7] based on finding pairs of proteins similar in sequence to proteins from a database of experimentally determined interactions. Figure generated using data from Bioverse (http://bioverse.compbio.washington.edu). For both networks, nodes representing proteins are colored based on their gene ontology (GO) [19] category and the 220 proteins present in both networks are outlined in blue. Edges represent the predicted relationships between proteins [functional linkages in (a) and protein interactions in (b)] and are colored by confidence (a) or mutual information score (b).

The “Token Network Figure”

Why, yes, Reviewer 3, we DO have a license for MetaCore/Ingenuity Pathway Analysis. We’ll get our post-doc right on that. A “systems-level” figure you say? Certainly- right away. What does it mean? Well there are molecules. And they’re connected. They’re ALL connected see? I mean to say that it’s complicated. And some of these things depicted were actually found by us to be significant. We’ll let you guess at which those are. But one of our favorites shows up smack in the middle and looks really meaningful. I’m sure there’s something good here. There must be. There are so many different interactions. My god, it’s full of proteins.

Yes. There are interactions here.

Yes. There are interactions here.

The “Included-By-Popular-Demand”:

At least, that’s what I’m guessing. The paper has been returned from review and the reviewers want to see “a figure showing relationship X included”. The data doesn’t look *that* great, but the reviewers are asking for it so it’s included as a figure. The reviewers are satisfied and give the paper a go-ahead. If they had reviewed that same figure the first time around they would have suggested it be removed. Sigh. An example of this is from this paper: “Evolutionary Rate in the Protein Interaction Network” Science 296 (5568): 750-752 

While it’s true that a “trend” shown here could be significant (given enough data points) this figure totally doesn’t show that. Why would the authors think that this supports their conclusions solidly, or why it should be Figure 1 in their Science paper is beyond me.

Figure 1 The relation between the number of protein-protein interactions (I) in which a yeast protein participates and that protein's evolutionary rate, as estimated by the evolutionary distance (K) to the protein's well-conserved ortholog in the nematode C. elegans.

Figure 1
The relation between the number of protein-protein interactions (I) in which a yeast protein participates and that protein’s evolutionary rate, as estimated by the evolutionary distance (K) to the protein’s well-conserved ortholog in the nematode C. elegans.

 The “Really Should Be A Better Way to Show This”:

This is where it gets personal. I really don’t like these, but do admit that under certain circumstances with certain kinds of data they can be effective. The spider plot or radar charts. They attempt to show differences between multiple categories of data, for multiple variables. And they do so by changing directions in dizzying fashion.

This post presents an excellent critique of radar charts and some alternatives that work better to present these kinds of data.

Here's an example of a spider plot. It's not even that bad as these plots go- but still very hard to interpret.

Here’s an example of a spider plot. It’s not even that bad as these plots go- but still very hard to interpret.

Fin

This list is woefully incomplete. Of course, there are many other categories that could be added here. Please suggest some.

One thought on “Science figures we could do without

  1. Pingback: Bad, bad Science Figures… – A Bucket Full of Science

Leave a Reply

Your email address will not be published. Required fields are marked *