So I got this comment from a reviewer on one of my grants:
The use of the term “hypothesis” throughout this application is confusing. In research, hypotheses pertain to phenomena that can be empirically observed. Observation can then validate or refute a hypothesis. The hypotheses in this application pertain to models not to actual phenomena. Of course the PI may hypothesize that his models will work, but that is not hypothesis-driven research.
There are a lot of things I can say about this statement, which really rankles. As a thought experiment replace all occurrences of the word “model” with “Western blot” in the above comment. Does the comment still hold?
At this point it may be informative to get some definitions, keeping in mind that the _working_ definitions in science can have somewhat different connotations.
Hypothesis: a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation.
This definition has nothing about empirical observation- and I would argue that this definition would be fairly widely accepted in biological sciences research, though the underpinnings of the reviewer’s comment- empirically observed phenomena- probably are in the minds of many biologists.
So then, also from Google:
Empirical: based on, concerned with, or verifiable by observation or experience rather than theory or pure logic.
Here’s where the real meat of the discussion is. Empirical evidence is based on observation or experience as opposed to being based on theory or pure logic. It’s important to understand that the “models” being referred to in my grant are machine learning statistical models that have been derived from sequence data (that is, observation).
I would argue that including some theory or logic in a model that’s based on observation is exactly what science is about- this is what the basis of a hypothesis IS. All the hypotheses considered in my proposal were based on empirical observation, filtered through some form of logic/theory (if X is true then it’s reasonable to conclude Y), and would be tested by returning to empirical observations (either of protein sequences or experimentation at the actual lab bench).
I believe that the reviewer was confused by the use of statistics, which is a largely empirical endeavor (based on the observation of data- though filtered through theory) and computation, which they do not see as empirical. Back to my original thought experiment, there’s a lot of assumptions, theory, and logic that goes into interpretation of Western blot – or any other common lab experiment. However, this does not mean that we can’t use them to formulate further hypotheses.
This debate is really fundamental to my scientific identity. I am a biologist who uses computers (algorithms, visualization, statistics, machine learning and more) to do biology. If the reviewer is correct, then I’m pretty much out of a job I guess. Or I have to settle back on “data analyst” as a job title (which is certainly a good part of my job, but not the core of it).
So I’d appreciate feedback and discussion on this. I’m interested to hear what other people think about this point.