Evidence
The following is a tutorial essay for Knowledge and Reality with Daniel Kodsi. In effect, it is a quick agenda for an hour-long discussion on a guiding question, which for this week was ‘Does knowledge, and only knowledge, constitute evidence?’. An edited version of the material is on my Substack.
I argue that all and only knowledge constitutes evidence. I assume that we start with a good grasp on knowledge and evidence. We know roughly what knowledge requires, and where knowledge is required. Further, we can specify the core functions of evidence across various domains, from everyday discourse to scientific investigation. From this starting point, I argue that knowledge fulfills the theoretical functions of evidence, such that one’s total evidence comprises just what one knows. §I argues that all evidence is knowledge. §II argues that all knowledge is evidence. §III deals with some objections to the resulting picture that all and only knowledge constitutes evidence.
I. All evidence is knowledge
The core role of evidence, especially in scientific practice, is to determine the relative plausibility of competing hypotheses, as modeled by conditionalization. Evidence—that is, what we properly condition on—must be known. In particular, one cannot properly condition on something unless one knows it, because (crucially) conditioning on φ fundamentally involves discarding every possibility inconsistent with φ. There is something wrong with discarding possibilities which, for all one knows, might be actual; hence, one may properly condition only on what one knows. That is, all evidence is knowledge.
To defend the crucial claim in the foregoing argument, we must get a bit technical. See ‘E = K, but what about R?’. Let W be a set; call its elements worlds, and its subsets propositions. A proposition φ is true at some world w just in case w is in φ. Thus, we’ve identified each proposition, in a coarse-grained way, by the set of worlds at which it’s true. (This much should be familiar from, say, Kripke semantics.)
Now, let Σ be a logically closed set of propositions. That is:
- Σ contains ⊥ := ∅ (the proposition true at no world).
- If Σ contains φ, then it contains ¬φ := W \ φ (the proposition true just where φ is not true).
- If Σ contains φ1, φ2, …, then it contains φ1 ∧ φ2 ∧ … := φ1 ∩ φ2 ∩ … (the proposition true just where all of the given propositions are true).
Then Σ is a σ-algebra on W, and (W, Σ) is a measureable space. Now, let M be a function from Σ to the (extended) real numbers such that:
- M(φ) ≥ 0.
- If φ1 ∧ φ2 = ⊥, then M(φ1 ∨ φ2) = M(φ1) + M(φ2). We really want this to hold for countable families of propositions as well, so that we have countable additivity over just finite additivity.
Then M is a measure on (W, Σ), and (W, Σ, M) is a measure space. If M(W) = 1 then M is a probability measure, for which we write P. If 0 < M(W) < ∞, we can just define a probability measure M’ as M’(φ) = M(φ)/M(W). In particular, (W, Σ, P) is a probability space, and we call W the sample space and Σ the event space. For any event φ in Σ, P(φ) is the probability of φ.
Next, we define conditional probability. Suppose that we only care about worlds in some particular event e, and so discard all the worlds not in e. This induces a subspace of our probability space. Our sample space naturally becomes e itself. For each event φ in Σ, discarding the non-e worlds leaves us with the subset e ∩ φ. We thus have a restricted event space Σe := {e ∧ φ : φ ∈ Σ}. But recall that Σ was logically closed, so Σe is just a subset of Σ. In particular, all the events in Σe are already in the domain of our probability measure P.
Restricting the domain of P to Σe yields a measure on (e, Σe); but we can do better. So long as P(e) > 0, we can define the probability measure Pe := P(φ)/ P(e) for each φ in Σe. Thus, we have a subspace (e, Σe, Pe).
Now, Pe was only defined on (e, Σe) but we can naturally extend it to a probability measure P(· | e) on (W, Σ). Recall that every ψ in Σe is equivalent by definition to e ∧ φ. Thus, we can define P(φ | e) := P(e ∧ φ) / P(e) for each ψ in Σe which uniquely extends Pe to a probability measure on (W, Σ).
Now, we see that the definition of conditional probability P(φ | e) is not arbitrary; it is (the unique extension of) the probability measure induced by restricting our sample space down to some particular (positive probability) event. In particular, we now see that conditioning on some piece of evidence fundamentally involves discarding all the possibilities (formally, worlds) inconsistent with it.
So, the foregoing argument to the effect that all evidence is knowledge goes through. As a reminder, the thought is that one may not discard a possibility which, for all one knows, actually obtains. But the nature of evidence is that one is always in a position to condition on it. So, if something is not known and so may not be conditioned on, then it is not evidence. Contraposing, all evidence is knowledge.
This way of specifying the core function of evidence automatically yields the result that all evidence is propositional. In particular, if something is not a proposition (in particular, an event), then it cannot be conditioned on, and so cannot be evidence in this sense. Of course, one may take less orthodox views of updating, such as ‘generalizing’ conditionalization to Jeffrey conditionalization; but this adds massive complication to the updating procedure, for no real gain in generality. In particular, we can always enrich the probability space to mimic the effect of Jeffrey conditioning using standard conditioning. See also Jaynes (2002).
II. All knowledge is evidence
Given the previous section, we have that evidence is a subset of knowledge. Considerations of theoretical economy point against taking evidence as a special, strict subset of knowledge: given the previous section, the equation of evidence with knowledge looks like the natural default hypothesis. Thus, even without much positive argument for the claim that all knowledge is evidence, we might tentatively adopt it if we accept that all evidence is knowledge.
However, we can offer positive arguments to the effect that all knowledge is evidence. Another way of putting the functional role of evidence is that evidence supports those hypotheses which successfully predict it. Suppose we have two events h1 and h2 representing two competing hypotheses. By Bayes’ Theorem, P(h1 | e) > P(h2 | e) just in case P(e | h1)P(h1) > P(e | h2)P(h2). So, if the two hypotheses start off equally plausible, we should prefer the one which better predicts our evidence.
Now, suppose that two competing hypotheses are equally well-positioned, except that the first better predicts something that we know. Then we have reason to prefer the first hypothesis: given what we know, it is more plausible. This applies for any piece of knowledge. But a hypothesis gains nothing by predicting something which isn’t among our evidence. Put another way, if something lends support to hypotheses which predict it, then it thereby counts as evidence. So, what is known must count as evidence. Thus, all knowledge is evidence.
The original description of the functional role of evidence also supports the claim that all knowledge is evidence. Just as one may not properly ignore possibilities that might be actual, one may properly ignore possibilities inconsistent with whatever one knows. If one really knows that some possibility does not obtain, then one does no deep wrong in discarding it from consideration.
Of course, there might be practical reasons not to discard possibilities one knows aren’t actual. One can entertain counterfactual situations; one can temporarily set aside some piece of disputed knowledge in order to convince an interlocutor; one can proceed with caution when uncertain about whether one really does know. Of course, being too cautious might mean failing to retain one’s outright belief and so losing one’s knowledge. But these practical reasons apply equally to evidence. One can entertain scenarios inconsistent with ones evidence, or set aside disputed pieces of evidence, or proceed with caution when something’s evidential status is uncertain. If this is right, then not only do we defuse the objection that there are practical limitations on knowledge, but we also gain a positive argument for equating knowledge and evidence: in particular, the practical limitations on knowledge bear a striking resemblance to the practical limitations on evidence.
One might deny that we have a fixed body of evidence which becomes inadmissible in some circumstances due to practical limitations. Perhaps the limitations on evidence are not practical; one simply lacks evidence in these cases (although one retains knowledge). For instance, perhaps evidence depends essentially on being used as evidence for some claim. That is, nothing is evidence full stop, only evidence for this or that. (And so inadmissible evidence is not really evidence.) But this does not constitute a deep disagreement with the E=K thesis. If this picture of evidence is correct, then E=K should be understood as equating knowledge with potential evidence: all and only knowledge can constitute evidence for various claims.
The objector may reply with the stronger claim that what can constitute evidence for various claim also varies. For instance, perhaps I can’t use my knowledge that I’ll graduate in 2028 as evidence which guarantees that I won’t need life insurance before then. Then, it seems like evidence comes and goes with something like the stakes of deliberation. The first response to this position is to note that, on some subject-sensitive views of knowledge, it comes and goes in exactly the same way. A second response is to just deny that evidence comes and goes. If I do know that I’ll graduate in 2028, then I do in fact have evidence guaranteeing that I shouldn’t buy life insurance, but it’s not admissible for some reason or another. Perhaps, when the stakes are high, it would be prudent to rely only on evidence which I know that I have. Similarly, it would be prudent to rely only on knowledge which I know that I have. If this is right, then the striking resemblance between practical limitations on knowledge and evidence holds.
One way of explaining this resemblance, and furthering the case for the equation of evidence and knowledge, is to take personal deliberation as something like an internalization of interpersonal deliberation. That is, individual and collective reasoning are structurally similar; but we tend to think about premises of collective reasoning under the guise of evidence, and premises of individual reasoning under the guise of knowledge. However, it’s plausible that evidence in collective reasoning constitutes (collective) knowledge, and that knowledge in individual reasoning constitutes (individual) evidence.
III. Objections
I end by treating some objections to the thesis that all and only knowledge constitutes evidence.
The first objection is that the thesis trivializes evidence. All knowledge seems self-justifying, in the sense that one always has perfect evidence guaranteeing what one knows, simply in virtue of knowing it. One response is to appeal to the notion of independent evidence. Rules of discourse usually require adducing different evidence for the claim under discussion, rather than blandly repeating the claim. So, the self-justification of knowledge only looks bad because it would be bad form in discourse, not because there is anything deeply objectionable. But another (compatible) response is to note that some knowledge is self-justifying after all. As is well-known, Hesperus the evening star just is Phosphorus the morning star (as both ‘Hesperus’ and ‘Phosphorus’ are names for the planet Venus). The fact that Hesperus appears in the evening (in addition to the identity claim) is perfectly good evidence for the fact that Phosphorus appears in the evening, despite the fact that (on some views) these two facts are identical.
The second objection points to the acceptability of sentences such as:
(1) I don’t know whether Claude stole it, but the evidence is that he did.
(2) I don’t have evidence that Claude stole it, but I know that he did.
If we take these at face value, the first suggests that some evidence isn’t knowledge, while the second suggests that some knowledge isn’t evidence. But for the first case, we should note that there’s something suspect about the phrase ‘the evidence is that’. Certainly, my claim being that p requires that p is part of my claim, not merely that p is supported by my claim. But the evidence being that p does not seem to require the same; in particular, ‘the evidence is that p’ seems closer to ‘the evidence suggests that p’ than to ‘that p is among the evidence’—certainly, ‘I don’t know whether Claude stole it, but that he did is among the evidence’ sounds much worse. For the second case, we must appeal to something like the response to the first objection. That is, it only looks like one has no evidence because adducing such trivial evidence would be unacceptable; ‘evidence’ in (2) must be elliptical for something like ‘admissible evidence’.
The third and final objection takes evidence to be a strict subset of knowledge. To avoid skepticism, we should grant that knowledge can be gained from ampliative inference. After seeing enough red balls drawn from some urn, one might ‘take the plunge’ and come to know that all the balls in the urn are red. After seeing a disproportionate number of Heads, one might ‘take the plunge’ and come to know that a coin is biased. But if such inferential knowledge can further serve as evidence in its own right, then we might allow a suspicious sort of bootstrapping. On the previously sketched picture of conditionalization, it looks like choosing some events with sufficiently high conditional probability, and throwing out the few cases where they don’t obtain. This looks bad enough already. But if we keep doing this, eventually the small risks of error may add up to a large risk of error. Yet, since our base of evidence keeps growing as we gain more knowledge, it looks like the later inferences are actually more secure than our earlier inferences. This result seems troubling; it might be better to restrict evidence to non-inferential knowledge.
Here is a multi-part response. The first thing to note is that ‘taking the plunge’ does not amount to simply ruling out events with negligibly low chance; rather, it involves ruling out events with negligibly close chance. Thus, error-risk does not accumulate in the suspicious way. Of course, all else equal, things with low chance are likelier to not appear with any close chance (as higher chances means more ways to appear in close chances), so there is a connection between no-high-chance probabilistic safety and no-close-chance modal safety. Spelling out the latter more precisely is somewhat difficult, Though see ‘Probability and Danger’. but the objector also faces the challenge of spelling out what knowledge counts as too inferential to be evidence. Depending on how this is done, the objector may end up ruling out much of our knowledge, leaving us with very little evidence to work with: perhaps the vast majority of our knowledge is inferential in some sense. Further, although this objection sounds worrying in the abstract, it’s hard to see any really difficult cases. We might expect that it will often look either like one really does have inferential knowledge, in which case it will be fine to treat such knowledge as evidence; or else like the inference is too shaky for the conclusion to be known in the first place. Speaking abstractly allows the objector to illicitly run these two possibilities together, giving the false impression that there are cases where some inference is solid enough to count as knowledge but too shaky to serve as evidence. Finally, we have the fallback move of treating some evidence as inadmissible for justifying further inference, thereby blocking any suspicious bootstrapping. The upshot is that we already have many resources to explain away this worry, such that it’s unnecessary to make sharp changes to our theory in order to vindicate the worry.
References
- Jaynes, E. T. (2002). Probability theory: the logic of science (G. L. Bretthorst, Ed.). Cambridge University Press: Cambridge.
- McGlynn, A. (2014). Knowledge first? Palgrave Macmillian.
- Pritchard, D., & Greenough, P. (Eds.). (2009). Williamson on knowledge. Oxford: Oxford University Press.
- Williamson, T. (2000). Knowledge and its limits. Oxford University Press.
- Williamson, T. (2009). Probability and danger. In Amherst lecture in philosophy (pp. 1-35).
- Williamson, T. (2024). E=K, but what about R? In M. Lasonen-Aarnio & C. Littlejohn (Eds.), The Routledge handbook of the philosophy of evidence. Routledge.