Probability and Frequency
Alexei Gilchrist
1 Making use of frequencies
If the problem has built-in frequencies of events or objects then we can use this information to determine the numerical value of probabilities. Following Jaynes, think of an urn with marbles (I know, very classical). The marbles are numbered \(1,\ldots,n\) and there are \(m\) green marbles and the rest are white. Say we draw a marble `randomly’ and look at it’s colour, what is the probability \(P(g|I)\) that it will be green?
First, by `draw randomly’ I mean any method that respects a certain symmetry, which is that I should not be able to tell if someone sneakily renumbered all the marbles in a different order. In other words the method of drawing a marble should be indifferent to which marble is drawn. The symmetry means that the probability I should assign to any particular marble being drawn should be invariant under renumbering (or I’d be able to tell if they had been renumbered). The only way I can satisfy this symmetry is by assigning equal probabilities and since the total probability should be unity we have the probability of marble \(j\) is \(P(j|I)=1/n\).
Now back to the example. What is the probability of drawing a green marble?
The argument can be extended in a natural way to more complicated situations where the background information \(I\) implies that we are counting various members of sets. So if there is frequency data available in the problem we can make use of it to assign numerical values to probabilities.
2 Probability of frequencies
It’s perfectly fine to talk about the probability of a frequency. Say we envision some experiment which is to be repeated \(n\) times and are interested in the probability of obtaining the frequency \(m/n\) of some interesting outcome. To make things more tractable let’s assume that the result in each experiment is independent of the other results, and that the probability of the outcome we’re interested in is \(a\) in each experiment. With these assumptions the probability of a frequency \(m/n\) is simply the Binomial distribution:
These probabilities are just the terms in a Binomial expansion so the distribution is normalised
Note that for any given \(n\) there is not a single `true’ frequency, rather there are a range of frequencies with a high probability. While we can report just the highest probability frequency if there is a unique one, it’s more informative to report a credible region which in this case is a range of the most probable frequencies that account for some fraction of the total probability like 80\