The issue here is precisely that for all practical purposes, the RNG is well...random.
While it converges if you have enough samples, for the individual player, that can mean a difference between 1 and 1000 trials. The lower the probability, the more samples needed to converge.

To keep players sane and to maintain perceived fairness, they SHOULD introduce some form of bias, such that it prevents "too good" and "too bad" occurrences.