Ah, yeah, only 40 would be a weak sampling; that's a 15% margin of error. Having slightly bad luck (6 procs instead of 8) would put it at 15% instead of 20%.

Ok, I think I know how I'll setup the test, then. Won't worry so much about the damage.