Log in

View Full Version : defense, attack, damage



MarkovChain
02-17-2012, 09:54 AM
I didn't know where to put this so I put it here as it's battle related. I did a bunch of test on the test server to express the distribution of melee hits between min damage and max damage, assuming they are given.

Let r=(your attack)/(mob's defense) [not bothering with level correction]

First case 0.75< r <1.25

Call a and b min and maximum damage, DMG your total weapon damage (incl fstr for those that know what it is). If you know what the 5% "randomizer" is those are values before accounting for this randomizer. The law of the random damage generated by the game is

d=((1-U)*V1+U*DMG)*(1+e)

e= unfiorm between 0 and 0.05; this is the so-called randomizer.
DMG= your weapon damage
U= a bernouilli variable which parameter is unknow so far, taking 2 values, 0 or 1
V1=uniform on [a,b]

In other words
IF U=0 THEN d=DMG*(1+e) ELSE d=V1*(1+e) END IF;
is a simple code that give this random variable.

Notes : this is obtained through data. When fighting EM sandsweepers in aby-altepa, if your def is in the correct range then there is a fixed interval of values (it's the same interval regarless of your attack) that have high frequencies, and they are always DMG->DMG+5%. For instance I was on pup and had 104 total damage including fstr, then I would find that the values 104,105,106,107,108,109 happen nearly 30-50% of the time.

This implies the following form for the distribution of the random damage 'd'

http://i189.photobucket.com/albums/z315/pchann/D104_ATT429.jpg

The average damage is therefore

d*=(1-q)*1.025*(b+a)/2+q*DMG*1.025 where q is parameter of the Bernouilli U

The proportion of numbers lying is the interval [DMG,DMG*1.05] seems constant equal to about 35% and doesn't seem to vary much with attack values. This proportion is , with the above model,

q+(1-q)*ln(1.05)*DMG/(b-a) .

Can provide data if needed.


second case 0.75> r > 0.5
In this case of DMG is equal to b we still have:
d=((1-U)*V1+U*(DMG)*(1+e)

The frequency spike happens at the upper bound of the damage range (b=DMG so the actual maximum observed is DMG*1.05).


second case case r>1.5
In this case DMG is still outside of [a,b] (DMG<=a) there are not high frequency values, so it's natural assumption that the roll U is equal to 1. It means that the parameter of the bernouilli U is likely zero when ratio>=1.25

exemple

http://i189.photobucket.com/albums/z315/pchann/counterstance.jpg
(only the shape of the graph is relevant)

How to find a mob's damage
Easy. Get hit a a couple of times with a non retarded def and examines is the lowest value of the frequency spikes. Exemples, the chigoes in caedaerva mire have about 70 damage as "weapon"

**asuming mob's pdif works the same**

How to find a mob's VIT
Check what is the minimum frequency spike you are getting and reverse the fstr formula.

MarkovChain
02-21-2012, 08:02 PM
Edited the previous post as the behavior is changing for 1.25<r<1.5 and 0.5<r<0.75

second case case r>1.5
Same as above, d=(1+e)*V1

So it's roughly uniform

third case 1.5>r>1.25
In this case the equation still holds
d=((1-U)*V1+U*DMG)*(1+e)
because in this case the true min pdif equals weapon DMG, therefore the distribution has a spike at its lower bound
However the frequency is decreasing if r grows towards 1.5 at which point there is no more spike (assuming therefore that the parameter of the bernouilli is equal to 0 at 1.5 pdif)

What remains is to examine the variation of the parameter of U with r. i'm suspecting that this parameter p is 0 for r>1.5 and r<0.5 which is basically when the weapon DMG doesn't appear in the hits values, making the formula still hold. However monster vs player seem to indicate that at very low pdif a frequency spike appear at the lower bound of the damage values so something else could happen at this point (flooring).

VZX
02-22-2012, 01:42 PM
Sorry... I didn't see where you define a and b, as well as p. Mind elaborating those points?
The model offered seems plausible to generate the spiking behavior.

MarkovChain
02-22-2012, 06:36 PM
I'm not interested in findind a,b. For this look at the threads on BG. Knowing a and b, aka the min and max pdif values (more or less 5%) isn't going to help you modelize damage.

P is the parameter of the berouilli it is unknown what exactly it is dependant from, though likely a function of ratio. It is 0 at 1.5 pdif and decreasing between 1.25 and 1.5. I need larger samples per data point to estimate it correctly. p is responsible for the "spike" around the DMG value +0~5%.

Masamunai
03-01-2012, 07:27 AM
Which Samples do you need at which cRatio ? i made a large number of parses which may help you ?

I have questions about your reasoning:
- what tells you the primary pdif (before the 5% randomizer is applied) follows a Bernouilli-type law ? (and not a uniform or gaussian law for example)
- what tells you the "thresholds" are 0.5 0.75 ? (and not 0.43 or 0.71?). 1.25 and 1.5 were already known so np.
- can you post your "data" somewhere please ? (because sansweepers using berserk can screw things)

MarkovChain
03-01-2012, 07:44 AM
The primary pdif is not a Bernouilly and honestly I don't trust SE for generating anything but uniform laws. The primary pdif is

(1-U)V1+U*DMG, V1 uniform on [a,b] and U a bernouilli. Why ? because It's not a bell curve basically and outisde of the spike the frequencies look about the same. Also all the spikes happen between DMg and DMG*1.05 whatever you do so it's not gaussian. The reason the frequencies of the spike are not equal is because of flooring, which makes certain values less or more likely.

Also I posted this comparison on the wiki :

http://images3.wikia.nocookie.net/__cb20120229152334/ffxi/images/f/f3/Pdif_r094.jpg

All the data between 0.75 and 1.25 follow the same trend except for the position of spike due to a & b changing.

Under <0.75 is only assumption on my part since data is had to get, the lowest I have is 0.59 pdif. For me it's clear that the limits are determined by ratio values at which min pdif=1 (assuming 1.5) and maxpdif=1 (assuming 0.5 since it's before randomizer).

MarkovChain
03-01-2012, 08:01 AM
data under 0.75 pdif, everything is done on PUP without any AF3 or close to naked THF
(format : melee hit value : frequency, H2Hskill_strvalue_attvalue_damagevalue on H2Hwpn)
H2H300_STR94_att273_D0:= [[9,11],[10,15],[11,9],[12,13],[13,17],[14,21],[15,21],[16,18],[17,20],[18,25],[19,15],[20,10],[21,22],[22, 8 ],[23,28],[24,15],[25,13],[26,20],[27,20],[28,24],[29,19],[30,18],[31,24],[32,19],[33,14],[34,25],[35,19],[36,12],[37,35],[38,30]]:
H2H300_STR134_att288_D0:=[[17,1],[18,8],[19,3],[20,4],[21,5],[22,7],[23,7],[24,5],[25,7],[26,5],[27,4],[28,9],[29,7],[30,4],[31,12],[32,5],[33,10],[34,11],[35,6],[36,7],[37,4],[38,7],[39,2],[40,11],[41,5],[42,7],[43,10],[44,26],[45,15],[46,19]]:

data 0.75<r<1.25 pdif,
H2H404_ATT350_STR91_D0:=[[20, 7], [21, 17], [22, 36], [23, 39], [24, 36], [25, 35], [26, 35], [27, 34], [28, 41], [29, 40], [30, 36], [31, 48], [32, 32], [33, 37], [34, 44], [35, 33], [36, 33], [37, 42], [38, 50], [39, 33], [40, 38], [41, 28], [42, 47], [43, 36], [44, 33], [45, 37], [46, 39], [47, 211], [48, 186], [49, 187]]
H2H404_ATT360_STR116_D39:=[[45,6],[46,2],[47,7],[48,3],[49,6],[50,3],[51,2],[52,6],[53,9],[54,2],[55,6],[56,1],[57,2],[58,4],[59,2],[60,2],[61,3],[62,3],[63,2],[64,4],[65,2],[66,3],[67,2],[68,7],[69,4],[70,4],[71,7],[72,3],[73,3],[74,5],[75,6],[76,3],[77,3],[78,2],[79,4],[80,5],[81,5],[82,4],[83,2],[84,6],[85,2],[86,5],[87,5],[88,4],[89,3],[90,4],[91,19],[92,19],[93,27],[94,21],[95,19],[96,2],[97,2]]
H2H427_ATT399_STR176_D42:=[[59, 2], [60, 5], [61, 4], [62, 3], [63, 14], [64, 6], [65, 5], [66, 9], [67, 5], [68, 6], [69, 4], [70, 9], [71, 4], [72, 4], [73, 6], [74, 7], [75, 4], [76, 5], [77, 4], [78, 12], [79, 4], [80, 3], [81, 5], [82, 4], [83, 5], [84, 5], [85, 9], [86, 9], [87, 9], [88, 4], [89, 8], [90, 7], [91, 6], [92, 8], [93, 6], [94, 10], [95, 9], [96, 15], [97, 5], [98, 7], [99, 3], [100, 8], [101, 3], [102, 10], [103, 6], [104, 45], [105, 46], [106, 44], [107, 37], [108, 29], [109, 43], [110, 8], [111, 12], [112, 10], [113, 13], [114, 7], [115, 6], [116, 4], [117, 8], [118, 5], [119, 6], [120, 9], [121, 7], [122, 0], [123, 2]]
H2H427_ATT429_STR176_D42:=[[67,3],[68,6],[69,6],[70,13],[71,9],[72,7],[73,9],[74,10],[75,14],[76,16],[77,5],[78,4],[79,7],[80,4],[81,12],[82,8],[83,8],[84,12],[85,6],[86,7],[87,7],[88,8],[89,12],[90,8],[91,10],[92,9],[93,7],[94,7],[95,7],[96,9],[97,16],[98,8],[99,15],[100,8],[101,12],[102,13],[103,4],[104,56],[105,48],[106,46],[107,54],[108,61],[109,52],[110,10],[111,7],[112,8],[113,7],[114,7],[115,10],[116,10],[117,12],[118,11],[119,9],[120,4],[121,10],[122,9],[123,10],[124,9],[125,3],[126,6],[127,5],[128,5],[129,3]]:
H2H404_ATT467_STR91_D0:=[[34,9],[35,28],[36,38],[37,29],[38,40],[39,31],[40,37],[41,22],[42,41],[43,33],[44,42],[45,41],[46,32],[47,205],[48,190],[49,158],[50,35],[51,30],[52,33],[53,28],[54,26],[55,34],[56,30],[57,31],[58,26],[59,37],[60,33],[61,29],[62,16],[63,13],[64,9]]:
H2H427_ATT572_STR176_D42:=[[104, 88], [105, 88], [106, 108], [107, 104], [108, 110], [109, 113], [110, 19], [111, 20], [112, 20], [113, 26], [114, 7], [115, 25], [116, 18], [117, 26], [118, 9], [119, 20], [120, 15], [121, 13], [122, 13], [123, 20], [124, 17], [125, 16], [126, 24], [127, 24], [128, 16], [129, 28], [130, 13], [131, 16], [132, 17], [133, 16], [134, 21], [135, 15], [136, 17], [137, 33], [138, 30], [139, 22], [140, 19], [141, 11], [142, 12], [143, 15], [144, 15], [145, 17], [146, 20], [147, 21], [148, 27], [149, 33], [150, 23], [151, 9], [152, 26], [153, 22], [154, 17], [155, 19], [156, 24], [157, 19], [158, 13], [159, 13], [160, 28], [161, 18], [162, 17], [163, 20], [164, 12], [165, 22], [166, 10], [167, 5], [168, 7], [171, 5]]:

data above r=1.5
H2H427_ATT715_STR176_D42:=[[112,2],[113,1],[114,7],[115,9],[116,8],[117,6],[118,7],[119,17],[120,8],[121,14],[122,9],[123,9],[124,5],[125,6],[126,8],[127,5],[128,9],[129,6],[130,9],[131,8],[132,9],[133,17],[134,11],[135,5],[136,4],[137,6],[138,11],[139,12],[140,9],[141,4],[142,6],[143,11],[144,11],[145,7],[146,6],[147,4],[148,8],[149,11],[150,10],[151,6],[152,4],[153,12],[154,7],[155,9 ],[156,13],[157,5],[158,7],[159,10],[160,9],[161,8],[162,11],[163,8],[164,6],[165,4],[166,4],[167,12],[168,9],[169,7],[170,6],[171,8],[172,13],[173,8],[174,9],[175,11],[176,8],[177,11],[178,5],[179,13],[180,7],[181,16],[182,9],[183,7],[184,6],[185,3],[186,5],[187,8],[188,15],[189,8],[190,7],[191,10],[192,11],[193,4],[194,8],[195,8],[196,5],[197,11],[198,9],[199,10],[200,15],[201,8],[202,9],[203,3],[204,6],[205,4],[206,5],[207,3],[208,2]]:
Got other data for high pdif but it's monter vs player.

Masamunai
03-01-2012, 05:07 PM
Nice !

Thanks for the explanations and data (where are crits?).

What are SandSweeper's Defense and VIT ? How you filtered out hits landed during their Berserk ?

Did you get a look at tests done @ Mathy BG forum ? some data might help ? You didnot answer my 1st question: which test do you need (at which cRatio) ?

MarkovChain
03-01-2012, 06:20 PM
EM sandsweepers have 457 def idk about their vit but it should be easy to get from the data itself. The first data set for H2H=300skill, no weapon, 94 str gibes you a base damage of 37. As 0.11*300+3=36, you are @ +1 fstr. Now the data H2H404_ATT350_STR91_D0 gives 47 damage, while 404*0.11+3=47 so that's +0 fstr. Thy have between 101 and 104 VIT I guess ? The exact value doesn't matter as your total damage can be read on the graphs.

Berserk is not a problem, you just move to the next mob.

I would need data between 0.5 and 0.75 to confirm the variations of the paramater 'q', namely that it is zero @ r=0.5 (vanishing spike). I have good approximations @ r=0.75, 1, 1.25, 1.5.

Also in order to maximize the data I removed all crit atma/merit/dex w/e so their data is too small. I could do that later to examine how a crit is obtained.

Masamunai
03-01-2012, 07:49 PM
For crits in abyssea, i parse with 50% critrate using atmas GnarledHorn and DarkDepths, yielding 40% together on top of your own ~10% base critrate. So a parse of 3k hits should have about same sample size for normal or crit damage.

Regarding generation of a crit, it's easy:
On the formula, particularly the term V1, instead of using a simple RAND(0;1), you add another term to it:
+ IF(RAND(0;1)<=CritRate%;1.x;0) ,where x=the gap between crit mins and normal maxs.

Regarding thresholds below r<~0.8, we still trying to figure out if it occurs at 0.8 0.75 or 0.7, but not easy to discern because of the secondary randomizer... Same problem around r~0.5 or 0.45?

I forgot also to mention something fundamental about the way to consider this frequency modeling:
The frequency spike(s) is a "signature", ie i think is something original SE devs actually didNOT want. That means those spikes are INDIRECTLY generated from a formula "format" non determinist, ie can NOT model something like "30% of the time, when 0.75<=r<=1.25 we have pDIFa=1, and normal randomized shape otherwise"... and that's what you wrote basically :s
This particular signature also "transfers" to crits at low cRatios, when pDIFa=1 belongs to crit damage.

MarkovChain
03-02-2012, 03:53 AM
Idk what a signature is, it's nothing. The number in game are generated from the same way you generate crits or random procrates like shield, procrate which vary with a parameter like crit% varies with dex,crit gear,merit etc.

Here we go

U=rand(0..1):V=rand(a..b):W=rand(0..1)
If U<0.30 then d=DMG*w
else d=V*w
end if;
4 lines for the generation of this.

SE doesn't generate the frequency of the spike but does generate the parameter of bernouilli.

MarkovChain
03-02-2012, 08:42 AM
Regarding thresholds below r<~0.8, we still trying to figure out if it occurs at 0.8 0.75 or 0.7, but not easy to discern because of the secondary randomizer... Same problem around r~0.5 or 0.45?
.

Randomizer is not a problem to determine this threshold. The upper values will be D-D*1.05. You don't have to know how the sytem troncates the numbers either. You must first determine what is the range of melee hit that always appear in a spike. For instance with my data, If have 91 STR and no H2H weapon, the spike consists of values in {47,48,49} -ALWAYS, for ratio ranging to very low (near 0.5) to very high (near 1.5). Note that those values, as explained have a very high frequency of appearance, 30% for one of the 3, make it 10% each) which basically means that the very moment where you cease to see '49' in your melee hits it's over. Let's look at my data
H2H404_ATT350_STR91_D0:=[[20, 7], [21, 17], [22, 36], [23, 39], [24, 36], [25, 35], [26, 35], [27, 34], [28, 41], [29, 40], [30, 36], [31, 48], [32, 32], [33, 37], [34, 44], [35, 33], [36, 33], [37, 42], [38, 50], [39, 33], [40, 38], [41, 28], [42, 47], [43, 36], [44, 33], [45, 37], [46, 39], [47, 211], [48, 186], [49, 187]]
This has a cratio of 0.76... and 49 is in there
Now another data with 10 more attack (but a different weapon/str)
H2H404_ATT360_STR116_D39:=[[45,6],[46,2],[47,7],[48,3],[49,6],[50,3],[51,2],[52,6],[53,9],[54,2],[55,6],[56,1],[57,2],[58,4],[59,2],[60,2],[61,3],[62,3],[63,2],[64,4],[65,2],[66,3],[67,2],[68,7],[69,4],[70,4],[71,7],[72,3],[73,3],[74,5],[75,6],[76,3],[77,3],[78,2],[79,4],[80,5],[81,5],[82,4],[83,2],[84,6],[85,2],[86,5],[87,5],[88,4],[89,3],[90,4],[91,19],[92,19],[93,27],[94,21],[95,19],[96,2],[97,2]]:
This time the spike is strictly inside the interval range so.. your are well above the threshold. This gives r=0.787...
So using this method you can know that the limit is between 0.76 and 0.78. Since 'a' and 'b' are floored too well you don't really know that it lies in [0.76,0.78[ but it is likely 0.75 as flooring would allow more ratios to make damage cap at the weapon DMG.

Masamunai
03-02-2012, 09:08 PM
For one, strange for someone doing "research" to not know what is a signature lol < < (for info it's what validate or invalidate a model over another, and helps greatly going in correct researching direction)

For 2, you mentioned tests at r=0.76 and 0.787, while i wre asking if could be around 0.7 or 0.72 or whatever r BELOW 0.75, of course we all know this threshold is NOT above 0.75... we have plenty of parses between 0.75 and 1 for proof, not below 0.75... But your method using the highest spiked dmg value disappearing is nice to find that ratio threshold, yet no parse below 0.75 ?

Keep in mind ppll posting on this topic are ppl who studied the shit over and over... no need to explain how number displayed in game are generated.

Also when you say SE does use the bernouilli system, that means basically they made those spikes ON PURPOSE, since that parameter litterally generate them directly. One could say "so what? as long it work..." but i still don't buy it as i think SE devs would on contrary want the most uniform distribution at ANY ratio with 0 spike, leading me to think those spikes appeared indirectly and they didnot bother to fix it.

MarkovChain
03-03-2012, 04:09 AM
You are trying to transform the problem into finding what max and min pdif are, which I don't care about. I care about the distribution of the melee hits which is the only thing that should matter when wanting to modelize damage. That's why I don't do data around 0.75. It's useless. I want a model that quantifies what damage I am doing at what pdif.

Congratulation on knowing what a and b are. However, my next question is : I have 1.134 pdif, what is my average damage, what is the variance of my damage ? Can you give the answer to that simple question ? nope. I can, and of course I need a and b beforehands, but since an approximate knowledge of a and b is enough, I 'm able to modelize damage more precisely than you.

The spike is not something that SE put by mistake since it happens exactly at the damage value. Theu put the value of q in their equations, just like they decide to cap crit at 25% or w/e in this case they decide to cap q at 30%.

Also lol @signature. This means jack shit.

Incidently, as explained in the OP, the model allows you to solve the mysteries of vanadiel. What is fafnir attack ? What is fafnir damage ? What is fafnir STR?

MarkovChain
03-03-2012, 04:23 AM
Also I want to point out that the model is still only a model and needs verification for different values at r>1.5 and r<0.75. Also if anyone can think of a process to get an accurate value of the bernouilli parameter 'q' based base on the data, I'm interested. The only way I can find q is using the spike frequency formula in the OP, or the average formula (1.025*( (1-q)(b+2)/2+q*DMG) ) however the lastone it not very accurate when the average is about equal to the weapon damage...

MarkovChain
03-03-2012, 08:30 AM
Old data of monster VS player, chigoes hitting me with counterstance+beserk full time on me. They got 66 DMG for the record and unknown attack.
counterstance_40:=[[86,6],[87,10],[88,20],[89,22],[90,41],[91,34],[92,32],[93,38],[94,26],[95,30],[96,35],[97,35],[98,42],[99,21],[100,31],[101,39],[102,29],[103,28],[104,24],[105,28],[106,38],[107,36],[108,48],[109,28],[110,34],[111,30],[112,24],[113,32],[114,32],[115,47],[116,39],[117,25],[118,40],[119,35],[120,38],[121,34],[122,31],[123,30],[124,28],[125,29],[126,32],[127,28],[128,37],[129,29],[130,38],[131,25],[132,27],[133,33],[134,33],[135,42],[136,40],[137,36],[138,34],[139,46],[140,41],[141,38],[142,33],[143,35],[144,18],[145,20],[146,10],[147,16],[148,10],[149,5],[150,2]]
Since I had so low def I assume that their pdif is largely above 1.5 (clearly no spike at their wpn DMG=66 since it is out of the range). Here is a graphical comparison of theorical distribution (green) and data (red).
http://i189.photobucket.com/albums/z315/pchann/pdif_counterstance.jpg

measured average value between a*1.05=90 and b=143 (upper floor) : 1.135
expected value : 20*ln(1.05)/(b/66-a/66) with a=86 and b=143 => 1.129

Not too bad : 0.4% variation <3.

Motenten
03-03-2012, 04:05 PM
Knowing A and B are useful for figuring out cRatio, and thus figuring out things like mob attack, but you're right, it's insufficient to really tell how average damage is related to cRatio.

However I don't think your:

U=rand(0..1):V=rand(a..b):W=rand(0..1)
If U<0.30 then d=DMG*w
else d=V*w

sufficiently describes the process. When you're right on the edge of creating a spike (eg: cRatio at about 1.5), the average spike frequency is about 3x the average non-spike frequency. When you're near 1.0, with the spike right in the middle, the average spike frequency is about 5x the average non-spike frequency.

Just a rough review from Masa's data..

What I'm looking at:

Find the average frequency of all values aside from the 1.0 spike and any tail ends (artifacts of the 1.05 randomizer).

Subtract that average frequency from the frequency values of the spike values.

Sum up the remaining spike values.

Compare that sum with the grand total of all damage value frequencies to see what percentage of all generated damage values are encompassed in the artificial spike.



What I found:

Below 1.0 cRatio, down to ~0.8 (where max damage was still above 1.0), ~35% of all values came from the spike pool.
As cRatio dropped towards 0.75 (where max damage was capped at 1.0), the pool size dropped to about 25% of all values.
Not enough data (in this spreadsheet) below 0.75 to see how the trend continued.

Above 1.0 cRatio the pool size fluctuated from ~30% to ~37%. Either 33% or 35% would be believable.

As cRatio increases above 1.25, such that the min damage hits and sticks to 1.0, the percentage of the overall damage frequencies that are made up by the spike pool drop in a fairly linear fashion, reaching 12% at 1.4 cRatio.


As such you can't simply say that 30% (or whatever) of the time you generate a spike value, and the rest of the time you generate a value in the min::max range.


One might posit, then, that the chance of generating a frequency spike value is dependant on the total min:max range available.

Formulation:

(max - min)/baseDamage + fraction of total values in the spike

where max and min are corrected to generally exclude the 1.05 spread tails.

That formulation consistantly resulted in a final value of just about 0.9 (with rounding, between 0.89 and 0.91).


So, your threshhold for U would be: 1 - ((max - min)/baseDamage) - 10%


Not sure why there would be the extra -10% there, but could probably be explained based on expectations for the min::max range compared to base damage.


And this finally creates a formula that I'm comfortable with in terms of how the spike is generated. Still needs more work to calculate an average damage value based only on cRatio, though, and you can't completely ignore the min and max values (the a and b you mention in your post).

Motenten
03-03-2012, 04:29 PM
Note regarding Masa's data: the above work was from examining the behavior of 18 different cRatio tests that fell within the range of 0.75 to 1.5. The overall conclusion is still just a rough approximation, though.

Masamunai
03-03-2012, 06:04 PM
I "could" actually tell you the average dmg just from regressions from all my parses:

Avg DMG =~ BaseDMG x (0.8 x cRatio + 0.2), in the range 0.7 <~ cRatio <= 1.5
and =~ cRatio at 1.5 <= cRatio < 2

... But since it's comes from directly from data.... I would rather get this average from a global dmg formula. The problem is gotta define properly this formula with ALL its parameters exactly defined, so certainly not from approximate guesses and even worse, ignored min/max. That's why i don't use/post this average formula on wikis...

Also, you saying signatures means jackshit ? then don't ask for them with :

a process to get an accurate value of the bernouilli parameter 'q' based base on the data, I'm interested. You can be thankful to Motenten who just provided it, and note it's still a ROUGH approximation, ie not as reliable as you sound just from 6 parses...

MarkovChain
03-03-2012, 09:16 PM
I didn't use a dumb a approximation for the frequency dude, I've explained in the OP that you can calculate the frequency spike. It IS
q+(1-q)*ln(1.05)*DMG/(b-a)

The problem comes from reverseing the formula at certain values where you divide by small quantities aka r~1 for instance.

MarkovChain
03-03-2012, 09:41 PM
When you're right on the edge of creating a spike (eg: cRatio at about 1.5), the average spike frequency is about 3x the average non-spike frequency.

Let's see. Average spike frequency is the probability that the damage lies in [D,D*1.05]. You can compute it, it is
(20*ln(1.05)*(1-q)/(b-a)+20*q)*0.05=ln(1.05)*(1-q)/(b-a)+q
The average non spike freqency is
(20*ln(1.05)*(1-q)/(b-a))*(b-1.05*a-D*0.05)
The ratio betwwen the two is
( ln(1.05)*(1-q)/(b-a)+q )
____________________________________
(20*ln(1.05)*(1-q)/(b-a))*(b-1.05*a-D*0.05)




When you're near 1.0, with the spike right in the middle, the average spike frequency is about 5x the average non-spike frequency.

Just a rough review from Masa's data..

What I'm looking at:

Find the average frequency of all values aside from the 1.0 spike and any tail ends (artifacts of the 1.05 randomizer).


(20*ln(1.05)*(1-q)/(b-a))*(b-1.05*a-D*0.05)



Subtract that average frequency from the frequency values of the spike values.

Sum up the remaining spike values.




ln(1.05)*(1-q)/(b-a)+q -
(20*ln(1.05)*(1-q)/(b-a))*(b-1.05*a-D*0.05)

not simple




Compare that sum with the grand total of all damage value frequencies to see what percentage of all generated damage values are encompassed in the artificial spike.



[ ln(1.05)*(1-q)/(b-a)+q -
(20*ln(1.05)*(1-q)/(b-a))*(b-1.05*a-D*0.05) ]
________________________________________________
ln(1.05)*(1-q)/(b-a)+q
and ..? Is that supposed to be q?



What I found:

Below 1.0 cRatio, down to ~0.8 (where max damage was still above 1.0), ~35% of all values came from the spike pool.
As cRatio dropped towards 0.75 (where max damage was capped at 1.0), the pool size dropped to about 25% of all values.
Not enough data (in this spreadsheet) below 0.75 to see how the trend continued.

Above 1.0 cRatio the pool size fluctuated from ~30% to ~37%. Either 33% or 35% would be believable.

As cRatio increases above 1.25, such that the min damage hits and sticks to 1.0, the percentage of the overall damage frequencies that are made up by the spike pool drop in a fairly linear fashion, reaching 12% at 1.4 cRatio.

As seen in the formula above this little to no relation with the parameter q which governs the spike.



As such you can't simply say that 30% (or whatever) of the time you generate a spike value, and the rest of the time you generate a value in the min::max range.

Hmm, yeah I did not say that I said q didn't vary with r. I said that between 0.75 and 1.25 the spike frequency IS 30% (or almost in the limit of the confident intervals) and outside of this interval it decreases (probably linearly) to 0.



One might posit, then, that the chance of generating a frequency spike value is dependant on the total min:max range available.

Formulation:

(max - min)/baseDamage + fraction of total values in the spike

where max and min are corrected to generally exclude the 1.05 spread tails.

That formulation consistantly resulted in a final value of just about 0.9 (with rounding, between 0.89 and 0.91).


So, your threshhold for U would be: 1 - ((max - min)/baseDamage) - 10%


Not sure why there would be the extra -10% there, but could probably be explained based on expectations for the min::max range compared to base damage.


And this finally creates a formula that I'm comfortable with in terms of how the spike is generated. Still needs more work to calculate an average damage value based only on cRatio, though, and you can't completely ignore the min and max values (the a and b you mention in your post).

average=1.025*( (1-q)*(b+a)/2 + q*DMG)
q(r)= 0.3 if 0.75<r<1.25
q(r)=0.6 - 1.2(r-1) if 1.25<r<1.5
q(r)=0.6+1.2(r-1) if 0.5<r<0.75
q(r)=0 otherwise.
Those formula are on the wiki.

The average is basically a second order polynomial outisde of 0.75 and 1.25 and linear inside assuming a & b are actually linear by part.

r=1,1.25, and 0.75 have 2k sample size nearly in my tests, they are very precise and q is about 0.3 (0.29 to 0.34)
for r=1.5 I verifed that there is no spike q=0.
I need data at r=0.5 to prove that there is no spike either.

So no masa I don't use rough approximation, I know my shit and My model is correct modulo an more accurate methode to determine q which I've yet to see.

MarkovChain
03-03-2012, 10:43 PM
My next plans are to show that the model matches the data just like the 2 graphs I posted so far for all the data I got and then to show that the frequency spike, the average and the variance coincide for each. Since I'm getting the value of 'q' by reversing the frequency spike expression, that's two verifications for each data set and you I can also compare n-th order moments too, shouldn't be too hard.

MarkovChain
03-04-2012, 01:56 AM
First let's look at the counterstance data (me getting hit by chigoes)
counterstance_40:=[[86,6],[87,10],[88,20],[89,22],[90,41],[91,34],[92,32],[93,38],[94,26],[95,30],[96,35],[97,35],[98,42],[99,21],[100,31],[101,39],[102,29],[103,28],[104,24],[105,28],[106,38],[107,36],[108,48],[109,28],[110,34],[111,30],[112,24],[113,32],[114,32],[115,47],[116,39],[117,25],[118,40],[119,35],[120,38],[121,34],[122,31],[123,30],[124,28],[125,29],[126,32],[127,28],[128,37],[129,29],[130,38],[131,25],[132,27],[133,33],[134,33],[135,42],[136,40],[137,36],[138,34],[139,46],[140,41],[141,38],[142,33],[143,35],[144,18],[145,20],[146,10],[147,16],[148,10],[149,5],[150,2]]
http://i189.photobucket.com/albums/z315/pchann/pdif_counterstance.jpg
green : model , red : data

I compute reduced moments : m1=E(d),sigma=standard dev, m2=E(d^2)^(1/2), m3=E(d^3)^(1/3) in each case.

DATA : m1=117.2447624,sigma=17.09798864,m2 :=118.4849169,m3=119.6849566
MODEL :m1=117.3622996,sigma=16.94823656,m2=118.5798168,m3=120.9018511

I win. (the model works with q=0 in this case, and we are @ very high, though unknow, pdif, probably @ cap).

MarkovChain
03-04-2012, 03:57 AM
Using H2H404_ATT467_STR91_D0
a=34,b=61,DMG=47

data : m := 48.10173160,sigma := 6.742996786,m2 := 48.57205564,m3 := 49.03565994
model : m:=48.49419660,sigma=6.723842029,m2=48.97687781,m3=57.38601253

at least the average and the variance match. m3 doesn't match 1300 sample only though. Using q=0.3 for this one.

H2H427_ATT429_STR176_D42 (the graph that I put earlier in the thread)
a=63,b=123,
data: m= 100.3947368,sigma := 14.79139228,m2 := 101.4785123,m3 := 102.4726554
model : m=100.2753381,sigma=14.56868085,m2=101.2691851,m3=121.3597252

It's interesting to see that in those 2 models, any value of q between 0.25 and 0.35 approximately matches for the average and standard deviation. It's starting to be different outside of those values. This is why I said It's hard to have a good approximation of q. The best I found is the expression of the spike frequency.

Masamunai
03-04-2012, 06:08 AM
From my data, i kinda "felt" ( so i can be wrong and just placebo effect ) that the spike intensity depends on BaseDMG:
The higher BaseDMG is, the more values the distribution will have, the less "intense" the spike will be;
If using DMG1 weapon, distribution will have few values and spike immediately gets to very high %.

For determining q, why motenten's method wouldnot work ? you don't have enough parses ? (Note i've asked this question since my 1st post and no answer). I'm not sure i understood correctly the problem there...

And finally, i never ever said your model approach isnot "correct" in the sense of mimicing observed frequency parses, but i said this approach is deterministic, ie according to you, SE put this spike on purpose...
A non deterministic way is to use the definition of a pseudo RNG defined as Y=START+INT(RANGE*Generator), strangely its main signature is a spike at start, from there you can guess how it can be adapted to our problem, and this without using any parameter q or whatever. Think of this spike like an "artifact" occuring only when pDIF=1 is generated. An example of it is what Raellia posted at Mathy BG forum.
Saying this doesnot turn in any way your approach as "not correct", ok ?

MarkovChain
03-04-2012, 07:56 AM
For determining q, why motenten's method wouldnot work ?

Because the proportion of the data supposedely inherent to the spike is not a simple function of q.



you don't have enough parses ? (Note i've asked this question since my 1st post and no answer). I'm not sure i understood correctly the problem there...


The only thing that can remove the uncertainity of what q is, is testing a range of pdif and study the variation of q between that, because increasing the the data doesn't seem to increase the precision on it very much.



Y=START+INT(RANGE*Generator)

What's that ? This is a uniform variable on a range of integer numbers ? There is no spike. (assuming generator is uniform).

MarkovChain
03-04-2012, 10:43 AM
So, your threshhold for U would be: 1 - ((max - min)/baseDamage) - 10%


Ok I'm actually agreeing with this, it's also interesting because it's puts q=0 at high pdif (1.5) or low pdif (<0.5) matches about for the rest of the datas. I concluded to this form after analyzing the behavior of the distribution for 1.25<r<1.5. In this range the frequency of the spike on the lower bound decreses when the upper bound increases ; so I made the hypothsis that decrease/increase at the same rate and that the average, which is
m=1.025*( (1-q)*(a+b)/2+qDMG) in this range, would then vary liearly (since a is constant). This necessarily brings that q+(b-a)/DMG is a constant. In this case it's around 0.9.

So in other words, the renormlized average damage would be

m=1.025*( (1-(0.9-(b-a)))*(a+b)/2+(0.9-(b-a)) )


So the damage would ge generated by the following program

U:=rand(0..1)
V:=rand(a..b)
W=rand(1..1.05)
If U<0.9-(b-a) then DMG*W
else V*W
end if;

So I suppose this is where you guys that studied the expression of a & b with respect to r explain what the value of b-a is at r>1.5 or r<0.5. Wiki formula at low pdif indicates b-a=0.9 so it matches and that should be the case at high pdif.

Masamunai
03-04-2012, 08:45 PM
Because the proportion of the data supposedely inherent to the spike is not a simple function of q. ... and BaseDMG is what i implied in my previous post, further confirmed with Motenten formula for U=1-(b-a)/BaseDMG -> the higher BaseDMG is, the higher U is, ie freq spike less "potent" compared to other values.




What's that ? This is a uniform variable on a range of integer numbers ? There is no spike. (assuming generator is uniform). At 1st glance, yea... until you plot and discover it actually has an artifact : a spike which position within the freq chart is static at Y=START. Here (https://docs.google.com/spreadsheet/ccc?key=0Ai1bYhrlM-J5dHZRV0hyNTVUVnpsZUVCU0xGRGZWTkE) is an example (doesnot match parses tough, it's just to picture the thing)

And now, from your last post, does proving that spike intensity decreases linearly from cRatio=1 to 1.5 and to 0.5 lead to a linear average dmg formula in the range 0.5<=cRatio<=1.5 ? Hoping the demonstration will end up to the regression formula i posted earlier, at least would be proven mathematically :)

MarkovChain
03-04-2012, 09:35 PM
(b-a)/DMG is independant of the weapon DMG so nope. The Bernouilli variable doesn't depend on DMG. It i s only a function of r and is about equal to the expression I gave earlier.

q(r)= 0.3 if 0.75<r<1.25
q(r)=0.6 - 1.2(r-1) if 1.25<r<1.5
q(r)=0.6+1.2(r-1) if 0.5<r<0.75
q(r)=0 otherwise.

cannot be a lot more precise than this or q=0.9-(b-a) until b and a are explicitely expressed

Also

m=1.025*( (1-(0.9-(b-a)))*(a+b)/2+(0.9-(b-a)) )

it's linear when b-a is constant otherwise doesn't seem to be so...

Masamunai
03-05-2012, 04:49 AM
huh? isnot BaseDMG = WeaponDMG + fSTR ?

Looks like to me you are misunderstanding what i'm talking about, im just following what motenten said :

So, your threshhold for U would be: 1 - ((max - min)/baseDamage) - 10%
... and yes Max and Min depends on r.

MarkovChain
03-05-2012, 05:36 AM
and ??? this expression is only a function of r, max and min are (an expression of r) times DMG => q doesn't depend from DMG in any way shape or form.

MarkovChain
03-26-2012, 11:33 PM
So as expected the model presented in the thread implies the shape and values of max and min pdif. I'm going to stipulate that the previous observations on 'q' (the parameter of the spike) are accurate, namely that it is worth 0 for r>=1.5 & r<=0.5, and that it equals a constant minus b-a :
q=K-(b-a)
K is therefore the the value of (b-a) for r>1.5. The data suggests that b-a = 0.83... @ 2 data point around r=1.5. Another test I conducted using brews shows that b-a=0.79... at cap pdif. When we study the different tests I exposed here we get the following data for a,b,b-a (notation : list of values of [r,a or b or c])


a=[[0.5973741794, 0.2432432432], [0.6301969365, 0.2702702703],[0.7658643326, 0.4255319149], [0.8730853392, 0.5673076923],[0.9387308534, 0.6442307692],[1.021881838, 0.7234042553],[1.251641138, 1.000000000], [1.498905908,1.000000000],[1.564551422, 1.076923077], [2.0, 1.572815534]]
b=[[0.5973741794, 1.000000000], [0.6301969365, 1.000000000], [0.7658643326, 1.000000000], [0.8730853392, 1.134615385],[0.9387308534, 1.192307692], [1.021881838, 1.297872340],[1.251641138, 1.567307692], [1.498905908, 1.825242718],[1.564551422, 1.913461538], [2.0, 2.368932039]]
c=[[0.5973741794, 0.7567567568], [0.6301969365, 0.7297297297], [0.7658643326, 0.5744680851], [0.8730853392, 0.5673076923],[0.9387308534, 0.5480769231], [1.021881838, 0.5744680851],[1.251641138, 0.5673076923], [1.498905908, 0.8252427184],[1.564551422, 0.8365384615], [2.0, 0.7961165049]]

Graphical expression below (b=green,a=red,c=brown)
http://i189.photobucket.com/albums/z315/pchann/pdif_abc.jpg

From the graph we can expect all slops to be the same (note that b is not the observed value while a is). Call P(x)=max(x,0) and N(x)=max(-x,0) be the positive and negative part of x, then for a certain slope L,
a(r)=1+L( -N(r-1.25)+P(r-1.5))
b(r)=1+L( -N(r-0.5)+P(r-0.75))

for r=1.5 we have b-a=~0.83. Replacing in the formula gives
a=1;b=1+L*0.75 => L=0.83*4/3=3.32/3=1.1066666~10/9
Note : at this point it's unlear wether L=1.1 or 10/9. In the latter case the observed 0.83 value between a and b will be (3/4)x(10/9)=5/6=0.833333
Anyway this gives the following values or a,b
r<=1.25 => a=1+(10/9)x(r-1.25);
1.25<=r<=1.5 => a=1;
r>=1.5 => a=1+(10/9)x(r-1.5);
r<=0.5 => b=1+(10/9)x(r-0.5);
0.5<=r<=0.75 => b=1;
r>=0.75 => b=1+(10/9)x(r-0.75);

Note that b is probably capped somewhere near r=1.95 because with brew-pdif we find b-a=0.79 instead of 0.83... I updated the wiki with those formula.

Incidently we can deduce the expression of the parameter q :
q=0 if r<=0.5 or r>=1.5
q is linear on [0.5,0.75] and [1.25,1.5] with a slope equal to L=10/9 and -10/9 respectively, and is constant on [0.75,1.25]. The value of the constant is obtained by doing r=1 in the above equation for instance and we find q=5/18=0.2777=~0.28... This also matches the observed frequencies of the spike (we found around 30% with a decent error that we were unable to narrow).

MarkovChain
03-27-2012, 05:15 AM
Also those formula are troncated at low pdif. From mob data it appears that around r=0.5 the minimum damage is capped. In this case the damage is not uniform on [a,b] and has a lower spike : therefore it appears that the damage numbers are still generated uniformly on [a,b] from the above formula but that d rather than a is truncated (at ~0.22 of the DMG value). Also just I realised that those formula implies that b-a=5/6 @ r=0.5 exactly so that would be where a caps out (a=1/6).


Attempt for a global formula

a:=1+L*( max(max(r,0.5)-1.5,0) - max(1.25-max(r,0.5),0) );
b:=1+L*( max(r-0.75,0) - max(0.5-r,0) );

Here is the representation of those functions

http://i189.photobucket.com/albums/z315/pchann/pdif_ab.jpg

The final model will not account for the lower cap on a for generating numbers but instead cap the generated value :

a*:=1+L*( max(r-1.5,0) - max(1.25-r,0) );
b*:=1+L*( max(r-0.75,0) - max(0.5-r,0) );

d=eta*((1-U)*max(V,DMG/6)+DMGxU)
with
DMG= damage value including fstr
eta= uniform on [1,1.05]
V=uniform on [a* x DMG,b* x DMG]
U= bernouilli of parameter q=P(U=1)=5/6 - (b*-a*)
U,V,eta are independantly generated.

wiki updated.