Illya is mostly correct here. I included block, Shield Oath, Rampart, Sentinel, and Hallowed Ground. I did not include Rage of Halone due to funky mechanics. I could try to include account for it as well, though I will need better endgame data -- level 50 mob damage parses which I cannot provide because I have no 50 PLD. Any volunteers on providing that data vs. level 50 opponents? Could test on some sad, lonely FATE boss nobody ever touches, I guess. If I had that, I would probably fork into magical and melee cases, since magic also cannot be blocked. Flash's blind effect was not included due to immunity on powerful opponents as well as lack of data. I tried to shy away from guesswork and maybes as much as possible. For WAR, I included several cases:
Wrath Only -- Sit on 15% from Wrath stacks. Exists for comparison with other cases and with PLD. This effect is 60% of Shield Oath alone.
WAR-low -- Average 850 heal (before Maim and Storm's Eye) from Inner Beast. This is a pure-VIT build with ilvl70 gear.
WAR-high -- Average 1200 heal from Inner Beast. This is a pure-STR build with ilvl90 gear. It is slightly above what Kunkka accomplished.
WAR-hold -- This is the WAR-high case with WAR holding onto Wrath stacks except when Infuriate is up. This is the better way of playing the Wrath-Only setup, since you get all the Wrath bonus plus 1 Inner Beast per minute.
All WAR cases include Bloodbath, Thrill of Battle, Berserk, and Second Wind. Neglected were Mercy Stroke, Featherfoot, Bulwark, Awareness, Foresight, Convalesce, Inner Release, and passive regen. Inner Release sucks horrendously, and the rest are either cross-class or have close analogues. They aren't pushing the dial much one way or the other.
Sadly, they are not. Kunkka has 423 strength, 262 Determination and a Bravura +1 and hit around 512 after Maim and Storm's Eye (posted this image verifying it).
Unfortunately, it is not feasible to create such an analysis. You will have to include a dozen cases varying by immunities and damage types, plus specific mechanics. At this time, I find it wholly unnecessary to even attempt -- the goal is to create a reasonable, representative sample case and balance toward it. There will be places where WAR actually has some relative advantages. For example, magical opponents which cannot be blocked but can be dodged offer WAR a relative advantage: WAR can activate Featherfoot to reduce damage while PLD's equivalent, Bulwark, would be wholly ineffective. Designing to this individual case is a fool's errand.
Use of Stoneskin is certainly dubious at best. You will not use it at all except in cases where you have free time or in cases where you desperately need more healing. These are not frequent cases, nor are they of particular impact. I believe Stoneskin is best represented a bit differently as an extension of PLD's burst eHP pool.
//EDIT:
Here are some of the cases you would need to consider:
-Single target / 2 target / 3 target / 4 target / 5+ target
-Physical damage vs. Magical damage vs. 50/50
-Vulnerable to Stun, Blind, Silence, Slow, Pacification
-Average opportunities for Stoneskin per minute
-Defense passthrough abilities and debuffs.
-Ground-target frequency and feasibility of dodging
I count at least 5760 charts needed for that. Better to just keep one or two realistic cases, don't you think?