So! Tech talk time. There are two main ways to do animation in video games, "root motion" and "in-place", and the main difference—at a high-level conceptual view—is that in root motion, the animation controls the movement, while in-place generally has the movement control the animation. To elaborate...
"Root motion" means that the animation includes the movement data; if you loaded the animation file into an animation viewer, you would see the character literally move away from the origin point and walk across the screen. When using root motion, the character's animation will perfectly match their movements, because the movement was literally baked in by the animator. Nowadays, you can also do a bit of slope deformation and foot-placement IK so that the character still moves properly over (and stands correctly on) uneven terrain. Obviously, root motion is great for fidelity, and very well-suited to single-player games...
But it is terrible for any multiplayer/online system. The last thing you want is to trust every client to take all movement from the animation, because if you cut an animation short... you'd have no way of knowing whether or not you stopped correctly on every system unless you tried to sync up animation by frame numbers, and that's just... it's a potential nightmare to keep player position and movement in sync across all clients in that scenario.
Thus, most multiplayer systems that utilize root motion animation in any way will use them only for—to use the term that Unreal Engine uses—what are called "montages". These are animations (or sequences of animations) where you are guaranteed to play the entirety of the animation—a finishing move, a specific animation sequence played when opening a secret door, etc. In essence, a montage is an animation sequence which will never be cancelled or cut short.
Conversely, "in place" animations are exactly what they sound like: the animation is entirely in one place, and if you loaded the animation into an animation viewer you would see the character basically running in place as if on a treadmill. Usually you'll include metadata tracks (such as the speed at which a character is moving) calculated from the animation, but rather than using those to determine how far a character has moved, generally you will instead use it to slightly change the animation speed if it does not exactly match the character speed. This means that instead of the animation being the authority on where you are, the server can be—very important in networked gaming!—and changing the animation speed to match your character's velocity can get you "close enough" to have minimal foot sliding.
If "close enough" isn't good enough, you can engage in what's called IK locking. Which leads me to a brief digression...
The two ways you can control a character's position are Forward Kinematics and Inverse Kinematics. Forward Kinematics is where you say "the shoulder is rotated this way, then the elbow is rotated this way, etc., and all of that means the hand is here"... basically, you move forward along the chain of bones/joints in a character rig. Most animation is stored as forward kinematics. Inverse Kinematics, unsurprisingly, goes the opposite direction, traveling backwards up the chain (from hand to wrist to elbow, etc.); instead of the place where a hand is being determined by all the previous bones, you instead say "the hand is here", and trust an IK solver to figure out what that means for how the wrist, elbow, shoulder, etc., need to be positioned.
Obviously, this is more computationally expensive than Forward Kinematics, but not so much as to be prohibitive on modern systems.
So "IK locking" is when you say "the foot touched the ground here, so it will remain in this spot until I say otherwise", and then for every frame of animation, you calculate the character pose and then move the foot back to that spot, and let the IK solver resolve how that changes the rest of the character's position; this means the feet will never slide, because you're altering the animation on the fly frame-by-frame to ensure they don't. Similarly, "slope deformation" is when, for every frame of animation, you shift the feet to whatever the different heights of terrain are, and let the IK solver recalculate the character's position/balance. Hence how a character can stand on uneven terrain in many games.
(IK locking is generally not needed or used much for root motion systems, but slope deformation still is.)
Now, IK locking is easily done for a couple of characters... but if you have many, many characters on screen all at once, that many IK solvers would start to get a bit much for a lot of systems, especially on the lower end. Thus, many MMOs do little to no IK -- generally just slope deformation, which you can kind of cheese to do very cheaply (computationally speaking). Adjusting animation speed helps to keep the sliding feel to a minimum. (In the case of FFXIV, I'm not actually sure they have IK data at all; so far as I can tell, this game works solely on Forward Kinematics, hence why we do not even have slope deformation and our character will stand with feet level even on uneven terrain.)
Conversely, while I never poked around in the game engine back in 1.x, all the behavior I remember suggests to me that in the name of animation fidelity they used root motion animation. Which is a thought that strikes me as actually insane with regards to any sort of networked movement code, much less an MMO. And the only way that would be reasonably viable is if you treated all animations like montages... e.g., you did not blend between animations, and instead had many many small fragmentary animations, every one of which had to play out in full before the next one could start.
If you divided the animations up into much smaller chunks than most games do, you could minimize this to some extent, but—for instance—even if you had a "running, left foot steps forward" and a "running, right foot steps forward", if someone tried to stop as the left foot had just begun to lift from the ground, you'd still have to play out the entire sequence until it was down again. (And at a run, this little bit of extra movement can be enough to carry you over the boundary you were trying to stop at.)
Which, anything else aside, precisely matches my recollection of what movement in 1.x was like. So while I don't know that they used root motion, I sure suspect that was the case.
(Here ends the lecture; I'd apologize for my verbosity, but... I mean, we all know I'm going to do it again soon enough. My signature on these forums even admits as much.)