This is honestly nothing new with the gaming industry, in regard to the manually rendered vs pre-animation rendered animations and lip sync in cutscene heavy games.

Remember the PS2 days? Back in FFX and Kingdom Hearts 1 and 2 (as well as other games), they'd swapped models constantly during cutscenes in order to save resources and time. Basically they'd give scenes with higher plot priority and presentation the high poly models where they're lip synced and all and then swap to the lower poly models where the mouths are just textures that flap when they talk along with pre-baked animations.

You also see this all over the place too once you actually start acknowledging this technique; Red Dead Redemption 2, Horizon Zero Dawn, Resident Evil etc.

I think the issue is that during those games, the stylization of the models helped with making the quality somewhat consistent, while in modern era where we have much higher graphical fidelity in animations, the cost-effective method is a lot more jarring to notice because of the bigger dip in quality in the models and animations with cutscenes.

Also we were all dumb kids who didnt give two shits about such things probably so we didnt complain about it as much back then.