Many cutscenes in FFXIV struggle with poor pacing, especially those that depict little more than characters speaking unvoiced dialog.
Part of this is because of poor cinematography being the accepted standard, making the cutscenes visually and artistically disinteresting. But I'd like to focus instead on a factor that may be easier to solve, which is the rigidity of dialog bubbles.
Dialog bubbles can currently only be displayed one at a time in the bottom center of the screen. This leads to characters present in the scene having to "wait their turn" before they can react, whether that means emoting, interrupting, moving, or simply responding to the speaker.
It's turn based dialog. When the dialog is engaging it's not as problem. When it starts to feel like boilerplate, it's a slog.
Instead, we could improve the reaction time of participants in the scene by having floating dialog bubbles - much like chat bubbles in the overworld but with closer aesthetics to existing cutscene dialog bubbles.
With floating bubbles, characters can be timed to respond to one another before the user has finished reading the initial dialog. It allows for more seamless back-and-forward conversation, more natural reaction timing, and so on.
As an example I give the first cutscene in Vagrant Story. Note how even though this is an exposition heavy scene consisting exclusively of people talking, the use of floating dialog bubbles allows for a smoother flow of conversation and a feeling of faster pacing.
https://youtu.be/SDThRX_SKs8?si=Y0s6aQOBMZMSQIlJ
As Vagrant Story demonstrates, floating dialog bubbles allow the devs to play further with the visuals of a scene, staging shots like a panel from a comic, and using positive and negative space in effective ways.
This will help improve the overall quality of unvoiced cutscenes in FFXIV and repair their tarnished reputation.