The problem with instances is limited resources. You need to be able to "spin up" and "spin down" instances. When you stick stuff like that in the context of housing, you are creating permanent resource losses, because now you can't spin down the map when nobody is in it, it has to run in perpetuity. The correct thing to do in a different game would be to "stretch" the map, eg adding new cells between point A and point B. Throw in a convenient excuse like a Volcano or Earthquake, and you're done.
But to go back a bit, this is something that only works when there isn't a "minecraft" like of component. To overcome that in such a game actually requires overlapping protection cells.
EG the cells occupied by A housing and B housing overlap at point C which is a protection buffer. To build anything in the protection buffer cells, requires an agreement between A and B cells. Should A or B be permanently removed, the protection cell will still exist unless both are removed, if both are spun back up at a later date (say a single person walks into A, through point C to B) it would be seamless as if they had never been spun down.
At any rate, real-world technology has existed for decades for solving "how do you put everyone in the world on the same network", because that's exactly how the GSM system works, albeit much more complicated than I'm describing, and also how the current LTE cellular system works. Everyone gets a unique ISMI (that's your "client device") and a IMEI (That's the subscriber identity module) and your permission to use any cell in the world is determined by your ISMI connecting to the network and the IMEI connecting to your carrier for permission to be charged. Your "home" registar is where your phone number is unique (eg NPA-NXX) and you're registered as a "visitor" whenever you go somewhere else. They also do "instancing" by having multiple frequencies, and CDMA works by everyone speaking at the same time but different languages. So capacity only declines by simply not having enough smaller cells instead of trying to build large cells.
Therefor, you get better use of hardware by having the ability to spin up and spin down small cells than large ones. A cell could be as small as a room, or as large as a cow pasture, it just depends how many many people are going to be consistantly present. This is where "virtualization" actually would work as promised. In practice, game companies have been making awful use of virtualization, choosing to instead put MORE instances on the same equipment to save costs rather than spinning up smaller instances on-demand.
Anyway, to go back to the OP's question. FFXIV's V1.0 had super-large cells (basically all of Ul'dah was one.) While V2.0's maps were all cut up into about 5 pieces each. Consider how copy-pasta the V1.0's landscape was, this was actually for the better, as it removed a bunch of nonsense from the Shroud and from Limsa. It also allowed them to actually create unique towns where once was just two tents and a Crystal.