So this fix "helped" to get in the game at around 6pm BST but if you dare to get a 90k or anything else that would disconnect you after that time, have fun in a queue of 2000 people -_-

So this fix "helped" to get in the game at around 6pm BST but if you dare to get a 90k or anything else that would disconnect you after that time, have fun in a queue of 2000 people -_-


I've never had problems with queues, but even I look at that situation with little sympathy. Your 2 hour window of play time is mildly interrupted, so you can't do party content? As I understand it, for a great many players, getting online at all in a two hour window was nearly impossible for a while there.
After the relatively short roulette queue I just had, I still feel that this has been a net benefit for everyone.

The lockout is right in the middle of my prime time (8pm). You have to interrupt your instance queueing about 20 min prior to the lock out so we don't get caught inside. So for those with limited play time, 30 min is not trivial. I would see no problem if that was a one off, but with no permanent solution on the horizon it could last a while.
I still don't understand why they blanket lockout all the servers and not the specific ones with problem.
I'm glad my lost time and inconvenience are helping others to login in their own server .. but I chose my server specifically to do not have these problems and you would excuse me if i feel a "little bit" annoyed by all this.
Quite frankly, if this continue too long i will have no other option than to transfer to another data centre for a more convenient lockout time.
The first bit there... that is actually a legitimate complaint! But here's the problem Square is likely facing:
Each server cluster probably runs off of the SAME code. Each shared data center resource has the SAME code. The beauty of that is that when you do a new code deploy, you can have an automated service go out and push the SAME code to every node, and the update takes minutes instead of hours.
If they were to "make a code exception" for server-specific restarts... that would clutter the process, and invite a WHOLE LOT of human error into a process that is probably completely automated. Trust me, human error and servers are about as healthy of a mix as electricity and water.
Bottom line: they probably don't want their code to have exceptions, keep it all standardized for future development cycles, and to keep the disruptions congruent across the population. Besides, everyone is treated fairly, read as: the same. Isn't that what everyone is about these days? Fair treatment and equal access (to getting unceremoniously booted every day).
It is rough, but fighting the same tired argument without any context as to why they do what they do... or why they didn't do it differently. Well... that just sounds pointless. And I know one of their engineers/sys admins aren't about to come out here and explain to us WHY.

Thank you ... at last i can understand some of the decisionsThe first bit there... that is actually a legitimate complaint! But here's the problem Square is likely facing:
Each server cluster probably runs off of the SAME code. Each shared data center resource has the SAME code. The beauty of that is that when you do a new code deploy, you can have an automated service go out and push the SAME code to every node, and the update takes minutes instead of hours.
If they were to "make a code exception" for server-specific restarts... that would clutter the process, and invite a WHOLE LOT of human error into a process that is probably completely automated. Trust me, human error and servers are about as healthy of a mix as electricity and water.





We know this isn't entirely true. They have taken down individual servers before more than once. And you have a pretty poor system if you design it in such a way that you can't take problem pieces down to fix them and have to take the whole thing down every time there is a problem with one area. Which it doesn't appear that they do, since as I mentioned, they've taken just the problem pieces down before.Each server cluster probably runs off of the SAME code. Each shared data center resource has the SAME code. The beauty of that is that when you do a new code deploy, you can have an automated service go out and push the SAME code to every node, and the update takes minutes instead of hours.
So tell me again how this is supposed to help the login queues? Kicking everyone out has created a login queue of over 1,000 people on my servers and its only gonna get worse as peak time beings to roll around.
I love this game but holy cow, how many bad decisions are SE gonna continue to make in this frankly shambolic launch period?
So that specific and targeted server fix - that was likely done for a specific reason, by a specific person or team. That is a manual process. It was probably done for a very specific reason that was unplanned (like, a failing SAN or they lost a blade on a cluster).We know this isn't entirely true. They have taken down individual servers before more than once. And you have a pretty poor system if you design it in such a way that you can't take problem pieces down to fix them and have to take the whole thing down every time there is a problem with one area. Which it doesn't appear that they do, since as I mentioned, they've taken just the problem pieces down before.
I know I'm getting a little deep here, but the point is: emergency maintenance is likely just for that reason. This would not fall into "emergency" maintenance. They're not going to sit and measure server performance vs load, and then reboot the server when the performance does not equal what is expected for the load (aka: people are idle and not consuming system resources). This would be what is known as a process. A temporary process, but a process. And you want processes to be divorced from human intervention as much as possible. Humans work slower, less efficient, and more accident prone than scripts and code. If possible, you want code to handle the event.
Likely, what I would do is set up a crontab job (task scheduler) to send out notices, and then to actually kick off the reboot job. Your team then monitors the condition of the job, because this is pretty high profile and you'll want to have eyes on it, to make sure everything executes properly and then responds quickly in the event things did not go well.
I would also venture to say that Square's team is meagre in size when compared to their server footprint.
Last edited by Ayrie; 06-30-2017 at 09:32 PM.
Guess one can never truly leave Balmung, You can take the person out of Balmung but not Balmung out of the person. The game will be forever haunted by it.
This is for all the years that people hated and bashed Balmung to death lol.
Balmung strikes back! xP
The true test of AFK will be the weekend, since the game was super congested then.
|
|
![]() |
![]() |
![]() |
|
|
Cookie Policy
This website uses cookies. If you do not wish us to set cookies on your device, please do not use the website. Please read the Square Enix cookies policy for more information. Your use of the website is also subject to the terms in the Square Enix website terms of use and privacy policy and by using the website you are accepting those terms. The Square Enix terms of use, privacy policy and cookies policy can also be found through links at the bottom of the page.
Reply With Quote

).



