I get far more RMT spam in FF14 than in other MMOs I've played in recent history. The current anti-spam measures are obviously insufficient. Based on my experience (10+ years of mail system administration), here are the key deficiencies I've identified and some realistic technical solutions that would massively curtail RMT spam:
Reporting procedure improvements:
- Currently, reporting players for RMT spam is a tedious, manual process. Players need to be able to right-click on a message and select "report as RMT spam". Doing so should automatically generate a report to the Special Task Force with all the relevant information (reporting and reported character name, server, the text of the reported message, etc.)
- Along similar lines, incentivize making accurate spam reports. Players are apathetic towards reporting spam because it requires effort (however trivial) and because they lack confidence in its efficacy. Offer minor rewards to players based on the number of valid spam reports they file. A currency/point based system would work well here.
- Players should have a (possibly negative) spam reporter score based on the number of correct and incorrect spam reports they've filed. Each report filed for a message/character should nudge a review priority metric by an amount determined by the spam reporter score of the reporting character. This at once discourages making false reports (players with sufficiently negative reporting scores should themselves be subject to review) while improving response time for reports that have a high probability of being accurate.
Filtering improvements:
- Subject every /tell, /say, /yell, and /shout to a Bayesian and/or Hidden Markov Model classifier. Preferably both. Messages with a sufficiently high score (say, 99.9% probability) should be automatically blocked with a system message to the sender along the lines of "This message has been automatically classified as spam. If you believe this is in error, blah, blah, blah". With adequate training, it is vanishingly unlikely that legitimate players will ever see this message, while RMT spammers will see it virtually 100% of the time.
- You will obviously need large, accurate, auto-aging samples of HAM (good) and SPAM messages to facilitate this. The SPAM sample can trivially be compiled from verified spam reports. The HAM sample will have to come from the playerbase itself to be statistically valid. I recommend an incentivized opt-in system similar to the authenticator initiative. Privacy assurances must be made and kept. Players who choose to opt-in to the system need to be assured that their messages will not be retained and will be used solely for training the spam filter and not for any other purpose.
- Bulk message autoblocking. Intelligently thumbprint every message sent and uset his to maintain a count for each message. This bulk counter should be used as part of your heuristic for preemptively blocking messages. I frequently see RMT spam with random letters appended to the end of each message. This tells me that you're probably already doing this, but it also tells me that you aren't doing it very smartly. Look into the thumbprinting systems used by (for instance) the Razor and Pyzor networks for inspiration on how to thumbprint messages intelligently.
- Account aging and behavior needs to figure into the spam filtering metrics. The vast majority of RMT spam comes from newly created burner accounts. New accounts need to be subjected to a higher level of scrutiny and lower threshold for preemptive blocking. This needs to be done intelligently. Raw account age and played time is insufficient. Spammers are smart. They can simply age their accounts before using them to spam. This needs to be measured by some meaningful unit of activity that is frequent for legitimate players but infrequent for spammers and gil farmers. A pinch of statistical analysis is worth its weight in gold. Sometimes the answers an unexpected. For instance, how frequently do normal players use the dye action vs. gil farmers and spammres? I'd be willing to bet there's a statistically significant difference...