Erfwiki talk:SpamWars

From Erfwiki
Jump to navigation Jump to search

So You're Not An Ad-Men and You Want To Fight Spam

Just a few things for all the new spam-fighters: first, we already have a tag to slap on the spam-bots and their pages. Just hit 'em with [[category:spammer]], and they'll show up on a page we've already made for it.

Second, we've been having a few waves of what I've been calling "Cheerleaders". They'll hit random pages, and either insert or replace content with meaningless fluff like "OMG ur so smatr" Rather than mark the page for deletion, if you could just tag the spammer in their user:talk page and undo the edits, that'd be great.

Third... thank you so much. The more people who help, the easier this becomes. --No one in particular 15:53, 1 May 2011 (UTC)

Note, keeping an eye on Uncategorized helps; most of the new spam pages are uncategorized at creation. Abb3w 03:17, 3 May 2011 (UTC)
The Black Hand of Nod has been patrolling the Recent Changes page, in an attempt to catch new spam before it propagates further. Already seven spam-related entries on that page have been fed to our Purifying Flames. --The Black Hand 17:41, 5 May 2011 (UTC)
One more thing: While the Spammers just create new pages, the Cheerleaders will actually replace content. Therefore, we ask that instead of just deleting their changes, you hit the rollback button. It's a one-click action that is helpful when they've deleted huge sections of plot summary, or wiki templates. Additionally, blanking pages keeps them from showing up in the dead-end list, making it a littler harder to find them. Again, if you could tag them with [[category:spammer]], it'll make it a lot easier to process them. Thank you all, and I'm sure it'll all be over by Christmas. -- No one in particular 20:51, 4 May 2011 (UTC)
Further note, {{spammer}} takes less typing than [[category:spammer]]. However, it looks like the addition of KittenAuth means round one is now over. Abb3w 21:07, 9 May 2011 (UTC)

Captchas & Filters

  • STrRedWolf here from Stalag '99. You should add the ReCaptcha plugin to MediaWiki, which cuts down on 99% of the spam. I'm using it on my Canmephian Library Wiki.
  • Archaic here, webmaster of Bulbagarden / Bulbapedia, founder of the Nintendo Independent Wiki Alliance, and a big fan of the comic. Heard you're having some spam issues, so I checked what some of our network partners are doing in dealing with similar issues. I see you've already got ConfirmEdit and SpamBlacklist, but you might want to consider using QuestyCaptcha for your ConfirmEdit, it makes it a lot stronger than the math questions. Just base the questions off the comic and you should be good to go -

Also consider getting the following extensions Abuse Filter: AntiSpoof: Title Blacklist: TorBlock:

The AbuseFilter extension seems most useful for the spam you guys are getting. You could prevent non-autoconfirmed accounts from creating pages that consist of a title followed by an external link, which seems to be the form of all the recent spam (spotchecked in recent changes). Someone'd hafta' brush up on their regex, though. Pages that contain the regex ==<center>\[[A-Za-z0-9/_:\.\?+] <big>'''<u>[A-Za-z][A-Za-z\s]*</u>'''</big></center>== might work, you'd need to double-check; it's a first shot.
Other items that seem like they'd help are building some tags users can use to notify admins when things need to be deleted. Something simple like {{delete}} would work best, which categorizes pages into, say, Category:Pages to be deleted. Anything besides the category is somewhat optional, but a box at the top where they can put in a reason for deletion would also help. I'd suggest ripping code out of similar Wikipedia templates for the boxes. Cheers, everyone. 14:38, 1 May 2011 (UTC)
Me and Rpeh both wrote up the template at the same time. I saved mine over his, even though it's much uglier and crappier, 'cuz it seemed like he just stole the code from Wikipedia's {{db-meta}}, which wouldn't work all that well without the other templates it builds off of. So, template and category are working, use them as you all like. Cheers. Lifebaka 15:21, 1 May 2011 (UTC)
I went back to the previous version. I took the code from here, not directly from Wikipedia. The version I copied doesn't use anything from WP so there's no problem with it. rpeh •TCE 15:48, 1 May 2011 (UTC)
  • I don't have the webmastery credentials that these other guys do, but I just wanted to point you here for info direct from MediaWiki on fighting spam, in case you haven't seen it yet. Also, I strongly suggest not permitting anonymous editing (such as I'm doing here). --Dachannien
  • Solutions we use over at - We locked down Editing to Logged in users only - that was the first step; second step was to make it so you had to request a user account from an admin. Spammers were getting to be too annoying to allow open editing or even open registration. We got tired of killing new accounts.
  • My experience in combating spam on topic-specific sites is that as a rule, Captchas don't work: Spammers have solved them by using cheap slave labor to answer them for them or spam for them. Emails don't work: There are many free emails out there, and you can't really block them all without blocking real users. Restricting user accounts, blocks out good users. What does work is a domain-specific mini-quiz. Ask a question or two about Erf as your Captcha-replacement. There's nobody editing a Wiki who doesn't know something about the subject at hand, or, at least, nobody who SHOULD be editing it. A real user should be able to answer such questions from memory. A spammer won't have the slightest clue what the answer is, and a generalized solution that can be applied to defeating other popular Captcha-systems will be useless, because it's domain-specific. Someone would actually have to read the comic to answer that question...which all real editors do, right?
This is excellent advice. But you should add how this could be implemented by someone who doesnt have that much experience with MediaWiki.-- 12:46, 6 May 2011 (UTC)

This is for media wiki 1.6

  1. Prevent new user registrations by anyone

$wgGroupPermissions['*']['createaccount'] = false;

  1. Restrict anonymous editing / tools showing

$wgShowIPinHeader = false;

  1. Stop anonymous editing

$wgGroupPermissions['*']['edit'] = false;

  1. Anonymous users can't create talk pages

$wgGroupPermissions['*']['createtalk'] = false;

  1. Anonymous users can't create pages

$wgGroupPermissions['*']['createpage'] = false;

That would appear to be a huge overreaction. If people can't even create accounts you're going to lose 99% of your potential editors. Rpeh 14:36, 1 May 2011 (UTC)

Spam Solution

We had exactly the same spam problem on the Oblivion Mod Wiki and the FancyCaptcha module for ConfirmEdit stopped it dead. Try that one. Rpeh 14:35, 1 May 2011 (UTC)

Ever though of making registration mandatory (with some captcha/others) for edition? --An unregistered user
Wouldn't help, because most of the current spam is coming from named accounts.
To be honest, if you aren't going to implement some of the ideas on this page, there's no point in asking for help because there's virtually nothing non-admins can do. We can't delete spam pages, we can't block spammers so it's up to the existing admins to sort it out. rpeh •TCE 09:59, 3 May 2011 (UTC)

Suggestions from MediaWiki's page on combating spam

MediaWiki has a page on combating spam. Some of the good suggestions from that page:

  • Add a spam regex using wgSpamRegex, including any content that should not appear in a valid page; this includes markup used only by spammers (such as any attempt to hide links, or many types of HTML formatting), and keywords and phrases used only by spammers.
  • Use the ConfirmEdit extension, to force a CAPTCHA on any users attempting to add a new external link, as well as users attempting to register a new account. It looks like this wiki may already do that, using SimpleCaptcha, but you might want to use one of the stronger CAPTCHA modules, as I strongly suspect spammers have automated ways to solve SimpleCaptcha. Try MathCaptcha to make it use images, or ReCaptcha to use that external service. I'd recommend against using QuestyCaptcha, as it only seems to support a fixed list of questions, which won't stop spammers for long.
  • Use an IP address blacklist of known spam IPs, which tracks spammers from other wikis and forums.

--JoshTriplett 02:33, 2 May 2011 (UTC) can give an indication of whether an IP address has a track record of such behaviour as comment spamming. It is dynamic, not a static blacklist, which is better practice. Feeding data back to such projects can also be a deterrent to spammers.

A couple of other ideas

Consider whether your page content actually needs to contain URLs at all. Intra-wiki links can be done with wiki tags, so if external URLs are rare in your content then blocking page edits which create pages containing them may be more useful than annoying. Humans are good at reading a partial URL and putting it back together, but it defeats any benefit for a link-farmer if their bots can't create URLs that are accessible to a search-engine spider.

Use your robots.txt file to make it harder for link-farmers to benefit. Putting the NOFOLLOW meta tag on your wiki pages will also keep Google from giving link credit for them. Most of the spam is badly behaved robots trying to impress Google's well-behaved one - you can't control the badly behaved bots directly, but by keeping better control of the well-behaved ones you can remove their prize.

N.B. The current robots.txt set-up means the spammer still wins even when you "delete" a spam page. You're allowing search engines to index your page history pages, including the old revision with the spammer's full text. Similar issue for page excerpts on "recent changes" page.

Note, however, that spammers will continue spamming regardless of whether their spam has any benefit. --JoshTriplett 06:55, 2 May 2011 (UTC)

Unlikely to persist forever. In my experience there are two rewards - search engine rank and feeling superior to the site owner. When both dry up, there are plenty of other targets out there. Resources are not infinite for the spammers any more than they are for the defenders. They can certainly afford to waste some and to use some on spite, but they're not really any less concerned with utility vs futility of their actions than the defenders are.

All your current spam contains this string: MjE3fHwxMzA0MTQ1NDIzfHwxOTUyfHwoRU5HSU5FKSBNZWRpYVdpa2k

This may be the tag that (when a spider reaches the site that is the object of the spam) confirms the origin of the successful link-spam. Such mechanisms are used to help automate the process - sites that allow spam to generate hits can be automatically re-used for the next spamming cycle and sites that generate no hits drop from the list.

After a few hours of rapid reversal of these spams, the link farming has been replaced with a human vandal. This is not uncommon as while the spammer is investigating the reasons behind a fall in productivity they often meddle to test the environment, for amusement or spite and to muddy the waters and confuse the defenders. This phase does not generally last very long before some automated actions resume, or the spammer decides not to bother with this target further.

Signs of automation of vandalism: change descriptions set to a random alphabetic sequence involving capital letters. Clearly distinguishable from patterns formed by keyboard-bashing, which tend to cluster around the home keys and have a high incidence of repetition. This is often the final phase of the encounter, but without technical or policy changes the site will eventually be found again.

Some of the latest addresses are interesting: (AS41822) seems to belong to honest-to-badness criminals. It's running an open proxy on port 80, also.

The phrases used in the vandalism are now beginning to repeat. Blocking those distinctive phrases would reduce the success rate during the vandalism phase drastically.

First change of signature string for the money-maker spams, now: OTN8fDEzMDQxMzc0NDJ8fDE5MDZ8fChFTkdJTkUpIE1lZGlhV2lraQ

And then straight back to the earlier signature again. The named spamming accounts each post at least one spam and there seems to be a backlog of quite a number of such accounts which haven't yet posted. It would be wise to keep them muzzled - you're unlikely to see the end of this if the link-farming machinery keeps scoring hits.

Check Spambots

The Check Spambots extension (see here) seems very promising to me. As a coadmin on a little phpBB forum somewhere, I vet all newly registered users through StopForumSpam, BotScout, Project Honeypot and a few other helpful databases. A plugin that does that job automatically seems like a very good idea to me. :)

Oh, and something else that can be useful: FlaggedRevs will make it so that even after an edit, a page will still display the last "validated" version rather than the very latest version. That means you have to have a few trusted patrolers who'll check recent changes regularly to validate or revert changes made by anonymous and regular users, but on the upside spambots and vandals can do their worse without getting their search-engine optimization rewards.

Some Advice

Hi there.

  1. SpamBlacklist is already installed. Good. Make sure you include the MetaWiki-Blacklist (instructions).
  2. Use MediaWiki:Spam-blacklist (you need to know some basics about regular expressions; have a look at the Metawiki-Blacklist). Maybe you will need to use MediaWiki:Spam-whitelist too, in case you decide to block (for example) all russian sites (\.ru) or blogspot\.com.
  3. Consider Extension:SimpleAntiSpam.
  4. Upgrade MedaWiki! This Wiki uses 1.14.0, 1.16.5 is the current version. Dont know if this will help fighting that spam, but it will make your wiki safer, as there were some security-fixes.
  5. $wgAccountCreationThrottle = 1; Nobody will be able to create more than one account per day.
  6. Have a look at Extension:Wiki-httpbl. It blocks spammers using "Project Honey Pot".
  7. $wgGroupPermissions['sysop']['deleterevision'] = true; sysops will be able to delete spammed revisions.
  8. Consider blocking Spam-IPs for more than 2 weeks, a month or so should be good.
  9. Think about getting some additional administrators.

Thats about all I can think of. We had a lot of spam in 2006, but after installing the blacklist, the simple anti-spam and the Wiki-httpbl, and some blocking, reverting, deleting and blacklisting, the spam disappeared. We didnt have any spam for quite some time now.-- 21:36, 5 May 2011 (UTC)

"Your edit includes new external links. To help protect against automated spam, please solve the simple sum below and enter the answer in the box (more info):" - why does that function need cookies enabled?! And why doesnt it say so? You need to change that: "Your edit includes new external links. To help protect against automated spam, please solve the simple sum below and enter the answer in the box (needs cookies enabled; more info):"-- 21:39, 5 May 2011 (UTC)
One more thing: Think about using pywikipediabot. I would use this bot to mass-delete all pages marked as spam by wikiusers. If you have any questions, feel free to contact me.--Baumgeist 13:08, 6 May 2011 (UTC)
    • Something else to consider - find some way of keeping the {{delete}} template from tripping that captcha. We nuked a few spam pages a little while ago, and replacing the spam with that template tripped the "new external links" filter, even though the links in question weren't technically external links (they were links to the history page and the delete page function). --The Black Hand of Nod (We're in control here!) 17:27, 6 May 2011 (UTC)
  • Would it be appropriate to update the robots.txt file to disallow indexing of the wiki? My understanding is that a large part of the reason why spambots bother with these sites has as much to do with SEO as anything. So even though a robots.txt won't be respected by spambots, the fact that Google does respect the robots.txt might reduce the incentive to vandalize the pages. --Techieman 06:06, 7 May 2011 (UTC)

Fix your security

Okay, your problem appears to stem from allowing anon page editing, for a start.

So, in order:

1) Change your wiki config - do not allow anonymous editing

Here's the info on how to do this:

2) Require all editors to have a registered account.

3) Create a 'trusted' pool of editors. Manually add people to this pool.

This wiki doesn't need to be open slather for the world ; yes, you will lose some contributes, but it is better than spammers and script kiddies.

4) Disable all 'anonymous' editing

5) I suggest having a page to record the IPs and other info of the attackers You might like to make this page visible to registered users only.

Do this, for a start.

It's not the anonymous bots that are causing the major trouble, it's the ones making accounts and new pages. (Although I do agree that anonymous posting should be seriously restricted.) --ChroniclerC 06:33, 15 May 2011 (UTC)