Okay, I’m being a bit unfair on singling out Sophos here, but they’re a current source of irritation. Like all security vendors they’re selling products that don’t work. Actually, Sophos is one of the few larger players that will talk about this honestly, which is why they have been my first choice recommendation for a long time.
The problem is that if you have companies selling “total security” products, which are nothing of the sort, the public are likely to believe such a thing is possible. If you describe your product realistically the idiots will look elsewhere, purchasing based on the most outrageous claims. A look at the Sophos customer base suggests they’re not selling to idiots.
So what’s my problem with Sophos at the moment. Well I’m falling foul of their UTM Web Defender at an educational establishments. Some of my information sites are unclassified on their list of web sites, and so they’re blocked. They contain educational material that I use when teaching. Not helpful.
Okay, this isn’t default behaviour and the establishments in question have made a decision to block anything that Sophos hasn’t classified yet. Some of these sites have been there since 1992, so presumably there’s a long backlog. And this illustrates the problem very nicely; there are over 300,000,000 domain names registered, with 1,000,000 being added every month. Web filtering companies have to look at all these web sites, and sub domain web sites, and classify them all. It’s an impossible task. I know Sophos does this manually, heroic but doom to failure.
The World Wide Web was created to allow the sharing of knowledge; particularly academic and research information. Unfortunately this is just the kind of web site that’s likely to remain unclassified by content filters; obscure links to non-commercial servers giving the information needed for research.
There is a solution. A few years ago I decided to write my own web search engine for a laugh. I then modified it to try and figure out what the web sites were about. Google has built an empire on doing this extremely well, but my quick heuristic solution did a pretty good job.
So here’s what Sophos et all should do. When their web defender appliance hits an unclassified site it should automatically submit it to them for evaluation. An automated system using heuristics can then figure out the likely classification, with a probability threshold for human checking.
This doesn’t have to be instant to be a hell of a lot better than their current system. To get past a Sophos filter (for example) you have to manually submit every site to them by filling in a form, and then they’ll go and classify it within a week. Possibly. And in reality, who’s going to submit such a request to access a web site they can’t actually view because it’s blocked as “unclassified”. There’s a hole in their bucket!