Technology – Page 15 – Frank Leonhardt's Blog

Infosec 2015 – first thoughts.

This is my customary personal blog post on the Infosec Europe show. Specific articles may, or may not appear here later.

This year the show has moved to Olympia from the defunct Earls Court, which is is probably the best choice available. It’s made me nostalgic for the old Personal Computer World shows of the 1980’s. Except there’s not a lot of interesting technology here. The theme of the show seems to be governance and the IT Security industry – governance and compliance rather than solutions to real problems. It’s been the way things have been moving over the last few years, with the modern IT professional being hard pressed to know which end of a soldering iron to hold it by.

There were a few interesting new(-ish) ideas, and the bleedin’ obvious stuff being packed with a GUI and monetised.

Libra Esva is a good case in point. They’ve taken Linux, spamassassin, Clam-AV (and optional commercial AV products),together with extra filtering and firewall functionality of the kind an old-style UNIX admin would customise their rigs with, and created a virtual appliance with a good looking and easy-to-use front end for users to deploy on VMware and so on. Sure, it takes the fun out of it but it looked good.

ActiveDefence were on hand, offering to launch a DDoS attack on your infrastructure to see how good it was. What, how do you launch a realistic DdoS attack without a botnet? “We have our own, they said.” And they were serious. The service may not be unique, but it’s very rare (unless you hire a bunch of crims, of course – I’ll have to see how prices compare).

KnowBe4’s PR has been bombarding me with their name for a few weeks now; I had to see why. They’re a company after my own heart – they’re launching cyber-security awareness training and consultancy in the UK, at a level appropriate to users and at a price point where SMEs really have no excuse for not doing something about what I (and KnowB4, obviously) regard as one of the greatest threats. Call it spear phishing or human engineering attacks – the weak link is employees being duped. And the criminals are very sophisticated, so awareness is about the old defence.

I’m off to see some more people who seem to have re-invented the obvious, and put it on the market. They’re using honey-pots to capture IP addresses to dynamically configure firewalls, it appears. Quite what their angle is remains to be seen, but it’s presumably a better honey-pot than we’ve all be writing for years now.

2-June-1524-February-21

The Future of Nominet (AGM report)

Nominet, the not-for-profit company that manages most of the .uk domain space, has been worrying me of late. It replaced a naming committee in 1996 as the volunteers that run it started to become overwhelmed by the workload, and was set up be self-sustaining by charging for domain name registrations. Based in Oxford, it now employs 140 people.

They’re worried. Anyone who wants a domain name pretty much has it – or it’s being sat on by a cybersquatter. Either way, Nominet’s getting the residual income from having registered it in the first place, and this is now fixed. Or worse, as the enthusiasm for registering names in the hope of making money from it later wanes, their income may fall as people unload their speculative “investments”.

As well as Nominet employees no longer being kept in the manner to which they’re accustomed, this presents a problem for those dealing in domain names commercially. Call them cybersquatters, domainers or parasites as you wish – domain dealers are making money out of buying and selling domain names. Their portfolio losing value as the bubble bursts could be problematic for them. With new top-level-domains now available, and the importance of a particular domain name falling, this is inevitable.

So, unsurprisingly, Nominet has been talking about expanding in other ways. At today’s AGM, new CEO Russel Harworth, was taking about expanding in to adjacent markets. What could this mean? As well as providing domain names, the obvious answer is hosting or other Internet services. Nominet members are going to have a problem with that: Nominet has a monopoly position issuing domain names, a big pile of cash and no way would it be good for anyone if they started competing with UK Internet businesses.

I pushed Russell Haworth on his choice of words. “I have no intention of competing with the channel”, was the emphatic reply. He explicitly rejected the idea of hosting: “It’s not our core business and never will be. The margins are very tight anyway.” This will be a relief to the hosting companies, who know all about tough margins. He continued “I’d like to see us add value to the channel. For example, we sit on a lot of data. We can aggregate that data. There is an opportunity to look at big data. [and derive value from it]”.

Basically, the plan seems to be to analyse domain registration data and DNS traffic, and use it to target areas such as SMEs with a view to selling them something. Quite what they were selling wasn’t spelled out exactly, but domain name registrations seemed to be the only example.

It seems that the current thinking is to sell DNS products, which won’t compete with anyone much (apart from anyone selling DNS products). Why anyone should pay for DNS products is beyond me; but if you can’t manage your own DNS I suppose its possible for companies to outsource it. But I really don’t see this replacing the revenue stream, as new domain name registration income stops rising.

Rob Golding from Astutium asked what many of us were thinking – what’s so wrong with the status quo? Why not stick with one revenue stream. Nominet isn’t supposed to be a business and has no need to expand; it’d be okay to contract. Unsurprisingly, Nominet’s view was similar to that of turkeys towards Christmas. “It would be foolish not to look at opportunities to diversify”. Speaking about the saturation of the .uk namespace and future projections, Haworth continued “It’s Darwinian – we’ re not going to sit and watch things fall apart. If we see domains trending downwards, Nominet can add value to adjacent markets.”

This is an interesting situation, especially when you see who controls Nominet. Things are voted on, ultimately, by its members. This is weighted to the number of names they have registered. It’s pretty obvious that the large domain name registration businesses are going to have a far greater say than the majority of small members; those that represent the general Internet industry and general public. The big domain dealers will have millions of votes; a normal small ISP might have a few dozen. To counteract this, Nominet limits the votes of any one member to 3%, and has mechanisms in place to stop the big companies simply joining once and splitting their domain portfolios to get multiple 3% blocks. However, one still suspects that, although there appears to be no evidence that the domain dealers don’t collude in their voting, they’re all going to have the same interests and will naturally vote together – this effectively tending to control Nominet towards policies that support their business model.

Unfortunately there’s no easy way around this. Even if it was one-member-one-vote, large organisations could swamp the membership with their friends. So what keeps Nominet working in the public interest? Ultimately, scrutiny. If it went too far, an outcry could get the Government involved.

It’s also hard to see what Nominet can do in other fields. Their charter requires them to engage only in worthy projects. But according to Haworth, “This doesn’t mean yo can’t be commercial.” However, given that Nominet has a huge, secure revenue stream for investment, it clearly does have a commercial advantage over anyone else who has to raise funding through normal channels. We’ve heard this before – Bill Gates famously said that Microsoft was about making the world a better place. Whether that’s his personal philosophy or not, from a corporate perspective it has a hollow ring.

In the mean time, Nominet is intent on expanding its revenue streams. The supposed block votes of the domain dealers (all those 3%s added together) is going to limit Nominet’s ability to compete with them. 123-Reg is never going to allow Nominet to start hosting web sites and damage their own business. So what next? I, for one, will be keeping a close eye on it. I was very much heartened to see that was the general consensus of those present, including Trustees and the board.

31-May-15

Kids can review Kindle books in their parents’ names

Occasionally I write the odd review on Amazon products directly on Amazon. This is normally information I wish I had when I was looking for an item or book. Then, today, I was clicking about and came upon a list of things I’d written about:

Now I don’t remember reviewing E. Nesbit’s classic, and I prefer her Barnstable series anyway (although I doubt it’d be PC enough to make in to a film, so its merits are less widely appreciated).

So what’s going on here? And I certainly don’t remember reading “The Ugly Duckling”, illustrated or otherwise.

And then I realised – this was my daughter using a Kindle attached to my account. It appears that it’s possible to rate books from it directly, and this she has obviously done. In my name.

Her pronouncements as to their literary merit may be valid, especially for someone her age, but this needs to be made clear.

I’ve sent some pointed feedback to Amazon on this point, and will wait to see what happens.

10-May-1514-January-16

Microsoft’s Windows 10 Security Update Plan

The headlines on luser news media are all about Windows 10 being the last ever release of Windows. Apparently Microsoft’s plan is to issue incremental updates thereafter. As those in the know, know, this has always been the way. Microsoft only releases a new version when it wants to flog it to the punters as the next great thing, and it does this by giving the latest snapshot of the code a new name (e.g. Windows 7, Windows Vista). Okay, there have been major step-ups; for example Window 2000 was the marketing name for Windows NT 5.0 (ditching some of the disastrous code in Windows NT 4.x), then came 5.1 – sold to the public as XP. Windows Vista was the next re-write; technically it was Windows 6.0. Confusingly to the punters, 6.1 was flogged as 7 and Windows 8.0 and 8.1 were 6.2 and 6.3 respectively. The reality is that OEM versions of Windows appear frequently, to track the new hardware as it turns up in production machines. It’s only the retail customers that believe in these retail versions. So what is Microsoft really doing?

Well, one effect of having a retail version of Windows is that every three years the punters stop buying new PCs, waiting for the next “version”. As Microsoft actually makes a lot more of its revenue from selling OEM licenses (bundled with PCs) than the retail versions, keeping the hardware manufacturers happy by killing off the boom/bust cycle is probably A Good Thing.

Is Microsoft getting a bit humble, acknowledging that hardware makers have a choice and Windows isn’t the only game in town? I don’t believe they do; the punters want Windows on their desktop PCs, and that’s that. So what is in it for Microsoft?

The clue is in what Terry Myerson was saying at Ignite 2015 in Chicago last week. The new version of Windows will feature greatly enhanced on-line update capabilities, with peer-to-peer patch distribution and a lot more. Patch Tuesday is to be abolished, with updates rolled out on a continuous basis. And all in the name of security.

Let’s play devil’s advocate here, and pretend that Microsoft has other reasons. First off, Patch Tuesday, the monthly release of non-critical Windows updates in an ordered manner, will become obsolete. The policy was originally formulated to avoid patches coming out willy-nilly at odd times in the month and catching IT departments off-guard; and now they’re going back to the old chaotic system. A broken update can knock your IT systems out at any time of the day or night. If this sounds like a recipe for disaster, don’t despair – according to Terry Myerson, patches will be rolled out to the lucky home users first, which means that it can be pulled and business won’t be affected if an update screws up. Enterprise customers will still be given the choice as to which updates they install; it would have been a hard sell to knowledgable IT people otherwise.

Is this actually going to improve Windows security? Peer-to-peer patch distribution? 24/7 patches coming from Redmond as soon as they’re presumed ready? What could possibly go wrong?

Rather than looking at this as a security fix, I think the policy should be taken in to consideration alongside Microsoft’s move towards licensing, rather than selling, software. They want a continual revenue stream and they don’t like their software pirated. Who does? By moving to an OS model that requires the host to be Internet connected and constantly patching itself, it becomes much harder for cracked versions of the OS or applications to exist. (Microsoft’s own applications, that is). Peer-to-peer updates will make updates harder to block. If a crack turns up in the wild, the next day a patch to kill it can appear from Redmond. And if your stop paying the license fee, your copy of Windows stops working. This last aspect isn’t being talked about openly. I’m just guessing here. But considering Microsoft’s penchant for licensed/rented software of recent years, Windows 10 being released with a mechanism that appears ideal for licence enforcement should they ever decide to move to the rental business model, I think it’s a good guess.

Or it could simply be that Microsoft is panicking over the less-than-warm reception the world gave Windows 8/8.1 and had decided that releasing new retail versions frightens the horses.

2-April-152-April-15

Obama to end cyber-attacks

American president Barack Obama is so hacked off with cyber-attacks on US companies (and other interests) that he’s taken a step sure to send the perpetrators running for cover. In an executive order on the 1^st of April, he created a new sanctions authority to have a go at anyone attacking the USA. In the statement announcing it he is quoted as saying “Cyber threats pose one of the most serious economic and national security challenges to the United States, and my administration is pursuing a comprehensive strategy to confront them”, describing it as a “national emergency”

Basically it gives the US Treasury Department to freeze the assets of any hackers suspected of attacking the US, in much the same way as it brings peace to places the Middle East and Ukraine. The criminals behind these attacks are no doubt quaking in their sneakers.

The decision to blame North Korea for the Sony attack told the world that the administration was getting tough, never mind the facts. And the Chinese, of course, deny state-sponsored naughtiness on an apparently daily basis.

The problem is, of course, that it’s somewhat difficult to actually figure out who’s behind an attack. Working out where an attack comes from is possible, and it’s usually from some hijacked computers used to obfuscate the origin. China and various other countries have a higher installed base of pirated software, which often comes with a built-in botnet, so of course attacks come from these places.

Initial opinion in the USA is divided between the law-makers, politicians and the non-technical cyber-security industry heralding it as the beginning of the end for international espionage gangs, and those of us who know now it works wondering if this is an April Fool.

One point I find intriguing, however, is whether this will have an effect on patent disputes. Apparently they’re worried about, and plan to apply these powers to, intellectual property theft. It seems to me that if some technology turned up in a competitor’s product and the American company went crying to the authorities they could have sanctions imposed on the foreign company, without any reasonable way of proving that any theft had taken place – or even who had it first. It could get messy.

22-March-1523-March-15

Security certificates broken on Google Chrome 41

Don’t install the latest release of Google Chrome (41), released on Thursday (Friday UK time). They’ve messed up. Twice.

Broken SSL when talking to routers etc.

The first problem comes when accessing the web interface on a device such as a router over SSL (encrypted). Unfortunately, because the software in theses is embedded, the security certificate it uses isn’t going to match the name of the device you use to access it. This would be impossible – when it leaves the factory it hasn’t had its IP address assigned on your site; never mind the DNS entry. Previously browsers have allowed you to ignore this mis-match; the encryption works as long as you’re comfortable that you’re really talking what you think you are using some other check, and once the exception has been stored, this should be the end of the matter.

But not with Chrome release 41. Now it will show you the screen below:

If you ask for more details it doesn’t really give you much:

A secure connection cannot be established because this site uses an unsupported protocol.

Error code: ERR_SSL_VERSION_OR_CIPHER_MISMATCH

This comes from a DrayTek 2820 modem/router, but the problem seems to exist on other networking kit.

More adverts too – and a malware backdoor

(Please see update below – there may be an innocent explanation for this)

As an extra surprise, those nice people seem to have found a way of blocking URL keyword filters used to keep adverts out from objectionable sources, circumventing methods of blocking Google’s syndicated advertising. I’m still researching this, but the way they appear to have done it means that embedded content from other sources than the site you’re looking at is extremely difficult to block.

It appears Google has done this to protect its revenue stream from adverts, with little regard from the site policies that may exist for reasons Google may not realise. But that’s not the worst of it: how long will it be before this feature of Chrome is used for drive-by downloads. If you’re firewall isn’t able to cross-check the source of the content on a page, it can be coming from anywhere.

Unfortunately there is no way of rolling back a bad version of Chrome. They really don’t like you doing that, however dangerous a release might be.

I have, of course, made urgent representations to the Chrome project but we will have to wait and see. In the mean time, all I can suggest is that you prevent Chrome from updating beyond version 40.

Update 2015-03-23
On further investigation, the updated Chrome isn’t doing a DNS lookup to find the Google ad-server. I’m unsure whether this is because it somehow cached the DNS results internally or whether its hard-wired. It certainly wasn’t using the system cache, but I know Chrome has kept its own cache in the past. If it is from an internal cache, the mechanism used to get the IP address in there in the first place is a mystery, however Google’s ad servers change from time to time and it’s not impossible that the perimeter firewall simply hadn’t kept up and allowed some through.

My next research will be looking more closely at the DNS traffic.

20-March-1524-October-21

FreeBSD hr utility – human readable number filter (man page)

Several years ago I wrote a utility to convert numeric output into human readable format – you know the kind of thing – 12345678 becomes 12M and so on. Although it was very clever in the way it dealt with really big numbers (Zetabytes), and in spite of ZFS having really big numbers as a possibility, no really big numbers have actually come my way.

It was always a dilemma as to whether I should use the same humanize_number() function as most of the FreeBSD utilities, which is limited to 64-bit numbers as its input, or stick with my own rolling conversion. In this release, actually written a couple of years ago, I’ve decided to go for standardisation.

You can download it from here. I’ve moved it (24-10-2021) and it’s not on a prettified page yet, but the file you’re looking for is “hr.tar”.

This should work on most current BSD releases, and quite a few Linux distributions. If you want binaries, leave a note in comments and I’ll see what I can do. Otherwise just download, extract and run make && make install

Extracted from the man page:

NAME

hr — Format numbers in human-readable form

SYNOPSIS

hr [-b] [-p] [-ffield] [-sbits] [-wwidth] [file ...]

DESCRIPTION
The hr utility formats numbers taken from the input stream and sends them
to stdout in a format that’s human readable. Specifically, it scales the
number and adds an appropriate suffix (e.g. 1073741824 becomes 1.0M)

The options are as follows:

-b Put a ‘B’ suffix on a number that hasn’t been scaled (for Bytes).

-p Attempt to deal with input fields that have been padded with spaces for formatting purposes.

-wwidth Set the field width to field characters. The default is four
(three digits and a suffix). Widths less than four are not normally useful.

-sbits Shift the number being processed right by bits bits. i.e. multi-
ply by 2^bits. This is useful if the number has already been scaled in to units. For example, if the number is in 512-byte
blocks then -s9 will multiply the output number by 512 before scaling it. If the number was already in Kb use -s10 and so on.
In addition to specifying the number of bits to shift as a number you may also use one of the SI suffixes B, K, M, G, T, P, E
(upper or lower case).

k-ffield Process the number in the numbered field , with fields being numbered from 0 upwards and separated by whitespace.

The hr utility currently uses the humanize() function in System Utilities Library (libutil, -lutil) to format the numbers. This will repeatedly divide the input number by 1024 until it fits in to a width of three digits (plus suffix), unless the width is modified by the -w option. Depending on the number of divisions required it will append a k, M, G, T, P or E suffix as appropriate. If the -b option is specified it will append a ‘B’ if no division is required.

If no file names are specified, hr will get its input from stdin. If ‘-‘ is specified as one of the file names hr will read from stdin at this point.

If you wish to convert more than one field, simply pipe the output from one hr command into another.

By default the first field (i.e. field 0) is converted, if possible, and the output will be four characters wide including the suffix.

If the field being converted contains non-numeral characters they will be passed through unchanged.

Command line options may appear at any point in the line, and will only take effect from that point onwards. This allows different options to apply to different input files. You may cancel an option by prepending it with a ‘-‘. For consistency, you can also set an option explicitly with a ‘+’. Options may also be combined in a string. For example:

hr -b file1 -b- file2

Will add a ‘B’ suffix when processing file1 but cancel it for file2.

hr -bw5f4p file1

Will set the B suffix option, set the output width to 5 characters, process field 4 and remove excess padding from in front of the original digits.

EXAMPLES
To format the output of an ls -l command’s file size use:

ls -l | hr -p -b -f4

This output will be very similar to the output of “ls -lh” using these options. However the -h option isn’t available with the -ls option on the “find” command. You can use this to achieve it:

find. -ls | hr -p -f6

Finally, if you wish to produce a sorted list of directories by size in human format, try:

du -d1 | sort -n | hr -s10

This assumes that the output of du is the disk usage in kilobytes, hence the need for the -s10

DIAGNOSTICS
The hr utility exits 0 on success, and >0 if an error occurs.

16-March-1516-March-15

Yahoo plans to give up passwords

The latest scheme from Yahoo’s Crazy Ideas Department is to dispense with login passwords. Are they going to replace them with a certificate login or something more secure? Nope! The security-gaff prone outfit from Sunnyvale California has had the genius idea of sending a four-character one-time password to your mobile phone, according to an announcement they made at SXSW yesterday (or possibly today if you’re reading this in the USA).

According to Chris ~~Stoned~~ Stoner, their Product Development Director, the bright idea is to avoid the need to memorise difficult passwords by simply sending a new one, each time, to your registered mobile phone.

At first glance, this sounds a bit like the sensible two-factor authentication you find already: Log in using your password and an additional verification code is sent to your mobile. However, Yahoo has dispensed with the first part – logging in with your normal password. This means that anyone that has physical control of your mobile phone can now hijack your Yahoo account too. If your phone is locked, no matter – just retrieve the SMS using the SIM alone. No need to pwn Yahoo accounts the traditional way.

With an estimated 800,000 mobile phones nicked per year in the UK alone (Source inferred from ONS report) and about 6M handsets a year going AWOL in the USA, you’ve got to wonder what Yahoo was thinking.

Apart from the security risk, what are the chances of being locked out of your email simply because you’re out of mobile range (or if you’re phone has gone missing). Double whammy!

4-March-154-March-15

The Artificial Intelligence Conspiracy

The Truth about Artificial Intelligence

Last year I was asked, at short notice, to teach an undergraduate Artificial Intelligence module. I haven’t done any serious work in the field since the 1980’s, when it was all the rage. It’s proponents were anticipating that it would be a part of life within ten years; as this claim had been made in the early 1970’s I was always a bit dubious, but computer power was increasing exponentially and so I kept an eye on the field. LISP was the thing back then, although I could never see quite how a language that processed lists easily, but was awkward for anything much else, was going to lead to the breakthrough.

So, having had the AI module dumped on me, I did the obvious thing and ran to the library to get out every textbook on the subject. What was the latest? I was surprised to see how far the field had come in the intervening years. It had got nowhere. The textbooks on AI covered pretty much the same as any good book on applied algorithms. The current state-of-the-art in AI is, in fact, applied algorithms with a different name on the cover; no doubt to make it sound more exciting and to make its proponents sound more interesting than mere programmers.

Since then, of course, AI has been in the news. Dr Stephen Hawking came out with a statement that he was worried about AI machines displacing mankind once they got going. Heavy stuff – it’d make a good plot for a sci-fi movie. It was also splashed all over the news media a week before the release of his latest book. The man’s no fool.

With universities having had departments of artificial intelligence for decades now, and consumer products claiming to have embedded AI (from mobile telephones to fuzzy logic thermostats) you may be forgiven for thinking that a breakthrough is imminent. Not from where I’m sitting.

Teaching artificial intelligence is like teaching warp drive technology. If you’ve never seen Star Trek, this is the method by which the Starship Enterprise travels faster than the speed of light by using a warp engine to bend the space around it such that a small movement inside the warp field translates to a much larger movement through “flat” space. Great idea, except that warp generators only exist in science fiction. And so does AI. You can realistically teach quantum physics, but trying to teach warp technology is only for the lunatic fringe. The same is true of AI, although I’m certain those with a career, and research grants, based on the name will beg to differ.

So where are we actually at? How does artificial intelligence as we know it work, and is it going in the right direction? In the absence of the real thing, the term AI is now being used to describe a class of algorithm. A proper algorithm takes input values and produces THE correct answer. For example, as sort algorithm will take as its input an unordered list and produce as output a sorted list. If the algorithm is correct, the output will always be correct, and furthermore it is possible to say how long it will take (worst case) to get the answer, because there is a worst-case number of steps the program will have to take. These are know as “P Problems”, to those who like to talk about how difficult things are to work out in terms of letters rather than plain old English.

Other problems are NP, which basically means that, although you might be able to produce an algorithm to solve them, the universe may have ended before you get the result. In some cases the computation may last an infinite amount of time. For example, one tricky problem would be working out the shortest route from London to Carlisle? Your satnav can work this out for you, of course, but how can you be sure it’s found the one correct answer; the absolute shortest route? In practice, you probably don’t care. You just want a route that works and is reasonably short. To know for sure that there was no shorter route possible you would have to examine every possible turn-after-turn in the complete road network. You can’t prove it’s not shorted to go via Penzance unless you try it. However, realistically, we use heuristics to prune off crazy paths and concentrate on the promising ones and get a result that’s “good enough”. There are a lot of problems like this.

A heuristic algorithm sounds better to some people if it’s called an AI algorithm, and with no actual AI working AI, people like to have something to point to; to justify their job titles. But where does this leave genuine AI?

In the 1970’s world was seen as lists, or relations (structured data of some kind). If we played about with databases and array (list) processing languages, we’d ignite the spark. If it wasn’t working it was just our failure to classify the world in to relations properly.

When nothing caught fire, Object Oriented Programming became fashionable. Minsky’s idea was that if a computer language could map on to the real world, using code/data (or methods and attributes) to define real-world objects, AI would follow. I remember the debate (around 1989) well. When the “proper” version of C++ appeared, the one with the holy grail of multiple inheritance, the paradigm would take off. Until then C++ was just a syntactical nicety to hide the pointer to the context in a library of functions acting on the same structure layout. We’ve had multiple inheritance for 25 years now, but any conceivable utility I’ve seen made of them has been somewhat contrived. I always thought they were a bad idea except for classes inheriting multiple interfaces, which I will concede but this is hardly the same as inheriting methods and attributes – the stuff that was supposed to map the way world worked.

The current hope seems to be “whole brain” emulation. If we can just build a large enough neural network, it will come to life. I have to admit that the only tangible reason why I don’t see this working is decades of disappointment. Am I right to be sceptical? Looking it another way, medical science has progressed by leaps and bounds, but we’re no closer to creating life than when Mary Shelly first wrote about it. However cleaver we think we are with modern medicine, I don’t think we’re remotely close to reanimating even a single dead cell, never mind creating one.

Perhaps a better places to start is looking at the nature of AI, and how we know we’ve got it. One early test was along the lines of “I’ll be impressed if that thinking machine can play chess!”. This has fallen by the wayside, with Deep Blue finally beating Garry Kasparov in 1997 and settling that question once and for all. But no one is now is claiming that Deep Blue was intelligent; it was simply able to calculate more possible outcomes in less time than its human opponent. One interesting point about it was the size of the machine required to do even this.

Another famous measure of AI success is Alan Turing’s test. A smart man, was Mr Turing. Unfortunately his test wasn’t valid (in my humble opinion). Basically, he reckoned that if you were communicating with a computer and couldn’t tell the difference between it and a human correspondent, then you had AI. No you don’t. We’ve all spoken to humans at call centres that do a pretty good impression of a machine; getting a machine to do a good impression of a human isn’t so hard. And it’s not intelligence.

In the late 1970s and early 1980s, computer conversation programs were everywhere (e.g. ELIZA). It’s no surprised; the input/output was basically a Teletype or later a video terminal (glass Teletype), so what else could you write? The pages of publications such as Creative Computing inspired me to write a few such programs myself, which I had running at the local library for the public to have a go at. Many had trouble believing the responses came from the computer rather than me behind a screen (this was in the early days, remember – most had never seen a computer). I called this simulated intelligence, and subsequently wrote about it in my PCW column. And that’s all it was – a simulation of intelligence. And all I’ve seen since has a simulation; however good the simulation it’s not the same as the real thing.

Science fiction writes have defined AI as a machine being aware of itself. I think this is possibly, true, but it pushes the problem on to defining self-awareness. I think there’s still merit in the idea anyway; it’s one feature of intelligent life that machines currently lack. A house fly is moderately intelligent; as may be an amoeba. What about a bacteria? Bear in mind that we’ve not created an artificial or simulated intelligence that can do as much as a house fly yet, if you’re thinking of AI as having human-like characteristics. (There is currently research into simulating a fly brain (See Arena, P.; Patane, L.; Termini, P.S.; “An insect brain computational model inspired by Drosophila melanogaster: Simulation results” in The 2010 International Joint Conference on Neural Networks – IJCNN).

Other AI definitions talk about a machine being able to learn; take the results of a previous decisions to alter subsequently decisions in the pursuance of a goal. This has been achieved, at high speed and with infinite resolution, many years ago. It’s called an analogue feedback loop. There’s a lot of bluster about AI systems being more complex and being able to cope with a far wider range of input types than previous systems, but a feedback loop isn’t intelligent, however complex it is.

So what have we actually got under the heading of AI? A load of heuristic algorithms that can produce answers to problems that can’t be computed for certain; systems that can interact with humans in a natural language; and with enough processing power you can build a complex enough heuristic system to drive a car. Impress your granny by calling this kind of thing AI if you like, and self-awareness doesn’t really matter if the machines do what we want of them. This is just as well, as AI is just as elusive as it was in the 1970s. All we have now is a longer list of examples that aren’t it.

The only viable route I can see to AI is in Whole Brain Emulation, as alluded to above. We are getting to the point now where it is possible to build a neural network complex enough to match a brain. How, exactly, we could kick-start such a machine in to thinking is an intriguing problem. Those talking loudest about this kind of technology are thinking in terms of uploading the contents of an existing brain, somehow. Personally, I see a few practical problems that will need solving before this will work, but if we could build such a complex neural network and if we could find a way to teach it, we may just achieve a real artificial intelligence. There are two ifs and a may in there. Worrying too much about where AI technology may lead, however, is like worrying about the effects of human physiology from prolonged exposure to the warp coils on a starship.

2-March-1516-March-15

More comment spammer email analysis

Since my earlier post, I decided to see what change there had been in the email addresses used by comment spammers to register. Here are the results:

Freemail Service	%
hotmail.com	22%
yahoo.com	20%
outlook.com	14%
mailnesia.com	8%
gmail.com	6%
laposte.net	6%
o2.pl	3%
mail.ru	2%
nokiamail.com	2%
emailgratis.info	1%
bk.ru	1%
gmx.com	1%
poczta.pl	1%
yandex.com	1%
list.ru	1%
mail.bg	1%
aol.com	1%
solar.emailind.com	1%
inbox.ru	1%
rediffmail.com	1%
live.com	1%
more-infos-about.com	1%
dispostable.com	<1%
go2.pl	<1%
rubbergrassmats-uk.co.uk	<1%
abv.bg	<1%
fdressesw.com	<1%
freemail.hu	<1%
katomcoupon.com	<1%
tlen.pl	<1%
yahoo.co.uk	<1%
acity.pl	<1%
atrais-kredits24.com	<1%
conventionoftheleft.org	<1%
iidiscounts.org	<1%
interia.pl	<1%
ovi.com	<1%
se.vot.pl	<1%
trolling-google.waw.pl	<1%

As before, domains with <1% are still significant; it’s a huge sample. I’ve only excluded domains with <10 actual attempts.

The differences from 18 months ago are interesting. Firstly, mailnesia.com has dropped from 19% to 6% – however this is because the spam system has decided to block it! Hotmail is also slightly less and Gmail and AOL are about the same. The big riser is Yahoo, followed by laposte.net (which had the highest percentage rise of them all). O2 in Poland is still strangely popular.

If you want to know how to extract the statistics for yourself, see my earlier post.