March 2015 – Frank Leonhardt's Blog

Security certificates broken on Google Chrome 41

Don’t install the latest release of Google Chrome (41), released on Thursday (Friday UK time). They’ve messed up. Twice.

Broken SSL when talking to routers etc.

The first problem comes when accessing the web interface on a device such as a router over SSL (encrypted). Unfortunately, because the software in theses is embedded, the security certificate it uses isn’t going to match the name of the device you use to access it. This would be impossible – when it leaves the factory it hasn’t had its IP address assigned on your site; never mind the DNS entry. Previously browsers have allowed you to ignore this mis-match; the encryption works as long as you’re comfortable that you’re really talking what you think you are using some other check, and once the exception has been stored, this should be the end of the matter.

But not with Chrome release 41. Now it will show you the screen below:

If you ask for more details it doesn’t really give you much:

A secure connection cannot be established because this site uses an unsupported protocol.

Error code: ERR_SSL_VERSION_OR_CIPHER_MISMATCH

This comes from a DrayTek 2820 modem/router, but the problem seems to exist on other networking kit.

More adverts too – and a malware backdoor

(Please see update below – there may be an innocent explanation for this)

As an extra surprise, those nice people seem to have found a way of blocking URL keyword filters used to keep adverts out from objectionable sources, circumventing methods of blocking Google’s syndicated advertising. I’m still researching this, but the way they appear to have done it means that embedded content from other sources than the site you’re looking at is extremely difficult to block.

It appears Google has done this to protect its revenue stream from adverts, with little regard from the site policies that may exist for reasons Google may not realise. But that’s not the worst of it: how long will it be before this feature of Chrome is used for drive-by downloads. If you’re firewall isn’t able to cross-check the source of the content on a page, it can be coming from anywhere.

Unfortunately there is no way of rolling back a bad version of Chrome. They really don’t like you doing that, however dangerous a release might be.

I have, of course, made urgent representations to the Chrome project but we will have to wait and see. In the mean time, all I can suggest is that you prevent Chrome from updating beyond version 40.

Update 2015-03-23
On further investigation, the updated Chrome isn’t doing a DNS lookup to find the Google ad-server. I’m unsure whether this is because it somehow cached the DNS results internally or whether its hard-wired. It certainly wasn’t using the system cache, but I know Chrome has kept its own cache in the past. If it is from an internal cache, the mechanism used to get the IP address in there in the first place is a mystery, however Google’s ad servers change from time to time and it’s not impossible that the perimeter firewall simply hadn’t kept up and allowed some through.

My next research will be looking more closely at the DNS traffic.

20-March-1524-October-21

FreeBSD hr utility – human readable number filter (man page)

Several years ago I wrote a utility to convert numeric output into human readable format – you know the kind of thing – 12345678 becomes 12M and so on. Although it was very clever in the way it dealt with really big numbers (Zetabytes), and in spite of ZFS having really big numbers as a possibility, no really big numbers have actually come my way.

It was always a dilemma as to whether I should use the same humanize_number() function as most of the FreeBSD utilities, which is limited to 64-bit numbers as its input, or stick with my own rolling conversion. In this release, actually written a couple of years ago, I’ve decided to go for standardisation.

You can download it from here. I’ve moved it (24-10-2021) and it’s not on a prettified page yet, but the file you’re looking for is “hr.tar”.

This should work on most current BSD releases, and quite a few Linux distributions. If you want binaries, leave a note in comments and I’ll see what I can do. Otherwise just download, extract and run make && make install

Extracted from the man page:

NAME

hr — Format numbers in human-readable form

SYNOPSIS

hr [-b] [-p] [-ffield] [-sbits] [-wwidth] [file ...]

DESCRIPTION
The hr utility formats numbers taken from the input stream and sends them
to stdout in a format that’s human readable. Specifically, it scales the
number and adds an appropriate suffix (e.g. 1073741824 becomes 1.0M)

The options are as follows:

-b Put a ‘B’ suffix on a number that hasn’t been scaled (for Bytes).

-p Attempt to deal with input fields that have been padded with spaces for formatting purposes.

-wwidth Set the field width to field characters. The default is four
(three digits and a suffix). Widths less than four are not normally useful.

-sbits Shift the number being processed right by bits bits. i.e. multi-
ply by 2^bits. This is useful if the number has already been scaled in to units. For example, if the number is in 512-byte
blocks then -s9 will multiply the output number by 512 before scaling it. If the number was already in Kb use -s10 and so on.
In addition to specifying the number of bits to shift as a number you may also use one of the SI suffixes B, K, M, G, T, P, E
(upper or lower case).

k-ffield Process the number in the numbered field , with fields being numbered from 0 upwards and separated by whitespace.

The hr utility currently uses the humanize() function in System Utilities Library (libutil, -lutil) to format the numbers. This will repeatedly divide the input number by 1024 until it fits in to a width of three digits (plus suffix), unless the width is modified by the -w option. Depending on the number of divisions required it will append a k, M, G, T, P or E suffix as appropriate. If the -b option is specified it will append a ‘B’ if no division is required.

If no file names are specified, hr will get its input from stdin. If ‘-‘ is specified as one of the file names hr will read from stdin at this point.

If you wish to convert more than one field, simply pipe the output from one hr command into another.

By default the first field (i.e. field 0) is converted, if possible, and the output will be four characters wide including the suffix.

If the field being converted contains non-numeral characters they will be passed through unchanged.

Command line options may appear at any point in the line, and will only take effect from that point onwards. This allows different options to apply to different input files. You may cancel an option by prepending it with a ‘-‘. For consistency, you can also set an option explicitly with a ‘+’. Options may also be combined in a string. For example:

hr -b file1 -b- file2

Will add a ‘B’ suffix when processing file1 but cancel it for file2.

hr -bw5f4p file1

Will set the B suffix option, set the output width to 5 characters, process field 4 and remove excess padding from in front of the original digits.

EXAMPLES
To format the output of an ls -l command’s file size use:

ls -l | hr -p -b -f4

This output will be very similar to the output of “ls -lh” using these options. However the -h option isn’t available with the -ls option on the “find” command. You can use this to achieve it:

find. -ls | hr -p -f6

Finally, if you wish to produce a sorted list of directories by size in human format, try:

du -d1 | sort -n | hr -s10

This assumes that the output of du is the disk usage in kilobytes, hence the need for the -s10

DIAGNOSTICS
The hr utility exits 0 on success, and >0 if an error occurs.

16-March-1516-March-15

Why Jeremy Clarkson Matters

Jeremy Clarkson must feature in the worst nightmares of the trendy liberals that run the BBC. He’s intelligent, articulate and hugely popular, but not politically correct. Whether he’s right or wrong in what he says doesn’t matter. From what I’ve heard of his TV appearances, he comes across shallow and missing the point 75% of the time. He’s written books and a column in the Sun “Newspaper”, which may turn out to pander less to the need to entertain; I don’t know because I can’t be bothered to read them.

I hear more about Mr Clarkson from the news media, where there appears to be a vendetta against him based on the notion that he says things which, while part of English society for over a century, are no longer politically correct. They’re lambasting him for treading on cracks in the pavement.

The latest row seems to be about him losing his temper after a stressful day’s filming. This isn’t a good thing, but it’s part of life. If he was a celebrity chef, such behaviour would be encouraged.

We should really be sharing a thought for the poor producer on the receiving end of the self-important star’s bad mood and abuse: Oisin Tymon. He appears to have taken the matter professionally, in his stride. He’s working in an industry containing celebrities with arge egos placed in stressful situations, and what little information there is in the public domain, it appears he’s taken the incident on the chin (literally, by some accounts) and just got on with it.

Unfortunately, it’s given Danny Cohen, the BBC Director of Television, the perfect excuse to over-react. Or so he seems to think. It’s clearly being used as an opportunity to silence a voice that doesn’t fit with their left-wing, liberal agenda. I’ve no problem with a left-wing agenda, as long as it’s balanced. The BBC is paid for by society as a whole, and has no business censoring someone who reflects the views of that society, whether they reflect their views or not.

Whether Mr Cohen is pandering to the views of his colleagues is something I can’t tell. There are calls for the wonder-boy of British Television to go instead of Clarkson. One thing’s for sure; there’s always Noreena sitting over the breakfast table to keep him on the one true path. Her published works leave no doubt as to her political and philosophical leanings.

As I believe in hearing all views from our “uniquely funded” state broadcaster, I have no choice but to take a stand in defence of the oaf. Guido Fawkes started a petition, and I notice it has almost reached a million supporters. Sign it here.

16-March-1516-March-15

Yahoo plans to give up passwords

The latest scheme from Yahoo’s Crazy Ideas Department is to dispense with login passwords. Are they going to replace them with a certificate login or something more secure? Nope! The security-gaff prone outfit from Sunnyvale California has had the genius idea of sending a four-character one-time password to your mobile phone, according to an announcement they made at SXSW yesterday (or possibly today if you’re reading this in the USA).

According to Chris ~~Stoned~~ Stoner, their Product Development Director, the bright idea is to avoid the need to memorise difficult passwords by simply sending a new one, each time, to your registered mobile phone.

At first glance, this sounds a bit like the sensible two-factor authentication you find already: Log in using your password and an additional verification code is sent to your mobile. However, Yahoo has dispensed with the first part – logging in with your normal password. This means that anyone that has physical control of your mobile phone can now hijack your Yahoo account too. If your phone is locked, no matter – just retrieve the SMS using the SIM alone. No need to pwn Yahoo accounts the traditional way.

With an estimated 800,000 mobile phones nicked per year in the UK alone (Source inferred from ONS report) and about 6M handsets a year going AWOL in the USA, you’ve got to wonder what Yahoo was thinking.

Apart from the security risk, what are the chances of being locked out of your email simply because you’re out of mobile range (or if you’re phone has gone missing). Double whammy!

4-March-154-March-15

The Artificial Intelligence Conspiracy

The Truth about Artificial Intelligence

Last year I was asked, at short notice, to teach an undergraduate Artificial Intelligence module. I haven’t done any serious work in the field since the 1980’s, when it was all the rage. It’s proponents were anticipating that it would be a part of life within ten years; as this claim had been made in the early 1970’s I was always a bit dubious, but computer power was increasing exponentially and so I kept an eye on the field. LISP was the thing back then, although I could never see quite how a language that processed lists easily, but was awkward for anything much else, was going to lead to the breakthrough.

So, having had the AI module dumped on me, I did the obvious thing and ran to the library to get out every textbook on the subject. What was the latest? I was surprised to see how far the field had come in the intervening years. It had got nowhere. The textbooks on AI covered pretty much the same as any good book on applied algorithms. The current state-of-the-art in AI is, in fact, applied algorithms with a different name on the cover; no doubt to make it sound more exciting and to make its proponents sound more interesting than mere programmers.

Since then, of course, AI has been in the news. Dr Stephen Hawking came out with a statement that he was worried about AI machines displacing mankind once they got going. Heavy stuff – it’d make a good plot for a sci-fi movie. It was also splashed all over the news media a week before the release of his latest book. The man’s no fool.

With universities having had departments of artificial intelligence for decades now, and consumer products claiming to have embedded AI (from mobile telephones to fuzzy logic thermostats) you may be forgiven for thinking that a breakthrough is imminent. Not from where I’m sitting.

Teaching artificial intelligence is like teaching warp drive technology. If you’ve never seen Star Trek, this is the method by which the Starship Enterprise travels faster than the speed of light by using a warp engine to bend the space around it such that a small movement inside the warp field translates to a much larger movement through “flat” space. Great idea, except that warp generators only exist in science fiction. And so does AI. You can realistically teach quantum physics, but trying to teach warp technology is only for the lunatic fringe. The same is true of AI, although I’m certain those with a career, and research grants, based on the name will beg to differ.

So where are we actually at? How does artificial intelligence as we know it work, and is it going in the right direction? In the absence of the real thing, the term AI is now being used to describe a class of algorithm. A proper algorithm takes input values and produces THE correct answer. For example, as sort algorithm will take as its input an unordered list and produce as output a sorted list. If the algorithm is correct, the output will always be correct, and furthermore it is possible to say how long it will take (worst case) to get the answer, because there is a worst-case number of steps the program will have to take. These are know as “P Problems”, to those who like to talk about how difficult things are to work out in terms of letters rather than plain old English.

Other problems are NP, which basically means that, although you might be able to produce an algorithm to solve them, the universe may have ended before you get the result. In some cases the computation may last an infinite amount of time. For example, one tricky problem would be working out the shortest route from London to Carlisle? Your satnav can work this out for you, of course, but how can you be sure it’s found the one correct answer; the absolute shortest route? In practice, you probably don’t care. You just want a route that works and is reasonably short. To know for sure that there was no shorter route possible you would have to examine every possible turn-after-turn in the complete road network. You can’t prove it’s not shorted to go via Penzance unless you try it. However, realistically, we use heuristics to prune off crazy paths and concentrate on the promising ones and get a result that’s “good enough”. There are a lot of problems like this.

A heuristic algorithm sounds better to some people if it’s called an AI algorithm, and with no actual AI working AI, people like to have something to point to; to justify their job titles. But where does this leave genuine AI?

In the 1970’s world was seen as lists, or relations (structured data of some kind). If we played about with databases and array (list) processing languages, we’d ignite the spark. If it wasn’t working it was just our failure to classify the world in to relations properly.

When nothing caught fire, Object Oriented Programming became fashionable. Minsky’s idea was that if a computer language could map on to the real world, using code/data (or methods and attributes) to define real-world objects, AI would follow. I remember the debate (around 1989) well. When the “proper” version of C++ appeared, the one with the holy grail of multiple inheritance, the paradigm would take off. Until then C++ was just a syntactical nicety to hide the pointer to the context in a library of functions acting on the same structure layout. We’ve had multiple inheritance for 25 years now, but any conceivable utility I’ve seen made of them has been somewhat contrived. I always thought they were a bad idea except for classes inheriting multiple interfaces, which I will concede but this is hardly the same as inheriting methods and attributes – the stuff that was supposed to map the way world worked.

The current hope seems to be “whole brain” emulation. If we can just build a large enough neural network, it will come to life. I have to admit that the only tangible reason why I don’t see this working is decades of disappointment. Am I right to be sceptical? Looking it another way, medical science has progressed by leaps and bounds, but we’re no closer to creating life than when Mary Shelly first wrote about it. However cleaver we think we are with modern medicine, I don’t think we’re remotely close to reanimating even a single dead cell, never mind creating one.

Perhaps a better places to start is looking at the nature of AI, and how we know we’ve got it. One early test was along the lines of “I’ll be impressed if that thinking machine can play chess!”. This has fallen by the wayside, with Deep Blue finally beating Garry Kasparov in 1997 and settling that question once and for all. But no one is now is claiming that Deep Blue was intelligent; it was simply able to calculate more possible outcomes in less time than its human opponent. One interesting point about it was the size of the machine required to do even this.

Another famous measure of AI success is Alan Turing’s test. A smart man, was Mr Turing. Unfortunately his test wasn’t valid (in my humble opinion). Basically, he reckoned that if you were communicating with a computer and couldn’t tell the difference between it and a human correspondent, then you had AI. No you don’t. We’ve all spoken to humans at call centres that do a pretty good impression of a machine; getting a machine to do a good impression of a human isn’t so hard. And it’s not intelligence.

In the late 1970s and early 1980s, computer conversation programs were everywhere (e.g. ELIZA). It’s no surprised; the input/output was basically a Teletype or later a video terminal (glass Teletype), so what else could you write? The pages of publications such as Creative Computing inspired me to write a few such programs myself, which I had running at the local library for the public to have a go at. Many had trouble believing the responses came from the computer rather than me behind a screen (this was in the early days, remember – most had never seen a computer). I called this simulated intelligence, and subsequently wrote about it in my PCW column. And that’s all it was – a simulation of intelligence. And all I’ve seen since has a simulation; however good the simulation it’s not the same as the real thing.

Science fiction writes have defined AI as a machine being aware of itself. I think this is possibly, true, but it pushes the problem on to defining self-awareness. I think there’s still merit in the idea anyway; it’s one feature of intelligent life that machines currently lack. A house fly is moderately intelligent; as may be an amoeba. What about a bacteria? Bear in mind that we’ve not created an artificial or simulated intelligence that can do as much as a house fly yet, if you’re thinking of AI as having human-like characteristics. (There is currently research into simulating a fly brain (See Arena, P.; Patane, L.; Termini, P.S.; “An insect brain computational model inspired by Drosophila melanogaster: Simulation results” in The 2010 International Joint Conference on Neural Networks – IJCNN).

Other AI definitions talk about a machine being able to learn; take the results of a previous decisions to alter subsequently decisions in the pursuance of a goal. This has been achieved, at high speed and with infinite resolution, many years ago. It’s called an analogue feedback loop. There’s a lot of bluster about AI systems being more complex and being able to cope with a far wider range of input types than previous systems, but a feedback loop isn’t intelligent, however complex it is.

So what have we actually got under the heading of AI? A load of heuristic algorithms that can produce answers to problems that can’t be computed for certain; systems that can interact with humans in a natural language; and with enough processing power you can build a complex enough heuristic system to drive a car. Impress your granny by calling this kind of thing AI if you like, and self-awareness doesn’t really matter if the machines do what we want of them. This is just as well, as AI is just as elusive as it was in the 1970s. All we have now is a longer list of examples that aren’t it.

The only viable route I can see to AI is in Whole Brain Emulation, as alluded to above. We are getting to the point now where it is possible to build a neural network complex enough to match a brain. How, exactly, we could kick-start such a machine in to thinking is an intriguing problem. Those talking loudest about this kind of technology are thinking in terms of uploading the contents of an existing brain, somehow. Personally, I see a few practical problems that will need solving before this will work, but if we could build such a complex neural network and if we could find a way to teach it, we may just achieve a real artificial intelligence. There are two ifs and a may in there. Worrying too much about where AI technology may lead, however, is like worrying about the effects of human physiology from prolonged exposure to the warp coils on a starship.

2-March-1516-March-15

More comment spammer email analysis

Since my earlier post, I decided to see what change there had been in the email addresses used by comment spammers to register. Here are the results:

Freemail Service	%
hotmail.com	22%
yahoo.com	20%
outlook.com	14%
mailnesia.com	8%
gmail.com	6%
laposte.net	6%
o2.pl	3%
mail.ru	2%
nokiamail.com	2%
emailgratis.info	1%
bk.ru	1%
gmx.com	1%
poczta.pl	1%
yandex.com	1%
list.ru	1%
mail.bg	1%
aol.com	1%
solar.emailind.com	1%
inbox.ru	1%
rediffmail.com	1%
live.com	1%
more-infos-about.com	1%
dispostable.com	<1%
go2.pl	<1%
rubbergrassmats-uk.co.uk	<1%
abv.bg	<1%
fdressesw.com	<1%
freemail.hu	<1%
katomcoupon.com	<1%
tlen.pl	<1%
yahoo.co.uk	<1%
acity.pl	<1%
atrais-kredits24.com	<1%
conventionoftheleft.org	<1%
iidiscounts.org	<1%
interia.pl	<1%
ovi.com	<1%
se.vot.pl	<1%
trolling-google.waw.pl	<1%

As before, domains with <1% are still significant; it’s a huge sample. I’ve only excluded domains with <10 actual attempts.

The differences from 18 months ago are interesting. Firstly, mailnesia.com has dropped from 19% to 6% – however this is because the spam system has decided to block it! Hotmail is also slightly less and Gmail and AOL are about the same. The big riser is Yahoo, followed by laposte.net (which had the highest percentage rise of them all). O2 in Poland is still strangely popular.

If you want to know how to extract the statistics for yourself, see my earlier post.