FreeBSD hr utility – human readable number filter (man page)

Several years ago I wrote a utility to convert numeric output into human readable format – you know the kind of thing – 12345678 becomes 12M and so on. Although it was very clever in the way it dealt with really big numbers (Zetabytes), and in spite of ZFS having really big numbers as a possibility, no really big numbers have actually come my way.

It was always a dilemma as to whether I should use the same humanize_number() function as most of the FreeBSD utilities, which is limited to 64-bit numbers as its input, or stick with my own rolling conversion. In this release, actually written a couple of years ago, I’ve decided to go for standardisation.

You can download it from its new permanent home here

This should work on most current BSD releases, and quite a few Linux distributions. If you want binaries, leave a note in comments and I’ll see what I can do. Otherwise just download, extract and run make && make install


 

Extracted from the man page:

NAME

hr — Format numbers in human-readable form

SYNOPSIS

hr [-b] [-p] [-ffield] [-sbits] [-wwidth] [file ...]

DESCRIPTION
The hr utility formats numbers taken from the input stream and sends them
to stdout in a format that’s human readable. Specifically, it scales the
number and adds an appropriate suffix (e.g. 1073741824 becomes 1.0M)

The options are as follows:

-b      Put a ‘B’ suffix on a number that hasn’t been scaled (for Bytes).

-p     Attempt to deal with input fields that have been padded with spaces for formatting purposes.

-wwidth      Set the field width to field characters. The default is four
(three digits and a suffix). Widths less than four are not normally useful.

-sbits  Shift the number being processed right by bits bits. i.e. multi-
ply by 2^bits. This is useful if the number has already been scaled in to units. For example, if the number is in 512-byte
blocks then -s9 will multiply the output number by 512 before scaling it. If the number was already in Kb use -s10 and so on.
In addition to specifying the number of bits to shift as a number you may also use one of the SI suffixes B, K, M, G, T, P, E
(upper or lower case).

k-ffield      Process the number in the numbered field , with fields being numbered from 0 upwards and separated by whitespace.

The hr utility currently uses the humanize() function in System Utilities Library (libutil, -lutil) to format the numbers.  This will repeatedly divide the input number by 1024 until it fits in to a width of three digits (plus suffix), unless the width is modified by the -w option. Depending on the number of divisions required it will append a k, M, G, T, P or E suffix as appropriate. If the -b option is specified it will append a ‘B’ if no division is required.

Please generate and paste your ad code here. If left empty, the ad location will be highlighted on your blog pages with a reminder to enter your code. Mid-Post

If no file names are specified, hr will get its input from stdin. If ‘-‘ is specified as one of the file names hr will read from stdin at this point.

If you wish to convert more than one field, simply pipe the output from one hr command into another.

By default the first field (i.e. field 0) is converted, if possible, and the output will be four characters wide including the suffix.

If the field being converted contains non-numeral characters they will be passed through unchanged.

Command line options may appear at any point in the line, and will only take effect from that point onwards. This allows different options to apply to different input files. You may cancel an option by prepending it with a ‘-‘. For consistency, you can also set an option explicitly with a ‘+’.  Options may also be combined in a string. For example:

hr -b file1 -b- file2

Will add a ‘B’ suffix when processing file1 but cancel it for file2.

hr -bw5f4p file1

Will set the B suffix option, set the output width to 5 characters, process field 4 and remove excess padding from in front of the original  digits.

EXAMPLES
To format the output of an ls -l command’s file size use:

ls -l | hr -p -b -f4

This output will be very similar to the output of “ls -lh” using these options. However the -h option isn’t available with the -ls option on the “find” command. You can use this to achieve it:

find. -ls | hr -p -f6

Finally, if you wish to produce a sorted list of directories by size in human format, try:

du -d1 | sort -n | hr -s10

This assumes that the output of du is the disk usage in kilobytes, hence the need for the -s10

DIAGNOSTICS
The hr utility exits 0 on success, and >0 if an error occurs.

2 Replies to “FreeBSD hr utility – human readable number filter (man page)”

    1. Good question!

      First off, that would be 1.1G, as G is the suffix for 1000 Million. A billion is a bit of a problem as a unit. In English it means a million million, but to Americans it means just a thousand million. This is thanks to a mistake in the first American mathematics book written by Isaac Greenwood in the first half of the 18th Century. To avoid confusion, internationally the suffixes G and T for the two possible billions – never B. Americans will insist Greenwood was correct, when there was plenty published before him that proves otherwise. Likewise they’ll insist that the “English speaking world” uses the Greenwood definition, but it’s not what I was taught at school (in England).

      But the reason for your question probably relates to why 1,073,741,824 doesn’t round to 1.1G, right? Well in decimal it wouldn’t either, as the result is rounded, not pushed up in increments.

      It’d have been a lot easier to write decimal rounding was used – just read the first few digits and round! But we (the BSD world and computer science in general) uses suffixes based on 2^10 = 1K, 2^20 = 1M, 2^30= 1G and so on, NOT 10^3, 10^6, 10^9 as people are used to in decimal. Therefore 1K is 1024, and NOT 1000. 1073741824 is 1G precisely, not a bit higher or or lower. In binary it’s a one with 30 zeros following.

      Computer marketing people, of course, discovered that a real unit was larger than the decimal version, so started flogging 1Tb drives that were 92Gb short of a proper Tb by using the decimal 10^12 definition of 1T rather than the binary 2^40. It might fool the Windoze users! Actually, that’s not fair on Microsoft – they use the correct binary definition too. RAM manufactures have never followed suit either.

      Incidentally, version 0.2 now uses the standard BSD humanize_number() to make sure it’s consistent with other BSD output. I’m not 100% convinced of this, but it does guarantee that all utilities convert the same number the same way.

Leave a Reply

Your email address will not be published. Required fields are marked *