Archive for June 25, 2011

CPU Scaling and distributed.net

Greg’s tip of the day!

I like to run distributed.net on my computers. I want the CPU frequency to stay pinned at the lowest possible frequency while distributed.net is running. I also want my computer to boost it’s CPU speed if I’m using it for something else. Using Linux, this is easy to accomplish. First you’ll want to make sure that distributed.net is running with a low ‘nice’ priority level (it does this by default). Secondly, you’ll want to add this to your /etc/rc.local:

echo 1 > /sys/devices/system/cpu/cpufreq/ondemand/ignore_nice_load

By doing this, you’re telling Linux to use the minimum CPU speed possible for programs with a ‘nice’ (e.g. low) priority level while giving it permission to boost the CPU speed if necessary for other software running on computer. Lower CPU speeds require less electricity and this leads to cheaper electric bills – both a good thing :)

One final caveat… after all that work, your computer will probably still use a few more watts than it would if it were completely idle at the lowest clock speed. This is because it is doing work that requires the CPU to use more circuits than it would in a normal idle loop. Caveat aside, you’ll still be using a lot less power than if your CPU were allowed to run at the top clock speed, and only a small amount more than it would normally idle!

Moving Files Using netcat and tar

Time for an upgrade. I’ve got this (really old) box running OpenBSD 4.9:
real mem = 234561536 (223MB)
avail mem = 226095104 (215MB)
mainbus0 at root: SUNW,SPARCstation-5
cpu0 at mainbus0: MB86904 @ 85 MHz, on-chip FPU

And a new one running Debian that is a lot faster…

CPU0: Intel(R) Pentium(R) M processor 1.60GHz stepping 06
503MB LOWMEM available.

I need to copy around 100GB of data from the SPARCstation to the new server. The SPARCstation is able to push around 300KB/sec tops unencrypted over the network. Normally I’d be using rsync over ssh, but in this case, it will actually slow things down. So slow that it would literally take over a week to copy this data! The fastest way I’ve found is to use netcat and tar together. By using this combination I’m freeing up the CPU on the SPARCstation to do more important things like read data off of the disks and push it over the network!

Notice that I’m not using compression with tar… once again, sparky just isn’t fast enough :)


#Sender (SPARCStation)
tar cf - * | netcat receiver_host_name_or_ip 7000


#Receiver (new computer)
netcat -l -p 7000 | tar xv

Mailing Lists and Procmail

I like having procmail sort my mail for me. In the case of mailing lists, the header of choice is the List-ID field. But there’s a problem… notice how each example below is slightly different. I want to pull the bold portion of the mailing list and use that as the folder name:

List-ID: <linux-kernel.vger.kernel.org>
List-Id: Learn about the Linux kernel <kernelnewbies.kernelnewbies.org>
List-Id: cocci.diku.dk

To get started, let’s state in English what we want to find: Dear procmail; please find the word that immediately precedes the first period in a line that begins with “List-Id:”

Finding these headers is easy with a regular expression… IF… you’re allowed to use look ahead: ^List-Id:.*?( (?!.*<)|<)([^.]*)

BUT, procmail doesn’t do look ahead :(

So let’s try with procmail’s regular expressions. Aside from look ahead/behind, there are two other major differences between procmail’s regular expressions and the rest of the world. First, procmail uses \/ to mark the portion of the expression that will be copied into $MATCH. Secondly, the part of the regular expression to the left of the \/ uses non-greedy matching. So when you write .* procmail treats it like .*? this is the feature that makes matching the three list headers I want to grab quite difficult.

With this in mind:
Matches the linux-kernel list:
^List-Id: *<\/[^.]*

Matches linux-kernel and kernelnewbies:
^List-Id: .*<\/[^<]?[^.]*

Notice the extra [^<]? which tells procmail that we want $MATCH to start after the < character. This is what allows the rule to find kernelnewbies without pulling < into $MATCH. This is necessary because procmail isn’t being greedy when it matches to the left side of \/.

Now, our remaining problem is the cocci mailing list. This one really makes life difficult. I decided that using a single regular expression just isn’t possible, so that means we’ll need two. One to grab the cocci mailing list and one to grab everything else. Here’s the completed procmail rule (note: I use Maildir and not mbox on my mailserver).

:0
* ^List-Id: \/[^.]+
{
        #list with <>
        #e.g. List-Id: Learn about the Linux kernel <kernelnewbies.kernelnewbies.org>
        #e.g. List-Id: <linux-kernel.vger.kernel.org>
        :0
        * $MATCH ?? ^.*<\/[^<]+
        .MailingLists.$MATCH/

        #list without <>
        #e.g. List-Id: cocci.diku.dk
        :0
        .MailingLists.$MATCH/
}