09 September 2009

Firefox automation

I have access to a web page that produces interesting pipe-delimited data, given a number. There are several hundred numbers for which I want data. The website is behind a gnarly security system to which I have access, but so far, only via a browser---I don't really want to try to figure out how to get access to it from a scripting environment like Python. The login process involves SSL, cookies, a bunch of redirects, and other things that would probably take ages to work out.

There are a bunch of Firefox plugins that claim to handle browser automation. My task is pretty simple, so this should be a nice test:
  1. Pop a number x from a predefined array of numbers. If the array is empty, stop.
  2. Plug x into the form.
  3. Submit the form.
  4. The web site will send back a pipe-delimited text file. Save that under the name x.txt.
  5. Go back to 1.
Here's what I tried:

CoScripter by IBM Research. Includes record function! Nice use of natural language. But it can't interact with browser chrome. Plonk. (Also: deleting a command is counterintuitive: you have to backspace all over it, rather than right-clicking and picking "delete" or similar.)

ChickenFoot by Michael Bolin from csail. No record function, but it has a tempting write() function. Unfortunately, the server sends the file as

Content-Disposition: attachment; filename="filename.txt"
Content-Type: application/octet-stream

...which forces Firefox to pop up the "You have chosen to open" dialog box no matter what and grays out the "Do this automatically from now on" check box, so you have to sit there and click. Not cool. This is a firefox problem, not a ChickenFoot problem.

Eh. I fired up Wireshark, copy-pasted the cookie, and used Python to automate the fetch. That worked.

No comments:

Post a Comment

About Me

blog at barillari dot org Older posts at http://barillari.org/blog