Diskless Windows 7 with iSCSI and gPXE

One of the things that makes virtual machines so great for development and testing is the ability to quickly and easily take snapshots and revert to earlier snapshots. With physical systems, this is more cumbersome. Sure, one could image the hard drive from a LiveCD, but these things take at least 10s of minutes with today’s giant disk drives, and physically moving disk drives around requires human intervention. Solutions like Intel’s AMT might be viable if all the machines are homogeneous, but that’s rarely the case. My goal is to setup automated regression testing of a hypervisor with multiple guest operating systems on hardware platforms from multiple vendors. To me, all of this screamed Network Boot / Disk-less systems. I’ve used PXE boot in the past with Linux systems, especially to launch FAI and automatically configure Linux. However, the only thing I’ve ever done with Windows that resembles automation is to “slipstream” a Windows XP install CD with BartPE, until now.

Other people have already figured out most of the hard stuff. Many thanks to all of them. Zorinaq’s Diskless Windows 7 iSCSI boot from OpenSolaris 2009.06 ZFS Server contained enough detail to make me confident enough to give all of this a try. Since 2009, however, the “Sun and Oracle thingy” happened. I didn’t want to have to depend on anything from Oracle, and though web searches lead me to learn about the existence of OpenIndiana, I ran into a viable Linux option: Using iSCSI On Debian Lenny (Initiator And Target). Though the instructions were somewhat dated, I had no trouble setting up Debian Squeeze as the Target (server). I connected from an Ubuntu 10.04 LTS Initiator (client) just to confirm that everything worked as expected before getting involved with Windows.

So here we are: Debian Squeeze iSCSI Target system successfully configured without any major snafus. Note that I did opt to use LVM to manage 1.9TB of unused space on the 2TB disk inside the fileserver I’m using.

Aside: I’ve only used LVM a few times in the past so I may update this with more information about some useful observations. One shortcoming I’ve observed so far (maybe? I’m still not fully convinced that this is correct) is that an LVM snapshot is not expected to live forever. Every howto I’ve read suggests, after taking a snapshot, that the contents of the snapshot should be copied elsewhere, and the snapshot deleted. This means it’s not possible to have the same workflow as I commonly use with VMware Fusion/Workstation. With VMware, I frequently take snapshots at interesting configuration points and keep them around indefinitely. I can easily rollback whenever I like, definitely without incurring the overhead of a full filesystem copy.

gPXE (part of the EtherBoot project) “implements PXE, the industry standard network booting specification, and extends it with a number of new protocols and features.” It can be chain-loaded from regular PXE. I did not bother trying to flash any NIC ROMs, again because I want my solution to work on as many different hardware platforms as possible. The gPXE feature of interest to us here is its ability to mount and boot from an iSCSI root filesystem.

Here are two more references that I found useful:

gPXE can get its configuration file from a webserver, instead of TFTP. I’m assuming, dear reader, that to even have an interest in diskless iSCSI boot, you can setup a LAMP stack or other webserver in your sleep.

I did not find a concrete example of a “gPXE” boot script (mine lives in /var/www/gpxe_boot_script.html), or a clear explanation in any documentation I could find that it’s a “shebang”-style file, so I post mine here for reference:

dhcp net0
set keep-san 1
sanboot iscsi:

Here is my (working with diskless Windows 7 booting via iSCSI) /etc/dhcp/dhcpd.conf:

ddns-update-style none;
option domain-name "example.com";
option domain-name-servers;
default-lease-time 600;
max-lease-time 7200;
log-facility local7;
option space gpxe;
option gpxe-encap-opts code 175 = encapsulate gpxe;
option gpxe.priority code 1 = signed integer 8;
option gpxe.keep-san code 8 = unsigned integer 8;
option gpxe.no-pxedhcp code 176 = unsigned integer 8;
option gpxe.bus-id code 177 = string;
option gpxe.bios-drive code 189 = unsigned integer 8;
option gpxe.username code 190 = string;
option gpxe.password code 191 = string;
option gpxe.reverse-username code 192 = string;
option gpxe.reverse-password code 193 = string;
option gpxe.version code 235 = string;
option iscsi-initiator-iqn code 203 = string;
subnet netmask {
option routers;
allow booting;
allow bootp;
next-server; # this server
if exists user-class and option user-class = "gPXE" {
filename "";
} else {
filename "undionly.kpxe";
option root-path "iscsi:";
option gpxe.keep-san 1;

In addition to providing the above information because I wished I had found such information before starting, I now detail the problems that I encountered and overcame.

Problem 1: The network card in my diskless client host is unsupported by Windows 7 out-of-the-box, preventing it from communicating via iSCSI. I worked around this by downloading the relevant driver from the HP/Intel website, and extracting the files onto a USB stick. When the installer asks when you would like to install Windows, there is an option to load a driver. It happily loads network drivers in addition to storage drivers.

Problem 2: The Windows 7 installer does not see the iSCSI target as a viable disk on which to install Windows, if I chainload gPXE from PXE. My diskless client host (an HP Elite 8100) has three options for network booting (Disabled, PXE, iSCSI). I was initially trying with it set to PXE, because my goal is to have the most general, cross-platform solution that I can. However, the Windows installer takes forever to load up to the screen where it looks for a disk onto which to install, and I didn’t have the patience to try too many different things. Plus, when I set the NIC to iSCSI, it just worked. Note that in iSCSI mode the BIOS settings for the NIC request the name of the iSCSI target. I entered them.

Configured thusly, at boot time the diskless system starts by issuing a DHCP request to connect the iSCSI volume, and then proceeds through the normal boot sequence. I put the DVD drive above the NIC, and the Windows 7 installer launched without issue. I was then able to successfully install Windows onto the iSCSI target (after loading the NIC driver from the USB stick), as it magically appeared in the list of possible disks onto which to install Windows.

Now, that worked to install windows, which is/was the hard part. Once Windows is installed on an iSCSI volume, it seems to work great. I switched my BIOS settings back to regular PXE chainloading gPXE and Windows boots just fine.

As the purpose of this exercise is to setup multiple machines for regression testing, I will be going through these steps again before too long. I hope to iteratively refine this post into very detail, effective instructions.

Some buzz-words that may help in web searches on this topic: iSCSI, gPXE, LVM, DHCP, TFTP, Apache2, Intel NIC Driver, Windows 7 32-bit CD.


How to (try to, anyways) nip a bad patent in the bud

Many patent applications become public about 18 months after they are initially filed. As it is not unusual for a patent to take 3~5 years before an examiner takes a look, this means that many pending patent applications are freely available for anybody with a web browser (and who can solve a captcha) to read. What seems like a viable grassroots movement is for willing individuals to peruse newly published patent applications, and make sure that they include references to sufficient prior art. If they do not (e.g., many applications that I have seen consider only other patents, and neglect a large volume of freely available scientific literature), then they risk becoming a “bad patent” because the examiner may not realize that there is little or no novelty.

There is a legal process for a third party to submit information on a pending patent application, within two months of that application’s having gone public. It is defined in “37 CFR 1.99: Third-party submission in published application” (see 37 CFR 1.99 on the USPTO website or bitlaw). Submitting such information generally costs between $100~$200, and one must first prove that one has also informed the applicant as well. This must all happen within 2 months of the public disclosure of the application. The viable methods for proof include some kind of acknowledgement from the applicant or their attorneys. Certified First Class US Mail, Return Receipt Requested (not nearly as expensive as it sounds; I’ve mailed 64 pages for less than $9, including the envelope I put them in) (a return receipt is a post card that gets mailed back to you including the name and original signature of the recipient of the mail) is one such method. More on how one might actually get in touch with the relevant attorneys below.

Now, $100~$200 is a lot for a volunteer to shell out for a single pending patent which may never directly impact their life, and even $9 for postage and printing might be more than folks want to spend. Let us first see how we might avoid the first expense. Anyone who has ever submitted a patent application may recognize “1.56 Duty to disclose information material to patentability. – Appendix R Patent Rules” (USPTO, wikipedia, bitlaw), they will have had to make detailed lists of related work (“prior art”) while preparing their own patent submissions. This is more informally known as the “Duty of Candor” or “Disclosure” as it pertains to a patent filing.

Some lawyers with whom I am acquainted (and this is consistent with what the attorneys for our own patent applications have said) emphasize that the duty of candor is taken seriously by attorneys. Thus, if we can find out who the responsible law firm is, then maybe we can send them a certified letter containing the relevant prior art, and they will do the right thing to limit their own liability (the right thing in this case is for them to send the additional art to the USPTO; I can’t remember the official name of the form right this minute). So, if even the $9 for certified mail is overkill, perhaps an email can suffice.

So, how do we find out the contact information for a given patent application? First, we need either its Application Number (generally written as xx/yyy,zzz) or its Publication Number (YYYYnnnnnnn, sometimes written YYYY/nnnnnnn). Armed with one of these numbers…

Warning: Enter a mindset appropriate for websites designed before “search” was a largely solved problem.

Goal: find the right attorneys.

To access the exhaustive history of a public application, use the Patent Application Information Retrieval (PAIR) website, and search using the Application Number or Publication Number, taking care to select the right radio button and include / omit relevant punctuation.

This brings up all of the paperwork and correspondence relevant to that application (cool, eh? I previously was not aware that all of this was accessible). Select the “Image File Wrapper” tab for PDF versions of stuff. The far righthand column has a provision for something like “select all”, and it becomes possible to grab everything in a single large PDF.

I have had good luck learning the original filing attorney from the “Transmittal Letter”, and the firm where that attorney was employed at the time on the “Filing Receipt”. It is unwise to assume that this attorney still works for the listed firm, as it appears that doing these kinds of filings is often the work of junior / inexperienced folks, who stand a good chance of having moved on to other areas of focus by the time 18 months have passed. I generally address such letters to “Name of Law Firm” / “Intellectual Property Department” / Street / City / State / Zip.

Be sure to resist the temptation to include any explanatory material in your letter. Somewhere in 37 CFR 1.99 it explains that the patent examiner has no obligation to read any explanation provided, and (although it’s not very clear) it looks like he might be within his rights to ignore the whole thing if it is too bogged down with explanation. (Does anybody know for sure?)

For completeness and convenience, here is a letter template:

(right justified) FROM: your name & address
(right justified) Date
(left justified from now on)
TO: Name of Law Firm
Intellectual Property Department
Attorney or Agent for NAME OF INVENTOR (US published patent application YYYY/nnnnnnn)

Dear sir or madam:

I am writing in regard to pending United States patent Application Number xx/yyy,zzz (Publication Number YYYY/nnnnnnn), entitled TITLE OF PATENT APPLICATION. The Notice of Publication on file with the USPTO is dated DATE.

Supporting documentation identifies NAME OF LAW FIRM as having been granted Power of Attorney to Prosecute Applications before the USPTO. Attorney Docket Number is listed as DOCKET NUMBER.

I hereby submit the following published scientific literature for your consideration, in the expectation that it will be evaluated in light of your responsibilities under 37 CFR 1.56 (“Duty of Candor”). Please find below citations for the literature (including the dates when they were made public). I also include full copies of these publications with this letter.

  • Thorough citation of Publication 1
  • Thorough citation of Publication 2


Google+ changed my Instant Upload setting without asking me

With a one-year-old and two pets, I use my Nexus One’s camera almost daily. I created my Google+ account many months ago. I almost never use the Google+ App on my phone. However, photos I took starting on December 2 began to show up in my Google+ “notification area” when signed into Gmail. Now, they’re not shared by default, so there’s no “real” harm done in the near term.

Photos from November 27 were not uploaded. I conclude that something happened between those two dates to re-enable Instant Upload.

Quick fix: Google+ App : Menu softkey : Settings : Uncheck the Instant Upload checkbox.

It is entirely possible that a regular App update tried to warn me about this. However, to me, this is a significant change that should have warranted an additional effort to notify me about it.

The lack of secure storage options on mobile devices

App developers who want to protect data, can’t. Here is a nice writeup from the Pidgin developers about why nothing out there improves upon cleartext password storage: PlainTextPasswords

A lot of developers need to understand this. Look at the comments to this stackoverflow question to get a taste of the lack of understanding that pervades many developers.

I feel inspired to write a response on par with this incredible rant about how people need to stop trying to parse HTML with regex, but I don’t think I’ve got the time (or the skills? ;-).