Adding CAPTCHA Support to the Setup Wizard

The new setup wizard is coming slowly. Here are a couple of work-in-progress screenshots.

Step 1: select a device to mimic

The device selection screen will require a tremendous amount of additional work. It should give the choice between uploading a predefined and a fully custom device profile. In case you are curious, here’s what a device config file looks like in JSON format (yes, all of that is uploaded when you register a device).

The current solution just gives you the choice between a hardcoded device profile or entering the GSF ID and user agent string of an existing device, which is neither flexible nor userfriendly.

Note that this part has nothing to do with the CAPTCHA problem directly, but it’s one required step in a sequence, leading to it. Unfortunately, the setup wizard has a next/back navigation as well as optional steps. So, there are several alternative sequences for passing through it. Adding new screens (the CAPTCHA prompt in this case) creates more paths (captcha required; captcha correct; captcha wrong; contacting server) in the underlying state machine and once you do that, the whole thing needs to be rewritten.

Step 2: enter credentials

Turns out that you (may) need some info from the device selection screen in order to perform a log in. You definitely need a device in order to register a GSF ID. Switching the order of the two prompts simplified things considerably.

Step 3: optionally configure a proxy

Needless to say, proxy support is as unfinished as it looks. Optional steps are a major pain in the butt.

Step 4: Contacting Google Play

The “Please wait” screen looks innocent enough. Just a progress bar and no interactive elements. However, it is anything but. There are four different entry (credentials entered; wrong credentials; captcha required; wrong captcha) and three different exit points (prompt credentials; prompt captcha; finish setup). Remember what I wrote above about different ways to pass through the setup wizard? This is where complexity hits the fan. You can’t just add another screen to a sequential dialog, you also have to define how to get and where to go from there (under which conditions).

Step 5: Solve the captcha

The CAPTCHA prompt is new. In addition to rewriting the setup wizard, it also required rewriting the login handler from scratch. And this finally gets us to the crux of the matter. CAPTCHAs are an optional, not a required step in the login protocol. You are only challenged to solve one when necessary to proof that you are a real human being on a real Android device.

Currently, I’m busy figuring out what the new “secret handshake” is. As shown in the screenshot, I have no problem downloading the image, but sending the solution just results in a new challenge, not the desired session token.

State diagram of the setup wizard screens

The screen transition diagram doesn’t do real justice to the complexity of the setup wizard. There is quite a bit of decision making logic under the hood to figure out what the user can and, more importantly, cannot do next (e.g. there is obviously no way to go “back” from the first screen).