Serverless WebRTC using QR codes

For those not familiar, WebRTC is a technology for peer-to-peer communication between browsers. But despite being peer to peer, WebRTC still requires setting up servers for exchanging session descriptions before the browsers can talk to each other (this is called “signaling”).

If you don’t want to manage your own servers, you can use some external service for signaling (and a public STUN server). But let’s say just for fun we don’t want to use any external services either. Then one interesting approach by Chris Ball is to manually exchange the session description by pasting them into IM or email.

I thought it would be cool to extend his demo by transmitting the data using QR codes instead. Then you will be able to establish a WebRTC connection between a phone and a desktop (or two phones with each other) by just pointing their cameras at each other.

After gluing some QR code libraries together, this was the result:

These two are just plain static web pages running javascript (using jsqrcode for scanning QR codes, and jquery-qrcode for generating). On the left in the video is a remote debugger showing what the Android Chrome browser is seeing, and on the right is a regular desktop Chrome browser. The signaling process only requires one back and forth to generate an offer and reply with an answer. So both cameras are active the whole time looking for the other’s offer/answer while displaying QR codes of their own. Once the exchange is done they can open a data channel which in this case is used for a simple chat application.

The interesting part was actually how to encode into QR codes. The session description is about ~2000 characters which is hitting the limit of what a single QR code was designed for:

This is too large to reliably scan especially since the image processing is done using javascript on a mobile device. Compressing can cut down the size by about 50% but this is still too large to read quickly. So you have to break up the data into smaller pieces and encode those into QR codes individually.

But then you will need to scan multiple QR codes in series and if you want to avoid requiring user interaction, you need to automatically know when to display the next QR code. At this stage, the two devices can’t communicate yet so it is hard to ‘ACK’ that you are done scanning a code unless both devices are simultaneously in view of each other’s camera which is very awkward to position physically. (Not that any of this is practical or serious in any way, but trying to build a full duplex connection out of QR code is getting way deeper into IP over avian carriers territory.)

So I ended up just flashing each QR code for a brief moment over and over again. This doesn’t work super well since you can’t flip the codes too quickly or there will be tearing artifacts but flipping too slowly means long waits to get another chance to try again if it misses a frame. It did work consistently enough for a demo so I was okay with declaring this experiment done!

Here’s a link to the demo you can try. Some warnings:

Requires latest Chrome on both desktop and Android (I used some Chrome specific stuff so Firefox isn’t supported). Tested with a MacBook Air and a Nexus 5.
Requires camera permissions. If you don’t complete the flow, make sure to exit the page so it can release the camera.
The exchange must complete within 30 seconds or it will timeout. It will probably take a couple of tries to learn the sweet spots for the scanner to do this quickly enough.
Will probably heat up your CPU since it is naively checking for a QR code for every frame captured.

Code is on github.

Serverless WebRTC using QR codes

Image diffing using CSS

6DOF Positional Tracking with the Wiimote