tag:blogger.com,1999:blog-29485999513875592032024-02-19T08:12:47.261-08:00schwink.netStephen Schwinkhttp://www.blogger.com/profile/08946760049354816705noreply@blogger.comBlogger1125tag:blogger.com,1999:blog-2948599951387559203.post-86647151503234250312011-01-13T22:29:00.000-08:002011-01-13T22:41:27.029-08:00HTTP chunks and onreadystatechangeOne of the features of HTTP 1.1 is "chunked transfer encoding". Rather than send a <span style="font-family: 'Courier New', Courier, monospace;">Content-Length</span> header followed by the entire document, it is possible to transmit the body as a series of chunks, each with their own content length declaration. This lets you start sending the beginning of the document before you know how long it's going to be.<br />
<br />
It also makes Comet "streaming" possible, letting you trickle down data without the overhead of a full HTTP request for each message. This depends on your browser telling you when new chunks arrive. As you might guess, this isn't supported by Internet Explorer. But all other major browsers that I've tried (Firefox, Chrome, Safari) will fire multiple <span style="font-family: 'Courier New', Courier, monospace;">XMLHttpRequest</span> <span style="font-family: 'Courier New', Courier, monospace;">onreadystatechange</span> events (<span style="font-family: 'Courier New', Courier, monospace;">readyState</span> == 3) as additional parts of the document are received.<br />
<br />
Here's <a href="https://github.com/mochi/mochiweb/blob/master/src/mochiweb_response.erl">MochiWeb's implementation</a> of chunked transfer encoding, which is pretty straightforward:<br />
<span class="Apple-style-span" style="font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 11px; line-height: 14px;"></span><br />
<pre style="font-family: 'Bitstream Vera Sans Mono', Courier, monospace; font-size: 12px; font: normal normal normal 12px/normal Monaco, 'Courier New', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', monospace; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><div class="line" id="LC46" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"><span class="c" style="color: #999988; font-style: italic; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">%% @spec write_chunk(iodata()) -> ok</span></div><div class="line" id="LC47" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"><span class="c" style="color: #999988; font-style: italic; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">%% @doc Write a chunk of a HTTP chunked response. If Data is zero length,</span></div><div class="line" id="LC48" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"><span class="c" style="color: #999988; font-style: italic; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">%% then the chunked response will be finished.</span></div><div class="line" id="LC49" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"><span class="nf" style="color: #990000; font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">write_chunk</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">(</span><span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Data</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">)</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">-></span></div><div class="line" id="LC50" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="k" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">case</span> <span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Request</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">:</span><span class="nb" style="color: #0086b3; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">get</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">(</span><span class="n" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">version</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">)</span> <span class="k" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">of</span></div><div class="line" id="LC51" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Version</span> <span class="k" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">when</span> <span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Version</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">>=</span> <span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">{</span><span class="mi" style="color: #009999; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">1</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">,</span> <span class="mi" style="color: #009999; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">1</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">}</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">-></span></div><div class="line" id="LC52" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Length</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">=</span> <span class="nb" style="color: #0086b3; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">iolist_size</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">(</span><span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Data</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">),</span></div><div class="line" id="LC53" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="nb" style="color: #0086b3; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">send</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">([</span><span class="nn" style="color: #555555; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">io_lib</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">:</span><span class="n" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">format</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">(</span><span class="s" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">"</span><span class="si" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">~.16b</span><span class="se" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">\r\n</span><span class="s" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">"</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">,</span> <span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">[</span><span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Length</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">]),</span> <span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Data</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">,</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><<</span><span class="s" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">"</span><span class="se" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">\r\n</span><span class="s" style="color: #dd1144; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">"</span><span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">>></span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">]);</span></div><div class="line" id="LC54" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">_</span> <span class="o" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">-></span></div><div class="line" id="LC55" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="nb" style="color: #0086b3; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">send</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">(</span><span class="nv" style="color: teal; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">Data</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">)</span></div><div class="line" id="LC56" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 1em; padding-right: 0px; padding-top: 0px;"> <span class="k" style="font-weight: bold; line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">end</span><span class="p" style="line-height: 1.4em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">.</span></div></pre><br />
For each chunk, you send an integer size, followed by a newline, followed by that number of bytes of data, followed by another newline. On the client, the web browser stitches each segment together, appending the data to <span style="font-family: 'Courier New', Courier, monospace;">responseText</span>.<br />
<br />
When designing <a href="http://schwink.net/kanaloa/">Kanaloa's</a> streaming protocol, I initially took it for granted that each chunk would have its own <span style="font-family: 'Courier New', Courier, monospace;">onreadystatechange</span> event. This made parsing the chunks simple; in my case, I just sent down a valid JSON array in each chunk, kept track of how much <span style="font-family: 'Courier New', Courier, monospace;">responseText</span> I'd already seen on the client, and called <span style="font-family: 'Courier New', Courier, monospace;">JSON.parse</span> on the difference.<br />
<br />
The first thing I noticed was that sometimes single chunks would be split across multiple events. I theorized that this resulted from them being put into multiple TCP packets, and indeed limiting the chunk size to the typical TCP segment size seemed to fix this problem.<br />
<br />
The next thing I noticed was that sometimes multiple chunks would be concatenated into the same event. This was also a problem, as <span style="font-family: 'Courier New', Courier, monospace;">JSON.parse</span> needs a valid expression, and <span style="font-family: 'Courier New', Courier, monospace;">'["foo"]["bar"]'</span> wasn't cutting it.<br />
<br />
You can see both of these cases demonstrated here:<br />
<a href="http://schwink.net/blog/chunker/client/">http://schwink.net/blog/chunker/client/</a><br />
The small chunks are often concatenated, and the large chunks are split.<br />
<br />
I took a look at the TCP packets in Wireshark, and am struct by two things. First, the small messages do in fact arrive as their own separate TCP packets. So the browser is stitching them together into the same event in some cases. They arrive at roughly equal intervals in the case I examined.<br />
<br />
Secondly, in the cases where a chunk is split across multiple events, the event boundaries do correspond with the packet boundaries.<br />
<br />
So I think we can conclude that the browser simply reads in incoming packets into its <span style="font-family: 'Courier New', Courier, monospace;">responseText</span> buffer, and fires <span style="font-family: 'Courier New', Courier, monospace;">onreadystatechange</span> for each. If your script is still running from the previous event, it just makes the <span style="font-family: 'Courier New', Courier, monospace;">responseText</span> available to you when you get that field, rather than wait to send another event later.<br />
<br />
This sort of begs the question whether we could construct a scenario where additional text gets appended to <span style="font-family: 'Courier New', Courier, monospace;">responseText</span> without you being notified, or where it changes between multiple reads to that field by the same JavaScript thread. But I've just about had my fill of this topic for now : )<br />
<br />
In the end I had to do what I'd been hoping to avoid from the start, and write my own logic to split the response, rather than trust the events to delineate them. The result may be the world's simplest and least featureful JSON parser, whose only job is to split a string into substrings that encode JSON arrays, which can in turn be properly deserialized. But it seems to work, and because it leaves unterminated arrays untouched, I can now also receive messages of arbitrary size that span multiple chunks.Stephen Schwinkhttp://www.blogger.com/profile/08946760049354816705noreply@blogger.com6