Monthly Archives: June 2011

YQL and Comet-based Streaming

Filed under feature

Summary

The latest YQL release adds support for Comet-based streaming with Downstream Polling (“CDP”), which allows YQL clients to receive updates to their queries in real time.

Motivation

In traditional YQL, a client must poll the YQL server for updates, by sending the same YQL query over and over again. Each time, the YQL server parses the query into a Pipe object, executes it, sends the results to the client, and closes the response.

Traditional way of invoking Pipe

Traditional way of invoking a Pipe

This approach is inefficient and does not scale well for updates: First of all, the YQL server has to parse the same YQL statements into corresponding Pipe objects over and over again, where each Pipe object is used to produce only a single response, after which it will be garbage-collected.  Secondly, there is no guarantee that the response data will have changed, resulting in unnecessary network traffic and wasted Pipe constructions and executions.

Long polling is not a solution either: It is impossible for the YQL server to know for how long to keep open a response, because it has no way of telling when new data has become available. As with busy polling, Pipe objects are not reused.

CDP attempts to address these deficiencies: In this mode, the client opens a single persistent connection to the server and sends the YQL query in the initial request. The YQL engine on the YQL server parses the query into a Pipe object, but instead of discarding the Pipe after a single execution and closing the response, it holds on to both the Pipe (turning it into a Standing Pipe), which allows it to execute the same query repeatedly over a period of time, and the Comet-enabled response, which allows it to send updated results to the client asynchronously and in real time.

Periodic invocation of Standing Pipe

Periodic invocation of Standing Pipe

Polling Frequency

In order to enable a table for CDP, its developer must specify the frequency (in seconds) that is appropriate for polling the table’s downstream web service for updates, using the new pollingFrequencySeconds table attribute.

If the YQL query is mapped to a single table, then the frequency with which the Standing Pipe will be executed is equal to the table’s pollingFrequencySeconds. If the YQL query is mapped to multiple tables, then the execution frequency of the Standing Pipe is set to the largest polling frequency of the tables involved, to increase the likelihood that each Standing Pipe execution will yield updated results.

Check out the YQL documentation for an example of how to enable a table for CDP.

Future Enhancements

A future enhancement will have the YQL engine participate in a truly event-driven, publish-subscribe (Bayeux) style notification system, where a table’s downstream service will be a named source of events, to which the YQL engine will subscribe through the appropriate event channel.

Implementation Status and Limitations

The current implementation of CDP is considered experimental and is made available on separate YQL web service endpoints, which are named after the traditional YQL web service endpoints, with streaming inserted into their URI paths. Therefore, YQL’s streaming-enabled endpoint for public tables is accessible through this URL:

http://query.yahooapis.com/v1/public/streaming/yql?[query_params]

whereas the streaming-enabled endpoint for OAuth-protected tables can be accessed at this URL:

http://query.yahooapis.com/v1/streaming/yql?[query_params]

The number of concurrent Comet connections has been throttled at the YQL engine: When the maximum number of concurrent Comet connections has been reached,  any requests that would normally have been put into CDP mode are served in the traditional way.

The version of the Comet implementation that CDP builds upon does not support a configurable timeout for Comet connections, with the effect that a Comet connection will remain open for only 20 seconds. This limitation will be lifted in a future YQL release.

Support for round trip lossless JSON processing

Filed under feature

There have been many reports (and complaints) in the past about the YQL engine changing the structure of a JSON response from a downstream webservice as part of its processing before returning JSON output to the client.

This corruption of JSON response content would manifest itself in a number of ways: JSON numbers in the downstream response would be delivered as JSON strings to the client, and single-element JSON arrays converted to JSON objects, among others.

For example, "myint":[5] in the downstream response would be returned as "results":{"json":{"myint":"5"}} to the client.

Recent YQL releases have fixed these issues and now provide support for round trip lossless JSON processing. As we want to give developers sufficient time to take advantage of and adapt to this change, we have not yet enabled this feature by default. To enable it, simply append jsonCompat=new to your YQL query.

Please start experimenting with this new query parameter and give us feedback. Eventually, this feature will be enabled by default, but there will an announcement before we make the switch.

1000th Community Table!

Filed under news

We started datatables.org with our first github entry back in 2009-02-05. Since then we’ve had quality contributions made by the public. We amazingly hit our 1000th table and the award goes to Carson McDonald (@casron) for submitting soundcloud.playlists.xml!! This means on average we get just over 1 new table a day. Thousands of developers benefit from using these tables and we’re impressed and thankful for all your contributions. Keep them coming!!

Yahoo! Pipes V2 engine timeline.

Filed under news

Yahoo! Pipes V2 engine will be using the YQL engine to process Pipes. Here is the timeline for the new changes, soon to take effect.

http://blog.pipes.yahoo.net/2011/06/10/pipes-v2-engine-timeline/

Changelog for build 17991

Filed under changelog

New Feature Highlights

  • Streaming support (experimental) – http://developer.yahoo.com/yql/guide/yql-odt-streaming.html
  • Making Asynchronous Calls with JavaScript Execute – http://developer.yahoo.com/yql/guide/yql-execute-bestpractices.html#yql-execute-asynchronous_calls
  • Query parameter added to support jsonCompat=new
  • Single-element JSON arrays are corrupted into JSON objects
  • Enable round trip lossless JSON processing
  • Added y.rest.head() method support
  • Added support for “NOT IN” comparison operator
  • Improved CSV handling to cope with quotes according to RFC http://tools.ietf.org/html/rfc4180
  • Add decompress(true) on y.rest – y.rest(“http://www.apache.org”).header(“Accept-Encoding”,”gzip”).decompress(true).get();
  • Added meta data from sub queries for yql.query.multi tables
  • Added .forceCharset(“foo”) on rest object – y.rest(‘http://example-domain-page.com’).forceCharset(“ISO-8859-1″).get()
  • Added .fallbackCharset(“foo1”, “foo2”) on rest object – y.rest(url).fallbackCharset(‘shift_jis, ISO-8859-1′).get().response;
  • Added native support json encode/decode – y.parseJson(“str”)

New core tables

  • feednormalizer

Other features

  • Support formatted output in y.log of native arrays

Changes

  • Return the html body in case of error (HTTP status like 400, 404 etc..)

Bug fixes, including

  • Table names that contain the word ‘matches’ fail
  • Setting no value on required key through function call should return error
  • Calling a non existing function should return error
  • Validate field in sort function – sort on valid field only
  • Matrix parameters in multiple paths are now properly handled
  • JSON numbers are corrupted into JSON strings
  • y.exit() in included JS causes uncaught ExitException
  • Charset parameter in html table not working