Author Archives: yqlteam

YQL Table Health and YQL Lint

Filed under feature, news

YQL has attracted a large number of OpenData tables thanks to the efforts of the community. But some of these tables don’t end up working properly due to many factors, like recent changes made to the underlying API. Therefore we’ve created two new tools, YQL Table Health and YQL Lint, to help developers see and understand which tables actually work and which ones don’t.

YQL Table Health is intended to provide a quick general overview of how “healthy” the community OpenData tables are:

img1

When you first arrive to the page, you will see a list of all the tables that can be used by YQL. Clicking on one of the entries in the list will cause it to expand and show additional information regarding where the source of the XML file is, what kind of table it is, sample query information, and lastly any errors that were encountered. You can use the controls on the left-hand side to further filter, sort, and search through all this data. If you see a table that doesn’t work, you can contact the author of the table to fix it via github. You may also fork the yql-tables from github and fix or enhance the table yourself.

YQL Table Health uses a sever-side script to iterate through all the tables. Each table’s XML file is loaded into YQL cloud storage before a series of checks are run against that XML file. The test results are then cached in a database as well as memory to serve this data as fast as possible. Updates to caches are triggered by users visiting the page and only fire if the data is older than thirty minutes; requests for an update are also synchronized to prevent a race condition occurring, where two or more requests might be made simultaneously. Last but not least, the data is served to the user through a user-interface built using HTML5 and the YUI Library.

The next tool we’re going to introduce to you is YQL Lint.

img2

YQL Lint is essentially an XML debugger for individual YQL tables. You can enter either a URL to a XML file, or the contents of an XML file, and it will validate this against our schema for syntax flaws. Once the schema check has been passed, we will use YQL to get a description of your YQL table and check to see if it contains a sample query that returns a valid result. YQL Lint essentially relies on the same core backend as YQL Table Health.

Please experiment with these tools and send us any questions or suggestions you might have.

Daniel Park – YQL/Pipes Intern

Search tables and BOSS v1

Filed under news

We’ve removed all search tables that relied on the BOSS v1 API (search.web, search.image, and search.news) as the aforementioned BOSS v1 no longer exists as of today (http://www.ysearchblog.com/2011/06/30/you-asked-for-this-boss-v2-updates/).

For those of you relying on those tables please consider using the community BOSS v2 table (https://github.com/yql/yql-tables/blob/master/boss/boss.search.xml).

Thanks -YQL Team

YQL and Comet-based Streaming

Filed under feature

Summary

The latest YQL release adds support for Comet-based streaming with Downstream Polling (“CDP”), which allows YQL clients to receive updates to their queries in real time.

Motivation

In traditional YQL, a client must poll the YQL server for updates, by sending the same YQL query over and over again. Each time, the YQL server parses the query into a Pipe object, executes it, sends the results to the client, and closes the response.

Traditional way of invoking Pipe

Traditional way of invoking a Pipe

This approach is inefficient and does not scale well for updates: First of all, the YQL server has to parse the same YQL statements into corresponding Pipe objects over and over again, where each Pipe object is used to produce only a single response, after which it will be garbage-collected.  Secondly, there is no guarantee that the response data will have changed, resulting in unnecessary network traffic and wasted Pipe constructions and executions.

Long polling is not a solution either: It is impossible for the YQL server to know for how long to keep open a response, because it has no way of telling when new data has become available. As with busy polling, Pipe objects are not reused.

CDP attempts to address these deficiencies: In this mode, the client opens a single persistent connection to the server and sends the YQL query in the initial request. The YQL engine on the YQL server parses the query into a Pipe object, but instead of discarding the Pipe after a single execution and closing the response, it holds on to both the Pipe (turning it into a Standing Pipe), which allows it to execute the same query repeatedly over a period of time, and the Comet-enabled response, which allows it to send updated results to the client asynchronously and in real time.

Periodic invocation of Standing Pipe

Periodic invocation of Standing Pipe

Polling Frequency

In order to enable a table for CDP, its developer must specify the frequency (in seconds) that is appropriate for polling the table’s downstream web service for updates, using the new pollingFrequencySeconds table attribute.

If the YQL query is mapped to a single table, then the frequency with which the Standing Pipe will be executed is equal to the table’s pollingFrequencySeconds. If the YQL query is mapped to multiple tables, then the execution frequency of the Standing Pipe is set to the largest polling frequency of the tables involved, to increase the likelihood that each Standing Pipe execution will yield updated results.

Check out the YQL documentation for an example of how to enable a table for CDP.

Future Enhancements

A future enhancement will have the YQL engine participate in a truly event-driven, publish-subscribe (Bayeux) style notification system, where a table’s downstream service will be a named source of events, to which the YQL engine will subscribe through the appropriate event channel.

Implementation Status and Limitations

The current implementation of CDP is considered experimental and is made available on separate YQL web service endpoints, which are named after the traditional YQL web service endpoints, with streaming inserted into their URI paths. Therefore, YQL’s streaming-enabled endpoint for public tables is accessible through this URL:

http://query.yahooapis.com/v1/public/streaming/yql?[query_params]

whereas the streaming-enabled endpoint for OAuth-protected tables can be accessed at this URL:

http://query.yahooapis.com/v1/streaming/yql?[query_params]

The number of concurrent Comet connections has been throttled at the YQL engine: When the maximum number of concurrent Comet connections has been reached,  any requests that would normally have been put into CDP mode are served in the traditional way.

The version of the Comet implementation that CDP builds upon does not support a configurable timeout for Comet connections, with the effect that a Comet connection will remain open for only 20 seconds. This limitation will be lifted in a future YQL release.

Support for round trip lossless JSON processing

Filed under feature

There have been many reports (and complaints) in the past about the YQL engine changing the structure of a JSON response from a downstream webservice as part of its processing before returning JSON output to the client.

This corruption of JSON response content would manifest itself in a number of ways: JSON numbers in the downstream response would be delivered as JSON strings to the client, and single-element JSON arrays converted to JSON objects, among others.

For example, "myint":[5] in the downstream response would be returned as "results":{"json":{"myint":"5"}} to the client.

Recent YQL releases have fixed these issues and now provide support for round trip lossless JSON processing. As we want to give developers sufficient time to take advantage of and adapt to this change, we have not yet enabled this feature by default. To enable it, simply append jsonCompat=new to your YQL query.

Please start experimenting with this new query parameter and give us feedback. Eventually, this feature will be enabled by default, but there will an announcement before we make the switch.

1000th Community Table!

Filed under news

We started datatables.org with our first github entry back in 2009-02-05. Since then we’ve had quality contributions made by the public. We amazingly hit our 1000th table and the award goes to Carson McDonald (@casron) for submitting soundcloud.playlists.xml!! This means on average we get just over 1 new table a day. Thousands of developers benefit from using these tables and we’re impressed and thankful for all your contributions. Keep them coming!!

Yahoo! Pipes V2 engine timeline.

Filed under news

Yahoo! Pipes V2 engine will be using the YQL engine to process Pipes. Here is the timeline for the new changes, soon to take effect.

http://blog.pipes.yahoo.net/2011/06/10/pipes-v2-engine-timeline/

Changelog for build 17991

Filed under changelog

New Feature Highlights

  • Streaming support (experimental) – http://developer.yahoo.com/yql/guide/yql-odt-streaming.html
  • Making Asynchronous Calls with JavaScript Execute – http://developer.yahoo.com/yql/guide/yql-execute-bestpractices.html#yql-execute-asynchronous_calls
  • Query parameter added to support jsonCompat=new
  • Single-element JSON arrays are corrupted into JSON objects
  • Enable round trip lossless JSON processing
  • Added y.rest.head() method support
  • Added support for “NOT IN” comparison operator
  • Improved CSV handling to cope with quotes according to RFC http://tools.ietf.org/html/rfc4180
  • Add decompress(true) on y.rest – y.rest(“http://www.apache.org”).header(“Accept-Encoding”,”gzip”).decompress(true).get();
  • Added meta data from sub queries for yql.query.multi tables
  • Added .forceCharset(“foo”) on rest object – y.rest(‘http://example-domain-page.com’).forceCharset(“ISO-8859-1″).get()
  • Added .fallbackCharset(“foo1”, “foo2”) on rest object – y.rest(url).fallbackCharset(‘shift_jis, ISO-8859-1′).get().response;
  • Added native support json encode/decode – y.parseJson(“str”)

New core tables

  • feednormalizer

Other features

  • Support formatted output in y.log of native arrays

Changes

  • Return the html body in case of error (HTTP status like 400, 404 etc..)

Bug fixes, including

  • Table names that contain the word ‘matches’ fail
  • Setting no value on required key through function call should return error
  • Calling a non existing function should return error
  • Validate field in sort function – sort on valid field only
  • Matrix parameters in multiple paths are now properly handled
  • JSON numbers are corrupted into JSON strings
  • y.exit() in included JS causes uncaught ExitException
  • Charset parameter in html table not working

Flickr Table change.

Filed under news

In the next few days we will change the Flickr internal tables where it will be recommended that you use your own Flickr API key to access the flickr.* YQL tables.

If you do not provide an API key the request will still work but is subject to rate limiting which may be triggered with high use of this table.

Please sign up for a Flickr API key if you haven’t already and use that in your Flickr YQL calls going forward.

For example: SELECT * FROM flickr.places WHERE query=”north beach” and api_key=”your key here”

Thanks -YQL Team

YQL + YUI: Building End-to-End Applications

Filed under tutorial

The third team talk was presented by Paul Donnelly and Nagesh Susarla. They go over how to start your query out in the YQL console, access YQL data via the various endpoints, and go through YQL’s various authentication layers.

Code examples here. And slides here.

Here is the link to the original yuiblog post and below an embedded video.

Building YQL Open Data Tables with YQL Execute

Filed under tutorial

The second YQL talk at YUIConf was presented by Nagesh Susarla. Nagesh goes over how to use YQL execute in the open data tables. Here is the link to the original yuiblog post and below an embedded video.