Category Archives: feature

The powerful YQL feednormalizer table

Filed under feature, tutorial

YQL’s feednormalizer table is used to convert an input feed in one format into an output feed of another format.

Optionally, prexsl transformation can be applied to the input feed before format conversion and postxsl transformation can be applied to the output feed after format conversion. Prexsl transformation can be used to correct badly formed feeds; while as, postxsl transformation can be used to rearrange, filter, or format final output.

Input feeds can be of any character encoding; however, the output is always transcoded into UTF-8. Illegal characters found in the feed during transcoding are removed.

Syntax
The input feed can either be specified as a url or as an xml string. When the desired output format of the feed is specified, the input feed will be converted into that format. Optionally, one can supply xsl transforms to pre-process the input feed or post-process the output feed.

SELECT * FROM feednormalizer
 WHERE (url= | xml=)
  [AND output=('rss_0.91N'|'rss_0.93'|'rss_0.92'|'rss_1.0'|'rss_0.94'|'rss_2.0'|'rss_0.91U\rss_0.9'|'atom_1.0'|'atom_0.3')]
  [AND prexslurl=]
  [AND postxslurl=]
  [AND timeout=]

or

SELECT * FROM feednormalizer
 WHERE (url= | xml=)
  [AND output=('rss_0.91N'|'rss_0.93'|'rss_0.92'|'rss_1.0'|'rss_0.94'|'rss_2.0'|'rss_0.91U\rss_0.9'|'atom_1.0'|'atom_0.3')]
  [AND prexsl=]
  [AND postxsl=]
  [AND timeout=]

If a timeout value (msec) is specified, feednormalizer table will expect the url to respond within that timeout; otherwise, an error message will be returned.

Example 1: Querying a valid input feed

SELECT * FROM feednormalizer
 WHERE url='http://rss.news.yahoo.com/rss/topstories'

Try above example in YQL Console. Try above example as REST request.


Example 2: Simple conversion of valid input feed

SELECT * FROM feednormalizer
 WHERE url='http://rss.news.yahoo.com/rss/topstories'
   AND output='atom_1.0'

Try above example in YQL Console. Try above example as REST request.


Example 3: Converting invalid input feed produces error

Invalid feeds (such as http://www.yqlblog.net/blog/wp-content/uploads/tmp/example_feed.xml) produces errors when being transformed. For example, the statement:

SELECT * FROM feednormalizer
 WHERE url='http://www.yqlblog.net/blog/wp-content/uploads/tmp/example_feed.xml'
   AND output='rss_2.0'

produces the following error during execution:

Could not parse feed data. Invalid rss_2.0 feed, missing image title

Try above example in YQL Console. Try above example as REST request.


Example 4: Successfully converting invalid feeds using XSL transform.

In Example 3, we saw that the invalid feed could not be converted because it has missing image title. We can get around this by transforming the invalid input feed into a valid feed by removing the <image> tag. This can be done through the use of an XSL transform as shown below:

SELECT * FROM feednormalizer
 WHERE url='http://www.yqlblog.net/blog/wp-content/uploads/tmp/example_feed.xml'
   AND output='rss_2.0'
   AND prexsl='<?xml version="1.0" encoding="ISO-8859-1"?>
               <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
                <xsl:output omit-xml-declaration="yes"/>
                   <xsl:template match="node()|@*">
                     <xsl:copy>
                        <xsl:apply-templates select="node()|@*"/>
                     </xsl:copy>
                   </xsl:template>
                   <xsl:template match="image"/>
               </xsl:stylesheet>'

Try above example in YQL Console. Try above example as REST request.


Example 5: HTML generation

A postxsl transformation can be applied to Example 4 to convert the corrected feed into HTML:

SELECT * FROM feednormalizer 
 WHERE url='http://www.yqlblog.net/blog/wp-content/uploads/tmp/example_feed.xml'
   AND output='rss_2.0'
   AND prexsl='<?xml version="1.0" encoding="ISO-8859-1"?>
               <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
                <xsl:output omit-xml-declaration="yes"/>
                   <xsl:template match="node()|@*">
                     <xsl:copy>
                        <xsl:apply-templates select="node()|@*"/>
                     </xsl:copy>
                   </xsl:template>
                   <xsl:template match="image"/>
               </xsl:stylesheet>'
   AND postxsl='<?xml version="1.0" encoding="ISO-8859-1"?>
                <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
                  <xsl:template match="/">
                    <html>
                      <body>
                        <h2>My News</h2>
                        <table border="1">
                          <tr bgcolor="#9acd32">
                            <th>Title</th>
                            <th>Description</th>
                          </tr>
                          <xsl:for-each select="rss/channel/item">
                            <tr>
                              <td><a><xsl:attribute name="href"><xsl:value-of select="link"/></xsl:attribute><xsl:value-of select="title"/></a></td>
                              <td><xsl:value-of select="description"/></td>
                            </tr>
                          </xsl:for-each>
                        </table>
                      </body>
                    </html>
                  </xsl:template>
                </xsl:stylesheet>'

Try above example in YQL Console. Try above example as REST request.


Example 6: HTML generation, 2nd example

For valid feeds that are already in the desired format, postxsl can be applied directly for HTML conversion:

SELECT * FROM feednormalizer
 WHERE url='http://rss.news.yahoo.com/rss/topstories'
   AND postxsl='<?xml version="1.0" encoding="ISO-8859-1"?>
                <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
                  <xsl:template match="/">
                    <html>
                      <body>
                        <h2>My News</h2>
                        <table border="1">
                          <tr bgcolor="#9acd32">
                            <th>Title</th>
                            <th>Description</th>
                          </tr>
                          <xsl:for-each select="rss/channel/item">
                            <tr>
                              <td><a><xsl:attribute name="href"><xsl:value-of select="link"/></xsl:attribute><xsl:value-of select="title"/></a></td>
                              <td><xsl:value-of select="description"/></td>
                            </tr>
                          </xsl:for-each>
                        </table>
                      </body>
                    </html>
                  </xsl:template>
                </xsl:stylesheet>'

Try above example in YQL Console. Try above example as REST request.


Example 7: Transcoding input feeds into UTF-8

Input feed documents may be in any encoding. The output is always in UTF-8. During transcoding illegal characters encountered in the input feed are removed. The following diagnostics message will appear in the output, when illegal characters are removed:

removed 14 badly encoded characters from feed.

SELECT * FROM feednormalizer
 WHERE url='http://cn.wsj.com/big5/rssbch.xml'
   AND output='atom_1.0'

Try above example in YQL Console. Try above example as REST request.


Example 8: Selecting and filtering input feeds using YQL

SELECT entry.title FROM feednormalizer
 WHERE url='http://rss.news.yahoo.com/rss/topstories'
   AND output='atom_1.0' | sort('entry.title')

Try above example in YQL Console. Try above example as REST request.

Publishing to GitHub

Filed under feature, news

A lot of hassle is involved in submitting your own table to YQL.

First, you would have to open a GitHub account. Then you would visit the YQL repository on GitHub and create a fork. Once you’ve done that, you’d have to pull the fork onto your local machine. After the fork has been pulled, you would have to sift through directories to get a feel for the table naming convention before adding your table in. Finally you’d have to make a pull request to YQL and wait until your table has been approved.

Today, we’re releasing a new feature in the Table Editor that allows you to submit tables directly to GitHub without even touching the command-line. All you have to do is press a simple button to start:

Screen Shot 2012-08-13 at 11.42.13 AM

Clicking the “Publish to GitHub” button will show you a brief disclaimer that will explain how your table will be submitted:

Screen Shot 2012-08-13 at 12.25.21 PM

Once you’ve signed in and authenticated the Table Editor with GitHub, we’ll go ahead and publish your table. In the future, you won’t have to authenticate with GitHub in order to publish a table. And if you ever wish to disassociate your GitHub account from the Table Editor, you can visit your GitHub application settings to revoke access.

Currently we will only push tables to GitHub. Environment and JS files will not be pushed at this time.

In addition to this new feature, we’ve ported all client-side code over to using the YUI 3 App Framework for better flexibility in the future.

Please try out this new feature and send us any questions or suggestions you might have.

Run Yahoo! Pipes from YQL Execute

Filed under feature, news

We recently added a new YQL execute method: y.pipe(pipeid,params)

This method will allow you to run a Yahoo Pipe within a YQL execute statement.

Accepted Parameters:
pipeid (required string)
params (optional json object)

Returns:
A response object that contains a result instance or an error object

Why would you want to do this?

1. Allows you to utilize all the benefits that a YQL execute statement gives you.
2. Higher rate limits! Benefit from YQL rate limits instead of Pipes (which is much lower).
3. Mash ups with other YQL tables
4. Extend your Pipe with server side javascripting

Example:

This is the Pipe that I want to run in YQL: http://pipes.yahoo.com/pipes/pipe.info?_id=990bf4c00040ad06ba83de9aadd6293b

Here is a simple sample YQL table that uses y.pipe. You can create one too using the YQL Table Editor.

And this is the YQL Query I would run to access my example: use “store://HuTaxj5021R7THBitCNIcJ” as ypipe_example; select * from ypipe_example

(You could always extend the sample table to accept the pipeid or pipe params as a parameter via the input keys)

Currently YQL does not produce the output formats that Pipes does (RSS, KML, ICAL, CSV).

YQL will only currently produce XML, JSON, JSONP and JSONP-X output formats.

YQL Rate Limit Increase

Filed under feature, news

Effective immediately the YQL Rate Limits are now increased:

Public endpoint to 2k/ip/hr (previously 1k)
Oauth endpoint to 20k/ip/hr (previously 10k)

Per application limit (identified by your Access Key): 100,000 calls per day will remain the same.

With this increase you can get up to 3 million – 3.1 million requests per month. (100k cap per day)

Thank you to the YQL community – we hope you enjoy this increase.

New enhancements to the YQL console and editor

Filed under changelog, feature

Some new enhancements were recently made to the YQL console and editor.

Console:

  • New debug checkbox.
    • Checking this will simply add debug=true to your console URL. When debug=true is set, it enables open table debugging and viewing of YQL network calls.
  • New expand checkbox.
    • Checking this will expand the results section to full height.
  • Renamed “My Tables” to “My YQL”.
    • With this release, you have the ability to create YQL tables, environments and js files via the editor. The My YQL section lists those files based on those types. This is the section where you can launch the editor to edit the files and the area where you can delete your files.

      Some actions when clicking on the file names:

      • Clicking on a table name will put the store execute key into the yql statement area and desc the table.
      • Clicking on an environment name will load that environment by adding env=store://(your store execute key here) to the console.
      • JS files are only editable via the YQL editor.

      Only you will know what YQL files you have created, since you need to be logged in to create and view them. But you can share your YQL file store execute keys, as they can be run by anyone after sharing. Only share them if you want others to run that file.

Editor:

  • The ability to create different YQL file types.
    • You now can create Tables, YQL environments and JS files. Simply select the “Save as” drop down to save the type of file needed.
    • Regarding JS files, these are files you can y.include() into your execute statement in a table.
  • Dragging a YQL file from the sidebar onto the editor will produce a contextual code snippet.

    editor

    • Saving file as: Table or JS
      • Dragging a Table file will produce: y.use("store://execute key here","namespace here");
      • Dragging a Environment file will produce: y.env("store://execute key here");
      • Dragging a JS file will produce: y.include("store://select key here");
    • Saving file as: Environment
      • Dragging a Table file will produce: use "store://execute key here" as name_space;
      • Dragging a Environment file will produce: env "store://execute key here";
      • Dragging a JS file will produce:
        set change_var_name="store://select key here" on change_to_table;
  • Full screen layout
  • Sidebar which contains Sample templates, keys of a table, and a list of your files by type.
  • Changed file access FROM: tableid=id_here TO: id=id_here

Recent Enhancement to the HTML table

Filed under feature, news

The HTML table has recently been enhanced to support HTML version 5. The HTML table in the backend uses a parser which autocorrects malformed tags. To support HTML5 we are using a different parser then the one used previously. Because of this change, the output might be slightly different than before. To ensure backward compatibility, both the parsers are supported, with the older one being the default. The new parser can be used by just appending compat=”html5” to the query.

For Example: select * from html where url=”http://finance.yahoo.com/q?s=yhoo” and compat=”html5″

Please start using this feature and give us your feedback! Eventually the new parser which supports ‘html5’ will be made default, but that will follow an announcement. Even after the new parser is made default, the old one can still be used by having compat equals to ‘html4’.

YQL Editor

Filed under feature

(updated)

The YQL Editor is a simple and easy way to create your table in Yahoo’s cloud. The editor makes use of the yql.storage table to store your table with Yahoo’s cloud instead of hosting it on your own server.

Simply access it from the YQL console on the upper right hand column under “My Tables”.

mytablesconsole

Some quick notes: You must be logged in to view, create and edit your tables. You can not view other people’s tables. If you previously created and stored tables using yql.storage, they will not show up in “My Tables”. Storing tables directly using the yql.storage table will also not be shown in “My Tables”. “My Tables” makes it easy to track tables you created via the YQL Editor while logged in.

Click the “new” link to launch the YQL Editor. This will open up the editor in a separate page. By default new tables are named “untitled_table”. You can rename the table by simply clicking on the name.

The Tables dropdown provides sample templates to construct your table. It also will show your tables if you have any.

Screen shot 2011-10-20 at Oct 20, 4.08.36 PM

When in the “My Tables” section in the console, clicking on the table name will put the store execute key into the yql statement area and desc the table.

Screen shot 2011-10-20 at Oct 20, 4.17.00 PM

To query your table store, put the yql query statment after the “use” declaration. For example: use “store://Tdr13p0ubxczYZ78ia0Sph” as zillow; select * from zillow where address = “1835 73rd Ave NE” and citystatezip = “98039” and zwsid = “X1-ZWz1cse68iatcb_13bwv”

Quick note: You can share your table execute store, it can be run by the public. Your table store execute key is only known to you – unless you choose to share it.

You can also make your endpoint (which will be really long) into a query alias. Click on the “Create Query Alias” link on the top right hand side of the YQL statement box to customize your endpoint.

Currently, it can take up to 30 seconds to see changes made to your table after editing. This will be fixed in a future YQL release. By adding debug=true to your query (or console), you can see real time edits after saving.

We plan to add new features to the YQL Editor as time goes on. Future releases will include the ability to manage and create your own YQL environments and hosted Javascript files. Please let us know of any features you’d like to see at yql-questions (at) yahoo-inc.com.

YQL Table Health and YQL Lint

Filed under feature, news

YQL has attracted a large number of OpenData tables thanks to the efforts of the community. But some of these tables don’t end up working properly due to many factors, like recent changes made to the underlying API. Therefore we’ve created two new tools, YQL Table Health and YQL Lint, to help developers see and understand which tables actually work and which ones don’t.

YQL Table Health is intended to provide a quick general overview of how “healthy” the community OpenData tables are:

img1

When you first arrive to the page, you will see a list of all the tables that can be used by YQL. Clicking on one of the entries in the list will cause it to expand and show additional information regarding where the source of the XML file is, what kind of table it is, sample query information, and lastly any errors that were encountered. You can use the controls on the left-hand side to further filter, sort, and search through all this data. If you see a table that doesn’t work, you can contact the author of the table to fix it via github. You may also fork the yql-tables from github and fix or enhance the table yourself.

YQL Table Health uses a sever-side script to iterate through all the tables. Each table’s XML file is loaded into YQL cloud storage before a series of checks are run against that XML file. The test results are then cached in a database as well as memory to serve this data as fast as possible. Updates to caches are triggered by users visiting the page and only fire if the data is older than thirty minutes; requests for an update are also synchronized to prevent a race condition occurring, where two or more requests might be made simultaneously. Last but not least, the data is served to the user through a user-interface built using HTML5 and the YUI Library.

The next tool we’re going to introduce to you is YQL Lint.

img2

YQL Lint is essentially an XML debugger for individual YQL tables. You can enter either a URL to a XML file, or the contents of an XML file, and it will validate this against our schema for syntax flaws. Once the schema check has been passed, we will use YQL to get a description of your YQL table and check to see if it contains a sample query that returns a valid result. YQL Lint essentially relies on the same core backend as YQL Table Health.

Please experiment with these tools and send us any questions or suggestions you might have.

Daniel Park – YQL/Pipes Intern

YQL and Comet-based Streaming

Filed under feature

Summary

The latest YQL release adds support for Comet-based streaming with Downstream Polling (“CDP”), which allows YQL clients to receive updates to their queries in real time.

Motivation

In traditional YQL, a client must poll the YQL server for updates, by sending the same YQL query over and over again. Each time, the YQL server parses the query into a Pipe object, executes it, sends the results to the client, and closes the response.

Traditional way of invoking Pipe

Traditional way of invoking a Pipe

This approach is inefficient and does not scale well for updates: First of all, the YQL server has to parse the same YQL statements into corresponding Pipe objects over and over again, where each Pipe object is used to produce only a single response, after which it will be garbage-collected.  Secondly, there is no guarantee that the response data will have changed, resulting in unnecessary network traffic and wasted Pipe constructions and executions.

Long polling is not a solution either: It is impossible for the YQL server to know for how long to keep open a response, because it has no way of telling when new data has become available. As with busy polling, Pipe objects are not reused.

CDP attempts to address these deficiencies: In this mode, the client opens a single persistent connection to the server and sends the YQL query in the initial request. The YQL engine on the YQL server parses the query into a Pipe object, but instead of discarding the Pipe after a single execution and closing the response, it holds on to both the Pipe (turning it into a Standing Pipe), which allows it to execute the same query repeatedly over a period of time, and the Comet-enabled response, which allows it to send updated results to the client asynchronously and in real time.

Periodic invocation of Standing Pipe

Periodic invocation of Standing Pipe

Polling Frequency

In order to enable a table for CDP, its developer must specify the frequency (in seconds) that is appropriate for polling the table’s downstream web service for updates, using the new pollingFrequencySeconds table attribute.

If the YQL query is mapped to a single table, then the frequency with which the Standing Pipe will be executed is equal to the table’s pollingFrequencySeconds. If the YQL query is mapped to multiple tables, then the execution frequency of the Standing Pipe is set to the largest polling frequency of the tables involved, to increase the likelihood that each Standing Pipe execution will yield updated results.

Check out the YQL documentation for an example of how to enable a table for CDP.

Future Enhancements

A future enhancement will have the YQL engine participate in a truly event-driven, publish-subscribe (Bayeux) style notification system, where a table’s downstream service will be a named source of events, to which the YQL engine will subscribe through the appropriate event channel.

Implementation Status and Limitations

The current implementation of CDP is considered experimental and is made available on separate YQL web service endpoints, which are named after the traditional YQL web service endpoints, with streaming inserted into their URI paths. Therefore, YQL’s streaming-enabled endpoint for public tables is accessible through this URL:

http://query.yahooapis.com/v1/public/streaming/yql?[query_params]

whereas the streaming-enabled endpoint for OAuth-protected tables can be accessed at this URL:

http://query.yahooapis.com/v1/streaming/yql?[query_params]

The number of concurrent Comet connections has been throttled at the YQL engine: When the maximum number of concurrent Comet connections has been reached,  any requests that would normally have been put into CDP mode are served in the traditional way.

The version of the Comet implementation that CDP builds upon does not support a configurable timeout for Comet connections, with the effect that a Comet connection will remain open for only 20 seconds. This limitation will be lifted in a future YQL release.

Support for round trip lossless JSON processing

Filed under feature

There have been many reports (and complaints) in the past about the YQL engine changing the structure of a JSON response from a downstream webservice as part of its processing before returning JSON output to the client.

This corruption of JSON response content would manifest itself in a number of ways: JSON numbers in the downstream response would be delivered as JSON strings to the client, and single-element JSON arrays converted to JSON objects, among others.

For example, "myint":[5] in the downstream response would be returned as "results":{"json":{"myint":"5"}} to the client.

Recent YQL releases have fixed these issues and now provide support for round trip lossless JSON processing. As we want to give developers sufficient time to take advantage of and adapt to this change, we have not yet enabled this feature by default. To enable it, simply append jsonCompat=new to your YQL query.

Please start experimenting with this new query parameter and give us feedback. Eventually, this feature will be enabled by default, but there will an announcement before we make the switch.