Getting stock information with YQL and open data tables

Filed under feature, tutorial

One question that the YQL and Pipes teams get asked is “how can I get stock quotes? There isn’t an API for it on developer.yahoo.com”. Interestingly, while there isn’t a more traditional web service API, Yahoo finance does provide a very nice way to get a lot of well structured information on a given company. For example, here’s the Yahoo finance page on YHOO:

You’ll notice that there’s a little “download data” link to the right. If you click the link, it generates a CSV file dynamically with almost all the pricing information on the page. The problem is how to understand what fields the “f” parameter actually produce in the CSV file. Luckily someone has already done that hard work. So now we have a link with a bunch of configurable parameters to get lots of lovely stock information for multiple stock symbols. It is an API of sorts, but that CSV file is still a hard to work with, somewhat cryptic to use, and the data in it is a bit messy.

Enter YQL open data tables. If you don’t want to know “how” this works, and just want a really cool open data table and API to give you stock quotes, then give this a go in the YQL console. Here’s a second example of pulling out only a few fields and sorting the quotes by who has the biggest gain.

You’ll see the query:

select * from yahoo.finance.quotes where symbol in ("YHOO","AAPL","GOOG","MSFT")

And the results (trimmed for this post – there’s a lot of data in the results):

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="4" yahoo:created="2009-06-01T10:40:52Z" yahoo:lang="en-US" yahoo:updated="2009-06-01T10:40:52Z" yahoo:uri="http://query.yahooapis.com/v1/yql?q=select+*+from+yahoo.finance.quotes+where+symbol+in+%28%22YHOO%22%2C%22AAPL%22%2C%22GOOG%22%2C%22MSFT%22%29">
  <diagnostics>
    <publiclyCallable>true</publiclyCallable>
    <url execution-time="2"><![CDATA[http://datatables.org/alltables.env]]></url>
    <url execution-time="55"><![CDATA[http://www.datatables.org/yahoo/finance/yahoo.finance.quotes.xml]]></url>
    <url execution-time="5"><![CDATA[http://download.finance.yahoo.com/d/quotes.csv?s=YHOO,AAPL,GOOG,MSFT&f=aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy]]></url>
    <url execution-time="13"><![CDATA[select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield']]></url>
    <javascript instructions-used="279387"/>
    <user-time>313</user-time>
    <service-time>75</service-time>
    <build-version>1678</build-version>
  </diagnostics>
  <results>
    <quote symbol="YHOO">
      <Ask>16.60</Ask>
      <AverageDailyVolume>22083900</AverageDailyVolume>
      <Bid>16.55</Bid>
      <AskRealtime>16.60</AskRealtime>
      <BidRealtime>16.55</BidRealtime>
      <BookValue>8.30</BookValue>
      <Change_PercentChange>+0.74 - +4.67%</Change_PercentChange>
      <Change>+0.74</Change>
      <Commission/>
      <ChangeRealtime>+0.74</ChangeRealtime>
      <AfterHoursChangeRealtime>N/A - N/A</AfterHoursChangeRealtime>
      <DividendShare>0.00</DividendShare>
      <LastTradeDate>6/1/2009</LastTradeDate>
      <TradeDate/>
      <EarningsShare>0.011</EarningsShare>
      <ErrorIndicationreturnedforsymbolchangedinvalid>N/A</ErrorIndicationreturnedforsymbolchangedinvalid>
      <EPSEstimateCurrentYear>0.36</EPSEstimateCurrentYear>
      <EPSEstimateNextYear>0.42</EPSEstimateNextYear>
      <EPSEstimateNextQuarter>0.08</EPSEstimateNextQuarter>
      <DaysLow>16.13</DaysLow>
      <DaysHigh>16.65</DaysHigh>
      <YearLow>8.94</YearLow>
      <YearHigh>27.10</YearHigh>
...
      <MarketCapitalization>23.140B</MarketCapitalization>
      <MarketCapRealtime/>
      <EBITDA>1.278B</EBITDA>
      <ChangeFromYearLow>+7.64</ChangeFromYearLow>
      <PercentChangeFromYearLow>+85.46%</PercentChangeFromYearLow>
      <LastTradeRealtimeWithTime>N/A - &lt;b&gt;16.58&lt;/b&gt;</LastTradeRealtimeWithTime>
      <ChangePercentRealtime>N/A - +4.67%</ChangePercentRealtime>
      <ChangeFromYearHigh>-10.52</ChangeFromYearHigh>
      <PercebtChangeFromYearHigh>-38.82%</PercebtChangeFromYearHigh>
      <LastTradeWithTime>4:00pm - &lt;b&gt;16.58&lt;/b&gt;</LastTradeWithTime>
      <LastTradePriceOnly>16.58</LastTradePriceOnly>
      <HighLimit/>
      <LowLimit/>
      <DaysRange>16.13 - 16.65</DaysRange>
      <DaysRangeRealtime>N/A - N/A</DaysRangeRealtime>
      <FiftydayMovingAverage>14.6126</FiftydayMovingAverage>
      <TwoHundreddayMovingAverage>12.9096</TwoHundreddayMovingAverage>
      <ChangeFromTwoHundreddayMovingAverage>+3.6704</ChangeFromTwoHundreddayMovingAverage>
      <PercentChangeFromTwoHundreddayMovingAverage>+28.43%</PercentChangeFromTwoHundreddayMovingAverage>
      <ChangeFromFiftydayMovingAverage>+1.9674</ChangeFromFiftydayMovingAverage>
      <PercentChangeFromFiftydayMovingAverage>+13.46%</PercentChangeFromFiftydayMovingAverage>
      <Name>Yahoo! Inc.</Name>
      <Notes>-</Notes>
      <Open>16.18</Open>
      <PreviousClose>15.84</PreviousClose>
...
      <Symbol>YHOO</Symbol>
      <SharesOwned/>
      <ShortRatio>2.10</ShortRatio>
      <LastTradeTime>4:00pm</LastTradeTime>
      <TickerTrend>&amp;nbsp;======&amp;nbsp;</TickerTrend>
      <OneyrTargetPrice>15.27</OneyrTargetPrice>
      <Volume>27926064</Volume>
      <HoldingsValue/>
      <HoldingsValueRealtime/>
      <YearRange>8.94 - 27.10</YearRange>
      <DaysValueChange>- - +4.67%</DaysValueChange>
      <DaysValueChangeRealtime>N/A - N/A</DaysValueChangeRealtime>
      <StockExchange>NasdaqNM</StockExchange>
...
    </quote>
    <quote symbol="AAPL">
...
  </results>
</query>

So how did we go from that ugly looking CSV file to the lovely XML? The answer is the yahoo.finance.quotes open data table:

<?xml version="1.0" encoding="UTF-8" ?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
  <meta>
    <sampleQuery>
      select * from {table} where symbol in ("YHOO","AAPL","GOOG","MSFT")
    </sampleQuery>
  </meta>
  <bindings>
    <select itemPath="quotes.quote" produces="XML">
      <urls><url>http://download.finance.yahoo.com/d/quotes.csv?s={-listjoin|,|symbol}</url></urls>
      <inputs>
        <key id='f' type='xs:string' const='true' default='aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy' paramType='query' />
        <key id='symbol' type='xs:string' batchable='true' maxBatchItems='20' paramType='path' required='true'/>
      </inputs>
      <execute><![CDATA[
        var results = y.query("select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield'",{url:request.url});
        var quotes = <quotes/>;
        var rows=results.results.row;
        for each (var row in rows) {
          for each (var item in row.*) {
            var elname = item.localName();
            var txt = item.text().toString();
            if (txt=="N/A") txt=""; else if (txt=="-") txt=""; else {
              txt = txt.replace(/"/g, '');
            }
            row[elname]=txt;
          }
            quotes.quote += <quote symbol={row.Symbol.text().toString()}>{row.*}</quote>;
        }
        response.object = quotes;
           ]]></execute>
    </select>
  </bindings>
</table>

Let’s step through the main parts of the open data table definition. First the URL that YQL builds to get the data looks like this:

<url>http://download.finance.yahoo.com/d/quotes.csv?s={-listjoin|,|symbol}</url>

The listjoin is a URI template instruction that creates a “comma” separated list of values for each item in the “symbol” value. Symbol itself looks like this in the inputs section:

<key id='f' type='xs:string' const='true' default='aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy' paramType='query' />
<key id='symbol' type='xs:string' batchable='true' maxBatchItems='20' paramType='path' required='true'/>

Note the batchable attribute of symbol. This tells YQL that this parameter can accept a set of values that can be sent to the remote data provider in a single request. In this case, the finance CSV API can take a comma separated list of stock symbols and return all that information for each entry.

That cryptic looking f input key is a constant – we’re going to get all the fields we can every time, and this value holds all the short field names that the API understands.

The YQL execute section actually dispatches the request and processes the return data.

var results = y.query("select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield'",{url:request.url});

This runs another YQL select statement when the table gets invoked. It’s fetching a CSV data source from a URL and setting up the column names for each “row” that comes back. The URL that YQL would originally have fetched for this data is already created in the request object, containing the list of symbols expanded into the s parameter, so we just used the @ substitution syntax to add that into the YQL statement.

var quotes = <quotes/>;
var rows=results.results.row;

The next few lines create a new XML object called quotes which will hold our final XML document and gets an XML list to each of the rows that came back from the CSV query.

for each (var row in rows) {

Now we’ll loop over each of those rows (one row per stock symbol).

   for each (var item in row.*) {
    var elname = item.localName();
    var txt = item.text().toString();
    if (txt=="N/A") txt=""; else if (txt=="-") txt=""; else {
      txt = txt.replace(/"/g, '');
    }
    row[elname]=txt;
  }

For each element in that row (note the E4X syntax to get all the elements row.*) we’re going to clean up the XML somewhat. We’ll get rid of “N/A” and “-” text elements and use empty elements instead, as well as remove any quotes in the text.

   quotes.quote += <quote symbol={row.Symbol.text().toString()}>{row.*}</quote>;
}

Finally in the main loop we’ll append a new quote element to our root XML quotes element that contains the reformatted XML elements, and an attribute called symbol

response.object = quotes;

Last of all we set the response object to the document we’ve just created in the loop.

We’ve already added the table to the github open data table repository, so you can try this table out in the YQL console just by including the community tables. And now its in YQL, you can sort, filter, project and join on any of this data that comes back! For example, you can pull out only a few fields and sort the quotes by who has the biggest gain.

Changelog for build 1678

Filed under changelog

Bug fixes, including

  • URI Template Handler now supports multiple -neg variables.
  • Escaped values in literals such as ‘\\d’ are now correctly passed to YQL execute as ‘\d’
  • Fixed uritemplate table example
  • Better integration of Local Search: Added search parameters to local.seach namely listing_id, omit_category, minimum_rating, route and sort

Enhancements

  • YQL Execute: y.rest(uri) now parses HTML documents and returns an E4X object if the accept header is set to “text/html”.
    
    y.rest("http://finance.yahoo.com/q?s=yhoo").accept('text/html').get().response
    
  • YQL now supports the ‘Access-Control-Allow-Origin’ header for new Cross Domain XHR requests.
  • Introduced new URL based paging model to support services which return a link to the next page.

Do it your way with execute and open data tables

Filed under feature

Today we’ve released a significant new feature for YQL: <execute>.

Previously our open data tables enabled any developer to bring the power of YQL and the table model available to new APIs or data sources that we hadn’t covered. Many of these are hosted on github, and you can read (and use) about them on datatables.org. These give you access to friendfeed, google reader, guardian, lastfm, nyt, twitter, weather.com, whitepages, yelp, zillow to name just a few.

YQL <execute> extends the open data table capabilities through server-side transformation and processing using javascript. This dramatically increases the reach and power of open data tables, for example:

  • join data across services
    e.g. grab New York Times article tags and incorporate associated flickr photos
  • combine multiple searches into a single result
    twitter, web, news and image…
  • augment data
    e.g. Add city/state information to APIs that just return zip code
  • create an API from a web page
    e.g. celebrity birthdays scraped from imdb
  • access APIs that require authentication
    e.g. Netflix OAuth, FlickrAuth, Google AuthSub …
  • data transformation
    e.g. convert the result from xml to Google’s kml format
  • move business logic for your application’s data to the cloud
  • leverage the work of others to make access easier by sharing tables

Read more about this on our YDN blog post, including links to several examples to wet your appetite.

Changelog for build 1432

Filed under changelog

Bug fixes, including

  • JSON and JSONP mimetypes now different/fixed.
  • fixed robots.txt evaluation bug
  • “NaN” value sorting bug fixed

Open Data Table schema changes

  • private open data table keys don’t have to be “const” and key name (not value) is displayed in “desc”
  • new “https=true” attribute on open data tables
  • new “variable” paramType for open data table input keys
  • new “execute” element: server-side javascript support

Core table changes

  • flickr.photos.search now has sort, is_commons and geo_context  parameters
  • search.web now has lang, region, site and abstractkey keys

New core tables

  • 16 new music APIs added
    • music.artist.id
    • music.artist.popular
    • music.artist.search
    • music.artist.similar
    • music.release.artist
    • music.release.id
    • music.release.popular
    • music.release.search
    • music.track.id
    • music.track.popular
    • music.track.search
    • music.video.category
    • music.video.id
    • music.video.popular
    • music.video.search
    • music.video.similar
  • term extractor
  • search.siteexplorer.pages
  • search.siteexplorer.inlinks
  • search.suggest
  • search.spelling
  • uritemplate (for creating URLs)

Misc

  • sub-selects now support “content” as projection field selector
  • console and web service now accessible over HTTPS

YQL environment files

Filed under feature, news

We’ve introduced a new env query parameter to both the YQL console and the web service. The env parameter expects a URL value that points to a plain text file anywhere on the web (including subversion repositories, github etc). The file should contain a list of use statements that specify all the open data tables that you’d like to be available to your YQL queries for that request. For example, if I want all of the New York Times open data tables hosted at javarants I would create a file like this:

use 'http://www.javarants.com/nyt/nyt.article.search.xml';
use 'http://www.javarants.com/nyt/nyt.bestsellers.history.xml';
use 'http://www.javarants.com/nyt/nyt.bestsellers.search.xml';
use 'http://www.javarants.com/nyt/nyt.bestsellers.xml';
use 'http://www.javarants.com/nyt/nyt.movies.critics.xml';
use 'http://www.javarants.com/nyt/nyt.movies.picks.xml';
use 'http://www.javarants.com/nyt/nyt.movies.reviews.xml';

…and serve it from somewhere on the web. If you want to try this new feature on some useful open data tables in the console, you can use a few of the ones currently in the github open data tables project like this:

http://developer.yahoo.com/yql/console/?env=http://github.com/spullara/yql-tables/raw/ef685688d649a7514ebd27722366b2918d966573/alltables.env

Try it in the console!. If you look to the “table” list on the right of the console, you’ll see an extra 9 APIs and 30 tables to explore!

Changelog for build 911

Filed under changelog
  • Console and web service now support an “env” query parameter. This is a url that points to an environment/shell file that YQL will load. Currently this enables multiple tables to be used at once.
  • “Use” table no longer requires the “as” clause. When “as” is absent it will default to the final path segment filename minus the extension. e.g. use 'http://somewhere/service.api.xml; select * from service.api;
  • Show tables now shows where the open data definition was defined
  • Desc table format tweaked
  • Open data table paging fixes, including better support for none paging but flexible result sized services and fixed page models. Moved “matrix” to the paging parent element.
  • Offset bug fixed in table(offset,number) processing

Changelog for build 822

Filed under changelog
  • NOT LIKE added
  • MATCHES regex matching added
  • New USE verb for open data table importing
  • Support for open data table definitions

Open data tables added to YQL

Filed under news

We’re pleased to announce that our last update now enables developers can create, use, and share table definitions over for any public API on the web, in addition to the Yahoo! APIs we support today.

For example, one of the requests we had when using our weather table was how to get an ID that would work for an international location. We pointed people to the weather.com API, but that needed a URL to be formatted up in a specific way. Now you can make that a table and use it:

use 'http://www.javarants.com/weather/weather.search.xml' as ws;
select * from weather.forecast where location in (select id from ws(1) where query='bonn, germany')

Try it.

While we believe this is a great start, it’s work in progress. We’re already adding more capabilities and adjusting the open data table XML syntax as we get feedback and go forward. You can read more about how to go about creating open data tables in our documentation.

Changelog for 2009.01.12.16:11

Filed under changelog

New tables

  • upcoming.user
  • microformats
  • flickr.photos.sizes

Table updates

  • local.search “location” key is now optional
  • mybloglog user by service
  • fixed typo in mybloglog.members.find
  • “charset” is now an optional key in html table, to enable developers to force a particular charset to be used when parsing the web page

Bug fixes, including:

  • more html and xhtml and i18n issues fixed for html table
  • subselect fix for optional keys

Other:

  • publiclyCallable element in diagnostics is set to true if the query can work across the public entrypoint
  • show tables response format changed

Note: future releases will more strictly apply robots.txt to html table fetches