Getting stock information with YQL and open data tables

Filed under feature, tutorial

One question that the YQL and Pipes teams get asked is “how can I get stock quotes? There isn’t an API for it on developer.yahoo.com”. Interestingly, while there isn’t a more traditional web service API, Yahoo finance does provide a very nice way to get a lot of well structured information on a given company. For example, here’s the Yahoo finance page on YHOO:

You’ll notice that there’s a little “download data” link to the right. If you click the link, it generates a CSV file dynamically with almost all the pricing information on the page. The problem is how to understand what fields the “f” parameter actually produce in the CSV file. Luckily someone has already done that hard work. So now we have a link with a bunch of configurable parameters to get lots of lovely stock information for multiple stock symbols. It is an API of sorts, but that CSV file is still a hard to work with, somewhat cryptic to use, and the data in it is a bit messy.

Enter YQL open data tables. If you don’t want to know “how” this works, and just want a really cool open data table and API to give you stock quotes, then give this a go in the YQL console. Here’s a second example of pulling out only a few fields and sorting the quotes by who has the biggest gain.

You’ll see the query:

select * from yahoo.finance.quotes where symbol in ("YHOO","AAPL","GOOG","MSFT")

And the results (trimmed for this post – there’s a lot of data in the results):

<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:count="4" yahoo:created="2009-06-01T10:40:52Z" yahoo:lang="en-US" yahoo:updated="2009-06-01T10:40:52Z" yahoo:uri="http://query.yahooapis.com/v1/yql?q=select+*+from+yahoo.finance.quotes+where+symbol+in+%28%22YHOO%22%2C%22AAPL%22%2C%22GOOG%22%2C%22MSFT%22%29">
  <diagnostics>
    <publiclyCallable>true</publiclyCallable>
    <url execution-time="2"><![CDATA[http://datatables.org/alltables.env]]></url>
    <url execution-time="55"><![CDATA[http://www.datatables.org/yahoo/finance/yahoo.finance.quotes.xml]]></url>
    <url execution-time="5"><![CDATA[http://download.finance.yahoo.com/d/quotes.csv?s=YHOO,AAPL,GOOG,MSFT&f=aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy]]></url>
    <url execution-time="13"><![CDATA[select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield']]></url>
    <javascript instructions-used="279387"/>
    <user-time>313</user-time>
    <service-time>75</service-time>
    <build-version>1678</build-version>
  </diagnostics>
  <results>
    <quote symbol="YHOO">
      <Ask>16.60</Ask>
      <AverageDailyVolume>22083900</AverageDailyVolume>
      <Bid>16.55</Bid>
      <AskRealtime>16.60</AskRealtime>
      <BidRealtime>16.55</BidRealtime>
      <BookValue>8.30</BookValue>
      <Change_PercentChange>+0.74 - +4.67%</Change_PercentChange>
      <Change>+0.74</Change>
      <Commission/>
      <ChangeRealtime>+0.74</ChangeRealtime>
      <AfterHoursChangeRealtime>N/A - N/A</AfterHoursChangeRealtime>
      <DividendShare>0.00</DividendShare>
      <LastTradeDate>6/1/2009</LastTradeDate>
      <TradeDate/>
      <EarningsShare>0.011</EarningsShare>
      <ErrorIndicationreturnedforsymbolchangedinvalid>N/A</ErrorIndicationreturnedforsymbolchangedinvalid>
      <EPSEstimateCurrentYear>0.36</EPSEstimateCurrentYear>
      <EPSEstimateNextYear>0.42</EPSEstimateNextYear>
      <EPSEstimateNextQuarter>0.08</EPSEstimateNextQuarter>
      <DaysLow>16.13</DaysLow>
      <DaysHigh>16.65</DaysHigh>
      <YearLow>8.94</YearLow>
      <YearHigh>27.10</YearHigh>
...
      <MarketCapitalization>23.140B</MarketCapitalization>
      <MarketCapRealtime/>
      <EBITDA>1.278B</EBITDA>
      <ChangeFromYearLow>+7.64</ChangeFromYearLow>
      <PercentChangeFromYearLow>+85.46%</PercentChangeFromYearLow>
      <LastTradeRealtimeWithTime>N/A - &lt;b&gt;16.58&lt;/b&gt;</LastTradeRealtimeWithTime>
      <ChangePercentRealtime>N/A - +4.67%</ChangePercentRealtime>
      <ChangeFromYearHigh>-10.52</ChangeFromYearHigh>
      <PercebtChangeFromYearHigh>-38.82%</PercebtChangeFromYearHigh>
      <LastTradeWithTime>4:00pm - &lt;b&gt;16.58&lt;/b&gt;</LastTradeWithTime>
      <LastTradePriceOnly>16.58</LastTradePriceOnly>
      <HighLimit/>
      <LowLimit/>
      <DaysRange>16.13 - 16.65</DaysRange>
      <DaysRangeRealtime>N/A - N/A</DaysRangeRealtime>
      <FiftydayMovingAverage>14.6126</FiftydayMovingAverage>
      <TwoHundreddayMovingAverage>12.9096</TwoHundreddayMovingAverage>
      <ChangeFromTwoHundreddayMovingAverage>+3.6704</ChangeFromTwoHundreddayMovingAverage>
      <PercentChangeFromTwoHundreddayMovingAverage>+28.43%</PercentChangeFromTwoHundreddayMovingAverage>
      <ChangeFromFiftydayMovingAverage>+1.9674</ChangeFromFiftydayMovingAverage>
      <PercentChangeFromFiftydayMovingAverage>+13.46%</PercentChangeFromFiftydayMovingAverage>
      <Name>Yahoo! Inc.</Name>
      <Notes>-</Notes>
      <Open>16.18</Open>
      <PreviousClose>15.84</PreviousClose>
...
      <Symbol>YHOO</Symbol>
      <SharesOwned/>
      <ShortRatio>2.10</ShortRatio>
      <LastTradeTime>4:00pm</LastTradeTime>
      <TickerTrend>&amp;nbsp;======&amp;nbsp;</TickerTrend>
      <OneyrTargetPrice>15.27</OneyrTargetPrice>
      <Volume>27926064</Volume>
      <HoldingsValue/>
      <HoldingsValueRealtime/>
      <YearRange>8.94 - 27.10</YearRange>
      <DaysValueChange>- - +4.67%</DaysValueChange>
      <DaysValueChangeRealtime>N/A - N/A</DaysValueChangeRealtime>
      <StockExchange>NasdaqNM</StockExchange>
...
    </quote>
    <quote symbol="AAPL">
...
  </results>
</query>

So how did we go from that ugly looking CSV file to the lovely XML? The answer is the yahoo.finance.quotes open data table:

<?xml version="1.0" encoding="UTF-8" ?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
  <meta>
    <sampleQuery>
      select * from {table} where symbol in ("YHOO","AAPL","GOOG","MSFT")
    </sampleQuery>
  </meta>
  <bindings>
    <select itemPath="quotes.quote" produces="XML">
      <urls><url>http://download.finance.yahoo.com/d/quotes.csv?s={-listjoin|,|symbol}</url></urls>
      <inputs>
        <key id='f' type='xs:string' const='true' default='aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy' paramType='query' />
        <key id='symbol' type='xs:string' batchable='true' maxBatchItems='20' paramType='path' required='true'/>
      </inputs>
      <execute><![CDATA[
        var results = y.query("select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield'",{url:request.url});
        var quotes = <quotes/>;
        var rows=results.results.row;
        for each (var row in rows) {
          for each (var item in row.*) {
            var elname = item.localName();
            var txt = item.text().toString();
            if (txt=="N/A") txt=""; else if (txt=="-") txt=""; else {
              txt = txt.replace(/"/g, '');
            }
            row[elname]=txt;
          }
            quotes.quote += <quote symbol={row.Symbol.text().toString()}>{row.*}</quote>;
        }
        response.object = quotes;
           ]]></execute>
    </select>
  </bindings>
</table>

Let’s step through the main parts of the open data table definition. First the URL that YQL builds to get the data looks like this:

<url>http://download.finance.yahoo.com/d/quotes.csv?s={-listjoin|,|symbol}</url>

The listjoin is a URI template instruction that creates a “comma” separated list of values for each item in the “symbol” value. Symbol itself looks like this in the inputs section:

<key id='f' type='xs:string' const='true' default='aa2bb2b3b4cc1c3c6c8dd1d2ee1e7e8e9ghjkg1g3g4g5g6ii5j1j3j4j5j6k1k2k4k5ll1l2l3mm2m3m4m5m6m7m8nn4opp1p2p5p6qrr1r2r5r6r7ss1s7t1t7t8vv1v7ww1w4xy' paramType='query' />
<key id='symbol' type='xs:string' batchable='true' maxBatchItems='20' paramType='path' required='true'/>

Note the batchable attribute of symbol. This tells YQL that this parameter can accept a set of values that can be sent to the remote data provider in a single request. In this case, the finance CSV API can take a comma separated list of stock symbols and return all that information for each entry.

That cryptic looking f input key is a constant – we’re going to get all the fields we can every time, and this value holds all the short field names that the API understands.

The YQL execute section actually dispatches the request and processes the return data.

var results = y.query("select * from csv where url=@url and columns='Ask,AverageDailyVolume,Bid,AskRealtime,BidRealtime,BookValue,Change&PercentChange,Change,Commission,ChangeRealtime,AfterHoursChangeRealtime,DividendShare,LastTradeDate,TradeDate,EarningsShare,ErrorIndicationreturnedforsymbolchangedinvalid,EPSEstimateCurrentYear,EPSEstimateNextYear,EPSEstimateNextQuarter,DaysLow,DaysHigh,YearLow,YearHigh,HoldingsGainPercent,AnnualizedGain,HoldingsGain,HoldingsGainPercentRealtime,HoldingsGainRealtime,MoreInfo,OrderBookRealtime,MarketCapitalization,MarketCapRealtime,EBITDA,ChangeFromYearLow,PercentChangeFromYearLow,LastTradeRealtimeWithTime,ChangePercentRealtime,ChangeFromYearHigh,PercebtChangeFromYearHigh,LastTradeWithTime,LastTradePriceOnly,HighLimit,LowLimit,DaysRange,DaysRangeRealtime,FiftydayMovingAverage,TwoHundreddayMovingAverage,ChangeFromTwoHundreddayMovingAverage,PercentChangeFromTwoHundreddayMovingAverage,ChangeFromFiftydayMovingAverage,PercentChangeFromFiftydayMovingAverage,Name,Notes,Open,PreviousClose,PricePaid,ChangeinPercent,PriceSales,PriceBook,ExDividendDate,PERatio,DividendPayDate,PERatioRealtime,PEGRatio,PriceEPSEstimateCurrentYear,PriceEPSEstimateNextYear,Symbol,SharesOwned,ShortRatio,LastTradeTime,TickerTrend,OneyrTargetPrice,Volume,HoldingsValue,HoldingsValueRealtime,YearRange,DaysValueChange,DaysValueChangeRealtime,StockExchange,DividendYield'",{url:request.url});

This runs another YQL select statement when the table gets invoked. It’s fetching a CSV data source from a URL and setting up the column names for each “row” that comes back. The URL that YQL would originally have fetched for this data is already created in the request object, containing the list of symbols expanded into the s parameter, so we just used the @ substitution syntax to add that into the YQL statement.

var quotes = <quotes/>;
var rows=results.results.row;

The next few lines create a new XML object called quotes which will hold our final XML document and gets an XML list to each of the rows that came back from the CSV query.

for each (var row in rows) {

Now we’ll loop over each of those rows (one row per stock symbol).

   for each (var item in row.*) {
    var elname = item.localName();
    var txt = item.text().toString();
    if (txt=="N/A") txt=""; else if (txt=="-") txt=""; else {
      txt = txt.replace(/"/g, '');
    }
    row[elname]=txt;
  }

For each element in that row (note the E4X syntax to get all the elements row.*) we’re going to clean up the XML somewhat. We’ll get rid of “N/A” and “-” text elements and use empty elements instead, as well as remove any quotes in the text.

   quotes.quote += <quote symbol={row.Symbol.text().toString()}>{row.*}</quote>;
}

Finally in the main loop we’ll append a new quote element to our root XML quotes element that contains the reformatted XML elements, and an attribute called symbol

response.object = quotes;

Last of all we set the response object to the document we’ve just created in the loop.

We’ve already added the table to the github open data table repository, so you can try this table out in the YQL console just by including the community tables. And now its in YQL, you can sort, filter, project and join on any of this data that comes back! For example, you can pull out only a few fields and sort the quotes by who has the biggest gain.