Three Techniques for Visualizing Data From Google BigQuery

Guest Author, August 29th, 2012

This guest post comes from Ryan Boyd, developer advocate for BigQuery, helping businesses and developers get value out of their big data using Google’s cloud data services (such as BigQuery). He has been at Google since 2006.

Google BigQuery is a REST-based API for SQL-like analysis of billions of rows of data in just a few seconds.  Customers are using it to analyze advertising campaigns, web server logs, inventory availability and more.   While BigQuery makes it easy for customers to gain insights into massive datasets, it can be challenging to make sense of the results without the power of visualization.  The Google BigQuery browser tool doesn’t have visualization built in, but the API enables you to integrate with other tools in just a few dozen lines of code.

Let’s explore the US Birth Statistics dataset, which Google has made available as sample data in BigQuery.  It’s only 137 million records, but provides insights into things like baby weights, age of mother and more for children born between 1969 and 2008.  I’ve also mashed up this data with 2008 election results so we can look at some of the data in light of red state/blue state status.

Using BigQuery with Google Apps Script and Google Spreadsheets

Google Spreadsheets and the included charting capabilities is a really powerful way to visualize data from BigQuery, using a tool that even non-developers understand.  In this case, I wanted to understand whether there was a visible difference between the age of mothers when bearing their first child in red states (Republican-leaning) and blue states (Democrat-leaning).

In about 50 lines of code in Google Apps Script (server-side JavaScript), I’m able to send an authorized query off to BigQuery and pull the results into the spreadsheet above.  I then created a pivot table and a graph based on that data.  If I load more data (maybe another year) and want to query again, I can simply hit the ‘Run Query, Run!’ button above and the sheet, pivot table and graph automatically update.

Here’s the operative piece of code:

function populateQueryResults(projectId, sql, sheet, firstRow) {
  var queryResults;

   // Create new query, passing in the project ID and SQL code
  queryResults = BigQuery.Jobs.query(projectId, sql);

  // Queries are asynchronous, so loop until successful job completion
  while (queryResults.getJobComplete() == false) {
      queryResults = BigQuery.Jobs.getQueryResults(projectId, queryResults.getJobReference().getJobId());
    Utilities.sleep(500);
  }

  // Populate the spreadsheet with the results
  var tableRows = queryResults.getRows();
  for (var i = 0; i < tableRows.length; i++) {
    var rowString = '';
    var cols = tableRows[i].getF();
    for (var j = 0; j < cols.length; j++) {
      sheet.getRange(firstRow + i, j+1).setValue(cols[j].getV());
    }
  }
}

Seems to be a pretty big difference in the age of mothers when giving birth to their first child in these states I’m not a statistician, so look towards the CDC for more official analyses.

If you’re a Microsoft Excel user, our new Google BigQuery Connector for Excel will allow you to run queries directly from within Excel.

Using BigQuery in JavaScript with the Google Visualization API

BigQuery is a REST API and can be called from server-side and client-side code.  Google also provides a Google Visualization API which is fantastic at rendering many different types of charts from within JavaScript.   For this dataset, I wanted to create a quick map of the average baby weight by state.   The entire code-- from OAuth authorization, to building up a DataTable, to rendering the GeoChart was about 60 lines.

The query, across 137 million rows of data, took only 2.9 seconds.   You can try the query yourself against this open public data set:

SELECT state, AVG(mother_age) AS avg_age
FROM [publicdata:samples.natality]
WHERE year=2000 AND ever_born=1
GROUP BY state
ORDER BY avg_age DESC;

Using BigQuery with Commercial Visualization Tools

Are you reading Programmable Web, but don’t like to write code? There are still some visualization solutions available.  Several commercial visualization apps have integrated with BigQuery to make it easy for analysts and execs to stay up to date. Bime and Qlikview have both built live demos using the US Birth Statistics data.

The simple REST API BigQuery provides to analyze massive datasets makes it easy to build many different types of visualizations.  Try it out with public datasets without getting out your credit card, and drop me a note on Google+ if you build something exciting!

Both comments and pings are currently closed.

2 Responses to “Three Techniques for Visualizing Data From Google BigQuery”

September 3rd, 2012
at 9:06 pm
Comment by: Link Roundup – September 4, 2012 | Enterprise Information Management in the 21st Century

[...] Three Techniques for Visualizing Data from Google BigQuery (ProgrammableWeb) – OK, so you have access to Google BigQuery – what are you going to do with it? [...]

November 2nd, 2012
at 9:00 pm
Comment by: mark

Could you share your 60 lines of code for using BigQuery in JavaScript with the Google Visualization API? I am new at this and would love to see it.

Follow the PW team on Twitter

ProgrammableWeb
APIs, mashups and code. Because the world's your programmable oyster.

John Musser
Founder, ProgrammableWeb

Adam DuVander
Executive Editor, ProgrammableWeb. Author, Map Scripting 101. Lover, APIs.