Monday, June 10, 2013

Playing with new toys...

I'm a fan of Google Apps.  I was a fan of OpenOffice back in the day, but then ended up working at a couple of Microsoft-centric companies and fell back into my Microsoft Office ways.  And to be fair, Microsoft Office is a simply amazing suite of office productivity applications.  To a package, Microsoft still probably has at least 50% more features/capabilities than the corresponding Google App.  However, I typically only need a simply functional suite of office applications and rarely do I need amazing.

What's cool, though, is the ability to extend Google Apps -- relatively easily -- with Google Apps Script.

So today, I was working on compiling some survey responses from a SurveyMonkey poll I put together.  It was important to us to see not only the IP address of the respondent, but to learn what we could about them from that information.  I Googled around and quickly came to the conclusion that the protected "sandbox" that Google allows you to build your scripts/functions/formulae in is simply too constrained to be able to do what I needed.  So, I figured I'd need to reach out to another system to do the lower-level work for me.

SurveyMonkey gives you the option to download your survey responses in a spreadsheet-friendly format.  I downloaded mine in Excel (.xls) format and imported them into a Google Spreadsheet.

Amongst the fields that SurveyMonkey shares is a standard one:  "IP Address".  What I needed was a quick way to do a reverse DNS lookup (PTR record) on the IP and stuff that into a cell.  Unfortunately, Google spreadsheet doesn't provide a gethostbyaddr() function by default, so I had to build one.

This amounted to two key things:
  • Get familiar with Google Apps Script
  • Use my freshly honed PHP-fu to finish the job
Google makes it pretty easy to get familiar with Google Apps Script (courtesy their tutorials) and using their training wheels (templates and sample code), I was able to knock out a quick and dirty solution in nothing flat.

Following their "Add a Custom Function to a Spreadsheet" guidance, I threw together a function (in Javascript) that looks like this (with the URL hostname changed to protect the innocent):

function getHostByAddr(addr) {
  var addr = addr;
  var baseUrl = "http://foo.com/utils/ptr.php?ip=";
  var url = baseUrl + addr;
  var result = UrlFetchApp.fetch(url);
  var content = result.getContentText();

  return content;
}

Now by simply adding a function into a spreadsheet cell, a la:

     =getHostByAddr(F2)

I end up grabbing the response HTML document from the URL request and stuffing it into the cell containing the function call/"formula".  Note:  In the formula example above, "F2" refers to the cell containing the IP address.

Now, I just need something to do the actual reverse DNS lookup for me.  Here's where the quickest and dirtiest PHP script ever came into play.  At the URL listed above in the function code, I laid down ptr.php, which contains nothing more than:

<?php

/* This grabs the value of "ip" from the URL parms, does a reverse DNS lookup, then prints it and exits */

$hostname = gethostbyaddr($_GET["ip"]);
echo $hostname;

?>

Now I said this was quick and dirty.  The anal-compulsive amongst you might notice that the output of this script will actually be a well-formed HTML document, complete with , , and tags.  Fortunately, for my current purposes, this is OK, since the spreadsheet basically strips those tags and displays just the content, which is just what I wanted.  If I wanted to invest more time in this, I could parse the returning HTML and grab the data I want, or get even more complicated and return JSON, but for my needs for now, this works just fine.

And by now, I've already spent more than twice as long writing this blog post than I did the two scripts combined (including ramp-up time).