Making Clean, Updatable Online Tables

I tried a cool tool from the folks at ProPublica called TableSetter, which makes really nice, dynamic tables from .csv (comma-separated-values) files or Google Docs spreadsheets:

 

 

Nice thing is that if the data changes, you just update the .csv file without having to remake the page. I managed to install it on my computer, make it work, deploy it on Heroku and make it work there, too.

How to Wrangle Huge Amounts of Data

I used MongoDB, Ruby and Google Fusion tables to make a map of long, early-morning taxi trips.

It shows the start points of 5533 NYC taxi trips that were at least 10 miles long and began between 4 a.m. and 6 a.m. one week in March 2009. Data from the NYC Taxi & Limousine Commission.

I imported the TLC's data into MongoDB using MongoImport and then got the Mongo Ruby Gem. Using these three pages, I ran a bunch of "finds" on the data to see what I could get, just using "puts" to print them to the screen.

Even figured out some regular expressions to pull the only records where the hour was 04 or 05, and also to get rid of the commas and extra spaces in the address fields.

Here's the code for where I ended up.

Instead of writing to a file, I just copy-pasted the output from my terminal as a .csv file and uploaded it to Google Fusion Tables.

And here's a bigger version of the map.

I can make maps with text

I learned from a Flowing Data article about making themed maps that I can download .svg files from Wikimedia Commons and that that those are just text files you can tinker with ... like shade the counties according to data by changing the fill text!  And when you're done, you can embed them into HTML like this: for browsers that support 'em!

Here's what it looks like:

Using embed code like this:

<object width="640" height="480" type="image/svg+xml" data="http://www.prototypecloud.com/linked/Map_of_New_York_County_Outlines.svg">

<span>&nbsp;</span>

</object>

Cool.

Mining Nested Hashes in Sinatra

I've been tinkering with Sinatra (a framework on Ruby) lately, but was having trouble figuring out how to get information out of a huge nested hash. So I tweeted the below to my accidental mentor, Al Shaw, of Talking Points Memo:

@A_L Say @foo={"people"=>{ "nabe"=>"dumbo", "kids"=>[{"name"=>"ann", "age"=>"8"}, {"name"=>"joe", "age"=>"10"}]} How do I return "joe"?

He replied that my example was missing a } (yup) and that the answer was:

@foo = {"people"=>{"nabe"=>"dumbo", "kids"=>[{"name"=>"ann", "age"=>"8"}, {"name"=>"joe", "age"=>"10"}]}}

@foo
["people"]["kids"][1]["name"]  #=>  "joe"

Now I get it!