Some Huck Hacking

I used to work on a big search product at AOL and still love search, even though that’s not what I do anymore. So, when I saw that IndexTank and Heroku were having a contest to build a cool app with IndexTank’s search-in-a-box, I couldn’t resist. I knew I had to keep it simple since I don’t have a lot of time for hacking outside of work, but I knew I had to do something.

I had two ideas, and went with the simpler one: What would happen if you broke a book down into individual sentences and made it searchable? Would it be useful at all? I decided to try Huckleberry Finn by Mark Twain, since it’s not too long, is public domain, is quotable and full of vernacular that can screw up indexers, and I knew it was available from Project Gutenberg.

I grabbed the text file, cut and pasted each chapter into individual text files and then wrote a Ruby parser to split it up into paragraphs and sentences, which were then written to javascript files. After that was done, I wrapped it in a simple Rails app to display each chapter and paragraph, and then fired all the sentences at IndexTank.

I call the result… Huck Smash, and I think it’s pretty cool.

It was a lot of fun to write an app without a database or ORM, just a bunch of javascript files that Ruby can read and an extremely limited scope. I know it probably won’t win, but it was a lot of fun to write and only took a few hours to put together. Writing the text parser was a lot of fun, and figuring out how to navigate the book and build out the HTML so you can link to an individual sentence was cool.

I’m going to try to spend more time outside of work playing with single-purpose sites and fixing Ficly up. I need to keep things constrained so I don’t bite off more than I can chew or over-commit, but this was so much fun I want to do it again.

I’d love to hear what you think of Huck and any ideas you have for improvements.

Thank You, Open Source!

Thank you
“Thank You” by Darwin Bell

I read Zed Shaw’s blog post on the decline of open source participation last week and it got me thinking about just how much open source software we use at work and how we’re (mostly not) giving back to those communities. So, here’s the first step in me becoming more involved and giving something back, even if it’s just a huge “thank you”. I am trying to be more involved, especially in the MongoDB and MongoMapper communities. I’m probably not going to be contributing code to either, but I’m fairly active on the mailing lists, have reported bugs and am committed to help with the MongoMapper documentation project.

Excuses aside, here’s a list of the big open source things we use on a daily basis and why we love them:

  • Apache – The webserver that holds everything together. It’s used by most of the web, and we use it too.
  • Ruby on Rails – Rails lets us do more faster. We also use a bunch of gems that I’ll list later on.
  • Sinatra – When you don’t need everything that Rails has (a simple API, for example), then Sinatra is perfect.
  • Passenger – Deploying Rails apps used to be a pain. Not anymore! Thank you, Passenger!
  • MySQL – Need an RDBMS? Well, we use this one. And it works pretty darned well.
  • Memcached – We cache everything we can, and memcached helps us do that.
  • MongoDB – We use it because it’s web scale! (if you get that joke, then you’re in the club!) Seriously, we first started using MongoDB just to collect our stats because we’d maxed our poor MySQL instance. Then, I looked deeper and realized it’s perfect for the big top secret thing I’m working on now. Atomic updates and super-fast inserts make it perfect for collecting a lot of data quickly. And it’s now slouch on the query side either. There’s also a great community behind MongoDB. The updates and improvements are frequent and the community is always willing to jump in and help.
    It’s a nice hybrid between the new school document store databases and a traditional RDBMS.
  • Beanstalkd – A super-fast queue server. It just works, which is why I love it. We queue everything we can. Why? It’s a great way to meter load. If you can only handle 3 jobs running at once, then you only run 3 workers. If you can handle more, you run more. It’s great!
  • And of course, all of our servers are Linux and run hundreds of open source packages that I don’t even worry about.
    Since our strength lies in Ruby, we try to do everything in Ruby that makes sense. I’m not going to list all the gems we use, but here are a few of my favorites – the ones that make life easier and make programming all day more fun.
  • MongoMapper – Makes working with MongoDB even more fun. I’m on the MongoMapper mailing list, and it’s one of the most supportive and helpful communities I’ve been a part of. It makes using it more fun.
  • memcache-client & beanstalk-clent – they’re how we talk to memcached and beanstalkd
  • hashie – Allows you to very easily create classes built around hashes. Great for wrapping around API’s.
  • typhoeus – My favorite of the many HTTP clients for Ruby.
  • will_paginate – Now I don’t need to do all the horrible gymnastics needs to add “previous” and “next” links to things! THANK YOU!
  • hpricot – My favorite way to parse HTML - with CSS selectors!
  • aws-s3 – A great interface to Amazon S3 (where we store a bunch of stuff)

There you go. That’s pretty much our entire stack. I left out a bunch of gems – most of them we don’t use directly – or that just provide one or two things.

So, thank you to all of the creators and contributors to open source projects out there, especially the ones we use to make our work easier. The web would be a much smaller place if there weren’t dedicated geniuses out there making this stuff, and the world would be a poorer place for it. I promise to be a better member of the community and contribute where I can!

Murray Wilson Is Awesome

My pal Murray Wilson does great things – he and AWOL take kids the system doesn’t want and teaches them to take apart, clean and refurbish computers the system doesn’t want – computers that would otherwise go to the landfill.

They then put linux on them and put them out into the community with families that need them. He’s one my absolute favorite people in Savannah (nay, the world) and I’m proud to know him.

The computers will, of course, end up in the landfill eventually, but the “Goon Squad” gives them easily another 2-5 years of life, and the kids learn useful and marketable skills. It’s a win-win, and an amazing program and Murray and AWOL built from the ground up.

If you can spare it, AWOL can always use some help. Every little bit helps, and every kid they help is one that’s not in the juvenile justice system or out on the street by themselves.

Murray is awesome in the best sense of the word.

My Slides from Future of Social Media

I finished speaking at *The Future of Social Media”… it was fun telling them not to join twitter if they’re just going to be marketers and not actually be human. I told them other stuff too, I think.
I didn’t have a lot of time to talk about the future, so I didn’t get to talk about identity vs. persona and my three categories of social networks… maybe next time. I’m pretty sure I scared the hell out of them when I talked about reputation stuff.
Some things I mentioned that either I didn’t put the URL for in the slides or didn’t have in the slides at all:
* The quote from Jeremy Tanner about twitter spammers comes from his fantastic blog post – read the whole thing.
* You can read The Cluetrain Manifesto online for free.
* I talked a little bit about Seth Godin. His blog may be a little pat, but I’ve learned a lot about marketing and product development from his books.
* My interest in reputation started with Cory Doctorow’s Down and Out in The Magic Kingdom. It introduces the idea of “whuffie” which captured my imagination. I hope to some day implement a real whuffie system online. I came really close once.
I think that’s it… hopefully the people who saw it enjoyed it and got something out of it. It was a lot of fun preparing it.

Epic iFail: AT&T and the Circular Phone Tree from Hell

When I left AOL, I gave them back my Blackberry, and have since been either without a phone completely or using Jen’s bright pink Razr. I want a new iPhone, but waited a little while to get past the initial rush. Well, this afternoon was supposed to be the day. I called up the local AT&T store to see if they had any, and got a menu. Here’s what happened:
* I pressed 2 to order new wireless service or hardware. Waited six minutes to talk to someone
* Asked the guy who picked up if the Savannah AT&T store had any in stock. He asked me my zip code and then asked me to hold. I spent three minutes on hold (there’s a timer on the phone… handy).
* He told me he could transfer me to the store. I then spent 8 minutes on hold.
* I ended up at the original phone menu and pressed 2 again just for fun. I waited for 5 minutes before my head exploded and I hung up.
That was twenty-three minutes to go basically in a big circle. AT&T, you suck. I mean, you’re a phone company and I can’t dial a local number and talk to the local AT&T store? How stupid is that?
Thank you, AT&T, for wasting almost half an hour of my life. You’re the balls.

Nerdy Songs

Jason posted a tweet about writing songs this afternoon and I must have been in a particularly suggestible post-nap state and instantly came up with several extremely nerdy song titles. I think almost all of these fall into to Nerd Country n’ Western, but whatever. Here they are:
* I’m Semantic, But Wow, You’re Well-Formed
* Since You Left, I’ve Been in Plain Old Semantic Hell
* Why Do Our Tags Have to Branch?
* If You Won’t Mock My Markup, I Won’t Jeer Your Scripts
* What’s in a DOCTYPE?
* I Sold My Soul to the W3C, and All I Got Was a Long-Sleeved Tee
* Baby, It’s Not Really a Microformat!
* Let’s Go Home and Append Some Child Nodes to Your DOM
* If You Leave, All I’ll Have is Twitter
I’m sorry. I really am, but you’re welcome to add to the nerdy nonsense in the comments…

Social Networking Mashups

I’m speaking today at The Social Networking Conference about social networking mashups. I decided to turn it on its head a little bit and do an introduction on portable social networking instead, because what is it but a big ol’ mashup of identity, relationships and content?
If you’re at the conference, I’ll see you at 1:30, and I’ve updated the slides a bit since I had to turn them in for the CD, so the presentation is a wee bit bigger and more complete than the one you got in your packet. I’ve uploaded the “final” version here, and you’re welcome to download it.
Feedback is welcome. I certainly couldn’t cover everything, because I only have about 35 minutes to cover everything (and 36 slides). I don’t touch on the Data Portability working group, or several other relevant things, because there just isn’t time. It’s very much an introduction into why it makes sense for social networks to support “good things” like OpenID, Creative Commons, microformats, providing feeds for everything, etc. Hopefully, it will lead to more technical discussions and some good questions.
Feedback is, of course, welcome!

Happy New Year and Twitter Stats

We had a lovely time in Mississippi eating way too much fried/barbecued/fatty food (I only gained one pound, and have promptly lost three, so no worries), playing with the dog, hanging out with Jen’s parents, fishing and watching the boys ride around in the trailer behind Grandpa Brian’s lawn tractor. I’ll try to upload pictures tonight.
I’m back at work, and having a hard time getting back into the work groove. So, I got my twitter stats instead… yeah, productive, I know (I also cleared out my inbox, remembered my kerberos password, did annual review feedback for folks, updated SVN and set up a meeting for this afternoon).
What I found funny is the tweets per hour. Since I got the blackberry over the summer, I twitter more at night while watching TV than I do while I’m at work. I also seem to post a lot around 11AM, which is usually when I take my first break of the day. All in all, I post a lot, which doesn’t really bother me, or seem to affect my work. I love the “noise” twitter generates. After working for almost 13 years with constant interruption, if I don’t get interrupted every ten minutes or so, it feels like something’s wrong.

number of tweets per hour - it peaks around 8:30PM

Also, March was my heaviest month o’ tweets, which isn’t surprising since SxSW was right smack in the middle, and that’s where I really “got it”. I’m not sure what happened in May, or why December was so high – especially considering I was at home for almost three weeks and without “real” bandwidth for a week.

number of tweets per month - march was the highest, with december a close second. May was the lowest, and I have no idea why

I think I’ve reached a sort of twitter equilibrium. I follow about 200 people, with only about 50 sent to my phone, which keeps the noise on my phone when I’m not near the computer down to a dull roar.
(I generated the stats with the very handy script written by Damon Cortesi)

Insomnia-Fueled E-Mail Management Musings

I just saw Khoi Vinh’s post on managing e-mail and since I can’t sleep, I figured I’d tell you how I manage e-mail. I use OS X’s for work mail and Thunderbird for my personal stuff (I like keeping them separate). I don’t get a ton of personal e-mail, but I get between one hundred and three hundred e-mails a day for work (during the week, 50-70 on weekends) between projects, internal listservs and CSS Working Group stuff. That number’s been as high as four hundred during the AIM Pages crunch last year, I was getting more than five hundred a day.
I’ve managed that load for more than five years, and have found a couple things that keep me sane.
# I have a smart folder called Unread Messages that has only messages I haven’t read in it. Instead of peering at threads and a thousands-message long inbox, it contains only the stuff I haven’t read. I have another smart folder that has messages received in the last 36 hours. I almost never go into the Inbox view, because there’s just too much stuff there.
# Respond right away. If you can’t respond in a couple minutes, open the message in a new window and get to it after you’ve filtered the rest.
# Do your e-mail first thing. I spend the first half-hour of the day filtering e-mail and respond, and then get to work. I’ll check back every hour or so and filter again, depending on what I’m working on. If I’m coding and in the zone, then I might only check at the end of the day, but if I’m in meetings, it’s more often.
That’s pretty much it. My work day is an exercise in interruption management. Between e-mails and IMs from co-workers, I deal with hundreds of interruptions a day. It’s funny, but when I really have to get something done and don’t log in to AIM or open my e-mail, I miss the interruptions. I don’t know what to do with myself.
Sad, isn’t it?

Hello, Ficlets

It’s been a very long day, and it’s not over yet, but I couldn’t let the day be done until I posted about this. Today, we took the covers off of the project that I’ve been working on for the past three months: ficlets. It started as this little thing I was going to do all by myself to learn Rails, and ended up what you can see over on the site.
I don’t even know what to say about it, really. Cindy, Jason and I have been dancing around it so long on twitter, calling it Ape Shirt, that talking about it now in the open feels kind of weird. But, here we are. There’s more information about what it all means on the ficlets blog.
Ficlets is very much an experiment (we like to call it “a prototype we just happened to launch”), and this is our very first release (we’re the first product in AOL to roll out on Rails, so we’ve still got stuff to learn about it…). So, things may go weird and wonky from time to time. Just give it a minute, and then reload.
I am truly fortunate to work at a company where I can get away with stuff like this. This started as my own little thing to do on the side. When I realized that it was actually a pretty cool idea and that I didn’t have the time or talent to do it all myself, I presented it at a meeting, and the next thing I know, I’m working on it full time with a small team of amazingly talented people. It was a pirate project in the best sense of the word. We didn’t really do a project plan or start with a big committee. It was four people in a room, working towards something we were all geeked about. From the beginning, we treated it like we were in a startup, very few rules, no defined roles (except that I got two votes, and Kerry got three). It worked so well, and we had too much fun designing and building it.
I never imagined it would look so good, or be so much fun. For that, I have to thank the designers who worked most closely on it: Cindy Li, Ari Kushimoto, Jenna Marino, and Jason Garber, who did 99% of the markup (all the good stuff), the CSS and most of the javascript (I worked on it some, I swear). We make such a great team, and I’m so proud of the work we did. We had lots of other help too, from folks who helped design the stickers, buttons and shirts for SxSW: Shadia Ahmed and Jayna Wallace, to the folks who played around with concepts early on: Elisa Nader, Elsa Kawai, Tom Osborne and Justin Kirk.
There are tons of people to thank, and a lot of people helped out. We had tons of support and “air cover” from Kerry and text and language help from John, Amy, Suzie, Nancie and Erin. My pal Tony was an immense help figuring out how to deliver everything in working order to the Greatest Ops Guy in the World, Dan, and Kelly helped us bend a few rules to get all the other opsy bits in order at the last minute. We had legal help from Holly and Regina. And my bosses let me steal Jason, and go work on it, so big thanks to Alan and Bert too.
This has been so much fun, I think we should do it again. I have big plans for our little story site…
One last thing… if you’re going to be at SxSW Interactive this weeked, come find me. We’ve got some lovely stickers and buttons to hand out (while supplies last). I should be pretty easy to spot. I’ll be the big fat guy with the ficlets shirt on (well, for two days… ).
Now I have to go finish packing!