Sibling Squabble over Regular Expressions
I was working in Java yesterday to parse a URL that had a somewhat flexible format, so I set up my unit tests, wrote the code in a declarative fashion using indexOf()
and substring()
, and got to greenbar. Then I looked at the code and shuddered at its inelegance. Time to switch to regular expressions.
Just in case someone hasn’t heard the Jamie Zawinski quote on regular expressions:
Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
(further discussion on quote and its source: http://regex.info/blog/2006-09-15/247)
I spent the next 30 minutes or so writing and debugging a regular expression to do what my declarative code did. When I finally got it right, I tweeted:
Pattern.compile(".+/(.[^/\\.]+).*")
got me to greenbar. #regex
(http://twitter.com/hoop33/status/25226605454)
Almost immediately, my older brother Ben (@BenWarner) DM’ed me:
Worst. Status. Update. EVAR!!11!!!
Despite being a quoted Twitter expert (http://www.usatoday.com/tech/news/2010-05-25-1Atwitter25_CV_N.htm), Ben doesn’t know a regular expression from prune juice, nor does he know the feeling of victory at getting a regular expression to work. I stand by my status!
Sadly, however, my regular expression had a bug: an extra period. I had to add some more tests to catch it. Here’s the final solution:
Pattern.compile(".+/([^/\\.]+).*")