Wednesday, December 5, 2012

New Article in Pragmatic Programmer

Just a quick note to say that I had another article published at the Pragmatic Programmer's:

Wednesday, October 24, 2012

Picking the next toys just got harder

My VerizonWireless contract is up in three weeks, my Kindle just died and Apple just introduced the iPad Mini so I've got some deciding to do!

I current travel with my 13" MacBook Air, a 2 year old Android phone and often my iPad2 as well.  The MacBook Air is my primary laptop and serves as my coding and general content creation platform.  The Android phone is going into the crusher as soon as possible, to be replaced with an iPhone.  Apple isn't perfect but they support their devices for longer than 20 minutes so it "never again" for Android.

So now we've dealt with phone calls, navigation, quick email checks, quick web browsing, coding and content creation.  What's left?

The iPad has been getting pretty lonely since I got the Air.  They're both big enough to need a carry bag so if I have the iPad I almost certainly have the Air as well.  This leaves time on the train commuting to work as about the only time I'd use the iPad because there's really not room in our crowded train for the Air.  So I use the iPad for games and some reading.

Do I really need either a Kindle or an iPad Mini?  (And when I say "need" I mean it in the geeky sense as in can I develop a rational for it).

The only thing I can think of using an iPad Mini for rather than my existing iPad is reading; lets face it the iPad gets heavy after a while.  But I could get a Kindle Paper White for about a third the cost or a base Kindle for less than a quarter of the cost of the Mini.  I'd rather get the Paper White but they're back ordered for 4-6 weeks so that's not an option.

Put another way: I can get an iPhone and a base Kindle for less than the cost of the Mini.

So, for folks who don't already own an iPad the iPad Mini might be a great idea but I don't see many people getting both.

Saturday, July 28, 2012

Finding your voice

In the past six weeks I've attended a week long Writer's Conference at Wesleyan University, a week long Native American and World Flute Conference at University of Wisconsin and changed jobs (going from a 40,000 person company to a 6 person startup).

I like to look for common threads and the thread here is finding one's voice.

Voice is sometimes thought of as the same thing as "style" which may help explain the idea of programmers having a voice but I think it goes far beyond that.

At the writer's conference we did close readings ("inspections") of various works, looking for themes and patterns and voice.  Voice here meant a way of saying things that was engaging, illuminating, understandable and among other things predictable.  We talked about various contracts a writer can implicitly establish with a reader such as mentioning a shotgun in a foreshadowing paragraph implies a later use of that gun in some future scene.

At the flute festival we attended Master Classes on topics such as use of various scales (major, minor, mixolydian), how to add embellishments or ornaments to ones playing and how to tell a story by use of tempo and volume and such.  While the standard orchestral flute has about 20 keys the various native flutes have six or four keys...and the didgeridoo has no keys.  This can be seen as limited but a smaller set of notes to choose from just means one has to use other ways to produce a full and interesting story. Thus voice.

In my new job I am the second programmer on a medium sized system...meaning that the system was entirely written by a single person.  It definitely has style because the person who wrote it has strong feelings about the myriad of design choices he's made along the way.  He's also free about the choices that were made without a strong sense of one choice being clearly better than another.

His voice is of course different from mine.  We have very different backgrounds and life experiences, but the point is that I can see his voice in the code and I quite like it.  I'm sure that over time our two voices will influence each other, but I suspect that they will remain distinct.

In the past I've been a strong advocate of static analysis tools such as CheckStyle. I don't think I'll be lobbying so much for that tool here.  I'm beginning to think that such tools are most applicable for shops where the engineers have not yet found their voice.

Friday, June 22, 2012

Thoughts about e-Books after attending a Writer's Conference

I just returned from attending the 2012 Wesleyan Writer's Conference in Middletown, Conn.  It was a week jam packed with close reading of poetry, thematic explorations of fiction, exploration of dialog in non-fiction and of course speculation on the future of books, e-books and publishing.

A couple of ideas came to me during the week that I'd like to share.  Let me preface by saying that for me reading means reading books or PDFs on my Kindle or via a Kindle Reading app on my my iPad, MacBook Air or HP laptop.  Last year I read 42 books and only two of them were of the dead tree variety.

One of the great features of Kindle reading (and presumably also for Nook reading though I can't say from experience) is the ability to trivial get the dictionary definition of works in the text via a single click.  I like to pride myself on an extensive vocabulary but there are lots of words that I "mostly know what they mean but could not give you a coherent definition".  I make it a point to click on these words and get a real definition.  That's all well and good but we should be able to do better.

Many books with extensive sets of characters contain a Dramatis personae or list of characters at the beginning of the book.  This can be especially important when reading Science Fiction where some authors show how alien their characters are by giving them unpronounceable names.  Imagine being able to click on a name in your book and get the quick description of who that person is?

A further enhancement to this idea would be to generate what we know about the person at that point in the story.

Some people might view these not as enhancements but as crutches that remove the need to immerse oneself in the book.  I would respectively disagree and say they provide a way to become more immersed in the book, but it doesn't matter.  They are mostly just examples of simple things we can do to play with the book reading experience.

Sadly, while Amazon provides an API to create applications like games on the Kindle they do not (as far as I've been able to tell), provide a way to modify/extend the actual book reading experience.  I don't blame them as most "enhancements" would like be misguided but its still a shame that we can't currently play with the idea.

Thursday, May 3, 2012

Scala Static Analysis

I am finally getting around to working on my next pet project which is a Static Analysis tool for Scala.

Java has a number of quite good Static Analysis tools such as:
  checkstyle - enforces a large set of user configurable rules about things like
                       max-line-length, max-parameters, max-if-nesting,
                       max Cyclomatic complexity, etc
  findbugs    - looks for common bug patterns

Scala, for all of its power currently lacks such tools.  There is even debate within the community if such tools matter for functional languages.  Still, it seems like a very interesting problem to me so I decided to give it a go.

In talking with people at Boston Scala Days 2012 the consensus was that a compiler plugin was the best way to approach this problem.  So, I started thinking about a two pronged approach:

  1 - develop a compiler plugin to get access to the Abstract Syntax Tree of a program

 2 - start a discussion about what kind of rules or metrics might make sense for a language like Scala.

The plug-in itself splits into at least two parts: the skeleton that hooks itself into the compilation process and the part that accesses the AST itself.

There is a great "how to" article on the compiler plug-in skeleton at  This article shows how to build a trivial plug-in that looks for divide by zero errors.  The guts are in the "apply" method...and that's where things got spooky for me!

      def apply(unit: CompilationUnit) {
        for ( tree @ Apply(Select(rcvr, nme.DIV), List(Literal(Constant(0)))) <- unit.body;
             if rcvr.tpe <:< definitions.IntClass.tpe)
            unit.error(tree.pos, "definitely division by zero")
I know that this method iterates over the unit.body tree looking for items that are division, with a literal zero and an integer.  That doesn't mean however that I fully grok the Tree hierarchy and its various flavors of Apply and Select methods!

I will say I'm getting some very good help on this via a question I asked on StackOverflow:

More as I get smarter about Abstract Syntax Trees!  If anyone wants to help on this project please contact me as I clearly could use help.  I'll put the project up on GitHub soon.

Monday, April 30, 2012

Hard because its hard or because its easy?

Some problems are hard because they're actually hard.
Some problems however are only hard because the answer is so easy that no one tells you about them.

I recently encountered a problem of the latter type while playing around with Scala's Option type.  If you are not familiar with it Option is a type that can either hold a thing or hold nothing.  So, Option[String] is either a String or its nothing (actually None in Scala).  For reasons that you probably either already know about or don't care to know this is very useful when coding in Scala.

You can say things like:
   foo match {
     case None => // handle the case where is is None
     case Some => // handle the case where there is something

(Of course we Scala folks like tricky things and so we'd tend to say
  "case _ =>" in the second case, but for now you can ignore that as showing off).

So, imagine you are in the second case and actually have a Something.  How do you get at it?

You might think you'd say foo.Some, or Some(foo) or a variation on that theme.  Failing that you might check the documentation on Option or read up on the topic in one of the several fine books on the topic.  Unless you looked someplace I didn't you won't find too much.  Foo.some doesn't exist and most variations of Some(foo) give truely unexpected answers.  For example, try this:

val bla = 4
val foo : Option[Long] = Some(3L)
println(bla > foo)

This results in the pretty confusing error message:

:10: error: overloaded method value > with alternatives:
  (x: Double)Boolean
  (x: Float)Boolean
  (x: Long)Boolean
  (x: Int)Boolean
  (x: Char)Boolean
  (x: Short)Boolean
  (x: Byte)Boolean
 cannot be applied to (Some[Option[Long]])
              bla > Some(foo)

So, how do you access the "some" part of an Option?  The "obvious after the fact" answer is with "get".

You can say println(bla > foo.get)

I bet that most people learning Scala have struggled with this or perhaps are still struggling with it.  However, it seems like such a dead simple case that most people won't admit to having been confused by it.

That's my definition of hard because its "too simple to mention".

Wednesday, April 18, 2012

Retrospective Velocity - part 3 of 3

While it may take a huge effort to get your organization to move to Scrum, you can as an individual go compute your personal velocity right now.  Any bug tracking system out there will allow you to gather data about the number of bugs assigned to you and/or fixed by you per unit time. If it were a database query you’d do something like “select count(*) from bugs where assignee==”me” group by month(assignedDate)”.  Now, you may object saying “but I do lots of other things besides fix bugs”.  You would  likely be correct, and it still doesn’t matter.  If a third of your time historically gets absorbed by mind numbing meetings what makes you think next month will be any different?

All discussions of Black Swans aside, whatever last month looked like is a good starting point for guessing what next month might look like.  By all means measure each month and look for trends.  Perhaps you and/or your organization is getting better and your velocity is increasing.  Now we have more confirmation of that fact.  Or perhaps in response to missed deadlines you’re attending still more mind-numbing meetings to discuss such gems as why things take too long and so your velocity is decreasing.  Cold comfort perhaps but now you have data to back up that dark realization.

As an individual you might pick from several data sources to establish a velocity.  The selection may depend on your job function (sustaining engineer, development engineer, etc) and/or on the systems you are using for bug tracking, source code management and the like.

You might track the number of bugs fixed by you, the number of source code checkin’s you do, the number of code inspections you participated in or some combination of these and others.  One interesting strategy is to gather the data from as many interesting sources as possible and then graph the results for longest time period that you have data for.  If we assume that your actual theoretical velocity has been relatively constant you can select a data source or sources that results in a flat graph.  Of course if you have switched languages or processes or problem domain your velocity likely has not been constant so your mileage may vary.

Tuesday, April 17, 2012

Retrospective Velocity - Part 2 of 3

One of the key take away messages from the experts in the field of estimation is not to estimate at all.  To quote Steve McConnell: “ If you can’t count the answer directly, you should count something else and then compute the answer by using some sort of calibration data”.  So, if we can’t count our velocity directly we can count the number of bugs in our last project, how long they stayed open, how many requirements were present and so on.  Our calibration data is that all of those requirements and bugs took place in the time interval of the project.  It’s not especially fine grained but in a waterfall model we are not looking for fine grained data.

Some may argue that bug reports and requirements are not precisely defined and standardized.  What passes as a single requirement for one team might be 3-5 separate requirements for another team.  Some teams have bugs like “it doesn’t work” while others might enter a dozen or more particular bugs to cover the same underlying defect.  Here is the strange thing though: it doesn’t matter.

Just like story points are not standardized, all that matters is that within your team you tend to be consistent.   Whatever your team’s definition of a story point is if you’re doing Scrum is likely to represent about the same quantum of work next month that it represented last month.  In the waterfall world,  the level of granularity you bring to your bug reports is likely to be fairly constant.  The point to keep in mind is that we’re not looking to equate points or bug counts against anything but other points and other bug counts.  So, if your last several projects had 6 requirements, generated 60 bugs and took 6 months you have a velocity of 1 requirement and/or 10 bugs per month.  If you next project arrives with 15 requirements and a deadline of three months from now we can safely conclude that you are in trouble!

Keep in mind the distinction between accuracy and precision.   In the proceeding case we can say with high confidence that you are probably hosed.

Monday, April 16, 2012

Retrospective Velocity - Part 1 of 3

So how long is this going to take?  This is one of the most common questions to ask or be asked in a traditional software development environment, and one of the most difficult to answer.  The agile methodologies, and Scrum in particular address this problem with the notion of velocity.  Whatever your team accomplished in the last time period (sprint) is likely similar to what they’ll be able to accomplish in the next one.  Scrum’s use of burn down charts, standard definitions of “done” and retrospectives allows for the discovery of a team’s average velocity.  One can argue about hours versus story points and many similar details but almost any implementation of Scrum allows for tracking how much work was done per unit time.

Sadly, many projects are still using “traditional” methods (which seems to be one of the new names for waterfall).  One of the drawbacks of this approach is that the time scales are relatively quite large compared to Scrum, which also means there are fewer time periods to measure.  There is simply less data available is you have two six month buckets versus 12 one month buckets.  This is one reason that “traditional” projects do not typically end up producing a team velocity.  This in turns make it substantially challenging to estimate how long the next project will take.

There is however other data that can be mined so as to uncover this more fine grained time measurements we are looking for.  The data in question lives in your requirement tracking system, bug database and source code control system.  While these systems are often, shall we say imprecise, they do contain useful data.

Wednesday, March 7, 2012

Using QuickSort to Explore Scala Collection Methods

In my continuing effort to play with my new favorite language Scala I decided to look at one of the Standard Algorithms in the field: quick sort. Quick Sort is a divide and conquer algorithm that picks a pivot point in a list and then calls itself to sort the two halves. demonstrated a very simple Scala quick sort shown below. Let’s examine it as a learning vehicle. The complex looking first line defines a function that takes a List of things T where T is any class that supports "Ordered". It also returns such a list.

As with almost all recursive algorithms quick sort has two cases: the empty list and the non-empty list. Scala's "list match" sets up those two cases. In the case of the Nil list we return Nil which is how the recursive algorithm completes. Otherwise we match against "cdr::cons". That expression means "first item in the list followed by the rest of the list". Cons and cdr are the standard names that the Lisp language uses for those terms. This may seem like an odd way to view a list but in the functional world it is a very common idiom.

So given a first element and a rest-of-the-list what do we do? We use another of Scala powerful collections function: partition. "cons partition (_ < cdr)" says "create two lists: a list of the items in the list that do match the condition and another list of the items that do not match the condition. These two lists are returned into the anonymous tuple “val (before after)”. This shows how Tuples are first class citizens in Scala, allowing us to return multiple values from a function call.

Searching the web for Java implementations of quick sort shows the following code as quite typical for the partition portion of the algorithm:

I don’t know about you but Listing 2 looks prime for off-by-one errors and is nothing I’d expect to get right the first time. Compare that to: val (before,after) = cons partition (_ < cdr). The beauty of that line is that its close to the English description: create lists of the items before and after the specified item by partitioning the list based on a test. I’m beginning to feel that Scala feels hard not because it is hard but because it’s so different from Java in that its easy!

Lastly we have the recursive part of the function. We call ourselves on the before list and the after list, and build a new list of the results of those two calls plus the cdr value (because it’s not in either list). This implementation works and has the advantage of being a tiny bit of code. The drawback is that it always picks the first item in each list as the pivot point. It is known to be sub-optimal especially in the case where the function is called on an already sorted list.

This next version of the program picks a better pivot point, in this case the middle entry of the list.  We accomplish this by changing the "case" to "theList : List" which just  means any list.  We manually find the pivot point and then perform two for-comprehensions to find the list items larger and smaller than the pivot entry.  This gains performance via the better pivot point but trades that off against having to scan the main list twice for the two for-comprehensions.  Further testing revealed that this implementation also had a defect in that if an item was present in the list more than once it would only appear once in the  sorted output.  Another reason for robust testing!

Our last version (for now) tackles the duplicate issue and is also a bit more efficient. We go back to using partition to generate our two lists, but now we post-process the second list.  We call partition again on the second list (which contains items not-less-than the pivot) into a list of matching items and not matching items.

Summary: This article is not meant to create the best possible Scala implementation of  Quick Sort, but to give you a vehicle for playing with Scala list manipulation functions.

Friday, January 27, 2012

Scala version of Log4JFugue, plus revamped website

Over the recent holidays I decided to create a Scala version of my Log4JFugue open source music project. I was able to reduce the 2500 lines of Java to just over 250 lines of Scala!

Checkout the new as well as:
Scala sources

I'll be giving a talk on the experience of converting a small Java project to Scala at the March meeting of New England Java User's Group.