Tuesday, March 26, 2013

Is It Coding in Scala or Coding With Courage And Humility That Matters?


Scala is all the rage these days, replacing its legacy cousin Java in the hearts and minds of all the cool kids.  Admitting that you still code in Java is the geek’s equivalent to saying you drive a mini-van and don’t have a twitter handle.  We’ve all heard the proclamations of how much more powerful and succinct Scala is..and I’ll admit to having done some of that myself.

While not dismissing those statements,  because I do in fact find Scala to be a better language,  I think there is another cluster of factors at work here as well.  Those factors are courage and humility.

There is a correlation between Java and large, often distributed teams.   This may be largely in part due to the relative age of the language.  It is established and “mature” to use a kind word.  If you have a 50-100 person development shop with offices in the US as well as in some subset of China, India,  Russia you are very likely to be a Java shop.  If you have maintenance or sustaining teams as well as a development team you are likely to be a Java shop.  If you spend a fair bit of your time on “process” you are likely to be a Java shop.

Notice that I said “likely”.  There are lots of counter examples.  My own company uses Java and we have only three programmers.  None the less, I think the preceeding statements are generally true.

So what?  Well, my assertion is that along with large, process oriented and distributed comes the notion of lowest common denomenator coding.  As the size and geographic distribution of your team grows so does concern about “those other programmers” being able to understand and maintain your code.  That leads us to want to standardize and simplify the code.  We want to make the code clear to that possibly junior coder who may be new to the project, who may never have absorbed the designs, and who many not be familiar with the code base.

This leads us to write code like:

public int calculateTheValue(long someInput, long someOtherInput) {
    long intermediateOne = someInput * getSomethingElse();
   intermediateOne += someOtherInput;
   intermediateOne = someMethod();
     ….
   return intermediateOne;
}

There is nothing wrong with this code and to a newbie its certainly more accessible than:

public int calculateTheValue(long someInput, long someOtherInput) {
     return somethingElse(someInput, someMethod(someOtherInput));
}

The problem is that 10 lines of code versus 3 for every method in your system results in a sea of code where you literally can’t see the forest for the trees.  I can tell what each individual line of code does but I have no idea why because I can never see more than 0.01% of the code on my screen at a time.
Some readers might protest at this point that this is all just formatting and Eclipse or Emacs could in principal convert between these two representations.  To which I say: not so much.

The functional approach is all about the composition of a method from a collection of existing functions.  In this approach it is clear that the new method is “just” using the existing methods.  The new method has no logic per se other than using the output of other…presumably well named and tested functions.

In the more familiar Java approach each method is a new creation created out of whole cloth.  It might do any old thing it wants.   In this case freedom and creativity are to be considered bad things.  Each method must be examined line by line to see what it might be doing.  Lets look at a bit of open source code I’m actually currently debugging:

int to = readTimeout - clientCnxnSocket.getIdleRecv();
int timeToNextPing = (readTimeout / 2)  - clientCnxnSocket.getIdleSend();                       
if (timeToNextPing <= 0) {
            sendPing();
            clientCnxnSocket.updateLastSend();
 } else {
           if (timeToNextPing < to)
                     to = timeToNextPing;
}
clientCnxnSocket .doTransport(to);

After some period of study we can see that this code has two variables related to time outs: “to” and “readTimeOut”.  Based on the results of two “getIdle” calls we might send a ping, and then we mutate “to” in a couple of possible ways and then use it as a parameter to a socket call.  Further investigation reveals that “to” is the length of time the socket method may spend in a blocking “select” call.  Thus, “to” is related to how long we can block before sending another ping.

I’ve spent the last couple of hours trying to track down a bug in this system and the problem is that every single line is ontologically at the same level.  By that I mean that any of them could have or be a side effect, any could do something other than whats expected and the gestalt of what this code fragment intends can only be gleaned by close study.

I’ll assert that the following code does not suffer from those flaws:

if(timeToPing()) sendPing();
clientCnxnSocket(safeTimeToWaitForRead());

And that brings us to the humility side of the equation.  Its ok to write little one line functions that just do the one thing that their name implies.  TimeToPing is not a function you will put on your resume.  You will not proudly show it to your coworkers.   You will not tell your husband/wife/partner about the amazing bit of code you wrote today.  This one line function will sit there quietly, unnoticed…working.
If we have the humility to write simple functions and then have the courage to combine them into composite functions without extraneous scafolding and temporary mutable variables then we have a chance to achieve greatness…even in a legacy language.

To be sure, there are things that are trivial in Scala that simply can not be done in Java.  I know of no way to annotate a method to indicate that it does or doesn’t ever return null.   We recently changed such a method to never return null.   There is no way however to  find all the code that’s now unncessary.  Or assuming we had made the opposite change…to find the code that was not an NPE timebomb. [1]   

In Scala of course if your function might return a Foo but might return nothing you return an Option(Foo)…and function’s callers must deal with the Option(Foo) or they will not compile.  This isn’t fixed in Java 7, nor will it be fixed inJava’s 8, 9 or 10.  Null is just baked into the language.

Java isn’t going to be “replaced” by any of the newer languages.  It will continue to lose market share but will command a large segment of the market for the foreseeable future.  Its also clear that Java will continue to evolve and will over time gain missing features such as lamda expressions and better package structure.  Other things like null and the Generics system are likely to be with us to the bitter end.    For good or bad erasures and generics are part of the language now and forever.  Java 8 sprinkles a bit of syntactic sugar allowing the second repeat of the type to be omitted as in:
HashMap myHashMap = new HashMap<>();
but that’s a fairly trivial improvement in this age of modern type inference languages.

What this means is that engineers working with Java need to do the best they can with a 15 year old language.  Courage and Humility can help with that. 



[1] Yes, we could use PMD but that just points out that there is no support for such things in the language itself.