Before I begin, I must make a clarification. This is an article about why I am not a Java programmer, one person's reasons and thoughts. It's not about why you are not a Java programmer and it's not about why Perl is better than Java. It is also not about why Java is the way it is, I (mostly) understand Sun's design decisions in doing things the way they've done them. But knowing why something is the way it is doesn't make it any less irritating. [1]
Please keep this firmly in mind while reading.
Recently, I moved back to Pittsburgh after a stay in New York City for a few years. [2] In NYC, Perl jobs are plentiful and Perl is fairly well respected as a language to get things done. In Pittsburgh, ``Perl'' is a four-letter word while ``Java'' is not. I'm reminded of this all the time. Since I've been gone, most of my programming friends have switched to Java, even the people who taught me Perl. ``When are you going to learn a real language?'' I'm asked, and it's long past being funny. I've gotten tired of parrying the same old critiques.
All this crystallized for me when a friend of mine, during yet another Java vs Perl argument, asked me ``What is Java missing?'' This was not really my thinking at the time, I'm not as concerned with what a language has as to how it goes about it. Few modern languages are ``missing'' any really critical features, it's more a matter of how easily they can be expressed. However, this does let my brain start organizing my objections.
I'll try to avoid incidental things. I won't complain about the way Java looks or refer to the mythical ``typical Java programmer'', just as I'd expect not to hear about Perl's ``line noise'' and stupid things your CGI programmer friend did. I'm not going to get into closed source vs free software, or which one is faster, or one supports some protocol that the other doesn't... all this could change tomorrow.
Instead, I'm going to stick to purely technical arguments about design decisions in the language. I feel these are fundamental design decisions which aren't going to change anytime soon in Java, that I can't easily work around and would really get in my way. [3]
Without further ado, some reasons why I program in Perl instead of Java.
Yes, I dredged up this old argument. Before your eyes glaze over, hear me out. This isn't the cheesy, old argument about which one is shorter.
Here's ``Hello World'' in Perl.
print "Hello, world\n";
One line, one function, one string. What does it do? It prints ``Hello, World''. Explaining it is almost as simple as writing it. The trickiest part is possibly the newline.
And here in Java.
public class HelloWorld { public static void main (String[] args) { System.out.println("Hello, world!"); } }
This most basic of all programs drags in the concepts of class, privacy, types, methods, the magic ``main'' method, the String class, arrays, class methods vs object methods and chained method calls. And on top of all that it, it has to be saved in a file called ``HelloWorld.java''. One must know all this (or cut & code it) just to get moving.
This makes it difficult to teach Java to a first-time programmer. Just to get off the ground, you've got to get past the whole OO philosophy. What's a class? What's a method? What's privacy? Why does it need to be called main? What's String[] mean? What's void? Why is it System.out.println? All this incidental crap is exposed to the student, someone who's still struggling with the idea of a text editor (``What do you mean I can't program in Word??'') You can, of course, hand-wave all this, but that doesn't settle well with me at all. I prefer to use as little magic as possible when teaching.
I recently had the pleasure (?) of teaching Perl to a 14-year-old. I started with ``Hello, World'' ('This is how you print something and run a program'), moved onto conditional logic ('This is how you print something if something else happens') and then to loops ('This is how you print something a bunch of times'), etc. Each lesson contained only one or two new concepts. Each concept produced a concrete result. Each new lesson built off the last. All with a minimum of hand-waving.
Nit picking? Possibly, but it is a symptom of deeper problems. To do anything in Java, even simple things, you've got to roll out all these conceptual cannons. You can't do one-liners in Java (assuming a sufficiently short line). The upshot of that is if you know Java you'll have to learn another language for quick tasks. For some, this isn't a problem, but I'm lazy. I like having one language which handles the vast majority of my daily tasks.
OO is currently vogue. Someday it will not be. Can Java adapt? I know Perl can, it demonstrated its flexibility when it made the leap from procedural to object-oriented programming six years ago.
Because Java is selling you a philosophy of programming, anything which does not fit into that philosophy rapidly becomes awkward. This is typified by ``Hello, World''. It is not a data-centric task, therefore the OO solution is clunky.
There's nothing you can't do with objects. There's nothing you can't do with any Turing Complete language. But that's not the point. The point is how easily, elegant and maintainable can you do it? When you try to shoehorn every programming task into one style, things get ugly. I'd rather have a tool-box full of lots of different tools than one with 57 kinds of hammers.
And what of other styles? Procedural programming has not breathed its last, and functional programming is just coming into its own, just to mention two alternatives. I've recently started playing with object-inheritance in Perl and find it useful. Can strict, class-based OO languages be adapted to new styles or will they become the COBOLers of the future? I know Perl can adapt.
CPAN is half the power of Perl (along with five or six other things). Having a single repository of Free [4] high-quality, well-documented and well-tested software (for the most part) with a common test and installation suite. The ability to need code to do X, finding it in one place, knowing it's probably going to be something more than Joe Hacker's little one off (not to say CPAN doesn't have its share of those) and not having to worry about license fees, can slash development times drastically.
Java, and most every language, lacks this vital, centralized resource.[5] There will typically be several equally popular repositories, varying in quality and openness (ie. Microsoft might provide a lot of Java classes, but what's the license like?) supplemented by smaller repositories and individuals. What you need is out there, but how hard is it to find it, know you've found it when you see it and use it once you've got it? And how much will you have to pay for it?
Not having function pointers in an all-OO language isn't in itself so bad, since a similar effect can be had by passing in the name of a method along with a class or object. The real problem is that without function pointers you cannot have closures. No closures makes lazy evaluation, and other parts of functional programming, difficult.
Some of what closures do can be simulated with inner and anonymous classes, but their conciseness is lost.
A closure can be thought of as OO inside out. Instead of data with code attached, it's code with data attached. Is this OO? Sure! In fact, several of Perl's OO systems have been implemented using closures [6]. It's a different approach to the same problem, and it's good to have lots of tools in your tool-box.
99% of the time, mucking with the symbol table is wrong. And most programmers will get by just fine without ever even being aware of its existence. The Devil is in that last 1%. When, damn-it, you just really need to generate a whole bunch of variables and to do it any other way would require hours more work.
This is what I call a 1% feature. 99% of the time, it's not needed by 99% of the people, but for that last 1% it can really make life a lot easier.
Often, symbol table spelunking allows very surprising code libraries with interesting and unexpected effects. Language designers can't anticipate everything, and it's often good to be able to take a language in a totally unexpected direction. Symbol table manipulation is one that allows this sort of hackery.
Of course, this feature is very often abused, most of the time things are much better accomplished by better data structures. However, abuse by novices does not mean it should be denied to everyone. The impact of such odd features can be lessened by properly structuring documentation and books to deemphasize such 1% features. A Perl tutorial shouldn't even bring up symbol table hacking (alas, many do).
Is symbol table hacking nasty? Yes. Is it messy? Yes. Should it be avoided when possible? Yes. But remember, any sufficiently encapsulated hack is no longer a hack. [7]
Much more tragically for an OO language is that a closed symbol table means...
Every OO programmer reaches the point in their understanding when they grok encapsulation. I mean truly realize the benefits of wrapping everything up in methods. The class is thy sword, and the accessor method is thy shield and micro-optimizations be damned!
Roughly five minutes later you never want to see another accessor again.
Computers are basically well-trained monkeys, and your typical
accessor is monkey code. getName()
returns a name. setName()
sets
the name. getAddress()
returns an address. setAddress()
sets an
address. Repeat until carpal tunnel. Let the computer handle this
busy-work for you! Not only is it lazier, but it's easier to maintain.
Basic laziness tells us that if you have twenty methods which all
basically do the same thing, you should consolidate them into one
method. A single point of change.
Perl has lots and lots of ways to do this. Class::Struct (distributed with Perl) and Class::Class to name two. This is so important that I wrote one myself, Class::Accessor. 60 lines of code, 5 methods. Fairly simple both inside and out.
All rely on one or more of function references, eval and symbol table hacking to generate methods on the fly. Alas, there's no easy way to do any of this in Java.
Ah ha! What about writing generic get()
and set()
methods?
obj.get('Name'); obj.set('Address', '32 Yarrow Way');
The problem
with this approach is three-fold.
First, inheritance is defeated. If I wish to alter the way
obj.getName()
works, I can simply override it. To change how
obj.get('Name')
behaves get()
must be overridden and with that all
the parameter's behaviors must be taken into account. Encapsulation
is blown.
Second, you tend to wind up with a big hairy case statement. Polymorphism helps break it down somewhat, but you still wind up handling multiple fields in a single method.
Third, the issue of how to get the types right. You wind up abusing
polymorphism, each group of types must be handled by a separate
polymorphic method, scattering the accessor code around and defeating
the purpose of writing ``single'' get()
and set()
methods.
What about using editor macros to generate the accessor code? Problem there is the next person to maintain your code won't have your editor or your macros. Count on it. [8] More damning is that editor macros are, at best, cut and paste, paste, paste coding and have all the maintenance problems of that style. The guy after you is left dealing with 31 repetitive methods, and what if #13 is different? It will be missed.
Preprocessors. C uses a preprocessor to get its work done, why not Java? When C came about, preprocessors were a good idea. Then again, so was paper tape readers and pet rocks. The problems with preprocessors are so well documented that it's probably redundant to go over them here, but history is often forgotten. They make debugging a pain in the ass [9], when your source code doesn't match the running code. Since preprocessors don't grok the language, you have to be very careful about how they are written to avoid strange behavior. Anyone familiar with C remembers having to wrap macros in protective layers of parenthesis.
eval is another one of those 1% features. The ability to add new code to a program while it's running is a colossal amount of power. I actually find eval to be a messy solution and usually find a better way, but they usually involve other techniques Java can't do (closures and symbol table mucking).
Unfortunately, a free form eval in Java would violate all sorts of compile-time checks, so it's not going to happen. You can create and load new classes on the fly (simply write a class file to disk and call javac) but it's not quite the same as being able to alter an existing class to fit. And what are the odds the user of your program has Java compiler installed? Pretty slim.
It's odd that an interpreted language (yes, Java is interpreted. A virtual machine is just a fancy name for an interpreter) would fail to have some way of eval'ing new code. [10] With no eval, the whole idea of dynamically-generated and self-altering code becomes very difficult, if not impossible.
A lot of people knock multiple inheritance, but there's really nothing wrong with it unless you're the one who has to design a language around it. Personally, I use MI extensively in my class hierarchies, fully aware of what I'm getting myself into. If you don't like MI, don't use it. But don't tell me I can't use it.
Java's answer to MI is interfaces. Interfaces are not multiple inheritance, it's really just a very strict virtual class. They address the first reason for inheritance, common interface, but completely miss the second, code reuse. Interfaces force you to reimplement the interface, for every class! What a bunch of busy work!
Aggregation and delegation are other work arounds, but they both involve writing wrapper methods. A situation further exasperated by Java's lack of dynamic method generation.
And, let's not forget, multiple inheritance still allows single inheritance if that's the way you like to do things.
Sometimes you need to have big blocks of text in your code. This comes up very often with HTML, XML and SQL:
$sql = <<'SQL'; INSERT INTO Stuff (This, That, Other_Thing) VALUES (?, ?, ? ) SQL
It's clear, it's formatted, it's unencumbered by quotes, appending characters, etc. Means I can just paste text into my program. Contrast this with how it's done in Java: [13]
"INSERT\n" + "INTO Stuff\n" + " (This, That, Other_Thing)\n" + "VALUES (?, ?, ? )\n";
I think that speaks for itself. Wow. [14]
I started programming by learning C++. I got up to the part about public, private, protected, friend, etc... got very annoyed by the unnecessary bureaucracy of it all and dropped the language, switched to Perl. If I can't trust the programmers around me not to muck around in my guts without good reason, I can't trust them at all. And if they're willing to perform that bad practice, they'll probably do more anyway. It's not worth worrying about.
This is not to say there aren't times I'd like to enforce privacy, just don't make me have to do it all the time. [15] To paraphrase Doug Gwyn on Unix, Perl was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
Right up there with the privacy rules comes the compulsion to enforce strict type checking. Don't get me wrong, types are nice if you're into that sort of thing, but most times I find it just another set of unnecessary hoops to jump through. Sometimes types are appropriate, sometimes they're not. Let me choose.
David Nicol, in a recent perl6-language@perl.org thread, put it nicely, ``The draw of the PMAW [Perl's Magical Autoconverting Wondervariable] is why we're all here''. We sling data all over the place in Perl programs without hardly worrying, it's wonderful! Many people cringe at this sort of free-wheeling approach to data integrity, but in my experience it's not a problem.
People complain about the dizzying variety of interfaces to Perl modules and functions, but we are saved from one thing: type casting. You don't have to worry about what special String subclass it requires, or that instead of taking a normal list it requires you use a specially crafted ParameterList object. No matter how weird the interface, it ultimately boils down to scalars, arrays and hashes. [16]
If you like types, use them. I'll even admit to wanting a strong typing system in Perl. But don't make me use them for everything.
Perl is a concise language, designed to make it quick and easy to turn thought into code. Java is a syntactically simple and consistent language, designed to encourage good style [17] and be easy to embed. Each has their strengths and weaknesses, but people rarely agree on which is which. People are funny that way. Some want to save you from yourself by restricting you. Some want to let you be yourself by removing as many restrictions as possible. Both are fraught with peril. I happen to like the latter peril better.
Bjarne Stroustrup has this to say:
"The connection between the language in which we think/program and the problems and solutions we can imagine is very close. For this reason restricting language features with the intent of eliminating programmer errors is at best dangerous."
But I think Larry Wall sums it up best: [18]
"The very fact that it's possible to write messy programs in Perl is also what makes it possible to write programs that are cleaner in Perl than they could ever be in a language that attempts to enforce cleanliness. The potential for greater good goes right along with the potential for greater evil."
Thanks to the folks on comp.lang.java.help for enduring my heresy and putting my head straight about a few things.
[1] I've found the first reaction most people have to this article is immediately defending Java's design. While it's interesting to know why, it doesn't make it any easier to write code.
[2] Since originally writing this I've bounced back to New York.
[3] Along the same path, JWZ has written a rant about Java <http://www.jwz.org/doc/java.html> picking apart annoying details that could only come from someone intimately familiar with the language.
[4] That's no-cost, open source, and unencumbered. Not this Community License ``we'll grant you the privilege to peak at our code'' nonsense.
[5] Java sort of gets around this by distributing heaps more classes in its JDK than Perl does (maybe not for long at the rate Jarko is going), but you can only push this so far. As Adam Turoff recently lamented on advocacy@perl.org, the Java2 SDK tarball is pushing 20 megs making Perl 5.7.1 look positively anemic at 6.7.
[6] Class::Accessor and Class::Data::Inheritable to name just two. Damian Conway goes into gut-blowing contortions about closures and OO in his book ``Object-Oriented Perl''.
[7] From the user's perspective, anyway. But that's what encapsulation is all about, perspective.
[8] I've had more than a few ``code building IDEs'' suggested to solve this, effectively glorified editor macros. One, StructureBuilder <http://www.webgain.com/products/structure_builder>, costs a mere $1000 and works only on Windows.
[9] Trust me, I know. The perl source code is mostly macros.
[10] There are rumors of a Java library in the works which gives you access to the compiler, but has all the problems of calling javac.
[13] Two individuals showed me examples of how to do multi-line strings in Java. Both forgot the newlines. Perhaps anecdotal evidence that this sort of unnecessary complexity leads to bugs?
[14] In all fairness, you can use preprocessors to get a here-doc-ish behavior. One such widget is MLS, http://www.ddj.com/ftp/2001/2001_05/webapp.zip/code/wap/html/mls.html. But preprocessors, like filters in Perl, have their own set of problems.
[15] Contrary to popular belief, you can enforce privacy in Perl. Closures are a simply way to get private methods and several modules provide privacy, Class::Contract in particular is air-tight!
[16] Yes, I'm oversimplifying.
[17] I'm being a gracious host here. ;)
[18] From ``Perl, The First Postmodern Computer Language''