Monthly Archives: September 2010

Another Look at Roo

OK, so I attended the Roo session today at JavaOne.  I got frustrated with it before and I’m more cynical than I started out, but that Ben Alex is just so darned earnest you can’t help but root for the guy.  Not to mention that it’s worth rooting for the tool to be successful just to make my life easier.

I was pretty harsh on Roo before.  I conceded that it still had promise, but I was very disappointed in the experience past the initial euphoria that comes with such a rapid development.  The great thing about the tool (and others like it) is that it basically builds an application for you with very little effort.  The problem can be right after that.  So far in my experience, it happens without fail.

The application that it builds for you is not quite the application you want – just most of it.  That’s great, right? If it builds 80% of the application you only need to build the other 20%.  Unfortunately, it doesn’t quite work that way.  The 80% that’s built is full of stuff you don’t understand (because it was auto-generated). When I say hard to understand, I don’t just mean that the code is hard to read (it often is) or that it uses a bunch of techniques that are unfamiliar to mortal developers (that’s usually true, too).  It’s also that it is unclear how to extend it.

So Roo generates all this persistence stuff that I like, and I like the controller scaffolding, but not the jspx files.  Can I delete them? Will the tool complain? If the tool doesn’t complain, will something else complain when try to start the server and they aren’t there? What files are safe to edit, what files should not be touched and what files can be edited with care? What about all those AspectJ files? It doesn’t look like the autogenerated test does anything, but I can see that it’s running 9 tests. What is it doing, exactly? I dunno – can’t see the code.

If you can’t figure that stuff out, you’re stuck with 80% of an application that can’t be finished so you’re pretty close to 0% of the way to a finished application.

I left with more hope than when I entered.  I got some clarification on some of those questions in my session today.  It is also perfectly valid to use Roo to build 80% of your application and then remove Roo.  Now you can edit whatever you like – it’s just Java.  You can’t remove Roo automatically, but you can do it manually without too much effort.  Now I’ve got a huge leg up on building my application.

It’s too late for my current project, but my next project will get going soon and I’m going to give Roo another day in court (the GWT stuff still might not be ready for prime time).  It’s cheap to try.

Advertisements

Groovy: To Infinity and Beyond

Attended a JavaOne session on Groovy today.  I was looking for an intro to Groovy but it was really more aimed at experienced Groovy developers – what’s new with 1.7, what’s coming in 1.8, etc.

Groovy is pretty cool, but it also has some of the same problems as all dynamic languages have.  It looks to me as if the whole point of Groovy is brevity.  This has great potential benefits to both efficiency and clarity.  However, the benefit to clarity is based on eliminating boilerplate stuff from Java that doesn’t actually add any clarity – removing clutter.  There are a lot of things in Groovy (it seems to me) that are really an enemy of clarity.

e.g.

def divide = {a, b -> a/b }
def halver = divide.rcurry(2)
assert halver(8) == 4

Clearly.


def plus2 = { it + 2 }
def times3 = { it * 3 }
def composed1 = plus2 << times3
assert composed1(3) == 11
assert composed1(4) = plus2(time3(4))

The result of these asserts should be intuitively obvious to the most casual observer.

Java in the Cloud

The first breakout session I attended at the JavaOne conference (or Unconference for this part) was a discussion on cloud computing.  Now this was just an interest group, not a speaker-led presentation.  The idea is that in the group of interested people, some would have experience in the topic and could learn from others with experience while those with little or no experience could benefit from the discussion by listening in or asking questions of the “experts”.

Unfortunately, there wasn’t much expertise on hand.  It seems that no one is doing much Java in the cloud – at least not among the 30 or so people dedicated enough to show up on Sunday for the discussion.  So was it fruitful anyway? Yes, and here’s why.

First was a discussion of what it even means.  We didn’t arrive at a formal consensus, but as the discussion went on, it was clear that we meant something like Amazon EC2 more than we meant something like SaaS.  There was a lot of talk on “private clouds.”  I have no problem with the concept of private clouds, but I think you are really talking about elastic virtual computing in general, and not necessarily some nebulous resources without boundaries out in the world.  It turns out the distinction is important to most people.  It may not be important whether you call it a “cloud” or not, but it matters a lot to people where these virtual computing resources live.  Curiously, our discussion of what constituted a cloud aligned very well with Ellison’s.  (This discussion was before Ellison’s welcome keynote and his sales pitch for the Exalogic “cloud in a box”.)

The main fruit of this fruitful conversation, though, was a discussion of why no one seemed to be doing the thing that everyone was talking about.  The answer was data.  There are two issues with data – performance and security/privacy.

Performance

Several people suggested that their DBA’s would never put their databases on a VM at all, no matter where it lived.  These guys want total control over which piece of such-and-such table lives on which disk to tweak performance.  That’s nonsense if you have a VM with a virtual disk.  What’s the point of putting temporary tables on their own disk when all the drive space is shared anyway?

Personally, I suspect that this is an overblown concern.  Not all storage arrays are the same, but if you are using a high performance SAN, you are already abstracting away the disk.  A large SAN (with more than just a few disks) is not a viable solution for most non-shared infrastructure folks because it’s very expensive.  However for a shared infrastructure (including a virtual one) the case for a SAN is obvious.  It provides great I/O and everything is, in a way, already on it’s own disk, so there’s no point in segmenting your data that way.

So, I suspect DBA’s can get crusty and cling to what they know just like everybody else, and they are saying a VM would never work because they won’t be able to sleep at night without that total tweaking control.  I suspect that, but I’m not a DBA, so I have to admit that they know a lot more about it than I do.  I can suspect prejudice, but I am not really qualified to prove it.

The other issue is that I’ve used a shared SAN before.  It worked great for us in terms of I/O, but it was also supposed to provide super high availability because it was so massively redundant.  However, the particular shared infrastructure vendor I was using had multiple outages of several hours of the whole SAN.  Again, I suspect incompetence, but I’m not qualified to say whether it was incompetence or SAN’s are inherently unstable.  Another possibility is that it was incompetence of a kind, but that it’s so complicated that only super geniuses can implement them correctly.  That would be a weakness of SAN closely related to “inherently unstable” in my book.

Security/Privacy

The other main concern was handling of sensitive data.  There seemed to be some specific concerns about things like Social Security Numbers – that you couldn’t really store it in the cloud and be SOX compliant.  I can’t speak to that, but I don’t know why that concern would be different if you used a non-virtual database but had it provisioned by RackSpace or some other data-center service provider.

Tying it all Together

So the VM issue is an issue whether it’s public or private and the security issue is an issue whether its virtual or not.  Whether the objections are valid or paranoid, these are the perceptions people have.  Well, if you can’t put your data in the cloud, what’s the point?  There are very few interesting applications that don’t require a database.  There are some (performance lab is a clear use case, a marketing web site without much “application” there, some super-processing that is done periodic batches) but until you can confidently put your data in the cloud, of course you’re not going to see a lot of serious applications in the cloud.

But somehow Amazon and Google do it.  There are a couple of important things that make them special.  EC2 can be thought of as a public cloud, but it’s private to Amazon.  Amazon can put it’s sensitive data somewhere else and no one but Amazon employees can put their hands on it.  Amazon and Google both have tons of data that is not the least bit sensitive.  (What would it even mean to “steal” Google’s data?)  However, they aren’t using RDBMS’s for that stuff.  They are using various flavors of NoSQL.  That proves that it can work, but it’s useful to remind yourself: You’re not Google.