Tuesday, May 26, 2009

Offline Browsing in Firefox with Scrapbook

Got a long plane flight or car trip? Have several articles you've been wanting to read, or maybe need some online documentation that's a subset of a big site? The Scrapbook plugin for Firefox lets you access the latter while on the former. It is nearly perfect for assembling an offline reading list or a library of things you want to keep indefinitely. It has many options for capturing various parts of a page (selection, frame, etc) which is handy if you just want to save the main content and not things like ads or navigation menus. I also like the markup features available from a toolbar that appears at the bottom of the window when viewing captured pages.

The online docs cover the full feature set, so I won't repeat those. I do have a few tips for specific use cases that I came across though:
  1. I wanted to capture an article that spanned multiple pages. In theory, ScrapBook has a filter capability that should be available in the In-depth Capture section of the Capture Detail dialog. I couldn't access that capability, so instead I would highlight the page links for the article (you know... the "1 2 3 Next" links that are usually at the top/bottom of a page) and then use the URL Detector in the Capture Multiple URLs option to capture all the links in the selection. I would then save those links to a new folder with the name of the article. It's not perfect since the pages are independent of each other (you can't navigate between them), but it's better than having to manually walk through all the pages and capture them.
  2. While reading captured pages when I didn't have network access, I would often come across links that I hadn't captured, but wanted to mark to grab later when I was back on the net. You can grab the link do this with the Bookmark with Scrapbook option and then Capture Again to pull the content when you are backonline. (Oops - correction the Bookmark with Scrapbook option is not available from the context menu when you right click on a link. I'll have to add that to the list of features suggestions I posted for the author.)
On a related note - while looking for something that provided the capabilities of Scrapbook, I found some other interesting plugins to view the page cache (CacheViewer) and index pages you have visited so you can easily find where you read a certain factoid that you stumbled across, but can't remember where (Breadcrumbs). I'll see how much I end up using those before reviewing them.

Sunday, May 3, 2009

Useful Links for Java Memory Issues

Every once in a while I have some problem with memory usage in Java (usually related to either Eclipse or JBoss running out) and I have to remember various settings or tools for dealing with the problem. Here are some pages that I return to on occasion:

Heap memory and MaxPermSize interaction
JVMStat tool
Open Source Profilers
Random Info
Interesting thread I came across when researching the Groovy eclipse-plugin memory leakl

If you want a good setting for JBoss (and you are limited to 32-bit Java), try:
-Xms128m -Xmx1536m -XX:MaxPermSize=128m

Eclipe Groovy Plugin - decent for scripts, PITA for classes

I have been using the Groovy plugin for Eclipse for about 18 months now. First let me say thanks to the team who has put it together. I can't really complain about any problems since it is a volunteer effort and I haven't volunteered. And over the time I have used the plugin it has gotten some pretty nice improvements, especially for code refactoring and auto-completion. It's debugging capability has always been good (with a few quirks). The fact that I can debug scripts as well as regular classes kept me from switching to NetBeans when the major upgrade of that product came out in November without that feature (hmmm... I see there are some upgrades for NB. Yet another thing to add to my to-do list to check out). Overall I find the plugin perfectly capable when writing scripts, especially in small projects. For writing classes however, especially in a large project or when mixed with Java classes, it's a PITA.

For those that cannot or do not want to move from Eclipse to IDEA or NetBeans, here are a few tips for getting by with the plugin for now:

1) The largest issue is a memory leak. I posted to the groovy user list about it over a year ago. (My explanation of the problem in that post was slightly incorrect. The problem will occur in any project with a lot of classes on the class path, not just one that includes another project.) I think the issue has gotten better in some circumstances, but it still makes it impossible for me to easily integrate groovy classes into my major projects.
Work Around: I have a small "GroovySandbox" project that has the class directory and key required libraries from my main project on its classpath. (Do not include the main project as a dependent project because you will have the exact same memory problem.) I write new scripts and classes in this project and then when they are ready, I copy them into the main project. The memory issue still has an affect, but I can work a full day on groovy files without having to restart eclipse rather than having to restart every 20-30 minutes.

2) Whenever I modify the class path of a project that includes groovy classes (thus triggering a clean build by Eclipse), the groovy class files are wiped with everything else and are not regenerated. I have to go in and touch each groovy class to get it to compile.
Work Around: I now have enough groovy classes that this isn't feasible, so I created an ant target that compiles only the groovy classes and I trigger it from within Eclipse when necessary. Eclipse detects the newly generated class files and the remaining java classes I have that are dependent on the groovy classes finish compiling. (Why do I have to mix groovy and java so much? See #1, above and #3, below. Unless I want to take advantage of groovy syntax features a lot, I stick with Java.)

3. When I open a groovy class in the editor, it often doesn't populate the Outline view. Makes it hard to navigate the file.
Work Around: Touch the file, or add-and-delete a space. This was a pain when using CVS for our SCM since the files were tagged as modified, but since switching to Subversion there hasn't been a problem. I guess Subversion uses something beyond file dates to track mods.

4. When debugging, if I try to step into a class (java or groovy) that is included from another project (which I do frequently since nearly all classes are in the main project due to #1, above), I get an error window saying that the class is part of the Groovy Libraries plugin path and that the path is not modifiable. The screen doesn't give you a button to modify the source lookup path for the directory/jar. (This one is relatively new. It started happening in some update around the beginning of the year.)
Work Around: Go into the project properties->Java Build Path->Libraries and manually set the "Source Attachment" property for the offending directory/jar. The next time you start the debugger, it will let you step in without a problem.

Wednesday, April 29, 2009

Groovy Categories vs ExpandoMetaClass

I first saw Groovy in Nov 2007 and promptly decided to dive in over my head and write a DSL in it. One thing that I never quite got was why one would want to use a Category versus ExpandoMetaClass. Having to create a class with a bunch of static methods and then surround other classes in it didn't seem very elegant. Directly adding behavior to a class with an ExpandoMetaClass is much more straightforward. I thought that the Category approach must be a hold over from Groovy releases prior to 1.5 when ExpandoMetaClass was added. The only use case I could come up with for them was if you wanted to add the same functionality to a set of classes - sort of an AOP view of things.

But then Groovy 1.6 came out and I saw that the team had added some very cool annotations to make categories even more powerful and easier to use. (See InfoQ for a great write up of the 1.6 features.) Okay - you don't improve a dead feature, so what was I missing?

This past weekend I got a chance to ask the person from whom I first learned about Groovy, Scott Davis. His Blue Pill and Red Pill talks at NFJS were what first got my mind churning about the possibilities that Groovy opened up. Before sitting in on another of his talks at the most recent NFJS conference this past weekend, I asked him why one would use categories in a DSL rather than modifying meta classes? His answer told me that I was asking the wrong question. The strength of categories is in scoping - modifying behavior for a limited period of time. The primary example he gave was for unit tests. A category can short circuit certain behavior (like retrieving info from a DB or from a file) to make it easier to test certain behavior in isolation. This makes a lot of sense. Mocks/stubs have their place; they are useful when you need to work with an object in a very limited sense - just a few calls. Then there are times when you want a real object doing 95% of what it is supposed to do, but you want to take over in just a few situations to be able to insert custom behavior. e.g. in a unit test when you want to short circuit network or file-based activity or you want to purposefully force an error condition.

It so happens that I really need to improve the unit tests for some network-based components in the ETL system that I work with. The primary one is called GetHttpContent, so you can imagine that unit testing it can be a pain. I'll give Categories a closer look to see if they can help me out. Unfortunately, I have to wait until we finish shifting to a our new project structure because right now there are so many dependencies in the primary project that if I work on a groovy file in that project in Eclipse I lose ~20M of memory every time I edit it. But the limitations (and benefits) of the Eclipse groovy plugin can wait for another post.

Originality through Ignorance

You have to admire the restraint some people show when faced with somebody else expounding on the obvious. I was sitting with my friend and coworker, Jeff Erikson, at the recent No Fluff Just Stuff conference explaining that I had decided to come out of my "I have way to much stuff to do" shell and start blogging. I had three reasons:
1. Most of the stuff I learn and then write up for our various projects comes from publicly available sources like other blogs and articles, plus a little bit of my own work gluing it together. So it seemed only fair to publish the results back where it could help others.
2. It was a good way to find out when I really didn't know what I was talking about since any readers would gladly point out (in the nicest of terms) that I was off base and should perhaps find another occupation.
3. It was a good way to create a public record of what I have worked on in case such a thing should ever come in handy for unnamed reasons.

Jeff was nice enough to listen politely without smacking me upside the head and saying, "Read the first entry in my blog." That would be the article where he talks about the keynote speech by Jared Richardson at the Rich Web Experience conference last fall which outlines much the same reasons for starting a blog.

That piece of humble pie tasted really good. Especially with a nice chaser of irony since anyone who works with me knows that I hate reinventing the wheel. Hopefully, my new resolution to do more regular reading (rather than cramming on new techs when I need to use them) will help me avoid thinking I am original when, in fact, I am just ignorant.