laughingmeme pointed at my post on Classifier4J's
text summary API today, and did a nice comparison with the OS X and Open Text summarizers.
Unfortunalty, the author couldn't run Classifier4J, so I've made a web-app available to test.
It's ugly, it's nasty, but it mostly works. Try playing with the number of sentances parameter, because if you stick
with 1 sentance you tend to get the first sentance most of the time. Enjoy, and let me know your comments.
Example (from the java.util.Collection javadocs):
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. Some collections allow duplicate elements and others do not. Some are ordered and others unordered. The SDK does not provide any direct implementations of this interface: it provides implementations of more specific subinterfaces like Set and List. This interface is typically used to pass collections around and manipulate them where maximum generality is desired.
Bags or multisets (unordered collections that may contain duplicate elements) should implement this interface directly.
All general-purpose Collection implementation classes (which typically implement Collection indirectly through one of its subinterfaces) should provide two “standard” constructors: a void (no arguments) constructor, which creates an empty collection, and a constructor with a single argument of type Collection, which creates a new collection with the same elements as its argument. In effect, the latter constructor allows the user to copy any collection, producing an equivalent collection of the desired implementation type. There is no way to enforce this convention (as interfaces cannot contain constructors) but all of the general-purpose Collection implementations in the SDK comply.
A three sentance summary gives:
The root interface in the collection hierarchy. A collection represents a group of objects, known as its elements. All general-purpose Collection implementation classes (which typically implement Collection indirectly through one of its subinterfaces) should provide two “standard” constructors: a void (no arguments) constructor, which creates an empty collection, and a constructor with a single argument of type Collection, which creates a new collection with the same elements as its argument.
which I think is rather good.