Free Enterprise Search Engine

At Mecklenburg County, our Information Services & Technology department has a Creativity and Innovation team. Being that I'm neither of those things, and also being that I'm not a part of Information Services & Technology, naturally I'm a member of this group.*

During one meeting we were discussing ways to better share information. I had been carefully and steadfastly studying ceiling tiles (127.5 if you piece together the half-tiles near the walls), when my mouth, which often acts independently of the rest of my body, spasmed out "You know, if you employed some enterprise search technology, people could search for these documents on their various file and web servers and information sharing would be much easier and more efficient."

I plastered a smile on my face to keep my lips from continuing their disobedient movements, but the damage had already been done. "That's a great idea! Why don't you research that and report back to the group." To paraphrase Douglas Adams, the smile stayed on my face, but my eyes frowned a little, and I promised my lips a rendezvous with the hottest cup of McDonald's coffee I could find.

The first thing I looked at was your standard Google search appliances. My report on those didn't get much traction for two reasons: (1) they cost money, and the group in question has no budget, and (2), if it doesn't run on Windows it is persona non grata at Mecklenburg County.**

Then I ran into a project from IBM and Yahoo! called the IBM OmniFind Yahoo! Edition. Rather than a hardware/software appliance like Google's offering, it is strictly a software solution you'd have to put on your own hardware. It runs on both Linux and Windows (Server 2003 SP1) and is based on the Apache Software Foundation's Lucene, a high performance text search engine written in Java.

A single server can handle up to 500,000 documents from both file shares and web sites, and you can stack multiple servers to handle more if need be. The search interface is fully customizable and search capabilities are accessible as web services using REST. Over 200 document types are supported, and as it's based on open source and open standards it shouldn't be too difficult to integrate with your overall enterprise architecture.

The kicker is IBM OmniFind Yahoo! Edition is completely free. You can pay for 24x7 support from IBM if you want to, but it isn't required. It advertises 3-click installation and 1-click starting (of course, they all say that), so it could be easily tossed on a VMware Server or an old box to test with no licensing hassles.

If you're interested you can read more about it at

*I'm on a couple of these types of things. I've been having a running debate over whether this falls under Marx's Law of Opposites or Murphy's Law. My vote is for Mr. Murphy.

**Mecklenburg County didn't simply drink the Microsoft kool-aid; it dove head first into the punch bowl.