I've been experimenting with WordNet (SQL version) and Aspell to see about adding spelling and synonym suggestions to the output from the III XML server. One of the things I love about google is the way it offers suggestions and allows you to easily fix your search. Many failed searches can be fixed this way and there has been times where I didn't know I misspelled a word until google told me. I really don't think there is a need to call the reference desk for help if it's just a simple mistake like that.

My trials are no where near showing right now but lo and behold Eric Morgan has created a simple SRU client that uses the same things I am. His version searches a OAI repository instead but the features are applicable to any search. You can read his overview of it in a XML4LIB posting.

What I really like about his implementation is his controlled dictionary. Instead of allowing the spell checker to always suggest things it only does it when the correct word is actually present in the data. It won't fix the spelling if there's no record that has the word in it. This would prevent two failed searches and not give the person the idea that the searched failed due to spelling alone. I do agree with his reasoning:

What is really great about this technique is that the spell checker will only recommend words that are in the dictionary, and the dictionary is only built from words in your index. Consequently, every single suggested word should have at least one record associated with it.

On the other hand I still like the idea of always fixing it. Yes it provokes another failed search but the person knows now that their query was misspelled and can be fairly certain that it is now correct. Also I would think it affects the thesaurus. If the person repeats the query with the correct spelling I think the thesaurus could then give alternates that might be useful. With an incorrectly spelled word I don't think the thesaurus would help.

My other caveat about it is that the alternate spellings and synonyms aren't clickable to do the search for those terms instead. This can be problematic with more complex queries so I'm not surprised but it would be something I'd love to see in a future version.

What I think this illustrates though is how little things can make a large difference in helping people find what they want. I think something as simple as this can go a long way in helping people recover from failed searches. I'm sure it will spawn another debate over dumbing down the interface and someone will yell that students should know how to spell if they are in college and know how to use a thesaurus and putting it in the OPAC is just letting them be stupid. I'll leave that debate for another day.

Technorati Tags: , , , , ,