Unit 10 Blog Assignment
Federated searching has been a moving target for libraries for a generation. The problem of isolated or fragmented silos of information seems to grow as digital publishing becomes more prevalent. In a related problem, libraries typically face the issue of how do they promote the added value they create for the holdings in their collections and provide and expose those efforts to their users? A lot of effort has centered on joining the library ILS to the library's subscription databases and other full text sources of digital information. The University of Arizona Libraries OPAC which successfully blurs the distinction between searching full text and searching metadata is one instance of this effort. This was not always the case, as the library OPAC traditionally only searched professionally constructed cataloging metadata. The old model was often the source of frustration for library patrons who often had expectations that full text resources would be retrieved in search results rather than just the cataloging metadata representative of that resource.
Problem is that most federated search solutions offer patrons a sometimes too generic interface that severely limits power users, who may find that searching the “native” database directly a lot more satisfying. Search options embedded in the “native” database are not available from the federated solution. Interoperablity is a key concept that makes federated searching feasible. The Dublic Core vs MARC debate that has raged for years, is in a sense a debate about interoperability. By homogenizing metadata down to fifteen common core elements and requiring that those fifteen elements be included in all instances for every record in any openarchives collection, researchers are able to search across hundreds of OAI compliant collections at once. However, while crosswalks and mapping solutions may help, sometimes discover is limited by squeezing metadata into a dc straitjacket.
Now lets take a closer look at how these concepts work in some real world OAI federated repositories.
The Openarchives.eu federated repository is an awesome federated search solution providing researchers with access to OAI-PMH compliant digital respositories from a European perspective.
A list of repositories included is displayed just be clicking the OK button next to the search box on the home page. The Openarchives.eu site indexes 2235 repositories!
The basic search indexes the descriptions of each OAI-PMH repository rather than accessing the individual digital objects within collections. I think this is a neat solution because it quickly and easily brings researchers face to face with a repository likely to contain information relevant to their topic. The researcher can then search the native digital repository with all its search features for a specific digital object. This is a more sustainable solution than harvesting all the metadata representative of every digital object in every repository. For example I searched the word “chemistry” in the Openarchives.eu search box. The search generated a list of eleven OAI-PMH chemistry repositories. A researcher can then open any of the eleven and search each repository individually.
Another great federated repository is the “Sheet Music Consortium” (SMR) which harvests metadata from seven sheet music repositories. Sheet music is an excellent barometer of popular taste and attitudes and reveals many things about the concerns and issues of the day. While international contributions are encouraged, metadata must be in English. The seven repositories contain metadata descriptions in unqualified Dublin Core for almost 100,000 digital objects. To get an appreciation for the usefulness of the SMR I performed a simple search using the word “Saratoga” which resulted in 68 digital objects or pieces of sheet music with the word Saratoga in the title. While there are many duplicates in the results list, all of the items were relevant to my digital Saratoga Springs, NY collection.
The specificity of the SMR federated repository, is attractive since it helps to speed up discovery and returns relevant results to the reasearcher of a particular topic. On the other hand, the Google approach of indexing everything, while it is very popular, because the searcher will almost always find something on point, has a significant downside. The downside is the 2 million hits syndrome where so much is returned from a search that the user just browses a few pages and ignores the rest. The Google universality approach is one that I think drives the Oaister federated respository. As with Google, universality is both a strength and a weakness of the Oaiseter.org federated repository.
The universal scope of the Oaister.org federated repository makes it similar in ambition to Google, but the lack of specifity makes searching a hit or miss effort. I wasn't nearly as successful searching Oaister.org for sheet music associated with Saratoga Springs, NY as I was when I performed the same search in the Sheet Music Consortium federated repository. This underscores how in federated repository architecture, as in other things in life, sometimes “less is more.”
Sunday, October 31, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment