I finished up the Harry Potter books on Saturday, just in time for tomorrow’s release of Order of the Phoenix on Playstation, so I can finally take a minute to stop neglecting this blog.
Federated search has been on my mind again recently, due to a number of factors. One, at work, federated search came up again as something we should look into. Two, I met with a rep of a federated search product. Three, the excellent report/article by Thomas Mann called, “The Peloponnesian War and the Future of Reference, Cataloging, and Scholarship in Research Libraries” (PDF), came to my attention today via David Weinberger’s blog.
I don’t think I’ve been entirely secretive about my dislike for federated searching, but reading Mann’s report pretty much sealed it for me. For one thing, anyone using an ancient history example to illustrate a point has my attention from the outset. Mainly, though, Mann’s piece is an articulate look at not only why librarians are necessary, but why using precoordination, subject headings, and classification is so critical.
Mann correctly focuses his piece on the information needs of researchers–grad students, faculty, and anyone needing information beyond what is quick and dirty. This right here is the distinction I see for medical libraries. In academic medical libraries, at least, what we are dealing with is a population that needs research-quality information, not just a quick fact or quote. I got into librarianship because of my love of the Classics and ancient history, so perhaps I just have a bias towards realizing that doing real research is a long-term, iterative, complicated process, requiring access to materials in multiple languages, obscure materials, and books–lots and lots of books. But I think any physician or other health sciences professional who is doing research likewise has the same bias. Sure, if you need a quick answer for patient care or to answer a simple question, doing a search in UpToDate (ugh) or a real evidence-based medicine database is the answer. Doing *research*, on the other hand, requires time and effort, and is made much easier by the presence of good, well-indexed, and well-cataloged tools.
I’m not going to take you through Mann’s paper step by step (you should read it yourself), but suffice it to say that Mann clearly points out that librarians who have knowledge of the types of resources available and how to use them to their best advantage are critical to success in finding information. He points out that even the most experienced researchers often aren’t as familiar with the information resources in their discipline as librarians, and that instead of trying to “simplify” search interfaces, what librarians really need to do is teach, teach, teach.
Here are a couple of snippets:
“Theorists who assert that simply “digitizing everything” eliminates the need for cataloging evidently have minimal experience with the actual results produced by implementing their theory. Full-text searching is indeed extremely valuable in many situations; but if a researcher wishes to get an overview of the important works on a topic, that kind of searching is positively counterproductive–it cannot segregate whole books from fragments of books, nor can it separate substantial treatments from trivial.” (pg 10)
“We all need to be very skeptical of the phrase “relevance ranking”–“term weighting” would be more accurate–because it radically changes the very meaning of the word relevance. It entirely divorces its definition from the notion of conceptual appropriateness, across both variant expressions and variant languages, and from the notion of substantial (rather than tangential) appropriateness.” (pg 10-11)
“The point here needs emphasis: a research library can provide not only a vast amount of content that is not on the open Internet; it can also provide multiple different search techniques that are usually much more efficient than “relevance ranked” and more like this” Web searching. And most of these search techniques themselves are not available to offsite users who confine their searches to the open Internet. ” (pg 14)
“This is one of the main reasons that we subsidize research libraries through taxes and endowments that shield them from market forces of supply and demand–so that they can provide free access to works not currently in general demand, and which profit-seeking bookstores would readily discard. (Second-hand bookstores that have some of the out-of-print sources do not make them freely available any more than the in-print stores do.) No one denies that research libraries need to be fiscally prudent; but there is a big difference between being fiscally responsible vs. allowing business concerns to determine the very goals of the library (e.g. “increasing market share” over “promoting scholarship”).” (pg 17-18)
I’m going to stop here and talk about something else that this last quote mentions. I sometimes worry that the research library is losing its focus. I can’t find it at the moment, but somewhere in Mann’s piece he talks about the obligation of the research library to collect foreign-language and other less common works. The obligation. It is not the obligation of the research library to provide great coffee or have the customer always be right. It is the obligation of the research library to preserve knowledge, and not just knowledge from popular presses, English-language sources, and books that the library thinks there will be a market for. It is the obligation of the research library to promote scholarship, not conform to business principles and economic or other whims. Nothing annoys me more than seeing the research libraries around me cancel foreign language journals and rare or foreign society periodicals. If the research libraries aren’t going to preserve access to the Journal of the Formosan Medical Association, for a particular example that has annoyed me for almost 6 years, who is? There are more important things at stake that preserving online access to popular titles, even if it does create difficulty for shrinking budgets.
(pausing to watch perhaps perhaps perhaps scene in Strictly Ballroom…)
So, I am ranting a bit. 🙂
But getting to medical libraries and federated searching, David Rothman had a good post responding to Rachel Walden’s experience a while back about a PubMed instructor who was “over MeSH.” Apparently, because PubMed does such a stellar job of mapping keywords to headings, using MeSH purposefully is rendered obsolete. Of course, as David and others show, PubMed’s mapping is not exactly, I don’t know, remotely reliable. That’s not the entire story, though. Even if PubMed’s mapping was peachy keen and always perfect, there are a couple of small flaws in how searches are translated, in my opinion. First of all, there is that whole searching keywords simultaneously thing–as if you don’t get too many results regardless. Then, there is that lack of major headings thing. And, finally, there is that thing with having those little precoordinated subheadings. I don’t know about YOU, but if I am doing a search in MEDLINE for treatment of myocardial infarction, I find leaving out keyword searching, using a major topic, and having that precoordinated therapy subheading pretty dang nice (well, completely critical to searching and at the same time remaining sane, actually).
(I don’t care about winning the Pan-Pacific Grand Prix!)
I had a student in today looking for help with a search on the prevalence of headache or head and neck pain over 3 months in duration. What’s the first thing I did? I’ll tell you, since you ask. Figure out what the correct MeSH heading is. Then with a little major-topicking (don’t you love the verbing of America?) and subheading use (epidemiology, anyone?), voila, he had a bunch of targeted articles right there in a matter of seconds. I did throw in a little keyword action for the duration thing. What the NLM was thinking when they were looking at ditching all those subheadings is frankly beyond me.
One thing that I often find really useful in searching Ovid is free-floating the subheading. Once in a while, it is really useful. But, I don’t want to do that all the time! And, that is what PubMed asks us to do by keyword searching.
As a total aside, I find it weird that PubMed explodes subheadings and Ovid doesn’t. How weird.
So, anyway, read Mann’s piece.
I was really glad to see LibraryThing mentioned positively, even in a piece slagging on tagging and federated search:
“Folksonomy lists of related sources, based on assemblages of democratically tagged results (as in LibraryThing) are also desirable supplements but terrible substitutes for the retrievals brought about by controlled vocabularies. How many of the “Peloponnesian” books (in multiple languages, in and out of print) listed above under the LC heading would have been found in folksonomy lists derived from uncontrolled tags?
Folksonomies do not adequately show the contexts and webs of relationships that scholarship requires–which linkages can be and are provided by professional catalogers who maintain the controlled vocabulary of the LC system. And let’s not forget–as many seem to have done–that beyond the standardization of terms for individual subjects, vocabulary control also entails the maintenance of scope notes, cross-references, and browse displays (like that for “Finance, public” above) which explain and exhibit the conceptual connections among the many related search terms that have not been applied
to the book in hand, but which, once brought to the searchers’ attention, are often of equal or even greater interest in expanding their horizons. Subject headings show not just books in the same category, but also whole webs of other, different (but related) categories.
…While folksonomies have severe limitations and cannot replace conventional cataloging, they also offer real advantages that can supplement cataloging. Perhaps financial arrangements with LibraryThing (or other such operations) might be worked out in such a way that LC/OCLC catalog records for books would provide clickable links to LibraryThing records for the same works. In this way researchers could take advantage of that supplemental network of connections without losing the primary network created by professional librarians.”
Might I say, LibraryThing for Libraries?