Research Guide Recommendations

Note: This is a collaborative project with Doug Shuga, soon-to-be graduate of The University of Texas at Austin School of Information.  You need to hire him.

This excellent article by William B. Lund and Chad Hansen at Brigham Young University demonstrates how one can embed subject-relevant research guides into search results.  The beauty of this is the usage of third-party search algorithms and result sets to identify the broad subject area of a search, regardless of what the user types into the search box.

A user can type anything into a search box.  Any word, any phrase, misspellings, colloquialisms.  Because of the open-ended nature of the search box, it’s not easy to use search terms to do anything useful outside of re-running a search somewhere else.  If a user types in “evolution and creationism”, I have no easy way of mapping those terms to our biology or religion subject guides.  We can’t possibly map the infinite words users might type in.

Enter Mr. Lund and Mr. Hansen’s brilliant idea.  Let a third party do the hard part by providing a search algorithm and relevant search results.  Use the search results to figure out useful information about the search. Lund and Hansen used LibraryThing.  I used Ebsco Discovery Service (EDS), but really, this could be done in any interface that produces search results with some form of call numbers.

This project is about placing subject-relevant research guides in a user’s search results in EDS.

Overview

Call numbers provide a way to identify the subject area of a search. Relevant research guides can then be presented.

A user types in some search terms and hits “Search.”  Ebsco Discovery Service brings back relevant search results, many of which are books from our library’s catalog.  In the search results screen, under each result that was pulled from our library catalog, you’ll see call numbers.

Our javascript screen-scrapes all the call numbers that appear on the search results page.  It then maps those call numbers to subjects – this is easy enough to do because LC call number ranges map to specific subjects (e.g., call numbers beginning with “N” are Fine Arts, call numbers beginning with “HV” are social pathology in the social sciences).  It then ranks the subjects based on frequency of appearance, and provides links to associated research guides.

Here’s a video demonstrating what it does.

Here’s a document that maps our research guides to call number ranges.

Details

Before you begin to code, you first need to identify what subject-based research guides you have.  Ignore any course-specific research guides, as there won’t be any way to tell from the search results if the student is researching for a particular course.

With a list of subject-based research guides, identify the call number ranges associated with those subjects using the Library of Congress call number ranges.  You could do this with Dewey, in theory, too.  This will identify which call number ranges you should be looking for in your search results.  You may have gaps; for example, you may not have a research guide for Engineering if your school does not have an engineering program.  This is fine – you only need to identify the call number ranges for which you have subject guides.

The first step in coding is to use javascript’s window.top.document.body.innerHTML attribute to grab all of the HTML code from the page.  At St. Edward’s, all of the call numbers on the page can be identified by the preceding string, “Call No. “.  The code looks for the first instance of this string and grabs the next set of characters until it hits a space.  This effectively grabs only the first portion of the call number, not the entire call number string.  This is all we need to identify the subject.  The code repeats this process until no more instances of “Call No. ” are found.

What results is an Array filled with the first portions of the call numbers on the page (e.g., “N”, “HV7332″, or “QH”).  The next step is to map the call number prefix to specific subjects and tally how many hits we have from each subject area.  We iterate through the entire array, matching each item with a predefined set of call number prefixes.  If we have a match (e.g., “N” was in the array and we have identified “N” as Art and Art History), we increment a counter for the relevant subject area.

Because the script compares strings, it reads call numbers more like Dewey than LC.  For example, BJ2200 is greater than BJ1 and less than BJ23. To avoid false positives, we need to add an extra step to each comparison, stripping away the letters and using parseint() to convert the rest to simple integers. When the comparison is done this way, JavaScript knows that 2200 is greater than 23. This is not implemented in any subjects besides Philosophy right now because the problem is rare outside of that subject area and of a character limit in our code I’ve noted below.

After counting up how many hits there were in our defined subject areas, we sort from highest to lowest, excluding all subjects that had no hits.  We then display URLs associated with the subjects in an Ebsco Discovery Service widget on the right side of the results screen.

Here’s the code in .txt format.  (You may need to right-click and do a ‘Save as…’ to see the code.)

In the code, you’ll notice a built-in delay of about 2 seconds.  This gives EDS enough time to do the real-time availability check and display the call numbers before the script runs.

Reflection

We hit a character limit with Ebsco Discovery Service’s widgets.  The full code that included all of our subject guides exceeds the allotted 8000 characters.  When trying to place the code on another server and using an iFrame to bring in the code, we found out that most modern browsers prohibit the kind of screen scraping we’re doing if the code doing the scraping is not on the same domain as the page being scraped.  For now, we’ve asked for Ebsco to up the character limit so we can finalize this project.

This only works if the search results have books with call numbers, and will change from page to page of results depending on the results on that page.  We tried to identify other sources of finite, subject-rich information on the page, but only subject headings provide that – and even then, there are too many subject headings from the variety of databases we have to make it worthwhile to attempt to map them to subject guides.

This is a bit more work intensive than other projects – if a new guide is produced, it will require identifying relevant call number ranges and adding code in three different places in the widget.  Same if a guide is deleted.  The work isn’t difficult, but there is maintenance to be done.

The links and what they do should probably be more obvious; we haven’t done any user testing to see if users notice them, or if they’d be useful.

Our Research Guides need to be standardized.  I mentioned this in my Blackboard / Research Guides post, so this is just raising the urgency for improving our guides.

About these ads
This entry was posted in Ebsco Discovery Service, LibGuides. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s