LIS 7610/CSC 7481: Homework 4

Please turn in this homework by linking it from your course Web page DURING CLASS on the due date. Please do not link it before that time.

Assignment adapted from James Allan's CMPSCI 646 course (Fall, 2004) at U. Mass. and Doug Oard's LBSC 796/INFM 718R (Fall 2007) at UMD.

Last updated: August 24 2009.


Part 2: Evaluating Systems

This Excel spreadsheet contains three separate sets of judgments for the hits examined in Homework 2. For one of the topics, you will:

  1. analyze agreement on the relevance judgments
  2. adjudicate the judgments
  3. use the adjudicated set to evaluate both Yahoo and Google

1. Agreement on Relevance Judgments

The above spreadsheet should contain three sets of judgments for every document (Web page). The first question you'll answer is: How often do judges agree on relevance? There are four possibilities:

For your chosen topic, figure out how often each case happens (both in terms of counts and in terms of percentage). Turn this information in. Pick three cases where judgments about a particular hit are not uniform, and briefly speculate why this may be so. Try to employ the concepts of relevance discussed in lecture. Turn this in.

2. Adjudication

Adjudication is simply the process of reconciling inconsistent judgments. Do this by simple majority voting. You do not need to turn anything in for this, but you will need the results for the third question.

3. Evaluation of Yahoo and Google

Now, evaluate Yahoo and Google using the adjudicated relevance judgments you just created (for the topic you chose). Make sure you are pooling judgments from both systems! Turn in the following information for both search engines:

In addition, answer the following questions: