LIS 7610/CSC 7481 Readings
Adapted from Doug Oard's LBSC 796/INFM 718R
Fall 2007.
Last update: June 4 2009.
The principal text for this course (referred to below as "MRS" for the
authors' initials) is Christopher D. Manning, Prabhakar Raghavan and
Heinrich Schuetze, An
Introduction to Information Retrieval, 2008. This book is also available on the Web at this point.
Some other books on information retrieval:
- Ricardo Baeza-Yates and Berthier Rubiero-Neto, Modern
Information Retrieval, Addison Wesley, 1999.
- Ian H. Witten, Alaitair Moffat, and Timmothy C. Bell,
Managing Gigabytes, Morgan Kaufmann, Second Edition,
1999.
- David A. Grossman and Ophir Frieder, Information Retrieval:
Algorithms and Heuristics, Kluwer Academic, 2004.
- William B. Frakes and Ricardo Baeza-Yates (ed.), Information
Retrieval: Data Structures and Algorithms, Prentice-Hall,
1992.
- Tomek Strzalkowski (ed.), Natural Language Information
Retrieval, Kluwer, 1999.
- Christopher D. Manning and Heinrich Schuetze, Statistical
Natural Language Processing, MIT Press, 2000.
- Karen Sparck-Jones and Peter Willet (ed.), Readings in
Information Retrieval, Morgan-Kaufmann, 1997.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990.
Downloading readings from the Web may require Microsoft Word or
Acrobat Reader, depending on the format.
Readings for Session 1 (Overview)
Required Readings:
- MRS Chapter 1: IR Using the Boolean Model
Recommended Readings:
- Tefko Saracevic, (1999) Information
Science. Journal of the American Society for Information
Science, 50(12)1051-1063.
- David C. Blair, Language and Representation in
Information Retrieval, Elsevier Science, 1990. Chapter 1,
pages 1-10.
Readings for Session 2 (Evidence from Content)
Required Readings:
- MRS Chapter 2: The dictionary and postings lists
- MRS Chapter 3: Tolerant retrieval
Recommended Readings:
- Christopher Manning and Heinrich Schuetze, Foundations of
Statistical Natural Language Processing, Chapter 5
(Collocations), MIT Press, 1999. Available from the
book's Web site.
- George A. Miller. (1995) WordNet:
lexical database for English.. Communications of the ACM, 38(11), 39-41. Also available from
the ACM Digital Library.
- Prager, John, Eric Brown, Anni Coden and Dragomir Radev.
"Question-Answering by Predictive Annotation," in
Proceedings of the 23rd Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval
July 24-28, 2000, Athens Greece, pp. 184-191. Available on
campus from the
ACM Digital Library.
- Donna Harman "Inverted Files," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 3. Available at LSU Middleton Library.
Readings for Session 3 (Ranked Retrieval)
Required Readings:
- MRS Chapter 6. Scoring and Term Weighting
- MRS Chapter 7. Vector Space Retrieval
- Djoerd Hiemstra and Arjen P. de Vries, "Relating the New
Language Models of Information Retrieval to the Traditional Retrieval
Models," Technical Report TR-CTIT-00-09. Available from CiteSeer.
Recommended Readings:
- S.E. Robertson et al, "Okapi at TREC-3," Proceedings of the
Third Text Retrieval Conference, 1994. Available on the TREC
Web site.
- Donna Harman "Ranking Algorithms," in William B. Frakes and
Ricardo Baeza-Yates, Information Retrieval: Data Structures and
Algorithms, Prentice Hall, 1992, Chapter 14. Available at LSU Middleton Library.
- Amit Singhal, "Pivoted Document Length Normalization," SIGIR
1996. Available through the ACM
Digital Library.
- James Allan, ed. "Challlenges in Information Retrieval and
Language Modeling", SIGIR Forum, 37(1)31-47, Spring, 2003.
Available from SIGIR or ACM Digital Library.
- David R. H. Miller, Tim Leek, and Richard M. Schwartz,
"A Hidden Markov Model Information Retrieval System,"
SIGIR 99. Available from the
ACM
Digital Library.
- W. B. Croft and J. Lafferty, ed., Language Modeling for
Information Retrieval, Kluwer, 2003.
Required Readings for Session 4 (Interaction)
Required Readings:
Recommended Readings:
- Peter Pirolli and Stuart Card, "Information Foraging,"
Psychological Review. 106(4), 643-675, 1999.
Available throught LSU Libraries database search PsycInfo.
- Robert S. Taylor, "The Process of Asking Questions,"
American Documentation, 13(4)391-396, 1962. (Available from LSU Libraries, search for this journal in "EJOURNALS.")
- Efthimis N. Efthimiadis and Stephen E. Robertson. (1989)
Feedback and Interaction in Information Retrieval. In
Charles Oppenheim, ed., Perspectives in Information
Management. London: Butterworth.
Readings for Session 5 (Evaluation)
Required Readings:
- MRS Chapter 8: Evaluation in Information Retrieval
- Ellen M. Voorhees, "Variations in Relevance Judgments and the
Measurement of Retrieval Effectiveness," Information
Processing and Management, 36(5)697-716. Available on
campus from Science
Direct
Recommended Readings:
- Ellen M. Voorhees, "Variations in Relevance Judgments and the
Measurement of Retrieval Effectiveness," Information
Processing and Management, 36(5)697-716. Available on
campus from Science
Direct.
- Chris Buckley and Ellen M. Voorhees, "Evaluating Evaluation
Measure Stability", SIGIR 2000. Available through the ACM
Digital Library.
- Ellen M. Voorhees and Chris Buckley, "The Effect of Topic Set
Size on Retrieval Experiment Error," SIGIR 2002, Available through the ACM
Digital Library.
- R. Mamantha, Ao Feng and James Allan, "A Critical Evaluation of
TDT's Cost Function," SIGIR 2002. Available from the ACM
Digital Library.
- Stefano Mizzaro. (1999) How Many Relevances in Information
Retrieval? Interacting With Computers, 10(3)305-322.
- Andrew H. Turpin and William Hersh, "Why Batch and User
Evaluations Do Not Give the Same Results," SIGIR 2001.
Available from the ACM
Digital Library.
Readings for Session 6 (Web Search)
Required Readings:
- MRS Chapter 19. Web Search Basics
- MRS Chapter 20. Web Crawling and Indexing
Recommended Readings:
- Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, and Andrew
Ng. Data-Intensive Question Answering. Proceedings of the Tenth
Text REtrieval Conference (TREC 2001).
Readings for Session 7 (Evidence from Behavior)
Required Readings:
Recommended Readings:
- Larry Page, Sergey Brin, Rajeev Motwani and Terry Winograd, "Page
Rank Citation Ranking: Bringing Order to the Web," Stanford Digital Library Working Paper
SIDL-WP-1999-0120, 1998. Available from CiteSeer.
- Diane Kelly and Jamie Teevan, "Implicit Feedback for Inferring
User Preference: A Bibliography," SIGIR Forum, 37(2)18-28, Fall
2003. Available from the SIGIR
Forum Web site.
- Jon M. Kleinberg, "Authoratative Sources in a Hyperlinked
Environment," Journal of the ACM, 46(5)604-632. Available from the ACM
Digital Library.
- Douglas W. Oard and Jinmook Kim, "Modeling Information Content
Using Observable Behavior," in Proceedings of the 2001
Annual Meeting of the American Society for Information Science and
Technology, Washington, November, 2001. Available from Doug Oard's Web
site
Readings for Session 8 (Scanned Documents)
Required Readings:
- David Doermann, "The Indexing and Retrieval of Document Images:
A Survey",
Computer Vision and Image Understanding, 70(3), 287-298,
1998. Available on campus from Science Direct.
- Toni M. Rath, R. Manmatha, and Victor Lavrenko, "A Search
Engine for Historical Manuscripts," SIGIR 2004. Available from
CIIR
Recommended Readings:
- Tseng, Y.-H. and Oard, D. W., Document Image Retrieval
Techniques for Chinese. In Proceedings of the 2001 Symposium
on Document Image Understanding Technology, Columbia, MD, 2001.
Available from Doug Oard's
Web site
Readings for Session 9 (Evidence from Metadata)
Required Readings:
- Nigel Shadbolt, Wendy Hall and Tim Berners-Lee, "The Semantic
Web Revisited," IEEE Intelligent Systems, 12(3)96-101, 2006.
- Diane Hillman, "National Science Digital Library (NSDL)
Metadata Primer," Web publication, 2003. Available
from the Open Archives
Initiative Web site.
Recommended Readings:
- Carl Lagoze and Herbert Van de Stomple, "The Open Archives
Initiative: Building a Low-Barrier Interoperability Framework,"
Proceedings of the First ACM/IEEE-CS Joint Conference on Digital
Libraries, Roanoke, VA, June 2001, pp. 54-62. Available
from the ACM
Digital Library.
- Diane Hillman, "National Science Digital Library (NSDL)
Metadata Primer," Web publication, 2003. Available
from the Open Archives
Initiative Web site.
Readings for Session 10 (Filtering)
Required Readings:
- MRS Chapter 9: Relevance Feedback and Query Expansion
- Joshua Goodman, Gordon V. Comack and David Heckerman, Spam and
the Ongoing Battle for the Inbox, Communications of the ACM,
50(2)24-33, 2007. (available on campus from the ACM Digital
Library)
Recommended Readings:
- Douglas W. Oard, "The State of the Art in Text Filtering," User
Modeling and User-Adapted Interaction, 2007.
Readings for Session 11 (Audio)
Required Readings:
- William Byrne et al, "Automatic Recognition of Spontaneous
Speech for Access to Multilingual Oral History Archives," IEEE
Transations on Audio and Speech Processing, 2004. Available on
campus from IEEE Xplore.
- Elias Pampalk, Simon Dixon and Gerhard Widmer, "Exploring Music
Collections by Browsing Different Views," in International
Conference on Music Information Retrieval, 2003. Available on
the ISMIR 2003
Web site.
Recommended Readings:
- Jonathan Foote, "An Overview of Audio Information Retrieval,"
ACM-Springer Multimedia Systems, 7(1)2-10,
1999. Available from CiteSeer
- John S. Garofolo, Cedric G. P. Auzanne and Ellen M. Voorhees,
"The TREC Spoken Document Retrieval Track: A success story,"
in Proceedings of the Eighth Text Retrieval Conference,
1999, pp. 107-130. Available from the TREC
Web site
- Rodger J. McNabb, Lloyd A. Smith, Ian H. Witten, and Clare
L. Henderson, "Tune
Retrieval in the Multimedia Library," Multimedia Tools and
Applications, 10(2-3)113-132, 2000. Available from the
New
Zealand Digital Library Web site.
Session 12: No Class
Readings for Session 13 (CLIR)
Required Readings:
- Lecture Notes: Cross-Language IR
- Gina-Anne Levow, Douglas W. Oard, Philip Resnik,
"Dictionary-Based Techniques for Cross-Language Information
Retrieval," Information Processing and Management, 41(3), 2005.
Available from Levow (.ps) or available on campus from LSU Libraries Electronic Journals Service.
Recommended Readings:
- Daqing He, et al., "Making MIRACLEs: Interactive Translingual
Search for Cebuano and Hindi," ACM Transactions on Asian
Language Information Retrieval, 2(2-3). Available from the
ACM Digital Library.
Readings for Session 14 (Images and Video)
Required Readings:
- Chad Carson, Serge Belongie, Hayit Greenspan and Jitendra
Malik, "Blobworld: Image Segmentation Using
Expectation-Maximization and Its Application to Image Querying,"
IEEE Transactions on Pattern Analysis and Machine Intelligence,
24(8)1026-1038, 2002. Available on campus from IEEE
Explore.
- Alan Smeaton, Wessel Kraaij and Paul Over, "TRECVID-2003 Video
Retrieval Evaluation Overview," Powerpoint slides, 2003.
Available from the TRECVID
Web site.
Recommended Readings:
- Vekant N. Gudivada and Vijay V. Raghavan, "Modeling and
Retrieving Images by Content," Information Processing and
Management, 33(4)427-452, 1997. Available on campus from
Science Direct.
- Howard Wactlar et al., "Complementary Audio and Video Analysis
for Broadcast News Archives," Communicatuions of the
ACM, 43(2)42-47, 2000. Available on campus from the ACM
Digital Library.
Go back to Syllabus Page.