LIS 7008 - Information Technologies
Spring 2012 - Section 01
Assignment 2


This homework is due on your course web site before the beginning of next class session. Partial credit may be awarded. Please be advised to test your Web account with FileZilla as early as possible. Do not wait until the last minute to upload your homework file in case there is an issue.

As a reference librarian or information user, sometimes you want to find authoritative information on the Web for users or for yourself. The purpose of this exercise is to learn the techniques for identifying who is responsible for the content of a Web page.

I have encountered an interesting Web page at http://www.flyonthewall.tv/casestudies.php?site=5, and would like to know more about the people or the organization that are responsible for the content on this site. We know at least five ways to find out who really runs a site, so lets give them a try:

  1. By following links from that Web page, see if you can find a page on the same site that makes a claim of organizational or individual responsibility for the content on the site.
  2. Sometimes no appropriate links are provided. In such cases, URL trimming sometimes offers a way of finding a page on which a claim of responsibility is made. The idea is to remove parts of the URL starting at the right until you get to a page where such a claim is made. For example, the Web page for section 01 of this course is http://www.csc.lsu.edu/~wuyj/Teaching/7008/sp12/. URL trimming would eventually get you back to http://www.csc.lsu.edu/~wuyj, where you would be redirected to my home page. Overtrimming to http://www.csc.lsu.edu would be less useful in this case, since the CSC server hosts unrelated information from many people.
  3. Sometimes it is not possible to find anything that resembles a claim of responsibility, and sometimes that claim may be misleading (for example, if you found a Web page from the "Committee to Re-Elect the President," you might want to know something more about that organization). One way to do that is to look at the domain name registry to see where the domain name is registered. Sometimes you will find the full domain name registered, other time you may find that only a part of the name is registered. In this case, you want to trim the URL from the right until you get to the domain name, and then trim the domain name from the left ("www.umiacs.umd.edu" would become "umiacs.umd.edu" and then "umd.edu"). A useful site for looking up domain names is http://www.networksolutions.com/whois/index.jsp
  4. Some top-level domain names are assigned to organizations (the U.S. government owns ".gov," for example) or to countries (the United Kingdom owns ".uk"). So in this case it would be useful to know who owns ".tv". If you do much of this, you will learn to recognize some of the more common top-level domain names. There are a lot of domain name services (DNS) that provide this sort of information; one can be found (with some poking around) at http://www.iana.org/.
  5. Ultimately, the packets that you send to a host have to know how to get there. You can follow that path using a "traceroute" service. One such service is available at http://whatismyipaddress.com/traceroute-tool. You need to type in an IP address in the search box. You can find the IP address of a Website using this tool. The traceroute service will provide quite a lot of detail on how packets get from the server that hosts whatismyipaddress.com to any site you specify.
The homework assignment is to use all of these techniques to determine who is responsible for the content that you see on the site given above. Describe what you find using each of the five techniques in an html file (name it as FirstName_LastName_hw2.html, such as John_Smith_hw2.html), also discuss possible causes for the inconsistencies that you discover.

To help me read your solution, please use the following structure to craft your report:

In other words, please use at least 6 paragraphs in your report.

Post the html file on your web site, then email the instructor the URL for accessing your html file. Do not revise that file for 3 days (after the due date). Reminder: your URL is in this format: http://classes.slis.lsu.edu/wu/7008/sp12/your_folder/FirstName_LastName_hw2.html where your_folder is your first initial followed by your last name, all in lower case, such as jsmith for John Smith. We use your official first and last name in the class roster to create your folder on the SLIS Web server.

Hope everything is clear. Some students might still have no clue what they are supposed to do. Again, two tasks need to be finished:

  1. Suppose you are an FBI agent and you are given an assignment to investigate who is responsible for the content of that Website. You can use multiple techniques (discussed above AND in the slides) to find out the results, which may or may not be consistent with each other. However, you can make a cogent story from those results. Librarians do this "information resource authority check" for information users often, too. You need to read the slides carefully. Specific techniques are discussed in the slides.
  2. You are supposed to write an HTML file to record your work, then upload that HTML file onto our class Web server, then email me the URL for accessing your HTML file. In other words, do NOT submit a .txt (or .pdf, .doc) file. The URL here starts with http://classes.slis.lsu.edu/wu/7008/sp12/...; do NOT submit the location of a file on your local computer starting with C://...

Some students had some difficulty using FileZilla to upload the html file to the class Web server, or using a browser to render the html file. Here are some common problems and solutions.
  1. Problem: FileZilla cannot be clicked and initiated.
    Solution: Make sure you have installed the right version of FileZilla for your operating system (Windows or Mac). FileZilla is available from Tigerware and the Web.

  2. Problem: unable to connect to the class Web server using FileZilla.
    Solution: Make sure that you have input the host, username, and password correctly. Filezilla reports whether a connection is successful or not.

  3. Problem: unable to drag my file from my local computer to the class Web server.
    Solution: Make sure that a successful connection is established (see (1)). Without a successful connection, you cannot drag a file from your local computer to the server. If you are sure that a successful connnection is already established, but the problem remains, or FileZilla reports "open for write: permission denied," it is very likely the server is busy, so please try it again a couple of minutes later. If the problem remains consistently, please email me, and I will take a look at your account.

  4. Problem: frustrated or panic.
    Solution: if you can make screenshots of your steps (at least the final step), send them to me, I will troubleshoot for you; or come to see me, the TA, anybody in the SLIS Lab, or your classmates.

  5. Problem: unable to view an HTML file using a browser.
    Solution: Make sure you have used a correct URL. Check your folder name (which is your first initial followed by your last name, all in lower case) and your filename. Our server is a linux machine, so your folder name and filenames are all case-sensitive.

  6. Problem: HTML tags are shown on my Web page.
    Solution: Make sure you have saved your file as .html (rather than .html.txt), and make sure you have closed the tags that are supposed to be closed.

  7. Problem: I cannot find Notepad on my computer.
    Solution: If you use a Windows machine, Notepad is under "All Programs" --> "Accessories." You can also download Notepad++ from the Web. If you use a Mac machine, try to find TextEdit.

  8. Problem: When I use TextEdit on my MAC and put the tags in (following the steps in the Huddleston text), they always show up when I bring up the webpage in my browser.
    Solution: There can be multiple reasons, try the following: (1) turn on TextEdit's "plain text" mode; (2) close the tags that are supposed to be closed, such as title, head; (3) save the file as .html (or .htm) rather than .txt.

Grading rubric:


Acknowledgment to Doug Oard, revised by Yejun Wu.