Sunday, July 09, 2006

Cataloging, indexing Web sites - Librarian's Perspectives

Given below are some samples of
a) how a Web site is catalogued (as input),
b) what descriptors are used (for thruput) and
c) what format is the data available (as output)

  • Middle East Virtual Library: Islamic Libraries and Libraries with Islamic collections
  • Conscious.Be: Islamic Libraries and Information Centres
  • Middle East Virtual Library. Book in the Islamic Civilization

    NB. If you need any clarification, and / or similar info. on cataloging Web sites, please email me at mt2222 at

    Se also:

  • Multimedia Metadata Standards
    Metadata is an important aspect of the creation and management of digital images (and other multimedia files). Metadata standards for digital imaging can include information about:

    the technical format of the image file
    the process by which the image was created
    the content of the image

  • Metadata extraction and harvesting: A comparison of two automatic metadata generation applications, GREENBERG Jane, Journal of internet cataloging, 2003, vol. 6, no4, pp. 59-82
    This research explores the capabilities of two Dublin Core automatic metadata generation applications, Klarity and DC-dot. The top level Web page for each resource, from a sample of 29 resources obtained from National Institute of Environmental Health Sciences (NIEHS), was submitted to both generators. Results indicate that extraction processing algorithms can contribute to useful automatic metadata generation. Results also indicate that harvesting metadata from META tags created by humans can have a positive impact on automatic metadata generation. The study identifies several ways in which automatic metadata generation applications can be improved and highlights several important areas of research. The conclusion is that integrating extraction of harvesting methods will be the best approach to creating optimal metadata, and more research is needed to identify when to apply which method.

  • Automatic Metadata Generation for Web Pages Using a Text Mining Approach
    Hsin-Chang Yang, Chung-Hong Lee, Chang Jung University
  • Metadata Schema Used in OCLC Sampled Web Pages, Fei Yu, Journal of Educational Media & Library Sciences, 2005, 43: 2, 129-152
    The tremendous growth of Web resources has made information organization and retrieval more and more difficult. As one approach to this problem, metadata schemas have been developed to characterize Web resources. However, many questions have been raised about the use of metadata schemas such as which metadata schemas have been used on the Web? How did they describe Web accessible information? What is the distribution of these metadata schemas among Web pages? Do certain schemas dominate the others? To address these issues, this study analyzed 16,383 Web pages with meta tags extracted from 200,000 OCLC sampled Web pages in 2000. It found that only 8.19% Web pages used meta tags; description tags, keyword tags, and Dublin Core tags were the only three schemas used in the Web pages. This article revealed the use of meta tags in terms of their function distribution, syntax characteristics, granularity of the Web pages, and the length distribution and word number distribution of both description and keywords tags.
  • No comments: