Given below are some samples of
a) how a Web site is catalogued (as input),
b) what descriptors are used (for thruput) and
c) what format is the data available (as output)
NB. If you need any clarification, and / or similar info. on cataloging Web sites, please email me at mt2222 at yahoo.com
Metadata is an important aspect of the creation and management of digital images (and other multimedia files). Metadata standards for digital imaging can include information about:
the technical format of the image file
the process by which the image was created
the content of the image
This research explores the capabilities of two Dublin Core automatic metadata generation applications, Klarity and DC-dot. The top level Web page for each resource, from a sample of 29 resources obtained from National Institute of Environmental Health Sciences (NIEHS), was submitted to both generators. Results indicate that extraction processing algorithms can contribute to useful automatic metadata generation. Results also indicate that harvesting metadata from META tags created by humans can have a positive impact on automatic metadata generation. The study identifies several ways in which automatic metadata generation applications can be improved and highlights several important areas of research. The conclusion is that integrating extraction of harvesting methods will be the best approach to creating optimal metadata, and more research is needed to identify when to apply which method.
Hsin-Chang Yang, Chung-Hong Lee, Chang Jung University
The tremendous growth of Web resources has made information organization and retrieval more and more difficult. As one approach to this problem, metadata schemas have been developed to characterize Web resources. However, many questions have been raised about the use of metadata schemas such as which metadata schemas have been used on the Web? How did they describe Web accessible information? What is the distribution of these metadata schemas among Web pages? Do certain schemas dominate the others? To address these issues, this study analyzed 16,383 Web pages with meta tags extracted from 200,000 OCLC sampled Web pages in 2000. It found that only 8.19% Web pages used meta tags; description tags, keyword tags, and Dublin Core tags were the only three schemas used in the Web pages. This article revealed the use of meta tags in terms of their function distribution, syntax characteristics, granularity of the Web pages, and the length distribution and word number distribution of both description and keywords tags.