Scope
- The Hidden or Invisible
or Deep
Internet/Web are upwards of
1 trillion (or more)
documents, other files, and, even, Web pages that are not directly
accessible by conventional search tools
- The "hidden" content typically consists of files and documents
in:
- Databases that can only be
reached through a query (database search-and-retrieval)
interface. Most of the accessing these databases requires some
kind of license and/or permission and, consequently, a login
and a password.
- Virtual corporate networks,
which may have upwards of a 100,000 or more Web pages each and
other files intended for only employee, intra-company, or
inter-company access.
- Formats that can not be
readily read or accessed across the Internet.
- Formats whose contents can not
be easiily indexed by most Search Engines.
Ways of Accessing
Hidden Intenet Files -- Get library cards everywhere you
can
- Different libraries (academic and
public) provide access to varying combinations of common and
not-so-common databases.
- Consequently, it is worthwhile to
get a card at any and every library that you can.
- You can often get a "community"
card at many academic libraries, particularly community college
ones. It ususally does not cost anything and allows you to check
books out of that libraries.
- 4-year colleges, particularly
private ones, are usually more restrictive or will not allow
non-students to have a library card. For example, San Jose State
University, charges $100/year for a community card, though this
may change with the integration of the San Jose Public's King
Library with SJSU's Clark Library.
Ways of Accessing
Hidden Intenet Files -- Use specialzed tools for searching the "deep"
web
- Direct Search
- Lycos: Searchable Databases
- InfoMine
- The Invisible Web
Directory
- Invisible Web
- Provides links to thousands of
sites that have databases or other resources not readily
accessible by Search Engines
- http://www.invisibleweb.com/
- CompletePlanet
- Google
- This Search Engine is now
indexing the contents of PDF format (Acrobat) and DOC
(Microsoft Word) files, something that was not done previously
by any Search Engine.
- http://www.google.com
- Bright Planet & Deep Query
Manager (client-side application)
- Claims that it provides links
to over 35,000 databases and specialized search
engines..
- Deep Query Manager is a
client-side application for searching invisible web
resources.
- http://www.brightplanet.com