About Me

About Me
I believe that information is a means to an end in solving man's needs that are insatiable

Wednesday, 2 September 2015

WHATS IN THE DEEP WEB

 
The importance of information gathering on the Web and the central and unquestioned role of search engines plus the frustrations expressed by users about the adequacy of these engines make them an obvious focus of investigation.

Search Engines: Dragging a Net Across the Web's Surface





 





Search engines generally create an index of data by finding information that's stored on Web sites and other online resources. Most search engines create their indices like spiders or crawlers, which locate domains and then follow hyperlinks to other domains, like an arachnid following the silky tendrils of a web, in a sense creating a sprawling map of the Web. For any content to be discovered, the page must be static and linked to other pages. 

The deep Web is qualitatively different from the surface Web. Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request.

Traditional search engines can not "see" or retrieve content in the deep Web — those pages do not exist until they are created dynamically as the result of a specific search. Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden. 
 
The most coveted commodity of the Information Age is "information", the value of deep Web content is immeasurable. With this in mind, BrightPlanet has quantified the size and relevancy of the deep Web in a study based on data collected between March 13 and 30, 2000. Our key findings include:
  • Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.
  • The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web.
  • The deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web.
  • More than 200,000 deep Web sites presently exist.
  • Sixty of the largest deep-Web sites collectively contain about 750 terabytes of information — sufficient by themselves to exceed the size of the surface Web forty times.
  • On average, deep Web sites receive fifty per cent greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet-searching public.
  • The deep Web is the largest growing category of new information on the Internet.
  • Deep Web sites tend to be narrower, with deeper content, than conventional surface sites.
  • Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web.
  • Deep Web content is highly relevant to every information need, market, and domain.
  • More than half of the deep Web content resides in topic-specific databases.
  • A full ninety-five per cent of the deep Web is publicly accessible information — not subject to fees or subscriptions.
 According to a study at the NEC Research Institute, most internet searchers on the surface web of the WWW are searching only 0.03% or one in 3,000 of the pages available to them today. As for the rest of it? Well, a lot of it's buried in what's called the Deep web or Invisible or underneath web.
In the dark Web, users really do intentionally bury data. Often, these parts of the Web are accessible only if you use special browser software that helps to peel away the onion-like layers of the dark Web.
This software maintains the privacy of both the source and the destination of data and the people who access it. For political dissidents and criminals alike, this kind of anonymity shows the immense power of the dark Web, enabling transfers of information, goods and services, legally or illegally, to the chagrin of the powers-that-be all over the world.

The deep Web is about 500 times larger than the surface Web, with, on average, about three times higher quality based on our document scoring methods on a per-document basis. On an absolute basis, total deep Web quality exceeds that of the surface Web by thousands of times. Total number of deep Web sites likely exceeds 200,000 today and is growing rapidly. Content on the deep Web has meaning and importance for every information seeker and market. More than 95% of deep Web information is publicly available without restriction. The deep Web also appears to be the fastest growing information component of the Web.

 Serious information seekers can no longer avoid the importance or quality of deep Web information. But deep Web information is only a component of total information available. Searching must evolve to encompass the complete Web. There is obviously split within the Internet information search market; search directories that offer hand-picked information chosen from the surface Web to meet popular search needs; search engines for more robust surface-level searches; and server-side content-aggregation vertical "infohubs" for deep Web information to provide answers where comprehensiveness and quality are imperative.

our Web has really become tangled.

Credits to: The Journal of Electronic Publishing and  Nathan Chandler

No comments:

Post a Comment