Username: Save?
Password:
Home Forum Links Search Login Register*
    News: Welcome to the TechnoWorldInc! Community!
Recent Updates
[September 09, 2024, 12:27:25 PM]

[September 09, 2024, 12:27:25 PM]

[September 09, 2024, 12:27:25 PM]

[September 09, 2024, 12:27:25 PM]

[August 10, 2024, 12:34:30 PM]

[August 10, 2024, 12:34:30 PM]

[August 10, 2024, 12:34:30 PM]

[August 10, 2024, 12:34:30 PM]

[July 05, 2024, 02:11:09 PM]

[July 05, 2024, 02:11:09 PM]

[July 05, 2024, 02:11:09 PM]

[June 21, 2024, 01:43:48 PM]

[June 21, 2024, 01:43:48 PM]
Subscriptions
Get Latest Tech Updates For Free!
Resources
   Travelikers
   Funistan
   PrettyGalz
   Techlap
   FreeThemes
   Videsta
   Glamistan
   BachatMela
   GlamGalz
   Techzug
   Vidsage
   Funzug
   WorldHostInc
   Funfani
   FilmyMama
   Uploaded.Tech
   MegaPixelShop
   Netens
   Funotic
   FreeJobsInc
   FilesPark
Participate in the fastest growing Technical Encyclopedia! This website is 100% Free. Please register or login using the login box above if you have already registered. You will need to be logged in to reply, make new topics and to access all the areas. Registration is free! Click Here To Register.
+ Techno World Inc - The Best Technical Encyclopedia Online! » Forum » THE TECHNO CLUB [ TECHNOWORLDINC.COM ] » Techno Articles » Website Promotion
 How Search Engines Find Documents
Pages: [1]   Go Down
  Print  
Author Topic: How Search Engines Find Documents  (Read 542 times)
Shawn Tracer
TWI Hero
**********


Karma: 2
Offline Offline

Posts: 16072


View Profile
How Search Engines Find Documents
« Posted: March 04, 2008, 11:49:51 AM »


How Search Engines Find Documents
 by: Kamlesh Patel

Every document on the Web is associated with a URL (Uniform Resource Locator). Inthis context, we will use the terms “document” and “URL” interchangeably. This is an oversimplification, as some URLs return different documents to the user depending on such factors as their location, browser type, form input etc., but this terminology suits our purposes for now.

To find every document on the Web would mean more than finding every URL on the Web. For this reason, search engines do not currently attempt to locate every possible unique document, although research is always underway in this area. Instead, crawling search engines focus their attention on unique URLs; although some dynamic sites may display different content at the same URL (via form inputs or other dynamic variables), search engines will see that URL as a single page.

The typical crawling search engine uses three main resources to build a list of URLs to crawl. Not all search engines use all of these:

Hyperlinks on existing Web pages

The bulk of the URLs found in the databases of most crawling search engines consists of links found on Web pages that the spider has already crawled. Finding a link to a document on one page implies that someone found that link important enough to add it to their page.

Submitted URLs

All the crawling search engines have some sort of process that allows users or Website owners to submit URLs to be crawled. In the past, all search engines offered a free manual submission process, but now, many accept only paid submissions. Google is a notable exception, with no apparent plans to stop accepting free submissions, although there is great doubt as to whether submitting actually does anything.

XML data feeds

Paid inclusion programs, such as the Yahoo! Site Match system, include trusted feed programs that allow sites to submit XML-based content summaries for crawling and inclusion. As the Semantic Web begins to emerge, and more sites begin to offer RSS (RDF Site Summary) news feed files, some search engines have begun to read these files in order to find fresh content.

Search engines run multiple crawler programs, and each crawler program (or spider) receives instructions from the scheduler about which URL (or set of URLs) to fetch next. We will see how search engines manage the scheduling process shortly, but first, let’s take a look at how the search engine’s crawler program works.

Source: http://www.elitedatasolution.com

About The Author

Kamlesh Patel

I'm freelancer Search engine optimization expert from India. We provide Search engine optimization services including link building, meta tags etc.

[email protected]

Logged

Pages: [1]   Go Up
  Print  
 
Jump to:  

Copyright © 2006-2023 TechnoWorldInc.com. All Rights Reserved. Privacy Policy | Disclaimer
Page created in 0.208 seconds with 24 queries.