Enable Dark Mode!
how-search-engines-index-web-pages.jpg
By: Ayisha Sumayya K

How Search Engines Index Web Pages?

Odoo 16

Website indexing refers to the process by which search engines analyze and categorize web pages for inclusion in their search results. It involves the search engine's bots, also known as crawlers, visiting web pages, scanning their content, and adding them to the search engine’s index. This blog will explain what meta tags are, understand how search engines index web pages, and the impact of the index, and follow directives that can significantly influence a website’s search engine optimization (SEO).

Meta Tags

Meta tags are tags in HTML that are used to give more information about a page to search engines and other clients who parse the meta tags and ignore those they don't support. Your HTML page's head section now includes meta tags that appear as
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <meta name="description" content=" Website Index">
  <meta name="keywords" content="SEO, Index">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
</html>

Robots Meta Tag

 The robots meta tag is used to implement a  page-specific strategy to manage how specific pages should be crawled and presented to users in Google search results. We need to place this robots meta tag in a page's head section.
<!DOCTYPE html>
<html>
<head>
  <meta name="robots" content="noindex,nofollow">
</head>
</html>
The above example specifies the robots meta tag to instruct search engines not to display the page in search results. The guideline is stated to apply to all crawlers, who are also referred to as user agents because they use their user agent to access pages. Google’s default web crawler goes by the name googlebot. To stop your page from just being indexed by Google, update the tag as follows:
<meta name="googlebot" content="noindex"/>
The following is a revised version that can be used if we just want to display pages in Google search but not in news.
<meta name="googlebot-news" content="noindex"/>
Use various robot meta tags to define different crawlers separately.
<meta name="googlebot" content="noindex"/>
<meta name="googlebot-news" content="nosnippet"/>
The content ‘nosnippent’ means that, in the search results for those pages, don’t display a text excerpt or a video preview. When it improves the user experience, a static thumbnail ( if available ) may still be displayed. All search results, including those from Google’s online search, google image, and Discover, fall under this category. Google may produce a text sample and video preview based on the data found on the page if you don’t specify this rule.
The rules are case-insensitive, both small letters and capitals are treated as the same. A commonly used rule to control the indexing and serving of a snippet with the robots meta tag as follows,
* all:  There is no restriction, that is the default value has no effect if explicitly listed.
* index: Show the page and resources in the search result. 
* noindex: Opposite to ‘index’, it specifies that pages and resources should not appear in search results. If we don’t specify this rule that page may be indexed and it will show in search results
* follow: Follow the links on his page. 
* nofollow: Opposite of ‘follow’, it specifies that don't follow the link on that page. If we don’t specify the rule it will use links on the page to discover those linked pages.
* none: Same as noindex and nofollow.
* nosnippet: In the search results for this page, don’t display a text except for a video preview. If a static image thumbnail is available, it might still be displayed if it improves the user experience. This holds true for all search-related outcomes at Google including Google images, online search, and Discover.
* noarchive: In the search results, don’t display a cached link. In the absence of this rule, google might produce a cached page that visitors might reach through the search results.
* notranslate : Refrain from including a translation of this page in search results. The title link and snippet of a search result that isn’t in the same language as the search query may be translated by Google if you don’t specify this rule. All subsequent user interactions with the page after clicking the translated title link will be handled by google translate, which will also translate any links that are clicked. 
* max-snippet: Use a text sample with a maximum number of characters for this search result. The preview of images or videos is unaffected by this. This holds true for all kinds of search engine results. If no parseable number is supplied, this rule is disregarded. In the absence of this rule, Google will determine the snippet’s length.
<meta name="robots" content="max-snippet: 0">
This is equivalent to the nosnippet rule. ‘0’ means no snippet is to be shown.
<meta name="robots" content="max-snippet:-1">
In this rule, it specifies that there is no limit to shown in the snippet.

Combine multiple rules in the same meta tag

Robots meta tag rules can be combined with commas to form a multi-rule instruction, as can several meta tags.
 Here is an illustration of a robots meta tag telling web crawlers not to index the page and not to crawl any of its links. 
<meta name="robots" content="noindex, nofollow">
The below robots meta tag tells web crawlers to index the page and crawl any of its links. 
<meta name="robots" content=" index, follow">
The search engine will use the total of negative rules in cases where many crawlers are provided along with various rules.
<meta name="robots" content="nofollow">
<meta name="googlebot" content="noindex">

Conclusion

Website indexing is the process of search engines analyze and categorize web pages for inclusion in search results. The ‘index’ rule determines whether a page is included or excluded from the search result. The ‘follow’ directive allows search engine bots to crawl and follow links on a page. By understanding and utilizing these concepts, website owners can improve their page visibility and search engine optimization.


If you need any assistance in odoo, we are online, please chat with us.



0
Comments



Leave a comment



whatsapp
location

Calicut

Cybrosys Technologies Pvt. Ltd.
Neospace, Kinfra Techno Park
Kakkancherry, Calicut
Kerala, India - 673635

location

Kochi

Cybrosys Technologies Pvt. Ltd.
1st Floor, Thapasya Building,
Infopark, Kakkanad,
Kochi, India - 682030.

location

Bangalore

Cybrosys Techno Solutions
The Estate, 8th Floor,
Dickenson Road,
Bangalore, India - 560042

Send Us A Message