Development Book V18: Managing Sitemaps for the Odoo Website

A sitemap is an essential tool that contributes significantly to a website’s usability and search engine visibility. It acts as a structured file that outlines important information about the website’s pages and associated resources. Search engines use this data to efficiently crawl, index, and display relevant pages in search results, enhancing user access and overall SEO performance.

This blog explores how to customize an existing sitemap in an Odoo website. The process involves using the sitemap_xml_index method, which allows developers to modify and extend the default sitemap structure according to specific business or technical requirements.

                       @http.route('/sitemap.xml', type='http', auth="public", website=True, multilang=False, sitemap=False)
def sitemap_xml_index(self, **kwargs):
   current_website = request.website
   Attachment = request.env['ir.attachment'].sudo()
   View = request.env['ir.ui.view'].sudo()
   mimetype = 'application/xml;charset=utf-8'
   content = None
   url_root = request.httprequest.url_root
   # For a same website, each domain has its own sitemap (cache)
   hashed_url_root = md5(url_root.encode()).hexdigest()[:8]
   sitemap_base_url = '/sitemap-%d-%s' % (current_website.id, hashed_url_root)


   def create_sitemap(url, content):
       return Attachment.create({
           'raw': content.encode(),
           'mimetype': mimetype,
           'type': 'binary',
           'name': url,
           'url': url,
       })
   dom = [('url', '=', '%s.xml' % sitemap_base_url), ('type', '=', 'binary')]
   sitemap = Attachment.search(dom, limit=1)
   if sitemap:
       # Check if stored version is still valid
       create_date = fields.Datetime.from_string(sitemap.create_date)
       delta = datetime.datetime.now() - create_date
       if delta < SITEMAP_CACHE_TIME:
           content = base64.b64decode(sitemap.datas)


   if not content:
       # Remove all sitemaps in ir.attachments as we're going to regenerated them
       dom = [('type', '=', 'binary'), '|', ('url', '=like', '%s-%%.xml' % sitemap_base_url),
              ('url', '=', '%s.xml' % sitemap_base_url)]
       sitemaps = Attachment.search(dom)
       sitemaps.unlink()


       pages = 0
       locs = request.website.with_user(request.website.user_id)._enumerate_pages()
       while True:
           values = {
               'locs': islice(locs, 0, LOC_PER_SITEMAP),
               'url_root': url_root[:-1],
           }
           urls = View._render_template('website.sitemap_locs', values)
           if urls.strip():
               content = View._render_template('website.sitemap_xml', {'content': urls})
               pages += 1
               last_sitemap = create_sitemap('%s-%d.xml' % (sitemap_base_url, pages), content)
           else:
               break


       if not pages:
           return request.not_found()
       elif pages == 1:
           # rename the -id-page.xml => -id.xml
           last_sitemap.write({
               'url': "%s.xml" % sitemap_base_url,
               'name': "%s.xml" % sitemap_base_url,
           })
       else:
           # TODO: in master/saas-15, move current_website_id in template directly
           pages_with_website = ["%d-%s-%d" % (current_website.id, hashed_url_root, p) for p in range(1, pages + 1)]


           # Sitemaps must be split in several smaller files with a sitemap index
           content = View._render_template('website.sitemap_index_xml', {
               'pages': pages_with_website,
               # URLs inside the sitemap index have to be on the same
               # domain as the sitemap index itself
               'url_root': url_root,
           })
           create_sitemap('%s.xml' % sitemap_base_url, content)


   return request.make_response(content, [('Content-Type', mimetype)])

When a sitemap file is generated in Odoo, it includes a list of URLs representing the pages available on the website. This is typically done using the following line of code:

locs = request.website.with_user(request.website.user_id)._enumerate_pages()

This line calls the _enumerate_pages() method on the website object, executed under the context of the website's public user (request.website.user_id). The method returns a collection of URLs (locations) that are used to build the sitemap. These URLs represent the different accessible pages on the website, which are then indexed for search engines through the sitemap.

This is a key step in dynamically creating a comprehensive and accurate sitemap for better SEO and page discoverability.

   def _enumerate_pages(self, query_string=None, force=False):
   """ Available pages in the website/CMS. This is mostly used for links
       generation and can be overridden by modules setting up new HTML
       controllers for dynamic pages (e.g. blog).
       By default, returns template views marked as pages.
       :param str query_string: a (user-provided) string, fetches pages
                                matching the string
       :returns: a list of mappings with two keys: ``name`` is the displayable
                 name of the resource (page), ``url`` is the absolute URL
                 of the same.
       :rtype: list({name: str, url: str})
   """
   # ==== WEBSITE.PAGES ====
   # '/' already has a http.route & is in the routing_map so it will already have an entry in the xml
   domain = [('url', '!=', '/')]
   if not force:
       domain += [('website_indexed', '=', True), ('visibility', '=', False)]
       # is_visible
       domain += [
           ('website_published', '=', True), ('visibility', '=', False),
           '|', ('date_publish', '=', False), ('date_publish', '<=', fields.Datetime.now())
       ]
   if query_string:
       domain += [('url', 'like', query_string)]
   pages = self._get_website_pages(domain)
for page in pages:
       record = {'loc': page['url'], 'id': page['id'], 'name': page['name']}
       if page.view_id and page.view_id.priority != 16:
           record['priority'] = min(round(page.view_id.priority / 32.0, 1), 1)
       if page['write_date']:
           record['lastmod'] = page['write_date'].date()
       yield record
   # ==== CONTROLLERS ====
   router = self.env['ir.http'].routing_map()
   url_set = set()
   sitemap_endpoint_done = set()
   for rule in router.iter_rules():
       if 'sitemap' in rule.endpoint.routing and rule.endpoint.routing['sitemap'] is not True:
           if rule.endpoint.func in sitemap_endpoint_done:
               continue
           sitemap_endpoint_done.add(rule.endpoint.func)


           func = rule.endpoint.routing['sitemap']
           if func is False:
               continue
           for loc in func(self.with_context(lang=self.default_lang_id.code).env, rule, query_string):
               yield loc
           continue
       if not self.rule_is_enumerable(rule):
           continue
       if 'sitemap' not in rule.endpoint.routing:
           logger.warning('No Sitemap value provided for controller %s (%s)' %
                          (rule.endpoint.original_endpoint, ','.join(rule.endpoint.routing['routes'])))
       converters = rule._converters or {}
       if query_string and not converters and (query_string not in rule.build({}, append_unknown=False)[1]):
           continue
       values = [{}]
       # converters with a domain are processed after the other ones
       convitems = sorted(
           converters.items(),
           key=lambda x: (hasattr(x[1], 'domain') and (x[1].domain != '[]'), rule._trace.index((True, x[0]))))
       for (i, (name, converter)) in enumerate(convitems):
           if 'website_id' in self.env[converter.model]._fields and (not converter.domain or converter.domain == '[]'):
               converter.domain = "[('website_id', 'in', (False, current_website_id))]"
           newval = []
           for val in values:
               query = i == len(convitems) - 1 and query_string
               if query:
                   r = "".join([x[1] for x in rule._trace[1:] if not x[0]])  # remove model converter from route
                   query = sitemap_qs2dom(query, r, self.env[converter.model]._rec_name)
                   if query == FALSE_DOMAIN:
                       continue
               for rec in converter.generate(self.env, args=val, dom=query):
                   newval.append(val.copy())
                   newval[-1].update({name: rec.with_context(lang=self.default_lang_id.code)})
           values = newval
       for value in values:
           domain_part, url = rule.build(value, append_unknown=False)
           pattern = query_string and '*%s*' % "*".join(query_string.split('/'))
           if not query_string or fnmatch.fnmatch(url.lower(), pattern):
               page = {'loc': url}
               if url in url_set:
                   continue
               url_set.add(url)
               yield page

This function enables smooth navigation between different pages on your website within the sitemap. It also allows for the inclusion of additional URLs, helping to ensure that your sitemap remains complete and up-to-date

Adding a Records Page to Your Sitemap

To include record pages in the sitemap in Odoo 18, you first need to import the necessary sitemap method. Here's how you can begin:

Import from odoo.addons.website.models.ir_http sitemap_qs2dom

The slug function is used to generate clean, user-friendly URLs. It works in conjunction with sitemap_qs2dom, a method commonly used to build domain filters based on routing paths and query parameters.

You can now define a new method to implement this functionality.

class Main(http.Controller):
   def sitemap_records(env, rule, qs):
       slug = request.env['ir.http']._slug
      records = env[your.model]
       dom = sitemap_qs2dom(qs, '/url', records._rec_name)
       for r in records.search(dom):
           loc = '/url/%s' % slug(r)
           if not qs or qs.lower() in loc:
               yield {'loc': loc}

The sitemap_records is a Python generator function that is triggered during sitemap generation. Inside this function, a domain is created using sitemap_qs2dom, which is then used to search for records. The slug() method is used to determine the location by generating a user-friendly URL.

Next, reference the sitemap_records function in the record detail root.

               @http.route('/url/', type='http', auth="user", website=True, sitemap=sitemap_records)
   def records_detail(self, record):

In this context, we linked the sitemap_records() function to the root using the sitemap keyword.

This feature allows record pages to be included in the sitemap.xml. If no filtering is needed and you want to include all records, you can replace the function reference with True.

The sitemap is refreshed every 12 hours. To view updates immediately, go to attachments, delete the existing sitemap.xml, and then open /sitemap.xml in your browser to see the changes.