Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate sitemaps #737

Open
sroussey opened this issue Nov 20, 2023 · 4 comments
Open

Duplicate sitemaps #737

sroussey opened this issue Nov 20, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@sroussey
Copy link

Describe the bug
Sitemap indexes show in both robots.txt and in the root sitemap index

To Reproduce
With this config:

/** @type {import('next-sitemap').IConfig} */
module.exports = {
  siteUrl: 'https://embarc.com',
  changefreq: 'daily',
  priority: 0.7,
  sitemapSize: 2000,
  generateRobotsTxt: true,
  autoLastmod: true,
  exclude: [
    '*/sitemap.xml',
    '/dashboard/*',
    '/pricing',
    '/signin',
    '/legal/*',
  ],
  robotsTxtOptions: {
    includeNonIndexSitemaps: false,
    additionalSitemaps: [
      'https://embarc.com/capital/leadership/sitemap.xml',
      'https://embarc.com/capital/spac/sitemap.xml',
      'https://embarc.com/capital/spac-sponsor/sitemap.xml',
      'https://embarc.com/company/crowdfunding/sitemap.xml',
      'https://embarc.com/portal/crowdfunding/sitemap.xml',
      'https://embarc.com/capital/underwriter/sitemap.xml',
    ],
  },
};

Expected behavior
Not duplicate sitemaps

Example

See https://embarc.com/robots.txt:

# *
User-agent: *
Allow: /

# Host
Host: https://embarc.com

# Sitemaps
Sitemap: https://embarc.com/sitemap.xml
Sitemap: https://embarc.com/capital/leadership/sitemap.xml
Sitemap: https://embarc.com/capital/spac/sitemap.xml
Sitemap: https://embarc.com/capital/spac-sponsor/sitemap.xml
Sitemap: https://embarc.com/company/crowdfunding/sitemap.xml
Sitemap: https://embarc.com/portal/crowdfunding/sitemap.xml
Sitemap: https://embarc.com/capital/underwriter/sitemap.xml

And see https://embarc.com/sitemap.xml :


<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://embarc.com/sitemap-0.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/capital/leadership/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/capital/spac/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/capital/spac-sponsor/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/company/crowdfunding/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/portal/crowdfunding/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://embarc.com/capital/underwriter/sitemap.xml</loc>
</sitemap>
</sitemapindex>

My preference would be to have only in the sitemap index and not in the robots.txt. How can that be done?

@sroussey sroussey added the bug Something isn't working label Nov 20, 2023
Copy link

Closing this issue due to inactivity.

@sroussey
Copy link
Author

would rather not

@peti446
Copy link

peti446 commented Feb 28, 2024

I can confirm this, and looking at the code the exclude list is not run over the sitemaps added to the robotsTxtOptions.additionalSitemaps, they are just plainly added, even tho the documentation does state that this is possible, at the moment it is not.
If you add sitemap indexes to the list above, this will cause these index to be added to the main sitemap and this is something not allowed by google:

Incorrect sitemap index format: Nested sitemap indexes
One or more entries in your sitemap index file uses its own URL or the URL of another sitemap index file. A sitemap index file can't list other sitemap index files, only sitemap files.

Remove any entries pointing to sitemap index files, then resubmit your sitemap.

https://support.google.com/webmasters/answer/7451001#errors&zippy=%2Csitemap-parsing-errors

@kevinrobert3
Copy link

This package is not actively maintained. It auto-closes issues and PRs after a particular set time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants