Disable search engine indexing

This video features an old UI. Updated version coming soon!

You can tell search engines which pages to crawl by writing a robots.txt file. You can also prevent search engines from crawling and indexing specific pages, folders, your entire site, or your webflow.io subdomain. This is useful for hiding pages like your site’s 404 page from being indexed and listed in search results.

Important: Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine knows about your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the noindex meta code to remove that content from Google’s index.

In this lesson:

How to disable indexing of the Webflow subdomain

You can prevent Google and other search engines from indexing your site’s webflow.io subdomain by disabling indexing from your Site settings.

Go to Site settings > SEO tab > Indexing section
Set Disable Webflow subdomain indexing to “Yes”
Click Save changes and publish your site

This will publish a unique robots.txt only on the subdomain, telling search engines to ignore this domain.

Note: You’ll need a Site plan or paid Workspace to disable search engine indexing of the Webflow subdomain. Learn more about Site and Workspace plans.

Disable Webflow subdomain indexing is set to YES. This section and the “Save changes” button are highlighted.

How to generate a robots.txt file

The robots.txt is usually used to list the URLs on a site that you don't want search engines to crawl. You can also include the sitemap of your site in your robots.txt file to tell search engine crawlers which content they should crawl.

Just like a sitemap, the robots.txt file lives in the top-level directory of your domain. Webflow will generate the /robots.txt file for your site once you create it in your Site settings.

To create a robots.txt file:

Go to Site settings > SEO tab > Indexing section
Add the robots.txt rule(s) you want
Click Save changes and publish your site

‍Important: Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine knows about your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the noindex meta code to remove that content from Google’s index.

A robots.txt rule “User-agent:*”, line break, “Disallow: /” in the robots.txt field is highlighted, along with the “Save changes” button.

Robots.txt rules

You can use any of these rules to populate the robots.txt file.

User-agent: * means this section applies to all robots.
Disallow: tells the robot to not visit the site, page, or folder.

To hide your entire site

User-agent: *

Disallow: /

To hide individual pages

User-agent: *

Disallow: /page-name

To hide an entire folder of pages

User-agent: *

Disallow: /folder-name/

To include a sitemap

Sitemap: https://your-site.com/sitemap.xml

Helpful resources

Check out more useful robots.txt rules.

Note: Anyone can access your site’s robots.txt file, so they may be able to identify and access your private content.

Best practices for privacy

If you’d like to prevent the discovery of a particular page or URL on your site, don’t use the robots.txt to disallow the URL from being crawled. Instead, use either of the following options:

Use the noindex meta code to disallow search engines from indexing your content and remove content from search engines’ index.
Save pages with sensitive content as draft and don’t publish them. Password protect pages that you need to publish.

FAQ and troubleshooting tips

Can I use a robots.txt file to prevent my Webflow site assets from being indexed?

It’s not possible to use a robots.txt file to prevent Webflow site assets from being indexed because a robots.txt file must live on the same domain as the content it applies to (in this case, where the assets are served). Webflow serves assets from our global CDN, rather than from the custom domain where the robots.txt file lives.

I removed the robots.txt file from my Site settings, but it still shows up on my published site. How can I fix this?

Once the robots.txt has been made, it can’t be completely removed. However, you can replace it with new rules to allow the site to be crawled, e.g.:

User-agent: *

Disallow:

Make sure to save your changes and republish your site. If the issue persists and you still see the old robots.txt rules on your published site, please contact customer support.

Prevent search engines from indexing pages, folders, your entire site, or just your webflow.io subdomain.

How to disable indexing of the Webflow subdomain

Note: You’ll need a Site plan or paid Workspace to disable search engine indexing of the Webflow subdomain. Learn more about Site and Workspace plans.

How to generate a robots.txt file

Robots.txt rules

To hide your entire site

To hide individual pages

To hide an entire folder of pages

To include a sitemap

Helpful resources

Note: Anyone can access your site’s robots.txt file, so they may be able to identify and access your private content.

Best practices for privacy

FAQ and troubleshooting tips

Product

Company

Learn & Get help

Social