At 6amTech, our top priority is to ensure that each and every product we offer is up-to-date and equipped with the latest features. Our ultimate goal is to help our clients succeed in their respective eCommerce industries.
With this in mind, we’ve included the robots.txt file in our systems to ensure that search engine crawlers can crawl through our clients’ websites effortlessly. This allows you to index the site’s content and make it more discoverable to users.
At the same time, we ensure the site is not overwhelmed with requests, thus ensuring a smooth user experience. We believe this small but important feature can go a long way in improving our client’s online presence and success.
What is Robots.txt?
A robots.txt file is a guide for web robots such as search engine crawlers. It tells them how to crawl and index pages on your website.
Using a robots.txt file, you can instruct these robots on which pages to crawl and which to avoid, thus preventing them from overloading your site with too many requests.
You can use the robots.txt file for web pages, media files, and resource files to manage the crawler traffic and keep a file off the search engine if needed.
The robots.txt works by providing instructions to search engine robots for crawling which is obeyed by respectable web crawlers.
Where to Find The Robots.txt File?
The Robots.txt file is typically located in the root directory of your website.
You can access it by navigating to your website’s domain followed by “/robots.txt” in a web browser.
For example – https://www.yourstore.com/robots.txt. Just copy and paste that address into your web browser to see if your store has a robots.txt file. Replace the “yourstore.com” with the domain of your website. Once done, it will be displayed similarly to this:
Why is The Robots.txt File Important?
The Robots.txt file is crucial for managing how search engines interact with your website. Here’s why it matters:
Crawl Budget: Search engines have a limit on the number of pages will crawl and index from your website within a limited timeframe. That’s why robots.txt guides to important pages and skips areas where they don’t need to be checked. This helps them find what matters the most quickly.
Block Unwanted Pages: You can use it to tell search engines not to index certain pages, like ones with the same content, internally linked, or private pages you don’t want everyone to see. This keeps your site organized and avoids any search engine penalties.
Control Access to Sensitive Areas: It’s handy for blocking access to parts of your site that you want to keep private. This is particularly important when it comes to sensitive information, such as user data or internal documents. By keeping these areas private, you can ensure that they are not indexed by search engines, which in turn, helps to keep your data safe and secure.
Ways to Edit Your Robots.txt File
There are two ways you can edit your robots.txt file – Manual Editing and Editing Through The Product’s Admin Panel.
Manual Editing
This involves accessing your store’s server files and directly editing the robots.txt code. While it offers more control, it’s recommended for users comfortable with server management.
To manually edit your Robots.txt file, follow these simple steps:
Download Your Robots.txt File:
- Go to your website’s robots.txt file (like – https://yourstore.com/robots.txt). Copy the content and paste it into a new text file. You can use any text editor.
- You can also use a tool called cURL to download your robots.txt file directly:
curl https://yourstore.com/robots.txt
- An alternative method involves copying Search Console’s robots.txt report and saving the content as a file.
Edit Your Robots.txt File:
- Open the downloaded Robots.txt file using a text editor on your computer.
- Make any changes you need to the rules. Please make sure to follow the right format and guidelines.
- Save the file with UTF-8 encoding to keep special characters intact.
Upload Your Robots.txt File:
- After making your edits, upload the updated robots.txt file to the root directory of your website.
- Make sure the file is named “robots.txt” and saved as a text file.
- Uploading the file depends on your website’s platform and server setup. Check your platform’s instructions or ask for help if you’re unsure how to do it.
Editing Through The Product’s Admin Panel
All the 6amTech products make it possible to edit the robots.txt from the admin panel.
In general, the steps are the same for all 6amTech products. For example, the section below shows the method from 6valley’s admin panel.
Step 1: Log in to your 6Valley Admin Panel
Image: Admin Login of 6Valley
Step 2: Scroll to the “Basic SEO” under the ‘System Settings’
Path: Admin Panel> System Settings> Business Setup> Basic SEO
Step 3: Click on the “Robots.txt” tab to view the options
Path: Admin Panel> System Settings> Business Setup> Basic SEO> Robots.txt
Step 4: After clicking, you will get a text editor. Here you will be able to edit your robots.txt file
Path: Admin Panel> System Settings> Business Setup> Basic SEO> Robots.txt
Step 5: Make the necessary changes and paste it to the editor. Click on the “Submit” button to make changes and update your robot.txt file
Step 6: You can view in real-time by clicking on the “View URL” button
Please note: The feature is disabled for the demo site for security and performance optimization reasons. The paid version of the product or your main site will automatically generate a robots.txt for your site.
Rules of Robots.txt file
Some certain rules or guidelines need to be followed to use the robots.txt file accurately.
- Comments in a Robots.txt file start with the # symbol.
- Using correct capitalization is a must, as “Disallow” and “disallow” are two different things.
- A Robots.txt file contains groups of rules that apply to different parts of your website.
- Your Robots.txt file must be saved as a UTF-8 encoded text file, including ASCII characters.
- Each website should have only one Robots.txt file.
- Make sure your Robots.txt file is named exactly “robots.txt”.
- Put your Robots.txt file in the root of your website’s hosting server.
- Each rule in the Robots.txt file includes a user agent (like a search engine bot) and one or more instructions about what to allow or disallow.
Anatomy of a robots.txt file code using an example:
User-agent: *
Disallow: /admin/
Allow: /images/*
User-agent: Googlebot
Allow: /
Crawl-delay: 5 (Optional)
Sitemap: https://www.yoursite.com/sitemap.xml
Explanation:
- The first line is instructing all crawlers to follow the rules
- The second line prevents crawlers from accessing the /admin/ directory
- The third line allows crawlers to access the images
- The next line explains that the instructions/rules below are just for Googlebot
- The following line will enable Googlebot to access everything on the site
- The “Crawl-delay” instructs to wait 5 seconds between visits
- The last line helps crawlers by sharing the sitemap to find everything
A quick guide to Robots.txt directives:
Directives | Explanation |
---|---|
User-agent | This tells the robots.txt file which search engine crawler (example: “Googlebot” or “Bingbot”) the following rules apply to. |
Disallow | This instructs the crawler to not access a specific path (like a directory or file) on your website. |
Allow | This tells the crawler that it can access a specific path (like “/images/” for your product images). |
Sitemap | This points the crawler to the location of your website’s XML sitemap file, which helps them find and index your content more efficiently. |
Crawl-delay | This (optional) directive sets a minimum delay (in seconds) between the crawler’s visits to your website pages. |
Conclusion
That’s a wrap! We hope you got some helpful insights and knowledge about the robots.txt file. While it might seem like a small technical detail, your robots.txt file plays a crucial role in optimizing your website’s search engine visibility.
If you are still confused and need help with setting up rules for your robots.txt file, you can always contact the expert team of 6amTech who are eagerly waiting to help you out.