Majestic

  • Site Explorer
    • Majestic
    • Summary
    • Ref Domains
    • Backlinks
    • * New
    • * Lost
    • Context
    • Anchor Text
    • Pages
    • Topics
    • Link Graph
    • Related Sites
    • Advanced Tools
    • Author ExplorerBeta
    • Summary
    • Similar Profiles
    • Profile Backlinks
    • Attributions
  • Compare
    • Summary
    • Backlink History
    • Flow Metric History
    • Topics
    • Clique Hunter
  • Link Tools
    • My Majestic
    • Recent Activity
    • Reports
    • Campaigns
    • Verified Domains
    • OpenApps
    • API Keys
    • Keywords
    • Keyword Generator
    • Keyword Checker
    • Search Explorer
    • Link Tools
    • Bulk Backlinks
    • Neighbourhood Checker
    • Submit URLs
    • Experimental
    • Index Merger
    • Link Profile Fight
    • Mutual Links
    • Solo Links
    • PDF Report
    • Typo Domain
  • Free SEO Tools
    • Get started
    • Backlink Checker
    • Majestic Million
    • Browser Plugins
    • Google Sheets
    • Post Popularity
    • Social Explorer
  • Support
    • Blog External Link
    • Support
    • Get started
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossary
    • Style Guide
    • How To Videos
    • API Reference Guide External Link
    • Contact Us
    • About Backlinks and SEO
    • SEO in 2026
    • The Majestic SEO Podcast
    • All Podcasts
    • What is Trust Flow?
    • Link Building Guides
  • Sign Up for FREE
  • Plans & Pricing
  • Login
  • Language flag icon
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文
  • Get started
  • Login
  • Plans & Pricing
  • Sign Up for FREE
    • Summary
    • Ref Domains
    • Map
    • Backlinks
    • New
    • Lost
    • Context
    • Anchor Text
    • Pages
    • Topics
    • Link Graph
    • Related Sites
    • Advanced Tools
    • Summary
      Pro
    • Backlink History
      Pro
    • Flow Metric History
      Pro
    • Topics
      Pro
    • Clique Hunter
      Pro
  • Bulk Backlinks
    • Keyword Generator
    • Keyword Checker
    • Search Explorer
      Pro
  • Neighbourhood Checker
    Pro
    • Index Merger
      Pro
    • Link Profile Fight
      Pro
    • Mutual Links
      Pro
    • Solo Links
      Pro
    • PDF Report
      Pro
    • Typo Domain
      Pro
  • Submit URLs
    • Summary
      Pro
    • Similar Profiles
      Pro
    • Profile Backlinks
      Pro
    • Attributions
      Pro
  • Custom Reports
    Pro
    • Get started
    • Backlink Checker
    • Majestic Million
    • Browser Plugins
    • Google Sheets
    • Post Popularity
    • Social Explorer
    • Get started
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossary
    • How To Videos
    • API Reference Guide External Link
    • Contact Us
    • Site Updates
    • The Company
    • Style Guide
    • Terms & Conditions
    • Privacy Policy
    • GDPR
    • Contact Us
    • SEO in 2026
    • The Majestic SEO Podcast
    • All Podcasts
    • What is Trust Flow?
    • Link Building Guides
  • Blog External Link
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文

Logfile analysis is an SEO and GEO goldmine that is too often left untouched

Pieter Serraris

Pieter Serraris discusses the importance of log file analysis in SEO, emphasizing its potential to uncover valuable insights about website performance and AI bot interactions.

 
Pieter Serraris 2025 Additional Insights podcast cover with logo
« Back to Additional insights
More Additional Insights YouTube Podcast Playlist Link Spotify Podcast Playlist Link Audible Podcast Playlist Link

Pieter says: “My additional insight is to go much deeper into log file analysis.

I know, for quite a lot of SEOs, log file analysis is already a thing they're doing, but I don't think they're doing much more than scratching the surface – especially if I look around at agencies in my area, in Europe.

I think log file analysis is somewhat of a hidden gem: a gold mine that's not touched enough. I know that what most SEOs link to a log file analysis is that it's a very technical tool which, of course, it is. It is invaluable, especially for bigger websites like e-com and big, international/multinational websites.

It's invaluable to see what's happening to your crawl budget; if bots are hitting walls somewhere or if there are a lot of 404s that you cannot find in the traditional SEO tools, it's invaluable. You must do that, especially for bigger websites.

But I think there's much more that can be found in these log files. Just dig a bit deeper and learn the names of the bots/the user agents that they are using, and you can find out a lot of information – both for SEO purposes, but now also for generative engine optimization and finding out more about what's happening in ChatGPT, Perplexity, etc.

Also, your log files will give you a lot more information, so you'll be able to see if ChatGPT visited your website, and by seeing which of ChatGPT's bots it was (because they actually have at least three that we know of), you will see what the purpose of its visit was.

You have the GPT bot, which is just the learning bot that’s gathering information to add to the large language model to make it smarter. But there's also, for instance, ChatGPT-User bot, and whenever you see a hit from that bot, you know that your site was being used as a source because that is for real-time visits. It's whenever a prompt is given for which your website is used as a source, and ChatGPT doesn't have enough information in its LLM, in its large language model, to answer the prompt. Then the ChatGPT-User bot will come visit your website and actually gather more information, more context, in order for it to answer the question.

That's really invaluable information that you cannot gather elsewhere, for the moment, because ChatGPT doesn't have an analytics tool that you can use as an SEO or other marketing expert. It's very, very important and you can find a lot of information. Seeing what these bots are doing is very interesting.

Also, again, the technical part: gather information on whether they have trouble, if they hit 404 pages, or if they're hallucinating/they just make up URLs (that also happens). It's also very interesting if that happens because you can learn from that as well. But also, for your content strategy, which URLs from your website are being visited by which bot?

You can learn so much from it. You can use this in your content strategy, in your interlinking strategy, etc. There’s really a lot you can learn from the log files.”

Okay, so log file analysis is hyper-relevant in the age of AI. It's been relevant for years, to give that additional context in terms of what users do on your site and how you can use that to actually augment your SEO strategy.

But, how do you find your log files, how do you have conversations with IT departments, how do you ensure that the right information is being retained for the right length of time?

“From my experience, that’s always a bit of a struggle. It’s a short struggle. As soon as it's set up, it's okay, but some IT departments have some difficulties with finding the right log files.

First of all, it's important that you have the right log files, because you have things like ADAL log files on your site that are not relevant for SEO work. What you need are the access log files, and they're on the server of your website. That's why they're so interesting: they're being gathered on a server level.

Who you need to talk to is the person in charge of your server. Most of the time, that's just your IT department, who will help you guide the way. If it's a different person, then just talk to them. It always involves some back-and-forth emailing, because the way you can gather these log files is always different depending on which kind of server you're using, so there's not just one easy email that I can always send to my clients’ IT departments. But we always get there.

It's also a question of what you want to do with the log files. For me, if you're able to set up an API with your log files and a third-party tool to gather the log files, that's the easiest way. At our agency, we use JetOctopus, which is a very cool tool that sets up the API for you and then links it to Google Search Console data, Google Analytics data, etc.

That's the ideal situation, where you have an API, because then, whenever there's an issue appearing in your log files, you can immediately get an alert. That's cool.

What you can also ask for is a one-time export of, let's say, at least three weeks of data. That's a very big file you will get, but there are other tools that you can use to analyse the export. For instance, Screaming Frog also has (besides the well-known Screaming Frog) the Screaming Frog Log File Analyser. You can import log file data into there, so that's also very interesting. Then it's just a one-time export.

It can be a good idea for getting to know the log files and convincing your clients, if you’re on the agency side. It might be a good first step. Depending on the type of website you are on (for instance, if you’re on a WordPress site), some plugins will just do the work for you.

We sometimes set up Dark Visitors, which is a WordPress plugin. It's very easy to use. It's not just a WordPress plugin; you can use it in other content management systems as well. For WordPress, it just has a plug-and-play setup. Just set it up, and you will see the bot visits coming in immediately.

You can't learn as much from it as you would from having the entire thing coming into a tool like JetOctopus, but what we have learned from it is quite a lot. We have it set up for our own website, for instance. Then, we see which AI bots are coming in, what Google Bot is doing, and which bots are spamming our websites unnecessarily. It's a very easy tool to get to know log files.

To summarise this long answer, the setup really depends on what you intend to do with it. The ideal situation, for me, is that you set up a connection between your log files where they're stored. It gets very technical. You need some sort of ‘bucket’, it's called, where your third-party tool can gather it. That's the ideal situation, but you can also ask for an export or just set up a WordPress plugin, for instance.”

You talked about AI bots visiting sites and being able to view that, and log files.

What, specifically, are you looking for within your log files to analyse that, and how can you piece that together with how your website is performing on the various AI search engines?

“What we're mostly looking at, first thing, is which bots are visiting.

As I mentioned, if we take the use case of ChatGPT, it's very interesting to see, for the GPT bot (the ‘LLM bot’), which pages they are using, and which they aren't. This already clarifies quite a lot, in terms of what information ChatGPT finds interesting and which information is simply not found. A very interesting use case there is to help this bot find more information.

Obviously, if you're a publisher, the situation is different. Then you probably will not be so happy that the GPT bot is visiting your website, because it's basically stealing/borrowing your information in order to get smarter. Then, a good use case would be just to use the robots.txt file to block this specific bot.

Another thing you can do is, for instance, we look at the pages that the ChatGPT-User bot is visiting, because this shows that these pages are actually being mentioned in ChatGPT. We think about it, and we see this as relevant content. We think about what the ChatGPT-User bot would do if it visits that page. Why is this page mentioned more than others? Is this, for instance, a long read? Is it a very short FAQ page?

Again, learning about which type of content is being shown in ChatGPT and being used as a source can also help you out very much because it gives you a lot of information for your content strategy.”

What about tying this together with your actual performance in the AI search engines?

“For me, learning about your performance in AI engines is like a puzzle where you have to gather the pieces yourself, because we don't have a lot of information in one tool – like in Google Ads, for instance, where you see everything. You see your impressions, you see your clicks, and you see what people have done on your website. You don't have that for ChatGPT.

What we do have is a few pieces of the puzzle. We have the log files, which show your impressions – in part; not all of them, obviously. Only that, if the LLMs don't have enough information about your website for a specific prompt, you will see that you're being used as a source. That's one piece of the puzzle.

You have brand trackers for AI tools where you can see, if you automatically push prompts into ChatGPT, etc., whether you are mentioned, how you are mentioned, and maybe which competitors of yours do better or worse than you. That's another piece of the puzzle. Then you have your Google Analytics data, where you actually see the visitors coming in. These are the three pieces of the puzzle that we link to each other.

You can do that in a Looker Studio report, where you see that these pages, for instance, get good impressions because they are mentioned a bit more, because we rank a bit higher than our competitors there, on average, so they give us more clicks eventually.

Learning more about the impressions is just so valuable because the clicks are decreasing, and fewer people are going to visit your website. Simply learning about how you are being shown and how your website is being used in a very black box tool like ChatGPT or Perplexity is already a lot of information that you can really use to help steer your content strategy, but also help these tools fix hallucinations.

We've had clients where these tools use the wrong version of the homepage, putting the wrong thing behind the URL. It helped us to just add a redirect from the hallucinated homepage to the right homepage. This helped us get more visibility and more clicks to the right homepage.

As I mentioned, it's a very cool way of learning a lot, but the deeper you go, the more you learn, and the more you go into the details. Sometimes it's going into a one-time visit of a bot and following it around to see what it does. It's very interesting and you can really learn a lot.

You can sort of communicate with these bots for the first time. For instance, if you see a lot of visits to a page on your website that doesn’t exist. We had that happen. We have an ‘SEO’ page on our agency website and, instead of visiting ‘SEO’ in the directory, the ChatGPT bot kept visiting a page for ‘Search Engine Optimization’, which was a 404 page.

It visited that page over 1,000 times in one week, so we put a redirect from that page to the right ‘SEO’ page. It helped us to be found more often and get more visits by sort of communicating to the bot. It also helped us understand that the full term that ChatGPT was using was ‘Search Engine Optimization’, and it was maybe a keyword or part of a prompt that we needed to mention more, instead of only using ‘SEO’ as an abbreviation.

There are a lot of cool use cases to discover.”

Pieter, what's the key takeaway from the tip you shared today?

“The key takeaway for me is, if you're not yet using log files, if you're not yet analysing them, make sure you start doing it.

Just by starting, it's not that complicated. It looks like a lot, it looks like something only technical SEOs can do, but it's really not. Maybe you'll need some help setting it up the first time, but there are a lot of cool tools out there that will easily help you. There are also a lot of YouTube tutorials to be found.

As soon as it’s set up, you will be able to start digging – and its kind of addictive, actually, when you start to get to know what’s happening on your website, start learning from it, start playing with it, and start testing. That's really my key takeaway.

Sometimes it will do nothing, whatever you test, but sometimes you really get results out of it. That's what SEO is about, and I don't think you can do SEO without seeing data. This is really a way of gathering data that's untouched, so far, for a lot of websites.”

Pieter Serraris is the SEO lead at OMcollective, and you can find him over at OMcollective.com.

Choose Your Own Learning Style

Webinar iconVideo

If you like to get up-close with your favourite SEO experts, these one-to-one interviews might just be for you.

Watch all of our episodes, FREE, on our dedicated SEO in 2025: Additional Insights playlist.

youtube Playlist Icon

Podcast iconPodcast

Maybe you are more of a listener than a watcher, or prefer to learn while you commute.

SEO in 2025: Additional Insights is available now via all the usual podcast platforms

Spotify Apple Podcasts Audible

Book iconSEO in 2025

Catch up on SEO tips from 106 SEO experts in the original SEO in 2025 series

Available as a video series, podcast, and a book.

SEO in 2025

Could we improve this page for you? Please tell us

Fresh Index

Unique URLs crawled 230,747,472,346
Unique URLs found 836,604,171,610
Date range 11 Oct 2025 to 08 Feb 2026
Last updated 14 minutes ago

Historic Index

Unique URLs crawled 4,502,566,935,407
Unique URLs found 21,743,308,221,308
Date range 06 Jun 2006 to 26 Mar 2024
Last updated 03 May 2024

SOCIAL

  • LinkedIn
  • YouTube
  • Facebook
  • Bluesky
  • Twitter / X

COMPANY

  • Blog External Link
  • About
  • Terms and Conditions
  • Privacy Policy
  • GDPR
  • Contact Us

TOOLS

  • Plans & Pricing
  • Site Explorer
  • Compare Domains
  • Bulk Backlinks
  • Search Explorer
  • Developer API External Link

MAJESTIC FOR

  • Trust Flow
  • Flow Metric Scores
  • Link Context
  • Backlink Checker
  • Influencer Discovery
  • Enterprise External Link

PODCASTS & PUBLICATIONS

  • The Majestic SEO Podcast
  • SEO in 2026
  • SEO in 2025
  • SEO in 2024
  • SEO in 2023
  • SEO in 2022
  • All Podcasts
top ^