SEO Pulse: Bing AI Citation Tracking, Hidden HTTP Homepages, and Pages Exceeding Crawl Limits

Pulse for SEO: AI Citations, Ghost Pages & Googlebot Crawl Limits

This week’s updates cover tracking AI visibility, troubleshooting a hidden homepage that can break your site name in search results, and what new crawl data reveals about Googlebot’s file size limits.


Bing Webmaster Tools Adds AI Citation Dashboard

Microsoft introduced an AI Performance dashboard in Bing Webmaster Tools, now in public preview, giving publishers insight into how often their content is cited in Copilot and other AI-generated answers.

Key Facts:

  • Tracks total citations, page-level activity, average cited pages per day, and grounding queries (the phrases AI used to retrieve your content).

Why It Matters:
Google Search Console reports AI overview data, but it doesn’t break out page-level citations or grounding queries. Bing’s dashboard lets you see which pages are cited, how often, and why—but not whether those citations drive clicks. Combining this with your own analytics helps measure AI-driven impact on business outcomes.

Industry Reaction:

  • Wil Reynolds: Excited about grounding queries, seeing new opportunities to analyze AI citations.

  • Koray Tuğberk GÜBÜR: Highlights Bing’s transparency advantage over Google Search Console.

  • Fabrice Canel (Microsoft Bing): Frames the tool as bridging traditional SEO and AI-driven optimization.

The takeaway: AI citation visibility is finally measurable—but currently, only on Bing.

[Full coverage: Bing Webmaster Tools Adds AI Citation Performance Data]


Hidden HTTP Homepage Can Break Your Site Name in Google

John Mueller shared a case where a leftover HTTP homepage caused site-name and favicon issues in search results, even though the site used HTTPS. Chrome auto-upgrades HTTP requests, hiding the issue from normal browsing—but Googlebot sees the HTTP page.

Key Facts:

  • HTTP homepage accessible by Googlebot but invisible to Chrome.

  • Mismatched content caused Google to display the wrong site name or favicon.

  • Solution: run curl from the command line or use Search Console’s URL Inspection Live Test to see what Google retrieves.

Why It Matters:
Browser behavior often masks the pages crawlers see. Invisible issues like this can break search result presentation even when your HTTPS homepage looks fine.

Mueller:

“Chrome automatically upgrades HTTP to HTTPS so you don’t see the HTTP page. However, Googlebot sees and uses it to influence the sitename & favicon selection.”


New Data Shows Most Pages Fit Well Within Googlebot’s Crawl Limit

Analysis of HTTP Archive data shows most webpages are well below Googlebot’s 2 MB HTML fetch limit (PDFs get 64 MB).

Why It Matters:
The update clarifies real-world risks of Googlebot’s size limits. For almost all sites, page size isn’t an SEO concern—even for content-heavy pages. Issues only arise for extremely bloated HTML, inline scripts, or embedded data.

Industry Perspective:

  • Dave Smart (Tame the Bots): 2 MB is still huge; real-world risk is minimal.

  • John Mueller: Shared tools to test HTML size against the 2 MB cutoff.

  • Roger Montti (SEJ): Most pages are far below the limit; HTML size can generally be removed from SEO worry lists.

[Full coverage: Googlebot’s 2 MB Crawl Limit Is Enough]


Theme of the Week: The Diagnostic Gap

Each story highlights hidden issues or measurement gaps:

  1. Bing AI dashboard: Fills a gap in AI citation visibility.

  2. Ghost HTTP homepage: Reveals pages standard audits and browsers can’t see.

  3. Googlebot crawl limits: Confirms documentation with real-world data.

Takeaway: Tools and data for understanding search engines are becoming more precise—but knowing where to look is the key challenge.