First Impressions Count – Googlebot Only Cares About Your First 15MB Of Content

If you're an avid reader of the District Maven blog, then you've read about the importance of sequencing in search engine optimization. How content is ordered on a website dramatically impacts how Google's indexing robots read and understand it. Generally, from both a macro and micro level, the search engines will put the most emphasis on what information is listed first on a website and its pages.

This is true regarding a site's navigation architecture and the META information attached to each landing page. As far as the latter is concerned, length also matters, and Google will only recognize the first 70 characters of a title tag and 160 characters in a META Description, making this precious digital real estate.

Image
Most recently, the search engine giant announced that they were taking this a step further. In a subtle, too-cool-for-school update to their Googlebot help documents, they now state that Google only crawls and indexes the first 15MB of content within any given page on a website.

"Googlebot can crawl the first 15MB of content in an HTML file or supported text-based file. After the first 15MB of the file, Googlebot stops crawling and only considers the first 15MB of content for indexing."

At face value, this announcement doesn't seem all that groundbreaking. It isn't a secret that sequencing is one of the ways Google determines what a website is trying to emphasize. However, it reinforces this fact's importance and reminds web admins to keep their website content on the lighter side for human users and indexing robots. This is something to keep in mind as stylistic and functional web design techniques continue to change and evolve. (You can get the full scoop from Google here.)

How To Test This For Yourself: One way to see what is being indexed for your website and its subpages is by using the URL Inspection Tool within Google Search Console. This will show you what content is being indexed by the search engines and what is being excluded. As a rule of thumb, the most important information and CTA's should always be placed at the top of any page. This makes it more likely that a human visitor will see it and Google will crawl and render it.

If you're an avid reader of the District Maven blog, then you've read about the importance of sequencing in search engine optimization. How content is ordered on a website dramatically impacts how Google's indexing robots read and understand it. Generally, from both a macro and micro level, the search engines will put the most emphasis on what information is listed first on a website and its pages.

This is true regarding a site's navigation architecture and the META information attached to each landing page. As far as the latter is concerned, length also matters, and Google will only recognize the first 70 characters of a title tag and 160 characters in a META Description, making this precious digital real estate.

Image
Most recently, the search engine giant announced that they were taking this a step further. In a subtle, too-cool-for-school update to their Googlebot help documents, they now state that Google only crawls and indexes the first 15MB of content within any given page on a website.

"Googlebot can crawl the first 15MB of content in an HTML file or supported text-based file. After the first 15MB of the file, Googlebot stops crawling and only considers the first 15MB of content for indexing."

At face value, this announcement doesn't seem all that groundbreaking. It isn't a secret that sequencing is one of the ways Google determines what a website is trying to emphasize. However, it reinforces this fact's importance and reminds web admins to keep their website content on the lighter side for human users and indexing robots. This is something to keep in mind as stylistic and functional web design techniques continue to change and evolve. (You can get the full scoop from Google here.)

How To Test This For Yourself: One way to see what is being indexed for your website and its subpages is by using the URL Inspection Tool within Google Search Console. This will show you what content is being indexed by the search engines and what is being excluded. As a rule of thumb, the most important information and CTA's should always be placed at the top of any page. This makes it more likely that a human visitor will see it and Google will crawl and render it.