Google quietly announced in an update to Googlebot's help document that it will crawl the first 15 MB of a webpage. Anything after this point will be excluded from ranking calculations.
This implies that code of the website must be structured in such a way that the SEO-relevant information on the first 15 MB is placed in an HTML or supported text-based file as well as whenever possible, images and videos should be compressed rather than directly encoded into HTML.
Some in the SEO community wondered if this meant that text that fell below images at the cutoff in HTML files would be completely ignored by Googlebot.
“It’s specific to the HTML file itself, like it’s written,” John Mueller, Google Search Advocate, clarified via Twitter. “Embedded resources/content pulled in with IMG tags is not a part of the HTML file.”