Indexing
Session 7.2 · ~5 min read
A page that Google crawls is not necessarily a page that Google indexes. Crawling means Google visited the URL and downloaded the content. Indexing means Google processed that content and stored it in a form that can appear in search results. The gap between these two stages is where many entity signals get lost.
If your About page is crawled but not indexed, Google has seen your entity information but decided not to include it in search results. Your schema markup, your entity description, your sameAs links, all of it becomes invisible to searchers. Understanding and fixing indexation issues is essential for entity authority.
Crawled vs. Indexed
Assessment"} C -->|High Quality| D["Page Indexed"] C -->|Low Quality| E["Page Crawled
but NOT Indexed"] C -->|Duplicate| F["Page Crawled
Canonical Chosen Instead"] C -->|noindex| G["Page Crawled
Deliberately Excluded"] D --> H["Appears in
Search Results"] E --> I["Entity Signals
Lost"] F --> J["Signals May Transfer
to Canonical"] G --> K["Entity Signals
Deliberately Blocked"] style A fill:#222221,stroke:#c8a882,color:#ede9e3 style D fill:#222221,stroke:#6b8f71,color:#ede9e3 style H fill:#222221,stroke:#6b8f71,color:#ede9e3 style E fill:#222221,stroke:#c47a5a,color:#ede9e3 style I fill:#222221,stroke:#c47a5a,color:#ede9e3
Google makes a quality and relevance decision for every URL it crawls. Pages that are thin (very little content), duplicative (substantially similar to another page), or technically blocked (noindex tag) will not be indexed. Pages that pass the quality assessment enter the index and can appear in search results.
Key concept: For entity authority, you need to confirm that every entity-critical page on your site is actually indexed, not just crawled. The site: operator and Google Search Console are your primary tools for this verification.
Checking Indexation Status
There are two reliable methods for checking whether a specific page is indexed.
Method 1: The site: operator. Type site:yourdomain.com/about/ into Google's search bar. If the page appears in results, it is indexed. If it does not appear, it is either not indexed or not crawled.
Method 2: Google Search Console URL Inspection. Enter the full URL into the URL Inspection Tool. Google will tell you whether the page is indexed, when it was last crawled, and if there are any issues preventing indexation.
Method 2 is more authoritative because it comes directly from Google's systems. The site: operator is useful for quick checks but can occasionally show inconsistent results.
Common Indexation Issues and Fixes
| Issue | Symptom | Cause | Fix |
|---|---|---|---|
| Discovered, not indexed | GSC shows "Discovered - currently not indexed" | Google found the URL but has not crawled it yet, often due to low priority | Improve internal linking to the page. Submit URL via GSC. Add to XML sitemap. |
| Crawled, not indexed | GSC shows "Crawled - currently not indexed" | Google crawled the page but judged it low quality or not useful enough to index | Add unique, valuable content to the page. Improve internal linking. Remove thin content. |
| Excluded by noindex | GSC shows "Excluded by 'noindex' tag" | Page has a noindex meta tag or X-Robots-Tag header | Remove the noindex directive if the page should be indexed. |
| Duplicate without canonical | GSC shows "Duplicate without user-selected canonical" | Google found duplicate content and chose its own canonical | Set a proper canonical tag on the preferred version. |
| Duplicate, submitted URL not selected | GSC shows page is duplicate of another URL you submitted | Google considers another URL the canonical version | Consolidate duplicate pages. Redirect non-canonical to canonical. |
| Page removed by request | GSC shows "Page removed by legal request" or robots.txt | URL removal request or robots.txt block | If unintentional, cancel the removal request or update robots.txt. |
| Soft 404 | GSC shows "Soft 404" | Page returns a 200 status code but looks like an error page to Google | Add real content to the page, or return a proper 404 status code. |
| Server error (5xx) | GSC shows "Server error (5xx)" | Server returned a 500 error when Googlebot crawled | Fix server errors. Re-request indexing after fix. |
| Redirect error | GSC shows "Redirect error" | Redirect loop or excessively long redirect chain | Fix redirect chains. Ensure each URL redirects directly to the final destination. |
The Entity Page Indexation Audit
For entity authority, not every page on your site needs to be indexed. But certain pages are critical. These are the pages that carry your strongest entity signals, and if any of them are not indexed, your entity infrastructure has a gap.
| Entity-Critical Page | Why It Must Be Indexed | Check Method |
|---|---|---|
| Homepage | Carries Organization/Person schema, sameAs links, primary entity description | site:yourdomain.com |
| About Page | Detailed entity description, team info, history | site:yourdomain.com/about/ |
| Contact Page | NAP information, ContactPoint schema | site:yourdomain.com/contact/ |
| Service/Product Pages | Service/Product schema, entity capabilities | site:yourdomain.com/services/ |
| Team/Founder Page | Person schema, author entity signals | site:yourdomain.com/team/ |
A typical website sees about 62% of its pages fully indexed. The remaining 38% falls into various non-indexed categories. For large sites with thousands of pages, this ratio is normal. But for entity-critical pages, the target is 100% indexation. Every page that carries entity signals must be in the index.
Requesting Indexation
If an entity-critical page is not indexed, you can request indexation through Google Search Console. Open the URL Inspection Tool, enter the page URL, and click "Request Indexing." Google will prioritize crawling and processing that URL.
This is not a guarantee of indexation. Google may still decide the page does not meet its quality standards. But it does ensure the page gets crawled promptly rather than waiting in the crawl queue.
Do not overuse the Request Indexing feature. It is designed for individual pages that need attention, not for bulk submissions. For bulk indexation, use XML sitemaps (covered in Session 7.6).
Fixing "Crawled, Not Indexed"
The most frustrating indexation issue is "Crawled - currently not indexed." Google visited your page, looked at it, and decided not to include it. This typically means Google judged the page as not valuable enough to warrant an index entry.
Fixes for this issue:
- Add more unique content. Thin pages with only a few sentences are often not indexed. Entity pages should have substantial, unique content.
- Improve internal linking. If the page is orphaned (no other pages link to it), Google assigns it low importance. Link to it from your homepage, navigation, and related pages.
- Add structured data. Schema markup does not guarantee indexation, but it adds a signal of page purpose and quality.
- Remove near-duplicate content. If the page is too similar to another page on your site, Google may choose to index only one version.
Further Reading
- Google. "How Google Search Indexing Works." Google Search Central. developers.google.com/search/docs/fundamentals/how-search-works
- Google. "Page Indexing Report." Google Search Central. support.google.com/webmasters/answer/7440203
- Google. "URL Inspection Tool." Google Search Central. support.google.com/webmasters/answer/9012289
- Mueller, John. "Google on Crawled Not Indexed." Google Search Central YouTube. youtube.com/googlewebmasters
Assignment
- Open Google Search Console and navigate to the Pages report (Index > Pages). Record the total number of indexed pages and the total number of pages with issues.
- For each entity-critical page (homepage, about, contact, services, team), use the URL Inspection Tool to confirm indexation status. Record the status of each.
- If any entity-critical page shows "Crawled - currently not indexed" or "Discovered - currently not indexed," investigate using the fix strategies above. Request indexing after making improvements.
- Use the site: operator to spot-check 5 important pages on your site. Do the results match what Search Console reports?
- Review the Pages report for any "Excluded by noindex" pages. Verify that each excluded page is intentionally excluded, not accidentally blocked.