Session 7.2: Indexing

Course → Module 7: Technical SEO Baseline

Session 2 of 7

A page that Google crawls is not necessarily a page that Google indexes. Crawling means Google visited the URL and downloaded the content. Indexing means Google processed that content and stored it in a form that can appear in search results. The gap between these two stages is where many entity signals get lost.

If your About page is crawled but not indexed, Google has seen your entity information but decided not to include it in search results. Your schema markup, your entity description, your sameAs links, all of it becomes invisible to searchers. Understanding and fixing indexation issues is essential for entity authority.

Crawled vs. Indexed

flowchart TD A["Google Discovers URL"] --> B["Googlebot Crawls Page"] B --> C{"Page Quality
Assessment"} C -->|High Quality| D["Page Indexed"] C -->|Low Quality| E["Page Crawled
but NOT Indexed"] C -->|Duplicate| F["Page Crawled
Canonical Chosen Instead"] C -->|noindex| G["Page Crawled
Deliberately Excluded"] D --> H["Appears in
Search Results"] E --> I["Entity Signals
Lost"] F --> J["Signals May Transfer
to Canonical"] G --> K["Entity Signals
Deliberately Blocked"] style A fill:#222221,stroke:#c8a882,color:#ede9e3 style D fill:#222221,stroke:#6b8f71,color:#ede9e3 style H fill:#222221,stroke:#6b8f71,color:#ede9e3 style E fill:#222221,stroke:#c47a5a,color:#ede9e3 style I fill:#222221,stroke:#c47a5a,color:#ede9e3

Google makes a quality and relevance decision for every URL it crawls. Pages that are thin (very little content), duplicative (substantially similar to another page), or technically blocked (noindex tag) will not be indexed. Pages that pass the quality assessment enter the index and can appear in search results.

Key concept: For entity authority, you need to confirm that every entity-critical page on your site is actually indexed, not just crawled. The site: operator and Google Search Console are your primary tools for this verification.

Checking Indexation Status

There are two reliable methods for checking whether a specific page is indexed.

Method 1: The site: operator. Type site:yourdomain.com/about/ into Google's search bar. If the page appears in results, it is indexed. If it does not appear, it is either not indexed or not crawled.

Method 2: Google Search Console URL Inspection. Enter the full URL into the URL Inspection Tool. Google will tell you whether the page is indexed, when it was last crawled, and if there are any issues preventing indexation.

Method 2 is more authoritative because it comes directly from Google's systems. The site: operator is useful for quick checks but can occasionally show inconsistent results.

Common Indexation Issues and Fixes

Issue	Symptom	Cause	Fix
Discovered, not indexed	GSC shows "Discovered - currently not indexed"	Google found the URL but has not crawled it yet, often due to low priority	Improve internal linking to the page. Submit URL via GSC. Add to XML sitemap.
Crawled, not indexed	GSC shows "Crawled - currently not indexed"	Google crawled the page but judged it low quality or not useful enough to index	Add unique, valuable content to the page. Improve internal linking. Remove thin content.
Excluded by noindex	GSC shows "Excluded by 'noindex' tag"	Page has a noindex meta tag or X-Robots-Tag header	Remove the noindex directive if the page should be indexed.
Duplicate without canonical	GSC shows "Duplicate without user-selected canonical"	Google found duplicate content and chose its own canonical	Set a proper canonical tag on the preferred version.
Duplicate, submitted URL not selected	GSC shows page is duplicate of another URL you submitted	Google considers another URL the canonical version	Consolidate duplicate pages. Redirect non-canonical to canonical.
Page removed by request	GSC shows "Page removed by legal request" or robots.txt	URL removal request or robots.txt block	If unintentional, cancel the removal request or update robots.txt.
Soft 404	GSC shows "Soft 404"	Page returns a 200 status code but looks like an error page to Google	Add real content to the page, or return a proper 404 status code.
Server error (5xx)	GSC shows "Server error (5xx)"	Server returned a 500 error when Googlebot crawled	Fix server errors. Re-request indexing after fix.
Redirect error	GSC shows "Redirect error"	Redirect loop or excessively long redirect chain	Fix redirect chains. Ensure each URL redirects directly to the final destination.

The Entity Page Indexation Audit

For entity authority, not every page on your site needs to be indexed. But certain pages are critical. These are the pages that carry your strongest entity signals, and if any of them are not indexed, your entity infrastructure has a gap.

Entity-Critical Page	Why It Must Be Indexed	Check Method
Homepage	Carries Organization/Person schema, sameAs links, primary entity description	site:yourdomain.com
About Page	Detailed entity description, team info, history	site:yourdomain.com/about/
Contact Page	NAP information, ContactPoint schema	site:yourdomain.com/contact/
Service/Product Pages	Service/Product schema, entity capabilities	site:yourdomain.com/services/
Team/Founder Page	Person schema, author entity signals	site:yourdomain.com/team/

A typical website sees about 62% of its pages fully indexed. The remaining 38% falls into various non-indexed categories. For large sites with thousands of pages, this ratio is normal. But for entity-critical pages, the target is 100% indexation. Every page that carries entity signals must be in the index.

Requesting Indexation

If an entity-critical page is not indexed, you can request indexation through Google Search Console. Open the URL Inspection Tool, enter the page URL, and click "Request Indexing." Google will prioritize crawling and processing that URL.

This is not a guarantee of indexation. Google may still decide the page does not meet its quality standards. But it does ensure the page gets crawled promptly rather than waiting in the crawl queue.

Do not overuse the Request Indexing feature. It is designed for individual pages that need attention, not for bulk submissions. For bulk indexation, use XML sitemaps (covered in Session 7.6).

Fixing "Crawled, Not Indexed"

The most frustrating indexation issue is "Crawled - currently not indexed." Google visited your page, looked at it, and decided not to include it. This typically means Google judged the page as not valuable enough to warrant an index entry.

Fixes for this issue:

Add more unique content. Thin pages with only a few sentences are often not indexed. Entity pages should have substantial, unique content.
Improve internal linking. If the page is orphaned (no other pages link to it), Google assigns it low importance. Link to it from your homepage, navigation, and related pages.
Add structured data. Schema markup does not guarantee indexation, but it adds a signal of page purpose and quality.
Remove near-duplicate content. If the page is too similar to another page on your site, Google may choose to index only one version.

Assignment

Open Google Search Console and navigate to the Pages report (Index > Pages). Record the total number of indexed pages and the total number of pages with issues.
For each entity-critical page (homepage, about, contact, services, team), use the URL Inspection Tool to confirm indexation status. Record the status of each.
If any entity-critical page shows "Crawled - currently not indexed" or "Discovered - currently not indexed," investigate using the fix strategies above. Request indexing after making improvements.
Use the site: operator to spot-check 5 important pages on your site. Do the results match what Search Console reports?
Review the Pages report for any "Excluded by noindex" pages. Verify that each excluded page is intentionally excluded, not accidentally blocked.

Indexing