When people talk about making their business visible to AI, they usually mean their website. Maybe their LinkedIn. Maybe their Google Business Profile. Almost nobody mentions the data source that carries more entity authority than all of those combined: government business registries.

Government registries are not glamorous. They have terrible UIs. They load slowly. Their search functions are often broken. But they have one property that no other data source can match: institutional authority. When a government registry says a company exists, was founded on a specific date, and operates at a specific address, machines treat that as ground truth. Everything else is secondary.

This is not opinion. It is how knowledge graphs are built. Google's own documentation on Knowledge Panel sources explicitly lists official government data as a primary source of entity information [1]. AI training data pipelines prioritize government and institutional sources for entity facts because they have the lowest error rates and the highest verification standards.

The problem is that most businesses treat their government registrations as a compliance checkbox. File the paperwork, get the license, put the certificate in a drawer, forget about it. They never consider that the data in those registries is actively being read by the systems that determine whether their company exists as a recognized entity in the digital world.

Why Government Registries Carry Extreme Entity Weight

To understand why these registries matter so much, you need to understand how AI systems and knowledge graphs evaluate source authority.

Every piece of information a machine encounters has an implicit trust score. Self-published content on your website has the lowest trust score because you control it completely. Social media profiles sit slightly higher because the platform provides a thin layer of verification. Industry databases are higher still. But government registries sit at or near the top of the trust hierarchy for entity facts.

Three properties make government data special.

Verification at registration. You cannot register a company in any legitimate jurisdiction without providing verifiable documentation. In Indonesia, that means notarized incorporation documents, founder identity verification, and address confirmation. The data enters the registry pre-verified. No other digital source has this property.

Legal enforceability. The data in a government registry has legal standing. If your registry says your company is active and headquartered in Bogor, that is a legally binding statement. Misrepresenting this data is a criminal offense in most jurisdictions. This legal weight is why machines treat it as ground truth.

Structural consistency. Government registries store data in structured formats with standardized fields. Company name, registration number, date of incorporation, registered address, directors, share capital. These map directly to schema.org Organization properties, making them ideal for knowledge graph ingestion.

B2B Mention's research on entity SEO confirms this: brands that ensure their entity data is consistent across authoritative sources, including government registries, see significantly stronger knowledge graph presence [2].

The Registries That Matter

Not all government registries are equally visible to AI systems. Some are crawlable and structured. Others are locked behind authentication walls or stored in formats that machines cannot parse. Here is a practical assessment of the registries that actually feed into entity verification.

Country Registry Authority Level AI Visibility Notes
Indonesia AHU (Administrasi Hukum Umum) Highest (Ministry of Law) Medium. Data is structured but behind authentication Primary legal registration. Contains incorporation data, directors, amendments
Indonesia OSS (Online Single Submission) High (BKPM/Investment Ministry) Medium-High. Publicly searchable, structured output Business licensing. NIB (Nomor Induk Berusaha) is the key identifier
Indonesia NPWP (Tax Registry) High (Directorate General of Taxes) Low. Not publicly searchable Tax ID confirms entity existence but data is not accessible to crawlers
UK Companies House Highest Very High. Full API, open data, machine-readable Gold standard for registry accessibility. Free API, bulk downloads, linked data
US SEC EDGAR Highest (for public companies) Very High. Full-text search, XBRL structured data Public company filings. XBRL format is directly machine-readable
US State SOS (Secretary of State) High Medium-High. Varies by state, most are searchable Business formation records. Each state maintains its own registry
Singapore ACRA (Accounting and Corporate Regulatory Authority) Highest High. Structured search, BizFile portal UEN (Unique Entity Number) is the key identifier. Clean structured data
EU BRIS (Business Registers Interconnection System) Highest (cross-border) High. Interconnects 30+ national registries Allows cross-border entity verification. EU-wide standard format
India MCA (Ministry of Corporate Affairs) Highest Medium. Searchable but sometimes slow CIN (Corporate Identification Number) is the key identifier
Australia ASIC (Australian Securities and Investments Commission) Highest High. ABN Lookup is fast, structured, and free ABN (Australian Business Number) provides instant entity verification

The pattern is clear. Registries that offer public APIs, structured data formats, and open access score highest on AI visibility. Registries behind authentication walls or in unstructured formats score lower, even if their authority level is equally high.

For Indonesian businesses, this creates a specific challenge. Our primary registries (AHU and OSS) have high authority but medium AI visibility compared to the UK's Companies House or Australia's ABN Lookup. This gap means Indonesian companies need to compensate with stronger entity infrastructure elsewhere, which is exactly what I described in the Trust Chain Methodology. Bridging this gap systematically is what entity infrastructure work looks like in practice.

How These Registries Feed Into Knowledge Graphs

The path from government registry to knowledge graph is not direct for most countries. It works through intermediaries.

Wikidata. This is the primary bridge. When you create a Wikidata entry for your company, you include properties like "official website" (P856), "inception" (P571), "headquarters location" (P159), and critically, registry identifiers like "Companies House ID" (P4012), "ABN" (P3548), or "Indonesian legal entity number." These properties allow machines to cross-reference your Wikidata entry against the government registry, creating a verified link. I covered the mechanics of this in my Wikidata guide for business owners.

Knowledge Graph ingestion. Google's Knowledge Graph pulls from Wikidata, government open data portals, and direct crawling of publicly accessible registries. When your company appears in a government registry AND in Wikidata AND on your domain with consistent data, the Knowledge Graph recognizes this as a verified entity. Search Engine Land's analysis confirms that multi-source corroboration, especially from institutional sources, is the strongest signal for Knowledge Panel generation [3].

AI training data. Large language models are trained on web-scale data that includes government open data dumps, Wikidata exports, and crawled registry pages. Companies that appear in these high-authority sources during training become part of the model's entity knowledge. This is why AI systems can answer questions about companies listed in Companies House but draw blanks on companies that only exist on their own websites.

The Indonesia-Specific Problem

Indonesian businesses face a compounding disadvantage. Our government registries are authoritative but not optimally accessible to machines. AHU data is behind a login wall. OSS is searchable but the output is not in a standard machine-readable format like JSON or XML. NPWP data is not publicly available at all.

Compare this to a UK company. Companies House data is available via a free API, in JSON format, with every company's full history of filings, directors, and registered addresses. A machine can verify a UK company's existence and basic facts in milliseconds. For an Indonesian company, the same verification requires navigating authentication, parsing unstructured HTML, and sometimes manually checking PDF documents.

This does not mean Indonesian companies are invisible. It means they need to work harder to bridge the gap. The bridge is Wikidata. If your Indonesian company has a Wikidata entry with accurate data and references to your OSS/NIB registration, you have created a machine-readable proxy for data that the registry itself does not expose well.

The other bridge is your own domain. JSON-LD Organization schema with your legal name, registration numbers, founding date, and address gives machines the structured data that Indonesian registries do not expose directly. Combined with a proper understanding of how Knowledge Graphs work, this becomes your primary path to entity recognition.

What to Do With This Information

Knowing which registries matter is only useful if you act on it. Here is the practical sequence.

Verify your registry data is current. Log into AHU and OSS. Check that your company name, address, directors, and business activities are accurate and current. Outdated registry data is worse than missing data because it creates inconsistencies when machines cross-reference.

Add registry identifiers to your schema markup. Your Organization JSON-LD should include your NIB (from OSS), your company registration number (from AHU), and any other official identifiers. These become the anchors that verification systems use to match your domain to your registry records.

Create or update your Wikidata entry. Include registry-specific properties. For Indonesian companies, add "legal form" (P1454) set to "Perseroan Terbatas," "country" (P17) set to "Indonesia," and any available registry identifiers. Reference your registry records as sources.

Ensure cross-source consistency. Your company name on your domain, in your schema markup, on LinkedIn, in Wikidata, and in government registries must be identical or explicitly declared as alternative names. "PT Arsindo Integrasi Pompa" in the registry and "Arsindo" on LinkedIn without a declared alias creates a matching failure.

Monitor registry updates. When you change directors, amend your articles of association, or update your business activities in the registry, update your schema markup and Wikidata entry immediately. Stale data creates entity drift that accumulates over time. The Entity Authority course covers this maintenance cycle and how to keep your registry data in sync across all surfaces.

Key concept: Government registries are the highest-authority entity data sources that exist. The gap between what these registries know about your company and what machines can access about your company is the gap you need to bridge with structured data and Wikidata.

Frequently Asked Questions

Can AI systems access Indonesian government registries directly?

Not easily. AHU requires authentication, and its data is not exposed via API or machine-readable format. OSS is publicly searchable but outputs HTML that requires parsing. NPWP data is not publicly available. This is a significant difference from registries like UK Companies House, which offers a free JSON API. For Indonesian companies, the practical solution is to ensure your registry data is accurately reflected in Wikidata and in your domain's JSON-LD schema. These serve as machine-readable proxies for the registry data. Over time, as Indonesian government digital infrastructure improves, direct access may become available. Until then, the proxy approach is your best path to making registry-level entity data visible to AI systems.

Do I need to register on international registries if I only operate in Indonesia?

No, you do not need to register in foreign jurisdictions. But you should ensure your Indonesian registration data is represented in international systems that AI can read. The most important is Wikidata, which is jurisdiction-agnostic and feeds directly into Google's Knowledge Graph and AI training data. You should also ensure your company appears in international business databases like Crunchbase or industry-specific directories, with data that matches your Indonesian registry records. The goal is not to register internationally but to make your domestic registration data visible in the systems that AI actually queries.

How do I find my company's registration identifiers for schema markup?

In Indonesia, your NIB (Nomor Induk Berusaha) is available through the OSS portal at oss.go.id. Your company registration number (nomor akta pendirian) is in your incorporation deed and can be verified through AHU at ahu.go.id. Your NPWP (tax ID) is issued by the tax office. For schema markup, use the "taxID" property for your NPWP, the "legalName" property for your full registered name, and "identifier" with the appropriate property type for your NIB. Include "foundingDate" matching your incorporation date from AHU. Every identifier you include gives verification systems another data point to match against registry records.

References

  1. Google. "How Knowledge Panels in Google Search are created." Google Support, 2024. Link
  2. B2B Mention. "Why Brands Can't Ignore SEO Entities." B2B Mention Blog, 2024. Link
  3. Search Engine Land. "Google Knowledge Panel: What It Is & How to Get Featured." Search Engine Land, 2025. Link

Related notes

2026-03-28

The companies that show up in ChatGPT are the ones that bothered to be verifiable.