CompositeDocIndexingInfo
Composite DocIndexingGoogleApi.ContentWarehouse.V1.Model.CompositeDocIndexingInfo
SEO Analysis
AI GeneratedPart of Google's Composite Document system, which brings together all known information about a URL into a single unified document representation. This includes content, links, quality signals, and metadata from multiple sources. The composite document is the complete picture Google has of a page and serves as the input for ranking algorithms.
Actionable Insights for SEOs
- Monitor for changes in rankings that may correlate with updates to this system
- Consider how your content strategy aligns with what this signal evaluates
- Optimize crawl budget by fixing broken links and reducing redirect chains
- Use robots.txt and sitemap.xml effectively to guide crawling
- Monitor Google Search Console for crawl errors and indexing issues
Attributes
23cdocBuildInfoIndexingDocjoinerCDocBuildInfo →nilFull type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerCDocBuildInfo.tTo hold extra info for building a final cdoc from raw cdoc and goldmine annotations.
contentProtectedboolean(nilWhether current page is under content protection, i.e. a page has been crawled as an error page, but we preserve its last known good content and keep its crawl_status as converter.CrawlStatus::CONTENT.
convertToRobotedReasoninteger(nilIf set, indicates that the crawl status was converted to ROBOTED for the reason specified by the enum value in converter.RobotedReasons.ConvertToRobotedReasons. See indexing/converter/proto/converter.proto for details. If unset, then the document was not converted to roboted, and if the document crawl status is ROBOTED, then the document is disallowed (at least to Google) in robots.txt.
crawlStatusinteger(nilOne of the enum values in converter.CrawlStatus.State (see indexing/converter/proto/converter.proto for details). Default is converter.CrawlStatus::CONTENT. The document is roboted if the value is converter.CrawlStatus::ROBOTED.
demotionTagsstringnilFull type: list(String.terrorTypeinteger(nilOne of the enum values in converter.ErrorPageType (see indexing/converter/proto/error-page-detector-enum.proto for detail). Default is converter::ERROR_PAGE_NONE.
freshdocsCorporastringnilFull type: list(String.thostidstringnilFull type: String.tThe host id of the document. Used chiefly to determine whether the document is part of a parked domain.
ieIdentifierstringnilFull type: String.tA short descriptive string to help identify the IE application or setup where this CDoc is generated. For example: websearch_m3 This field is for debuggability purposes.
imageIndexingInfoImageSearchImageIndexingInfo →nilFull type: GoogleApi.ContentWarehouse.V1.Model.ImageSearchImageIndexingInfo.tIndexing info about images (i.e. image links missing image data, etc).
indexingTsstringnilFull type: String.tThe timestamp (the time since the Epoch, in microseconds) when the docjoin is exported from indexing. The main purpose of this field is to identify different versions of the same document.
noLongerCanonicalTimestampstringnilFull type: String.tIf set, the timestamp in microseconds when the URL stopped being canonical. This should never be set for exported canonical documents. This field is used by dups during canonical flip, and by webmain when doc selection switched between desktop and mobile. Union respects this timestamp to prevent old doc being deleted until the new doc is picked up
normalizedClickScorenumber(nilThis score is calculated by re-mapping the back onto the partition's score distribution, such that the score represents the score of the equivalently ranked organically-selected document.
primaryVerticalstringnilFull type: String.tVertical membership of the document. - primary_vertical is the vertical that initiated indexing of this document (or empty if the vertical was websearch). - verticals is the full list of verticals that contained this document (excluding websearch) at indexing time. primary_vertical may or may not be an element of verticals because of vertical membership skew between the ingestion time and indexing time. See go/one-indexing-for-web for more background.
rawNavboostinteger(nilThe raw navboost count for the canonical url without aggregating the navboost from dup urls. This field is used when building forwarding map.
rowTimestampstringnilFull type: String.tThe timestamp (the time since the Epoch, in microseconds) to represent doc version, which is used in the downstream processing after Raffia. If it's not set, indexing_ts will be used as row_timestamp. The timestamp is generally set by reprocessing to set slightly newer indexing_ts such that the system can respect the reprocessed version to overwrite old data in storage.
selectionTierRanknumber(nilSelection tier rank is a language normalized score ranging from 0-1 over the serving tier (Base, Zeppelins, Landfills) for this document.
tracingIdstringnilFull type: list(String.tThe tracing ids is to label the version of url for url status tracking. This repeated field will carry at most 10 tracing id. See more details in go/rich-tracing-design There will be less than 2% base+uz cdocs carrying this field. The major sources of tracing ids include: Indexing API pushed urls Index Metrics sampling urls The tracing ids will be written into cdocs by Webmain Ramifier. The consumer of the tracing ids is Union serving notification collector see more at go/serving-notification-from-union
urlChangerateCrawlerChangerateUrlChangerate →nilFull type: GoogleApi.ContentWarehouse.V1.Model.CrawlerChangerateUrlChangerate.tChangerate information for this doc (see crawler/changerate/changerate.proto for details).
urlHistoryCrawlerChangerateUrlHistory →nilFull type: GoogleApi.ContentWarehouse.V1.Model.CrawlerChangerateUrlHistory.tUrl change history for this doc (see crawler/changerate/changerate.proto for details). Note if a doc has more than 20 changes, we only keep the last 20 changes here to avoid adding to much data in its docjoin.
urlPatternSignalsIndexingSignalAggregatorUrlPatternSignals →nilFull type: GoogleApi.ContentWarehouse.V1.Model.IndexingSignalAggregatorUrlPatternSignals.tUrlPatternSignals for this doc, used to compute document score in LTG (see indexing/signal_aggregator/proto/signal-aggregator.proto for details).
verticalsstringnilFull type: list(String.tvideoIndexingInfoImageRepositoryVideoIndexingInfo →nilFull type: GoogleApi.ContentWarehouse.V1.Model.ImageRepositoryVideoIndexingInfo.tIndexing info about videos.