CompositeDocAdditionalChecksums
Composite DocDocument ProcessingGoogleApi.ContentWarehouse.V1.Model.CompositeDocAdditionalChecksums
SEO Analysis
AI GeneratedPart of Google's Composite Document system, which brings together all known information about a URL into a single unified document representation. This includes content, links, quality signals, and metadata from multiple sources. The composite document is the complete picture Google has of a page and serves as the input for ranking algorithms.
Actionable Insights for SEOs
- Understanding this model helps SEOs grasp Google's internal data architecture
- Consider how this system might interact with other ranking signals
Attributes
5NoTransientChecksum96stringnilFull type: String.tSame as ContentChecksum96 but without transient boilerplate.
SimHashstringnilFull type: String.tDeprecated. Use simhash_v2 and simhash_v2_significance instead.
SimHashIsTrustedboolean(nilsimhashV2stringnilFull type: String.tSimhash-v2 is generated by SimHashParseHandler, designed as a complete replacement of simhash-v1 (a.k.a. the original simhash above) from ApproxDupsParseHandler. Simhash-v2 uses a revised algorithm so that it is expected to work better in most cases than simhash-v1. They coexist in current transition period, then simhash-v1 will be retired.
simhashV2Significancefloat(nilSimhash-v2-significance is used to describe the confidence about the corresponding simhash-v2 value. It is defined as the average absolute difference from zero of all internal state components when finalizing a simhash-v2 value in HashMultiSetDotCauchy. We used to compare the significance against some pre-defined threshold (default: 20) to get a boolean value "trusted_simhash_v2". However, it is possible that this field is missing while "simhash_v2" is present, in such case (1) Use "SimHashIsTrusted" instead if it is present, AND/OR (2) Assume "simhash_v2" is trusted if its value is non-zero.