NlpSaftToken

NLPInfrastructure

GoogleApi.ContentWarehouse.V1.Model.NlpSaftToken

3
out of 10
Low
SEO Impact
A document token marks a span of bytes in the document text as a token or word. Next available index: 16.

SEO Analysis

AI Generated

Backend infrastructure with indirect SEO impact. This model (Nlp Saft Token) contains 15 attributes that define its data structure. Key functionality includes: Whether the break skipped over non-tag text (excluding script/style).

Actionable Insights for SEOs

  • Understanding this model helps SEOs grasp Google's internal data architecture
  • Consider how this system might interact with other ranking signals

Attributes

15
Sort:|Filter:
breakLevelstring
Default: nilFull type: String.t
breakSkippedTextboolean(
Default: nil

Whether the break skipped over non-tag text (excluding script/style).

categorystring
Default: nilFull type: String.t

Coarse-grained word category for token. See README.categories for category inventory.

endinteger(
Default: nil
headinteger(
Default: nil

Head of this token in the dependency tree: the id of the token which has an arc going to this one. If it is the root token of a sentence, then it is set to -1.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.Proto2BridgeMessageSet.t

Annotation for this token.

labelstring
Default: nilFull type: String.t

Label for dependency relation between this token and its head. See README.labels for label inventory.

lemmastring
Default: nilFull type: String.t

Word lemma. This is only filled if the lemma is different from the word form.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.NlpSaftMorphology.t

Morphology information.

scriptCodestring
Default: nilFull type: String.t

A string representation (typically four letters, sometimes longer) of the token's Unicode script code, based on BCP 47/CLDR, capitalized according to ISO 15924. See i18n/identifiers/scriptcode.h for details.

startinteger(
Default: nil

[start, end] describe the inclusive byte range of the UTF-8 encoded token in document.text. End gives the index of the last byte, which may be a UTF-8 continuation byte, and the length in bytes is end - start + 1. begin/end options are for goldmine AnnotationsFinder to locate the offsets of saft tokens. Start is inclusive by default and end is marked.

tagstring
Default: nilFull type: String.t

Part-of-speech tag for token. See README.tags for tag inventory.

tagConfidencenumber(
Default: nil

Confidence score for the tag prediction -- should be interpreted as a probability estimate that the tag is correct.

textPropertiesinteger(
Default: nil
wordstring
Default: nilFull type: String.t

Token word form. This may not be identical to the original. For example, in goldmine annotation we do UTF-8 normalization and punctuation normalization. The punctuation normalization includes inferring the directionality of straight doublequotes -- that is, we map " to open quote (``) or close quote (''), and sometimes we get it wrong. SAFT processing in other contexts (such as queries in qrewrite) involves different normalizations.