ResearchScienceSearchReconciledMetadata

ResearchSearch Infrastructure

GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchReconciledMetadata

7
out of 10
High
SEO Impact
A proto for storing inferred and reconciled metadata for Science Search. Next available tag: 74

SEO Analysis

AI Generated

Core search serving infrastructure. While not a direct ranking signal, these systems process and serve search results. This model (Research Science Search Reconciled Metadata) contains SEO-relevant attributes including datasetClassificationScore. Key functionality includes: An identifier as provided by the dataset itself.

Actionable Insights for SEOs

  • Monitor for changes in rankings that may correlate with updates to this system
  • Consider how your content strategy aligns with what this signal evaluates

Attributes

60
Sort:|Filter:
identifierFromSourcestring
Default: nilFull type: list(String.t

An identifier as provided by the dataset itself.

namestring
Default: nilFull type: list(String.t

The names of the dataset.

doistring
Default: nilFull type: String.t

The DOI for the dataset. We assume that there is only one.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t

Most recent of the three dates (published, created, modified)

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t

The date when the dataset was published.

alternateNamestring
Default: nilFull type: list(String.t

Alternate names and acronyms for the dataset.

locationReconciledForNameboolean(
Default: nil

Indicates if the location has been reconciled for the dataset name. This is used by LocationExtender to avoid re-annotating the dataset name.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchFieldOfStudyInfo.t

Field of study: a general, high-level classification of the dataset. This is only populated during indexing time and it is only populated if the classification_source is KNOWLEDGE_GRAPH or it's above inference threshold.

sameAsstring
Default: nilFull type: list(String.t

Ids for other instances (not different versions) of this dataset.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchLicense.t

License for the dataset.

versionsSimhashstring
Default: nilFull type: String.t

A simhash value of the fields used for identifying versions of a dataset. This will be used by the VersionClusterInfoWriter.

descriptionstring
Default: nilFull type: list(String.t

Description of the dataset.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t
isAccessibleForFreestring
Default: nilFull type: String.t

Indicates if the dataset is available for free or behind a paywal http://schema.org/isAccessibleForFree

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t

The start and end date that the dataset covers. If the dataset covers a single timepoint, then start and end dates are the same. Use the ISO 8601 format for dates (e.g., 2006-05-23).

versionEmbeddingVectorlist(number(
Default: nil

An embedding for the dataset to be used by the VersionAggregator.

authorListstring
Default: nilFull type: String.t

A string representation of the authors of the dataset, collected from author and creator in raw metadata. The exact format (e.g., comma-separated, etc.) is up to the extender that populates this field. The assumption is that this string may appear in the UI "as is".

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t

The date when the dataset was created.

topSalientTermLabelstring
Default: nilFull type: list(String.t

Top salient term labels that describe the dataset document body.

keywordstring
Default: nilFull type: list(String.t

Keywords describing the dataset.

hasCroissantFormatboolean(
Default: nil

Indicates if the dataset has croissant format (https://github.com/mlcommons/croissant). Use optional so that explicitly setting to false will ensure the value is passed along to the KG instead of being indistinguisable from being unset and thus not set in the KG.

denylistStatusstring
Default: nilFull type: list(String.t
datasetClassificationScorefloat(
Default: nil

Probability that the entity is in fact a dataset (in contrast to spam or website labelled as dataset that does not describe a dataset).

languageCodestring
Default: nilFull type: String.t

The 2-letter language code for the source page for the dataset. Same as the language code in source_url_docjoin_info. Populated only when generating output for indexing.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchSourceUrlDocjoinInfo.t

All the information extracted from docjoin, for the source_url of this dataset, aka DatasetMetadata.source_url.

compactIdentifierFromCitationstring
Default: nilFull type: list(String.t

Compact Identifier(s) extracted from the citation field. Like in the case of DOI(s) those identify the articles related to the dataset rather than the dataset itself.

mentionedUrlsstring
Default: nilFull type: list(String.t

Mentioned URLs in the description.

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDate.t

The date when the dataset was modified.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchOrganization.t

Funder of the dataset.

variablestring
Default: nilFull type: list(String.t

Variables that the data in the dataset captures (e.g., pressure, salinity, temperature). For now, these are just strings.

numberOfDatasetsAtSourceUrlinteger(
Default: nil

The number of datasets at the same source url as this dataset.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchLocation.t

Locations that describe spatial coverage of the data. If the data covers multiple locations then each value corresponds to one such location, describing its coordinates, mid, etc.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchOrganization.t

Source of the dataset: unifies provider, creator, author, publisher etc.

doiFromCitationstring
Default: nilFull type: list(String.t

DOI(s) extracted from the citation field. In contrast to the "doi" field these DOIs identify the articles related to the dataset rather than the dataset itself.

indexInClusterinteger(
Default: nil

Index of this dataset in its cluster of replicas.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchDataDownload.t

The dataset in downloadable form. There can be multiple data download entries for different file types.

scholarQuerystring
Default: nilFull type: String.t

Query string to send to Scholar to obtain the best approximation of citations to the dataset.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchCitation.t
Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchCatalog.t

Catalog that this dataset is a part of.

isBasedOnstring
Default: nilFull type: list(String.t

A resource (most likely another dataset) from which this dataset is derived or from which it is a modification or adaption. http://schema.org/isBasedOn

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchVersionClusterInfo.t

Information on the version cluster that the dataset is a part of. This field is populated during the indexing time; the field is populated only if the dataset is part of a version cluster.

urlstring
Default: nilFull type: list(String.t

urls for the dataset, including doi.

Default: nilFull type: list(GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchReplica.t

The info of replicas of this dataset.

isInferredboolean(
Default: nil

Indicates whether the metadata was inferred using an ML model rather than from the schema.org fields. Use optional so that explicitly setting to false will ensure the value is passed along to the KG instead of being indistinguisable from being unset and thus not set in the KG. This field was originally non-optional; changing to optional is backwards compatible, but protos created prior to being optional won't have has_is_inferred() (go/proto-proposals/proto3-presence#wire-format-semantic-changes).

metadataTypestring
Default: nilFull type: String.t
Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.ResearchScienceSearchScholarlyArticle.t

For tables and figures, contains all of the metadata for a scholarly article that was the source of this table or figure. This field is populated only if metadata_type is 'TABLE' or 'FIGURE'.

relatedArticleUrlstring
Default: nilFull type: String.t

The url for the article that (likely) describes this dataset.

basicFieldsHashstring
Default: nilFull type: String.t

A hash of the fields copied by BasicMetadataExtender and the importers. See cs/research/science_search/backend/extender/basic_metadata_extender.h for the list of fields.

compactIdentifierstring
Default: nilFull type: list(String.t

Compact Identifiers (for example "RRID:SCR_002088") that can be resolved by Identifiers.org or N2T.net meta-resolvers.

imageUrlstring
Default: nilFull type: list(String.t

The image urls provided by the dataset (e.g., for thumbnail images).

licenseDeprecatedstring
Default: nilFull type: list(String.t

License for the dataset. DEPRECATED

versionEmbeddingFieldsHashstring
Default: nilFull type: String.t

A hash of the raw metadata fields used by the VersionEmbeddingExtender.

hasTableSummariesboolean(
Default: nil

Indicates if the dataset has table summaries. This field is only populated during indexing time.

numberOfScholarCitationsinteger(
Default: nil

The number of articles that reference this dataset.

idstring
Default: nilFull type: String.t

A unique id for the dataset. For the data from Spore, this is the spore id, such as, for example "http://accession.nodc.noaa.gov/8500223#__sid=js0" REQUIRED

measurementTechniquestring
Default: nilFull type: list(String.t

A technique or technology used in a Dataset corresponding to the method used for measuring the corresponding variable(s) (described using variableMeasured). http://schema.org/measurementTechnique

sourceUrlstring
Default: nilFull type: String.t

Source url from which we gathered the metadata

fingerprintstring
Default: nilFull type: String.t

The fingerprint of basic fields from DatasetMetadata, including: - name - description DEPRECATED

descriptionInHtmlstring
Default: nilFull type: list(String.t

Description of the dataset converted to HTML.

datasetClassificationFieldsHashstring
Default: nilFull type: String.t

A hash of the raw metadata fields used by the QualityExtender.