TrawlerTrawlerPrivateFetchReplyData

TrawlerCrawling

GoogleApi.ContentWarehouse.V1.Model.TrawlerTrawlerPrivateFetchReplyData

out of 10

Critical

SEO Impact

This is an optional container of arbitrary data that can be added to a FetchReplyData. This data is meant to be logged, but not sent back in a fetch reply (it should be added after the reply is prepared). Use FetchResponsePreparatorImpl::AddTrawlerPrivateDataToFetchReplyData to add. See also the comment in fetch_response_preparator_impl.cc. Next Tag: 49

SEO Analysis

AI Generated

Part of Google's web crawling infrastructure (Trawler is Google's internal name for their web crawler). This model governs how Googlebot fetches and processes web pages, affecting crawl efficiency and frequency. Crawl management directly impacts how quickly new content is discovered and how often existing content is refreshed in the index.

Actionable Insights for SEOs

Monitor for changes in rankings that may correlate with updates to this system
Consider how your content strategy aligns with what this signal evaluates
Optimize crawl budget by fixing broken links and reducing redirect chains
Use robots.txt and sitemap.xml effectively to guide crawling
Monitor Google Search Console for crawl errors and indexing issues

Attributes

Sort:|Filter:

PostDataSizestring

Default: nilFull type: String.t

What's the post data size (in bytes) if it's a post request.

numDroppedRepliesstring

Default: nilFull type: String.t

Number of times we drop the content of a stream reply or the final reply, which can only be caused by REJECTED_NO_RPC_BUFFERS now.

HintIPAddressstring

Default: nilFull type: String.t

If we do not have Endpoints in FetchReplyData (e.g., url rejected due to hostload limit), do we have a guess of the server IPAddress (e.g., from robots fetch)? This helps us classify URLs based on country code, etc. The field is filled with IPAddress::ToPackedString().

RpcStartDeadlineLeftMsinteger(

Default: nil

RPC deadline left at the start of url control flow. Can be useful for debugging rpc deadline exceeded error received by clients, this field is only recorded if RpcEndDeadlineLeftMs is small enough.

largeStoreHitLocationstring

Default: nilFull type: String.t

Set to the hit location (CNS filename) if cache comes from large store.

isDedicatedHostloadboolean(

Default: nil

dependentFetchTypestring

Default: nilFull type: String.t

Dependent fetch type

isVpcTrafficboolean(

Default: nil

Set if the fetch goes through the virtual private cloud path so we can track the VPC traffic.

httpVersionstring

Default: nilFull type: String.t

Stores the HTTP version we used in the last hop.

BotGroupNamestring

Default: nilFull type: String.t

If we fetched using BotFetchAgent, what is the BotGroupName?

isBidiStreamingFetchboolean(

Default: nil

Whether this is a bidirectional streaming fetch.

authenticationInfostring

Default: nilFull type: String.t

Stores the OAuth authentication method.

RequestUserNamestring

Default: nilFull type: String.t

Log the loas username in trawler private to help with debugging. Store the username in trawler private so clients won't see it from FetchReply. To reduce disk usage, we only log the loas username if the requestorid being used does not have ClientUsernameRestrictions.

cacheHitTypestring

Default: nilFull type: String.t

Only set if the fetch uses cache content (is_cache_fetch is true).

originalClientParamsTrawlerOriginalClientParams →

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.TrawlerOriginalClientParams.t

Store the original client information.

IsRobotsFetchboolean(

Default: nil

Was this an internally-initiated robots.txt fetch?

resourceBucketstring

Default: nilFull type: String.t

If the requestor shares resource bucket with other requestorids, we will store the resource bucket name in these fields.

cacheAcceptableAgeinteger(

Default: nil

Corresponds to AcceptableAge field in FetchParams.

Producerstring

Default: nilFull type: String.t

Note TrawlerPrivateFetchReplyData is never sent back to clients. The following field is just for Trawler and Multiverse internal tracking, and clients should not look at this field at all.

ProxyInstancestring

Default: nilFull type: String.t

If set, this fetch was done through a proxy (e.g., fetchproxy).

cdnProviderstring

Default: nilFull type: String.t

concurrentStreamNumstring

Default: nilFull type: String.t

How many concurrent streams are on the connection when the request finishes (including this request). Export this value to monitor the stream multiplexing for HTTP/2.

cacheAcceptableAfterDateinteger(

Default: nil

Corresponds to AcceptableAfterDate field in FetchParams.

credentialIdstring

Default: nilFull type: String.t

Log the credential id

ResponseBytesstring

Default: nilFull type: String.t

The number of bytes we sent back to the client.

downloadFileNamestring

Default: nilFull type: String.t

If the response header contains Content-Disposition header "attachment; filename="google.zip": the download_file_name would be "google.zip"

isFloonetFetchboolean(

Default: nil

Whether or not this is a Floonet fetch request. Floonet requests have inherent lower availability (due to HOPE rejections when HOPE is in degraded mode, and other Floonet specific reasons). Therefore, it is important for debugging and for our availability SLO to know whether of not it is a floonet fetch. IMPORTANT NOTE: This field is only currently set for traffic that explicitly requires Floonet and can not failover to use Googlebot (i.e. "transparent" or "implicit" Floonet fetches).

multiverseClientIdentifierTrawlerMultiverseClientIdentifier →

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.TrawlerMultiverseClientIdentifier.t

Multiverse client information

TrawlerInstancestring

Default: nilFull type: String.t

Which Trawler cell was this response fetched in? (e.g. "HR" or "YQ")

HSTSHeaderValuestring

Default: nilFull type: String.t

HTTP Strict-Transport-Security (RFC6797) header value. We log this so we can generate a list of hosts that prefer HTTPS over HTTP.

tierstring

Default: nilFull type: String.t

Service tier info will be used in traffic grapher for ploting per tier graph.

Is5xxHostIdboolean(

Default: nil

Represents if the HostId belongs to HostId set in 5xx url patterns, it can work as a tag when emitting requestor minute summary, this helps us to aggregate traffic affected by 5xx patterns, and test if there are any fetching changes.

UserAgentSentstring

Default: nilFull type: String.t

The useragent string sent to the remote webserver. It corresponds to UserAgentToSend field in FetchParams.

googleExtendedObeyWildcardRobotsStatusinteger(

Default: nil

We check if Google-Extended is allowed to crawl this URL, wildcard rules are obeyed, this is for internal analysis. Check RobotsTxtClient::RobotsStatus for the meaning of number.

RobotsBodystring

Default: nilFull type: String.t

If this was a robots.txt fetch (IsRobotsFetch above), this may contain the robots.txt body. (It may not, for instance, 404s are omitted; current policy is URL_CRAWLED + partially crawled) This includes http headers + body.

UserAgentSentFpstring

Default: nilFull type: String.t

The fp2011 of useragent sent to the remote webserver, note it corresponds to UserAgentToSend field in FetchParams

prodRegionstring

Default: nilFull type: String.t

Log the prod region (only for regional harpoon requestor ids)

RpcEndDeadlineLeftMsinteger(

Default: nil

RPC deadline left at the end of url control flow. Can be useful for debugging rpc deadline exceeded error received by clients, this field is only recorded if it's small enough.

isFromGrpcProxyboolean(

Default: nil

Whether or not this response is sent from gRPC proxy service.

ServerSignaturestring

Default: nilFull type: String.t

An arbitrary string signature identifying the remote server type/version. In the case of HTTP, this would be the contents of the "Server:" header.

googleExtendedRobotsStatusinteger(

Default: nil

We check if Google-Extended is allowed to crawl this URL and store the result here, wildcard rules are not obeyed, this is for internal analysis. Check RobotsTxtClient::RobotsStatus for the meaning of number.

BotHostnamestring

Default: nilFull type: String.t

This is the HOPE server that we sent the url to. We log the HOPE backend cell and hope server shard number (e.g., 'qf:6'). This allows us to understand how we are balancing our load to the HOPE servers.

subResourceBucketstring

Default: nilFull type: String.t

vpcDestinationTrawlerLoggedVPCDestination →

Default: nilFull type: GoogleApi.ContentWarehouse.V1.Model.TrawlerLoggedVPCDestination.t

The following are vpc information that's only set if is_vpc_traffic is true.

bypassedHostOverfullboolean(

Default: nil

Cache hit for this url, bypassed host_overfull error.

CacheRequestorIDstring

Default: nilFull type: String.t

Present if the reply is from the trawler cache. This is the requestorid of the trawler client that populated the cache with the data we are reusing.

HadInMemCacheHitboolean(

Default: nil

FetcherTaskNumberinteger(

Default: nil

Which Trawler fetcher task fetched this URL.