Crawl Summary

This document describes the crawl summary data structure stored in the summary database and is part of the entity data tree.

A crawl summary has the following fields:

crawl_type: Which kind of crawl was performed. (Same as in the crawl log)
crawl_uuid: UUID to indetify a crawl across databases, taken from crawl log.
crawl_time: Taken from time_started in the crawl log.
agent_uuid: The uuid indentifying the agent that did the crawling.
exit_code: The crawl exit code that summarizes the outcome of the
server_last_modified: When the resource was last modified according to the server (UTC timestamp)
request_duration_ms: How long the request took in milliseconds
was_robotstxt_approved: Wheter the request was approved by robots.txt.
http: One optional http summary

The URL is not prt of this data structure and is assumed to match the entity generation URL.

HTTP Summary

The http summary fields are: