Skip to content

Response

response

Custom data models for the Scrape.do's API HTTP response

Encapsulates the httpx.Response object to provide a strongly-typed interface for the respone data sent back by the Scrape.do API. It Parses nested JSON payloads, extracts proxy telemetry, and attempts to determine whether non-2xx responses are coming from the target website, or from Scrape.do's gateway failures.

ScrapeDoNetworkRequest

Bases: BaseModel

Represents an intercepted HTTP network request made by the headless browser.

When rendering JavaScript, the browser makes subsequent requests to fetch CSS, images, and background API data which Scrape.do returns in the networkRequests field when returnJSON=true

Attributes:

Name Type Description
url HttpUrl

The absolute URL of the requested resource.

method str

The HTTP method used (e.g., GET, POST).

status int

The HTTP status code returned by the resource server.

request_headers Dict[str, str]

The headers sent by the headless browser.

request_body Optional[str]

The payload sent with the request, if any.

response_body Optional[str]

The payload returned by the server, if captured.

response_headers Dict[str, str]

The headers returned by the resource server.

ScrapeDoWebSocketFrame

Bases: BaseModel

Represents the underlying payload of an intercepted WebSocket message.

Attributes:

Name Type Description
opcode int

The WebSocket frame operation code (1 for text, 2 for binary).

mask bool

Indicates if the payload data is masked.

payload_data str

The actual message content transferred over the socket.

ScrapeDoWebSocketEvent

Bases: BaseModel

Represents the Chrome DevTools Protocol (CDP) event metadata for a WebSocket.

Attributes:

Name Type Description
request_id str

The unique identifier for this specific WebSocket connection.

timestamp float

The exact epoch timestamp when the event occurred.

response ScrapeDoWebSocketFrame

The underlying frame containing the payload.

ScrapeDoWebsocketRequest

Bases: BaseModel

Represents a complete WebSocket message intercepted during rendering.

Attributes:

Name Type Description
type str

The direction of the traffic (e.g., "sent" or "received").

event ScrapeDoWebSocketEvent

The raw DevTools Protocol event data.

is_text property

Determines if the WebSocket payload is readable text.

Returns:

Type Description
bool

True if the underlying frame opcode is 1 (Text).

ScrapeDoActionResult

Bases: BaseModel

Represents the execution outcome of a specific programmatic browser action.

Attributes:

Name Type Description
action str

The name of the action executed (e.g., "Click", "Wait").

index int

The sequence index of this action in the original request array.

success bool

Indicates whether the action completed without throwing an error.

error Optional[str]

The error message if the action failed.

response Optional[Union[Dict[str, Any], str]]

Data returned by the action, typically populated when using the ExecuteAction to run custom JavaScript.

ScrapeDoScreenshot

Bases: BaseModel

Represents a captured screenshot generated during the scraping process.

Attributes:

Name Type Description
screenshot_type str

The configuration used (e.g., "FullScreenShot").

b64_image Optional[str]

The Base64 encoded string of the PNG image data.

error Optional[str]

The failure reason if the screenshot could not be captured.

to_bytes()

Convenience method to convert the b64_image string into a bytes object using the base64 standard python library

Raises:

Type Description
ValueError

If the instance's b64_image attribute is empty

Returns:

Type Description
bytes

bytes object retuned by base64.b64decode(b64_image)

to_file(path)

Convenience method to save the base64-encoded screenshot

File Type

Scrape.do returns base64-encoded .png image data, so path should end in /file_name.png

Parameters:

Name Type Description Default
path Union[str, PathLike]

Image file will be saved to this path

required

Returns:

Type Description
Path

resolved pathlib.Path object of the path parameter

ScrapeDoFrame

Bases: BaseModel

Represents an isolated, cross-origin iframe discovered on the target webpage.

Attributes:

Name Type Description
url HttpUrl

The absolute source URL of the iframe.

content Optional[str]

The rendered HTML content inside the iframe.

ScrapeDoResponse

A unified data model for all HTTP responses returned by the Scrape.do API.

This model encapsulates the underlying HTTPX network response to provide a flexible, strongly-typed interface.

Different Response Types

Because Scrape.do alters its response format based on the request parameters, this model attempts to route property access to the correct underlying data source.

Additional Infomartion

The following are some of the parameters that change the format of the HTTP response returned by Scrape.do.

  • return_json=True : Returns a JSON string containing information about the request instead of the target website's raw HTML

  • transparent_response=True : Causes the HTTP response returned by Scrape.do to mirror the exact status code of the HTTP response it got from the target website

  • pure_cookies=True : Tells Scrpe.do to return the original Set-Cookie headers it got from the target website instead of bundling them into its scrape.do-cookies response header

Attributes:

Name Type Description
request PreparedScrapeDoRequest

The original, validated request configuration.

httpx_response Response

The unmutated network response object.

target_status_code Optional[int]

The status code returned by the destination server.

text str

The primary payload of the target website (HTML or inner JSON string).

target_headers Headers

The target's headers, without proxy telemetry headers.

cookies Optional[Cookies]

Extracted cookies returned by the target.

resolved_url Optional[str]

The final destination URL after all redirects.

target_url Optional[str]

The original destination URL requested.

scrape_do_status_code Optional[int]

The status code of the Scrape.do gateway.

request_cost Optional[float]

API billing credits consumed by this specific execution.

remaining_credits Optional[float]

Total API billing credits remaining on your account.

rid Optional[str]

The specific proxy node Routing ID utilized

rate Optional[str]

Current rate limit metrics for the provided API token.

request_id Optional[str]

Unique UUID assigned to this request by the gateway.

auth Optional[int]

Authentication status against the Scrape.do gateway.

initial_status_code Optional[int]

Target's status extracted strictly from proxy headers.

scrape_do_headers Headers

Filtered headers containing only Scrape.do telemetry.

frames Optional[List[ScrapeDoFrame]]

Isolated cross-origin iframes discovered on the page.

network_requests Optional[List[ScrapeDoNetworkRequest]]

Background HTTP calls made by the browser.

websocket_requests Optional[List[ScrapeDoWebsocketRequest]]

Intercepted bidirectional WebSocket traffic.

action_results Optional[List[ScrapeDoActionResult]]

Execution outcomes of programmatic DOM actions.

screenshots Optional[List[ScrapeDoScreenshot]]

Captured Base64 screenshots.

is_proxy_error cached property

Heuristic to determine whether a non-2xx status code error is coming directly from the target website, or whether it's coming from the Scrape.do gateway

Additional Information

Scrape.do usually sends JSON error messages when there's an infrastructure error, so we try to parse the response's payload as JSON regardless of whether or not return_json=True.

  • IF Payload Is Parsable JSON :

    • Check if the returned JSON contatins one of the standard error keys (message, Error, detail, Message, or errorMessage). If it does, then the error is coming from Scrape.do, so return True

    • Otherwise, check if the returned JSON contains the statusCode key. If it does, and its value matches the status code returned by the original httpx response, then the error is probably coming from the target website, so return False.

    • If the value doesn't match or the statusCode key is missing, fallback to Payload Is Not Parsable JSON logic.

  • IF Payload Is Not Parsable JSON :

    • Scrape.do sends telemetry headers when a request is successfuly completed, so if the response has the scrape.do-intial-status-code header and its value is not empty, the error is probably coming from the target website, so return False. Otherwise, it's probably a Scrape.do error, so return True
transparent_response=True

When trasparent_response=True, Scrape.do can still send its own error status codes when there's an infrastructure failure, so we can't rely on the scrape_do_status_code to determine where the error is coming from. With this in mind, this method aims to provide a solution by analysing the response's structure as a whole.

Returns:

Type Description
bool

True if it's a Scrape.do error, or False if it's a target website error

httpx_response property

Exposes the raw, underlying HTTPX response.

Intended Usage

Accessing this bypasses all SDK normalization. It's provided as an escape hatch for specific use cases where the original response object is needed.

Returns:

Type Description
Response

The raw httpx response object.

status_code property

Convenience accessor for the underlying HTTPX response status code.

Equivalent to response.httpx_response.status_code. Distinct from target_status_code and scrape_do_status_code, which interpret the Scrape.do response envelope.

Returns:

Type Description
int

The HTTP status code of the response received from api.scrape.do.

request property

Exposes the original, validated request configuration.

Returns:

Type Description
PreparedScrapeDoRequest

The PreparedScrapeDoRequest configuration that generated this response.

scrape_do_status_code property

The HTTP status code returned by the Scrape.do gateway infrastructure.

Transparent Response

If transparent_response=True was used, the gateway hides its own status code, and this property will return None.

Returns:

Type Description
Optional[int]

The proxy gateway status code (e.g., 200, 429, 502).

target_status_code property

The HTTP status code returned by the destination website.

Additional Information
  • If self.is_proxy_error=True, the target website was never reached, so return None

  • If transparent_response=True, the original status code from the httpx response is returned

  • If return_json=True, the statusCode field from the response's JSON is returned

  • If it's not a proxy error, and both parameters are set to false, the ScrapeDoResponse.initial_status_code property value is returned

Returns:

Type Description
Optional[int]

The target website's status code (e.g., 200, 403, 404).

text property

The primary textual payload of the target website.

Additional Information

Depending on the request parameters, this will return either the raw HTML byte stream or the extracted content string from within Scrape.do's JSON wrapper.

Returns:

Type Description
str

The HTML or JSON string payload from the target.

target_headers property

The HTTP headers returned by the destination server.

Additional Information

This property automatically filters all internal scrape.do- proxy telemetry headers, providing a clean representation of the target's response.

Returns:

Type Description
Headers

The filtered headers from the target website.

scrape_do_headers property

Filters the response headers to isolate Scrape.do's specific infrastructure telemetry.

Returns:

Type Description
Optional[Headers]

Only headers prefixed with scrape.do-, or None if no scrape.do- headers are found

request_cost property

The amount of API billing credits consumed by this specific execution.

Returns:

Type Description
Optional[float]

The value returned in the scapre_do_headers casted to a float, or None if the scrape.do-request-cost header is missing

initial_status_code property

The target website's HTTP status code, extracted directly from the proxy headers.

Returns:

Type Description
Optional[int]

The status code casted to an int, or None if the scrape.do-intial-status-code header is missing.

request_id property

The unique UUID assigned to this request by the Scrape.do gateway.

Returns:

Type Description
Optional[str]

The internal tracking ID, or None if the scrape.do-request-id header is missing

resolved_url property

The final destination URL after all server-side and client-side redirects.

Returns:

Type Description
Optional[str]

The absolute URL where the browser ultimately landed, or None if the scrape.do-resolved-url header is missing

target_url property

The original destination URL requested by the SDK.

Returns:

Type Description
Optional[str]

The initial target URL, or None if the scrape.do-target-url header is missing

auth property

Indicates the authentication status against the Scrape.do gateway.

Returns:

Type Description
Optional[int]

The authentication flag value casted to an int, or None if the scrape.do-auth header is missing

rate property

The current rate limit metrics for the provided API token.

Returns:

Type Description
Optional[str]

A string representing current concurrency thresholds, or None if the scrape.do-rate header is missing

remaining_credits property

The total number of API billing credits remaining on your account.

Returns:

Type Description
Optional[float]

The remaining account balance casted to a float, or None if the scrape.do-remaining-credits header is missing

rid property

The specific proxy node Routing ID utilized for this connection.

Session ID

If session_id was provided in the parameters, this Routing ID is used by the ScrapeDoClient to verify that sticky sessions are maintaining the same node.

Returns:

Type Description
Optional[str]

The internal routing identifier, or None if the scrape.do-rid header is missing

cookies property

Extracts and parses cookies returned by the target server.

Additional Information

If pure_cookies=True is active, it returns the httpx response's cookies attribute. Otherwise, it decodes the custom scrape.do-cookies string into a httpx.Cookies object

Returns:

Type Description
Optional[Cookies]

A httpx.Cookies object containing all cookies.

frames property

Extracts isolated cross-origin iframes discovered during page rendering.

Prerequisites

Requires render=True, return_json=True, and show_frames=True

Returns:

Type Description
Optional[List[ScrapeDoFrame]]

A list of typed Pydantic models representing frames.

network_requests property

Intercepts background network traffic triggered by the headless browser.

Prerequisites

Requires render=True and return_json=True.

Returns:

Type Description
Optional[List[ScrapeDoNetworkRequest]]

A list of typed models detailing HTTP calls.

websocket_requests property

Intercepts bidirectional WebSocket traffic initiated by the target website.

Prerequisites

Requires render=True, return_json=True, and show_websocket_requests=True

Returns:

Type Description
Optional[List[ScrapeDoWebsocketRequest]]

A list of typed models detailing socket events.

action_results property

Details the success or failure of programmatic DOM interactions.

Returns:

Type Description
Optional[List[ScrapeDoActionResult]]

A list of typed models mapping sequentially to the actions defined in the play_with_browser array.

screenshots property

Extracts generated Base64 screenshots from the JSON payload.

Prerequisites

Requires render=True, return_json=True, and a valid screenshot parameter (e.g., full_screenshot=True).

Returns:

Type Description
Optional[List[ScrapeDoScreenshot]]

A list of typed models containing the image data.

raise_for_status()

Evaluates the response and raises a mapped exception if the request failed.

Additional Information

Utilizes the is_proxy_error heuristic to determine if the failure originated from the Scrape.do proxy infrastructure or from the target website.

Returns:

Type Description
Self

The current ScrapeDoResponse instance, allowing for method chaining.

Raises:

Type Description
TargetError

If the proxy succeeded, but the target website returned an error code (e.g., a 403 Cloudflare block or a 404 Not Found).

BadRequestError

If the request was malformed (HTTP 400 from Scrape.do).

AuthenticationError

If your Scrape.do API token is invalid (HTTP 401).

AuthenticationThrottleError

If your specific token has been temporarily locked by the Scrape.do authentication server to prevent abuse. (HTTP 401)

RateLimitError

If you exceed your account's concurrent request limit (HTTP 429).

ServerError

If the Scrape.do gateway experiences an issue (HTTP 502/510).

APIResponseError

A generic fallback for unmapped Scrape.do proxy errors.