Response
response
¶
Custom data models for the Scrape.do's API HTTP response
Encapsulates the httpx.Response object to provide a strongly-typed interface for the respone data sent back by the Scrape.do API. It Parses nested JSON payloads, extracts proxy telemetry, and attempts to determine whether non-2xx responses are coming from the target website, or from Scrape.do's gateway failures.
ScrapeDoNetworkRequest
¶
Bases: BaseModel
Represents an intercepted HTTP network request made by the headless browser.
When rendering JavaScript, the browser makes subsequent requests to fetch
CSS, images, and background API data which Scrape.do returns in the
networkRequests field when returnJSON=true
Attributes:
| Name | Type | Description |
|---|---|---|
url |
HttpUrl
|
The absolute URL of the requested resource. |
method |
str
|
The HTTP method used (e.g., GET, POST). |
status |
int
|
The HTTP status code returned by the resource server. |
request_headers |
Dict[str, str]
|
The headers sent by the headless browser. |
request_body |
Optional[str]
|
The payload sent with the request, if any. |
response_body |
Optional[str]
|
The payload returned by the server, if captured. |
response_headers |
Dict[str, str]
|
The headers returned by the resource server. |
ScrapeDoWebSocketFrame
¶
Bases: BaseModel
Represents the underlying payload of an intercepted WebSocket message.
Attributes:
| Name | Type | Description |
|---|---|---|
opcode |
int
|
The WebSocket frame operation code (1 for text, 2 for binary). |
mask |
bool
|
Indicates if the payload data is masked. |
payload_data |
str
|
The actual message content transferred over the socket. |
ScrapeDoWebSocketEvent
¶
Bases: BaseModel
Represents the Chrome DevTools Protocol (CDP) event metadata for a WebSocket.
Attributes:
| Name | Type | Description |
|---|---|---|
request_id |
str
|
The unique identifier for this specific WebSocket connection. |
timestamp |
float
|
The exact epoch timestamp when the event occurred. |
response |
ScrapeDoWebSocketFrame
|
The underlying frame containing the payload. |
ScrapeDoWebsocketRequest
¶
Bases: BaseModel
Represents a complete WebSocket message intercepted during rendering.
Attributes:
| Name | Type | Description |
|---|---|---|
type |
str
|
The direction of the traffic (e.g., "sent" or "received"). |
event |
ScrapeDoWebSocketEvent
|
The raw DevTools Protocol event data. |
ScrapeDoActionResult
¶
Bases: BaseModel
Represents the execution outcome of a specific programmatic browser action.
Attributes:
| Name | Type | Description |
|---|---|---|
action |
str
|
The name of the action executed (e.g., "Click", "Wait"). |
index |
int
|
The sequence index of this action in the original request array. |
success |
bool
|
Indicates whether the action completed without throwing an error. |
error |
Optional[str]
|
The error message if the action failed. |
response |
Optional[Union[Dict[str, Any], str]]
|
Data returned by the
action, typically populated when using the |
ScrapeDoScreenshot
¶
Bases: BaseModel
Represents a captured screenshot generated during the scraping process.
Attributes:
| Name | Type | Description |
|---|---|---|
screenshot_type |
str
|
The configuration used (e.g., "FullScreenShot"). |
b64_image |
Optional[str]
|
The Base64 encoded string of the PNG image data. |
error |
Optional[str]
|
The failure reason if the screenshot could not be captured. |
to_bytes()
¶
Convenience method to convert the b64_image string into a bytes
object using the base64 standard python library
Raises:
| Type | Description |
|---|---|
ValueError
|
If the instance's |
Returns:
| Type | Description |
|---|---|
bytes
|
bytes object retuned by |
to_file(path)
¶
Convenience method to save the base64-encoded screenshot
File Type
Scrape.do returns base64-encoded .png image data, so path
should end in /file_name.png
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Union[str, PathLike]
|
Image file will be saved to this path |
required |
Returns:
| Type | Description |
|---|---|
Path
|
resolved |
ScrapeDoFrame
¶
ScrapeDoResponse
¶
A unified data model for all HTTP responses returned by the Scrape.do API.
This model encapsulates the underlying HTTPX network response to provide a flexible, strongly-typed interface.
Different Response Types
Because Scrape.do alters its response format based on the request parameters, this model attempts to route property access to the correct underlying data source.
Additional Infomartion
The following are some of the parameters that change the format of the HTTP response returned by Scrape.do.
-
return_json=True: Returns a JSON string containing information about the request instead of the target website's raw HTML -
transparent_response=True: Causes the HTTP response returned by Scrape.do to mirror the exact status code of the HTTP response it got from the target website -
pure_cookies=True: Tells Scrpe.do to return the originalSet-Cookieheaders it got from the target website instead of bundling them into itsscrape.do-cookiesresponse header
Attributes:
| Name | Type | Description |
|---|---|---|
request |
PreparedScrapeDoRequest
|
The original, validated request configuration. |
httpx_response |
Response
|
The unmutated network response object. |
target_status_code |
Optional[int]
|
The status code returned by the destination server. |
text |
str
|
The primary payload of the target website (HTML or inner JSON string). |
target_headers |
Headers
|
The target's headers, without proxy telemetry headers. |
cookies |
Optional[Cookies]
|
Extracted cookies returned by the target. |
resolved_url |
Optional[str]
|
The final destination URL after all redirects. |
target_url |
Optional[str]
|
The original destination URL requested. |
scrape_do_status_code |
Optional[int]
|
The status code of the Scrape.do gateway. |
request_cost |
Optional[float]
|
API billing credits consumed by this specific execution. |
remaining_credits |
Optional[float]
|
Total API billing credits remaining on your account. |
rid |
Optional[str]
|
The specific proxy node Routing ID utilized |
rate |
Optional[str]
|
Current rate limit metrics for the provided API token. |
request_id |
Optional[str]
|
Unique UUID assigned to this request by the gateway. |
auth |
Optional[int]
|
Authentication status against the Scrape.do gateway. |
initial_status_code |
Optional[int]
|
Target's status extracted strictly from proxy headers. |
scrape_do_headers |
Headers
|
Filtered headers containing only Scrape.do telemetry. |
frames |
Optional[List[ScrapeDoFrame]]
|
Isolated cross-origin iframes discovered on the page. |
network_requests |
Optional[List[ScrapeDoNetworkRequest]]
|
Background HTTP calls made by the browser. |
websocket_requests |
Optional[List[ScrapeDoWebsocketRequest]]
|
Intercepted bidirectional WebSocket traffic. |
action_results |
Optional[List[ScrapeDoActionResult]]
|
Execution outcomes of programmatic DOM actions. |
screenshots |
Optional[List[ScrapeDoScreenshot]]
|
Captured Base64 screenshots. |
is_proxy_error
cached
property
¶
Heuristic to determine whether a non-2xx status code error is coming directly from the target website, or whether it's coming from the Scrape.do gateway
Additional Information
Scrape.do usually sends JSON error messages when there's an
infrastructure error, so we try to parse the response's payload
as JSON regardless of whether or not return_json=True.
-
IF
Payload Is Parsable JSON:-
Check if the returned JSON contatins one of the standard error keys (
message,Error,detail,Message, orerrorMessage). If it does, then the error is coming from Scrape.do, so returnTrue -
Otherwise, check if the returned JSON contains the
statusCodekey. If it does, and its value matches the status code returned by the original httpx response, then the error is probably coming from thetarget website, so returnFalse. -
If the value doesn't match or the
statusCodekey is missing, fallback toPayload Is Not Parsable JSONlogic.
-
-
IF
Payload Is Not Parsable JSON:- Scrape.do sends telemetry headers when a request is
successfuly completed, so if the response has the
scrape.do-intial-status-codeheader and its value is not empty, the error is probably coming from thetarget website, so returnFalse. Otherwise, it's probably a Scrape.do error, so returnTrue
- Scrape.do sends telemetry headers when a request is
successfuly completed, so if the response has the
transparent_response=True
When trasparent_response=True, Scrape.do can still send its
own error status codes when there's an infrastructure failure, so
we can't rely on the scrape_do_status_code to determine where
the error is coming from. With this in mind, this method aims
to provide a solution by analysing the response's structure as a
whole.
Returns:
| Type | Description |
|---|---|
bool
|
|
httpx_response
property
¶
Exposes the raw, underlying HTTPX response.
Intended Usage
Accessing this bypasses all SDK normalization. It's provided as an escape hatch for specific use cases where the original response object is needed.
Returns:
| Type | Description |
|---|---|
Response
|
The raw httpx response object. |
status_code
property
¶
Convenience accessor for the underlying HTTPX response status code.
Equivalent to response.httpx_response.status_code. Distinct from
target_status_code and scrape_do_status_code, which interpret the
Scrape.do response envelope.
Returns:
| Type | Description |
|---|---|
int
|
The HTTP status code of the response received from |
request
property
¶
Exposes the original, validated request configuration.
Returns:
| Type | Description |
|---|---|
PreparedScrapeDoRequest
|
The |
scrape_do_status_code
property
¶
target_status_code
property
¶
The HTTP status code returned by the destination website.
Additional Information
-
If
self.is_proxy_error=True, the target website was never reached, so returnNone -
If
transparent_response=True, the original status code from the httpx response is returned -
If
return_json=True, thestatusCodefield from the response's JSON is returned -
If it's not a proxy error, and both parameters are set to false, the
ScrapeDoResponse.initial_status_codeproperty value is returned
Returns:
| Type | Description |
|---|---|
Optional[int]
|
The target website's status code (e.g., 200, 403, 404). |
text
property
¶
The primary textual payload of the target website.
Additional Information
Depending on the request parameters, this will return
either the raw HTML byte stream or the extracted content string
from within Scrape.do's JSON wrapper.
Returns:
| Type | Description |
|---|---|
str
|
The HTML or JSON string payload from the target. |
target_headers
property
¶
The HTTP headers returned by the destination server.
Additional Information
This property automatically filters all internal scrape.do- proxy
telemetry headers, providing a clean representation of
the target's response.
Returns:
| Type | Description |
|---|---|
Headers
|
The filtered headers from the target website. |
scrape_do_headers
property
¶
request_cost
property
¶
initial_status_code
property
¶
request_id
property
¶
resolved_url
property
¶
target_url
property
¶
auth
property
¶
rate
property
¶
remaining_credits
property
¶
rid
property
¶
The specific proxy node Routing ID utilized for this connection.
Session ID
If session_id was provided in the parameters,
this Routing ID is used by the ScrapeDoClient to verify that
sticky sessions are maintaining the same node.
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The internal routing identifier, or None if the |
cookies
property
¶
Extracts and parses cookies returned by the target server.
Additional Information
If pure_cookies=True is active, it returns the httpx response's
cookies attribute. Otherwise, it decodes the custom
scrape.do-cookies string into a httpx.Cookies object
Returns:
| Type | Description |
|---|---|
Optional[Cookies]
|
A |
frames
property
¶
Extracts isolated cross-origin iframes discovered during page rendering.
Prerequisites
Requires render=True, return_json=True, and show_frames=True
Returns:
| Type | Description |
|---|---|
Optional[List[ScrapeDoFrame]]
|
A list of typed Pydantic models representing frames. |
network_requests
property
¶
Intercepts background network traffic triggered by the headless browser.
Prerequisites
Requires render=True and return_json=True.
Returns:
| Type | Description |
|---|---|
Optional[List[ScrapeDoNetworkRequest]]
|
A list of typed models detailing HTTP calls. |
websocket_requests
property
¶
Intercepts bidirectional WebSocket traffic initiated by the target website.
Prerequisites
Requires render=True, return_json=True, and
show_websocket_requests=True
Returns:
| Type | Description |
|---|---|
Optional[List[ScrapeDoWebsocketRequest]]
|
A list of typed models detailing socket events. |
action_results
property
¶
Details the success or failure of programmatic DOM interactions.
Returns:
| Type | Description |
|---|---|
Optional[List[ScrapeDoActionResult]]
|
A list of typed models mapping sequentially to the actions defined
in the |
screenshots
property
¶
Extracts generated Base64 screenshots from the JSON payload.
Prerequisites
Requires render=True, return_json=True, and a valid screenshot
parameter (e.g., full_screenshot=True).
Returns:
| Type | Description |
|---|---|
Optional[List[ScrapeDoScreenshot]]
|
A list of typed models containing the image data. |
raise_for_status()
¶
Evaluates the response and raises a mapped exception if the request failed.
Additional Information
Utilizes the is_proxy_error heuristic to determine if
the failure originated from the Scrape.do proxy infrastructure or
from the target website.
Returns:
| Type | Description |
|---|---|
Self
|
The current |
Raises:
| Type | Description |
|---|---|
TargetError
|
If the proxy succeeded, but the target website returned an error code (e.g., a 403 Cloudflare block or a 404 Not Found). |
BadRequestError
|
If the request was malformed (HTTP 400 from Scrape.do). |
AuthenticationError
|
If your Scrape.do API token is invalid (HTTP 401). |
AuthenticationThrottleError
|
If your specific token has been temporarily locked by the Scrape.do authentication server to prevent abuse. (HTTP 401) |
RateLimitError
|
If you exceed your account's concurrent request limit (HTTP 429). |
ServerError
|
If the Scrape.do gateway experiences an issue (HTTP 502/510). |
APIResponseError
|
A generic fallback for unmapped Scrape.do proxy errors. |