Parameters
parameters
¶
Core validation engine and configuration contracts.
Validates request data before the network layer to ensure that invalid configurations are caught locally without wasting network requests by using Pydantic V2 models to enforce Scrape.do's parameter dependencies and interactions
RequestParametersDict
¶
Bases: TypedDict
Provides strict IDE autocomplete and static type checking for **kwargs
dictionaries meant for the
RequestParameters model.
super
instance-attribute
¶
Activates Residential/Mobile IP proxies.
render
instance-attribute
¶
Executes the request using a headless browser.
device
instance-attribute
¶
Specify the device type (desktop, mobile, tablet)
session_id
instance-attribute
¶
Use the same IP address continuously with a session
geo_code
instance-attribute
¶
ISO 3166-1 alpha-2 country code for IP targeting.
regional_geo_code
instance-attribute
¶
Targets a broader geographical region. Requires super=True.
postal_code
instance-attribute
¶
Targets a specific zip code. Requires super=True and a supported geo_code.
wait_until
instance-attribute
¶
Control when the browser considers the page loaded
custom_wait
instance-attribute
¶
Set the browser wait time on the target web page after content loaded
wait_selector
instance-attribute
¶
CSS selector to wait for in the target web page.
width
instance-attribute
¶
Custom viewport width.
height
instance-attribute
¶
Custom viewport height.
return_json
instance-attribute
¶
Returns response body as base64-encoded JSON instead of raw HTML.
block_resources
instance-attribute
¶
Block CSS, images, and fonts on your target web page
screenshot
instance-attribute
¶
Captures the visible viewport.
full_screenshot
instance-attribute
¶
Captures the entire scrollable page.
particular_screenshot
instance-attribute
¶
Captures a specific DOM element by selector.
play_with_browser
instance-attribute
¶
A sequence of automated interactions to perform.
show_frames
instance-attribute
¶
Returns all iframe content from the target webpage. Requires render=true and returnJSON=true
show_websocket_requests
instance-attribute
¶
Captures WebSocket network traffic. Requires render=true and returnJSON=true.
custom_headers
instance-attribute
¶
Replaces Scrape.do's default headers with your provided headers.
extra_headers
instance-attribute
¶
Appends your provided headers to Scrape.do's default headers.
forward_headers
instance-attribute
¶
Forwards all headers exactly as sent by your client.
set_cookies
instance-attribute
¶
Injects specific cookies into the request.
disable_redirection
instance-attribute
¶
Prevents the proxy from following 3xx HTTP redirects.
timeout
instance-attribute
¶
Total API connection timeout in milliseconds.
retry_timeout
instance-attribute
¶
Internal proxy retry duration in milliseconds. Cannot be used with render=True.
disable_retry
instance-attribute
¶
Fails immediately on target error without rotating IPs.
output
instance-attribute
¶
Output format parser.
transparent_response
instance-attribute
¶
Return pure response from target web page without Scrape.do processing
pure_cookies
instance-attribute
¶
Returns the original Set-Cookie headers from the target website
RequestParameters
¶
Bases: BaseModel
The strict data contract for the request parameters accepted by Scrape.do's API.
This model enforces all parameter dependencies, mutually exclusive rules, and geographical targeting constraints locally before a network request is generated.
Attributes:
| Name | Type | Description |
|---|---|---|
url |
HttpUrl
|
The absolute destination URL you wish to scrape. |
super |
Optional[bool]
|
Activates Residential/Mobile IP proxies. |
render |
Optional[bool]
|
Executes the request using a headless browser. |
device |
Optional[DeviceType]
|
Specify the device type (desktop, mobile, tablet) |
session_id |
Optional[int]
|
Use the same IP address continuously with a session |
geo_code |
Optional[str]
|
ISO 3166-1 alpha-2 country code for IP targeting. |
regional_geo_code |
Optional[RegionCodeType]
|
Targets a broader geographical region. Requires super=True. |
postal_code |
Optional[str]
|
Targets a specific zip code. Requires super=True and a supported geo_code. |
wait_until |
Optional[WaitUntilType]
|
Control when the browser considers the page loaded |
custom_wait |
Optional[int]
|
Set the browser wait time on the target web page after content loaded |
wait_selector |
Optional[str]
|
CSS selector to wait for in the target web page. |
width |
Optional[int]
|
Custom viewport width. |
height |
Optional[int]
|
Custom viewport height. |
return_json |
Optional[bool]
|
Returns response body as base64-encoded JSON instead of raw HTML. |
block_resources |
Optional[bool]
|
Block CSS, images, and fonts on your target web page |
screenshot |
Optional[bool]
|
Captures the visible viewport. |
full_screenshot |
Optional[bool]
|
Captures the entire scrollable page. |
particular_screenshot |
Optional[str]
|
Captures a specific DOM element by selector. |
play_with_browser |
Optional[List[BrowserAction]]
|
A sequence of automated interactions to perform. |
show_frames |
Optional[bool]
|
Returns all iframe content from the target webpage. Requires render=true and returnJSON=true |
show_websocket_requests |
Optional[bool]
|
Captures WebSocket network traffic. Requires render=true and returnJSON=true. |
custom_headers |
Optional[bool]
|
Replaces Scrape.do's default headers with your provided headers. |
extra_headers |
Optional[bool]
|
Appends your provided headers to Scrape.do's default headers. |
forward_headers |
Optional[bool]
|
Forwards all headers exactly as sent by your client. |
set_cookies |
Optional[str]
|
Injects specific cookies into the request. |
disable_redirection |
Optional[bool]
|
Prevents the proxy from following 3xx HTTP redirects. |
timeout |
Optional[int]
|
Total API connection timeout in milliseconds. |
retry_timeout |
Optional[int]
|
Internal proxy retry duration in milliseconds. Cannot be used with render=True. |
disable_retry |
Optional[bool]
|
Fails immediately on target error without rotating IPs. |
output |
Optional[OutputType]
|
Output format parser. |
transparent_response |
Optional[bool]
|
Return pure response from target web page without Scrape.do processing |
pure_cookies |
Optional[bool]
|
Returns the original Set-Cookie headers from the target website |
validate_compatibility()
¶
Cross-validates parameter dependencies to prevent invalid API requests locally.
Headless Browser Dependencies (render=True)
wait_untilwait_selectorcustom_waitwidthheightreturn_jsonblock_resourcesscreenshotfull_screenshotparticular_screenshotplay_with_browsershow_framesshow_websocket_requests
ReturnJSON Dependencies (render=True + return_json=True)
screenshotfull_screenshotparticular_screenshotshow_framesshow_websocket_requests
Super Proxy Dependencies (super=True)
regional_geo_code
Screenshot Parameters
-
Only one of the screenshot parameters can be set at a time.
-
In addition to
render=Trueandreturn_json=True, all screenshot parameters requireblockResourcesto be set to False.
Header Parameters
-
Only one of the header parameters can be set at a time.
-
None of the header parameters can be set to True when using the
setCookiesparameter
Mutually Exclusive Parameters
-
The
playWithBrowserandparticular_screenshotparameters cannot be used simultaneously -
The
retryTimeoutandrenderparameters cannot be used simultaneously -
The
regional_geo_codeandgeo_codeparameters cannot be used simultaneously
Returns:
| Type | Description |
|---|---|
Self
|
The validated instance from which the method was called |
Raises:
| Type | Description |
|---|---|
ValueError
|
If mutually exclusive parameters are combined or if dependent parameters are provided without their required prerequisites. |
validate_geo_code(v, info)
classmethod
¶
Validates the country code against the allowed proxy pools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
Optional[str]
|
The |
required |
info
|
ValidationInfo
|
The data already validated for the model so far |
required |
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The validated |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the country code is not supported by the selected proxy tier. |
validate_postal_code(v, info)
classmethod
¶
Validates postal codes based on specific regional formats.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
Optional[str]
|
The |
required |
info
|
ValidationInfo
|
The data already validated for the model so far |
required |
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The validated |
Raises:
| Type | Description |
|---|---|
ValueError
|
If dependencies are missing or the format does not match the regional regex. |
to_api_params()
¶
Serializes the model into a dictionary formatted for httpx query parameters.
This method automatically drops unassigned fields, maps snake_case variables to their camelCase API equivalents, and stringifies nested JSON objects as required by Scrape.do.
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
A sanitized dictionary ready to be passed to httpx. |
from_url(api_url)
classmethod
¶
Instantiates a RequestParameters instance by parsing a raw
Scrape.do API URL string.
Accepted URLs
This method accepts both raw and encoded URLs by using
the urllib.parse.parse_qs and urllib.parse.unquote_plus
functions to normalize encoded URLs.
Browser Actions (playWithBrowser)
When providing a URL containing the playWithBrowser parameter,
make sure to use the json.dumps function to stringify the list
of dictionaries containing the entries. Both the raw and ecoded
URLs can be passed to this method afterwards.
API Token
This method ignores the &token= parameter containing the
Scrape.do API key, since its insertion is meant to be handled by
the ScrapeDoClient using either an initialization parameter, or
the SCRAPE_DO_API_KEY environment variable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_url
|
str
|
The full Scrape.do endpoint
( |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the value found in the |
Returns:
| Type | Description |
|---|---|
RequestParameters
|
The |