Validators
validators
¶
Reusable validation helpers for Scrape.do parameter contracts
This module exposes the cross-field and field-level rules enforced by
RequestParameters as pure
functions so they can be reused by other parameter models without inheriting
from RequestParameters itself
Design Notes
-
Every helper is a free function
-
Callers pass in exactly the values each rule needs
-
Helpers either return the validated value on success or
raise ValueErrorwith a message on rule violation
check_geo_code(value, *, super_set)
¶
Normalizes and validates an ISO 3166-1 alpha-2 country code against
the different sets of countries allowed by Scrape.do when super=True
and when super=False
Lowercases the input and checks it against
_SUPER_SUPPORTED_COUNTRIES or
_DATACENTER_SUPPORTED_COUNTRIES depending
on the value of super_set
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Optional[str]
|
Raw |
required |
super_set
|
bool
|
Whether |
required |
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The lowercased |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the country code is not supported by the
selected proxy tier. The message distinguishes between
|
check_postal_code(value, *, super_set, geo_code)
¶
Validates a postal code against the list of countries supported by
Scrape.do for this parameter. In addition, it uses regex to check if the
provided value matches the valid postal code format of that country
Validation Logic
-
Postal-code targeting requires both
super=TrueAND a previously validatedgeo_codethat belongs to_ZIPCODE_FORMATS -
The value is stripped of surrounding whitespace before format matching
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Optional[str]
|
Raw |
required |
super_set
|
bool
|
Whether |
required |
geo_code
|
Optional[str]
|
The (already-validated) ISO 3166-1 alpha-2 country code accompanying this request |
required |
Returns:
| Type | Description |
|---|---|
Optional[str]
|
The stripped |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
check_geo_exclusion(geo_code, regional_geo_code)
¶
Enforces mutual exclusivity between geo_code and
regional_geo_code
Logic Behind Validation
Scrape.do's gateway rejects requests that specify both a country
code and a regional code at the same time
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
geo_code
|
Optional[str]
|
The ISO 3166-1 alpha-2 country code
on this request, or |
required |
regional_geo_code
|
Optional[RegionCodeType]
|
The
|
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If both arguments are non- |
check_regional_requires_super(super_set, regional_geo_code)
¶
Enforces the regional_geo_code and super=True dependency
Logic Behind Validation
Scrape.do only routes regional codes through the premium proxy
pool, so super must be explicitly enabled when a regional code
is supplied
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
super_set
|
bool
|
Whether |
required |
regional_geo_code
|
Optional[Any]
|
The
|
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
check_screenshot_mutual_exclusion(screenshot, full_screenshot, particular_screenshot)
¶
Enforces that at most one of the three screenshot parameters is set per request
Logic Behind Validation
Scrape.do doesn't allow setting more than one screenshot parameter
per request
Multiple Screenshots
To get multiple screenshot per requests, you can use the
play_with_browser parameter to provide a list containing
ScreenShotAction
objects
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
screenshot
|
Optional[bool]
|
|
required |
full_screenshot
|
Optional[bool]
|
|
required |
particular_screenshot
|
Optional[str]
|
CSS selector targeting a specific element to screenshot |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If more than one of the three is truthy |
check_screenshot_blocks_resources(screenshot, full_screenshot, particular_screenshot, block_resources)
¶
Enforces that screenshots run with block_resources=False
Logic Beghind Validation
-
Any active screenshot parameter requires that resource blocking be disabled so that images, CSS, and fonts are actually loaded before capture
-
Combining a screenshot with
block_resources=Truemight yield an empty/partially-rendered image, soScrape.doautomatically sets it toFalseregardless of the value sent for the parameter
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
screenshot
|
Optional[bool]
|
Visible-viewport screenshot flag |
required |
full_screenshot
|
Optional[bool]
|
Full-page screenshot flag |
required |
particular_screenshot
|
Optional[str]
|
Element-selector screenshot |
required |
block_resources
|
Optional[bool]
|
|
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any screenshot parameter is truthy while
|
check_return_json_dependencies(screenshot, full_screenshot, particular_screenshot, show_frames, show_websocket_requests, return_json)
¶
Enforces the return_json=True dependency for response-only
artifacts
Logic Behind Validation
Screenshots, iframe content, and websocket traces are delivered
inside the structured JSON envelope, so they require both render=True
AND return_json=True
Render Dependencies
-
The render-side requirement is enforced separately by
check_render_dependencies -
This helper only enforces the JSON-envelope side
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
screenshot
|
Optional[bool]
|
Visible-viewport screenshot flag |
required |
full_screenshot
|
Optional[bool]
|
Full-page screenshot flag |
required |
particular_screenshot
|
Optional[str]
|
Element-selector screenshot |
required |
show_frames
|
Optional[bool]
|
Include all iframe content in the response |
required |
show_websocket_requests
|
Optional[bool]
|
Capture websocket traffic in the response |
required |
return_json
|
Optional[bool]
|
Whether |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any JSON-envelope-dependent field is truthy
while |
check_play_with_browser_vs_particular_screenshot(play_with_browser, particular_screenshot)
¶
Enforces mutual exclusivity between play_with_browser and
particular_screenshot
Logic Behind Validation
Element-selector screenshots are taken after navigation but cannot coexist with a scripted browser-action sequence in the same request
Use Particular Screenshot With playWithBrowser
To get a particular_screenshot while using the playWithBrowser
parameter, you can use use a
ScreenShotAction
object with the particular_screenshot field set to the CSS selector
you want to capture
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
play_with_browser
|
Optional[Sequence[BrowserAction]]
|
Sequence of
|
required |
particular_screenshot
|
Optional[str]
|
CSS selector targeting
a specific element to screenshot, or |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If both arguments are non- |
check_render_dependencies(render, dependent_fields)
¶
Enforces the render=True dependency for headless-browser
parameters
Logic Behind Validation
The wait_until, custom_wait, wait_selector, width,
height, return_json, block_resources, screenshot,
full_screenshot, particular_screenshot, play_with_browser,
show_frames, show_websocket_requests require the headless-browser
pipeline to be active
Usage
The caller must pass the actual field values keyed by name so that the error message can name the offending fields verbatim
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
render
|
Optional[bool]
|
Whether |
required |
dependent_fields
|
Dict[str, Any]
|
Map of render-dependent
field names → their current values. Any field whose
value is non- |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any of the dependent fields is non- |
check_retry_timeout_vs_render(render, retry_timeout)
¶
Enforces mutual exclusivity between retry_timeout and
render=True
Logic Behind Validation
According to the Scrape.do official documentation, the
retry_timeout parameter does not work when set simultaneously with
render=true
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
render
|
Optional[bool]
|
Whether |
required |
retry_timeout
|
Optional[int]
|
Internal proxy retry duration
in milliseconds, or |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
check_header_mutual_exclusion(custom_headers, extra_headers, forward_headers)
¶
Enforces that at most one of the three header-control parameters is set
Logic Behind Validation
-
custom_headers,extra_headers, andforward_headersare mutually exclusive header-handling modes -
Scrape.do'sgateway will reject combinations
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
custom_headers
|
Optional[bool]
|
Replace Scrape.do's default headers with the user-provided ones |
required |
extra_headers
|
Optional[bool]
|
Append the user-provided headers to Scrape.do's defaults |
required |
forward_headers
|
Optional[bool]
|
Forward all headers exactly as sent by the client |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If more than one of the three is truthy |
check_headers_vs_set_cookies(used_header_fields, set_cookies)
¶
Enforces incompatibility between any header-control mode and
set_cookies
Logic Behind Validation
Scrape.do does not accept setCookies alongside any of the
customHeaders / extraHeaders / forwardHeaders flags
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
used_header_fields
|
List[str]
|
Names of header-control fields
that are currently truthy. Pass |
required |
set_cookies
|
Optional[str]
|
The |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |