Async Client

`async_client` ¶

Asynchronous HTTP client for the Scrape.do API.

Defines AsyncScrapeDoClient, the asyncio-native version of ScrapeDoClient. Mirrors the sync client's surface — smart routing, retry strategy, session validation, and event hooks — via await-based methods backed by httpx.AsyncClient.

Hooks and session validators on this client are async-only. Their type aliases ( AsyncClientEventHooks and AsyncSessionValidator) type the callable as returning Awaitable[None] / Awaitable[bool] so hooks can perform I/O while the request executes.

`AsyncScrapeDoClient` ¶

Asynchronous HTTP client for executing Scrape.do API requests.

asyncio-native version of ScrapeDoClient, backed by httpx.AsyncClient. Mirrors the sync client's surface — smart routing, retry strategy, session validation, and event hooks — but every IO-bound method is async/await.

Features

Local API parameter validation via the RequestParameters Pydantic model.
Status code error parsing and customisable retry intervals for rate-limited requests.
Strongly-typed interface for responses via the ScrapeDoResponse Pydantic model.

Concurrency Limit and Server Errors

This client intercepts and manages Scrape.do's specific gateway errors (429, 502, 510), automatically applying a customisable retry strategy before the error can reach the application. The sleep between retries is non-blocking — await asyncio.sleep(...) rather than the sync client's time.sleep(...).

SDK Event Hooks (event_hooks)

This client implements SDK-specific async event hooks. See AsyncClientEventHooks for available lifecycle hooks and their required signatures. Hooks must be async-callable (returning Awaitable[None]).

Additional httpx.AsyncClient Configuration

The following httpx.AsyncClient parameters can be provided as keyword arguments and will be passed directly to the underlying object.

verify
cert
http1
http2
timeout
limits
transport
default_encoding

Additionally, the following httpx.AsyncClient.request parameters can be provided as keyword arguments during request execution.

timeout (r_timeout)
extensions

For more information on their behaviour and default values, please consult the official httpx documentation.

Unsupported HTTPX Client Arguments

The underlying httpx.AsyncClient object is strictly managed by the instance to prevent invalid configurations from being sent to the Scrape.do API. For this reason, arguments not listed in the previous section are intentionally blocked and shouldn't be changed.

Parameters:

Name	Type	Description	Default
`api_token`	`Optional[str]`	The Scrape.do API key. If omitted, the client will attempt to load it from the 'SCRAPE_DO_API_KEY' environment variable.	`None`
`max_retries`	`int`	The maximum number of retry attempts for retryable Scrape.do gateway errors (HTTP 429, 502, and 510).	`3`
`retry_backoff`	`Union[float, Callable[[int], float]]`	The strategy used to calculate the delay between retries. Can be a static `float` (seconds) or a callable that accepts the current attempt number (0-indexed) and returns a float. Defaults to a jittered exponential backoff when set to `None`. (Shared with the sync client)	`None`
`event_hooks`	`Optional[AsyncClientEventHooks]`	A dictionary of SDK-native async hooks to execute during different points of the request lifecycle.	`None`
`verify`	`Union[SSLContext, str, bool]`	Configures SSL certificate verification. Defaults to True (secure).	`True`
`cert`	`Optional[CertTypes]`	Client-side certificates for mutual TLS authentication.	`None`
`http1`	`bool`	Enable HTTP/1.1 support.	`True`
`http2`	`bool`	Enable HTTP/2 multiplexing for higher concurrency.	`False`
`timeout`	`TimeoutTypes`	The default timeout (in seconds) applied to all network phases. Defaults to 60s, raised from httpx's 5s default to accommodate Scrape.do proxy round-trips (browser rendering, geo-routing, fingerprinting).	`60.0`
`limits`	`Limits`	Configuration for maximum connection pool sizes.	`DEFAULT_LIMITS`
`transport`	`Optional[AsyncBaseTransport]`	A completely custom async transport engine.	`None`
`default_encoding`	`Union[str, Callable[[bytes], str]]`	The fallback text encoding used if a target website omits a charset header.	`'utf-8'`

`aclose()` `async` ¶

Closes the underlying HTTPX async connection pool.

It is recommended to use the client as an async context manager to ensure resources are released automatically.

`aenter()` `async` ¶

Async context manager entry.

Returns:

Type	Description
`Self`	instance with an opened HTTPX async connection pool.

`aexit(exc_type, exc_val, exc_tb)` `async` ¶

Calls aclose to close the underlying HTTPX async connection pool without swallowing any exceptions.

Parameters:

Name	Type	Description	Default
`exc_type`	`Optional[type[BaseException]]`	The type of the exception.	required
`exc_val`	`Optional[BaseException]`	The instance of the exception.	required
`exc_tb`	`Optional[TracebackType]`	The traceback information.	required

Returns:

Type	Description
`Literal[False]`	`False`, since no exceptions are swallowed.

`execute(request, session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None)` `async` ¶

Executes a fully prepared and validated Scrape.do request asynchronously.

Async counterpart of ScrapeDoClient.execute. Acts as the core execution funnel, applying the retry backoff logic, evaluating gateway errors and sessions, and isolating cookies between sequential executions. Sleeps between retries are non-blocking (await asyncio.sleep(...)).

Intended Usage

Use this method if you have manually constructed a PreparedScrapeDoRequest object for bulk routing, custom configurations, or task reusability.

Sessions (sessionId)

If you configure a request with a session_id, Scrape.do will attempt to route your traffic through the same proxy address. However, it can still silently rotate this address for various reasons. If it rotates during a multi-step scraping task, any target-specific WAF state or cookies accumulated will be lost, which may cause the task to fail.

Validating Sessions (session_validator)

In order to prevent unexpected errors due to dropped sessions, you can pass a custom async function to the client's execute method session_validator argument.
This function will be await-ed internally by the client after each stateful request (sessionId is not None) to determine whether or not a RotatedSessionError exception should be raised to signal that this session is no longer valid.
The function should take the current request's ScrapeDoResponse object as its only argument and return Awaitable[bool].
If the awaited value is True, this method will raise the RotatedSessionError instead of returning the response object. Otherwise, no additional action is taken.

Parameters:

Name	Type	Description	Default
`request`	`PreparedScrapeDoRequest`	The validated request payload.	required
`session_validator`	`Optional[AsyncSessionValidator]`	A custom async function to be called in order to determine whether or not to raise a `RotatedSessionError` exception.	`None`
`r_timeout`	`Union[TimeoutTypes, UseClientDefault]`	A request-specific timeout override.	`USE_CLIENT_DEFAULT`
`extensions`	`Optional[RequestExtensions]`	Advanced HTTPX extensions for this specific request.	`None`

Returns:

Type	Description
`ScrapeDoResponse`	The `ScrapeDoResponse` object containing the target's data.

Raises:

Type	Description
`APIConnectionError`	If the underlying network transport drops entirely (e.g., DNS failure).
`RotatedSessionError`	If a `session_validator` is provided, the request was made with a `session_id` argument, and the awaited `session_validator` returned `True`.

`execute_from_url(method, full_url, headers=None, body=None, payload_type='json', session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None)` `async` ¶

Executes an async request using a raw, pre-configured api.scrape.do URL.

Async counterpart of ScrapeDoClient.execute_from_url.

Intended Usage

This method is designed for scenarios where you have generated a Scrape.do URL elsewhere and simply need to execute it. It parses the URL to extract and validate the parameters, and then passes the PreparedScrapeDoRequest to the execute method.

URL Format

The api.scrape.do URL can be either url-encoded or not. Both will have their parameters extracted and be properly re-encoded before the request is sent.

Parameters:

Name	Type	Description	Default
`method`	`HttpMethod`	The HTTP method to forward to the target website.	required
`full_url`	`str`	The complete, pre-formatted `api.scrape.do` endpoint.	required
`headers`	`Optional[Dict[str, str]]`	Custom HTTP headers to forward to the target.	`None`
`body`	`Optional[Union[Dict[str, Any], str, bytes]]`	The payload to send to the target website.	`None`
`payload_type`	`PayloadType`	Dictates how the client encodes the `body` (e.g., 'json', 'data').	`'json'`
`session_validator`	`Optional[AsyncSessionValidator]`	A custom async function to be called in order to determine whether or not to raise a `RotatedSessionError` exception. (See `AsyncScrapeDoClient.execute` docstring for more information.)	`None`
`r_timeout`	`Union[TimeoutTypes, UseClientDefault]`	A request-specific timeout override.	`USE_CLIENT_DEFAULT`
`extensions`	`Optional[RequestExtensions]`	Advanced HTTPX extensions.	`None`

Raises:

Type	Description
`APIConnectionError`	If the underlying network transport drops entirely (e.g., DNS failure).
`RotatedSessionError`	If a `session_validator` is provided, the request was made with a `session_id` argument, and the awaited `session_validator` returned `True`.

Returns:

Type	Description
`ScrapeDoResponse`	The `ScrapeDoResponse` object containing the target's data.

`request(method, target_url, params=None, session_validator=None, *, headers=None, body=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

Async interface for building and executing a Scrape.do request.

Async counterpart of ScrapeDoClient.request. Depending on the parameter configuration it either constructs a PreparedScrapeDoRequest object and passes it to the execute method, or calls the execute_from_url method on the target_url.

Parameter Configuration

This method provides smart routing based on the arguments provided. You can configure the request in three distinct ways:

Keyword Arguments (Default) : Pass the target URL and Scrape.do parameters directly as **api_kwargs (render=True, geoCode="us").
Pre-built Parameters : Pass a fully validated RequestParameters object via the params argument.
Raw Scrape.do URL : Pass a full api.scrape.do URL as the target_url.

Parameter Restrictions

To prevent silent overwrites and routing ambiguity, the client enforces that only one of the parameter configurations can be used at a time.

When using the default Keyword Arguments (**api_kwargs) configuration, passing a value to the params argument, or a api.scrape.do URL to the target_url argument will raise a ValueError
When using the Pre-built Parameters (params) configuration, passing any **api_kwargs argument, or an api.scrape.do URL to the target_url argument, will raise a ValueError
When using the Raw Scrape.do URL configuration, passing any **api_kwargs argument, or a value to the params argument, will raise a ValueError

Pre-built Parameters Configuration

When passing an already constructed RequestParameters instance to the params argument, its url attribute will be ignored and replaced by the provided target_url.

Parameters:

Name	Type	Description	Default
`method`	`HttpMethod`	The HTTP method to forward to the target website.	required
`target_url`	`str`	The destination website URL (or a raw Scrape.do endpoint).	required
`params`	`Optional[RequestParameters]`	A pre-validated parameter object.	`None`
`session_validator`	`Optional[AsyncSessionValidator]`	A custom async function to be called in order to determine whether or not to raise a `RotatedSessionError` exception. (See `AsyncScrapeDoClient.execute` docstring for more information.)	`None`
`headers`	`Optional[Dict[str, str]]`	Custom HTTP headers to forward to the target.	`None`
`body`	`Optional[Union[Dict[str, Any], str, bytes]]`	The payload to send to the target website.	`None`
`payload_type`	`PayloadType`	Dictates how the client encodes the `body`.	`'json'`
`r_timeout`	`Union[TimeoutTypes, UseClientDefault]`	Request-specific timeout override.	`USE_CLIENT_DEFAULT`
`extensions`	`Optional[RequestExtensions]`	Advanced HTTPX extensions.	`None`
`**api_kwargs`	`Unpack[RequestParametersDict]`	Scrape.do API configuration parameters (e.g., `render=True`).	`{}`

Returns:

Type	Description
`ScrapeDoResponse`	The `ScrapeDoResponse` object containing the target's data.

Raises:

Type	Description
`ValueError`	If configuration constraints are violated.
`APIConnectionError`	If the underlying network transport drops entirely (e.g., DNS failure).
`RotatedSessionError`	If a `session_validator` is provided, the request was made with a `session_id` argument, and the awaited `session_validator` returned `True`.

`get(url, params=None, session_validator=None, *, headers=None, r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

Async wrapper for executing a GET request.

Inherits the smart routing logic, parameter validation, and execution constraints of the base request method.

Parameters:

Name	Type	Description	Default
`url`	`str`	The target website URL (or raw Scrape.do URL).	required
`params`	`Optional[RequestParameters]`	A pre-validated parameter object.	`None`
`session_validator`	`Optional[AsyncSessionValidator]`	A custom async function to be called in order to determine whether or not to raise a `RotatedSessionError` exception. (See `AsyncScrapeDoClient.execute` docstring for more information.)	`None`
`headers`	`Optional[Dict[str, str]]`	Custom HTTP headers to forward.	`None`
`r_timeout`	`Union[TimeoutTypes, UseClientDefault]`	Request-specific timeout override.	`USE_CLIENT_DEFAULT`
`extensions`	`Optional[RequestExtensions]`	Advanced HTTPX extensions.	`None`
`**api_kwargs`	`Unpack[RequestParametersDict]`	Scrape.do API configuration parameters.	`{}`

Raises:

Type	Description
`ValueError`	If configuration constraints are violated.
`APIConnectionError`	If the underlying network transport drops entirely (e.g., DNS failure).
`RotatedSessionError`	If a `session_validator` is provided, the request was made with a `session_id` argument, and the awaited `session_validator` returned `True`.

Returns:

Type	Description
`ScrapeDoResponse`	The `ScrapeDoResponse` object containing the target's data.

`post(url, params=None, session_validator=None, *, body=None, headers=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

Async wrapper for executing a POST request.

Inherits the smart routing logic, parameter validation, and execution constraints of the base request method.

Parameters:

Name	Type	Description	Default
`url`	`str`	The target website URL (or raw Scrape.do URL).	required
`params`	`Optional[RequestParameters]`	A pre-validated parameter object.	`None`
`session_validator`	`Optional[AsyncSessionValidator]`	A custom async function to be called in order to determine whether or not to raise a `RotatedSessionError` exception. (See `AsyncScrapeDoClient.execute` docstring for more information.)	`None`
`body`	`Optional[Union[Dict[str, Any], str, bytes]]`	The payload to send to the target website.	`None`
`headers`	`Optional[Dict[str, str]]`	Custom HTTP headers to forward.	`None`
`payload_type`	`PayloadType`	Dictates how the client encodes the `body`.	`'json'`
`r_timeout`	`Union[TimeoutTypes, UseClientDefault]`	Request-specific timeout override.	`USE_CLIENT_DEFAULT`
`extensions`	`Optional[RequestExtensions]`	Advanced HTTPX extensions.	`None`
`**api_kwargs`	`Unpack[RequestParametersDict]`	Scrape.do API configuration parameters.	`{}`

Raises:

Type	Description
`ValueError`	If configuration constraints are violated.
`APIConnectionError`	If the underlying network transport drops entirely (e.g., DNS failure).
`RotatedSessionError`	If a `session_validator` is provided, the request was made with a `session_id` argument, and the awaited `session_validator` returned `True`.

Returns:

Type	Description
`ScrapeDoResponse`	The `ScrapeDoResponse` object containing the target's data.

AsyncClientEventHooks ¶

Bases: TypedDict

Configuration dictionary for async-native lifecycle hooks.

The async counterpart of SyncClientEventHooks. Each hook must be an async-callable returning Awaitable[None] so it can perform I/O (logging to an async sink, posting telemetry, awaiting locks) while the request executes.

request `instance-attribute` ¶

request: List[
    Callable[[PreparedScrapeDoRequest], Awaitable[None]]
]

Fires exactly once per logical execution, immediately before the retry loop begins. Receives the PreparedScrapeDoRequest object that will be used to execute the request. Useful for logging the request being executed.

response `instance-attribute` ¶

response: List[
    Callable[[ScrapeDoResponse], Awaitable[None]]
]

Fires exactly once per logical execution, immediately after the proxy returns a response and the session_validator (if any) passes. Receives the request's ScrapeDoResponse object. Useful for logging only the final response after all retries, which can be either a successful response, a non-retryable error, or a final retryable error after max_attempts has been exhausted.

retry `instance-attribute` ¶

retry: List[
    Callable[
        [
            int,
            PreparedScrapeDoRequest,
            Optional[ScrapeDoResponse],
            Optional[Exception],
        ],
        Awaitable[None],
    ]
]

Fires inside the execution loop ONLY when a proxy gateway error (or an httpx.RequestError) occurs and the SDK decides to retry. Receives the current attempt number, the prepared request, and either the failed response (if it exists) or the httpx.RequestError that caused the retry. Useful for tracking proxy instability or manually raising an exception to abort the retry loop.

AsyncSessionValidator `module-attribute` ¶

AsyncSessionValidator = Callable[
    [ScrapeDoResponse], Awaitable[bool]
]

Defines the expected signature of the custom async function meant to be passed to the AsyncScrapeDoClient.execute method's session_validator argument.

Mirrors SyncSessionValidator but the callable must return Awaitable[bool] so the validator can perform I/O (e.g., a follow-up request to confirm session liveness) before deciding whether to raise RotatedSessionError.

Async Client

async_client ¶

AsyncScrapeDoClient ¶

aclose() async ¶

__aenter__() async ¶

__aexit__(exc_type, exc_val, exc_tb) async ¶

execute(request, session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None) async ¶

execute_from_url(method, full_url, headers=None, body=None, payload_type='json', session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None) async ¶

request(method, target_url, params=None, session_validator=None, *, headers=None, body=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs) async ¶

get(url, params=None, session_validator=None, *, headers=None, r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs) async ¶

post(url, params=None, session_validator=None, *, body=None, headers=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs) async ¶

AsyncClientEventHooks ¶

request instance-attribute ¶

response instance-attribute ¶

retry instance-attribute ¶

AsyncSessionValidator module-attribute ¶

`async_client` ¶

`AsyncScrapeDoClient` ¶

`aclose()` `async` ¶

`aenter()` `async` ¶

`aexit(exc_type, exc_val, exc_tb)` `async` ¶

`execute(request, session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None)` `async` ¶

`execute_from_url(method, full_url, headers=None, body=None, payload_type='json', session_validator=None, *, r_timeout=USE_CLIENT_DEFAULT, extensions=None)` `async` ¶

`request(method, target_url, params=None, session_validator=None, *, headers=None, body=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

`get(url, params=None, session_validator=None, *, headers=None, r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

`post(url, params=None, session_validator=None, *, body=None, headers=None, payload_type='json', r_timeout=USE_CLIENT_DEFAULT, extensions=None, **api_kwargs)` `async` ¶

request `instance-attribute` ¶

response `instance-attribute` ¶

retry `instance-attribute` ¶

AsyncSessionValidator `module-attribute` ¶