Response

`response` ¶

Inbound response shapes for the Scrape.do Async API

Defines the pydantic models the ScrapeDoAsyncAPIClient parses from q.scrape.do JSON responses

Plugin Content

TaskDetails.content is an opaque string for now
Per-plugin structured response models ship with the plugin clients themselves in 0.4 / 0.5

`CancelJobResponse = JobDetails` `module-attribute` ¶

Alias for JobDetails

DELETE /api/v1/jobs/{jobID} returns the same shape as the corresponding GET with Canceled=true

`WebhookPayload = TaskDetails` `module-attribute` ¶

Alias for TaskDetails

The Scrape.do Async API posts a TaskDetails-shaped JSON body to the configured webhook URL when each task reaches a terminal status

`JobCreationResponse` ¶

Bases: BaseModel

Response body returned by POST /api/v1/jobs

Attributes:

Name	Type	Description
`job_id`	`str`	Server-assigned UUID for the newly created job
`task_ids`	`List[str]`	UUIDs for each task spawned from this job (one per `Targets[]` entry or one per `Plugin.Params[]` entry)
`message`	`Optional[str]`	Human-readable acknowledgment message

`UndetailedTaskResponse` ¶

Bases: BaseModel

Per-task summary nested inside JobDetails.tasks

Attributes:

Name	Type	Description
`task_id`	`str`	UUID of the task
`url`	`str`	The target URL for this task
`status`	`JobStatus`	Current lifecycle status of the task

`JobDetails` ¶

Bases: BaseModel

Response body returned by GET /api/v1/jobs/{jobID} and DELETE /api/v1/jobs/{jobID}

Attributes:

Name	Type	Description
`job_id`	`str`	UUID of the job
`task_ids`	`List[str]`	UUIDs of every task in this job
`status`	`JobStatus`	Current lifecycle status
`start_time`	`Optional[datetime]`	When the job started executing (RFC3339)
`end_time`	`Optional[datetime]`	When the job reached a terminal status (RFC3339)
`acquired_concurrency`	`int`	Number of concurrent requests currently in use by this job (per `Scrape.do's` Async API response schema)
`limit_concurrency`	`int`	Maximum number of concurrent requests allowed for this job. `0` indicates no per-job ceiling beyond the account-wide async pool
`canceled`	`bool`	`True` if the job was explicitly canceled
`tasks`	`List[UndetailedTaskResponse]`	Per-task summary entries

`is_terminal` `property` ¶

Whether the job has reached any terminal status

Terminal Statuses

success
error
canceled

Returns:

Type	Description
`bool`	`True` if `status` is one of the documented terminal statuses, `False` otherwise

`is_success` `property` ¶

Whether the job completed successfully

Returns:

Type	Description
`bool`	`True` if `status == "success"`, `False` otherwise

`is_terminal_failure` `property` ¶

Whether the job has reached a terminal failure state

Status Routing

error / canceled → True
success → False
Non-Terminal Statuses → False

Returns:

Type	Description
`bool`	`True` if `status` is `error` or `canceled`, `False` otherwise

`duration` `property` ¶

Wall-clock duration the job spent running

Availability

Both start_time and end_time are populated only once the job reaches a terminal status
Before then, this property returns None

Returns:

Type	Description
`Optional[timedelta]`	`end_time - start_time` when both are set, `None` otherwise

`raise_for_status()` ¶

Raises the appropriate terminal-state exception for the current status

Exception Mapping

status == "error" → raises JobFailedError
status == "canceled" → raises JobCanceledError
Success / Non-Terminal → no-op

Intended Usage

Polling helpers wait_for_job and submit_and_wait already call this internally
Use it directly when working with get_job outputs

Raises:

Type	Description
`JobFailedError`	When `status == "error"`
`JobCanceledError`	When `status == "canceled"`

`TaskDetails` ¶

Bases: BaseModel

Response body returned by GET /api/v1/jobs/{jobID}/{taskID}

Attributes:

Name	Type	Description
`task_id`	`str`	UUID of the task
`job_id`	`str`	UUID of the parent job
`url`	`str`	The target URL the task was fetching
`status`	`JobStatus`	Terminal status (`success`, `error`, or `canceled` when delivered via webhook)
`start_time`	`Optional[datetime]`	When the task started
`end_time`	`Optional[datetime]`	When the task reached its terminal status
`update_time`	`Optional[datetime]`	Last update timestamp
`expires_at`	`Optional[datetime]`	When the task's content expires server-side and becomes unretrievable
`base64_encoded_content`	`bool`	Whether `content` is base64-encoded
`status_code`	`int`	HTTP status code returned by the target
`response_headers`	`Dict[str, str]`	Headers returned by the target
`scrape_do_metadata`	`Dict[str, str]`	Scrape.do's telemetry block. Exposed under the literal `Scrape.do` key on the wire
`content`	`Optional[str]`	Response body. Base64-decoded via `decoded_content()` when `base64_encoded_content=True`
`error_message`	`Optional[str]`	Task-level error message on failure

`is_terminal` `property` ¶

Whether the task has reached any terminal status

Terminal Statuses

success
error
canceled

Returns:

Type	Description
`bool`	`True` if `status` is one of the documented terminal statuses, `False` otherwise

`is_success` `property` ¶

Whether the task completed successfully

Lifecycle Status vs Target Response

This checks the task's lifecycle status, not the target's HTTP Status Code
A task with status == "success" and status_code == 404 means Scrape.do successfully fetched the target, which itself returned 404

Returns:

Type	Description
`bool`	`True` if `status == "success"`, `False` otherwise

`is_terminal_failure` `property` ¶

Whether the task has reached a terminal failure state

Status Routing

error / canceled → True
success → False
Non-Terminal Statuses → False

Returns:

Type	Description
`bool`	`True` if `status` is `error` or `canceled`, `False` otherwise

`is_expired` `property` ¶

Whether the task's content has expired server-side

Definition

Returns True only when expires_at is populated and strictly before datetime.now(timezone.utc)
Returns False only when expires_at is populated and strictly after datetime.now(timezone.utc)
Returns None when expires_at is not set

Returns:

Type	Description
`Optional[bool]`	`True` if `expires_at` is set and in the past, `False` if `expires_at` is set and in the future, `None` if `expires_at` is not set

`duration` `property` ¶

Wall-clock duration the task spent running

Availability

Both start_time and end_time are populated only once the task reaches a terminal status
Before then, this property returns None

Returns:

Type	Description
`Optional[timedelta]`	`end_time - start_time` when both are set, `None` otherwise

`decoded_content()` ¶

Returns content as bytes, decoding base64 when applicable

Behavior

When base64_encoded_content=True, decodes content via base64.b64decode
When base64_encoded_content=False, returns content.encode("utf-8") so callers always get bytes
Returns None when content is None

Returns:

Type	Description
`Optional[bytes]`	The decoded bytes, or `None` if there is no `content`

Raises:

Type	Description
`ValueError`	If `base64_encoded_content=True` but `content` is not valid base64

`raise_for_status()` ¶

Raises the appropriate terminal-state exception for the current status

Exception Mapping

status == "error" → raises TaskFailedError
status == "canceled" → raises TaskCanceledError
Success / Non-Terminal → no-op

Mirrors JobDetails.raise_for_status

Equivalent to JobDetails.raise_for_status on the parent job, but for the task's own lifecycle status

Raises:

Type	Description
`TaskFailedError`	When `status == "error"`. The task's `error_message` (when set) gives a more specific reason
`TaskCanceledError`	When `status == "canceled"`

`UserInformation` ¶

Bases: BaseModel

Response body returned by GET /api/v1/me

AvaliableCredits

The credit-balance field is spelled AvaliableCredits in Scrape.do's official documentation and is the live server spelling
Verified against /api/v1/me and locked here as the alias

Attributes:

Name	Type	Description
`total_concurrency`	`int`	Total concurrency limit on the account
`free_concurrency`	`int`	Currently available concurrency slots
`active_jobs`	`int`	Number of jobs currently running
`available_credits`	`int`	Remaining account credits. Mapped from the `AvaliableCredits` server field

`JobsListResponse` ¶

Bases: BaseModel

Response body returned by GET /api/v1/jobs

Per-Job Entries

Each entry in jobs is parsed as a JobDetails model
Unlike GET /api/v1/jobs/{jobID}, the listing payload omits the AcquiredConcurrency, LimitConcurrency, Canceled, and Tasks attributes, so those fall back to their model defaults (0, 0, False, and [], respectively)
The actual values can always be fetched via get_job

Attributes:

Name	Type	Description
`jobs`	`List[JobDetails]`	Per-job entries on this page
`total_count`	`int`	Total jobs matching the query (across all pages)
`page_size`	`int`	Page size used to compute the response
`page_number`	`int`	1-indexed page number of the current response
`total_pages`	`int`	Total pages available for the query

`JobResult` ¶

Bases: BaseModel

Bundle of terminal JobDetails + the fetched per-task details

Returned by ScrapeDoAsyncAPIClient.submit_and_wait after a job reaches a terminal status and every task's details have been fetched

Attributes:

Name	Type	Description
`job`	`JobDetails`	The terminal `JobDetails` snapshot
`tasks`	`List[TaskDetails]`	The fully-fetched `TaskDetails` for each `task_id` in `job.task_ids`, in input order

Response

response ¶

CancelJobResponse = JobDetails module-attribute ¶

WebhookPayload = TaskDetails module-attribute ¶

JobCreationResponse ¶

UndetailedTaskResponse ¶

JobDetails ¶

is_terminal property ¶

is_success property ¶

is_terminal_failure property ¶

duration property ¶

raise_for_status() ¶

TaskDetails ¶

is_terminal property ¶

is_success property ¶

is_terminal_failure property ¶

is_expired property ¶

duration property ¶

decoded_content() ¶

raise_for_status() ¶

UserInformation ¶

JobsListResponse ¶

JobResult ¶

`response` ¶

`CancelJobResponse = JobDetails` `module-attribute` ¶

`WebhookPayload = TaskDetails` `module-attribute` ¶

`JobCreationResponse` ¶

`UndetailedTaskResponse` ¶

`JobDetails` ¶

`is_terminal` `property` ¶

`is_success` `property` ¶

`is_terminal_failure` `property` ¶

`duration` `property` ¶

`raise_for_status()` ¶

`TaskDetails` ¶

`is_terminal` `property` ¶

`is_success` `property` ¶

`is_terminal_failure` `property` ¶

`is_expired` `property` ¶

`duration` `property` ¶

`decoded_content()` ¶

`raise_for_status()` ¶

`UserInformation` ¶

`JobsListResponse` ¶

`JobResult` ¶