Response
response
¶
Inbound response shapes for the Scrape.do Async API
Defines the pydantic models the ScrapeDoAsyncAPIClient parses from
q.scrape.do JSON responses
Plugin Content
-
TaskDetails.contentis an opaque string for now -
Per-plugin structured response models ship with the plugin clients themselves in
0.4/0.5
CancelJobResponse = JobDetails
module-attribute
¶
Alias for JobDetails
DELETE /api/v1/jobs/{jobID} returns the same shape as the corresponding
GET with Canceled=true
WebhookPayload = TaskDetails
module-attribute
¶
Alias for TaskDetails
The Scrape.do Async API posts a TaskDetails-shaped JSON body to
the configured webhook URL when each task reaches a terminal status
JobCreationResponse
¶
Bases: BaseModel
Response body returned by POST /api/v1/jobs
Attributes:
| Name | Type | Description |
|---|---|---|
job_id |
str
|
Server-assigned UUID for the newly created job |
task_ids |
List[str]
|
UUIDs for each task spawned from this
job (one per |
message |
Optional[str]
|
Human-readable acknowledgment message |
UndetailedTaskResponse
¶
Bases: BaseModel
Per-task summary nested inside
JobDetails.tasks
Attributes:
| Name | Type | Description |
|---|---|---|
task_id |
str
|
UUID of the task |
url |
str
|
The target URL for this task |
status |
JobStatus
|
Current lifecycle status of the task |
JobDetails
¶
Bases: BaseModel
Response body returned by GET /api/v1/jobs/{jobID} and
DELETE /api/v1/jobs/{jobID}
Attributes:
| Name | Type | Description |
|---|---|---|
job_id |
str
|
UUID of the job |
task_ids |
List[str]
|
UUIDs of every task in this job |
status |
JobStatus
|
Current lifecycle status |
start_time |
Optional[datetime]
|
When the job started executing (RFC3339) |
end_time |
Optional[datetime]
|
When the job reached a terminal status (RFC3339) |
acquired_concurrency |
int
|
Number of concurrent requests
currently in use by this job (per |
limit_concurrency |
int
|
Maximum number of concurrent requests
allowed for this job. |
canceled |
bool
|
|
tasks |
List[UndetailedTaskResponse]
|
Per-task summary entries |
is_terminal
property
¶
Whether the job has reached any terminal status
Terminal Statuses
successerrorcanceled
Returns:
| Type | Description |
|---|---|
bool
|
|
is_success
property
¶
Whether the job completed successfully
Returns:
| Type | Description |
|---|---|
bool
|
|
is_terminal_failure
property
¶
Whether the job has reached a terminal failure state
Status Routing
error/canceled→Truesuccess→FalseNon-Terminal Statuses→False
Returns:
| Type | Description |
|---|---|
bool
|
|
duration
property
¶
raise_for_status()
¶
Raises the appropriate terminal-state exception for the
current status
Exception Mapping
status == "error"→ raisesJobFailedErrorstatus == "canceled"→ raisesJobCanceledErrorSuccess / Non-Terminal→ no-op
Intended Usage
-
Polling helpers
wait_for_jobandsubmit_and_waitalready call this internally -
Use it directly when working with
get_joboutputs
Raises:
| Type | Description |
|---|---|
JobFailedError
|
When |
JobCanceledError
|
When |
TaskDetails
¶
Bases: BaseModel
Response body returned by GET /api/v1/jobs/{jobID}/{taskID}
Attributes:
| Name | Type | Description |
|---|---|---|
task_id |
str
|
UUID of the task |
job_id |
str
|
UUID of the parent job |
url |
str
|
The target URL the task was fetching |
status |
JobStatus
|
Terminal status ( |
start_time |
Optional[datetime]
|
When the task started |
end_time |
Optional[datetime]
|
When the task reached its terminal status |
update_time |
Optional[datetime]
|
Last update timestamp |
expires_at |
Optional[datetime]
|
When the task's content expires server-side and becomes unretrievable |
base64_encoded_content |
bool
|
Whether |
status_code |
int
|
HTTP status code returned by the target |
response_headers |
Dict[str, str]
|
Headers returned by the target |
scrape_do_metadata |
Dict[str, str]
|
Scrape.do's telemetry
block. Exposed under the literal |
content |
Optional[str]
|
Response body. Base64-decoded via
|
error_message |
Optional[str]
|
Task-level error message on failure |
is_terminal
property
¶
Whether the task has reached any terminal status
Terminal Statuses
successerrorcanceled
Returns:
| Type | Description |
|---|---|
bool
|
|
is_success
property
¶
Whether the task completed successfully
Lifecycle Status vs Target Response
-
This checks the task's
lifecycle status, not the target'sHTTP Status Code -
A task with
status == "success"andstatus_code == 404meansScrape.dosuccessfully fetched the target, which itself returned404
Returns:
| Type | Description |
|---|---|
bool
|
|
is_terminal_failure
property
¶
Whether the task has reached a terminal failure state
Status Routing
error/canceled→Truesuccess→FalseNon-Terminal Statuses→False
Returns:
| Type | Description |
|---|---|
bool
|
|
is_expired
property
¶
Whether the task's content has expired server-side
Definition
-
Returns
Trueonly whenexpires_atis populated and strictly beforedatetime.now(timezone.utc) -
Returns
Falseonly whenexpires_atis populated and strictly afterdatetime.now(timezone.utc) -
Returns
Nonewhenexpires_atis not set
Returns:
| Type | Description |
|---|---|
Optional[bool]
|
|
duration
property
¶
decoded_content()
¶
Returns content as bytes, decoding base64 when applicable
Behavior
- When
base64_encoded_content=True, decodescontentviabase64.b64decode - When
base64_encoded_content=False, returnscontent.encode("utf-8")so callers always get bytes - Returns
Nonewhencontent is None
Returns:
| Type | Description |
|---|---|
Optional[bytes]
|
The decoded bytes, or |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
raise_for_status()
¶
Raises the appropriate terminal-state exception for the
current status
Exception Mapping
status == "error"→ raisesTaskFailedErrorstatus == "canceled"→ raisesTaskCanceledErrorSuccess / Non-Terminal→ no-op
Mirrors JobDetails.raise_for_status
Equivalent to JobDetails.raise_for_status
on the parent job, but for the task's own lifecycle status
Raises:
| Type | Description |
|---|---|
TaskFailedError
|
When |
TaskCanceledError
|
When |
UserInformation
¶
Bases: BaseModel
Response body returned by GET /api/v1/me
AvaliableCredits
-
The credit-balance field is spelled
AvaliableCreditsinScrape.do'sofficial documentation and is the live server spelling -
Verified against
/api/v1/meand locked here as the alias
Attributes:
| Name | Type | Description |
|---|---|---|
total_concurrency |
int
|
Total concurrency limit on the account |
free_concurrency |
int
|
Currently available concurrency slots |
active_jobs |
int
|
Number of jobs currently running |
available_credits |
int
|
Remaining account credits. Mapped
from the |
JobsListResponse
¶
Bases: BaseModel
Response body returned by GET /api/v1/jobs
Per-Job Entries
-
Each entry in
jobsis parsed as aJobDetailsmodel -
Unlike
GET /api/v1/jobs/{jobID}, the listing payload omits theAcquiredConcurrency,LimitConcurrency,Canceled, andTasksattributes, so those fall back to their model defaults (0,0,False, and[], respectively) -
The actual values can always be fetched via
get_job
Attributes:
| Name | Type | Description |
|---|---|---|
jobs |
List[JobDetails]
|
Per-job entries on this page |
total_count |
int
|
Total jobs matching the query (across all pages) |
page_size |
int
|
Page size used to compute the response |
page_number |
int
|
1-indexed page number of the current response |
total_pages |
int
|
Total pages available for the query |
JobResult
¶
Bases: BaseModel
Bundle of terminal JobDetails + the fetched per-task details
Returned by ScrapeDoAsyncAPIClient.submit_and_wait
after a job reaches a terminal status and every task's details have
been fetched
Attributes:
| Name | Type | Description |
|---|---|---|
job |
JobDetails
|
The terminal |
tasks |
List[TaskDetails]
|
The fully-fetched |