Roadmap¶
This Document Might Change
Items may be reordered or rescoped based on user feedback and design discoveries.
0.2 — Async + Proxy Mode¶
Status → Planned
-
Next Minor
-
No Change To The Existing Sync API Surface
AsyncScrapeDoClient¶
-
Very similar to the current synchronous client, backed by
httpx.AsyncClient -
Same smart-routing, validator, and event-hook semantics, with
async/await
ScrapeDoProxyClient + AsyncScrapeDoProxyClient¶
-
Wraps
Scrape.do's Proxy Modeatproxy.scrape.do, instead of the current API mode. -
Reuses the existing
RequestParametersdata models. -
Differs in URL construction and client-level network handling.
0.3 — Scrape.do Async API¶
Status → Planned
-
Sub-package wrapping the
Scrape.do Async APIatq.scrape.do -
New data models for
job_id,polling state, andresult fetching -
Unlike the in-process async client of 0.2, this targets Scrape.do's
server-side async job queue
0.4 — Google Plugin¶
Status → Planned
- Sub-package wrapping
Scrape.do's Google Scraper APIwith new data models specific to search/results.
0.5 — Amazon Plugin¶
Status → Planned
- Sub-package wrapping
Scrape.do's Amazon Scraper APIwith new data models specific to product/listing data.
1.0 — Surface Freeze¶
Stability Commitment
-
Stabilize the public API across
sync,async,proxy,async-API, andpluginnamespaces -
Post-1.0, breaking changes follow strict
Semantic Versioning
Planned Package Layout¶
Speculative
-
A starting point, not a commitment
-
Each milestone may surface design constraints that justify deviation
-
Version slots above are firmer than the file paths below
src/scrape_do/
│
├─ __init__.py # (1)!
├─ py.typed # (2)!
├─ exceptions.py # (3)!
├─ constants.py
├─ abc.py
│
│ # (4)!
├─ client.py # (5)!
├─ async_client.py # (6)!
├─ proxy_client.py # (7)!
├─ async_proxy_client.py # (8)!
├─ models/ # (9)!
│
├─ async_api/ # (10)!
│ │
│ ├─ __init__.py
│ ├─ client.py
│ ├─ async_client.py
│ ├─ models/ # (11)!
│ └─ exceptions.py # (12)!
│
│
└─ plugins/ # (13)!
│
├─ __init__.py
├─ google/ # (14)!
│ │
│ ├─ __init__.py
│ ├─ client.py
│ ├─ async_client.py
│ └─ models/ # (15)!
│
└─ amazon/
│
├─ __init__.py
├─ client.py
├─ async_client.py
└─ models/ # (16)!
- Curated Public Re-Exports
- PEP 561 Marker
- Base Hierarchy (sub-packages may extend)
0.1+0.2- api.scrape.do + proxy.scrape.do- ScrapeDoClient (sync, api.scrape.do) —
0.1 - AsyncScrapeDoClient —
0.2 - ScrapeDoProxyClient (proxy.scrape.do) —
0.2 - AsyncScrapeDoProxyClient —
0.2 - Request / Response models for the four above
0.3- q.scrape.do — Different API surface (server-side job queue)job_id,polling,results, ...- Queue-Specific (if needed)
0.4+0.5- Each plugin is a sub-package0.4- Search-Specific
- Product/Listing-Specific
Suggestions Are Welcome¶
Influence The Roadmap
-
If a feature you need isn't here, open a
Feature Request -
The roadmap reorders based on what real users need