ever onwards
This commit is contained in:
@@ -219,6 +219,8 @@ If the CMS is unknown, the tool prints structural diagnostics (card selectors, f
|
||||
|
||||
## Important Notes
|
||||
|
||||
Don't treat detail pages as optional, we always want all the info!
|
||||
|
||||
### Status Mapping
|
||||
Brokers use different status strings. Always map to one of:
|
||||
- `"beschikbaar"` — Available for sale
|
||||
@@ -270,6 +272,7 @@ The database stores this as JSON in the `extra` column.
|
||||
- Nominatim (geocoding) has a 1 req/s limiter built into `huizenbot.py`
|
||||
- Never spawn parallel requests without the human's approval
|
||||
- Always use the `USER_AGENT` header (includes contact info for respectful scraping)
|
||||
- Don't keep curling the same endpoint, pipe it to a <name makelaar>.dump and then rg through it to find what you need. Can also pipe it through the bsprettify.py and then rg that.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user