13 Commits

Author SHA1 Message Date
c6328cee46 fix: extract postcode for vanoord and vanherk scrapers
Van Oord: postcode is in the first .elementor-heading-title on detail pages.
Van Herk: postcode extracted via regex from <title> tag; also pick up kamers
and energielabel from the features list which were previously ignored.
Test output now includes woonoppervlak and energielabel fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:53:17 +02:00
f74e9bcfb0 refactor: split ssr.py into package, enrich OG Online detail pages, fix travel upsert
- Split src/adapters/ssr.py (2160 LOC) into ssr/ package grouped by CMS:
  realworks.py, sure.py, schiedam.py, denhaag.py, overige.py
- Add _og_detail() to api.py; all OG Online scrapers now fall back to
  detail page fetch when energielabel/bouwjaar are missing from the API
- Fix run() to recalculate travel times for existing listings where
  fiets_mark IS NULL; upsert() now writes travel cols on existing rows too
- Update tests/cache.py to patch fetch_soup in every ssr submodule
- Update docs to reflect new package structure and mark API enrichment TODO done

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:39:35 +02:00
6beae1133b add scrapers: Olsthoorn (SURE), Post Makelaardij, Morris (Realworks) for Delft
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 23:07:27 +02:00
bfd69e3542 add scrapers: V&W, ZO Makelaars (Realworks), Roepman (JSON-LD) for Delft
- fetch_vwmakelaars, fetch_zomakelaars: one-liner Realworks wrappers
- fetch_roepman: custom JSON-LD scraper (Realworks CMS uses div.aanbodEntry
  instead of li.aanbodEntry; price from potentialAction priceSpecification)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:43:43 +02:00
d310a7a560 add scrapers: Van Daal (API), Van Silfhout (SSR) for Delft
- fetch_vandaal: OG Online API, covers Delft/Rijswijk/Den Haag area,
  includes is_bought→verkocht status mapping
- fetch_vansilfhout: HTML scraper, all listings on single page,
  extracts postcode from embedded JS variable (objectZipcode)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:39:02 +02:00
c92ddb5812 add scrapers: Moerman & De Jong (API) and Schieland Borsboom (SSR)
- fetch_moerman: OG Online realtime-listings API (same platform as bjornd),
  includes bouwjaar from dateOfConstruction, energielabel, strips postcode space
- fetch_schielandborsboom: paginated HTML scraper filtered to Schiedam,
  fetches #kenmerken detail page for full specs (bouwjaar, kamers, etc.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:34:58 +02:00
f1748214ce drop email support 2026-04-04 14:11:07 +02:00
8450c33887 HA webhook works, also more makelaars 2026-04-04 01:35:29 +02:00
b35025b9cb ever onwards 2026-04-03 16:58:57 +02:00
18c01139c2 give in to the vibe 2026-04-03 16:32:00 +02:00
4f37a1dd37 improve logging 2026-04-03 16:15:29 +02:00
17b35d1997 add some more makelaars, and some more infra 2026-04-03 15:49:42 +02:00
26d9d936f4 first setup, travel works, bjornd api works 2026-04-03 13:53:39 +02:00