Legal / public records
Swiss federal legal database pipeline (242k laws, 4 languages, 26 cantons)
End-to-end pipeline ingesting every Swiss federal law and court decision across all 26 cantons in 4 languages (DE/FR/IT/RM). PDFs scraped with proxy rotation, text-extracted via PyMuPDF, persisted to PostgreSQL (Neon) and AWS S3 on a four-times-a-week cron with automated validation jobs flagging missing records.
- Laws indexed
- 242,101
- PDFs in S3
- 98.6%
- Cantons
- 26 / 26
Hospitality data
73,000-venue Australian restaurant database
Scraped, deduplicated and verified every food-service venue in Australia from public sources. Now sold as our flagship dataset.
- Records
- 73,000
- Sources
- 12+
- Refresh
- Quarterly
Real estate
Real-time real-estate price monitoring (US)
Daily Realtor.com price-change scrape for 4 metro areas, normalised and pushed to client's BigQuery warehouse.
- Listings tracked
- 180k
- Frequency
- Daily
- Latency
- < 6 hours
Hospitality reputation
TripAdvisor review aggregation for hotel chain
Pulled 12 years of TripAdvisor reviews across 47 properties, sentiment-tagged and loaded into the chain's reporting dashboard.
- Reviews
- 1.2M
- Properties
- 47
- Languages
- 9
Local services
Angi list dataset extraction (US trades)
Built a complete Angi.com index of US trades businesses — name, contact, ratings, services offered — for a SaaS lead-gen client.
- Records
- 2.4M
- Trades
- 85+
- Match rate
- 92%
Sales enablement
Google Maps email enrichment for SaaS outbound
Enriched 50,000 SMB Google Maps records with publicly-listed business emails for a SaaS vendor's cold outbound campaign.
- Records enriched
- 50,000
- Email match rate
- 38%
- Reply rate
- ~14%
E-commerce automation
Custom Shopify catalogue migration tool
Built a one-click migration bot that lifts a competitor's Shopify catalogue into a client's Shopify store — preserving variants, images and SEO meta.
- SKUs migrated
- 8,400
- Time saved
- ~120 hrs
- Error rate
- 0.3%