Bongbetic
Scrappy
A practical lead generation tool that gathers prospects in a cleaner, usable structure for real outreach teams.
Lead Generation
Sales Enablement
Workflow Design
Problem
Lead lists were scattered across sources and too messy for quick sales activation.
Outcome
Scrappy standardized lead capture into an actionable format so teams could move from list-building to outreach faster.
Technical Spec Sheet
Scrappy engineering blueprint
Platform
Desktop app with local FastAPI backend + React frontend
Architecture
- Launcher process (`app.py`) boots a dynamic localhost server and opens a native desktop window via `pywebview`.
- API/WebSocket layer (`server.py`) coordinates licensing, scraping lifecycle, real-time event streaming, and Excel export endpoints.
- Scraping engine (`scraper_engine.py`) runs a two-phase Playwright pipeline: collect place URLs, then visit each place for detail extraction.
- Frontend (`frontend/src`) is a Vite + React interface with WebSocket-driven live result updates and settings controls.
Stack
- Python 3 + FastAPI + Uvicorn
- Playwright (Chromium automation with stealth init scripts)
- React 19 + TypeScript + Tailwind + Radix UI
- openpyxl for styled workbook export
- requests + BeautifulSoup + lxml for optional website email extraction
Core Modules
- `scraper_engine.py`: URL discovery, DOM extraction, rating/review parsing, stop/resume-safe runtime state
- `server.py`: `/ws` command channel (`start`, `stop`, `clear`, `ping`) + `/export-open` + `/results`
- `license.py` + `gen_license.py`: trial limits, machine binding, activation flow
- `email_extractor.py`: optional contact discovery via website crawl
Data Flow
- User submits multiple queries from React sidebar.
- WebSocket `start` initializes `ScraperEngine` with runtime options (headless/background, phone-only, email extraction, dedup).
- Engine emits status/progress/result callbacks; backend relays events to UI in near real-time.
- Result store is persisted in-memory server-side and exported to multi-sheet XLSX with summary metadata.
Integrations
- Google Maps web interface (no API key mode)
- Gumroad-linked license fulfillment flow
- OS-level file opening after export (Windows/macOS/Linux)
Packaging and Delivery
- PyInstaller-style frozen desktop distribution with bundled Chromium/WebView handling
- Installer scripts for Windows/macOS/Linux targets
- Runtime checks for missing browser binaries + auto-install path for WebView2 on Windows
Quality and Reliability Measures
- Dedup strategy for repeated leads across query batches
- Thread-safe server result locking for large scrape sessions
- Graceful fallback paths when browser/runtime dependencies are missing
Available now
Get Scrappy on Gumroad
Purchase the app directly and start using it in your workflow.