Introducing Housefly: A Playground for Web Scraping

Learn Web Scraping with Hands-On Challenges and Real Websites
Web scraping is an essential skill for developers, but learning it can be tricky. That’s why I created Housefly, a hands-on project designed to teach web scraping through interactive exercises. Inspired by Google Gruyere, Housefly provides a series of small tutorials with dedicated companion websites built to be scraped. The goal? To give you a safe, structured environment to practice and refine your scraping skills.
Why Did I Make This?
I’ve seen countless tutorials that explain web scraping in theory, but very few offer real, controlled environments to experiment in. Housefly solves that by providing self-contained challenges where you scrape provided websites and verify your solutions against expected outputs. It’s built for hands-on learners who want to do rather than just read.
How to Get Started
The few chapters are live, and you can try it out today!
- Clone the GitHub repo:
git clone https://github.com/jonaylor89/housefly.git cd housefly
- Navigate to Chapter 1 and explore the provided website.
- Write your scraper in
apps/solution1/
. - Run the checker to validate your solution:
npm run ca 1
More chapters are on the way, but for now, dive into the first challenge and start scraping! 👉 Try Housefly on GitHub