Recently, I’ve had the pleasure of accumulating over 1 million thumbnail and preview images on my drive. I like having as many files on my disk as the next guy, but I thought since I’m using MongoDB anyway, why not use GridFS to store the images!
The trick was to fetch an image at a URL, save it into GridFS, then be able to serve it back out – all without temp files or creating individual files for every image. Here’s a recipe that does it all!
Whoever said scraping web pages can’t be fun never tried it using Python decorators and generators! We’ll use this mini-framework to fetch all the upcoming comic book releases from one of my favorite online comic book stores.
A couple of interesting things going on here programmatically: 1) using decorators that are static methods of a class that maintains some state for the decorated operations and 2) decorating class methods that are generators – the typical decorator can’t handle a yielded result…
Here’s some handy code for generating combinations of strings in Python. Provide a string with substitution variables and a list of iterables, and poof!