A world of pain

There are actually quite a lot of libraries and tools to take screenshots of web pages, even for capturing specific DOM elements within a page. Turns out 99% of them are based on PhantomJS, which is dead. So let the thing rest in peace and get to know Puppeteer.

Enter Puppeteer

Puppeteer is developed by the Chrome team. Chrome 59 came with a headless mode, opening up a whole lot of possibilities. This allows us to render web pages in the browser without the need for GUI windows; ideal for automating stuff or running tests. Now Puppeteer provides a JavaScript API to remotely control the browser, no matter whether its head is attached or not. As an official part of Chrome, I suppose we can rely on Puppeteer being around for a while.

You said something about screenshots

Yup, let’s cut to the chase. Puppeteer (I hate typing it already) brings API support for exactly that. So why not build a microservice (I heard it’s en vogue right now) that takes a screenshot of an arbitrary DOM element, and nothing more?

For that I created Hotshot. You can call Hotshot via HTTP and pass it a relative URL path and a CSS selector:

curl -G "https://hotshot.innoq.io/shoot?path=/de/podcast/042-twwwr/cover&selector=.podcast-teaser"

You’ll get back a screenshot as image/png:

Screenshot of a podcast episode cover
Screenshot of a podcast episode cover

As you probably noticed, you can only pass relative paths to /shoot for security reasons. If you want to try it for a site other than innoq.com, you’ll need to spin up your own instance and configure it to target your desired website.

Hotshot stands on the shoulders of giants. Since I wanted to use Docker, I looked for a base image I could use and found alekzonder/docker-puppeteer. This does a lot of the heavy lifting involved in installing Puppeteer on a Node base image.

Now that we have our microservice taking screenshots of DOM selectors, how can we get those neat images into our actual site? This is something we’re still working on for our website. There are several possibilities:

  • Store the image in the cloud
  • Just pass through every call to Hotshot, embracing HTTP caching

Hotshot is still under active development, but I wanted to share my findings so far. Let’s see what other kinds of cool stuff we can use Puppeteer for.

TAGS

Kommentare