How I Dropped Pocket to Go Offline

by Ploum on 2022-01-13

Browsing the web, Pocket has always been one of my most important tools. The one I could not live without.I even kept a Kobo only because it could synchronise with Pocket. You may use an alternative like Wallabag or Instapaper. Same stuff.

Unfortunately, there was no way to access Pocket content while #offline (yeah, trying to make this hashtag most popular as there was no recorded instance on the new fancy hashtag indexer).

While preparing for my #offline year and looking for alternatives, I followed a convoluted process to try to replace Pocket synchronisation with something-to-be-determined-reMarkable synchronisation. Then it hits me in the face. Ctrl+p. That’s it, everything I was looking for was in plain sight, a keystroke away: printing webpages I wanted to read offline!

No… That’s too simple. There must be something.

PDFs are awesome

Erm, no. In fact, even while online, printed as PDF webpages are rendering better than in Pocket. No more half articles not parsed. No more trying to find in their archive. No more having to pay for some features which are basically greping through PDFs. Last but not least, I keep a permanent local copy of really important articles. I can send them by email and share them even if the original link has long been dead. It also removes most ads and annoyances by default.

As I’m implementing offline-web reading in #offpunk (let’s try to create yet another hashtag I care about), I realise that some webpages are not rendered fully. That’s expected, since I use a readability library to try to find the article. Some pages don’t play well with it. Some are also too long to read in my terminal. Some rely heavily on pictures.

PDF solves all those issue. It also adds a kind of positive barrier to consume content. I must really want to read a page to care about making a PDF. I believe that removing too many barriers is one of the main problems we face with today’s informational orgy. We cannot filter out what we really want from what looks not completely uninteresting because everything has the same availability.

Trying to automate stuff…

My offline-year rules allow me to connect twice a week to accomplish predefined actions. After 12 days, I can attest that most of those actions are "get pdf version of a given webpage" (and also "apt install list_of_packages"). This translates into : 1) opening Firefox, 2) copy/pasting the URL, 3) waiting for the load, 4) ctrl+p, 5) clicking save. This is a bit dumb. This could be automatised and done by my script during my daily synchronisation.

So I guess it’s time to explore if there’s a way to easily retrieve a webpage as a PDF. Suggestions on the subject are highly appreciated. Searching with apt, I found wkhtmltopdf, which looks promising. I added the apt install command to my list of "tasks to do online". Another thing I need to make automatic in my : a list of apt packages to install (I’ve seen there’s an apt-offline tool but didn’t manage to understand it so far).

Or not ?

But what if I’m just trying to remove the positive barrier, I just described previously?

As I’m now using Offpunk to browse, asynchronously, the web, the frontier with Gemini-space is blurred. Gemini starts to look less simple, less Zen. It is now contaminated by the web. The code itself is becoming increasingly complex to handle all the oddities found on the web.

Is this really what I want?

As a writer and an engineer, I like to explore how technology impacts society. You can subscribe by email or by rss. I value privacy and never share your adress.

If you read French, you can support me by buying/sharing/reading my books and subscribing to my newsletter in French or RSS. I also develop Free Software.