Skip to main content

External Blogs

Sous-Chef (Meals-on wheels) Final Sprint!

Montreal Python - sam, 08/13/2016 - 23:00

Join us on August 25th and contribute to the final sprint of Sous-chef, the new management platform of Santropol Roulant. They need our help to get the best tools to manage and deliver food to people with disabilities and difficulties.

This platform will be at the heart of their service and it's entirely based on Django. The project needs volunteers who would like give a hand by contributing to their open platform and pursuing it's development.

It's an opportunity to complete some tasks of the project (, and to get to know what the people of Santropol are doing. Although knowledge of Django is necessary to participate, you don't have to be an expert! Don't be afraid and join us.

Food will be prepared and served by the people from Santropol to the volunteers of the event!

This free event is co-organized by Santropol roulant, Montreal Python, Savoir-Faire Linux, PyLadies Montréal, Montréal Django, .

The event will be bilingual.


Thursday, Auguest 25th, doors are opening at 6pm (Everything start 6:30pm)


Santropol Roulant


Please grab a ticket on our meetup at the following url:

Catégories: External Blogs


Montreal Python - mar, 07/19/2016 - 23:00

Montreal Python developers, this event is for YOU! No matter your favourite programming language, come and grab a burger and a beer in excellent company!

This year, for our traditional inter-community barbecue, we think big! Thanks to the financial and logistic support of our sponsors, we offer you a networking activity you won't soon forget.

What's in store
  • Huge 225 m² tent with a capacity of 150+ seated guests
  • Deck terrace overlooking the Île Notre Dame Lake
  • Access to the beach, as well as a private volley-ball court
  • 100% beef and vegetarian hamburgers
  • Microbrewery beer, GURU and soft drinks
  • View of the closing fireworks of L'International des Feux Loto-Québec
  • Free parking included

Come one, come all, to an event you won't want to miss!

For more information and tickets, visit the Meetup page.

Catégories: External Blogs

(Still) working too much on the computer

Anarcat - mer, 06/01/2016 - 10:39

I have been using Workrave to try to force me to step away from the computer regularly to work around Repetitive Strain Injury (RSI) issues that have plagued my life on the computer intermittently in the last decade.

Workrave itself is only marginally efficient at getting me away from the machine: as any warning systems, it suffers from alarm fatigue as you frenetically click the dismiss button every time a Workrave warning pops up. However, it has other uses.

Analyzing data input

In the past, I have used Workrave to document how I work too much on the computer, but never went through more serious processing of the vast data store that Workrave accumulates about mouse movements and keystrokes. Interested in knowing how much my leave from Koumbit affected time spent on the computer, I decided to look into this again.

It turns out I am working as much, if not more, on the computer since I took that "time off":

We can see here that I type a lot on the computer. Normal days range from 10 000 to 60 000 keystrokes, with extremes at around 100 000 keystrokes per day. The average seem to fluctuate around 30 to 40 000 keystrokes per day, but rises sharply around the end of the second quarter of this very year. For those unfamiliar with the underlying technology, one keystroke is roughly one byte I put on the computer. So the average of 40 000 keystrokes is 40 kilobyte (KB) per day on the computer. That means 15 MB over a year or about 150MB (or 100 MiB if you want to be picky about it) over the course of the last decade.

That is a lot of typing.

I originally thought this could have been only because I type more now, as opposed to use more the mouse previously. Unfortunately, Workrave also tracks general "active time" which we can also examine:

Here we see that I work around 4 hours a day continuously on the computer. That is active time: not just login, logout time. In other words, the time where i look away from the computer and think for a while, jot down notes in my paper agenda or otherwise step away from the computer for small breaks is not counted here. Notice how some days go up to 12 hours and how recently the average went up to 7 hours of continuous activity.

So we can clearly see that I basically work more on the computer now than I ever did in the last 7 years. This is a problem - one of the reasons of this time off was to step away from the computer, and it seems I have failed.

Update: it turns out the graph was skewed towards the last samples. I went more easy on the keyboard in the last few days and things have significantly improved:

Another interesting thing we can see is when I switched from using my laptop to using the server as my main workstation, around early 2011, which is about the time marcos was built. Now that marcos has been turned into a home cinema downstairs, I went back to using my laptop as my main computing device, in late 2015. We can also clearly see when I stopped using Koumbit machines near the end of 2015 as well.

Further improvements and struggle for meaning

The details of how the graph was produced are explained at the end of this article.

This is all quite clunky: it doesn't help that the Workrave data structure is not easily parsable and so easily corruptible. It would be best if each data point was on its own separate line, which would be long, granted, but so easier to parse.

Furthermore, I do not like the perl/awk/gnuplot data processing pipeline much. It doesn't allow me to do interesting analysis like averages, means and linear regressions easily. It could be interesting to rewrite the tools in Python to allow better graphs and easier data analysis, using the tools I learned in 2015-09-28-fun-with-batteries.

Finally, this looks only at keystrokes and non-idle activity. It could be more interesting to look at idle/active times and generally the amount of time spent on the computer each day. And while it is interesting to know that I basically write a small book on the computer every day (according to Wikipedia, 120KB is about the size of a small pocket book), it is mostly meaningless if all that stuff is machine-readable code.

Where is, after all, the meaning in all those shell commands and programs we constantly input on our keyboards, in the grand scheme of human existence? Most of those bytes are bound to be destroyed by garbage collection (my shell's history) or catastrophic backup failures.

While the creative works from the 16th century can still be accessed and used by others, the data in some software programs from the 1990s is already inaccessible. - Lawrence Lessig

But is my shell history relevant? Looking back at old posts on this blog, one has to wonder if the battery life of the Thinkpad 380z laptop or how much e-waste I threw away in 2005 will be of any historical value in 20 years, assuming the data survives that long.

How this graph was made

I was happy to find that Workrave has some contrib scripts to do such processing. Unfortunately, those scripts are not shipped with the Debian package, so I requested that to be fixed (#825982). There were also some fixes necessary to make the script work at all: first, there was a syntax error in the Perl script. But then since my data is so old, there was bound to be some data corruption in there: incomplete entries or just plain broken data. I had lines that were all NULL characters, typical of power failures or disk corruptions. So I have made a patch to fix that script (#826021).

But this wasn't enough: while this processes data on the current machine fine, it doesn't deal with multiple machines very well. In the last 7 years of data I could find, I was using 3 different machines: this server (marcos), my laptop (angela) and Koumbit's office servers (koumbit). I ended up modifying the contrib scripts to be able to collate that data meaningfully. First, I copied over the data from Koumbit in a local fake-koumbit directory. Second, I mounted marcos home directory locally with SSHFS:

sshfs marcos

I also made this script to sum up datasets:

#!/usr/bin/perl -w use List::MoreUtils 'pairwise'; $| = 1; my %data = (); while (<>) { my @fields = split; my $date = shift @fields; if (defined($data{$date})) { my @t = pairwise { $a + $b } @{$data{$date}}, @fields; $data{$date} = \@t; } else { $data{$date} = \@fields; } } foreach my $d ( sort keys %data ) { print "$d @{$data{$d}}\n"; }

Then I made a modified version of the Gnuplot script that processes all those files together:

#!/usr/bin/gnuplot set title "Workrave" set ylabel "Keystrokes per day" set timefmt "%Y%m%d%H%M" #set xrange [450000000:*] set format x "%Y-%m-%d" set xtics rotate set xdata time set terminal svg set output "workrave.svg" plot "workrave-angela.dat" using 1:28 title "angela", \ "workrave-marcos.dat" using 1:28 title "marcos", \ "workrave-koumbit.dat" using 1:28 title "koumbit", \ "workrave-sum.dat" using 1:2 smooth sbezier linewidth 3 title "average" #plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \ # "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \ # "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"

And finally, I made a small shell script to glue this all together:

#!/bin/sh perl workrave-dump > workrave-$(hostname).dat HOME=$HOME/marcos perl workrave-dump > workrave-marcos.dat HOME=$PWD/fake-koumbit perl workrave-dump > workrave-koumbit.dat # remove idle days as they skew the average sed -i '/ 0$/d' workrave-*.dat # per-day granularity sed -i 's/^\(........\)....\? /\1 /' workrave-*.dat # sum up all graphs cat workrave-*.dat | sort | perl > workrave.dat ./gnuplot-workrave-anarcat

I used a different gnuplot script to generate the activity graph:

#!/usr/bin/gnuplot set title "Workrave" set ylabel "Active hours per day" set timefmt "%Y%m%d%H%M" #set xrange [450000000:*] set format x "%Y-%m-%d" set xtics rotate set xdata time set terminal svg set output "workrave.svg" plot "workrave-angela.dat" using 1:($23/3600) title "angela", \ "workrave-marcos.dat" using 1:($23/3600) title "marcos", \ "workrave-koumbit.dat" using 1:($23/3600) title "koumbit", \ "workrave.dat" using 1:($23/3600) title "average" smooth sbezier linewidth 3 #plot "workrave-angela.dat" using 1:28 smooth sbezier title "angela", \ # "workrave-marcos.dat" using 1:28 smooth sbezier title "marcos", \ # "workrave-koumbit.dat" using 1:28 smooth sbezier title "koumbit"
Catégories: External Blogs

My free software activities, May 2016

Anarcat - jeu, 05/19/2016 - 17:49
Debian Long Term Support (LTS)

This is my 6th month working on Debian LTS, started by Raphael Hertzog at Freexian. This is my largest month so far, for which I had requested 20 hours of work.

Xen work

I spent the largest amount of time working on the Xen packages. We had to re-roll the patches because it turned out we originally just imported the package from Ubuntu as-is. This was a mistake because that package forked off the Debian packaging a while ago and included regressions in the packaging itself, not just security fixes.

So I went ahead and rerolled the whole patchset and tested it on Koumbit's test server. Brian May then completed the uploaded, which included about 40 new patches, mostly from Ubuntu.

Frontdesk duties

Next up was the frontdesk duties I had taken this week. This was mostly uneventful, although I had forgotten how to do some of the work and thus ended up doing extensive work on the contributor's documentation. This is especially important since new contributors joined the team! I also did a lot of Debian documentation work in my non-sponsored work below.

The triage work involved chasing around missing DLAs, triaging away OpenJDK-6 (for which, let me remind you, security support has ended in LTS), raised the question of Mediawiki maintenance.

Other LTS work

I also did a bunch of smaller stuff. Of importance, I can note that I uploaded two advisories that were pending from April: NSS and phpMyAdmin. I also reviewed the patches for the ICU update, since I built the one for squeeze (but didn't have time to upload before squeeze hit end-of-life).

I have tried to contribute to the NTP security support but that was way too confusing to me, and I have left it to the package maintainer which seemed to be on top of things, even if things mean complete chaos and confusion in the world of NTP. I somehow thought that situation had improved with the recent investments in ntpsec and ntimed, but unfortunately Debian has not switched to the ntpsec codebase, so it seems that the NTP efforts have diverged in three different projects instead of closing into a single, better codebase.

Future LTS work

This is likely to be my last month of work on LTS until September. I will try to contribute a few hours in June, but July and August will be very busy for me outside of Debian, so it's unlikely that I contribute much to the project during the summer. My backlog included those packages which might be of interest to other LTS contributors:

  • libxml2: no upstream fix, but needs fixing!
  • tiff{,3}: same mess
  • libgd2: maintainer contacted
  • samba regression: mailed bug #821811 to try to revive the effort
  • policykit-1: to be investigated
  • p7zip: same
Other free software work Debian documentation

I wrote an detailed short guide to Debian package development, something I felt was missing from the existing corpus, which seems to be too focus in covering all alternatives. My guide is opinionated: I believe there is a right and wrong way of doing things, or at least, there are best practices, especially when just patching packages. I ended up retroactively publishing that as a blog post - now I can simply tag an item with blog and it shows up in the blog.

(Of course, because of a mis-configuration on my side, I have suffered from long delays publishing to Debian planet, so all the posts dates are off in the Planet RSS feed. This will hopefully be resolved around the time this post is published, but this allowed me to get more familiar with the Planet Venus software, as detailed in that other article.)

Apart from the guide, I have also done extensive research to collate information that allowed me to create workflow graphs of the various Debian repositories, which I have published in the Debian Release section of the Debian wiki. Here is the graph:

It helps me understand how packages flow between different suites and who uploads what where. This emerged after I realized I didn't really understand how "proposed updates" worked. Since we are looking at implementing a similar process for the security queue, I figured it was useful to show what changes would happen, graphically.

I have also published a graph that describes the relations between different software that make up the Debian archive. The idea behind this is also to provide an overview of what happens when you upload a package in the Debian archive, but it is more aimed at Debian developers trying to figure out why things are not working as expected.

The graphs were done with Graphviz, which allowed me to link to various components in the graph easily, which is neat. I also prefered Graphviz over Dia or other tools because it is easier to version and I don't have to bother (too much) about the layout and tweaking the looks. The downside is, of course, that when Graphviz makes the wrong decision, it's actually pretty hard to make it do the right thing, but there are various workarounds that I have found that made the graphs look pretty good.

The source is of course available in git but I feel all this documentation (including the guide) should go in a more official document somewhere. I couldn't quite figure out where. Advice on this would be of course welcome.


I have made yet another plugin for Ikiwiki, called irker, which enables wikis to send notifications to IRC channels, thanks to the simple irker bot. I had trouble with Irker in the past, since it was not quite reliable: it would disappear from channels and not return when we'd send it a notification. Unfortunately, the alternative, the KGB bot is much heavier: each repository needs a server-side, centralized configuration to operate properly.

Irker's design is simpler and more adapted to a simple plugin like this. Let's hope it will work reliably enough for my needs.

I have also suggested improvements to the footnotes styles, since they looked like hell in my Debian guide. It turns out this was an issue with the multimarkdown plugin that doesn't use proper semantic markup to identify footnotes. The proper fix is to enable footnotes in the default Discount plugin, which will require another, separate patch.

Finally, I have done some improvements (I hope!) on the layout of this theme. I made the top header much lighter and transparent to work around an issue where followed anchors would be hidden under the top header. I have also removed the top menu made out of the sidebar plugin because it was cluttering the display too much. Those links are all on the frontpage anyways and I suspect people were not using them so much.

The code is, as before, available in this git repository although you may want to start from the new ikistrap theme that is based on Bootstrap 4 and that may eventually be merged in ikiwiki directly.

DNS diagnostics

Through this interesting overview of various *ping tools, I got found out about the dnsdiag tool which currently allows users to do DNS traces, tampering detection and ping over DNS. In the hope of packaging it into Debian, I have requested clarifications regarding a modification to the DNSpython library the tool uses.

But I went even further and boldly opened a discussion about replacing DNSstuff, the venerable DNS diagnostic tools that is now commercial. It is somewhat surprising that there is no software that has even been publicly released that does those sanity checks for DNS, given how old DNS is.

Incidentally, I have also requested smtpping to be packaged in Debian as well but httping is already packaged.

Link checking

In the process of writing this article, I suddenly remembered that I constantly make mistakes in the various links I post on my site. So I started looking at a link checker, another tool that should be well established but that, surprisingly, is not quite there yet.

I have found this neat software written in Python called LinkChecker. Unfortunately, it is basically broken in Debian, so I had to do a non-maintainer upload to fix that old bug. I managed to force myself to not take over maintainership of this orphaned package but I may end up doing just that if no one steps up the next time I find issues in the package.

One of the problems I had checking links in my blog is that I constantly refer to sites that are hostile to bots, like the Debian bugtracker and MoinMoin wikis. So I published a patch that adds a --no-robots flag to be able to crawl those sites effectively.

I know there is the W3C tool but it's written in Perl, and there's probably zero chance for me to convince those guys to bypass robots exclusion rules, so I am sticking to Linkchecker.

Other Debian packaging work

At my request, Drush has finally been removed from Debian. Hopefully someone else will pick up that work, but since it basically needs to be redone from scratch, there was no sense in keeping it in the next release of Debian. Similarly, Semanticscuttle was removed from Debian as well.

I have uploaded new versions of tuptime, sopel and smokeping. I have also file a Request For Help for Smokeping. I am happy to report there was a quick response and people will be stepping up to help with the maintenance of that venerable monitoring software.

Background radiation

Finally, here's the generic background noise of me running around like a chicken with his head cut off:

Finally, I should mention that I will be less active in the coming months, as I will be heading outside as the summer finally came! I somewhat feel uncomfortable documenting publicly my summer here, as I am more protective of my privacy than I was before on this blog. But we'll see how it goes, maybe you'll hear non-technical articles here again soon!

Catégories: External Blogs

Meals-on-wheel Hacknight

Montreal Python - mar, 05/17/2016 - 23:00

Join us on May 26th and contribute to new management platform of Santropol Roulant. They need our help to get a better tools to manage and deliver food to people with disabilities and difficulties.

This platform will be at the heart of their service and it's entirely based on Django. The project needs volunteers who would like give a hand and to contribute their open platform and pursue it's development.

It's your opportunity to achieves some tasks of the project (, and to get to know what people of Santropol are doing. Please note that the event is targeting people who are used to Django but not necessarily experts with it, don't be afraid and join us :)

People from Santropol will prepare food that will be serve to the volunteers of the event!

This Free software project was also built by Savoir-Faire Linux and volunteers from la Maison du logiciel Libre.

Please note that the event will be bilingual.


Please confirm your participation on our meetup page at


Thursday, May 26th, starting at 6pm


Santropol Roulant 111 Roy East (Sherbrooke metro)

Catégories: External Blogs

Long delays posting Debian Planet Venus

Anarcat - sam, 05/14/2016 - 14:47

For the last few months, it seems that my posts haven't been reaching the Planet Debian aggregator correctly. I timed the last two posts and they both arrived roughly 10 days late in the feed.

SNI issues

At first, I suspected I was a victim of the SNI bug in Planet Venus: since it is still running in Python 2.7 and uses httplib2 (as opposed to, say, Requests), it has trouble with sites running under SNI. In January, there were 9 blogs with that problem on Planet. When this was discussed elsewhere in February, there were now 18, and then 21 reported in March. With everyone enabling (like me) Let's Encrypt on their website, this number is bound to grow.

I was able to reproduce the Debian Planet setup locally to do further tests and ended up sending two (unrelated) patches to the Debian bug tracker against Planet Venus, the software running Debian planet. In my local tests, I found 22 hosts with SNI problems. I also posted some pointers on how the code could be ported over to the more modern Requests and Cachecontrol modules.

Expiry issues

However, some of those feeds were working fine on philp, the host I found was running as the Planet Master. Even more strange, my own website was working fine!

INFO:planet.runner:Feed unchanged

Now that was strange: why was my feed fetched, but noted as unchanged? For that, I found that there was a FAQ question buried down in the PlanetDebian wikipage which explicitly said that Planet obeys Expires headers diligently and will not get new content again if the headers say they did. Skeptical, I looked my own headers and, ta-da! they were way off:

$ curl -v 2>&1 | egrep '< (Expires|Date)' < Date: Sat, 14 May 2016 19:59:28 GMT < Expires: Sat, 28 May 2016 19:59:28 GMT

So I lowered the expires timeout on my RSS feeds to 3 hours:

root@marcos:/etc/apache2# git diff diff --git a/apache2/conf-available/expires.conf b/apache2/conf-available/expires.conf index 214f3dd..a983738 100644 --- a/apache2/conf-available/expires.conf +++ b/apache2/conf-available/expires.conf @@ -3,8 +3,18 @@ # Enable expirations. ExpiresActive On - # Cache all files for 2 weeks after access (A). - ExpiresDefault A1209600 + # Cache all files 12 hours after access + ExpiresDefault "access plus 12 hours" + + # RSS feeds should refresh more often + <FilesMatch \.(rss)$> + ExpiresDefault "modification plus 4 hours" + </FilesMatch> + + # images are *less* likely to change + <FilesMatch "\.(gif|jpg|png|js|css)$"> + ExpiresDefault "access plus 1 month" + </FilesMatch> <FilesMatch \.(php|cgi)$> # Do not allow scripts to be cached unless they explicitly send cache

I also lowered the general cache expiry, except for images, Javascript and CSS.

Planet Venus maintenance

A small last word about all this: I'm surprised to see that Planet Debian is running a 6 year old software that hasn't seen a single official release yet, with local patches on top. It seems that Venus is well designed, I must give them that, but it's a little worrisome to see great software just rotting around like this.

A good "planet" site seems like a resource a lot of FLOSS communities would need: is there another "Planet-like" aggregator out there that is well maintained and more reliable? In Python, preferably.

PlanetPlanet, which Venus was forked from, is out of the question: it is even less maintained than the new fork, which itself seems to have died in 2011.

There is a discussion about the state of Venus on Github which reflects some of the concerns expressed here, as well as on the mailing list. The general consensus seems to be that everyone should switch over to Planet Pluto, which is written in Ruby.

I am not sure which planet Debian sits on - Pluto? Venus? Besides, Pluto is not even a planet anymore...

Mike check!

So this is also a test to see if my posts reach Debian Planet correctly. I suspect no one will ever see this on the top of their feeds, since the posts do get there, but with a 10 days delay and with the original date, so they are "sunk" down. The above expiration fixes won't take effect until the 10 days delay is over... But if you did see this as noise, retroactive apologies in advance for the trouble.

If you are reading this from somewhere else and wish to say hi, don't hesitate, it's always nice to hear from my readers.

Catégories: External Blogs

Notmuch, offlineimap and Sieve setup

Anarcat - jeu, 05/12/2016 - 18:29

I've been using Notmuch since about 2011, switching away from Mutt to deal with the monstrous amount of emails I was, and still am dealing with on the computer. I have contributed a few patches and configs on the Notmuch mailing list, but basically, I have given up on merging patches, and instead have a custom config in Emacs that extend it the way I want. In the last 5 years, Notmuch has progressed significantly, so I haven't found the need to patch it or make sweeping changes.

The huge INBOX of death

The one thing that is problematic with my use of Notmuch is that I end up with a ridiculously large INBOX folder. Before the cleanup I did this morning, I had over 10k emails in there, out of about 200k emails overall.

Since I mostly work from my laptop these days, the Notmuch tags are only on the laptop, and not propagated to the server. This makes accessing the mail spool directly, from webmail or simply through a local client (say Mutt) on the server, really inconvenient, because it has to load a very large spool of mail, which is very slow in Mutt. Even worse, a bunch of mail that was archived in Notmuch shows up in the spool because it's just removed tags in Notmuch: the mails are still in the inbox, even though they are marked as read.

So I was hoping that Notmuch would help me deal with the giant inbox of death problem, but in fact, when I don't use Notmuch, it actually makes the problem worse. Today, I did a bunch of improvements to my setup to fix that.

The first thing I did was to kill procmail, which I was surprised to discover has been dead for over a decade. I switched over to Sieve for filtering, having already switched to Dovecot a while back on the server. I tried to use the conversion tool but it didn't work very well, so I basically rewrote the whole file. Since I was mostly using Notmuch for filtering, there wasn't much left to convert.

Sieve filtering

But this is where things got interesting: Sieve is so simpler to use and more intuitive that I started doing more interesting stuff in bridging the filtering system (Sieve) with the tagging system (Notmuch). Basically, I use Sieve to split large chunks of emails off my main inbox, to try to remove as much spam, bulk email, notifications and mailing lists as possible from the larger flow of emails. Then Notmuch comes in and does some fine-tuning, assigning tags to specific mailing lists or topics, and being generally the awesome search engine that I use on a daily basis.

Dovecot and Postfix configs

For all of this to work, I had to tweak my mail servers to talk sieve. First, I enabled sieve in Dovecot:

--- a/dovecot/conf.d/15-lda.conf +++ b/dovecot/conf.d/15-lda.conf @@ -44,5 +44,5 @@ protocol lda { # Space separated list of plugins to load (default is global mail_plugins). - #mail_plugins = $mail_plugins + mail_plugins = $mail_plugins sieve }

Then I had to switch from procmail to dovecot for local delivery, that was easy, in Postfix's perennial

#mailbox_command = /usr/bin/procmail -a "$EXTENSION" mailbox_command = /usr/lib/dovecot/dovecot-lda -a "$RECIPIENT"

Note that dovecot takes the full recipient as an argument, not just the extension. That's normal. It's clever, it knows that kind of stuff.

One last tweak I did was to enable automatic mailbox creation and subscription, so that the automatic extension filtering (below) can create mailboxes on the fly:

--- a/dovecot/conf.d/15-lda.conf +++ b/dovecot/conf.d/15-lda.conf @@ -37,10 +37,10 @@ #lda_original_recipient_header = # Should saving a mail to a nonexistent mailbox automatically create it? -#lda_mailbox_autocreate = no +lda_mailbox_autocreate = yes # Should automatically created mailboxes be also automatically subscribed? -#lda_mailbox_autosubscribe = no +lda_mailbox_autosubscribe = yes protocol lda { # Space separated list of plugins to load (default is global mail_plugins). Sieve rules

Then I had to create a Sieve ruleset. That thing lives in ~/.dovecot.sieve, since I'm running Dovecot. Your provider may accept an arbitrary ruleset like this, or you may need to go through a web interface, or who knows. I'm assuming you're running Dovecot and have a shell from now on.

The first part of the file is simply to enable a bunch of extensions, as needed:

# Sieve Filters # # require "fileinto"; require "envelope"; require "variables"; require "subaddress"; require "regex"; require "vacation"; require "vnd.dovecot.debug";

Some of those are not used yet, for example I haven't tested the vacation module yet, but I have good hopes that I can use it as a way to announce a special "urgent" mailbox while I'm traveling. The rationale is to have a distinct mailbox for urgent messages that is announced in the autoreply, that hopefully won't be parsable by bots.

Spam filtering

Then I filter spam using this fairly standard expression:

######################################################################## # spam # possible improvement, server-side: # if header :contains "X-Spam-Flag" "YES" { fileinto "junk"; stop; } elsif header :contains "X-Spam-Level" "***" { fileinto "greyspam"; stop; }

This puts stuff into the junk or greyspam folder, based on the severity. I am very aggressive with spam: stuff often ends up in the greyspam folder, which I need to check from time to time, but it beats having too much spam in my inbox.

Mailing lists

Mailing lists are generally put into a lists folder, with some mailing lists getting their own folder:

######################################################################## # lists # converted from procmail if header :contains "subject" "FreshPorts" { fileinto "freshports"; } elsif header :contains "List-Id" "" { fileinto "alternc"; } elsif header :contains "List-Id" "" { fileinto "koumbit"; } elsif header :contains ["to", "cc"] ["", ""] { fileinto "debian"; # Debian BTS } elsif exists "X-Debian-PR-Message" { fileinto "debian"; # default lists fallback } elsif exists "List-Id" { fileinto "lists"; }

The idea here is that I can safely subscribe to lists without polluting my mailbox by default. Further processing is done in Notmuch.

Extension matching

I also use the magic +extension tag on emails. If you send email to, say, then the emails end up in the foo folder. This is done with the help of the following recipe:

######################################################################## # wildcard +extension # if envelope :matches :detail "to" "*" { # Save name in ${name} in all lowercase except for the first letter. # Joe, joe, jOe thus all become 'Joe'. set :lower "name" "${1}"; fileinto "${name}"; #debug_log "filed into mailbox ${name} because of extension"; stop; }

This is actually very effective: any time I register to a service, I try as much as possible to add a +extension that describe the service. Of course, spammers and marketers (it's the same really) are free to drop the extension and I suspect a lot of them do, but it helps with honest providers and this actually sorts a lot of stuff out of my inbox into topically-defined folders.

It is also a security issue: someone could flood my filesystem with tons of mail folders, which would cripple the IMAP server and eat all the inodes, 4 times faster than just sending emails. But I guess I'll cross that bridge when I get there: anyone can flood my address and I have other mechanisms to deal with this.

The trick is to then assign tags to all folders so that they appear in the Notmuch-emacs welcome view:

echo tagging folders for folder in $(ls -ad $HOME/Maildir/${PREFIX}*/ | egrep -v "Maildir/${PREFIX}(feeds.*|Sent.*|INBOX/|INBOX/Sent)\$"); do tag=$(echo $folder | sed 's#/$##;s#^.*/##') notmuch tag +$tag -inbox tag:inbox and not tag:$tag and folder:${PREFIX}$tag done

This is part of my notmuch-tag script that includes a lot more fine-tuned filtering, detailed below.

Automated reports filtering

Another thing I get a lot of is machine-generated "spam". Well, it's not commercial spam, but it's a bunch of Nagios, cron jobs, and god knows what software thinks it's important to send me emails every day. I get a lot less of those these days since I'm off work at Koumbit, but still, those can be useful for others as well:

if anyof (exists "X-Cron-Env", header :contains ["subject"] ["security run output", "monthly run output", "daily run output", "weekly run output", "Debian Package Updates", "Debian package update", "daily mail stats", "Anacron job", "nagios", "changes report", "run output", "[Systraq]", "Undelivered mail", "Postfix SMTP server: errors from", "backupninja", "DenyHosts report", "Debian security status", "apt-listchanges" ], header :contains "Auto-Submitted" "auto-generated", envelope :contains "from" ["nagios@", "logcheck@"]) { fileinto "rapports"; } # imported from procmail elsif header :comparator "i;octet" :contains "Subject" "Cron" { if header :regex :comparator "i;octet" "From" ".*root@" { fileinto "rapports"; } } elsif header :comparator "i;octet" :contains "To" "root@" { if header :regex :comparator "i;octet" "Subject" "\\*\\*\\* SECURITY" { fileinto "rapports"; } } elsif header :contains "Precedence" "bulk" { fileinto "bulk"; } Refiltering emails

Of course, after all this I still had thousands of emails in my inbox, because the sieve filters apply only on new emails. The beauty of Sieve support in Dovecot is that there is a neat sieve-filter command that can reprocess an existing mailbox. That was a lifesaver. To run a specific sieve filter on a mailbox, I simply run:

sieve-filter .dovecot.sieve INBOX 2>&1 | less

Well, this doesn't do anything. To really execute the filters, you need the -e flags, and to write to the INBOX for real, you need the -w flag as well, so the real run looks something more like this:

sieve-filter -e -W -v .dovecot.sieve INBOX > refilter.log 2>&1

The funky output redirects are necessary because this outputs a lot of crap. Also note that, unfortunately, the fake run output differs from the real run and is actually more verbose, which makes it really less useful than it could be.


I also usually archive my mails every year, rotating my mailbox into an Archive.YYYY directory. For example, now all mails from 2015 are archived in a Archive.2015 directory. I used to do this with Mutt tagging and it was a little slow and error-prone. Now, i simply have this Sieve script:

require ["variables","date","fileinto","mailbox", "relational"]; # Extract date info if currentdate :matches "year" "*" { set "year" "${1}"; } if date :value "lt" :originalzone "date" "year" "${year}" { if date :matches "received" "year" "*" { # Archive Dovecot mailing list items by year and month. # Create folder when it does not exist. fileinto :create "Archive.${1}"; } }

I went from 15613 to 1040 emails in my real inbox with this process (including refiltering with the default filters as well).

Notmuch configuration

My Notmuch configuration is a in three parts: I have small settings in ~/.notmuch-config. The gist of it is:

[new] tags=unread;inbox; ignore= #[maildir] # synchronize_flags=true # tentative patch that was refused upstream # #reckless_trash=true [search] exclude_tags=deleted;spam;

I omitted the fairly trivial [user] section for privacy reasons and [database] for declutter.

Then I have a notmuch-tag script symlinked into ~/Maildir/.notmuch/hooks/post-new. It does way too much stuff to describe in details here, but here are a few snippets:

if hostname | grep angela > /dev/null; then PREFIX=Anarcat/ else PREFIX=. fi

This sets a variable that makes the script work on my laptop (angela), where mailboxes are in Maildir/Anarcat/foo or the server, where mailboxes are in Maildir/.foo.

I also have special rules to tag my RSS feeds, which are generated by feed2imap, which is documented shortly below:

echo tagging feeds ( cd $HOME/Maildir/ && for feed in ${PREFIX}feeds.*; do name=$(echo $feed | sed "s#${PREFIX}feeds\\.##") notmuch tag +feeds +$name -inbox folder:$feed and not tag:feeds done )

Another example that would be useful is how to tag mailing lists, for example, this removes the inbox tag and adds the notmuch tags to emails from the notmuch mailing list.

notmuch tag +lists +notmuch -inbox tag:inbox and ""

Finally, I have a bunch of special keybindings in ~/.emacs.d/notmuch-config.el:

;; autocompletion (eval-after-load "notmuch-address" '(progn (notmuch-address-message-insinuate))) ; use fortune for signature, config is in custom (add-hook 'message-setup-hook 'fortune-to-signature) ; don't remember what that is (add-hook 'notmuch-show-hook 'visual-line-mode) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; keymappings ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define-key notmuch-show-mode-map "S" (lambda () "mark message as spam and advance" (interactive) (notmuch-show-tag '("+spam" "-unread")) (notmuch-show-next-open-message-or-pop))) (define-key notmuch-search-mode-map "S" (lambda (&optional beg end) "mark message as spam and advance" (interactive (notmuch-search-interactive-region)) (notmuch-search-tag (list "+spam" "-unread") beg end) (anarcat/notmuch-search-next-message))) (define-key notmuch-show-mode-map "H" (lambda () "mark message as spam and advance" (interactive) (notmuch-show-tag '("-spam")) (notmuch-show-next-open-message-or-pop))) (define-key notmuch-search-mode-map "H" (lambda (&optional beg end) "mark message as spam and advance" (interactive (notmuch-search-interactive-region)) (notmuch-search-tag (list "-spam") beg end) (anarcat/notmuch-search-next-message))) (define-key notmuch-search-mode-map "l" (lambda (&optional beg end) "undelete and advance" (interactive (notmuch-search-interactive-region)) (notmuch-search-tag (list "-unread") beg end) (anarcat/notmuch-search-next-message))) (define-key notmuch-search-mode-map "u" (lambda (&optional beg end) "undelete and advance" (interactive (notmuch-search-interactive-region)) (notmuch-search-tag (list "-deleted") beg end) (anarcat/notmuch-search-next-message))) (define-key notmuch-search-mode-map "d" (lambda (&optional beg end) "delete and advance" (interactive (notmuch-search-interactive-region)) (notmuch-search-tag (list "+deleted" "-unread") beg end) (anarcat/notmuch-search-next-message))) (define-key notmuch-show-mode-map "d" (lambda () "delete current message and advance" (interactive) (notmuch-show-tag '("+deleted" "-unread")) (notmuch-show-next-open-message-or-pop))) ;; (define-key notmuch-show-mode-map "b" (lambda (&optional address) "Bounce the current message." (interactive "sBounce To: ") (notmuch-show-view-raw-message) (message-resend address) (kill-buffer))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;; my custom notmuch functions ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (defun anarcat/notmuch-search-next-thread () "Skip to next message from region or point This is necessary because notmuch-search-next-thread just starts from point, whereas it seems to me more logical to start from the end of the region." ;; move line before the end of region if there is one (unless (= beg end) (goto-char (- end 1))) (notmuch-search-next-thread)) ;; Linking to notmuch messages from org-mode ;; (require 'org-notmuch nil t) (message "anarcat's custom notmuch config loaded")

This is way too long: in my opinion, a bunch of that stuff should be factored in upstream, but some features have been hard to get in. For example, Notmuch is really hesitant in marking emails as deleted. The community is also very strict about having unit tests for everything, which makes writing new patches a significant challenge for a newcomer, which will often need to be familiar with both Elisp and C. So for now I just have those configs that I carry around.

Emails marked as deleted or spam are processed with the following script named notmuch-purge which I symlink to ~/Maildir/.notmuch/hooks/pre-new:

#!/bin/sh if hostname | grep angela > /dev/null; then PREFIX=Anarcat/ else PREFIX=. fi echo moving tagged spam to the junk folder notmuch search --output=files tag:spam \ and not folder:${PREFIX}junk \ and not folder:${PREFIX}greyspam \ and not folder:Koumbit/INBOX \ and not path:Koumbit/** \ | while read file; do mv "$file" "$HOME/Maildir/${PREFIX}junk/cur" done echo unconditionnally deleting deleted mails notmuch search --output=files tag:deleted | xargs -r rm

Oh, and there's also customization for Notmuch:

;; -*- mode: emacs-lisp; auto-recompile: t; -*- (custom-set-variables ;; from '(fortune-file "/home/anarcat/.mutt/sigs.fortune") '(message-send-hook (quote (notmuch-message-mark-replied))) '(notmuch-address-command "notmuch-address") '(notmuch-always-prompt-for-sender t) '(notmuch-crypto-process-mime t) '(notmuch-fcc-dirs (quote ((".*" . "Koumbit/INBOX.Sent") (".*" . "Anarcat/Sent")))) '(notmuch-hello-tag-list-make-query "tag:unread") '(notmuch-message-headers (quote ("Subject" "To" "Cc" "Bcc" "Date" "Reply-To"))) '(notmuch-saved-searches (quote ((:name "inbox" :query "tag:inbox and not tag:koumbit and not tag:rt") (:name "unread inbox" :query "tag:inbox and tag:unread") (:name "unread" :query "tag:unred") (:name "freshports" :query "tag:freshports and tag:unread") (:name "rapports" :query "tag:rapports and tag:unread") (:name "sent" :query "tag:sent") (:name "drafts" :query "tag:draft")))) '(notmuch-search-line-faces (quote (("deleted" :foreground "red") ("unread" :weight bold) ("flagged" :foreground "blue"))))/ '(notmuch-search-oldest-first nil) '(notmuch-show-all-multipart/alternative-parts nil) '(notmuch-show-all-tags-list t) '(notmuch-show-insert-text/plain-hook (quote (notmuch-wash-convert-inline-patch-to-part notmuch-wash-tidy-citations notmuch-wash-elide-blank-lines notmuch-wash-excerpt-citations))) )

I think that covers it.


So of course the above works well on the server directly, but how do run Notmuch on a remote machine that doesn't have access to the mail spool directly? This is where OfflineIMAP comes in. It allows me to incrementally synchronize a local Maildir folder hierarchy with a a remote IMAP server. I am assuming you already have an IMAP server configured, since you already configured Sieve above.

Note that other synchronization tools exist. The other popular one is isync but I had trouble migrating to it (see courriels for details) so for now I am sticking with OfflineIMAP.

The configuration is fairly simple:

[general] accounts = Anarcat ui = Blinkenlights maxsyncaccounts = 3 [Account Anarcat] localrepository = LocalAnarcat remoterepository = RemoteAnarcat # refresh all mailboxes every 10 minutes autorefresh = 10 # run notmuch after refresh postsynchook = notmuch new # sync only mailboxes that changed quick = -1 ## possible optimisation: ignore mails older than a year #maxage = 365 # local mailbox location [Repository LocalAnarcat] type = Maildir localfolders = ~/Maildir/Anarcat/ # remote IMAP server [Repository RemoteAnarcat] type = IMAP remoteuser = anarcat remotehost = ssl = yes # without this, the cert is not verified (!) sslcacertfile = /etc/ssl/certs/DST_Root_CA_X3.pem # do not sync archives folderfilter = lambda foldername: not'(Sent\.20[01][0-9]\..*)', foldername) and not'(Archive.*)', foldername) # and only subscribed folders subscribedonly = yes # don't reconnect all the time holdconnectionopen = yes # get mails from INBOX immediately, doesn't trigger postsynchook idlefolders = ['INBOX']

Critical parts are:

  • postsynchook: obviously, we want to run notmuch after fetching mail
  • idlefolders: receives emails immediately without waiting for the longer autorefresh delay, which means that most mailboxes don't see new emails until 10 minutes in the worst case. unfortunately, doesn't run the postsynchook so I need to hit G in Emacs to see new mail
  • quick=-1, subscribedonly, holdconnectionopen: makes most runs much, much faster as it skips unchanged or unsubscribed folders and keeps the connection to the server

The other settings should be self-explanatory.

RSS feeds

I gave up on RSS readers, or more precisely, I merged RSS feeds and email. The first time I heard of this, it sounded like a horrible idea, because it means yet more emails! But with proper filtering, it's actually a really nice way to process emails, since it leverages the distributed nature of email.

For this I use a fairly standard feed2imap, although I do not deliver to an IMAP server, but straight to a local Maildir. The configuration looks like this:

--- include-images: true target-refix: &target "maildir:///home/anarcat/Maildir/.feeds." feeds: - name: Planet Debian url: target: [ *target, 'debian-planet' ]

I have obviously more feeds, the above is just and example. This will deliver the feeds as emails in one mailbox per feed, in ~/Maildir/.feeds.debian-planet, in the above example.


You will fail at writing the sieve filters correctly, and mail will (hopefully?) fall through to your regular mailbox. Syslog will tell you things fail, as expected, and details are in your .dovecot.sieve.log file in your home directory.

I also enabled debugging on the Sieve module

--- a/dovecot/conf.d/90-sieve.conf +++ b/dovecot/conf.d/90-sieve.conf @@ -51,6 +51,7 @@ plugin { # deprecated imapflags extension in addition to all extensions were already # enabled by default. #sieve_extensions = +notify +imapflags + sieve_extensions = +vnd.dovecot.debug # Which Sieve language extensions are ONLY available in global scripts. This # can be used to restrict the use of certain Sieve extensions to administrator

This allowed me to use debug_log function in the rulesets to output stuff directly to the logfile.

Further improvements

Of course, this is all done on the commandline, but that is somewhat expected if you are already running Notmuch. Of course, it would be much easier to edit those filters through a GUI. Roundcube has a nice Sieve plugin, and Thunderbird also has such a plugin as well. Since Sieve is a standard, there's a bunch of clients available. All those need you to setup some sort of thing on the server, which I didn't bother doing yet.

And of course, a key improvement would be to have Notmuch synchronize its state better with the mailboxes directly, instead of having the notmuch-purge hack above. Dovecot and Maildir formats support up to 26 flags, and there were discussions about using those flags to synchronize with notmuch tags so that multiple notmuch clients can see the same tags on different machines transparently.

This, however, won't make Notmuch work on my phone or webmail or any other more generic client: for that, Sieve rules are still very useful.

I still don't have webmail setup at all: so to read email, I need an actual client, which is currently my phone, which means I need to have Wifi access to read email. "Internet Cafés" or "this guy's computer" won't work as well, although I can always use ssh to login straight to the server and read mails with Mutt.

I am also considering using X509 client certificates to authenticate to the mail server without a passphrase. This involves configuring Postfix, which seems simple enough. Dovecot's configuration seems a little more involved and less well documented. It seems that both [OfflimeIMAP][] and K-9 mail support client-side certs. OfflineIMAP prompts me for the password so it doesn't get leaked anywhere. I am a little concerned about building yet another CA, but I guess it would not be so hard...

The server side of things needs more documenting, particularly the spam filters. This is currently spread around this wiki, mostly in configuration.

Security considerations

The whole purpose of this was to make it easier to read my mail on other devices. This introduces a new vulnerability: someone may steal that device or compromise it to read my mail, impersonate me on different services and even get a shell on the remote server.

Thanks to the two-factor authentication I setup on the server, I feel a little more confident that just getting the passphrase to the mail account isn't sufficient anymore in leveraging shell access. It also allows me to login with ssh on the server without trusting the machine too much, although that only goes so far... Of course, sudo is then out of the question and I must assume that everything I see is also seen by the attacker, which can also inject keystrokes and do all sorts of nasty things.

Since I also connected my email account on my phone, someone could steal the phone and start impersonating me. The mitigation here is that there is a PIN for the screen lock, and the phone is encrypted. Encryption isn't so great when the passphrase is a PIN, but I'm working on having a better key that is required on reboot, and the phone shuts down after 5 failed attempts. This is documented in my phone setup.

Client-side X509 certificates further mitigates those kind of compromises, as the X509 certificate won't give shell access.

Basically, if the phone is lost, all hell breaks loose: I need to change the email password (or revoke the certificate), as I assume the account is about to be compromised. I do not trust Android security to give me protection indefinitely. In fact, one could argue that the phone is already compromised and putting the password there already enabled a possible state-sponsored attacker to hijack my email address. This is why I have an OpenPGP key on my laptop to authenticate myself for critical operations like code signatures.

The risk of identity theft from the state is, after all, a tautology: the state is the primary owner of identities, some could say by definition. So if a state-sponsored attacker would like to masquerade as me, they could simply issue a passport under my name and join a OpenPGP key signing party, and we'd have other problems to deal with, namely, proper infiltration counter-measures and counter-snitching.

Catégories: External Blogs

Epic Lameness

Eric Dorland - lun, 09/01/2008 - 17:26 now supports OpenID. Hooray! I'd like to make a comment on a thread about the RTL8187se chip I've got in my new MSI Wind. So I go to sign in with OpenID and instead of signing me in it prompts me to create an account with a name, username and password for the account. Huh? I just want to post to their forum, I don't want to create an account (at least not explicitly, if they want to do it behind the scenes fine). Isn't the point of OpenID to not have to create accounts and particularly not have to create new usernames and passwords to access websites? I'm not impressed.
Catégories: External Blogs

Sentiment Sharing

Eric Dorland - lun, 08/11/2008 - 23:28
Biella, I am from there and I do agree. If I was still living there I would try to form a team and make a bid. Simon even made noises about organizing a bid at DebConfs past. I wish he would :)

But a DebConf in New York would be almost as good.
Catégories: External Blogs
Syndiquer le contenu