Skip to main content

External Blogs

January 2018 report: LTS

Anarcat - ven, 02/02/2018 - 17:25

I have already published a yearly report which covers all of 2017 but also some of my January 2018 work, so I'll try to keep this short.

Debian Long Term Support (LTS)

This is my monthly Debian LTS report. I was happy to switch to the new Git repository for the security tracker this month. It feels like some operations (namely pull / push) are a little slower, but others, like commits or log inspection, are much faster. So I think it is a net win.

jQuery

I did some work on trying to cleanup a situation with the jQuery package, which I explained in more details in a long post. It turns out there are multiple databases out there that track security issues in web development environemnts (like Javascript, Ruby or Python) but do not follow the normal CVE tracking system. This means that Debian had a few vulnerabilities in its jQuery packages that were not tracked by the security team, in particular three that were only on Snyk.io (CVE-2012-6708, CVE-2015-9251 and CVE-2016-10707). The resulting discussion was interesting and is worth reading in full.

A more worrying aspect of the problem is that this problem is not limited to flashy new web frameworks. Ben Hutchings estimated that almost half of the Linux kernel vulnerabilities are not tracked by CVE. It seems the concensus is that we want to try to follow the CVE process, and Mitre has been helpful in distributing this by letting other entities, called CVE Numbering Authorities or CNA, issue their own CVEs. After contacting Snyk, it turns out that they have started the process of becoming a CNA and are trying to get this part of their workflow, so that's a good sign.

LTS meta-work

I've done some documentation and architecture work on the LTS project itself, mostly around updating the wiki with current practices.

OpenSSH DLA

I've done a quick security update of OpenSSH for LTS, which resulted in DLA-1257-1. Unfortunately, after a discussion with the security researcher that published that CVE, it turned out that this was only a "self-DOS", i.e. that the NEWKEYS attack would only make the SSH client terminate its own connection, and therefore not impact the rest of the server. One has to wonder, in that case, why this was issue a CVE at all: presumably the vulnerability could be leveraged somehow, but I did not look deeply enough into it to figure that out.

Hopefully the patch won't introduce a regression: I tested this summarily and it didn't seem to cause issue at first glance.

Hardlinks attacks

An interesting attack (CVE-2017-18078) was discovered against systemd where the "tmpfiles" feature could be abused to bypass filesystem access restrictions through hardlinks. The trick is that the attack is possible only if kernel hardening (specifically fs.protected_hardlinks) is turned off. That feature is available in the Linux kernel since the 3.6 release, but was actually turned off by default in 3.7. In the commit message, Linus Torvalds explained the change was breaking some userland applications, which is a huge taboo in Linux, and recommended that distros configure this at boot instead. Debian took the reverse approach and Hutchings issued a patch which reverts the default to the more secure default. But this means users of custom kernels are still vulnerable to this issue.

But, more importantly, this affects more than systemd. The vulnerability also happens when using plain old chown with hardening turned off, when running a simple command like this:

chown -R non-root /path/owned/by/non-root

I didn't realize this, but hardlinks share permissions: if you change permissions on file a that's hardlinked to file b, both files have the new permissions. This is especially nasty if users can hardlink to critical files like /etc/password or suid binaries, which is why the hardening was introduced in the first place.

In Debian, this is especially an issue in maintainer scripts, which often call chown -R on arbitrary, non-root directories. Daniel Kahn Gillmor had reported this to the Debian security team all the way back in 2011, but it didn't get traction back then. He now opened Debian bug #889066 to at least enable a warning in lintian and an issue was also opened on colord Debian bug #889060, as an example, but many more packages are vulnerable. Again, this is only if hardening is somewhat turned off.

Normally, systemd is supposed to turn that hardening on, which should protect custom kernels, but this was turned off in Debian. Anyways, Debian still supports non-systemd init systems (although those users mostly probably all migrated to Devuan) so the fix wouldn't be complete. I have therefore filed Debian bug #889098 against procps (which owns /etc/sysctl.conf and related files) to try and fix the issue more broadly there.

And to be fair, this was very explicitly mentioned in the jessie release notes so those people without the protection kind of get what they desserve here...

p7zip

Lastly, I did a fairly trivial update of the p7zip package, which resulted in DLA-1268-1. The patch was sent upstream and went through a few iterations, including coordination with the security researcher.

Unfortunately, the latter wasn't willing to share the proof of concept (PoC) so that we could test the patch. We are therefore trusting the researcher that the fix works, which is too bad because they do not trust us with the PoC...

Other free software work

I probably did more stuff in January that wasn't documented in the previous report. But I don't know if it's worth my time going through all this. Expect a report in February instead!

Have happy new year and all that stuff.

Catégories: External Blogs

Montréal-Python 69 - Yogic Zumba

Montreal Python - mer, 01/31/2018 - 00:00

It may be the middle of winter but it's no reason to stay at home. Take out your warmest toque, your snowshoes and join us at Notman House for the first event of the season.

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Presentations start
  • 7:30PM - Break
  • 9:00PM - End of the event
  • 9:15PM - Benelux
Presentations How to use PGP with Git and why you should care - Konstantin Ryabitsev

Git supports PGP signatures for the purposes of tracking code provenance and to ensure the integrity of repository clones across multiple mirrors. In this talk, we will discuss how your project can benefit from PGP-signing your tags and your commits, and will look at available resources that will help you get started with minimal pain.

Gestion de configuration - Michel Rondeau

La gestion de la configuration avec la librairie standard de Python est efficace tant que le format « ini » convient. Nous voulions la reproduire tout en supportant n’importe quel format de fichier. La solution retenue fonctionne par événements et plugiciel (plug-ins) afin d’ajouter ou d’enlever des fonctionnalités dynamiquement . Il est maintenant possible de gérer les formats Excel, Yaml, JSON, etc. Nous verrons comment mettre en place une fil d’événements élégante en Python telle qu’utiliser dans ce projet.

morpheOm - François Robert

Depuis que le Xerox Lab de Palo Alto a inventé le desktop, les interfaces usagers ont peu changé. Heureusement, téléphones et tablettes ont aidé, mais sommes-nous mûrs pour un plus grand changement ?

How to Aggregate User’s Interest Data using PySpark – A Short Tutorial - Abbas Taher

Spark is one the most popular Big Data frameworks and PySpark is the API to use Spark from Python. PySpark is a great choice when you need to scale up your jobs to work with large files. In this short tutorial we shall present 3 methods to aggregate data. First we shall use Python dictionaries, then we shall present the two methods “GroupBy” and “ReduceBy” to do the same aggregation work. The three code snippets will be presented and explained using a Jupyter Notebook.

When

February 5th at 6:00PM

Where

Notman House 51 Sherbrooke West Montréal, QC H2X 1X2

Catégories: External Blogs

4TB+ large disk price review

Anarcat - sam, 01/27/2018 - 20:39

For my personal backups, I am now looking at 4TB+ single-disk long-term storage. I currently have 3.5TB of offline storage, split in two disks: this is rather inconvenient as I need to plug both in a toaster-like SATA enclosure which gathers dusts and performs like crap. Now I'm looking at hosting offline backups at a friend's place so I need to store everything in a single drive, to save space.

This means I need at least 4TB of storage, and those needs are going to continuously expand in the future. Since this is going to be offsite, swapping the drive isn't really convenient (especially because syncing all that data takes a long time), so I figured I would also look at more than 4 TB.

So I built those neat little tables. I took the prices from Newegg.ca or Newegg.com as a fallback when the item wasn't available in Canada. I used to order from NCIX because it was "more" local, but they unfortunately went bankrupt and in the worse possible way: the website is still up and you can order stuff, but those orders never ship. Sad to see a 20-year old institution go out like that; I blame Jeff Bezos.

I also used failure rate figures from the latest Backblaze review, although those should always be taken with a grain of salt. For example, the apparently stellar 0.00% failure rates are all on sample sizes too small to be statistically significant (<100 drives).

All prices are in CAD, sometimes after conversion from USD for items that are not on newegg.ca, as of today.

8TB Brand Model Price $/TB fail% Notes HGST 0S04012 280$ 35$ N/A Seagate ST8000NM0055 320$ 40$ 1.04% WD WD80EZFX 364$ 46$ N/A Seagate ST8000DM002 380$ 48$ 0.72% HGST HUH728080ALE600 791$ 99$ 0.00% 6TB Brand Model Price $/TB fail% Notes HGST 0S04007 220$ 37$ N/A Seagate ST6000DX000 ~222$ 56$ 0.42% not on .ca, refurbished Seagate ST6000AS0002 230$ 38$ N/A WD WD60EFRX 280$ 47$ 1.80% Seagate STBD6000100 343$ 58$ N/A 4TB Brand Model Price $/TB fail% Notes Seagate ST4000DM004 125$ 31$ N/A Seagate ST4000DM000 150$ 38$ 3.28% WD WD40EFRX 155$ 39$ 0.00% HGST HMS5C4040BLE640 ~242$ 61$ 0.36% not on .ca Toshiba MB04ABA400V ~300$ 74$ 0.00% not on .ca Conclusion

Cheapest per TB costs seem to be in the 4TB range, but the 8TB HGST comes really close. Reliabilty for this drive could be an issue, however - I can't explain why it is so cheap compared to other devices... But I guess we'll see where it goes as I'll just order the darn thing and try it out.

Catégories: External Blogs

A summary of my 2017 work

Anarcat - sam, 01/27/2018 - 11:54

New years are strange things: for most arbitrary reasons, around January 1st we reset a bunch of stuff, change calendars and forget about work for a while. This is also when I forget to do my monthly report and then procrastinate until I figure out I might as well do a year report while I'm at it, and then do nothing at all for a while.

So this is my humble attempt at fixing this, about a month late. I'll try to cover December as well, but since not much has happened then, I figured I could also review the last year and think back on the trends there. Oh, and you'll get chocolate cookies of course. Hang on to your eyeballs, this won't hurt a bit.

Debian Long Term Support (LTS)

Those of you used to reading those reports might be tempted to skip this part, but wait! I actually don't have much to report here and instead you will find an incredibly insightful and relevant rant.

So I didn't actually do any LTS work in December. I actually reduced my available hours to focus on writing (more on that later). Overall, I ended up working about 11 hours per month on LTS in 2017. That is less than the 16-20 hours I was available during that time. Part of that is me regularly procrastinating, but another part is that finding work to do is sometimes difficult. The "easy" tasks often get picked and dispatched quickly, so the stuff that remains, when you're not constantly looking, is often very difficult packages.

I especially remember the pain of working on libreoffice, the KRACK update, more tiff, GraphicsMagick and ImageMagick vulnerabilities than I care to remember, and, ugh, Ruby... Masochists (also known as "security researchers") can find the details of those excruciating experiments in debian-lts for the monthly reports.

I don't want to sound like an old idiot, but I must admit, after working on LTS for two years, that working on patching old software for security bugs is hard work, and not particularly pleasant on top of it. You're basically always dealing with other people's garbage: badly written code that hasn't been touched in years, sometimes decades, that no one wants to take care of.

Yet someone needs to take care of it. A large part of the technical community considers Linux distributions in general, and LTS releases in particular, as "too old to care for". As if our elders, once they passed a certain age, should just be rolled out to the nearest dumpster or just left rotting on the curb. I suspect most people don't realize that Debian "stable" (stretch) was released less than a year ago, and "oldstable" (jessie) is a little over two years old. LTS (wheezy), our oldest supported release, is only four years old now, and will become unsupported this summer, on its fifth year anniversary. Five years may seem like a long time in computing but really, there's a whole universe out there and five years is absolutely nothing in the range of changes I'm interested in: politics, society and the environment range much beyond that shortsightedness.

To put things in perspective, some people I know still run their office on an Apple II, which celebrated its 40th anniversary this year. That is "old". And the fact that the damn thing still works should command respect and admiration, more than contempt. In comparison, the phone I have, an LG G3, is running an unpatched, vulnerable version of Android because it cannot be updated, because it's locked out of the telcos networks, because it was found in a taxi and reported "lost or stolen" (same thing, right?). And DRM protections in the bootloader keep me from doing the right thing and unbricking this device.

We should build devices that last decades. Instead we fill junkyards with tons and tons of precious computing devices that have more precious metals than most people carry as jewelry. We are wasting generations of programmers, hardware engineers, human robots and precious, rare metals on speculative, useless devices that are destroying our society. Working on supporting LTS is a small part in trying to fix the problem, but right now I can't help but think we have a problem upstream, in the way we build those tools in the first place. It's just depressing to be at the receiving end of the billions of lines of code that get created every year. Hopefully, the death of Moore's law could change that, but I'm afraid it's going to take another generation before programmers figure out how far away from their roots they have strayed. Maybe too long to keep ourselves from a civilization collapse.

LWN publications

With that gloomy conclusion, let's switch gears and talk about something happier. So as I mentioned, in December, I reduced my LTS hours and focused instead on finishing my coverage of KubeCon Austin for LWN.net. Three articles have already been published on the blog here:

... and two more articles, about Prometheus, are currently published as exclusives by LWN:

I was surprised to see that the container runtimes article got such traction. It wasn't the most important debate in the whole conference, but there were some amazingly juicy bits, some of which we didn't even cover because. Those were... uh... rather controversial and we want the community to stay sane. Or saner, if that word can be applied at all to the container community at this point.

I ended up publishing 16 articles at LWN this year. I'm really happy about that: I just love writing and even if it's in English (my native language is French), it's still better than rambling on my own like I do here. My editors allow me to publish well polished articles, and I am hugely grateful for the privilege. Each article takes about 13 hours to write, on average. I'm less happy about that: I wish delivery was more streamlined and I spare you the miserable story of last minute major changes I sent in some recent articles, to which I again apologize profusely to my editors.

I'm often at a loss when I need to explain to friends and family what I write about. I often give the example of the password series: I wrote a whole article about just how to pick a passphrase then a review of two geeky password managers and then a review of something that's not quite a password manager and you shouldn't be using. And on top of that, I even wrote an history of those but by that time my editors were sick and tired of passwords and understandably made me go away. At this point, neophytes are just scratching their heads and I remind them of the TL;DR:

  1. choose a proper password with a bunch of words picked at random (really random, check out Diceware!)

  2. use a password manager so you have to remember only one good password

  3. watch out where you type those damn things

I covered two other conferences this year as well: one was the NetDev conference, for which I wrote 4 articles (1, 2, 3, 4). It turned out I couldn't cover NetDev in Korea even though I wanted to, but hopefully that is just "partie remise" as we say in french... I also covered DebConf in Montreal, but that ended up being much harder than I thought: I got involved in networking and volunteered all over the place. By the time the conference started, I was too exhausted to do actually write anything, even though I took notes like crazy and ran around trying to attend everything. I found it's harder to write about topics that are close to home: nothing is new, so you don't get excited as much. I still enjoyed writing about the supposed decline of copyleft, which was based on a talk by FSF executive director John Sullivan, and I ended up writing about offline PGP key storage strategies and cryptographic keycards, after buying a token from friendly gniibe at DebConf.

I also wrote about Alioth moving to Pagure, unknowingly joining up with a long tradition of failed predictions at LWN: a few months later, the tide turned and Debian launched the Alioth replacement as a beta running... GitLab. Go figure - maybe this is the a version of the quantum observer effect applied to journalism?

Two articles seemed to have been less successful. The GitHub TOS update was less controversial than I expected it would be and didn't seem to have a significant impact, although GitHub did rephrase some bits of their TOS eventually. The ROCA review didn't seem to bring excited crowds either, maybe because no one actually understood anything I was saying (including myself).

Still, 2017 has been a great ride in LWN-land: I'm hoping to publish even more during the next year and encourage people to subscribe to the magazine, as it helps us publish new articles, if you like what you're reading here of course.

Free software work

Last but not least is my free software work. This was just nuts.

New programs

I have written a bunch of completely new programs:

  • Stressant - a small wrapper script to stress-test new machines. no idea if anyone's actually using the darn thing, but I have found it useful from time to time.

  • Wallabako - a bridge between Wallabag and my e-reader. This is probably one of my most popular programs ever. I get random strangers asking me about it in random places, which is quite nice. Also my first Golang program, something I am quite excited about and wish I was doing more of.

  • Ecdysis - a pile of (Python) code snippets, documentation and standard community practices I reuse across projects. Ended up being really useful when bootstrapping new projects, but probably just for me.

  • numpy-stats - a dumb commandline tool to extract stats from streams. didn't really reuse it so maybe not so useful. interestingly, found similar tools called multitime and hyperfine that will be useful for future benchmarks

  • feed2exec - a new feed reader (just that) which I have been using ever since for many different purposes. I have now replaced feed2imap and feed2tweet with that simple tool, and have added support for storing my articles on https://archive.org/, checking for dead links with linkchecker (below) and pushing to the growing Mastodon federation

  • undertime - a simple tool to show possible meeting times across different timezones. a must if you are working with people internationally!

If I count this right (and I'm omitting a bunch of smaller, less general purpose programs), that is six new software projects, just this year. This seems crazy, but that's what the numbers say. I guess I like programming too, which is arguably a form of writing. Talk about contributing to the pile of lines of code...

New maintainerships

I also got more or less deeply involved in various communities:

And those are just the major ones... I have about 100 repositories active on GitHub, most of which are forks of existing repositories, so actual contributions to existing free software projects. Hard numbers for this are annoyingly hard to come by as well, especially in terms of issues vs commits and so on. GitHub says I have made about 600 contributions in the last year, which is an interesting figure as well.

Debian contributions

I also did a bunch of things in the Debian project, apart from my LTS involvement:

  • Gave up on debmans, a tool I had written to rebuild https://manpages.debian.org, in the face of the overwhelming superiority of the Golang alternative. This is one of the things which lead me to finally try the language and write Wallabako. So: net win.

  • Proposed standard procedures for third-party repositories, which didn't seem to have caught up significantly in the real world. Hopefully just a matter of time...

  • Co-hosted a bug squashing party for the Debian stretch release, also as a way to have the DebConf team meet up.

  • That lead to a two hour workshop at Montreal DebConf which was packed and really appreciated. I'm thinking of organizing this at every DebConf I attend, in a (so far) secret plot to standardize packaging practices by evangelizing new package maintainers to my peculiar ways. I hope to teach again in Taiwan this year, but I'm not sure I'll make it that far across the globe...

  • And of course, I did a lot of regular package maintenance. I don't have good numbers on the exact activity stats here (any way to pull that out easily?) but I now directly maintain 34 Debian packages, a somewhat manageable number.

What's next?

This year, I'll need to figure out what to do with legacy projects. Gameclock and Monkeysign both need to be ported away from GTK2, which is deprecated. I will probably abandon the GUI in Monkeysign but gameclock will probably need a rewrite of its GUI. This begs the question of how we can maintain software in the longterm if even the graphical interface (even Xorg is going away!) is swept away under our feet all the time. Without this change, both software could have kept on going for another decade without trouble. But now, I need to spend time just to keep those tools from failing to build at all.

Wallabako seems to be doing well on its own, but I'd like to fix the refresh issues that make the reader sometimes unstable: maybe I can write directly to the SQLite database? I tried statically linking sqlite to do some tests about that, but that's apparently impossible and failed.

Feed2exec just works for me. I'm not very proud of the design, but it does its job well. I'll fix bugs and maybe push out a 1.0 release when a long enough delay goes by without any critical issues coming up. So try it out and report back!

As for the other projects, I'm not sure how it's going to go. It's possible that my involvement in paid work means I cannot commit as much to general free software work, but I can't help but just doing those drive-by contributions all the time. There's just too much stuff broken out there to sit by and watch the dumpster fire burn down the whole city.

I'll try to keep doing those reports, of which you can find an archive in monthly-report. Your comments, encouragements, and support make this worth it, so keep those coming!

Happy new year everyone: may it be better than the last, shouldn't be too hard...

PS: Here is the promised chocolate cookie:

Catégories: External Blogs

Changes in Prometheus 2.0

Anarcat - mer, 01/24/2018 - 19:00

This is one part of my coverage of KubeCon Austin 2017. Other articles include:

2017 was a big year for the Prometheus project, as it published its 2.0 release in November. The new release ships numerous bug fixes, new features and, notably, a new storage engine that brings major performance improvements. This comes at the cost of incompatible changes to the storage and configuration-file formats. An overview of Prometheus and its new release was presented to the Kubernetes community in a talk held during KubeCon + CloudNativeCon. This article covers what changed in this new release and what is brewing next in the Prometheus community; it is a companion to this article, which provided a general introduction to monitoring with Prometheus.

What changed

Orchestration systems like Kubernetes regularly replace entire fleets of containers for deployments, which means rapid changes in parameters (or "labels" in Prometheus-talk) like hostnames or IP addresses. This was creating significant performance problems in Prometheus 1.0, which wasn't designed for such changes. To correct this, Prometheus ships a new storage engine that was specifically designed to handle continuously changing labels. This was tested by monitoring a Kubernetes cluster where 50% of the pods would be swapped every 10 minutes; the new design was proven to be much more effective. The new engine boasts a hundred-fold I/O performance improvement, a three-fold improvement in CPU, five-fold in memory usage, and increased space efficiency. This impacts container deployments, but it also means improvements for any configuration as well. Anecdotally, there was no noticeable extra load on the servers where I deployed Prometheus, at least nothing that the previous monitoring tool (Munin) could detect.

Prometheus 2.0 also brings new features like snapshot backups. The project has a longstanding design wart regarding data volatility: backups are deemed to be unnecessary in Prometheus because metrics data is considered disposable. According to Goutham Veeramanchaneni, one of the presenters at KubeCon, "this approach apparently doesn't work for the enterprise". Backups were possible in 1.x, but they involved using filesystem snapshots and stopping the server to get a consistent view of the on-disk storage. This implied downtime, which was unacceptable for certain production deployments. Thanks again to the new storage engine, Prometheus can now perform fast and consistent backups, triggered through the web API.

Another improvement is a fix to the longstanding staleness handling bug where it would take up to five minutes for Prometheus to notice when a target disappeared. In that case, when polling for new values (or "scraping" as it's called in Prometheus jargon) a failure would make Prometheus reuse the older, stale value, which meant that downtime would go undetected for too long and fail to trigger alerts properly. This would also cause problems with double-counting of some metrics when labels vary in the same measurement.

Another limitation related to staleness is that Prometheus wouldn't work well with scrape intervals above two minutes (instead of the default 15 seconds). Unfortunately, that is still not fixed in Prometheus 2.0 as the problem is more complicated than originally thought, which means there's still a hard limit to how slowly you can fetch metrics from targets. This, in turn, means that Prometheus is not well suited for devices that cannot support sub-minute refresh rates, which, to be fair, is rather uncommon. For slower devices or statistics, a solution might be the node exporter "textfile support", which we mentioned in the previous article, and the pushgateway daemon, which allows pushing results from the targets instead of having the collector pull samples from targets.

The migration path

One downside of this new release is that the upgrade path from the previous version is bumpy: since the storage format changed, Prometheus 2.0 cannot use the previous 1.x data files directly. In his presentation, Veeramanchaneni justified this change by saying this was consistent with the project's API stability promises: the major release was the time to "break everything we wanted to break". For those who can't afford to discard historical data, a possible workaround is to replicate the older 1.8 server to a new 2.0 replica, as the network protocols are still compatible. The older server can then be decommissioned when the retention window (which defaults to fifteen days) closes. While there is some work in progress to provide a way to convert 1.8 data storage to 2.0, new deployments should probably use the 2.0 release directly to avoid this peculiar migration pain.

Another key point in the migration guide is a change in the rules-file format. While 1.x used a custom file format, 2.0 uses YAML, matching the other Prometheus configuration files. Thankfully the promtool command handles this migration automatically. The new format also introduces rule groups, which improve control over the rules execution order. In 1.x, alerting rules were run sequentially but, in 2.0, the groups are executed sequentially and each group can have its own interval. This fixes the longstanding race conditions between dependent rules that create inconsistent results when rules would reuse the same queries. The problem should be fixed between groups, but rule authors still need to be careful of that limitation within a rule group.

Remaining limitations and future

As we saw in the introductory article, Prometheus may not be suitable for all workflows because of its limited default dashboards and alerts, but also because of the lack of data-retention policies. There are, however, discussions about variable per-series retention in Prometheus and native down-sampling support in the storage engine, although this is a feature some developers are not really comfortable with. When asked on IRC, Brian Brazil, one of the lead Prometheus developers, stated that "downsampling is a very hard problem, I don't believe it should be handled in Prometheus".

Besides, it is already possible to selectively delete an old series using the new 2.0 API. But Veeramanchaneni warned that this approach "puts extra pressure on Prometheus and unless you know what you are doing, its likely that you'll end up shooting yourself in the foot". A more common approach to native archival facilities is to use recording rules to aggregate samples and collect the results in a second server with a slower sampling rate and different retention policy. And of course, the new release features external storage engines that can better support archival features. Those solutions are obviously not suitable for smaller deployments, which therefore need to make hard choices about discarding older samples or getting more disk space.

As part of the staleness improvements, Brazil also started working on "isolation" (the "I" in the ACID acronym) so that queries wouldn't see "partial scrapes". This hasn't made the cut for the 2.0 release, and is still work in progress, with some performance impacts (about 5% CPU and 10% RAM). This work would also be useful when heavy contention occurs in certain scenarios where Prometheus gets stuck on locking. Some of the performance impact could therefore be offset under heavy load.

Another performance improvement mentioned during the talk is an eventual query-engine rewrite. The current query engine can sometimes cause excessive loads for certain expensive queries, according the Prometheus security guide. The goal would be to optimize the current engine so that those expensive queries wouldn't harm performance.

Finally, another issue I discovered is that 32-bit support is limited in Prometheus 2.0. The Debian package maintainers found that the test suite fails on i386, which lead Debian to remove the package from the i386 architecture. It is currently unclear if this is a bug in Prometheus: indeed, it is strange that Debian tests actually pass in other 32-bit architectures like armel. Brazil, in the bug report, argued that "Prometheus isn't going to be very useful on a 32bit machine". The position of the project is currently that "'if it runs, it runs' but no guarantees or effort beyond that from our side".

I had the privilege to meet the Prometheus team at the conference in Austin and was happy to see different consultants and organizations working together on the project. It reminded me of my golden days in the Drupal community: different companies cooperating on the same project in a harmonious environment. If Prometheus can keep that spirit together, it will be a welcome change from the drama that affected certain monitoring software. This new Prometheus release could light a bright path for the future of monitoring in the free software world.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Montréal-Python 69 - Call For Proposals

Montreal Python - mer, 01/17/2018 - 00:00

First of all, the Montreal-Python team would like to wish you a Happy New Year!

With every new year, there's resolutions made. If presenting at a the tech event is on your list of resolutions, here's your chance to cross it off your list early. Montreal Python opens the call for presentations for our events of 2018. ( Feel free to submit your proposal - whether you have resolutions or not ;) )

Send us your proposal at team@montrealpython.org. We have spots for lightning talks (5 min) or regular talks (15 to 30 min)

When

February 5th, 2018 at 6:00PM

Where

TBD

Catégories: External Blogs

Monitoring with Prometheus 2.0

Anarcat - mar, 01/16/2018 - 19:00

This is one part of my coverage of KubeCon Austin 2017. Other articles include:

Prometheus is a monitoring tool built from scratch by SoundCloud in 2012. It works by pulling metrics from monitored services and storing them in a time series database (TSDB). It has a powerful query language to inspect that database, create alerts, and plot basic graphs. Those graphs can then be used to detect anomalies or trends for (possibly automated) resource provisioning. Prometheus also has extensive service discovery features and supports high availability configurations. That's what the brochure says, anyway; let's see how it works in the hands of an old grumpy system administrator. I'll be drawing comparisons with Munin and Nagios frequently because those are the tools I have used for over a decade in monitoring Unix clusters.

Monitoring with Prometheus and Grafana

What distinguishes Prometheus from other solutions is the relative simplicity of its design: for one, metrics are exposed over HTTP using a special URL (/metrics) and a simple text format. Here is, as an example, some network metrics for a test machine:

$ curl -s http://curie:9100/metrics | grep node_network_.*_bytes # HELP node_network_receive_bytes Network device statistic receive_bytes. # TYPE node_network_receive_bytes gauge node_network_receive_bytes{device="eth0"} 2.720630123e+09 # HELP node_network_transmit_bytes Network device statistic transmit_bytes. # TYPE node_network_transmit_bytes gauge node_network_transmit_bytes{device="eth0"} 4.03286677e+08

In the above example, the metrics are named node_network_receive_bytes and node_network_transmit_bytes. They have a single label/value pair(device=eth0) attached to them, along with the value of the metrics themselves. This is only a couple of hundreds of metrics (usage of CPU, memory, disk, temperature, and so on) exposed by the "node exporter", a basic stats collector running on monitored hosts. Metrics can be counters (e.g. per-interface packet counts), gauges (e.g. temperature or fan sensors), or histograms. The latter allow, for example, 95th percentiles analysis, something that has been missing from Munin forever and is essential to billing networking customers. Another popular use for histograms is maintaining an Apdex score, to make sure that N requests are answered in X time. The various metrics types are carefully analyzed before being stored to correctly handle conditions like overflows (which occur surprisingly often on gigabit network interfaces) or resets (when a device restarts).

Those metrics are fetched from "targets", which are simply HTTP endpoints, added to the Prometheus configuration file. Targets can also be automatically added through various discovery mechanisms, like DNS, that allow having a single A or SRV record that lists all the hosts to monitor; or Kubernetes or cloud-provider APIs that list all containers or virtual machines to monitor. Discovery works in real time, so it will correctly pick up changes in DNS, for example. It can also add metadata (e.g. IP address found or server state), which is useful for dynamic environments such as Kubernetes or containers orchestration in general.

Once collected, metrics can be queried through the web interface, using a custom language called PromQL. For example, a query showing the average bandwidth over the last minute for interface eth0 would look like:

rate(node_network_receive_bytes{device="eth0"}[1m])

Notice the "device" label, which we use to restrict the search to a single interface. This query can also be plotted into a simple graph on the web interface:

What is interesting here is not really the node exporter metrics themselves, as those are fairly standard in any monitoring solution. But in Prometheus, any (web) application can easily expose its own internal metrics to the monitoring server through regular HTTP, whereas other systems would require special plugins, on both the monitoring server and the application side. Note that Munin follows a similar pattern, but uses its own text protocol on top of TCP, which means it is harder to implement for web apps and diagnose with a web browser.

However, coming from the world of Munin, where all sorts of graphics just magically appear out of the box, this first experience can be a bit of a disappointment: everything is built by hand and ephemeral. While there are ways to add custom graphs to the Prometheus web interface using Go-based console templates, most Prometheus deployments generally use Grafana to render the results using custom-built dashboards. This gives much better results, and allows graphing multiple machines separately, using the Node Exporter Server Metrics dashboard:

All this work took roughly an hour of configuration, which is pretty good for a first try. Things get tougher when extending those basic metrics: because of the system's modularity, it is difficult to add new metrics to existing dashboards. For example, web or mail servers are not monitored by the node exporter. So monitoring a web server involves installing an Apache-specific exporter that needs to be added to the Prometheus configuration. But it won't show up automatically in the above dashboard, because that's a "node exporter" dashboard, not an Apache dashboard. So you need a separate dashboard for that. This is all work that's done automatically in Munin without any hand-holding.

Even then, Apache is a relatively easy one; monitoring some arbitrary server not supported by a custom exporter will require installing a program like mtail, which parses the server's logfiles to expose some metrics to Prometheus. There doesn't seem to be a way to write quick "run this command to count files" plugins that would allow administrators to write quick hacks. The options available are writing a new exporter using client libraries, which seems to be a rather large undertaking for non-programmers. You can also use the node exporter textfile option, which reads arbitrary metrics from plain text files in a directory. It's not as direct as running a shell command, but may be good enough for some use cases. Besides, there are a large number of exporters already available, including ones that can tap into existing Nagios and Munin servers to allow for a smooth transition.

Unfortunately, those exporters will only give you metrics, not graphs. To graph metrics from a third-party Postfix exporter, a graph must be created by hand in Grafana, with a magic PromQL formula. This may involve too much clicking around in a web browser for grumpy old administrators. There are tools like Grafanalib to programmatically create dashboards, but those also involve a lot of boilerplate. When building a custom application, however, creating graphs may actually be a fun and distracting task that some may enjoy. The Grafana/Prometheus design is certainly enticing and enables powerful abstractions that are not readily available with other monitoring systems.

Alerting and high availability

So far, we've worked only with a single server, and did only graphing. But Prometheus also supports sending alarms when things go bad. After working over a decade as a system administrator, I have mixed feelings about "paging" or "alerting" as it's called in Prometheus. Regardless of how well the system is tweaked, I have come to believe it is basically impossible to design a system that will respect workers and not torture on-call personnel through sleep-deprivation. It seems it's a feature people want regardless, especially in the enterprise, so let's look at how it works here.

In Prometheus, you design alerting rules using PromQL. For example, to warn operators when a network interface is close to saturation, we could set the following rule:

alert: HighBandwidthUsage expr: rate(node_network_transmit_bytes{device="eth0"}[1m]) > 0.95*1e+09 for: 5m labels: severity: critical annotations: description: 'Unusually high bandwidth on interface {{ $labels.device }}' summary: 'High bandwidth on {{ $labels.instance }}'

Those rules are regularly checked and matching rules are fired to an alertmanager daemon that can receive alerts from multiple Prometheus servers. The alertmanager then deduplicates multiple alerts, regroups them (so a single notification is sent even if multiple alerts are received), and sends the actual notifications through various services like email, PagerDuty, Slack or an arbitrary webhook.

The Alertmanager has a "gossip protocol" to enable multiple instances to coordinate notifications. This design allows you to run multiple Prometheus servers in a federation model, all simultaneously collecting metrics, and sending alerts to redundant Alertmanager instances to create a highly available monitoring system. Those who have struggled with such setups in Nagios will surely appreciate the simplicity of this design.

The downside is that Prometheus doesn't ship a set of default alerts and exporters do not define default alerting thresholds that could be used to create rules automatically. The Prometheus documentation also lacks examples that the community could use, so alerting is harder to deploy than in classic monitoring systems.

Issues and limitations

Prometheus is already well-established: Cloudflare, Canonical and (of course) SoundCloud are all (still) using it in production. It is a common monitoring tool used in Kubernetes deployments because of its discovery features. Prometheus is, however, not a silver bullet and may not the best tool for all workloads.

In particular, Prometheus is not designed for long-term storage. By default, it keeps samples for only two weeks, which seems rather small to old system administrators who are used to RRDtool databases that efficiently store samples for years. As a comparison, my test Prometheus instance is taking up as much space for five days of samples as Munin, which has samples for the last year. Of course, Munin only collects metrics every five minutes while Prometheus samples all targets every 15 seconds by default. Even so, this difference in sizes shows that Prometheus's disk requirements are much larger than traditional RRDtool implementations because it lacks native down-sampling facilities. Therefore, retaining samples for more than a year (which is a Munin limitation I was hoping to overcome) will be difficult without some serious hacking to selectively purge samples or adding extra disk space.

The project documentation recognizes this and suggests using alternatives:

Prometheus's local storage is limited in its scalability and durability. Instead of trying to solve long-term storage in Prometheus itself, Prometheus has a set of interfaces that allow integrating with remote long-term storage systems.

Prometheus in itself delivers good performance: a single instance can support over 100,000 samples per second. When a single server is not enough, servers can federate to cover different parts of the infrastructure. And when that is not enough sharding is possible. In general, performance is dependent on avoiding variable data in labels, which keeps the cardinality of the dataset under control, but the dataset size will grow with time regardless. So long-term storage is not Prometheus' strongest suit. But starting with 2.0, Prometheus can finally write to (and read from) external storage engines that can be more efficient than Prometheus. InfluxDB, for example, can be used as a backend and supports time-based down-sampling that makes long-term storage manageable. This deployment, however, is not for the faint of heart.

Also, security freaks can't help but notice that all this is happening over a clear-text HTTP protocol. Indeed, that is by design, "Prometheus and its components do not provide any server-side authentication, authorisation, or encryption. If you require this, it is recommended to use a reverse proxy." The issue is punted to a layer above, which is fine for the web interface: it is, after all, just a few Prometheus instances that need to be protected. But for monitoring endpoints, this is potentially hundreds of services that are available publicly without any protection. It would be nice to have at least IP-level blocking in the node exporter, although this could also be accomplished through a simple firewall rule.

There is a large empty space for Prometheus dashboards and alert templates. Whereas tools like Munin or Nagios had years to come up with lots of plugins and alerts, and to converge on best practices like "70% disk usage is a warning but 90% is critical", those things all need to be configured manually in Prometheus. Prometheus should aim at shipping standard sets of dashboards and alerts for built-in metrics, but the project currently lacks the time to implement those.

The Grafana list of Prometheus dashboards shows one aspect of the problem: there are many different dashboards, sometimes multiple ones for the same task, and it's unclear which one is the best. There is therefore space for a curated list of dashboards and a definite need for expanding those to feature more extensive coverage.

As a replacement for traditional monitoring tools, Prometheus may not be quite there yet, but it will get there and I would certainly advise administrators to keep an eye on the project. Besides, Munin and Nagios feature-parity is just a requirement from an old grumpy system administrator. For hip young application developers smoking weird stuff in containers, Prometheus is the bomb. Just take for example how GitLab started integrating Prometheus, not only to monitor GitLab.com itself, but also to monitor the continuous-integration and deployment workflow. By integrating monitoring into development workflows, developers are immediately made aware of the performance impacts of proposed changes. Performance regressions can therefore be trivially identified quickly, which is a powerful tool for any application.

Whereas system administrators may want to wait a bit before converting existing monitoring systems to Prometheus, application developers should certainly consider deploying Prometheus to instrument their applications, it will serve them well.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Demystifying container runtimes

Anarcat - mer, 12/20/2017 - 12:00

This is one part of my coverage of KubeCon Austin 2017. Other articles include:

As we briefly mentioned in our overview article about KubeCon + CloudNativeCon, there are multiple container "runtimes", which are programs that can create and execute containers that are typically fetched from online images. That space is slowly reaching maturity both in terms of standards and implementation: Docker's containerd 1.0 was released during KubeCon, CRI-O 1.0 was released a few months ago, and rkt is also still in the game. With all of those runtimes, it may be a confusing time for those looking at deploying their own container-based system or Kubernetes cluster from scratch. This article will try to explain what container runtimes are, what they do, how they compare with each other, and how to choose the right one. It also provides a primer on container specifications and standards.

What is a container?

Before we go further in looking at specific runtimes, let's see what containers actually are. Here is basically what happens when a container is launched:

  1. A container is created from a container image. Images are tarballs with a JSON configuration file attached. Images are often nested: for example this Libresonic image is built on top of a Tomcat image that depends (eventually) on a base Debian image. This allows for content deduplication because that Debian image (or any intermediate step) may be the basis for other containers. A container image is typically created with a command like docker build.
  2. If necessary, the runtime downloads the image from somewhere, usually some "container registry" that exposes the metadata and the files for download over a simple HTTP-based protocol. It used to be only Docker Hub, but now everyone has their own registry: for example, Red Hat has one for its OpenShift project, Microsoft has one for Azure, and GitLab has one for its continuous integration platform. A registry is the server that docker pull or push talks with, for example.
  3. The runtime extracts that layered image onto a copy-on-write (CoW) filesystem. This is usually done using an overlay filesystem, where all the container layers overlay each other to create a merged filesystem. This step is not generally directly accessible from the command line but happens behind the scenes when the runtime creates a container.
  4. Finally, the runtime actually executes the container, which means telling the kernel to assign resource limits, create isolation layers (for processes, networking, and filesystems), and so on, using a cocktail of mechanisms like control groups (cgroups), namespaces, capabilities, seccomp, AppArmor, SELinux, and whatnot. For Docker, docker run is what creates and runs the container, but underneath it actually calls the runc command.

Those concepts were first elaborated in Docker's Standard Container manifesto which was eventually removed from Docker, but other standardization efforts followed. The Open Container Initiative (OCI) now specifies most of the above under a few specifications:

Implementation of those standards varies among the different projects. For example, Docker is generally compatible with the standards except for the image format. Docker has its own image format that predates standardization and it has promised to convert to the new specification soon. Implementation of the runtime interface also differs as not everything Docker does is standardized, as we shall see.

The Docker and rkt story

Since Docker was the first to popularize containers, it seems fair to start there. Originally, Docker used LXC but its isolation layers were incomplete, so Docker wrote libcontainer, which eventually became runc. Container popularity exploded and Docker became the de facto standard to deploy containers. When it came out in 2014, Kubernetes naturally used Docker, as Docker was the only runtime available at the time. But Docker is an ambitious company and kept on developing new features on its own. Docker Compose, for example, reached 1.0 at the same time as Kubernetes and there is some overlap between the two projects. While there are ways to make the two tools interoperate using tools such as Kompose, Docker is often seen as a big project doing too many things. This situation led CoreOS to release a simpler, standalone runtime in the form of rkt, that was explained this way:

Docker now is building tools for launching cloud servers, systems for clustering, and a wide range of functions: building images, running images, uploading, downloading, and eventually even overlay networking, all compiled into one monolithic binary running primarily as root on your server. The standard container manifesto was removed. We should stop talking about Docker containers, and start talking about the Docker Platform. It is not becoming the simple composable building block we had envisioned.

One of the innovations of rkt was to standardize image formats through the appc specification, something we covered back in 2015. CoreOS doesn't yet have a fully standard implementation of the runtime interfaces: at the time of writing, rkt's Kubernetes compatibility layer (rktlet), doesn't pass all of Kubernetes integration tests and is still under development. Indeed, according to Brandon Philips, CTO of CoreOS, in an email exchange:

rkt has initial support for OCI image-spec, but it is incomplete in places. OCI support is less important at the moment as the support for OCI is still emerging in container registries and is notably absent from Kubernetes. OCI runtime-spec is not used, consumed, nor handled by rkt. This is because rkt execution is based on pod semantics, while runtime-spec only covers single container execution.

However, according to Dan Walsh, head of the container team at Red Hat, in an email interview, CoreOS's efforts were vital to the standardization of the container space within the Kubernetes ecosystem: "Without CoreOS we probably would not have CNI, and CRI and would be still fighting about OCI. The CoreOS effort is under-appreciated in the market." Indeed, according to Philips, the "CNI project and specifications originated from rkt, and were later spun off and moved to CNCF. CNI is still widely used by rkt today, both internally and for user-provided configuration." At this point, however, CoreOS seems to be gearing up toward building its Kubernetes platform (Tectonic) and image distribution service (Quay) rather than competing in the runtime layer.

CRI-O: the minimal runtime

Seeing those new standards, some Red Hat folks figured they could make a simpler runtime that would only do what Kubernetes needed. That "skunkworks" project was eventually called CRI-O and implements a minimal CRI interface. During a talk at KubeCon Austin 2017, Walsh explained that "CRI-O is designed to be simpler than other solutions, following the Unix philosophy of doing one thing and doing it well, with reusable components."

Started in late 2016 by Red Hat for its OpenShift platform, the project also benefits from support by Intel and SUSE, according to Mrunal Patel, lead CRI-O developer at Red Hat who hosted the talk. CRI-O is compatible with the CRI (runtime) specification and the OCI and Docker image formats. It can also verify image GPG signatures. It uses the CNI package for networking and supports CNI plugins, which OpenShift uses for its software-defined networking layer. It supports multiple CoW filesystems, like the commonly used overlay and aufs, but also the less common Btrfs.

One of CRI-O's most notable features, however is that it supports mixed workloads between "trusted" and "untrusted" containers. For example, CRI-O can use Clear Containers for stronger isolation promises, which is useful in multi-tenant configurations or to run untrusted code. It is currently unclear how that functionality will trickle up into Kubernetes, which currently considers all backends to be the same.

CRI-O has an interesting architecture (see the diagram below from the talk slides [PDF]). It reuses basic components like runc to start containers, and software libraries like containers/image and containers/storage, created for the skopeo project, to pull container images and create container filesystems. A separate library called oci-runtime-tool prepares the container configuration. CRI-O introduces a new daemon to handle containers called conmon. According to Patel, the program was "written in C for stability and performance" and takes care of monitoring, logging, TTY allocation, and miscellaneous hazards like out-of-memory conditions.

The conmon daemon is needed here to do all of the things that systemd doesn't (want to) do. But even though CRI-O doesn't use systemd directly to manage containers, it assigns containers to systemd-compatible cgroups, so that regular systemd tools like systemctl have visibility into the container resources. Since conmon (and not the CRI daemon) is the parent process of the container, it also allows parts of CRI-O to be restarted without stopping containers, which promises smoother upgrades. This is a problem for Docker deployments right now, where a Docker upgrade requires restarting all of the containers. This is usually not much trouble for Kubernetes clusters, however, because it is easy to roll out upgrades progressively by moving containers around.

CRI-O is the first implementation of the OCI standards suite that passes all Kubernetes integration tests (apart from Docker itself). Patel demonstrated those capabilities by showing a Kubernetes cluster backed by CRI-O, in what seemed to be a routine demonstration of cluster functionalities. Dan Walsh explained CRI-O's approach in a blog post that explains how CRI-O interacts with Kubernetes:

Our number 1 goal for CRI-O, unlike other container runtimes, is to never break Kubernetes. Providing a stable and rock-solid container runtime for Kubernetes is CRI-O's only mission in life.

According to Patel, performance is comparable to a normal Docker-based deployment, but the team is working on optimizing performance to go beyond that. Debian and RPM packages are available and deployment tools like minikube, or kubeadm also support switching to the CRI-O runtime. On existing clusters, switching runtimes is straightforward: just a single environment variable changes the runtime socket, which is what Kubernetes uses to communicate with the runtime.

CRI-O 1.0 was released in October 2017 with support for Kubernetes 1.7. Since then, CRI-O 1.8 and 1.9 were released to follow the Kubernetes 1.8 and 1.9 releases (and sync version numbers). Patel considers CRI-O to be production-ready and it is already available in beta in OpenShift 3.7, released in November 2017. Red Hat will mark it as stable in the upcoming OpenShift 3.9 release and is looking at using it by default with OpenShift 3.10, while keeping Docker as a fallback. Next steps include integrating the new Kata Containers virtual-machine-based runtime, kube-spawn support, and more storage backends like NFS or GlusterFS. The team also discussed how it could support casync or libtorrent to optimize synchronization of images between nodes.

containerd: Docker's runtime gets an API

While Red Hat was working on its implementation of OCI, Docker was also working on the standard, which led to the creation of another runtime, called containerd. The new daemon is a refactoring of internal Docker components to combine the OCI-specific bits like execution, storage, and network interface management. It was already featured in the 1.12 Docker release, but wasn't completed until the containerd 1.0 release announced at KubeCon, which will be part of the upcoming Docker 17.12 (Docker has moved to version numbers based on year and month). And while we call containerd a "runtime", it doesn't directly implement the CRI interface, which is covered by another daemon called cri-containerd. So containerd needs more daemons than CRI-O for Kubernetes (five, versus three for CRI-O). Also, at the time of writing, the cri-containerd component is marked as beta but containerd itself is already used in numerous production environments through Docker, of course.

During the Node special interest group (SIG) meeting at KubeCon, Stephen Day described [Speaker Deck] containerd as "designed as a tight core of decoupled components". Unlike CRI-O, however, containerd supports workloads outside the Kubernetes ecosystem through a Go API. The API is not considered stable yet, although containerd defines a clear release process for making changes to the API and command-line tools. Like CRI-O, containerd is feature complete and passes all Kubernetes tests, but it does not interoperate with systemd's cgroups.

Next step for the project is to develop more tests and improve performance like memory usage and latency. Containerd developers are also working hard on improving stability. They want to provide Debian and RPM packages for easier installation, and integrate it with minikube and kops as well. There are also plans to integrate Kata Containers more smoothly: runc can already be replaced by Kata for basic integration but cri-containerd integration is not implemented yet.

Interoperability and defaults

All of those options are causing a certain amount of confusion in the community. At KubeCon, which runtime to use was a recurring question to speakers. Kubernetes will likely change from Docker to a different runtime, because it doesn't need all the features Docker provides, and there are concerns that the switch could cause compatibility issues because the new runtimes do not implement exactly the same interface as Docker. Log files, for example, are different in the CRI standard. Some programs also monitor the Docker socket directly, which has some non-standard behaviors that the new runtimes may implement differently, or not at all. All of this could cause some breakage when switching to a different runtime.

The question of which runtime Kubernetes will switch to (if it changes) is also open, which leads to some competition between the runtimes. There was a slight controversy related to that question at KubeCon because CRI-O wasn't mentioned during the CNCF keynote, something Vincent Batts, a senior engineer at Red Hat, mentioned on Twitter:

It is bonkers that CRI implementations containerd and rktlet are covered in KubeCon keynote, but zero mention of CRI-O, which is a Kubernetes project that's been 1.0 and actually used in production.

When I prompted him for details about this at KubeCon, Batts explained that:

Healthy competition is good, the problem is unhealthy competition. The CNCF should be better stewards of the projects under their umbrella and shouldn't fold under pressure to promote certain projects over others.

Batts explained further that Red Hat "may be at a tipping point where some apps could start being deployed as containers instead of RPMs" citing "security concerns (namely with [security] advisory tracking, which is lacking in containers as a packaging format) as the main blocker for such transitions". With Project Atomic, Red Hat seems to be pivoting toward containers, so the stakes are high for the Linux distribution.

When I talked to CNCF's COO Chris Aniszczyk at KubeCon, he explained that the CNCF "current policy is to prioritize the marketing of top-level projects":

Projects like CRIO and Helm are part of Kubernetes in some fashion and therefore part of CNCF, we just don't market them as heavily as our top level projects which have passed the CNCF TOC [Technical Oversight Committee] approval bar.

Aniszczyk added that "we want to help, we've heard the feedback and plan to address things in 2018", suggesting that one solution could be that CRI-O applies to graduate to a top-level project in CNCF.

During a container runtimes press meeting, Philips explained that the community would decide, through consensus, which runtime Kubernetes would run by default. He compared runtimes to web browsers and suggested that OCI specifications for containers be compared to the HTML5 and Javascript standards: those are evolving standards that get pushed by the different implementations. He argued that this competition is healthy and means more innovation.

A lot of the people involved in writing those new runtimes were originally Docker contributors: Patel was the initial maintainer of the original OCI implementation that led to runc, while Philips was also a core Docker contributor before starting the rkt project. Those people actively collaborate, along with Docker developers, on standardizing those interfaces and all want to see Kubernetes stabilize and improve. The goal, according to Patrick Chazenon from Docker Inc., is to "make container runtimes boring and stable to have people innovate on top of it". The developers present at the press meeting were happy and proud of what they have accomplished: they have managed to create a single, interoperable specification for containers, and that specification is expanding.

Consolidation and standardization will continue in 2018

The current big topic in container standardization is not runtimes as much as image distribution (i.e. container registries), which is likely to standardize, again, in a specification built around Docker's distribution system. There is also work needed to follow the Linux kernel changes, for example cgroups v2.

The reality is that each runtime has its own advantages: containerd has an API so it can be used to build customized platforms, while CRI-O is a simpler runtime specifically targeting Kubernetes. Docker and rkt are on another level, providing more than simply the runtime: they also provide ways of building containers or pushing to registries, for example.

Right now, most public cloud infrastructure still uses Docker as a runtime. In fact, even CoreOS uses Docker instead of rkt in its Tectonic platform. According to Philips, this is because "there is a substantial integration ecosystem around Docker Engine that our customers rely on and it is the most well-tested option across all existing Kubernetes products." Philips says that CoreOS may consider supporting alternate runtimes for Tectonic, "if alternative container runtimes provide significant improvements to Kubernetes users":

At this point containerd and CRI-O are both very young projects due to the significant amount of new code each project developed this year. Further, they need to reach the maturity of third-party integrations across the ecosystem from logging, monitoring, security, and much more.

Philips further explained CoreOS's position in this blog post:

So far the primary benefits of the CRI for the Kubernetes community have been better code organization and improved code coverage in the Kubelet itself, resulting in a code base that's both higher quality and more thoroughly tested than ever before. For nearly all deployments, however, we expect the Kubernetes community will continue to use Docker Engine in the near term.

During the discussion in the press meeting, Patel likewise said that "we don't want Kubernetes users to know what the runtime is". Indeed, as long as it works, users shouldn't care. Besides, OpenShift, Tectonic, and other platforms abstract runtime decisions away and pick their own best default matching their users' requirements. The question of which runtime Kubernetes chooses by default is therefore not really a concern for those developers, who prefer working on building consensus on standard specifications. In a world of conflict, seeing those developers working together cordially was definitely a breath of fresh air.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Containers without Docker at Red Hat

Anarcat - mer, 12/20/2017 - 12:00

This is one part of my coverage of KubeCon Austin 2017. Other articles include:

The Docker (now Moby) project has done a lot to popularize containers in recent years. Along the way, though, it has generated concerns about its concentration of functionality into a single, monolithic system under the control of a single daemon running with root privileges: dockerd. Those concerns were reflected in a talk by Dan Walsh, head of the container team at Red Hat, at KubeCon + CloudNativeCon. Walsh spoke about the work the container team is doing to replace Docker with a set of smaller, interoperable components. His rallying cry is "no big fat daemons" as he finds them to be contrary to the venerated Unix philosophy.

The quest to modularize Docker

As we saw in an earlier article, the basic set of container operations is not that complicated: you need to pull a container image, create a container from the image, and start it. On top of that, you need to be able to build images and push them to a registry. Most people still use Docker for all of those steps but, as it turns out, Docker isn't the only name in town anymore: an early alternative was rkt, which led to the creation of various standards like CRI (runtime), OCI (image), and CNI (networking) that allow backends like CRI-O or Docker to interoperate with, for example, Kubernetes.

These standards led Red Hat to create a set of "core utils" like the CRI-O runtime that implements the parts of the standards that Kubernetes needs. But Red Hat's OpenShift project needs more than what Kubernetes provides. Developers will want to be able to build containers and push them to the registry. Those operations need a whole different bag of tricks.

It turns out that there are multiple tools to build containers right now. Apart from Docker itself, a session from Michael Ducy of Sysdig reviewed eight image builders, and that's probably not all of them. Ducy identified the ideal build tool as one that would create a minimal image in a reproducible way. A minimal image is one where there is no operating system, only the application and its essential dependencies. Ducy identified Distroless, Smith, and Source-to-Image as good tools to build minimal images, which he called "micro-containers".

A reproducible container is one that you can build multiple times and always get the same result. For that, Ducy said you have to use a "declarative" approach (as opposed to "imperative"), which is understandable given that he comes from the Chef configuration-management world. He gave the examples of Ansible Container, Habitat, nixos-container, and Smith (yes, again) as being good approaches, provided you were familiar with their domain-specific languages. He added that Habitat ships its own supervisor in its containers, which may be superfluous if you already have an external one, like systemd, Docker, or Kubernetes. To complete the list, we should mention the new BuildKit from Docker and Buildah, which is part of Red Hat's Project Atomic.

Building containers with Buildah

Buildah's name apparently comes from Walsh's colorful Boston accent; the Boston theme permeates the branding of the tool: the logo, for example, is a Boston terrier dog (seen at right). This project takes a different approach from Ducy's decree: instead of enforcing a declarative configuration-management approach to containers, why not build simple tools that can be used by your favorite configuration-management tool? If you want to use regular command-line commands like cp (instead of Docker's custom COPY directive, for example), you can. But you can also use Ansible or Puppet, OS-specific or language-specific installers like APT or pip, or whatever other system to provision the content of your containers. This is what building a container looks like with regular shell commands and simply using make to install a binary inside the container:

# pull a base image, equivalent to a Dockerfile's FROM command buildah from redhat # mount the base image to work on it crt=$(buildah mount) cp foo $crt make install DESTDIR=$crt # then make a snapshot buildah commit

An interesting thing with this approach is that, since you reuse normal build tools from the host environment, you can build really minimal images because you don't need to install all the dependencies in the image. Usually, when building a container image, the target application build dependencies need to be installed within the container. For example, building from source usually requires a compiler toolchain in the container, because it is not meant to access the host environment. A lot of containers will also ship basic Unix tools like ps or bash which are not actually necessary in a micro-container. Developers often forget to (or simply can't) remove some dependencies from the built containers; that common practice creates unnecessary overhead and attack surface.

The modular approach of Buildah means you can run at least parts of the build as non-root: the mount command still needs the CAP_SYS_ADMIN capability, but there is an issue open to resolve this. However, Buildah shares the same limitation as Docker in that it can't build containers inside containers. For Docker, you need to run the container in "privileged" mode, which is not possible in certain environments (like GitLab Continuous Integration, for example) and, even when it is possible, the configuration is messy at best.

The manual commit step allows fine-grained control over when to create container snapshots. While in a Dockerfile every line creates a new snapshot, with Buildah commit checkpoints are explicitly chosen, which reduces unnecessary snapshots and saves disk space. This is useful to isolate sensitive material like private keys or passwords which sometimes mistakenly end up in public images as well.

While Docker builds non-standard, Docker-specific images, Buildah produces standard OCI images among other output formats. For backward compatibility, it has a command called build-using-dockerfile or buildah bud that parses normal Dockerfiles. Buildah has a enter command to inspect images from the inside directly and a run command to start containers on the fly. It does all the work without any "fat daemon" running in the background and uses standard tools like runc.

Ducy's criticism of Buildah was that it was not declarative, which made it less reproducible. When allowing shell commands anything can happen: for example, a shell script might download arbitrary binaries, without any way of subsequently retracing where those come from. Shell command effects may vary according to the environment. In contrast to shell-based tools, configuration-management systems like Puppet or Chef are designed to "converge" over a final configuration that is more reliable, at least in theory: in practice you can call shell commands from configuration-management systems. Walsh, however, argued that existing configuration management can be used on top of Buildah, but it doesn't force users down that path. This fits well with the classic "separation" principle of the Unix philosophy ("mechanism not policy").

At this point, Buildah is in beta and Red Hat is working on integrating it into OpenShift. I have tested Buildah while writing this article and, short of some documentation issues, it generally works reliably. It could use some polishing in error handling, but it is definitely a great asset to add to your container toolbox.

Replacing the rest of the Docker command-line

Walsh continued his presentation by giving an overview of another project that Red Hat is working on, tentatively called libpod. The name derives from a "pod" in Kubernetes, which is a way to group containers inside a host, to share namespaces, for example.

Libpod includes the kpod command to inspect and manipulate container storage directly. Walsh explained this can be useful if, for example, dockerd hangs or if a Kubernetes cluster crashes. kpod is basically an independent re-implementation of the docker command-line tool. There is a command to list running containers (kpod ps) or images (kpod images). In fact, there is a translation cheat sheet documenting all Docker commands with a kpod equivalent.

One of the nice things with the modular approach is that when you run a container with kpod run, the container is directly started as a subprocess of the current shell, instead of a subprocess of dockerd. In theory, this allows running containers directly from systemd, removing the duplicate work dockerd is doing. It enables things like socket-activated containers, which is something that is not straightforward to do with Docker, or even with Kubernetes right now. In my experiments, however, I have found that containers started with kpod lack some fundamental functionality, namely networking (!), although there is an issue in progress to complete that implementation.

A final command we haven't covered is push. While the above commands provide a good process for working with local containers, they don't cover remote registries, which allow developers to actively collaborate on application packaging. Registries are also an essential part of a continuous-deployment framework. This is where the skopeo project comes in. Skopeo is another Atomic project that "performs various operations on container images and image repositories", according to the README file. It was originally designed to inspect the contents of container registries without actually downloading the sometimes voluminous images as docker pull does. Docker refused patches to support inspection, suggesting the creation of a separate tool, which led to Skopeo. After pull, push was the logical next step and Skopeo can now do a bunch of other things like copying and converting images between registries without having to store a copy locally. Because this functionality was useful to other projects as well, a lot of the Skopeo code now lives in a reusable library called containers/image. That library is in turn used by Pivotal, Google's container-diff, kpod push, and buildah push.

kpod is not directly tied to Kubernetes, so the name might change in the future — especially since Red Hat legal has not cleared the name yet. (In fact, just as this article was going to "press", the name was changed to podman.) The team wants to implement more "pod-level" commands which would allow operations on multiple containers, a bit like what docker compose might do. But at that level, a better tool might be Kompose which can execute Compose YAML files into a Kubernetes cluster. Some Docker commands (like swarm) will never be implemented, on purpose, as they are best left for Kubernetes itself to handle.

It seems that the effort to modularize Docker that started a few years ago is finally bearing fruit. While, at this point, kpod is under heavy development and probably should not be used in production, the design of those different tools is certainly interesting; a lot of it is ready for development environments. Right now, the only way to install libpod is to compile it from source, but we should expect packages coming out for your favorite distribution eventually.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

An overview of KubeCon + CloudNativeCon

Anarcat - mer, 12/13/2017 - 12:00

This is one part of my coverage of KubeCon Austin 2017. Other articles include:

The Cloud Native Computing Foundation (CNCF) held its conference, KubeCon + CloudNativeCon, in December 2017. There were 4000 attendees at this gathering in Austin, Texas, more than all the previous KubeCons before, which shows the rapid growth of the community building around the tool that was announced by Google in 2014. Large corporations are also taking a larger part in the community, with major players in the industry joining the CNCF, which is a project of the Linux Foundation. The CNCF now features three of the largest cloud hosting businesses (Amazon, Google, and Microsoft), but also emerging companies from Asia like Baidu and Alibaba.

In addition, KubeCon saw an impressive number of diversity scholarships, which "include free admission to KubeCon and a travel stipend of up to $1,500, aimed at supporting those from traditionally underrepresented and/or marginalized groups in the technology and/or open source communities", according to Neil McAllister of CoreOS. The diversity team raised an impressive $250,000 to bring 103 attendees to Austin from all over the world.

We have looked into Kubernetes in the past but, considering the speed at which things are moving, it seems time to make an update on the projects surrounding this newly formed ecosystem.

The CNCF and its projects

The CNCF was founded, in part, to manage the Kubernetes software project, which was donated to it by Google in 2015. From there, the number of projects managed under the CNCF umbrella has grown quickly. It first added the Prometheus monitoring and alerting system, and then quickly went up from four projects in the first year, to 14 projects at the time of this writing, with more expected to join shortly. The CNCF's latest additions to its roster are Notary and The Update Framework (TUF, which we previously covered), both projects aimed at providing software verification. Those add to the already existing projects which are, bear with me, OpenTracing (a tracing API), Fluentd (a logging system), Linkerd (a "service mesh", which we previously covered), gRPC (a "universal RPC framework" used to communicate between pods), CoreDNS (DNS and service discovery), rkt (a container runtime), containerd (another container runtime), Jaeger (a tracing system), Envoy (another "service mesh"), and Container Network Interface (CNI, a networking API).

This is an incredible diversity, if not fragmentation, in the community. The CNCF made this large diagram depicting Kubernetes-related projects—so large that you will have a hard time finding a monitor that will display the whole graph without scaling it (seen below, click through for larger version). The diagram shows hundreds of projects, and it is hard to comprehend what all those components do and if they are all necessary or how they overlap. For example, Envoy and Linkerd are similar tools yet both are under the CNCF umbrella—and I'm ignoring two more such projects presented at KubeCon (Istio and Conduit). You could argue that all tools have different focus and functionality, but it still means you need to learn about all those tools to pick the right one, which may discourage and confuse new users.

You may notice that containerd and rkt are both projects of the CNCF, even though they overlap in functionality. There is also a third Kubernetes runtime called CRI-O built by RedHat. This kind of fragmentation leads to significant confusion within the community as to which runtime they should use, or if they should even care. We'll run a separate article about CRI-O and the other runtimes to try to clarify this shortly.

Regardless of this complexity, it does seem the space is maturing. In his keynote, Dan Kohn, executive director of the CNCF, announced "1.0" releases for 4 projects: CoreDNS, containerd, Fluentd and Jaeger. Prometheus also had a major 2.0 release, which we will cover in a separate article.

There were significant announcements at KubeCon for projects that are not directly under the CNCF umbrella. Most notable for operators concerned about security is the introduction of Kata Containers, which is basically a merge of runV from Hyper.sh and Intel's Clear Containers projects. Kata Containers, introduced during a keynote by Intel's VP of the software and services group, Imad Sousou, are virtual-machine-based containers, or, in other words, containers that run in a hypervisor instead of under the supervision of the Linux kernel. The rationale here is that containers are convenient but all run on the same kernel, so the compromise of a single container can leak into all containers on the same host. This may be unacceptable in certain environments, for example for multi-tenant clusters where containers cannot trust each other.

Kata Containers promises the "best of both worlds" by providing the speed of containers and the isolation of VMs. It does this by using minimal custom kernel builds, to speed up boot time, and parallelizing container image builds and VM startup. It also uses tricks like same-page memory sharing across VMs to deduplicate memory across virtual machines. It currently works only on x86 and KVM, but it integrates with Kubernetes, Docker, and OpenStack. There was a talk explaining the technical details; that page should eventually feature video and slide links.

Industry adoption

As hinted earlier, large cloud providers like Amazon Web Services (AWS) and Microsoft Azure are adopting the Kubernetes platform, or at least its API. The keynotes featured AWS prominently; Adrian Cockcroft (AWS vice president of cloud architecture strategy) announced the Fargate service, which introduces containers as "first class citizens" in the Amazon infrastructure. Fargate should run alongside, and potentially replace, the existing Amazon EC2 Container Service (ECS), which is currently the way developers would deploy containers on Amazon by using EC2 (Elastic Compute Cloud) VMs to run containers with Docker.

This move by Amazon has been met with skepticism in the community. The concern here is that Amazon could pull the plug on Kubernetes when it hinders the bottom line, like it did with the Chromecast products on Amazon. This seems to be part of a changing strategy by the corporate sector in adoption of free-software tools. While historically companies like Microsoft or Oracle have been hostile to free software, they are now not only using free software but also releasing free software. Oracle, for example, released what it called "Kubernetes Tools for Serverless Deployment and Intelligent Multi-Cloud Management", named Fn. Large cloud providers are getting certified by the CNCF for compliance with the Kubernetes API and other standards.

One theory to explain this adoption is that free-software projects are becoming on-ramps to proprietary products. In this strategy, as explained by InfoWorld, open-source tools like Kubernetes are merely used to bring consumers over to proprietary platforms. Sure, the client and the API are open, but the underlying software can be proprietary. The data and some magic interfaces, especially, remain proprietary. Key examples of this include the "serverless" services, which are currently not standardized at all: each provider has its own incompatible framework that could be a deliberate lock-in strategy. Indeed, a common definition of serverless, from Martin Fowler, goes as follows:

Serverless architectures refer to applications that significantly depend on third-party services (knows as Backend as a Service or "BaaS") or on custom code that's run in ephemeral containers (Function as a Service or "FaaS").

By designing services that explicitly require proprietary, provider-specific APIs, providers ensure customer lock-in at the core of the software architecture. One of the upcoming battles in the community will be exactly how to standardize this emerging architecture.

And, of course, Kubernetes can still be run on bare metal in a colocation facility, but those costs are getting less and less affordable. In an enlightening talk, Dmytro Dyachuk explained that unless cloud costs hit $100,000 per month, users may be better off staying in the cloud. Indeed, that is where a lot of applications end up. During an industry roundtable, Hong Tang, chief architect at Alibaba Cloud, posited that the "majority of computing will be in the public cloud, just like electricity is produced by big power plants".

The question, then, is how to split that market between the large providers. And, indeed, according to a CNCF survey of 550 conference attendees: "Amazon (EC2/ECS) continues to grow as the leading container deployment environment (69%)". CNCF also notes that on-premise deployment decreased for the first time in the five surveys it has run, to 51%, "but still remains a leading deployment". On premise, which is a colocation facility or data center, is the target for these cloud companies. By getting users to run Kubernetes, the industry's bet is that it makes applications and content more portable, thus easier to migrate into the proprietary cloud.

Next steps

As the Kubernetes tools and ecosystem stabilize, major challenges emerge: monitoring is a key issue as people realize it may be more difficult to diagnose problems in a distributed system compared to the previous monolithic model, which people at the conference often referred to as "legacy" or the "old OS paradigm". Scalability is another challenge: while Kubernetes can easily manage thousands of pods and containers, you still need to figure out how to organize all of them and make sure they can talk to each other in a meaningful way.

Security is a particularly sensitive issue as deployments struggle to isolate TLS certificates or application credentials from applications. Kubernetes makes big promises in that regard and it is true that isolating software in microservices can limit the scope of compromises. The solution emerging for this problem is the "service mesh" concept pioneered by Linkerd, which consists of deploying tools to coordinate, route, and monitor clusters of interconnected containers. Tools like Istio and Conduit are designed to apply cluster-wide policies to determine who can talk to what and how. Istio, for example, can progressively deploy containers across the cluster to send only a certain percentage of traffic to newly deployed code, which allows detection of regressions. There is also work being done to ensure standard end-to-end encryption and authentication of containers in the SPIFFE project, which is useful in environments with untrusted networks.

Another issue is that Kubernetes is just a set of nuts and bolts to manage containers: users get all the parts and it's not always clear what to do with them to get a platform matching their requirements. It will be interesting to see how the community moves forward in building higher-level abstractions on top of it. Several tools competing in that space were featured at the conference: OpenShift, Tectonic, Rancher, and Kasten, though there are many more out there.

The 1.9 Kubernetes release should be coming out in early 2018; it will stabilize the Workloads API that was introduced in 1.8 and add Windows containers (for those who like .NET) in beta. There will also be three KubeCon conferences in 2018 (in Copenhagen, Shanghai, and Seattle). Stay tuned for more articles from KubeCon Austin 2017 ...

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

November 2017 report: LTS, standard disclosure, Monkeysphere in Python, flash fraud and Goodbye Drupal

Anarcat - jeu, 11/30/2017 - 17:33
Debian Long Term Support (LTS)

This is my monthly Debian LTS report. I didn't do as much as I wanted this month so a bunch of hours are reported to next month. I got frustrated by two difficult packages: exiv2 and libreoffice.

Exiv

For Exiv2 I first reported the issues upstream as requested in the original CVE assignment. Then I went to see if I could reproduce the issue. Valgrind didn't find anything, so I went on to test the new ASAN instructions that tell us how to build for ASAN in LTS. Turns out that I couldn't make that work either. Fortunately, Roberto was able to build the package properly and confirmed the wheezy version as non-vulnerable, so I marked the three CVEs as not-affected and moved on.

Libreoffice

Next up was LibreOffice. I started backporting the patches to wheezy which was a little difficult because any error in the backport takes hours to find because LibreOffice is so big. The monster takes about 4 hours to build on my i3-6100U processor - I can't imagine how long that would take on a slower machine. Still, I managed to get patches that mostly builds. I say mostly because while most of the code builds, the tests fail to build. Not only do they fail to build, but they even segfault the linker. At that point, I had already spent too many hours working on this frustrating loop of "work/build-wait/crash" that I gave up.

I also worked on reproducing a supposed regression associated with the last security update. Somehow, I couldn't reproduce either - the description of the regression was very limited and all suggested approaches failed to trigger the problems described.

OptiPNG

Finally, a little candy: an easy backport of a simple 2-line patch for a simple program, OptiPNG that, ironically, had a vulnerability (CVE-2017-16938) in GIF parsing. I could do hundreds of those breezy updates, they are fun and simple, and easy to test. This resulted in the trivial DLA-1196-1.

Miscellaneous

LibreOffice stretched the limits of my development environment. I had to figure out how to deal with out of space conditions in the build tree (/build) something that is really not obvious in sbuild. I ended up documenting that in a new troubleshooting section in the wiki.

Other free software work feed2exec

I pushed forward with the development of my programmable feed reader, feed2exec. Last month I mentioned I released the 0.6.0 beta: since then 4 more releases were published, and we are now at the 0.10.0 beta. This added a bunch new features:

  • wayback plugin to save feed items to the Wayback Machine on archive.org
  • archive plugin to save feed items to the local filesystem
  • transmission plugin to save RSS Torrent feeds to the Transmission torrent client
  • vast expansion of the documentation, now hosted on ReadTheDocs. The design was detailed with a tour of the source code and detailed plugin writing instructions were added to the documentation, also shipped as a feed2exec-plugins manpage.
  • major cleanup and refactoring of the codebase, including standard locations for the configuration files, which moved

The documentation deserves special mention. If you compare between version 0.6 and the latest version you can see 4 new sections:

  • Plugins - extensive documentation on plugins use, the design of the plugin system and a full tutorial on how to write new plugins. the tutorial was written while writing the archive plugin, which was written as an example plugin just for that purpose and should be readable by novice programmers
  • Support - a handy guide on how to get technical support for the project, copied over from the Monkeysign project.
  • Code of conduct - was originally part of the contribution guide. the idea then was to force people to read the Code when they wanted to contribute, but it wasn't a good idea. The contribution page was overloaded and critical parts were hidden down in the page, after what is essentially boilerplate text. Inversely, the Code was itself hidden in the contribution guide. Now it is clearly visible from the top and trolls will see this is an ethical community.
  • Contact - another idea from the Monkeysign project. became essential when the security contact was added (see below).

All those changes were backported in the ecdysis template documentation and I hope to backport them back into my other projects eventually. As part of my documentation work, I also drifted into the Sphinx project itself and submitted a patch to make manpage references clickable as well.

I now use feed2exec to archive new posts on my website to the internet archive, which means I have an ad-hoc offsite backup of all content I post online. I think that's pretty neat. I also leverage the Linkchecker program to look for dead links in new articles published on the site. This is possible thanks to a Ikiwiki-specific filter to extract links to changed files from the Recent Changes RSS feed.

I'm considering making the parse step of the program pluggable. This is an idea I had in mind for a while, but it didn't make sense until recently. I described the project and someone said "oh that's like IFTTT", a tool I wasn't really aware of, which connects various "feeds" (Twitter, Facebook, RSS) to each other, using triggers. The key concept here is that feed2exec could be made to read from Twitter or other feeds, like IFTTT and not just write to them. This could allow users to bridge social networks by writing only to a single one and broadcasting to the other ones.

Unfortunately, this means adding a lot of interface code and I do not have a strong use case for this just yet. Furthermore, it may mean switching from a "cron" design to a more interactive, interrupt-driven design that would run continuously and wake up on certain triggers.

Maybe that could come in a 2.0 release. For now, I'll see to it that the current codebase is solid. I'll release a 0.11 release candidate shortly, which has seen a major refactoring since 0.10. I again welcome beta testers and users to report their issues. I am happy to report I got and fixed my first bug report on this project this month.

Towards standard security disclosure guidelines

When reading the excellent State of Opensource Security report, some metrics caught my eye:

  • 75% of vulnerabilities are not discovered by the maintainer

  • 79.5% of maintainers said that they had no public-facing disclosure policy in place

  • 21% of maintainers who do not have a public disclosure policy have been notified privately about a vulnerability

  • 73% of maintainers who do have a public disclosure policy have been notified privately about a vulnerability

In other words, having a public disclosure policy more than triples your chances of being notified of a security vulnerability. I was also surprised to find that 4 out of 5 projects do not have such a page. Then I realized that none of my projects had such a page, so I decided to fix that and fix my documentation templates (the infamously named ecdysis project) to specifically include a section on security issues.

I found that the HackerOne disclosure guidelines were pretty good, except they require having a bounty for disclosure. I understand it's part of their business model, but I have no such money to give out - in fact, I don't even pay myself for the work of developing the project, so I don't quite see why I would pay for disclosures.

I also found that many projects include OpenPGP key fingerprints in their contact information. I find that's a little silly: project documentation is no place to offer OpenPGP key discovery. If security researchers cannot find and verify OpenPGP key fingerprints, I would be worried about their capabilities. Adding a fingerprint or key material is just bound to create outdated documentation when maintainers rotate. Instead, I encourage people to use proper key discovery mechanism like the Web of trust, WKD or obviously TOFU which is basically what publishing a fingerprint does anyways.

Git-Mediawiki

After being granted access to the Git-Mediawiki project last month, I got to work. I fought hard with both Travis and Git, and Perl, and MediaWiki, to add continuous integration in the repository. It turns out the MediaWiki remote doesn't support newer versions of MediaWiki because the authentication system changed radically. Even though there is supposed to be backwards-compatibility, it doesn't seem to really work in my cases, which means that any test case that requires logging into the wiki fails. This also required using an external test suite instead of the one provided by Git, which insists on building Git before being used at all. I ended up using the Sharness project and submitted a few tests that were missing.

I also:

I also opened a discussion regarding the versioning scheme and started tracking issues that would be part of the next release. I encourage people to get involved in this project if they are interested: the door is wide open for contributors, of course.

Rewriting Monkeysphere in Python?

After being told one too many times that Monkeysphere doesn't support elliptic curve algorithms, I finally documented the issue and looked into solutions. It turns out that a key part of Monkeysphere is a Perl program written using a library that doesn't support ECC itself, which makes this problem very hard to solve. So I looked into the PGPy project to see if it could be useful and it turns out that ECC support already exists - the only problem is the specific curve GnuPG uses, ed25519, was missing. The author fixed that but the fix requires a development version of OpenSSL, because even the 1.1 release doesn't support that curve.

I nevertheless went ahead to see how hard it would be to re-implement that key component of Monkeysphere in Python, and ended up with monkeypy, an analysis of the Monkeysphere design and first prototype of a Python-based OpenPGP / SSH conversion tool. This lead to a flurry of issues on the PGPy project to fix UTF-8 support, easier revocation checks and a few more. I also reviewed 9 pull requests that were pending in the repository. A key piece missing is keyserver interaction and I made a first prototype of this as well.

It was interesting to review Monkeysphere's design. I hope to write more about this in the future, especially about how it could be improved, and find the time to do this re-implementation which could mean wider adoption of this excellent project.

Goodbye, Drupal

I finally abandoned all my Drupal projects, except the Aegir-related ones, which I somewhat still feel attached to, even though I do not do any Drupal development at all anymore. I do not use Drupal anywhere (unless as a visitor on some websites maybe?), I do not manage Drupal or Aegir sites anymore, and in fact, I have completely lost interested in writing anything remotely close to PHP.

So it was time to go. I have been involved with Drupal for a long time: I registered on Drupal.org in March 2001, almost 17 years ago. I was active for about 12 years on the site: I made my first post in 2003 which showed I was already "hacking core" which back then was the way to go. My first commit, to the mailman_api project, was also in 2003, and I kept contributing until my last commit in 2015, on the Aegir project of course.

That is probably the longest time I have spent on any software project, with the notable exception of the Debian project. I had a lot of fun working with Drupal: I met amazing people, traveled all over the world and made powerful websites that shook the world. Unfortunately, the Drupal project has decided to go in a direction I cannot support: Drupal 8 is a monster that is beyond anything needed for me, my friends or the organizations I support. It may be very well for the startups and corporations, to create large sites, but it seems it completely lost touch with its roots. Drupal used to be a simple tool for "community plumbing" (ref), it is now a behemoth to "Launch, manage, and scale ambitious digital experiences—with the flexibility to build great websites or push beyond the browser" (ref).

When I heard a friend had his small artist website made in Drupal, my first questions were "which version?" and "who takes care of the upgrades?" Those questions still don't seem to be resolved in the community: a whole part of the project is still stuck in the old Drupal 7 world, years after the first D8 alpha releases. Drupal 8 doesn't address the fundamental issue of upgradability and cost: Drupal sites are bigger than they ever were. The Drupal community seems to be following a very different way of thinking (notice the Amazon Echo device there) than what I hope the future of the Internet will be.

Goodbye, Drupal - it's been a great decade, but I'm already gone and I'm not missing you at all.

Fixing fraudulent memory cards

I finally got around testing the interesting f3 project with a real use case, which allowed me to update the stressant documentation with actual real-world results. A friend told me the memory card he put in his phone had trouble with some pictures he was taking with the phone: they would show up as "gray" when he would open them, yet the thumbnail looked good.

It turns out his memory card wasn't really 32GB as advertised, but really 16GB. Any write sent to the card above that 16GB barrier was just discarded and reads were returned as garbage. But thanks to the f3 project it was possible to fix that card so he could use it normally.

In passing, I worked on the f3 project to merge the website with the main documentation, again using the Sphinx project to generate more complete documentation. The old website is still up, but hopefully a redirection will be installed soon.

Miscellaneous
  • quick linkchecker patch, still haven't found the time to make a real release

  • tested the goaccess program to inspect web server logs and get some visibility on the traffic to this blog. found two limitations: some graphs look weird and would benefit from a logarithmic scale and it would be very useful to have plain text reports. otherwise this makes for a nice replacement for the aging analog program I am currently using, especially as a sysadmin quick diagnostic tool. for now I use this recipe to send a HTML report by email every week

  • found out about the mpd/subsonic bridge and sent a few issues about problems I found. turns out the project is abandoned and this prompted the author to archive the repository. too bad...

  • uploaded a new version of etckeeper into Debian, and started an upstream discussion about what to do with the pile of patches waiting there

  • uploaded a new version of the also aging smokeping in Debian as well to fix a silly RC bug about the rename dependency

  • GitHub also says I "created 7 repositories", "54 commits in 8 repositories", "24 pull requests in 11 repositories", "reviewed 19 pull requests in 9 repositories", and "opened 23 issues in 14 repositories".

  • blocked a bunch of dumb SMTP bots attacking my server using this technique. this gets rid of those messages I was getting many times per second in the worst times:

    lost connection after AUTH from unknown[180.121.131.155]
  • happily noticed that fail2ban gained the possibility of incremental ban times in the upcoming version 0.11, which i documented on Serverfault.com. unfortunately, the Debian package is out of date for now...

  • tested Firefox Quantum with great success: all my extensions are supported and it is much, much faster than it was. i do miss the tab list drop-down which somehow disappeared and I also miss a Debian package. I am also concerned about the maintainability of Rust in Debian, but we'll see how that goes.

I spent only 30% of this month on paid work.

Catégories: External Blogs

Montréal-Python 68: Wysiwyg Xylophone

Montreal Python - mar, 11/14/2017 - 00:00

Please RSVP on our meetup event

When

November 20th at 6:00PM

Where

Google Montréal 1253 McGill College #150 Montréal, QC

We thank Google Montreal for sponsoring MP68

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Talks
  • 7:30PM - Break
  • 7:45PM - Talk
  • 8:30-9:00PM - End of event
Presentations Va debugger ton Python! - Stéphane Wirtel

Cette présentation vous explique les bases de Pdb ainsi que GDB, afin de debugger plus facilement vos scripts Python.

Writing a Python interpreter in Python from scratch - Zhentao Li

I will show a prototype of a Python interpreter written in entirely in Python itself (that isn't Pypy).

The goal is to have simpler internals to allow experimenting with changes to the language more easily. This interpreter has a small core with much of the library modifiable at run time for quickly testing changes. This differs from Pypy that aimed for full Python compatibility and speed (from JIT compilation). I will show some of the interesting things that you can do with this interpreter.

This interpreter has two parts: a parser to transform source to Abstract Syntax Tree and a runner for traversing this tree. I will give an overview of how both part work and discuss some challenges encountered and their solution.

This interpreter makes use of very few libraries, and only those included with CPython.

This project is looking for members to discuss ways of simplifying parts of the interpreter (among other things).

The talk would be about Rasa, an open-source chatbots platform - Nathan Zylbersztejn

Most chatbots rely on external APIs for the cool stuff such as natural language understanding (NLU) and disappoint because if and else conditionals fail at delivering good representations of our non linear human way to converse. Wouldn’t it be great if we could 1) take control of NLU and tweak it to better fit our needs and 2) really apply machine learning, extract patterns from real conversations, and handle dialogue in a decent manner? Well, we can, thanks to Rasa.ai. It’s open-source, it’s in Python, and it works

About Nathan Zylbersztejn:

Nathan is the founder of Mr. Bot, a dialogue consulting agency in Montreal with clients in the media, banking and aerospace industries. He holds a master in economics, a graduate diploma in computer science, and a machine learning nanodegree.

Catégories: External Blogs

Montréal-Python 68: Wysiwyg Xylophone

Montreal Python - mar, 11/14/2017 - 00:00

Please RSVP on our meetup event

When

November 20th at 6:00PM

Where

Google Montréal 1253 McGill College #150 Montréal, QC

We thank Google Montreal for sponsoring MP68

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Talks
  • 7:30PM - Break
  • 7:45PM - Talk
  • 8:30-9:00PM - End of event
Presentations Va debugger ton Python! - Stéphane Wirtel

Cette présentation vous explique les bases de Pdb ainsi que GDB, afin de debugger plus facilement vos scripts Python.

Writing a Python interpreter in Python from scratch - Zhentao Li

I will show a prototype of a Python interpreter written in entirely in Python itself (that isn't Pypy).

The goal is to have simpler internals to allow experimenting with changes to the language more easily. This interpreter has a small core with much of the library modifiable at run time for quickly testing changes. This differs from Pypy that aimed for full Python compatibility and speed (from JIT compilation). I will show some of the interesting things that you can do with this interpreter.

This interpreter has two parts: a parser to transform source to Abstract Syntax Tree and a runner for traversing this tree. I will give an overview of how both part work and discuss some challenges encountered and their solution.

This interpreter makes use of very few libraries, and only those included with CPython.

This project is looking for members to discuss ways of simplifying parts of the interpreter (among other things).

The talk would be about Rasa, an open-source chatbots platform - Nathan Zylbersztejn

Most chatbots rely on external APIs for the cool stuff such as natural language understanding (NLU) and disappoint because if and else conditionals fail at delivering good representations of our non linear human way to converse. Wouldn’t it be great if we could 1) take control of NLU and tweak it to better fit our needs and 2) really apply machine learning, extract patterns from real conversations, and handle dialogue in a decent manner? Well, we can, thanks to Rasa.ai. It’s open-source, it’s in Python, and it works

About Nathan Zylbersztejn:

Nathan is the founder of Mr. Bot, a dialogue consulting agency in Montreal with clients in the media, banking and aerospace industries. He holds a master in economics, a graduate diploma in computer science, and a machine learning nanodegree.

Catégories: External Blogs

ROCA: Return Of the Coppersmith Attack

Anarcat - lun, 11/13/2017 - 19:00

On October 30, 2017, a group of Czech researchers from Masaryk University presented the ROCA paper at the ACM CCS Conference, which earned the Real-World Impact Award. We briefly mentioned ROCA when it was first reported but haven't dug into details of the vulnerability yet. Because of its far-ranging impact, it seems important to review the vulnerability in light of the new results published recently.

ROCA and its impacts

As we all probably know, most modern cryptography is based on the fundamental assumption that finding the prime factors of a large number is much harder than generating that number from two large primes. For this assumption to hold, however, the prime numbers need to be randomly chosen, otherwise it becomes possible to guess those numbers more easily. The prime generation process can take time, especially on embedded devices like cryptographic tokens.

The ROCA vulnerability occurred because Infineon, a popular smartcard manufacturer, developed its own proprietary algorithm based on the fast prime technique. If used correctly, fast prime allows the creation of randomly distributed primes faster than traditional methods, which generally consist of generating random numbers and testing for primality. The ROCA paper shows that Infineon goofed on the implementation and the resulting primes were not evenly distributed. This opened the possibility of recovering private keys from public key material in a reasonable time frame.

Private key recovery is one of the worst security vulnerabilities (short of direct cleartext recovery) that can happen in asymmetric crypto systems. It turns out that Infineon is one of only three secure chip manufacturers and therefore its chips are everywhere: in cryptographic keycards (e.g. the popular Yubikey 4) but also Trusted Platform Modules (TPMs) that live in most modern computers; even some official identity cards, which are used in everything from banking to voting, have Infineon chips. So the impact of this vulnerability is broad: medical records, voting, OpenPGP signatures and encryption, Digital Rights Management (DRM), full disk encryption, and secure boot; all become vulnerable if the wrong keycard generated the private key material.

Hacking the Estonian elections

Let's take an extreme example of identity theft to illustrate the severity of the ROCA vulnerability. Estonia used Infineon chips in its state-issued identity cards. Because those cards are used in its electronic voting system, it was speculated that votes could be forged in an election. Indeed, there was a parliamentary election in 2015, at a time when vulnerable cards were in the wild.

The Estonian government claims that leveraging the attack to commit electoral fraud in Estonia is "complicated and not cheap". This seems to rely on an unnamed expert as saying it would "cost approximately €60 billion" to mount such an attack, a number found in this article published by the Estonian state media (ERR). Indeed, estimates show that cracking a single key would cost €80,000 in cloud computing costs. Since then, however, the vulnerability was also reviewed by cryptographers Daniel J. Bernstein and Tanja Lange, who found that it was possible to improve the performance of the attack: they stopped after a 25% improvement, but suspect even further speed-ups are possible. So let's see what effect those numbers would have in an election now.

There are 750,000 vulnerable cards, according to ERR, out of the 1.3 million cards currently in circulation, according to National Electoral Committee. There were around 900,000 eligible voters in the 2015 parliamentary elections, about 600,000 of those voted and about 30% of voters cast their votes electronically. So, assuming the distribution of compromised cards is uniform among voters, we can use the percentage of compromised cards (roughly 60%) to estimate that vulnerable cards could have been used in about 17% of the total votes.

The 2015 election was pretty close: the Estonian Reform Party beat its closest rival, the Estonian Centre Party, by only 5%. So, it could have been possible to affect the result of that election, even without compromising all the cards. Bernstein and Lange were actually generous when they said that "'large-scale vote fraud' does not require breaking all of the ID cards; 10% already buys massive influence".

In fact, according to their numbers, the €80,000 required to break one card can be reduced by a factor of four thanks to various performance improvements and by another factor of ten using dedicated hardware, which means we can expect targeted attacks to be 40 times cheaper than the original paper, bringing the cost of one vote down to around €2000. An Ars Technica article quoted another researcher as saying: "I'm not sure whether someone can slash the cost of one key below \$1,000 as of today, but I certainly see it as a possibility."

So, being generous with the exchange rates for the sake of convenience, we could use a price per vote range of about €1,000-2,000. This means that buying the 5% of the votes necessary to win that 2015 election (or around 30,000 votes) would cost €30-60 million, which is a much lower number than originally announced and definitely affordable for a hostile foreign-state actor.

Now, I am not saying that this happened during the 2015 Estonian election. I am merely pointing out that the resources required to buy votes (one way or another) are much cheaper than previously thought. And while the Electoral Committee released the server-side source code in 2013, that wasn't where the ROCA problem lies. The vulnerability, this time, is client-side: the identity cards based on proprietary hardware. Fortunately, the Estonian authorities took action to block the Infineon cards on November 3 and issue new certificates for the vulnerable cards in a state-wide replacement program. So far, there is no evidence of foul play or identity theft, according to state officials.

The attack and disclosure

ROCA is an acronym for "Return Of the Coppersmith Attack" which, in turn, refers to a class of attacks on RSA that uses some knowledge about the secret key material that allows the key to be guessed in less than brute-force time. The actual mathematical details are available in the full paper [PDF] and go beyond my modest mathematical knowledge, but it certainly seems like Infineon took shortcuts in designing its random number generator. What the Czech researchers found is that the primes generated by the Infineon algorithm had a specific structure that made them easier to guess than randomly distributed primes. Indeed, all primes generated by RSALib, the library Infineon created for those chips, followed this pattern:

p = k ∗ M + (65537^a mod M)

Here p is one of the secret primes used to construct the public key. The secrets in the equation are k and a, while M varies depending with the chosen key size. The problem is that M is disproportionately large, close enough to the size of p so that k and a are too small. And this is where the Coppersmith method comes in: it relies on small integers being part of the equation generating the prime. It is also where I get lost in the math; interested readers can read the papers to find out more.

Something interesting here is that Bernstein and Lange were able to reproduce the results before the paper was publicly released, using only information disclosed as part of the public advisory. As such, they raise the interesting question of whether delaying publication of the actual paper helped in containing the security vulnerability, since they were able to construct an attack within about a week of part-time work. They outlined that publicizing the keyword "Coppersmith" in the title of the research paper (which was public at the time of disclosure) was especially useful in confirming they were on the right path during their research. In that sense, Bernstein and Lange imply that delaying the publication of the paper had no benefit to the security community, and may have actually benefited attackers in a critical period, while impeding the work of honest security researchers.

They also questioned the unusually long delay given to Infineon (eight months) before disclosure. Usually, "responsible disclosure" ranges from a few days to several months. For example, Google's Project Zero team gives projects 90 days to fix problems before it publicizes the results. The long delay was presumably given to provide ample time for companies and governments to organize mitigation measures. Yet, when the paper was finally released, some companies (e.g. Gemalto) still weren't clear if their products were affected. According to Ars Technica, Gemalto's customers, including Microsoft, which uses the products for two-factor authentication, are now still left wondering if their products are secure. Similarly, Estonia recently scrambled to suspend all affected cards, earlier than originally planned, even though they were notified of the vulnerability back in August. It seems the long delay didn't work: stakeholders did not proceed with those changes promptly and thousands if not millions of devices are still vulnerable at the time of writing.

Part of the problem is that a lot of those devices simply cannot be upgraded in the field. For example, Yubico forbids firmware changes to the Yubikey 4. So the company had to set up a free replacement program for the popular keycards. TPM modules, fortunately, are often flashable, but those need to be done through a manufacturer update, which may be hard to arrange. Most often, fixes are just not deployed at all, because supply chains make it difficult to find the original manufacturer that can provide fixes. In fact, the ROCA paper described the impact evaluation as "far from straightforward", partly because secure hardware is often designed as an opaque system that is deliberately hard to evaluate. In particular, the researchers stated that "prevalence of the RSALib in the area of programmable smartcards is notoriously difficult to estimate". The researchers further explained that we may deal with the fallout of ROCA for a long time:

We expect to see the cards for a rather long time (several years) before all the vulnerable cards are eventually sold out, especially when dealing with low volume markets. The buyers should check the cards for the presence of fingerprinted keys and/or opt for longer key lengths if they are supported by the card hardware.

The ROCA paper also noted that medical records identification cards and electronic payment cards (EMV) may be vulnerable to this issue as well, although they were not aware of publicly available sources for the public keys for those devices—a necessary condition to mount an attack.

Yet the researchers are confident their disclosure led to good results. In an email interview, Petr Svenda said:

The main driver for us to publicly announce the vulnerability including details is to force companies to really act. This hopefully happened (new firmware for TPMs, updates to Bitlocker, revocation of eID keys...)

Svenda also believes the security vulnerability is "more likely to be unintentional" because the prime structure leak was "so obvious" when compared to other cryptographic vulnerabilities like Dual_EC_DRBG. He pointed to some of the recommendations from the paper:

  1. Don't keep the design secret and have verifiable implementation

  2. Use some kind of secure multiparty computation to remove single points of failure

The latter idea is novel for me, but was proposed in academic literature back in 1997. The idea is to not rely solely on one algorithm or device to generate primes but collaboratively generate them among multiple parties. This makes vulnerabilities like ROCA much less likely because the same mistake would need to happen among all the components for the attack to be effective.

Obviously, you can also generate private keys on another computer and move them to the keycard later. But, as the Debian OpenSSL debacle showed, desktop computers are not immune to this type of vulnerability either.

The ROCA paper highlights the "dangers of keeping the design secret and the implementation closed-source, even if both are thoroughly analyzed and certified by experts". They were also critical of the certification process which "counter-intuitively 'rewards' the secrecy of design by additional certification 'points' when an implementation is difficult for potential attackers to obtain - thus favoring security by obscurity."

The paper concludes by stating that "relevant certification bodies might want to reconsider such an approach in favor of open implementations and specifications." This is a similar conclusion to the one I reached in my comparison of cryptographic tokens: we should be able to design secure, yet open, hardware and hopefully these kinds of vulnerabilities will serve as a lesson for the future.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Montréal-Python 68: Call For Speakers

Montreal Python - mer, 11/08/2017 - 00:00

We are looking for speakers that want to give a regular presentation (20 to 25 minutes) or a lightning talk (5 minutes).

Submit your proposal at team@montrealpython.org

When

November 20th, 2017
6PM to 9PM

Where

Google Montréal
1253 McGill College Ave #150
H3B 2Y5
https://goo.gl/maps/oNruyD2oVbq

Catégories: External Blogs

Montréal-Python 68: Call For Speakers

Montreal Python - mer, 11/08/2017 - 00:00

We are looking for speakers that want to give a regular presentation (20 to 25 minutes) or a lightning talk (5 minutes).

Submit your proposal at team@montrealpython.org

When

November 20th, 2017
6PM to 9PM

Where

Google Montréal
1253 McGill College Ave #150
H3B 2Y5
https://goo.gl/maps/oNruyD2oVbq

Catégories: External Blogs

October 2017 report: LTS, feed2exec beta, pandoc filters, git mediawiki

Anarcat - jeu, 11/02/2017 - 11:12
Debian Long Term Support (LTS)

This is my monthly Debian LTS report. This time I worked on the famous KRACK attack, git-annex, golang and the continuous stream of GraphicsMagick security issues.

WPA & KRACK update

I spent most of my time this month on the Linux WPA code, to backport it to the old (~2012) wpa_supplicant release. I first published a patchset based on the patches shipped after the embargo for the oldstable/jessie release. After feedback from the list, I also built packages for i386 and ARM.

I have also reviewed the WPA protocol to make sure I understood the implications of the changes required to backport the patches. For example, I removed the patches touching the WNM sleep mode code as that was introduced only in the 2.0 release. Chunks of code regarding state tracking were also not backported as they are part of the state tracking code introduced later, in 3ff3323. Finally, I still have concerns about the nonce setup in patch #5. In the last chunk, you'll notice peer->tk is reset, to_set to negotiate a new TK. The other approach I considered was to backport 1380fcbd9f ("TDLS: Do not modify RNonce for an TPK M1 frame with same INonce") but I figured I would play it safe and not introduce further variations.

I should note that I share Matthew Green's observations regarding the opacity of the protocol. Normally, network protocols are freely available and security researchers like me can easily review them. In this case, I would have needed to read the opaque 802.11i-2004 pdf which is behind a TOS wall at the IEEE. I ended up reading up on the IEEE_802.11i-2004 Wikipedia article which gives a simpler view of the protocol. But it's a real problem to see such critical protocols developed behind closed doors like this.

At Guido's suggestion, I sent the final patch upstream explaining the concerns I had with the patch. I have not, at the time of writing, received any response from upstream about this, unfortunately. I uploaded the fixed packages as DLA 1150-1 on October 31st.

Git-annex

The next big chunk on my list was completing the work on git-annex (CVE-2017-12976) that I started in August. It turns out doing the backport was simpler than I expected, even with my rusty experience with Haskell. Type-checking really helps in doing the right thing, especially considering how Joey Hess implemented the fix: by introducing a new type.

So I backported the patch from upstream and notified the security team that the jessie and stretch updates would be similarly easy. I shipped the backport to LTS as DLA-1144-1. I also shared the updated packages for jessie (which required a similar backport) and stretch (which didn't) and those Sebastien Delafond published those as DSA 4010-1.

Graphicsmagick

Up next was yet another security vulnerability in the Graphicsmagick stack. This involved the usual deep dive into intricate and sometimes just unreasonable C code to try and fit a round tree in a square sinkhole. I'm always unsure about those patches, but the test suite passes, smoke tests show the vulnerability as fixed, and that's pretty much as good as it gets.

The announcement (DLA 1154-1) turned out to be a little special because I had previously noticed that the penultimate announcement (DLA 1130-1) was never sent out. So I made a merged announcement to cover both instead of re-sending the original 3 weeks late, which may have been confusing for our users.

Triage & misc

We always do a bit of triage even when not on frontdesk duty, so I:

I also did smaller bits of work on:

The latter reminded me of the concerns I have about the long-term maintainability of the golang ecosystem: because everything is statically linked, an update to a core library (say the SMTP library as in CVE-2017-15042, thankfully not affecting LTS) requires a full rebuild of all packages including the library in all distributions. So what would be a simple update in a shared library system could mean an explosion of work on statically linked infrastructures. This is a lot of work which can definitely be error-prone: as I've seen in other updates, some packages (for example the Ruby interpreter) just bit-rot on their own and eventually fail to build from source. We would also have to investigate all packages to see which one include the library, something which we are not well equipped for at this point.

Wheezy was the first release shipping golang packages but at least it's shipping only one... Stretch has shipped with two golang versions (1.7 and 1.8) which will make maintenance ever harder in the long term.

We build our computers the way we build our cities--over time, without a plan, on top of ruins. - Ellen Ullman

Other free software work

This month again, I was busy doing some serious yak shaving operations all over the internet, on top of publishing two of my largest LWN articles to date (2017-10-16-strategies-offline-pgp-key-storage and 2017-10-26-comparison-cryptographic-keycards).

feed2exec beta

Since I announced this new project last month I have released it as a beta and it entered Debian. I have also wrote useful plugins like the wayback plugin that saves pages on the Wayback machine for eternal archival. The archive plugin can also similarly save pages to the local filesystem. I also added bash completion, expanded unit tests and documentation, fixed default file paths and a bunch of bugs, and refactored the code. Finally, I also started using two external Python libraries instead of rolling my own code: the pyxdg and requests-file libraries, the latter which I packaged in Debian (and fixed a bug in their test suite).

The program is working pretty well for me. The only thing I feel is really missing now is a retry/fail mechanism. Right now, it's a little brittle: any network hiccup will yield an error email, which are readable to me but could be confusing to a new user. Strangely enough, I am particularly having trouble with (local!) DNS resolution that I need to look into, but that is probably unrelated with the software itself. Thankfully, the user can disable those with --loglevel=ERROR to silence WARNINGs.

Furthermore, some plugins still have some rough edges. For example, The Transmission integration would probably work better as a distinct plugin instead of a simple exec call, because when it adds new torrents, the output is totally cryptic. That plugin could also leverage more feed parameters to save different files in different locations depending on the feed titles, something would be hard to do safely with the exec plugin now.

I am keeping a steady flow of releases. I wish there was a way to see how effective I am at reaching out with this project, but unfortunately GitLab doesn't provide usage statistics... And I have received only a few comments on IRC about the project, so maybe I need to reach out more like it says in the fine manual. Always feels strange to have to promote your project like it's some new bubbly soap...

Next steps for the project is a final review of the API and release production-ready 1.0.0. I am also thinking of making a small screencast to show the basic capabilities of the software, maybe with asciinema's upcoming audio support?

Pandoc filters

As I mentioned earlier, I dove again in Haskell programming when working on the git-annex security update. But I also have a small Haskell program of my own - a Pandoc filter that I use to convert the HTML articles I publish on LWN.net into a Ikiwiki-compatible markdown version. It turns out the script was still missing a bunch of stuff: image sizes, proper table formatting, etc. I also worked hard on automating more bits of the publishing workflow by extracting the time from the article which allowed me to simply extract the full article into an almost final copy just by specifying the article ID. The only thing left is to add tags, and the article is complete.

In the process, I learned about new weird Haskell constructs. Take this code, for example:

-- remove needless blockquote wrapper around some tables -- -- haskell newbie tips: -- -- @ is the "at-pattern", allows us to define both a name for the -- construct and inspect the contents as once -- -- {} is the "empty record pattern": it basically means "match the -- arguments but ignore the args" cleanBlock (BlockQuote t@[Table {}]) = t

Here the idea is to remove <blockquote> elements needlessly wrapping a <table>. I can't specify the Table type on its own, because then I couldn't address the table as a whole, only its parts. I could reconstruct the whole table bits by bits, but it wasn't as clean.

The other pattern was how to, at last, address multiple string elements, which was difficult because Pandoc treats spaces specially:

cleanBlock (Plain (Strong (Str "Notifications":Space:Str "for":Space:Str "all":Space:Str "responses":_):_)) = []

The last bit that drove me crazy was the date parsing:

-- the "GAByline" div has a date, use it to generate the ikiwiki dates -- -- this is distinct from cleanBlock because we do not want to have to -- deal with time there: it is only here we need it, and we need to -- pass it in here because we do not want to mess with IO (time is I/O -- in haskell) all across the function hierarchy cleanDates :: ZonedTime -> Block -> [Block] -- this mouthful is just the way the data comes in from -- LWN/Pandoc. there could be a cleaner way to represent this, -- possibly with a record, but this is complicated and obscure enough. cleanDates time (Div (_, [cls], _) [Para [Str month, Space, Str day, Space, Str year], Para _]) | cls == "GAByline" = ikiwikiRawInline (ikiwikiMetaField "date" (iso8601Format (parseTimeOrError True defaultTimeLocale "%Y-%B-%e," (year ++ "-" ++ month ++ "-" ++ day) :: ZonedTime))) ++ ikiwikiRawInline (ikiwikiMetaField "updated" (iso8601Format time)) ++ [Para []] -- other elements just pass through cleanDates time x = [x]

Now that seems just dirty, but it was even worse before. One thing I find difficult in adapting to coding in Haskell is that you need to take the habit of writing smaller functions. The language is really not well adapted to long discourse: it's more about getting small things connected together. Other languages (e.g. Python) discourage this because there's some overhead in calling functions (10 nanoseconds in my tests, but still), whereas functions are a fundamental and important construction in Haskell that are much more heavily optimized. So I constantly need to remind myself to split things up early, otherwise I can't do anything in Haskell.

Other languages are more lenient, which does mean my code can be more dirty, but I feel get things done faster then. The oddity of Haskell makes frustrating to work with. It's like doing construction work but you're not allowed to get the floor dirty. When I build stuff, I don't mind things being dirty: I can cleanup afterwards. This is especially critical when you don't actually know how to make things clean in the first place, as Haskell will simply not let you do that at all.

And obviously, I fought with Monads, or, more specifically, "I/O" or IO in this case. Turns out that getting the current time is IO in Haskell: indeed, it's not a "pure" function that will always return the same thing. But this means that I would have had to change the signature of all the functions that touched time to include IO. I eventually moved the time initialization up into main so that I had only one IO function and moved that timestamp downwards as simple argument. That way I could keep the rest of the code clean, which seems to be an acceptable pattern.

I would of course be happy to get feedback from my Haskell readers (if any) to see how to improve that code. I am always eager to learn.

Git remote MediaWiki

Few people know that there is a MediaWiki remote for Git which allow you to mirror a MediaWiki site as a Git repository. As a disaster recovery mechanism, I have been keeping such a historical backup of the Amateur radio wiki for a while now. This originally started as a homegrown Python script to also convert the contents in Markdown. My theory then was to see if we could switch from Mediawiki to Ikiwiki, but it took so long to implement that I never completed the work.

When someone had the weird idea of renaming a page to some impossible long name on the wiki, my script broke. I tried to look at fixing it and then remember I also had a mirror running using the Git remote. It turns out it also broke on the same issue and that got me looking in the remote again. I got lost in a zillion issues, including fixing that specific issue, but I especially looked at the possibility of fetching all namespaces because I realized that the remote fetches only a part of the wiki by default. And that drove me to submit namespace support as a patch to the git mailing list. Finally, the discussion came back to how to actually maintain that contrib: in git core or outside? Finally, it looks like I'll be doing some maintenance that project outside of git, as I was granted access to the GitHub organisation...

Galore Yak Shaving

Then there's the usual hodgepodge of fixes and random things I did over the month.

There is no [web extension] only XUL! - Inside joke

Catégories: External Blogs

Epic Lameness

Eric Dorland - lun, 09/01/2008 - 17:26
SF.net now supports OpenID. Hooray! I'd like to make a comment on a thread about the RTL8187se chip I've got in my new MSI Wind. So I go to sign in with OpenID and instead of signing me in it prompts me to create an account with a name, username and password for the account. Huh? I just want to post to their forum, I don't want to create an account (at least not explicitly, if they want to do it behind the scenes fine). Isn't the point of OpenID to not have to create accounts and particularly not have to create new usernames and passwords to access websites? I'm not impressed.
Catégories: External Blogs

Sentiment Sharing

Eric Dorland - lun, 08/11/2008 - 23:28
Biella, I am from there and I do agree. If I was still living there I would try to form a team and make a bid. Simon even made noises about organizing a bid at DebConfs past. I wish he would :)

But a DebConf in New York would be almost as good.
Catégories: External Blogs
Syndiquer le contenu