Skip to main content

Feed aggregator

A report from Netconf: Day 1

Anarcat - Tue, 04/11/2017 - 12:00

As is becoming traditional, two times a year the kernel networking community meets in a two-stage conference: an invite-only, informal, two-day plenary session called Netconf, held in Toronto this year, and a more conventional one-track conference open to the public called Netdev. I was invited to cover both conferences this year, given that Netdev was in Montreal (my hometown), and was happy to meet the crew of developers that maintain the network stack of the Linux kernel.

This article covers the first day of the conference which consisted of around 25 Linux developers meeting under the direction of David Miller, the kernel's networking subsystem maintainer. Netconf has no formal sessions; although some people presented slides, interruptions are frequent (indeed, encouraged) and the focus is on hashing out issues that are blocked on the mailing list and getting suggestions, ideas, solutions, and feedback from their peers.

Removing ndo_select_queue()

One of the first discussions that elicited a significant debate was the ndo_select_queue() function, a key component of the Linux polling system that determines when and how to send packets on a network interface (see netdev_pick_tx and friends). The general question was whether the use of ndo_select_queue() in drivers is a good idea. Alexander Duyck explained that Intel people were considering using ndo_select_queue() for receive/transmit queue matching. Intel drivers do not currently use the hook provided by the Linux kernel and it turns out no one is happy with ndo_select_queue(): the heuristics it uses don't really please anyone. The consensus (including from Duyck himself) seemed to be that it should just not be used anymore, or at least not used for that specific purpose.

The discussion turned toward the wireless network stack, which uses it extensively, but for other purposes. Johannes Berg explained that the wireless stack uses ndo_select_queue() for traffic classification, for example to get voice traffic through even if the best-effort queue is backed up. The wireless stack could stop using it by doing flow control completely inside the wireless stack, which already uses the fq_codel flow-control mechanism for other purposes, so porting away from ndo_select_queue() seems possible there.

The problem then becomes how to update all the drivers to change that behavior, which would be a lot of work. Still, it seems people are moving away from a generic ndo_select_queue() interface to stack-specific or even driver-specific (in the case of Intel) queue management interfaces.

refcount_t followup

There was a followup discussion on the integration of the refcount_t type into the network stack, which we covered recently. This type is meant to be an in-kernel defense against exploits based on overflowing or underflowing an object's reference count.

The consensus seems to be that having refcount_t used for debugging is acceptable, but it cannot be enabled by default. An issue that was identified is that the networking developers are fairly sure that introducing refcount_t would have a severe impact on performance, but they do not have benchmarks to prove it, something Miller identified as a problem that needs to be worked on. Miller then expressed some openness to the idea of having it as a kernel configuration option.

A similar discussion happened, on the second day, regarding the KASan memory error detector which was covered when it was introduced in 2014. Eric Dumazet warned that there could be a lot of issues that cannot be detected by KASan because of the way the network stack often bypasses regular memory-allocation routines for performance reasons. He also noted that this can sometimes mean the stack may go over the regular 10% memory limit (the tcp_mem parameter, described in the tcp(7) man page) for certain operations, especially when rebuilding out of order packets with lots of parallel TCP connections.

Therefore it was proposed that these special memory recycling tricks could be optionally disabled, at run or compile-time, to instrument proper memory tracking. Dumazet argued this was a situation similar to refcount_t in that we need a way to disable high performance to make the network stack easier to debug with KAsan.

The problem with optional parameters is that they are often disabled in production or even by default, which, in turn, means that critical bugs cannot actually be found because the code paths are not tested. When I asked Dumazet about this, he explained that Google performs integration testing of new kernels before putting them in production, and those toggles could be enabled there to find and fix those bugs. But he agreed that certain code paths are then not tested until the code gets deployed in production.

So it seems the status quo remains: security folks wants to improve the reliability of the kernel, but the network folks can't afford the performance cost. Yet it was clear in the discussions that the team cares about security issues and wants those issues to be fixed; the impact of some of the solutions is just too big.

Lightweight wireless management packet access

Berg explained that some users need to have high-performance access to certain management frames in the wireless stack and wondered how to best expose those to user space. The wireless stack already allows users to clone a network interface in "monitor" mode, but this has a big performance cost, as the radiotap header needs to be constructed from scratch and the packet header needs to be copied. As wireless improves and the bandwidth rises to gigabit levels, this can become significant bottleneck for packet sniffers or reporting software that need to know precisely what's going on over the air outside of the regular access point client operation.

It seems the proper way to do this is with an eBPF program. As Miller summarized, just add another API call that allows loading a BPF program into the kernel and then those users can use a BPF filtering point to get the statistics they need. This will require an extra hook in the wireless stack, but it seems like this is the way that will be taken to implement this feature.

VLAN 0 inconsistencies

Hannes Frederic Sowa brought up the seemingly innocuous question of "how do we handle VLAN 0?" In theory, VLAN 0 means "no VLAN". But the Linux kernel currently handles this differently depending on whether the VLAN module is loaded and whether a VLAN 0 interface was created. Sometimes the VLAN tag is stripped, sometimes not.

It turns out the semantics of this were accidentally changed last time there was a change here and this was originally working but is now broken. Sowa therefore got the go-ahead to fix this to make the behavior consistent again.

Loopy fun

Then it came the turn of Jamal Hadi Salim, the maintainer of the kernel's traffic-control (tc) subsystem. The first issue he brought up is a problem in the tc REDIRECT action that can create infinite loops within the kernel. The problem can be easily alleviated when loops are created on the same interface: checks can be added that just drop packets coming from the same device and rate-limit logging to avoid a denial-of-service (DoS) condition.

The more serious problem occurs when a packet is forwarded from (say) interface eth0 to eth1 which then promptly redirects it from eth1 back to eth0. Obviously, this kind of problem can only be created by a user with root access so, at first glance, those issues don't seem that serious: admins can shoot themselves in the foot, so what?

But things become a little more serious when you consider the container case, where an untrusted user has root access inside a container and should have constrained resource limitations. Such a loop could allow this user to deploy an effective DoS attack against a whole group of containers running on the same machine. Even worse, this endless loop could possibly turn into a deadlock in certain scenarios, as the kernel could try to transmit the packet on the same device it originated from and block, progressively filling the queues and eventually completely breaking network access. Florian Westphal argued that a container can already create DoS conditions, for example by doing a ping flood.

According to Salim, this whole problem was created when two bits used for tracking such packets were reclaimed from the skb structure used to represent packets in the kernel. Those bits were a simple TTL (time to live) field that was incremented on each loop and dropped after a pre-determined limit was reached, breaking infinite loops. Salim asked everyone if this should be fixed or if we should just forget about this issue and move on.

Miller proposed to keep a one-behind state for the packet, fixing the simplest case (two interfaces). The general case, however, would requite a bitmap of all the interfaces to be scanned, which would impose a large overhead. Miller said an attempt to fix this should somehow be made. The root of the problem is that the network maintainers are trying to reduce the size of the skb structure, because it's used in many critical paths of the network stack. Salim's position is that, without the TTL fields, there is no way to fix the general case here, and this constitutes a security issue. So either the bits need to be brought back, or we need to live with the inherent DoS threat.

Dumping large statistics sets

Another issue Salim brought up was the question of how to export large statistics sets from the kernel. It turns out that some use cases may end up dumping a lot of data. Salim mentioned a real-world tc use case that calls for reading six-million entries. The current netlink-based API provides a way to get only 20 entries at a time, which means it takes forever to dump the state of all those policy actions. Salim has a patch that changes the dump size be eight times the NLMSG_GOOD_SIZE, which improves performance by an order of magnitude already, although there are issues with checking the user-space buffer size there.

But a more complete solution is needed. What Salim proposed was a way to ask only for the states that changed since the last dump was requested. He has a patch to add a last_access field to the netlink_callback structure used by netlink_dump() to output data; that raised the question of how to actually use that field. Since Salim fetches that data every five seconds, he figured he could just tell the kernel to return all the nodes that changed in that period. But then if the dump takes more than five seconds to complete, the next dump may be missing states that changed during the extra delay. An alternative mechanism would be for the user-space utility to keep the time stamp it requested and use that as a delta for the next dump.

It turns out this is a larger problem than just tc. Dumazet mentioned this was an issue with fq_codel classes: he would even like to be able to dump those statistics faster than every five seconds. Roopa Prabhu mentioned that Cumulus also has similar problems dumping stats from bridges, so clearly a more generic solution is needed here. There is, however, a fundamental problem with dumping large statistics sets from the kernel: those statistics are constantly changing while the dump is created and unless versioning or locking mechanisms are used — which would slow things down — the data returned is bound to be only an approximation of reality. Salim promised to send a set of RFC patches to further discussions regarding this issue, but during the following Netdev conference, Berg published a patch to fix this ten-year-old issue, which brought cheers from the audience.

The author would like to thank the Netconf and Netdev organizers for travel to, and hosting assistance in, Toronto. Many thanks to Berg, Dumazet, Salim, and Sowa for their time taken for a technical review of this article.

Note: this article first appeared in the Linux Weekly News.

Categories: External Blogs

A report from Netconf: Day 2

Anarcat - Tue, 04/11/2017 - 12:00

This article covers the second day of the informal Netconf discussions, held on on April 4, 2017. Topics discussed this day included the binding of sockets in VRF, identification of eBPF programs, inconsistencies between IPv4 and IPv6, changes to data-center hardware, and more. (See this article for coverage from the first day of discussions).

How to bind to specific sockets in VRF

One of the first presentations was from David Ahern of Cumulus, who presented a few interesting questions for the audience. His first was the problem of binding sockets to a given interface. Right now, there are four different ways this can be done:

  • the old SO_BINDTODEVICE generic socket option (see socket(7))
  • the IP_PKTINFO, IP-specific socket option (see ip(7)), introduced in Linux 2.2
  • the IP_UNICAST_IF flag, introduced in Linux 3.3 for WINE
  • the IPv6 scope ID suffix, part of the IPv6 addressing standard

So there's a problem of having too many ways of doing the same thing, something that cannot really be fixed without breaking ABI compatibility. But even worse, conflicts between those options are not reported by the kernel so it's possible for a user to set up socket flags in a way that certain flags override others and there are no checks made or errors reported. It was agreed that the user should get some notification of conflicting changes here, at least.

Furthermore, binding sockets to a specific VRF (Virtual Routing and Forwarding) device is not currently possible, so Ahern asked what the best way to do this would be, considering the many options available. A use case example is a UDP multicast socket that could be bound to a specific interface within a VRF.

This is an old problem: Tom Herbert explained that there were previous discussions about making the bind() system call more programmable so that, for example, you could bind() a UDP socket to a discrete list of IP addresses or a subnet. So he identified this issue as a broader problem that should be addressed by making the interfaces more generic.

Ahern explained that it is currently possible to bind sockets to the slave device of a VRF even though that should not be allowed. He also raised the question of how the kernel should tell which socket should be selected for incoming packets. Right now, there is a scoring mechanism for UDP sockets, but that cannot be used directly in this more general case.

David Miller said that there are already different ways of specifying scope: there is the VRF layer and the namespace ("netns") layer. A long time ago, Miller reluctantly accepted the addition of netns keys everywhere, swallowing the performance cost to gain flexibility. He argued that a new key should not be added and instead existing infrastructure should be reused. Herbert argued this was exactly the reason why this should be simplified: "if we don't answer the question, people will keep on trying this". For example, one can use a VRF to limit listening addresses, but it gets complicated if we need a device for every address. It seems the consensus evolved towards using, IP_UNICAST_IF, added back in 2012, which is accessible for non-root users. It is currently limited to UDP and RAW sockets, but it could be extended for TCP.

XDP and eBPF program identification

Ahern then turned to the problem of extracting BPF programs from the kernel. He gave the example of a simple cBPF (classic BPF) filter that checks for ARP packets. If the filter is read back from the kernel, the user gets a blob of binary data, which is hard to interpret. There is an kernel verifier that can show C-like output, but that is also difficult to interpret. Ahern then added annotations to his slide that showed what the original program actually does, which was a good demonstration of why such a feature is needed.

Ahern explained that, at least for cBPF, it should be possible to recover the original plaintext, or at least something close to the original program. A first step would be to replace known constants (like 0x806 for ARP). Even with eBPF, it should be possible to improve the output. Alexei Starovoitov, the BPF maintainer, explained that it might make sense to start by returning information about the maps used by an eBPF program. Then more complex data structures could be inspected once we know their type.

The first priority is to get simple debugging tools working but, in the long term, the goal is a full decompiler that can reconstruct instructions into a human-readable program. The question that remains is how to return this data. Ahern explained that right now the bpf() system call copies the data to a different file descriptor, but it could just fill in a buffer. Starovoitov argued for a file descriptor; that would allow the kernel to stream everything through the same descriptor instead of having many attach points. Netlink cannot be used for this because of its asynchronous nature.

A similar issue regarding the way we identify express data path (XDP) programs (which are also written in BPF) was raised by Daniel Borkmann from Covalent. Miller explained that users will want ways to figure out which XDP program was installed, so XDP needs an introspection mechanism. We currently have SHA-1 identifiers that can be internally used to tell which binary is currently loaded but those are not exposed to user space. Starovoitov mentioned it is now just a boolean that shows if a program is loaded or not.

A use case for this, on top of just trying to figure out which BPF program is loaded, is to actually fetch the source code of a BPF program that was deployed in the field for which the source was lost. It is still uncertain that it will be possible to extract an exact copy that could then be recompiled into the same program. Starovoitov added that he needed this in production to do proper reporting.

IPv4/IPv6 equivalency

The last issue — or set of issues — that Ahern brought up was the question of inconsistencies between IPv4 and IPv6. It turns out that, because both protocols were (naturally) implemented separately, there are inconsistencies in how they are handled in the Linux kernel, which affect, among other things, the VRF framework. The first example he gave was the fact that IPv6 addresses added on the loopback interface generate unreachable routes in the main routing table, yet this doesn't happen with IPv4 addresses. Hannes Frederic Sowa explained this was part of the IPv6 specification: there are stronger restrictions on loopback interfaces in IPv6 than IPv4. Ahern explained that VRF loopback interfaces do not implement these restrictions and wanted to know if this was a problem.

Another issue is that anycast routes are added to the wrong interface. This is apparently not specific to VRF: this was done "just because Java", and has been there from day one. It seems that the Java Virtual Machine builds its own routing table and assumes this behavior, so changing this would break every JVM out there, which is obviously not acceptable.

Finally, Martin Kafai Lau asked if work should be done to merge the IPv4 and IPv6 FIB (forwarding information base) trees. The FIB tree is the data structure that represents routing tables in the Linux kernel. Miller explained that the two trees are not semantically equivalent: while IPv6 does source-address lookup and routing, IPv4 does not. We can't remove the source lookups from IPv6, because "people probably use that". According to Alexander Duyck, adding source tables to IPv4 would degrade performance to the level of IPv6 performance, which was jokingly referred to as an incentive to switch to IPv6.

More seriously, Sowa argued that using the same compressed tree IPv4 uses in IPv6 could make sense. People may want to have source routing in IPv4 as well. Miller argued that the kernel is optimized for 32-bit addresses in IPv4, and conceded that it could be scaled to 64-bit subnets, but 128-bit addresses would be much harder. Sowa suggested that they could be limited to 64 bits, as global routes that are announced over BGP usually have such a limit, and more specific routes are usually at discrete prefixes like /65, /127 (for interconnect links) or /128 for (for point-to-point links). He expressed concerns over the reliability of such an implementation so, at this point, it is unlikely that the data structures could be merged. What is more likely is that the code path could be merged and simplified, while keeping the data structures separate.

Modules options substitutions

The next issue that was raised was from Jiří Pírko, who asked how to pass configuration options to a driver before the driver is initialized. Some chips require that some settings be sent before the firmware is loaded, which leads to a weird situation where there is a need to address a device before it's actually recognized by the kernel. The question then can be summarized as to how to pass information to a device that doesn't exist yet.

The answer seems to be that devlink could do this, as it has access to the full device tree and, therefore, to devices that can be addressed by (say) PCI identifiers. Then a possible devlink command could look something like:

devlink dev pci/0000:03:00.0 option set foo bar

This idea raised a bunch of extra questions: some devices don't have a one-to-one mapping with the PCI bridge identifiers, for example, meaning that those identifiers cannot be used to access such devices. Another issue is that you may want to send multiple settings in a single transaction, which doesn't fit well in the devlink model. Miller then proposed to let the driver initialize itself to some state and wait for configuration to be sent when necessary. Another way would be to unregister the driver and re-register with the given configuration. Shrijeet Mukherjee explained that right now, Cumulus is doing this using horrible startup script magic by retrying and re-registering, but it would be nice to have a more standard way to do this.

Control over UAPI patches

Another issue that came up was the problem of changes in the user-space API (UAPI) which break backward compatibility. Pírko said that "we have to be more careful about those changes". The problem is that reviewers are not always available to make detailed reviews of such changes and may not notice API-breaking changes. Pírko proposed creating a bot to check if a given patch introduces UAPI changes, changes in structs, or in netlink enums. Miller said he could block merges until discussions happen and that patchwork, which Miller uses to process patches from the mailing list, does some of this. He also pointed out there aren't enough test cases in the first place.

Starovoitov argued UAPI isn't special, there are other ways of breaking backward compatibility. He expressed concerns that such a bot could create a false sense that everything is fine while a patch could break compatibility and not be detected. Miller countered that UAPI is special in that "we're stuck with it forever". He then went on to propose that, since there's a maintainer (or more) for each module, he can make sure that each maintainer explicitly approves changes to those modules.

Data-center hardware changes

Starovoitov brought up the issue of a new type of hardware that is currently being deployed in data centers called a "multi-host NIC" (network interface card). It's a single NIC that is connected to multiple servers. Facebook, for example, uses this in its Yosemite platform that shoves twelve servers into a 2U rack mount, in three modules. Each module is made of four servers connected to the traditional switch fabric with a single NIC through PCI-Express. Mellanox and and Broadcom also have similar devices.

One question is how to manage those devices. Since they are connected through a PCI-Express bus, Linux will see them as a NIC, yet they are also a little like switches, in that they interconnect multiple servers. Furthermore, the kernel security model assumes that a NIC is trusted, and gladly opens its own memory to NICs through DMA; this can become a huge security issue when the NIC is under the control of another server. This can especially become problematic if we consider that there could be TLS hardware offloading in the future with the introduction of in-kernel TLS stacks.

The other problem is the question of reliability: since those devices are currently "dumb", they need to be managed just like a regular NIC. If the host managing the card crashes, it could disable a whole set of servers that rely on the same NIC. There could be an election process among the servers, but that complicates significantly what used to be a simple PCI connection.

Mukherjee pointed out that the model Cisco uses for this is that the "smart NIC" is a "slave" of the main switch fabric. It's a daughter card, which makes it easier to manage from a network perspective. It is clear that Linux will need a way to represent those devices, probably through the newly introduced switchdev or DSA (distributed switch architecture), but it will be something to keep an eye on as density increases in the data center.

There were many more discussions during Netconf, too many to cover here, but in the end, Miller thanked everyone for all the interesting topics as the participants dispersed for a day off to travel to Montreal to attend the following Netdev conference.

The author would like to thank the Netconf and Netdev organizers for travel to, and hosting assistance in, Toronto. Many thanks to Alexei Starovoitov for his time taken for a technical review of this article.

Note: this article first appeared in the Linux Weekly News.

Categories: External Blogs

Simple Server Hardening, Part II

Linux Journal - Tue, 04/11/2017 - 07:17

In my last article, I talked about the classic, complicated approach to server hardening you typically will find in many hardening documents and countered it with some specific, simple hardening steps that are much more effective and take a only few minutes. more>>

Categories: Linux News


Linux Journal - Mon, 04/10/2017 - 06:00

The new production release of Mender 1.0, an open-source tool for updating embedded devices safely and reliably, is now available. more>>

Categories: Linux News

Contribute your skills to Debian in Montreal, April 14 2017

Anarcat - Sun, 04/09/2017 - 10:06

Join us in Montreal, on April 14 2017, and we will find a way in which you can help Debian with your current set of skills! You might even learn one or two things in passing (but you don't have to).

Debian is a free operating system for your computer. An operating system is the set of basic programs and utilities that make your computer run. Debian comes with dozens of thousands of packages, precompiled software bundled up for easy installation on your machine. A number of other operating systems, such as Ubuntu and Tails, are based on Debian.

The upcoming version of Debian, called Stretch, will be released later this year. We need you to help us make it awesome

Whether you're a computer user, a graphics designer, or a bug triager, there are many ways you can contribute to this effort. We also welcome experience in consensus decision-making, anti-harassment teams, and package maintenance. No effort is too small and whatever you bring to this community will be appreciated.

Here's what we will be doing:

  • We will triage bug reports that are blocking the release of the upcoming version of Debian.

  • Debian package maintainers will fix some of these bugs.

Goals and principles

This is a work in progress, and a statement of intent. Not everything is organized and confirmed yet.

We want to bring together a heterogeneous group of people. This goal will guide our handling of sponsorship requests, and will help us make decisions if more people want to attend than we can welcome properly. In other words: if you're part of a group that is currently under-represented in computer communities, we would like you to be able to attend.

We are committed to providing a friendly, safe and welcoming environment for all, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, religion, nationality, or other similar personal characteristic. Attending this event requires reading and respecting the Debian Code of Conduct, that sets the standards in terms of behaviour for the whole event, including communication (public and private) before, while and after.

The space where this event will take place is unfortunately not accessible to wheelchairs. Food (including vegetarian options) should be provided for lunch. If you have any specific needs regarding food, please let us know when registering, and we will do our best.

What we will be doing

This will be an informal session to confirm and fix bugs in Debian. If you have never worked with Debian packages, this is a good opportunity to learn about packaging and bugtracker usage.

Bugs flagged as Release Critical are blocking the release of the upcoming version of Debian. To fix them, it helps to make sure the bug report documents the up-to-date status of the bug, and of its resolution. One does not need to be a programmer to do this work! For example, you can try and reproduce bugs in software you use... or in software you will discover. This helps package maintainers better focus their work.

We will also try to actually fix bugs by testing patches and uploading fixes into Debian itself. Antoine Beaupré, a seasoned Debian developer, will be available to sponsor uploads and teach people about basic Debian packaging skills.

Where? When? How to register?

See for the exact address and time.

Categories: External Blogs

VMKings' VPS Hosting Solution

Linux Journal - Fri, 04/07/2017 - 11:44

The management team of cloud provider VMKing, as developers themselves, found standard virtual servers not to be well tailored to the developer community—too much or too little space, insufficient security and no support for their preferred Linux OS(!). more>>

Categories: Linux News

Sunshine in a Room with No Windows

Linux Journal - Thu, 04/06/2017 - 10:24

I'm a bit of a weather nut. It might be because I'm getting older, but for some reason, the weather fascinates me. I'm not quite to the point that I watch The Weather Channel on a regular basis, but I do check the forecast often. more>>

Categories: Linux News

On-the-Fly Web Server

Linux Journal - Wed, 04/05/2017 - 09:11

Most of you have a web server installed on your network somewhere. In fact, most of you probably have several. In a pinch, however, getting to the web directory can be difficult. more>>

Categories: Linux News

Best of Hack and /

Linux Journal - Wed, 04/05/2017 - 06:40
Secure Server Deployments in Hostile Territory; Preseeding Full Disk Encryption; Own Your Own DNS; Learn How-to Secure Desktops with Qubes; What's New In 3D Printing


Categories: Linux News

Five Reasons to Switch to Flash Storage

Linux Journal - Tue, 04/04/2017 - 16:12
By now you have heard your peers raving about flash storage. But perhaps you have not made the switch from your enterprise HDD storage solution yet, because of nagging questions you may have, about the cost of flash storage or its technical capabilities. more>>
Categories: Linux News

Flat File Encryption with OpenSSL and GPG

Linux Journal - Tue, 04/04/2017 - 04:46

The Pretty Good Privacy (PGP) application, which has long been known as a primary tool for file encryption, commonly focused on email. It has management tools for exchanging credentials with peers and creating secure communication channels over untrusted networks. more>>

Categories: Linux News

My free software activities, February and March 2017

Anarcat - Sat, 04/01/2017 - 17:51
Looking into self-financing

Before I begin, I should mention that I started tracking my time working on free software more systematically. I spend a lot of time on the computer, as regular readers of this blog might remember so I wanted to know exactly how much time was paid vs free work. I was already using org-mode's time clock system to keep track of my work hours, so I just extended this to my regular free software contributions, which also helps in writing those reports.

It turns out that over 60% of my computer time is spent working on free software. That's huge! I was expecting something more along the range of 20 to 40% of my time. So I started thinking about ways of financing this work. I created a Patreon page but I'm hesitant into launching such a campaign: the only thing worse than "no patreon page" is "a patreon page with failed goals and no one financing it". So before starting such an effort, I'd like to get a feeling of what other people's experience with it are. I know that joeyh is close to achieving his goals, but I can't compare with the guy that invented git-annex or debhelper, so I'm concerned I wouldn't be able to raise the same level of funding.

So any advice you have, feel free to contact me in private or in the comments. If you would be ready to fund my work, I'd love to know about it, obviously, but I guess I wouldn't get real numbers until I actually open up such a page...

Now, onto the regular report.


I spent a good chunk of time completing most of the things I had in mind for Wallabako, which I mentioned quickly in the previous report. Wallabako is now much easier to installed, with clearer instructions, an easier to use configuration file, more reliable synchronization and read status propagation. As usual the Wallabako README file has all the details.

I've also looked at better integration with Koreader, the free software e-reader that forms the basis of the okreader free software distribution which has been able to port Debian to the Kobo e-readers, a project I am really excited about. This project has the potential of supporting Kobo readers beyond the lifetime that upstream grants it and removes a lot of proprietary software and spyware that ships with the Kobo readers. So I have made a few contributions to okreader and also on koreader, the ebook reader okreader is based on.


I rewrote stressant, my simple burn-in and stress-testing tool. After struggling in turn with Debirf, live-build, vmdebootstrap and even FAI, I just figured maybe it wasn't the best idea to try and reinvent that particular wheel: instead of reinventing how to build yet another Debian system build tool, maybe I should just reuse what's already there.

It turns out there's a well known, succesful and fairly complete recovery system called Grml. It is a Debian Derivative, so all I needed to do was to stop procrastinating and actually write the actual stressant tool instead of just creating a distribution with a bunch of random tools shipped in. This allowed me to focus on which tools were the best to stress test different components. This selection ended up being:

fio can also be used to overwrite disk drives with the proper options (--overwrite and --size=100%), although grml also ships with nwipe for wiping old spinning disks and hdparm to do a secure erase of SSD disks (whatever that's worth).

Stressant still needs to be shipped with grml for this transition to be complete. In the meantime, I was able to configure the excellent public Gitlab CI service to provide ISO images with Stressant built-in as a stopgap measure. I also need to figure out a way to automate starting stressant from a boot menu to automate deployments on a larger scale, although because I have little need for the feature at this moment in time, this will likely wait for a sponsor to show up for this to be implemented.

Still, stressant has useful features like the capability of sending logs by email using a fresh new implementation of the Python SMTPHandler (BufferedSMTPHandler) which waits for logging to complete before sending a single email. Another interesting piece of code in there is the NegateAction argparse handler that enables the use of "toggle flags" (e.g. --flag / --no-flag). I'm so happy with the code that I figure I could just share it here directly:

class NegateAction(argparse.Action): '''add a toggle flag to argparse this is similar to 'store_true' or 'store_false', but allows arguments prefixed with --no to disable the default. the default is set depending on the first argument - if it starts with the negative form (define by default as '--no'), the default is False, otherwise True. ''' negative = '--no' def __init__(self, option_strings, *args, **kwargs): '''set default depending on the first argument''' default = not option_strings[0].startswith(self.negative) super(NegateAction, self).__init__(option_strings, *args, default=default, nargs=0, **kwargs) def __call__(self, parser, ns, values, option): '''set the truth value depending on whether it starts with the negative form''' setattr(ns, self.dest, not option.startswith(self.negative))

Short and sweet. I wonder why stuff like this is not in the standard library yet - maybe just because no one bothered yet? It'd be great to get feedback of more experienced Pythonistas on this one.

I hope that my work on Stressant is complete. I get zero funding for this work, and have little use for it myself: I manage only a few machines and such a tool really shines when you regularly put new hardware online, which is (fortunately?) not my case anymore. I'd be happy, of course, to accompany organisations and people that wish to further develop and use such a tool.

A short demo of stressant as well as detailed description of how it works is of course available in its README file.

Standard third party repositories

After looking at improvements for the grml repository instructions, I realized there was no real "best practices" document on how to configure an Apt repository. Sure, there are tools like reprepro and others, but those hardly qualify as policy: they are very flexible and there are lots of ways to create insecure repositories or curl | sh style instructions, which we of course generally want to avoid.

While the larger problem of Unstrusted Debian packages remain generally unsolved (e.g. when you install any .deb file, it can get root on your system), it seemed to me one critical part of this problem was how to add a random third-party repository to your machine while limiting, as much as possible, what possible attackers could do with such a repository. In other words, to solve the more general problem of insecure .deb files, we also need to solve the distribution problem, otherwise fixing the .deb files themselves will be useless.

This lead to the creation of standardized repository instructions that define:

  1. how to distribute the repository's public signing key (ie. over HTTPS)
  2. how to name suites and components (e.g. use stable and main unless you have a good reason, and explain yourself)
  3. recommend a healthy does of apt preferences pinning
  4. how to distribute keys (e.g. with a derive-archive-keyring package)

I've seen so many third party repositories get this wrong. For example, a lot of repositories recommend this type of command to intialize the OpenPGP trust path:

curl | apt-key add -

This has the following problems:

  • the key is transfered in plaintext and can easily be manipulated by an active attacker (e.g. a router on your path to the server or a neighbor in a Wifi cafe)
  • the key is added to the main trust root, which allows the key to authentify as the real Debian archive, therefore giving it all rights over all packages
  • since it's part of the global archive, it's difficult for a package to remove/add the key when a key rollover is necessary (and repositories generally don't provide a deriv-archive-keyring to do that process anyways)

An example of this are the Docker install instructions that, at least, manage to do this over HTTPS. Some other repositories don't even bother teaching people about the proper way of adding those keys. We settled for:

wget -O /usr/share/keyrings/deriv-archive-keyring.gpg

That location was explicitly chosen to be out of the main trust directory, so that it needs to be explicitly added to the sources.list as well:

deb [signed-by=/usr/share/keyrings/deriv-archive-keyring.gpg] stable main

Similarly, we highly recommend users setup "apt pinning" to restrict what a given repository can do. Since pinning is so confusing, most people don't actually bother even configuring it and I have yet to see a single repo advise its users to configure those preferences, which are essential to limit what a repository can do. To keep configuration simple, we recommend this:

Package: * Pin: origin Pin-Priority: 100

Obviously, for a single-package repository, the actual package name should be listed, e.g.:

Package: foo Pin: origin Pin-Priority: 100

And the priority should probably be set to 1 unless you want to allow automatic upgrades.

It is my hope that this design will get more traction in the years to come and become a de-facto standard that will be a key part in safely adding third party repositories. There is obviously much more work to be done to improve security when installing untrusted .deb files, and I encourage Debian developers to consider contributing to the UntrustedDebs discussions and particularly to the Teams/Dpkg/Spec/DeclarativePackaging work.

Signal R&D

I spent a significant amount of time this month struggling with the Signal project on my phone. I'm still ambivalent on Signal: it's a centralized designed, too dependent on phone numbers, but I must admit they get a lot of things right and it's the only free-software platform that allows for easy-to-use, multi-platform videoconferencing that my family can use.

I've been following Signal for a while: up until now, I had been using the LibreSignal rebuild of the official client, as it is distributed on a F-Droid repository. Because I try to avoid Google (proprietary) software on my phone, it's basically the only way I could even install Signal. Unfortunately, the repository is out of date and introduces another point of trust in the distribution model: now you not only need to trust the Signal authors to do the right thing, you also need to trust that F-Droid repo not to inject nasty code on your phone. I've therefore started a discussion about how Signal could be distributed outside of the Google Play Store. I'd like to think it's one of the things that led the Signal people to distribute an official copy of Signal outside of the playstore.

After much struggling, I was able to upgrade to this official client and will be able to upgrade easily by just downloading the APK. (Do note that I ended up reinstalling and re-registering Signal, which unfortunately changed my secret keys.) I do hope Signal enters F-Droid one day, but it could take a while because it still doesn't work without Google services and barely works with MicroG, the free software alternative to the Google services clients. Moxie also set a list of requirements like crash reporting and statistics that need to be implemented on F-Droid's side before he agrees to the deployment, so this could take a while.

I've also participated in the, ahem, discussion on the JWZ blog regarding a supposed vulnerability in Signal where it would leak previously unknown phone numbers to third parties. I reviewed the way the phone number is uploaded and, while it's possible to create a rainbow table of phone numbers (which are hashed with a truncated SHA-1 checksum), I couldn't verify the claims of other participants in the thread. For me, Signal still does the right thing with contacts, although I do question the way "read status" notifications get transmitted, but that belong in another bug report / blog post.

Debian Long Term Support (LTS)

It's been more than a year working on Debian LTS, started by Raphael Hertzog at Freexian. I didn't work much in February so I had a lot of hours to catchup with, and was unfortunately unable to do so, partly because I was busy with other projects, and partly because my colleagues are doing a great job at resolving the most important issues.

So one my concerns this month was finding work. It seemed that all the hard packages were either taken (e.g. my usual favorites, tiff and imagemagick, we done by others) or just too challenging (e.g. I don't feel quite comfortable tackling the LTS branch of the Linux kernel yet).

I spent quite a bit of time trying to figure out what was wrong with pcre3, only to realise the "32" in the report was not about the architecture, but about the character width. Because of thise, I marked 4 CVEs (CVE-2017-7186, CVE-2017-7244, CVE-2017-7245, CVE-2017-7246) as "not-affected", since the 32-bith character support wasn't enabled in wheezy (or jessie, for that matter). I still spent some time trying to reproduce the issues, which require a compiler with an AddressSanitizer, something that was introduced in both Clang and GCC after Wheezy was released, which makes reproducing this fairly complicated...

This allowed me to experiment more with Vagrant, however, and I have provided the Debian cloud team with a 32-bit Vagrant box that was merged in shortly after, although it doesn't show up yet in the official list of Debian images.

Then I looked at the apparmor situation (CVE-2017-6507), Debian bug #858768). That was one tricky bug as well, since it's not a security issue in apparmor per se, but more an issue with things that assume a certain behavior from apparmor. I have concluded that Wheezy was not affected because there are no assumptions of proper isolation there - which are provided only starting from LXC 1.0 - and Docker is not in Wheezy. I also couldn't reproduce the issue on Jessie, but, as it turns out, the issue was sysvinit-specific, which is why I couldn't reproduce it under the default systemd configuration shipped with Jessie.

I also looked at the various binutils security issues: as I reported on the mailing list, I didn't see anything serious enough in there to warrant a a security release and followed the lead of both the stable and Red Hat security teams by marking this "no-dsa". I similiarly reviewed the mp3splt security issues (specifically CVE-2017-5666) and was fairly puzzled by that issue, which seems to be triggered only the same address sanitization extensions than PCRE, although there was some pretty wild interplay with debugging flags in there. All in all, it seems we can't reproduce that issue in wheezy, but I do not feel confident enough in the results to push that issue aside for now.

I finally uploaded the pending graphicsmagick issue (DLA-547-2), a regression update to fix a crash that was introduced in the previous release (DLA-547-1, mistakenly named DLA-574-1). Hopefully that release should clear up some of the confusion and fix the regression.

I also released DLA-879-1 for the CVE-2017-6369 in firebird2.5 which was an interesting experiment: I couldn't reproduce the issue in a local VM. After following the Ubuntu setup tutorial, as I wasn't too familiar with the Firebird database until now (hint: the default username and password is sysdba/masterkey), I ended up assuming we were vulnerable and just backporting the patch after seeing the jessie folks push out a release just in case.

I also looked at updating the ca-certificates package to deal with the pending WoSign/Startcom removal: I made an explicit list of the CAs that need to be removed after reviewing the Mozilla list. I also sent a patch for an unrelated issue where ca-certificates is writing to /usr/local (!!) in Debian bug #843722.

I have also done some "meta" work in starting a discussion about fixing the missing DLA links in the tracker, as you will notice all of the above links lead to nowhere. Thanks to pabs, there are now some links but unfortunately there are about 500 DLAs missing from the website. We also discussed ways to Debian bug #859123, something which is currently a manual process. This is now in the hands of the excellent webmaster team.

I have also filed a few missing security bugs (Debian bug #859135, Debian bug #859136), partly because I wanted to help the security team. But it turned out that I felt the script needed some improvements, so I submitted a patch to improve the script so it is easier to run.

Other projects

As usual, there's the usual mixed bags of chaos:

More stuff on Github...

Categories: External Blogs

Linux Journal April 2017

Linux Journal - Sat, 04/01/2017 - 12:31
If There Was a Problem? Yo, I'll Solve It...

I'm not sure what problem Vanilla Ice solved with his DJ's sick hook, but thankfully in the Linux world, we solve problems all the time. more>>

Categories: Linux News

Briggs and Stratton 8,000 Watt Elite Series Portable Generator with StatStation Wireless

Linux Journal - Fri, 03/31/2017 - 11:29
Although Linux Journal readers might not equate Milwaukee with tech, a new Briggs & Stratton product portends the bright future of smartened "legacy" devices from the industrial heartland. more>>
Categories: Linux News

Five Reasons to Love SAP HANA

Linux Journal - Fri, 03/31/2017 - 09:51
If you are reading a blog on SAP HANA, you probably already know that it is an in-memory data platform built to handle massive amounts of data in real time. You probably already know that it can be deployed as an on-premises appliance or purchased as a hybrid or cloud service. more>>
Categories: Linux News

NTPsec: a Secure, Hardened NTP Implementation

Linux Journal - Thu, 03/30/2017 - 05:57

Note: This article was first published in the October 2016 issue of Linux Journal. more>>

Categories: Linux News

SUSE Linux Enterprise High Availability Extension

Linux Journal - Wed, 03/29/2017 - 10:25

Historically, data replication has been available only piecemeal through proprietary vendors. In a quest to remediate history, SUSE and partner LINBIT announced a solution that promises to change the economics of data replication. more>>

Categories: Linux News

Hybrid Cloud Storage Delivers Performance and Value

Linux Journal - Wed, 03/29/2017 - 06:23
Just in the past two years, technology has created a veritable ocean of data. And like an ocean, that data, created by social technology, mobile technology and IoT, is vast, bountiful and dynamic. And also like an ocean, the climate, or temperature of data, constantly shifts and changes. more>>
Categories: Linux News

smbclient Security for Windows Printing and File Transfer

Linux Journal - Tue, 03/28/2017 - 07:53

Microsoft Windows is usually a presence in most computing environments, and UNIX administrators likely will be forced to use resources in Windows networks from time to time. Although many are familiar with the Samba server software, the matching smbclient utility often escapes notice. more>>

Categories: Linux News

How to Calculate Flash Storage TCO

Linux Journal - Mon, 03/27/2017 - 20:17
Across a wide range of consumer devices, from cameras to smartphones to laptops, flash storage has become the de facto standard for digital data storage. more>>
Categories: Linux News
Syndicate content