Skip to main content

External Blogs

August 2018 report: LTS, Debian, Upgrades

Anarcat - ven, 08/31/2018 - 19:19
Debian Long Term Support (LTS)

This is my monthly Debian LTS report.

twitter-bootstrap

I researched some of the security issue of the Twitter Bootstrap framework which is clearly showing its age in Debian. Of the three vulnerabilities, I couldn't reproduce two (CVE-2018-14041 and CVE-2018-14042) so I marked them as "not affecting" jessie. I also found that CVE-2018-14040 was relevant only for Bootstrap 3 (because yes, we still have Bootstrap 2, in all suites, which will hopefully be fixed in buster)

The patch for the latter was a little tricky to figure out, but ended up being simple. I tested the patch with a private copy of the code which works here and published the result as DLA-1479-1.

What's concerning with this set of vulnerabilities is they show a broader problem than the one identified in those specific instances. May found at least one similar other issue although I wasn't able to exploit it in a quick attempt. Besides, I'm not sure we want to audit the entire Bootstrap codebase: upstream fixed this issue more widely in the v4 series, and Debian should follow suite, at least in future releases, and remove older releases from the archive.

tiff

A classic. I tried and failed to reproduce CVE-2018-15209 in the tiff package. I'm a bit worried by Brian May's results that the proof of concept eats up all memory in his tests. Since I could not reproduce, I marked the package as N/A in jessie and moved on.

Ruby 2.1

Another classic source of vulnerabilities... The patches were easy to backport, tests passed, so I just uploaded and published DLA-1480-1.

GDM 3

I reviewed Markus Koschany's work on CVE-2018-14424. His patches seemed to work in my tests as I couldn't see any segfault in jessie, either in the kernel messages or through a debugger.

True, the screen still "flashes" so one might think there is still a crash, but this is actually expected behavior. Indeed, this is the first D-Bus command being ran:

dbus-send --system --dest=org.gnome.DisplayManager --type=method_call --print-reply=literal /org/gnome/DisplayManager/LocalDisplayFactory org.gnome.DisplayManager.LocalDisplayFactory.CreateTransientDisplay

Or, in short, CreateTransientDisplay, which is also known as fast user switching, brings you back to the login screen. If you enter the same username and password, you get your session back. So no crash. After talking with Koschany, we'll wait a little longer for feedback from the reporter but otherwise I expect to publish the fixed package shortly.

git-annex

This is a bigger one I took from Koschany. The patch was large, and in a rather uncommon language (Haskell).

The first patch was tricky as function names had changed and some functionality (the P2P layer, the setkey command and content verification) were completely missing. On advice from upstream, the content verification functionality was backported as it was critical for the second tricky patch which required more Haskell gymnastics.

This time again, Haskell was nice to work with: by changing type configurations and APIs, the compiler makes sure that everything works out and there are no inconsistencies. This logic is somewhat backwards to what we are used to: normally, in security updates, we avoid breaking APIs at all costs. But in Haskell, it's a fundamental way to make sure the system is still coherent.

More details, including embarrassing fixes to the version numbering scheme, are best explained in the email thread. An update for this will come out shortly, after giving more time for upstream to review the final patchset.

Fighting phishing

After mistyping the address of the security tracker, I ended up on this weird page:

Some phishing site masquerading as a Teksavvy customer survey.

Confused and alarmed, I thought I was being intercepted by my ISP, but after looking on their forums, I found out they actually get phished like this all the time. As it turns out, the domain name debain.org (notice the typo) is actually registered to some scammers. So I implemented a series of browser quick searches as a security measure and shared those with the community. Only after feedback from a friend did I realize that surfraw (SR) has been doing this all along. The problem with SR is that it's mostly implemented with messy shell scripts and those cannot easily be translated back into browser shortcuts, which are still useful on their own. That and the SR plugins (called "elvi" or "elvis" in plural) are horribly outdated.

Ideally, trivial "elvis" would simply be "bookmarks" (which are really just one link per line) that can then easily be translated back into browser bookmarks. But that would require converting a bunch of those plugins, something I don't currently have the energy (or time) for. All this reminds me a lot of the interwiki links from the wiki world and looks like an awful duplication of information. Even in this wiki I have similar shortcuts, which are yet another database of such redirections. Surely there is a better way than maintaining all of this separately?

Who claimed all the packages?

After struggling again to find some (easy, I admit) work, I worked on a patch to show per-user package claims. Now, if --verbose is specified, the review-update-needed script will also show a list of users who claimed packages and how many are claimed. This can help us figure out who's overloaded and might need some help.

Post-build notifications in sbuild

I sent a patch to sbuild to make sure we can hook into failed builds on completion as well as successful builds. Upstream argued this is best accomplished with a wrapper, but I believe it's unsufficient as a wrapper will not have knowledge of the sbuild internals and won't be able to effectively send notifications. It is, after all, while there is a post-build hook right now, which runs only on succesful builds.

GnuTLS and other reviews

I reviewed questions from Ola Lundqvist regarding the pending GnuTLS security vulnerabilities designated CVE-2018-10844, CVE-2018-10845 and CVE-2018-10846. Those came from a paper called Pseudo Constant Time Implementations of TLS Are Only Pseudo Secure. I am still unsure of the results: after reviewing the paper in detail, I am worried the upstream fixes are complete. Hopefully Lundqvist will figure it out but in any case I am available to review this work again next week.

I also provided advice on a squirrelmail bugfix backport suggestion.

Other free software work

I was actually on vacation this month so this is a surprising amount of activity for what was basically a week of work.

Buster upgrade

I upgraded my main workstation to buster, in order to install various Node.JS programs through npm for that Dat article (which will be public here shortly). It's part of my routine: when enough backports pile up or I need too much stuff from unstable, it's time to make the switch. This makes development on Debian easier and helps testing the next version of stable before it is released. I do this only on my most busy machine where I can fix things quickly and they break: my laptop and server remain on stable so I don't have to worry about them too much.

It was a bumpy ride: font rendering changed because of the new rendering engine in FreeType. Someone ended up finding a workaround in Debian bug #866685 which allowed me to keep the older rendering engine but I am worried it might be removed in the future. Hopefully that bug will trickle upstream and Debian users won't see a regression when they upgrade to buster.

A major issue was a tiny bug in the python-sh library which caused my entire LWN workflow to collapse. Thankfully, it turned out upstream had already released a fix and all I had to do was to update the package and NMU the result. As it turns out, I was already part of the Python team, and that should have been marked as a team upload, but I didn't know. Strange how memory works sometimes.

Other problems were similar: dictd, for example, failed to upgrade (Debian bug #906420, fixed). There are about 15 different packages that are missing from stretch: many FTBFS problems, other with real critical bugs. Others are just merely missing from the archive: I particularly pushed on wireguard (Debian bug #849308), taffybar (Debian bug #895264), and hub (Debian bug #807866).

I won't duplicate the whole upgrade documentation here, the details are in buster.

Debian 25th anniversary updates

The Debian project turned 25 this year so it was a good occasion to look back at history and present. I became a Debian Developer in 2010, a Debian maintainer in 2009, and my first contributions to the project go all the way back to 2003, when I started filing bugs. So this is anywhere between my 8th and 15th birthday in the project.

I didn't celebrate this in any special way, although I did make sure to keep my packages up to date when I returned from vacation. That meant a few uploads:

Work on smokeping and charybdis happened as part of our now regular Debian & Stuff along with LeLutin which is helping and learning a few packaging tricks along the way.

Other software upgrades

During the above buster upgrade, Prometheus broke because the node exporter metrics labels changed. More accurately, what happened is that Grafana would fail to display some of the nodes. As it turns out, all that was needed was to update a few Grafana dashboard (as those don't update automatically of course). But it brought to my attention that a bunch of packages I had installed were not being upgraded as automatically as the rest of my Debian infrastructure. There were a few reasons for that:

  1. packages were installed from a third-party repository

  2. packages were installed from unstable

  3. there were no packages: software was installed in a container

  4. there were no packages: software was installed by hand

I'm not sure what is the worst between 3 and 4. As it turns out, containers were harder to deal with because they also involved upgrading docker.io which was more difficult.

For each forgotten program, I tried to make sure they wouldn't stay stale any longer in the case of 1 or 2, a proper apt preference (or "pin") was added to automate upgrades. For 3 and 4, I added the release feeds of the program to feed2exec so I get an email when upstream makes a new release.

Those are the programs I had to deal with:

  • rainloop: the upgrade guide is trivial. make a backup:

    (umask 0077; tar cfz rainloop.tgz /var/lib/rainloop)

    Then decompress archive on top. It keeps old data in rainloop/v/1.11.0.203/ which should probably be removed on next upgrade. Upgrade presumably runs when visiting the site which worked flawlessly after upgrade.

  • grafana: the upgrade guide says to backup the database in /var/lib/grafana/grafana.db but i backed up the whole thing:

    (umask 0077; tar cfz grafana /var/lib/grafana

    The upgrade from 4.x to 5.2.x was trivial and automated. There is, unfortunately, still no official package. A visit to the Grafana instance shows some style changes and improvements and that things generally just work.

  • the toot Mastodon client has entered Debian so I was able to remove yet another third party repository. this involved adding a pin to follow the buster sources for this package:

    Package: toot Pin: release n=buster Pin-Priority: 500
  • Docker is in this miserable state in stretch. There is an really old binary in jessie-backports (1.6) and for some reason I had a random version from unstable running 1.13.1~ds1-2. I upgraded to the sid version, which installs fine in stretch because golang is statically compiled. But, the containers did not restart automatically. starting them by hand gave this error:

    root@marcos:/etc/apt/sources.list.d# ~anarcat/bin/subsonic-start e4216f435be477dacd129ed8c2b23b2b317e9ef9a61906f3ba0e33265c97608e docker: Error response from daemon: OCI runtime create failed: json: cannot unmarshal object into Go value of type []string: unknown.

    Strangely, the container was started, but is not reachable over the network. The problem is that runc needs to be upgraded as well, so that was promptly fixed.

    The magic pin to follow buster is like this:

    Package: docker.io runc Pin: release n=buster Pin-Priority: 500
  • airsonic upgrades are a little trickier because I run this inside a docker container. First step is to fix the Dockerfile and rebuild the container image:

    sed -i s/10.1.1/10.1.2/ Dockerfile sudo docker build -t anarcat/airsonic .

    then the image is ready to go. the previous container needs to be stopped and the new one started:

    docker ps docker stop 78385cb29cd5 ~anarcat/bin/subsonic-start

    The latter is a script I wrote because I couldn't remember the magic startup sequence, which is silly: you'd think the Dockerfile would know stuff like that. A visit to the radio site showed that everything seemed to be in order but no deeper test was performed.

All of this assumes that updates to unstable will not be too disruptive or that, if they do, the NEWS.Debian file will warn me so I can take action. That is probably a little naive of me, but it beats having outdated infrastructure running exposed on the network.

Other work

Then there's the usual:

Catégories: External Blogs

Sharing and archiving data sets with Dat

Anarcat - dim, 08/26/2018 - 19:00

Dat is a new peer-to-peer protocol that uses some of the concepts of BitTorrent and Git. Dat primarily targets researchers and open-data activists as it is a great tool for sharing, archiving, and cataloging large data sets. But it can also be used to implement decentralized web applications in a novel way.

Dat quick primer

Dat is written in JavaScript, so it can be installed with npm, but there are standalone binary builds and a desktop application (as an AppImage). An online viewer can be used to inspect data for those who do not want to install arbitrary binaries on their computers.

The command-line application allows basic operations like downloading existing data sets and sharing your own. Dat uses a 32-byte hex string that is an ed25519 public key, which is is used to discover and find content on the net. For example, this will download some sample data:

$ dat clone \ dat://778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639 \ ~/Downloads/dat-demo

Similarly, the share command is used to share content. It indexes the files in a given directory and creates a new unique address like the one above. The share command starts a server that uses multiple discovery mechanisms (currently, the Mainline Distributed Hash Table (DHT), a custom DNS server, and multicast DNS) to announce the content to its peers. This is how another user, armed with that public key, can download that content with dat clone or mirror the files continuously with dat sync.

So far, this looks a lot like BitTorrent magnet links updated with 21st century cryptography. But Dat adds revisions on top of that, so modifications are automatically shared through the swarm. That is important for public data sets as those are often dynamic in nature. Revisions also make it possible to use Dat as a backup system by saving the data incrementally using an archiver.

While Dat is designed to work on larger data sets, processing them for sharing may take a while. For example, sharing the Linux kernel source code required about five minutes as Dat worked on indexing all of the files. This is comparable to the performance offered by IPFS and BitTorrent. Data sets with more or larger files may take quite a bit more time.

One advantage that Dat has over IPFS is that it doesn't duplicate the data. When IPFS imports new data, it duplicates the files into ~/.ipfs. For collections of small files like the kernel, this is not a huge problem, but for larger files like videos or music, it's a significant limitation. IPFS eventually implemented a solution to this problem in the form of the experimental filestore feature, but it's not enabled by default. Even with that feature enabled, though, changes to data sets are not automatically tracked. In comparison, Dat operation on dynamic data feels much lighter. The downside is that each set needs its own dat share process.

Like any peer-to-peer system, Dat needs at least one peer to stay online to offer the content, which is impractical for mobile devices. Hosting providers like Hashbase (which is a pinning service in Dat jargon) can help users keep content online without running their own server. The closest parallel in the traditional web ecosystem would probably be content distribution networks (CDN) although pinning services are not necessarily geographically distributed and a CDN does not necessarily retain a complete copy of a website.

The Photos application loading a test gallery in the Beaker browser

A web browser called Beaker, based on the Electron framework, can access Dat content natively without going through a pinning service. Furthermore, Beaker is essential to get any of the Dat applications working, as they fundamentally rely on dat:// URLs to do their magic. This means that Dat applications won't work for most users unless they install that special web browser. There is a Firefox extension called "dat-fox" for people who don't want to install yet another browser, but it requires installing a helper program. The extension will be able to load dat:// URLs but many applications will still not work. For example, the photo gallery application completely fails with dat-fox.

Dat-based applications look promising from a privacy point of view. Because of its peer-to-peer nature, users regain control over where their data is stored: either on their own computer, an online server, or by a trusted third party. But considering the protocol is not well established in current web browsers, I foresee difficulties in adoption of that aspect of the Dat ecosystem. Beyond that, it is rather disappointing that Dat applications cannot run natively in a web browser given that JavaScript is designed exactly for that.

Dat privacy

An advantage Dat has over other peer-to-peer protocols like BitTorrent is end-to-end encryption. I was originally concerned by the encryption design when reading the academic paper [PDF]:

It is up to client programs to make design decisions around which discovery networks they trust. For example if a Dat client decides to use the BitTorrent DHT to discover peers, and they are searching for a publicly shared Dat key (e.g. a key cited publicly in a published scientific paper) with known contents, then because of the privacy design of the BitTorrent DHT it becomes public knowledge what key that client is searching for.

So in other words, to share a secret file with another user, the public key is transmitted over a secure side-channel, only to then leak during the discovery process. Fortunately, the public Dat key is not directly used during discovery as it is hashed with BLAKE2B. Still, the security model of Dat assumes the public key is private, which is a rather counterintuitive concept that might upset cryptographers and confuse users who are frequently encouraged to type such strings in address bars and search engines as part of the Dat experience. There is a security & privacy FAQ in the Dat documentation warning about this problem:

One of the key elements of Dat privacy is that the public key is never used in any discovery network. The public key is hashed, creating the discovery key. Whenever peers attempt to connect to each other, they use the discovery key.

Data is encrypted using the public key, so it is important that this key stays secure.

There are other privacy issues outlined in the document; it states that "Dat faces similar privacy risks as BitTorrent":

When you download a dataset, your IP address is exposed to the users sharing that dataset. This may lead to honeypot servers collecting IP addresses, as we've seen in Bittorrent. However, with dataset sharing we can create a web of trust model where specific institutions are trusted as primary sources for datasets, diminishing the sharing of IP addresses.

A Dat blog post refers to this issue as reader privacy and it is, indeed, a sensitive issue in peer-to-peer networks. It is how BitTorrent users are discovered and served scary verbiage from lawyers, after all. But Dat makes this a little better because, to join a swarm, you must know what you are looking for already, which means peers who can look at swarm activity only include users who know the secret public key. This works well for secret content, but for larger, public data sets, it is a real problem; it is why the Dat project has avoided creating a Wikipedia mirror so far.

I found another privacy issue that is not documented in the security FAQ during my review of the protocol. As mentioned earlier, the Dat discovery protocol routinely phones home to DNS servers operated by the Dat project. This implies that the default discovery servers (and an attacker watching over their traffic) know who is publishing or seeking content, in essence discovering the "social network" behind Dat. This discovery mechanism can be disabled in clients, but a similar privacy issue applies to the DHT as well, although that is distributed so it doesn't require trust of the Dat project itself.

Considering those aspects of the protocol, privacy-conscious users will probably want to use Tor or other anonymization techniques to work around those concerns.

The future of Dat

Dat 2.0 was released in June 2017 with performance improvements and protocol changes. Dat Enhancement Proposals (DEPs) guide the project's future development; most work is currently geared toward implementing the draft "multi-writer proposal" in HyperDB. Without multi-writer support, only the original publisher of a Dat can modify it. According to Joe Hand, co-executive-director of Code for Science & Society (CSS) and Dat core developer, in an IRC chat, "supporting multiwriter is a big requirement for lots of folks". For example, while Dat might allow Alice to share her research results with Bob, he cannot modify or contribute back to those results. The multi-writer extension allows for Alice to assign trust to Bob so he can have write access to the data.

Unfortunately, the current proposal doesn't solve the "hard problems" of "conflict merges and secure key distribution". The former will be worked out through user interface tweaks, but the latter is a classic problem that security projects have typically trouble finding solutions for---Dat is no exception. How will Alice securely trust Bob? The OpenPGP web of trust? Hexadecimal fingerprints read over the phone? Dat doesn't provide a magic solution to this problem.

Another thing limiting adoption is that Dat is not packaged in any distribution that I could find (although I requested it in Debian) and, considering the speed of change of the JavaScript ecosystem, this is unlikely to change any time soon. A Rust implementation of the Dat protocol has started, however, which might be easier to package than the multitude of Node.js modules. In terms of mobile device support, there is an experimental Android web browser with Dat support called Bunsen, which somehow doesn't run on my phone. Some adventurous users have successfully run Dat in Termux. I haven't found an app running on iOS at this point.

Even beyond platform support, distributed protocols like Dat have a tough slope to climb against the virtual monopoly of more centralized protocols, so it remains to be seen how popular those tools will be. Hand says Dat is supported by multiple non-profit organizations. Beyond CSS, Blue Link Labs is working on the Beaker Browser as a self-funded startup and a grass-roots organization, Digital Democracy, has contributed to the project. The Internet Archive has announced a collaboration between itself, CSS, and the California Digital Library to launch a pilot project to see "how members of a cooperative, decentralized network can leverage shared services to ensure data preservation while reducing storage costs and increasing replication counts".

Hand said adoption in academia has been "slow but steady" and that the Dat in the Lab project has helped identify areas that could help researchers adopt the project. Unfortunately, as is the case with many free-software projects, he said that "our team is definitely a bit limited on bandwidth to push for bigger adoption". Hand said that the project received a grant from Mozilla Open Source Support to improve its documentation, which will be a big help.

Ultimately, Dat suffers from a problem common to all peer-to-peer applications, which is naming. Dat addresses are not exactly intuitive: humans do not remember strings of 64 hexadecimal characters well. For this, Dat took a similar approach to IPFS by using DNS TXT records and /.well-known URL paths to bridge existing, human-readable names with Dat hashes. So this sacrifices a part of the decentralized nature of the project in favor of usability.

I have tested a lot of distributed protocols like Dat in the past and I am not sure Dat is a clear winner. It certainly has advantages over IPFS in terms of usability and resource usage, but the lack of packages on most platforms is a big limit to adoption for most people. This means it will be difficult to share content with my friends and family with Dat anytime soon, which would probably be my primary use case for the project. Until the protocol reaches the wider adoption that BitTorrent has seen in terms of platform support, I will probably wait before switching everything over to this promising project.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Concerns with Signal receipt notifications

Anarcat - ven, 07/27/2018 - 16:18

During some experiments with a custom Signal client with a friend, let's call him Bob, he was very surprised when we had a conversation that went a little like this:

A> hey Bob! welcome home! B> what? B> wait, how did you know I got home? B> what the heck man? did you hack my machine? OMGWTFSTHUBERTBBQ?!

I'm paraphrasing as I lost copy of the original chat, but it was striking how he had absolutely no clue how I figured out he had just came home in front of his laptop. He was quite worried I hacked into his system to spy on his webcam or some other "hack". As it turns out, I just made simple assertions based on data Signal provides to other peers when you send messages. Using those messages, I could establish when my friend opened his laptop and the Signal Desktop app got back online.

How this works

This is possible because the receipt notifications in Signal are per-device. This means that the "double-checkmark" you see when a message is delivered to the device is actually only when the first device receives the message. Behind the scenes, Signal actually sends a notification for each device, with a unique, per-device identifier. Those identifiers are visible with signal-cli. For example, this is a normal notification the Signal app will send when confirming reception for a message, as seen from signal-cli:

Envelope from: “Bob” +15555555555 (device: 1) Timestamp: 1532279834422 (2018-07-22T17:17:14.422Z) Got receipt.

That's Bob's phone telling me it received the message. On my side, the Signal app shows a second checkmark to tell me the message was transmitted. (There are also "blue checkmarks" now that tell the user the other person has seen the message, but I haven't looked into those in detail.) Then another notification comes in:

Envelope from: “Bob” +15555555555 (device: 2) Timestamp: 1532279901951 (2018-07-22T17:18:21.951Z) Got receipt.

Notice the device number there? It changed from 1 to 2. This tells me this is a different device than the first one. Device 1 will most likely be the phone app and device 2 will most likely be Signal Desktop. (In my case, I tried so many different configurations thatI have device numbers up to 8, but my phone is still device 1.)

An attacker can use those notifications to tell when my phone goes online. It is also possible to make reasonable assertions about the identity of each device: any device number above one is most likely a Signal Desktop client. This can be used to assert physical presence on different machines: the desktop at home, laptop in the office, etc. It might not seem like much, but it sure felt creepy to Bob.

While writing this article, I figured I would reproduce those results, I wrote Bob again to ask for help. Here's how the (redacted and reformatted) conversation went:

A-1> hey you there? * B-1 message received A-1> i want to see if i can freak you out with signal again * B-1 message received A-1> i'm going to write about the issue, and i want to reproduce the results * B-1 message received B-1> he's driving B-1> sure, I'll be your guinea pig he says A-1> all he needs to do is open his laptop and start signal-desktop :p * B-1 message received B-1> we'll be home in 1h30 A-1> i'll know, don't worry :p * B-1 message received

After an hour or two, Bob gets home opens his laptop, and you can see the key message that reveals it:

* B-2 mesage received A-1> welcome home, sucker! ;) B-2> dang dog.

This attack can be carried out by anyone who knows Bob's phone number. Because Signal is an open network, you are free to send messages to anyone without their consent. An attacker only has to send spam messages to a victim to figure out when they're online, how many devices they own and when they are online. There's no way for Bob to protect himself from this attack, other than trying to keep his phone number private.

Why Signal works that way

When I shared an earlier draft of this article to the Signal Security team, they stated this was a necessary trade-off, as each device carries a unique cryptographic key anyways and that:

Signal encrypts messages individually to each recipient device. Thus as long as there is a "delivery receipt" feature, it will be possible to learn which recipient devices are online, for example by sending an encrypted message to a subset of the recipient devices, and seeing whether a delivery receipt is received or not.

The alternative seems to be to either disable receipt notifications or sharing the same private key among different devices, which induces other problems:

Having all recipient devices share the same encryption keys would render the Diffie-Hellman ratcheting which is part of the Signal protocol ineffective, since all devices (including offline ones) would have to use synchronized DH ratchet key pairs, preventing these values from adding fresh randomness. In addition, it would add massive protocol complexity and fragility to try to keep recipient devices synchronized, while trying to achieve the (probably-infeasible) goal of eliminating all ways to distinguish recipient devices.

I am not certain those tradeoffs are that clear-cut, however. I am not a cryptographer, and specifically not very familiar with the "ratcheting" algorithm behind the "Signal protocol" (or is it called Noise now?), but it seems to me there should be a way to provide multi-device, multi-key encryption, without revealing per-device identifiers to other clients. In particular, I do not understand what purpose those integers serve: maybe they are automatically generated by signal-cli and are just a side-effect of a fundamental property of the protocol, in which case I would understand why they would be unavoidable. To be fair, other cryptographic systems also share similar problems: an encrypted OpenPGP email usually embeds metadata about source and destination addresses, as email headers are not encrypted. Even a normal OpenPGP encrypted blob includes OpenPGP key data by default, although there are ways to turn that off and make sure an encrypted blob is just an undecipherable blob. The problem with this, of course, is that many critics of OpenPGP present it as an old technology that should be replaced by more modern alternatives like Signal, so it's a bit disappointing to see it suffers from similar metadata exposure problems as older protocols.

But apart from cryptographic properties, there are certain user expectations regarding Signal, and my experience with this specific issue is that this property certainly breaks some privacy expectations for users. I'm not sure people would choose to have delivery notifications if they were given the choice.

Other metadata issues

There are other metadata issues in Signal, of course. Like receipt notifications, they are tradeoffs between usability and privacy. The most notable one is probably how Signal shares your contact list. The user-visible effect is the "Bob is on Signal!" message that pops up when the server figures that out. The Signal people have done extensive research to make this work securely while at the same time leveraging the contacts on your phone, but it's still a surprising phenomenon to new users who don't know about the specifics of how this is implemented.

Another one is how groups are opt-out only: anyone can add you to a group without your consent, which shares your phone number to the other members of the group, a bit like how carbon-copies in emails reveals a social network.

Compared with groups and new users notifications, the receipt notification issue is a little more pernicions: the leak is not visible at all to users except if they run signal-cli... While people clearly see each other's presence in a group, they definitely will not know that those little checkmark disclose more information than they seem to other users.

The bottomline is that crypto and security are hard to implement but also hard to make visible to users. Signal does a great job at making a solid communication application that provides decent security, but it can have surprising properties even for skilled engineers who thought they knew about the security properties of the system, so I am worried about my fellow non-technical friends and their expectations of privacy...

Catégories: External Blogs

My free software activities, July 2018

Anarcat - ven, 07/27/2018 - 13:37
Debian Long Term Support (LTS)

This is my monthly Debian LTS report.

Most of my hours this month were spent updating jessie to catchup with all the work we've done in Wheezy that were never forward-ported (DLA-1414-1, fixing CVE-2017-9462, CVE-2017-17458, CVE-2018-1000132, OVE-20180430-0001, OVE-20180430-0002, and OVE-20180430-0004). Unfortunately, work was impeded by how upstream now refuses to get CVE identifiers for new issues they discover in the process, which meant that I actually missed three more patches which were required to fix the subrepo vulnerability (CVE-2017-17458). In other issues, upstream at least attempted to try identifiers through the OVE system which is not as well integrated in our security tracker but does allow some cross-distro collaboration at least. The regression advisory was published as DLA-1414-2.

Overall, the updates of the Mercurial package were quite difficult as the test suite would fail because order of one test would vary between builds (and not runs!) which was quite confusing. I originally tried fixing this by piping the output of the test suite through sort to get consistent output, but, after vetting the idea one of the upstream maintainers (durin42), I ended up sorting the dictionnary in the code directly.

I have also uploaded fixes for cups (DLA-1412-1, fixing CVE-2017-18190 and CVE-2017-18248) and dokuwiki (DLA-1413-1, fixing CVE-2017-18123).

Other activities

This month was fairly quiet otherwise, as I was on vacation.

I still managed to push a few projects forward. The pull request to add nopaste to ELPA was met with skepticism considering there is already another paste tool in ELPA called webpaste.el which takes the different (and unfortunate) approach of reimplementing all pastebins natively, instead of reusing the existing paste programs. I have, incidentally, discovered similar functionality in my terminal emulator, in the form of urxvt-selection-pastebin although I have yet to try (and probably patch) that approach.

We have also been dealing with a vast attack on IRC servers primarily aimed at hurting the reputation of Freenode operators, but that affected all IRC networks. On top of implementing custom measures to deal with the problem on our networks, I have contributed some documentation to help users and improvements to a IRC service to help with the attack.

I've also had a great conversation with the author of croc, a derivative of magic-wormhole because of flaws I felt were present in the croc implementation. It seems I was able to convince the author to do the right thing and future versions of the program might be fully compatible with wormhole, which is great news.

Catégories: External Blogs

My free software activities, June 2018

Anarcat - jeu, 06/28/2018 - 12:55

It's been a while since I haven't done a report here! Since I need to do one for LTS, I figured I would also catchup you up with the work I've done in the last three months. Maybe I'll make that my new process: quarterly reports would reduce the overhead on my side with little loss on you, my precious (few? many?) readers.

Debian Long Term Support (LTS)

This is my monthly Debian LTS report.

I omitted doing a report in May because I didn't spend a significant number of hours, so this also covers a handful of hours of work in May.

May and June were strange months to work on LTS, as we made the transition between wheezy and jessie. I worked on all three LTS releases now, and I must have been absent from the last transition because I felt this one was a little confusing to go through. Maybe it's because I was on frontdesk duty during that time...

For a week or two it was unclear if we should have worked on wheezy, jessie, or both, or even how to work on either. I documented which packages needed an update from wheezy to jessie and proposed a process for the transition period. This generated a good discussion, but I am not sure we resolved the problems we had this time around in the long term. I also sent patches to the security team in the hope they would land in jessie before it turns into LTS, but most of those ended up being postponed to LTS.

Most of my work this month was spent actually working on porting the Mercurial fixes from wheezy to jessie. Technically, the patches were ported from upstream 4.3 and led to some pretty interesting results in the test suite, which fails to build from source non-reproducibly. Because I couldn't figure out how to fix this in the alloted time, I uploaded the package to my usual test location in the hope someone else picks it up. The test package fixes 6 issues (CVE-2018-1000132, CVE-2017-9462, CVE-2017-17458 and three issues without a CVE).

I also worked on cups in a similar way, sending a test package to the security team for 2 issues (CVE-2017-18190, CVE-2017-18248). Same for Dokuwiki, where I sent a patch single issue (CVE-2017-18123). Those have yet to be published, however, and I will hopefully wrap that up in July.

Because I was looking for work, I ended up doing meta-work as well. I made a prototype that would use the embedded-code-copies file to populate data/CVE/list with related packages as a way to address a problem we have in LTS triage, where package that were renamed between suites do not get correctly added to the tracker. It ended up being rejected because the changes were too invasive, but led to Brian May suggesting another approach, we'll see where that goes.

I've also looked at splitting up that dreaded data/CVE/list but my results were negative: it looks like git is very efficient at splitting things up. While a split up list might be easier on editors, it would be a massive change and was eventually refused by the security team.

Other free software work

With my last report dating back to February, this will naturally be a little imprecise, as three months have passed. But let's see...

LWN

I wrote eigth articles in the last three months, for an average of three monthly articles. I was aiming at an average of one or two a week, so I didn't get reach my goal. My last article about Kubecon generated a lot of feedback, probably the best I have ever received. It seems I struck a chord for a lot of people, so that certainly feels nice.

Linkchecker

Usual maintenance work, but we at last finally got access to the Linkchecker organization on GitHub, which meant a bit of reorganizing. The only bit missing now it the PyPI namespace, but that should also come soon. The code of conduct and contribution guides were finally merged after we clarified project membership. This gives us issue templates which should help us deal with the constant flow of issues that come in every day.

The biggest concern I have with the project now is the C parser and the outdated Windows executable. The latter has been removed from the website so hopefully Windows users won't report old bugs (although that means we won't gain new Windows users at all) and the former might be fixed by a port to BeautifulSoup.

Email over SSH

I did a lot of work to switch away from SMTP and IMAP to synchronise my workstation and laptops with my mailserver. Having the privilege of running my own server has its perks: I have SSH access to my mail spool, which brings the opportunity for interesting optimizations.

The first I have done is called rsendmail. Inspired by work from Don Armstrong and David Bremner, rsendmail is a Python program I wrote from scratch to deliver email over a pipe, securely. I do not trust the sendmail command: its behavior can vary a lot between platforms (e.g. allow flushing the mailqueue or printing it) and I wanted to reduce the attack surface. It works with another program I wrote called sshsendmail which connects to it over a pipe. It integrates well into "dumb" MTAs like nullmailer but I also use it with the popular Postfix as well, without problems.

The second is to switch from OfflineIMAP to Syncmaildir (SMD). The latter allows synchronization over SSH only. The migration was a little difficult but I very much like the results: SMD is faster than OfflineIMAP and works transparently in the background.

I really like to use SSH for email. I used to have my email password stored all over the place: in my Postfix config, in my email clients' memory, it was a mess. With the new configuration, things just work unattended and email feels like a solved problem, at least the synchronization aspects of it.

Emacs

As often happens, I've done some work on my Emacs configuration. I switched to a new Solarized theme, the bbatsov version which has support for a light and dark mode and generally better colors. I had problems with the cursor which are unfortunately unfixed.

I learned about and used the Emacs iPython Notebook project (EIN) and filed a feature request to replicate the "restart and run" behavior of the web interface. Otherwise it's real nice to have a decent editor to work on Python notebooks and I have used this to work on the terminal emulators series and the related source code

I have also tried to complete my conversion to Magit, a pretty nice wrapper around git for Emacs. Some of my usual git shortcuts have good replacements, but not all. For example, those are equivalent:

  • vc-annotate (C-x C-v g): magit-blame
  • vc-diff (C-x C-v =): magit-diff-buffer-file

Those do not have a direct equivalent:

  • vc-next-action (C-x C-q, or F6): anarcat/magic-commit-buffer, see below
  • vc-git-grep (F8): no replacement

I wrote my own replacement for "diff and commit this file" as the following function:

(defun anarcat/magit-commit-buffer () "commit the changes in the current buffer on the fly This is different than `magit-commit' because it calls `git commit' without going through the staging area AKA index first. This is a replacement for `vc-next-action'. Tip: setting the git configuration parameter `commit.verbose' to 2 will show the diff in the changelog buffer for review. See `git-config(1)' for more information. An alternative implementation was attempted with `magit-commit': (let ((magit-commit-ask-to-stage nil)) (magit-commit (list \"commit\" \"--\" (file-relative-name buffer-file-name))))) But it seems `magit-commit' asserts that we want to stage content and will fail with: `(user-error \"Nothing staged\")'. This is why this function calls `magit-run-git-with-editor' directly instead." (interactive) (magit-run-git-with-editor (list "commit" "--" (file-relative-name buffer-file-name))))

It's not very pretty, but it works... Mostly. Sometimes the magit-diff buffer becomes out of sync, but the --verbose output in the commitlog buffer still works.

I've also looked at git-annex integration. The magit-annex package did not work well for me: the file listing is really too slow. So I found the git-annex.el package, but did not try it out yet.

While working on all of this, I fell in a different rabbit hole: I found it inconvenient to "pastebin" stuff from Emacs, as it would involve selection a region, piping to pastebinit and copy-pasting the URL found in the *Messages* buffer. So I wrote this first prototype:

(defun pastebinit (begin end) "pass the region to pastebinit and add output to killring TODO: prompt for possible pastebins (pastebinit -l) with prefix arg Note that there's a `nopaste.el' project which already does this, which we should use instead. " (interactive "r") (message "use nopaste.el instead") (let ((proc (make-process :filter #'pastebinit--handle :command '("pastebinit") :connection-type 'pipe :buffer nil :name "pastebinit"))) (process-send-region proc begin end) (process-send-eof proc))) (defun pastebinit--handle (proc string) "handle output from pastebinit asynchronously" (let ((url (car (split-string string)))) (kill-new url) (message "paste uploaded and URL added to kill ring: %s" url)))

It was my first foray into aynchronous process operations in Emacs: difficult and confusing, but it mostly worked. Those who know me know what's coming next, however: I found not only one, but two libraries for pastebins in Emacs: nopaste and (after patching nopaste to add asynchronous support and customize support of course) debpaste.el. I'm not sure where that will go: there is a proposal to add nopaste in Debian that was discussed a while back and I made a detailed report there.

Monkeysign

I made a minor release of Monkeysign to cover for CVE-2018-12020 and its GPG sigspoof vulnerability. I am not sure where to take this project anymore, and I opened a discussion to possibly retire the project completely. Feedback welcome.

ikiwiki

I wrote a new ikiwiki plugin called bootstrap to fix table markup to match what the Bootstrap theme expects. This was particularly important for the previous blog post which uses tables a lot. This was surprisingly easy and might be useful to tweak other stuff in the theme.

Random stuff
  • I wrote up a review of security of APT packages when compared with the TUF project, in TufDerivedImprovements
  • contributed to about 20 different repositories on GitHub, too numerous to list here
Catégories: External Blogs

Historical inventory of collaborative editors

Anarcat - mar, 06/26/2018 - 13:19

A quick inventory of major collaborative editor efforts, in chronological order.

As with any such list, it must start with an honorable mention to the mother of all demos during which Doug Engelbart presented what is basically an exhaustive list of all possible software written since 1968. This includes not only a collaborative editor, but graphics, programming and math editor.

Everything else after that demo is just a slower implementation to compensate for the acceleration of hardware.

Software gets slower faster than hardware gets faster. - Wirth's law

So without further ado, here is the list of notable collaborative editors that I could find. By "notable" i mean that they introduce a notable feature or implementation detail.

Project Date Platform Notes SubEthaEdit 2003-2015? Mac-only first collaborative, real-time, multi-cursor editor I could find. An reverse-engineering attempt in Emacs failed to produce anything. DocSynch 2004-2007 ? built on top of IRC! Gobby 2005-now C, multi-platform first open, solid and reliable implementation and still around! The protocol ("libinfinoted") is notoriously hard to port to other editors (e.g. Rudel failed to implement this in Emacs). 0.7 release in jan 2017 adds possible python bindings that might improve this. Interesting plugins: autosave to disk. Ethercalc 2005-now Web, Javascript First spreadsheet, along with Google docs moonedit 2005-2008? ? Original website died. Other user's cursors visible and emulated keystrokes noises. Included a calculator and music sequencer! synchroedit 2006-2007 ? First web app. Inkscape 2007-2011 C++ First graphics editor with collaborative features backed by the "whiteboard" plugin built on top of Jabber, now defunct. Abiword 2008-now C++ First word processor Etherpad 2008-now Web First solid web app. Originally developped as a heavy Java app in 2008, acquired and opensourced by Google in 2009, then rewritten in Node.js in 2011. Widely used. Wave 2009-2010 Web, Java Failed attempt at a grand protocol unification CRDT 2011 Specification Standard for replicating a document's datastructure among different computers reliably. Operational transform 2013 Specification Similar to CRDT, yet, well, different. Floobits 2013-now ? Commercial, but opensource plugins for different editors LibreOffice Online 2015-now Web free Google docs equivalent, now integrated in Nextcloud HackMD 2015-now ? Commercial but opensource. Inspired by hackpad, which was bought up by Dropbox. Cryptpad 2016-now web? spin-off of xwiki. encrypted, "zero-knowledge" on server Prosemirror 2016-now Web, Node.JS "Tries to bridge the gap between Markdown text editing and classical WYSIWYG editors." Not really an editor, but something that can be used to build one. Qill 2013-now Web, Node.JS Rich text editor, also javascript. Not sure it is really collaborative. Teletype 2017-now WebRTC, Node.JS For the GitHub's Atom editor, introduces "portal" idea that makes guests follow what the host is doing across multiple docs. p2p with webRTC after visit to introduction server, CRDT based. Tandem 2018-now Node.JS? Plugins for atom, vim, neovim, sublime... uses a relay to setup p2p connexions CRDT based. Dubious license issues were resolved thanks to the involvement of Debian developers, which makes it a promising standard to follow in the future. Other lists
Catégories: External Blogs

Montréal-Python 72 - Carroty Xenophon

Montreal Python - dim, 06/03/2018 - 23:00

Let’s meet one last time before our Summer break! Thanks to Notman House for sponsoring this event.

Presentations Socket - Éric Lafontaine

Most of our everyday job include doing request over the internet or hosting a web solution for our company. Each connections we make utilize the socket API in some way that is not always evident. I hope to, by giving this talk, elucidate some of the magic contained in the socket API. I'm also going to give away some trick that I've been using since understanding that API.

Probabilistic Programming and Bayesian Modeling with PyMC3 - Christopher Fonnesbeck

Bayesian statistics offers powerful, flexible methods for data analysis that, because they are based on full probability models, confer several benefits to analysts including scalability, straightforward quantification of uncertainty, and improved interpretability relative to classical methods. The advent of probabilistic programming has served to abstract the complexity associated with fitting Bayesian models, making such methods more widely available. PyMC3 is software for probabilistic programming in Python that implements several modern, computationally-intensive statistical algorithms for fitting Bayesian models. PyMC3’s intuitive syntax is helpful for new users, and its reliance on the Theano library for fast computation has allowed developers to keep the code base simple, making it easy to extend and expand the software to meet analytic needs. Importantly, PyMC3 implements several next-generation Bayesian computational methods, allowing for more efficient sampling for small models and better approximations to larger models with larger associated dataset. I will demonstrate how to construct, fit and check models in PyMC, using a selection of applied problems as motivation.

When

Monday June 11th, 2018 at 6PM

Where

Notman House

51 Sherbrooke West

Montréal, QC

H2X 1X2

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Presentations
  • 8:00PM - End of the event
  • 8:15PM - Benelux
Catégories: External Blogs

Diversity, education, privilege and ethics in technology

Anarcat - sam, 05/26/2018 - 11:48

This article is part of a series on KubeCon Europe 2018.

This is a rant I wrote while attending KubeCon Europe 2018. I do not know how else to frame this deep discomfort I have with the way one of the most cutting edge projects in my community is moving. I see it as a symptom of so many things wrong in society at large and figured it was as good a way as any to open the discussion regarding how free software communities seem to naturally evolved into corporate money-making machines with questionable ethics.

A white man groomed by a white woman

Diversity and education

There is often a great point made of diversity at KubeCon, and that is something I truly appreciate. It's one of the places where I have seen the largest efforts towards that goal; I was impressed by the efforts done in Austin, and mentioned it in my overview of that conference back then. Yet it is still one of the less diverse places I've ever participated in: in comparison, Pycon "feels" more diverse, for example. And then, of course, there's real life out there, where women constitute basically half the population, of course. This says something about the actual effectiveness diversity efforts in our communities.

4000 white men

The truth is that contrary to programmer communities, "operations" knowledge (sysadmin, SRE, DevOps, whatever it's called these days) comes not from institutional education, but from self-learning. Even though I have years of university training, the day to day knowledge I need in my work as a sysadmin comes not from the university, but from late night experiments on my personal computer network. This was first on the Macintosh, then on the FreeBSD source code of passed down as a magic word from an uncle and finally through Debian consecrated as the leftist's true computing way. Sure, my programming skills were useful there, but I acquired those before going to university: even there teachers expected students to learn programming languages (such as C!) in-between sessions.

Diversity program

The real solutions to the lack of diversity in our communities not only comes from a change in culture, but also real investments in society at large. The mega-corporations subsidizing events like KubeCon make sure they get a lot of good press from those diversity programs. However, the money they spend on those is nothing compared to tax evasion in their home states. As an example, Amazon recently put 7000 jobs on hold because of a tax the city of Seattle wanted to impose on corporations to help the homeless population. Google, Facebook, Microsoft, and Apple all evade taxes like gangsters. This is important because society changes partly through education, and that costs money. Education is how more traditional STEM sectors like engineering and medicine have changed: women, minorities, and poorer populations were finally allowed into schools after the epic social struggles of the 1970s finally yielded more accessible education. The same way that culture changes are seeing a backlash, the tide is turning there as well and the trend is reversing towards more costly, less accessible education of course. But not everywhere. The impacts of education changes are long-lasting. By evading taxes, those companies are keeping the state from revenues that could level the playing field through affordable education.

Hell, any education in the field would help. There is basically no sysadmin education curriculum right now. Sure you can follow a Cisco CCNA or MSCE private trainings. But anyone who's been seriously involved in running any computing infrastructure knows those are a scam: that will tie you down in a proprietary universe (Cisco and Microsoft, respectively) and probably just to "remote hands monkey" positions and rarely to executive positions.

Velocity

Besides, providing an education curriculum would require the field to slow down so that knowledge would settle down and trickle into a curriculum. Configuration management is pretty old, but because the changes in tooling are fast, any curriculum built in the last decade (or even less) quickly becomes irrelevant. Puppet publishes a new release every 6 month, Kubernetes is barely 4 years old now, and is changing rapidly with a ~3 month release schedule.

Here at KubeCon, Mark Zuckerberg's mantra of "move fast and break things" is everywhere. We call it "velocity": where you are going does not matter as much as how fast you're going there. At one of the many keynotes, Abby Kearns from the Cloud Foundry Foundation boasted at how Home Depot, in trying to sell more hammers than Amazon, is now deploying code to production multiple times a day. I am still unclear as whether this made Home Depot actually sell more hammers, or if it's something that we should even care about in the first place. Shouldn't we converge over selling less hammers? Making them more solid, reliable, so that they are passed down from generations instead of breaking and having to be replaced all the time?

Home Depot ecstasy

We're solving a problem that wasn't there in some new absurd faith that code deployments will naturally make people happier, by making sure Home Depot sells more hammers. And that's after telling us that Cloud Foundry helped the USAF save 600M$ by moving their databases to the cloud. No one seems bothered by the idea that the most powerful military in existence would move state secrets into a private cloud, out of the control of any government. It's the name of the game, at KubeCon.

USAF saves (money)

In his keynote, Alexis Richardson, CEO of Weaveworks, presented the toaster project as an example of what not to do. "He did not use any sourced components, everything was built from scratch, by hand", obviously missing the fact that toasters are deliberately not built from reusable parts, as part of the planned obsolescence design. The goal of the toaster experiment is also to show how fragile our civilization has become precisely because we depend on layers upon layers of parts. In this totalitarian view of the world, people are also "reusable" or, in that case "disposable components". Not just the white dudes in California, but also workers outsourced out of the USA decades ago; it depends on precious metals and the miners of Africa, the specialized labour of the factories and intricate knowledge of the factory workers in Asia, and the flooded forests of the first nations powering this terrifying surveillance machine.

Privilege

"Left to his own devices he couldn’t build a toaster. He could just about make a sandwich and that was it." -- Mostly Harmless, Douglas Adams, 1992

Staying in an hotel room for a week, all expenses paid, certainly puts things in perspectives. Rarely have I felt more privileged in my entire life: someone else makes my food, makes my bed, and cleans up the toilet magically when I'm gone. For me, this is extraordinary, but for many people at KubeCon, it's routine: traveling is part of the rock star agenda of this community. People get used to being served, both directly in their day to day lives, but also through the complex supply chain of the modern technology that is destroying the planet.

Nothing is like corporate nothing.

The nice little boxes and containers we call the cloud all abstract this away from us and those dependencies are actively encouraged in the community. We like containers here and their image is ubiquitous. We acknowledge that a single person cannot run a Kube shop because the knowledge is too broad to be possibly handled by a single person. While there are interesting collaborative and social ideas in that approach, I am deeply skeptical of its impact on civilization in the long run. We already created systems so complex that we don't truly know who hacked the Trump election or how. Many feel it was, but it's really just a hunch: there were bots, maybe they were Russian, or maybe from Cambridge? The DNC emails, was that really Wikileaks? Who knows! Never mind failing close or open: the system has become so complex that we don't even know how we fail when we do. Even those in the highest positions of power seem unable to protect themselves; politics seem to have become a game of Russian roulette: we cock the bot, roll the secret algorithm, and see what dictator will shoot out.

Ethics

All this is to build a new Skynet; not this one or that one, those already exist. I was able to pleasantly joke about the AI takeover during breakfast with a random stranger without raising as much as an eyebrow: we know it will happen, oh well. I've skipped that track in my attendance, but multiple talks at KubeCon are about AI, TensorFlow (it's opensource!), self-driving cars, and removing humans from the equation as much as possible, as a general principle. Kubernetes is often shortened to "Kube", which I always think of as a reference to the Star Trek Borg all mighty ship, the "cube". This might actually make sense given that Kubernetes is an open source version of Google's internal software incidentally called... Borg. To make such fleeting, tongue-in-cheek references to a totalitarian civilization is not harmless: it makes more acceptable the notion that AI domination is inescapable and that resistance truly is futile, the ultimate neo-colonial scheme.

"We are the Borg. Your biological and technological distinctiveness will be added to our own. Resistance is futile."

The "hackers" of our age are building this machine with conscious knowledge of the social and ethical implications of their work. At best, people admit to not knowing what they really are. In the worse case scenario, the AI apocalypse will bring massive unemployment and a collapse of the industrial civilization, to which Silicon Valley executives are responding by buying bunkers to survive the eventual roaming gangs of revolted (and now armed) teachers and young students coming for revenge.

Only the most privileged people in society could imagine such a scenario and actually opt out of society as a whole. Even the robber barons of the 20th century knew they couldn't survive the coming revolution: Andrew Carnegie built libraries after creating the steel empire that drove much of US industrialization near the end of the century and John D. Rockefeller subsidized education, research and science. This is not because they were humanists: you do not become an oil tycoon by tending to the poor. Rockefeller said that "the growth of a large business is merely a survival of the fittest", a social darwinist approach he gladly applied to society as a whole.

But the 70's rebel beat offspring, the children of the cult of Job, do not seem to have the depth of analysis to understand what's coming for them. They want to "hack the system" not for everyone, but for themselves. Early on, we have learned to be selfish and self-driven: repressed as nerds and rejected in the schools, we swore vengeance on the bullies of the world, and boy are we getting our revenge. The bullied have become the bullies, and it's not small boys in schools we're bullying, it is entire states, with which companies are now negotiating as equals.

The fraud

...but what are you creating exactly?

And that is the ultimate fraud: to make the world believe we are harmless little boys, so repressed that we can't communicate properly. We're so sorry we're awkward, it's because we're all somewhat on the autism spectrum. Isn't that, after all, a convenient affliction for people that would not dare to confront the oppression they are creating? It's too easy to hide behind such a real and serious condition that does affect people in our community, but also truly autistic people that simply cannot make it in the fast-moving world the magical rain man is creating. But the real con is hacking power and political control away from traditional institutions, seen as too slow-moving to really accomplish the "change" that is "needed". We are creating an inextricable technocracy that no one will understand, not even us "experts". Instead of serving the people, the machine is at the mercy of markets and powerful oligarchs.

A recurring pattern at Kubernetes conferences is the KubeCon chant where Kelsey Hightower reluctantly engages the crowd in a pep chant:

When I say 'Kube!', you say 'Con!'

'Kube!' 'Con!' 'Kube!' 'Con!' 'Kube!' 'Con!'

Cube Con indeed...

I wish I had some wise parting thoughts of where to go from here or how to change this. The tide seems so strong that all I can do is observe and tell stories. My hope is that the people that need to hear this will take it the right way, but I somehow doubt it. With chance, it might just become irrelevant and everything will fix itself, but somehow I fear things will get worse before they get better.

Catégories: External Blogs

Easier container security with entitlements

Anarcat - lun, 05/21/2018 - 19:00

This article is part of a series on KubeCon Europe 2018.

During KubeCon + CloudNativeCon Europe 2018, Justin Cormack and Nassim Eddequiouaq presented a proposal to simplify the setting of security parameters for containerized applications. Containers depend on a large set of intricate security primitives that can have weird interactions. Because they are so hard to use, people often just turn the whole thing off. The goal of the proposal is to make those controls easier to understand and use; it is partly inspired by mobile apps on iOS and Android platforms, an idea that trickled back into Microsoft and Apple desktops. The time seems ripe to improve the field of container security, which is in desperate need of simpler controls.

The problem with container security

Cormack first stated that container security is too complicated. His slides stated bluntly that "unusable security is not security" and he pleaded for simpler container security mechanisms with clear guarantees for users.

"Container security" is a catchphrase that actually includes all sorts of measures, some of which we have previously covered. Cormack presented an overview of those mechanisms, including capabilities, seccomp, AppArmor, SELinux, namespaces, control groups — the list goes on. He showed how docker run --help has a "ridiculously large number of options"; there are around one hundred on my machine, with about fifteen just for security mechanisms. He said that "most developers don't know how to actually apply those mechanisms to make sure their containers are secure". In the best-case scenario, some people may know what the options are, but in most cases people don't actually understand each mechanism in detail.

He gave the example of capabilities; there are about forty possible values that can be provided for the --cap-drop option, each with its own meaning. He described some capabilities as "understandable", but said that others end up in overly broad boxes. The kernel's data structure limits the system to a maximum of 64 capabilities, so a bunch of functionality was lumped together into CAP_SYS_ADMIN, he said.

Cormack also talked about namespaces and seccomp. While there are fewer namespaces than capabilities, he said that "it's very unclear for a general user what their security properties are". For example, "some combinations of capabilities and namespaces will let you escape from a container, and other ones don't". He also described seccomp as a "long JSON file" as that's the way Kubernetes configures it. Even though he said those files could "usefully be even more complicated" and said that the files are "very difficult to write".

Cormack stopped his enumeration there, but the same applies to the other mechanisms. He said that while developers could sit down and write those policies for their application by hand, it's a real mess and makes their heads explode. So instead developers run their containers in --privileged mode. It works, but it disables all the nice security mechanisms that the container abstraction provides. This is why "containers do not contain", as Dan Walsh famously quipped.

Introducing entitlements

There must be a better way. Eddequiouaq proposed this simple idea: "provide something humans can actually understand without diving into code or possibly even without reading documentation". The solution proposed by the Docker security team is "entitlements": the ability for users to choose simple permissions on the command line. Eddequiouaq said that application users and developers alike don't need to understand the low-level security mechanisms or how they interact within the kernel; "people don't care about that, they want to make sure their app is secure."

Entitlements divide resources into meaningful domains like "network", "security", or "host resources" (like devices). Behind the scenes, Docker translates those into whatever security mechanisms are available. This implies that the actual mechanism deployed will vary between runtimes, depending on the implementation. For example, a "confined" network access might mean a seccomp filter blocking all networking-related system calls except socket(AF_UNIX|AF_LOCAL) along with dropping network-related capabilities. AppArmor will deny network on some platforms while SELinux would do similar enforcement on others.

Eddequiouaq said the complexity of implementing those mechanisms is the responsibility of platform developers. Image developers can ship entitlement lists along with container images created with a regular docker build, and sign the whole bundle with docker trust. Because entitlements do not specify explicit low-level mechanisms, the resulting image is portable to different runtimes without change. Such portability helps Kubernetes on non-Linux platforms do its job.

Entitlements shift the responsibility for configuring sandboxing environments to image developers, but also empowers them to deliver security mechanisms directly to end users. Developers are the ones with the best knowledge about what their applications should or should not be doing. Image end-users, in turn, benefit from verifiable security properties delivered by the bundles and the expertise of image developers when they docker pull and run those images.

Eddequiouaq gave a demo of the community's nemesis: Docker inside Docker (DinD). He picked that use case because it requires a lot of privileges, which usually means using the dreaded --privileged flag. With the entitlements patch, he was able to run DinD with network.admin, security.admin, and host.devices.admin, which looks like --privileged, but actually means some protections are still in place. According to Eddequiouaq, "everything works and we didn't have to disable all the seccomp and AppArmor profiles". He also gave a demo of how to build an image and demonstrated how docker inspect shows the entitlements bundled inside the image. With such an image, docker run starts a DinD image without any special flags. That requires a way to trust the content publisher because suddenly images can elevate their own privileges without the caller specifying anything on the Docker command line.

Goals and future

The specification aims to provide the best user experience possible, so that people actually start using the security mechanisms provided by the platforms instead of opting out of security configurations when they get a "permission denied" error. Eddequiouaq said that Docker eventually wants to "ditch the --privileged flag because it is really a bad habit". Instead, applications should run with the least privileges they need. He said that "this is not the case; currently, everyone works with defaults that work with 95% of the applications out there." Those Docker defaults, he said, provide a "way too big attack surface".

Eddequiouaq opened the door for developers to define custom entitlements because "it's hard to come up with a set that will cover all needs". One way the team thought of dealing with that uncertainty is to have versions of the specification but it is unclear how that would work in practice. Would the version be in the entitlement labels (e.g. network-v1.admin), or out of band?

Another feature proposed is the control of API access and service-to-service communication in the security profile. This is something that's actually available on phones, where an app can only talk with a specific set of services. But that is also relevant to containers in Kubernetes clusters as administrators often need to restrict network access with more granularity than the "open/filter/close" options. An example of such policy could allow the "web" container to talk with the "database" container, although it might be difficult to specify such high-level policies in practice.

While entitlements are now implemented in Docker as a proof of concept, Kubernetes has the same usability issues as Docker so the ultimate goal is to get entitlements working in Kubernetes runtimes directly. Indeed, its PodSecurityPolicy maps (almost) one-to-one with the Docker security flags. But as we have previously reported, another challenge in Kubernetes security is that the security models of Kubernetes and Docker are not exactly identical.

Eddequiouaq said that entitlements could help share best security policies for a pod in Kubernetes. He proposed that such configuration would happen through the SecurityContext object. Another way would be an admission controller that would avoid conflicts between the entitlements in the image and existing SecurityContext profiles already configured in the cluster. There are two possible approaches in that case: the rules from the entitlements could expand the existing configuration or restrict it where the existing configuration becomes a default. The problem here is that the pod's SecurityContext already provides a widely deployed way to configure security mechanisms, even if it's not portable or easy to share, so the proposal shouldn't break existing configurations. There is work in progress in Docker to allow inheriting entitlements within a Dockerfile. Eddequiouaq proposed that Kubernetes should implement a simple mechanism to inherit entitlements from images in the admission controller.

The Docker security team wants to create a "widely adopted standard" supported by Docker swarm, Kubernetes, or any container scheduler. But it's still unclear how deep into the Kubernetes stack entitlements belong. In the team's current implementation, Docker translates entitlements into the security mechanisms right before calling its runtime (containerd), but it might be possible to push the entitlements concept straight into the runtime itself, as it knows best how the platform operates.

Some readers might also notice fundamental similarities between this and other mechanisms such as OpenBSD's pledge(), which made me wonder if entitlements belong in user space in the first place. Cormack observed that seccomp was such a "pain to work with to do complicated policies". He said that having eBPF seccomp filters would make it easier to deal with conflicts between policies and also mentioned the work done on the Checmate and Landlock security modules as interesting avenues to explore. It seems that none of those kernel mechanisms are ready for prime time, at least not to the point that Docker can use them in production. Eddequiouaq said that the proposal was open to changes and discussion so this is all work in progress at this stage. The next steps are to make a proposal to the Kubernetes community before working on an actual implementation outside of Docker.

I have found the core idea of protecting users from all the complicated stuff in container security interesting. It is a recurring theme in container security; we've previously discussed proposals to add container identifiers in the kernel directly for example. Everyone knows security is sensitive and important in Kubernetes, yet doing it correctly is hard. This is a recipe for disaster, which has struck in high profile cases recently. Hopefully having such easier and cleaner mechanisms will help users, developers, and administrators alike.

A YouTube video and slides [PDF] of the talk are available.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Epic Lameness

Eric Dorland - lun, 09/01/2008 - 17:26
SF.net now supports OpenID. Hooray! I'd like to make a comment on a thread about the RTL8187se chip I've got in my new MSI Wind. So I go to sign in with OpenID and instead of signing me in it prompts me to create an account with a name, username and password for the account. Huh? I just want to post to their forum, I don't want to create an account (at least not explicitly, if they want to do it behind the scenes fine). Isn't the point of OpenID to not have to create accounts and particularly not have to create new usernames and passwords to access websites? I'm not impressed.
Catégories: External Blogs

Sentiment Sharing

Eric Dorland - lun, 08/11/2008 - 23:28
Biella, I am from there and I do agree. If I was still living there I would try to form a team and make a bid. Simon even made noises about organizing a bid at DebConfs past. I wish he would :)

But a DebConf in New York would be almost as good.
Catégories: External Blogs
Syndiquer le contenu