Skip to main content

Feed aggregator

Caldwell Partners' Cyber Advisory Board Service

Linux Journal - Fri, 09/15/2017 - 11:04

For many enterprises, cyber risk is the top business risk. Meanwhile, there is simply not a sufficiently large talent pool of cyber-risk professionals to satisfy the ever-growing demand. more>>

Categories: Linux News

Solving Physics Problems on Linux

Linux Journal - Thu, 09/14/2017 - 08:24

Several years ago, I wrote an article on using Elmer to solve complicated physics problems. Elmer has progressed quite a bit since then, so I thought it would be worth taking a fresh look at this simulation software. more>>

Categories: Linux News

I'll Gladly Pay You Tuesday for a Hamburger Today

Linux Journal - Wed, 09/13/2017 - 07:09

My day job pays me on the 15th and last day of every month, unless those days land on a weekend, in which case I get paid the Friday before. With those rules, creating a Google Calendar event is shockingly difficult. In fact, it's not possible to create a recurring event with those rules using Google's GUI scheduling tool. more>>

Categories: Linux News

Watermarking Images--from the Command Line

Linux Journal - Tue, 09/12/2017 - 08:52

Us geeks mostly think of the command line as the best place for text manipulation. It's a natural with cat, grep and shell scripts. But although you can't necessarily view your results from within a typical terminal window, it turns out to be pretty darn easy to analyze and manipulate images from within a shell script. more>>

Categories: Linux News

Chasing Carrots' Pressure Overdrive

Linux Journal - Mon, 09/11/2017 - 08:52

A "funky, four-wheeled shoot 'em up" is how independent game-developer Chasing Carrots describes its newest game release Pressure Overdrive for Linux, Mac OS, Windows and Xbox. more>>

Categories: Linux News

Non-Linux FOSS: Mac2Imgur

Linux Journal - Fri, 09/08/2017 - 15:51

I love to share images with people quickly. They could be cat photos or screenshots. Usually I post those silly images to Twitter and Facebook using Buffer, but occasionally, I just want to send a quick image to a single person. (This is usually when I'm trying to show my computer via screenshot.) more>>

Categories: Linux News

Heirloom Software: the Past as Adventure

Linux Journal - Thu, 09/07/2017 - 07:17

Through the years, I've spent what might seem to some people an inordinate amount of time cleaning up and preserving ancient software. My Retrocomputing Museum page archives any number of computer languages and games that might seem utterly obsolete. more>>

Categories: Linux News

SUSE Linux Enterprise Server for SAP Applications

Linux Journal - Wed, 09/06/2017 - 09:46

Saving customers time, effort and budget as they implement SAP landscapes, including on-premises and now on-demand, are the core selling points for SUSE Linux Enterprise Server for SAP Applications.

The latest release of the SAP-focused SUSE Linux server is also now available as the operating system for SAP solutions on Google Cloud Platform (GCP). more>>

Categories: Linux News

Classifying Text

Linux Journal - Tue, 09/05/2017 - 09:35

In my last few articles, I've looked at several ways one can apply machine learning, both supervised and unsupervised. This time, I want to bring your attention to a surprisingly simple—but powerful and widespread—use of machine learning, namely document classification. more>>

Categories: Linux News

My free software activities, August 2017

Anarcat - Sat, 09/02/2017 - 15:16
Debian Long Term Support (LTS)

This is my monthly Debian LTS report. This month I worked on a few major packages that took a long time instead of multiple smaller issues. Affected packages were Mercurial, libdbd-mysql-perl and Ruby.

Mercurial updates

Mercurial was vulnerable to two CVEs: CVE-2017-1000116 (command injection on clients through malicious ssh URLs) and CVE-2017-1000115 (path traversal via symlink). The former is an issue that actually affects many other similar software like Git (CVE-2017-1000117), Subversion (CVE-2017-9800) and even CVS (CVE-2017-12836). The latter symlink issue is a distinct issue that came up during an internal audit.

The fix, shipped as DLA-1072-1, involved a rather difficult backport, especially because the Mercurial test suite takes a long time to complete. This reminded me of the virtues of DEB_BUILD_OPTIONS=parallel=4, which sped up the builds considerably. I also discovered that the Wheezy build chain doesn't support sbuild's --source-only-changes flag which I had hardcoded in my sbuild.conf file. This seems to be simply because sbuild passes --build=source to dpkg-buildpackage, an option that is supported only in jessie or later.

libdbd-mysql-perl

I have worked on fixing two issues with the libdbd-mysql-perl package, CVE-2017-10788 and CVE-2017-10789, which resulted in the DLA-1079-1 upload. Behind this mysteriously named package sits a critical piece of infrastructure, namely the mysql commandline client which is probably used and abused by hundreds if not thousands of home-made scripts, but also all of Perl's MySQL support, which is probably used by even a larger base of software.

Through the Debian bug reports (Debian bug #866818 and Debian bug #866821), I have learned that the patches existed in the upstream tracker but were either ignored or even reverted in the latest 4.043 upstream release. It turns out that there are talks of forking that library because of maintainership issue. It blows my mind that such an important part of MySQL is basically unmaintained.

I ended up backporting the upstream patches, which was also somewhat difficult because of the long-standing issues with SSL support in MySQL. The backport there was particularly hard to test, as you need to run that test suite by hand, twice: once with a server configured with a (valid!) SSL certificate and one without (!). I'm wondering how much time it is really worth spending on trying to fix SSL in MySQL, however. It has been badly broken forever, and while the patch is an improvement, I would actually still never trust SSL transports in MySQL over an untrusted network. The few people that I know use such transports wrap their connections around a simpler stunnel instead.

The other issue was easier to fix so I submitted a pull request upstream to make sure that work isn't lost, although it is not clear what the future of that patch (or project!) will be at this point.

Rubygems

I also worked on the rubygems issues, which, thanks to the "vendoring" practice of the Ruby community, also affects the ruby1.9 package. 4 distinct CVEs were triaged here (CVE-2017-0899, CVE-2017-0900, CVE-2017-0901 and CVE-2017-0902) and I determined the latter issue didn't affect wheezy as rubygems doesn't do its own DNS resolution there (later versions lookup SRV records).

This is another package where the test suite takes a long time to run. Worse, the packages in Wheezy actually fails to build from source: the test suites just fail in various steps, particularly because of dh key too small errors for Rubygems, but also other errors for Ruby. I also had trouble backporting one test which I had to simply skip for Rubygems. I uploaded and announced test packages and hopefully I'll be able to complete this work soon, although I would certainly appreciate any help on this...

Triage

I took a look at the sox, libvorbis and exiv2 issues. None had fixes available. sox and exiv2 were basically a list of fuzzing issues, which are often minor or at least of unknown severity. Those would have required a significant amount of work and I figured I would prioritize other work first.

I also triaged CVE-2017-7506, which doesn't seem to affect the spice package in wheezy, after doing a fairly thorough audit of the code. The vulnerability is specifically bound to the reds_on_main_agent_monitors_config function, which is simply not present in our older version. A hostile message would fall through the code and not provoke memory allocation or out of bounds access, so I simply marked the wheezy version as not-affected, something which usually happens during the original triage but can also happen during the actual patching work, as in this case.

Other free software work

This describes the volunteer work I do on various free software projects. This month, again, my internal reports show that I spent about the same time on volunteer and paid time, but this is probably a wrong estimate because I spent a lot of time at Debconf which I didn't clock in...

Debconf

So I participated in the 17th Debian Conference in Montreal. It was great to see (and make!) so many friends from all over the world in person again, and I was happy to work on specific issues together with other Debian developers. I am especially thankful to David Bremner for fixing the syncing of the flagged tag when added to new messages (patch series). This allows me to easily sync the one tag (inbox) that is not statically assigned during notmuch new, by using flagged as a synchronization tool. This allows me to use notmuch more easily across multiple machines without having to sync all tags with dump/restore or using muchsync which wasn't working for me (although a new release came out which may fix my issues). The magic incantation looks something like this:

notmuch tag -inbox tag:inbox and not tag:flagged notmuch tag +inbox not tag:inbox and tag:flagged

However, most of my time in the first week (Debcamp) was spent trying to complete the networking setup: configure switches, setup wiring and so on. I also configured an apt-cacher-ng proxy to serve packages to attendees during the conference. I configured it with Avahi to configure clients automatically, which led me to discover (and fix) issue Debian bug #870321) although there are more issues with the autodiscovery mechanism... I spent extra time to document the (somewhat simple) configuration of such a server in the Debian wiki because it was not the first time I had research that procedure...

I somehow thought this was a great time to upgrade my laptop to stretch. Normally, I keep that device running stable because I don't use it often and I don't want to have major traumatizing upgrades every time I leave with it on a trip. But this time was special: there were literally hundreds of Debian developers to help me out if there was trouble. And there was, of course, trouble as it turns out! I had problems with the fonts on my display, because, well, I had suspended (twice) my laptop during the install. The fix was simply to flush the fontconfig cache, and I tried to document this in the fonts wiki page and my upgrades page.

I also gave a short training called Debian packaging 101 which was pretty successful. Like the short presentation I made at the last Montreal BSP, the workshop was based on my quick debian development guide. I'm thinking of expanding this to a larger audience with a "102" course that would discuss more complex packaging problems. But my secret plan (well, secret until now I guess) is to make packaging procedures more uniform in Debian by training new Debian packagers using that same training for the next 2 decades. But I will probably start by just trying to do this again at the next Debconf, if I can attend.

Debian uploads

I also sponsored two packages during Debconf: one was a "scratch an itch" upload (elpa-ivy) which I requested (Debian bug #863216) as part of a larger effort to ship the Emacs elisp packages as Debian packages. The other was an upload of diceware to build the documentation in a separate package and fix other issues I have found in the package during a review.

I also uploaded a bunch of other fixes to the Debian archive:

Signing keys rotation

I also started the process of moving my main OpenPGP certification key by adding a signing subkey. The subkey is stored in a cryptographic token so I can sign things on more than one machine without storing that critical key on all those devices physically.

Unfortunately, this meant that I need to do some shenanigans when I want to sign content in my Debian work, because the new subkey takes time to propagate to the Debian archive. For example, I have to specify the primary key with a "bang" when signing packages (debsign -k '792152527B75921E!' ...) or use inline signatures in email sent for security announcement (since that trick doesn't work in Mutt or Notmuch). I tried to figure out how to better coordinate this next time by reading up documentation on keyring.debian.org, but there is no fixed date for key changes on the rsync interface. There are "monthly changes" so one's best bet is to look for the last change in their git repository.

GitLab.com and LFS migration

I finally turned off my src.anarc.at git repository service by moving the remaining repos to GitLab. Unfortunately, GitLab removed support for git-annex recently, so I had to migrate my repositories to Git-LFS, which was an interesting experience. LFS is pretty easy to use, definitely simpler than git-annex. It also seems to be a good match for the use-case at hand, which is to store large files (videos, namely) as part of slides for presentations.

It turns out that their migration guide could have been made much simpler. I tried to submit those changes to the documentation but couldn't fork the GitLab EE project to make a patch, so I just documented the issue in the original MR for now. While I was there I filed a feature request to add a new reference shortcut (GL-NNN) after noticing a similar token used on GitHub. This would be a useful addition because I often have numbering conflicts between Debian BTS bug numbers and GitLab issues in packages I maintain there. In particular, I have problems using GitLab issue numbers in Monkeysign, because commit logs end up in Debian changelogs and will be detected by the Debian infrastructure even though those are GitLab bug numbers. Using such a shortcut would avoid detection and such a conflict.

Numpy-stats

I wrote a small tool to extract numeric statistics from a given file. I often do ad-hoc benchmarks where I store a bunch of numbers in a file and then try to make averages and so on. As an exercise in learning NumPy, I figured I would write such a simple tool, called numpy-stats, which probably sounds naive to seasoned Python scientists.

My incentive was that I was trying to figure out what was the distribution of password length in a given password generator scheme. So I wrote this simple script:

for i in seq 10000 ; do shuf -n4 /usr/share/dict/words | tr -d '\n' done > length

And then feed that data in the tool:

$ numpy-stats lengths { "max": 60, "mean": 33.883293722913464, "median": 34.0, "min": 14, "size": 143060, "std": 5.101490225062775 }

I am surprised that there isn't such a tool already: hopefully I am wrong and will just be pointed towards the better alternative in the comments here!

Safe Eyes

I added screensaver support to the new SafeEyes project, which I am considering as a replacement to the workrave project I have been using for years. I really like how the interruptions basically block the whole screen: way more effective than only blocking the keyboard, because all potential distractions go away.

One feature that is missing is keystrokes and mouse movement counting and of course an official Debian package, although the latter would be easy to fix because upstream already has an unofficial build. I am thinking of writing my own little tool to count keystrokes, since the overlap between SafeEyes and such a counter isn't absolutely necessary. This is something that workrave does, but there are "idle time" extensions in Xorg that do not need to count keystrokes. There are already certain tools to count input events, but none seem to do what I want (most of them are basically keyloggers). It would be an interesting test to see if it's possible to write something that would work both for Xorg and Wayland at the same time. Unfortunately, preliminary research show that:

  1. in Xorg, the only way to implement this is to sniff all events, ie. to implement a keylogger

  2. in Wayland, this is completely unsupported. it seems some compositors could implement such a counter, but then it means that this is compositor specific, or, in other words, unportable

So there is little hope here, which brings to my mind "painmeter" as an appropriate name for this future programming nightmare.

Ansible

I sent my first contribution to the ansible project with a small documentation fix. I had an eye opener recently when I discovered a GitLab ansible prototype that would manipulate GitLab settings. When I first discovered Ansible, I was frustrated by the YAML/Jinja DSL: it felt silly to write all this code in YAML when you are a Python developer. It was great to see reasonably well-written Python code that would do things and delegate the metadata storage (and only that!) to YAML, as opposed to using YAML as a DSL.

So I figured I would look at the Ansible documentation on how this works, but unfortunately, the Ansible documentation is severly lacking in this area. There are broken links (I only fixed one page) and missing pieces. For example, the developing plugins page doesn't explain how to program a plugin at all.

I was told on IRC that: "documentation around developing plugins is sparse in general. the code is the best documentation that exists (right now)". I didn't get a reply when asking which code in particular could provide good examples either. In comparison, Puppet has excellent documentation on how to create custom types, functions and facts. That is definitely a turn-off for a new contributor, but at least my pull request was merged in and I can only hope that seasoned Ansible contributors expand on this critical piece of documentation eventually.

Misc

As you can see, I'm all over the place, as usual. GitHub tells me I "Opened 13 other pull requests in 11 repositories" (emphasis mine), which I guess means on top of the "9 commits in 5 repositories" mentioned earlier. My profile probably tells a more detailed story that what would be useful to mention here. I should also mention how difficult it is to write those reports: I basically do a combination of looking into my GitHub and GitLab profiles, the last 30 days of emails and filesystem changes (!!). En vrac, a list of changes which may be of interest:

  • font-large (and its alias, font-small): shortcut to send the right escape sequence to rxvt so it changes its font
  • fix-acer: short script to hardcode the modeline (you remember those?!) for my screen which has a broken EDID pin (so autodetection fails, yay Xorg log files...)
  • ikiwiki-pandoc-quickie: fake ikiwiki renderer that (ab)uses pandoc to generate a HTML file with the right stylesheet to preview Markdown as it may look in this blog (the basic template is missing still)
  • git-annex-transfer: a command I've often been missing in git-annex, which is a way to transfer files between remotes without having to copy them locally (upstream feature request)
  • I linked the graphics of the Debian archive software architecture in the Debian wiki in the hope more people notice it.
  • I did some tweaks on my Taffybar to introduce a battery meter and hoping to have temperature sensors, which mostly failed. there's a pending pull request that may bring some sense into this, hopefully.
  • I made two small patches in Monkeysign to fix gpg.conf handling and multiple email output, a dumb bug I cannot believe anyone noticed or reported just yet. Thanks Valerie for the bug report! The upload of this in Debian is pending a review from the release team.
Categories: External Blogs

My free software activities, August 2017

Anarcat - Sat, 09/02/2017 - 15:16
Debian Long Term Support (LTS)

This is my monthly Debian LTS report. This month I worked on a few major packages that took a long time instead of multiple smaller issues. Affected packages were Mercurial, libdbd-mysql-perl and Ruby.

Mercurial updates

Mercurial was vulnerable to two CVEs: CVE-2017-1000116 (command injection on clients through malicious ssh URLs) and CVE-2017-1000115 (path traversal via symlink). The former is an issue that actually affects many other similar software like Git (CVE-2017-1000117), Subversion (CVE-2017-9800) and even CVS (CVE-2017-12836). The latter symlink issue is a distinct issue that came up during an internal audit.

The fix, shipped as DLA-1072-1, involved a rather difficult backport, especially because the Mercurial test suite takes a long time to complete. This reminded me of the virtues of DEB_BUILD_OPTIONS=parallel=4, which sped up the builds considerably. I also discovered that the Wheezy build chain doesn't support sbuild's --source-only-changes flag which I had hardcoded in my sbuild.conf file. This seems to be simply because sbuild passes --build=source to dpkg-buildpackage, an option that is supported only in jessie or later.

libdbd-mysql-perl

I have worked on fixing two issues with the libdbd-mysql-perl package, CVE-2017-10788 and CVE-2017-10789, which resulted in the DLA-1079-1 upload. Behind this mysteriously named package sits a critical piece of infrastructure, namely the mysql commandline client which is probably used and abused by hundreds if not thousands of home-made scripts, but also all of Perl's MySQL support, which is probably used by even a larger base of software.

Through the Debian bug reports (Debian bug #866818 and Debian bug #866821), I have learned that the patches existed in the upstream tracker but were either ignored or even reverted in the latest 4.043 upstream release. It turns out that there are talks of forking that library because of maintainership issue. It blows my mind that such an important part of MySQL is basically unmaintained.

I ended up backporting the upstream patches, which was also somewhat difficult because of the long-standing issues with SSL support in MySQL. The backport there was particularly hard to test, as you need to run that test suite by hand, twice: once with a server configured with a (valid!) SSL certificate and one without (!). I'm wondering how much time it is really worth spending on trying to fix SSL in MySQL, however. It has been badly broken forever, and while the patch is an improvement, I would actually still never trust SSL transports in MySQL over an untrusted network. The few people that I know use such transports wrap their connections around a simpler stunnel instead.

The other issue was easier to fix so I submitted a pull request upstream to make sure that work isn't lost, although it is not clear what the future of that patch (or project!) will be at this point.

Rubygems

I also worked on the rubygems issues, which, thanks to the "vendoring" practice of the Ruby community, also affects the ruby1.9 package. 4 distinct CVEs were triaged here (CVE-2017-0899, CVE-2017-0900, CVE-2017-0901 and CVE-2017-0902) and I determined the latter issue didn't affect wheezy as rubygems doesn't do its own DNS resolution there (later versions lookup SRV records).

This is another package where the test suite takes a long time to run. Worse, the packages in Wheezy actually fails to build from source: the test suites just fail in various steps, particularly because of dh key too small errors for Rubygems, but also other errors for Ruby. I also had trouble backporting one test which I had to simply skip for Rubygems. I uploaded and announced test packages and hopefully I'll be able to complete this work soon, although I would certainly appreciate any help on this...

Triage

I took a look at the sox, libvorbis and exiv2 issues. None had fixes available. sox and exiv2 were basically a list of fuzzing issues, which are often minor or at least of unknown severity. Those would have required a significant amount of work and I figured I would prioritize other work first.

I also triaged CVE-2017-7506, which doesn't seem to affect the spice package in wheezy, after doing a fairly thorough audit of the code. The vulnerability is specifically bound to the reds_on_main_agent_monitors_config function, which is simply not present in our older version. A hostile message would fall through the code and not provoke memory allocation or out of bounds access, so I simply marked the wheezy version as not-affected, something which usually happens during the original triage but can also happen during the actual patching work, as in this case.

Other free software work

This describes the volunteer work I do on various free software projects. This month, again, my internal reports show that I spent about the same time on volunteer and paid time, but this is probably a wrong estimate because I spent a lot of time at Debconf which I didn't clock in...

Debconf

So I participated in the 17th Debian Conference in Montreal. It was great to see (and make!) so many friends from all over the world in person again, and I was happy to work on specific issues together with other Debian developers. I am especially thankful to David Bremner for fixing the syncing of the flagged tag when added to new messages (patch series). This allows me to easily sync the one tag (inbox) that is not statically assigned during notmuch new, by using flagged as a synchronization tool. This allows me to use notmuch more easily across multiple machines without having to sync all tags with dump/restore or using muchsync which wasn't working for me (although a new release came out which may fix my issues). The magic incantation looks something like this:

notmuch tag -inbox tag:inbox and not tag:flagged notmuch tag +inbox not tag:inbox and tag:flagged

However, most of my time in the first week (Debcamp) was spent trying to complete the networking setup: configure switches, setup wiring and so on. I also configured an apt-cacher-ng proxy to serve packages to attendees during the conference. I configured it with Avahi to configure clients automatically, which led me to discover (and fix) issue Debian bug #870321) although there are more issues with the autodiscovery mechanism... I spent extra time to document the (somewhat simple) configuration of such a server in the Debian wiki because it was not the first time I had research that procedure...

I somehow thought this was a great time to upgrade my laptop to stretch. Normally, I keep that device running stable because I don't use it often and I don't want to have major traumatizing upgrades every time I leave with it on a trip. But this time was special: there were literally hundreds of Debian developers to help me out if there was trouble. And there was, of course, trouble as it turns out! I had problems with the fonts on my display, because, well, I had suspended (twice) my laptop during the install. The fix was simply to flush the fontconfig cache, and I tried to document this in the fonts wiki page and my upgrades page.

I also gave a short training called Debian packaging 101 which was pretty successful. Like the short presentation I made at the last Montreal BSP, the workshop was based on my quick debian development guide. I'm thinking of expanding this to a larger audience with a "102" course that would discuss more complex packaging problems. But my secret plan (well, secret until now I guess) is to make packaging procedures more uniform in Debian by training new Debian packagers using that same training for the next 2 decades. But I will probably start by just trying to do this again at the next Debconf, if I can attend.

Debian uploads

I also sponsored two packages during Debconf: one was a "scratch an itch" upload (elpa-ivy) which I requested (Debian bug #863216) as part of a larger effort to ship the Emacs elisp packages as Debian packages. The other was an upload of diceware to build the documentation in a separate package and fix other issues I have found in the package during a review.

I also uploaded a bunch of other fixes to the Debian archive:

Signing keys rotation

I also started the process of moving my main OpenPGP certification key by adding a signing subkey. The subkey is stored in a cryptographic token so I can sign things on more than one machine without storing that critical key on all those devices physically.

Unfortunately, this meant that I need to do some shenanigans when I want to sign content in my Debian work, because the new subkey takes time to propagate to the Debian archive. For example, I have to specify the primary key with a "bang" when signing packages (debsign -k '792152527B75921E!' ...) or use inline signatures in email sent for security announcement (since that trick doesn't work in Mutt or Notmuch). I tried to figure out how to better coordinate this next time by reading up documentation on keyring.debian.org, but there is no fixed date for key changes on the rsync interface. There are "monthly changes" so one's best bet is to look for the last change in their git repository.

GitLab.com and LFS migration

I finally turned off my src.anarc.at git repository service by moving the remaining repos to GitLab. Unfortunately, GitLab removed support for git-annex recently, so I had to migrate my repositories to Git-LFS, which was an interesting experience. LFS is pretty easy to use, definitely simpler than git-annex. It also seems to be a good match for the use-case at hand, which is to store large files (videos, namely) as part of slides for presentations.

It turns out that their migration guide could have been made much simpler. I tried to submit those changes to the documentation but couldn't fork the GitLab EE project to make a patch, so I just documented the issue in the original MR for now. While I was there I filed a feature request to add a new reference shortcut (GL-NNN) after noticing a similar token used on GitHub. This would be a useful addition because I often have numbering conflicts between Debian BTS bug numbers and GitLab issues in packages I maintain there. In particular, I have problems using GitLab issue numbers in Monkeysign, because commit logs end up in Debian changelogs and will be detected by the Debian infrastructure even though those are GitLab bug numbers. Using such a shortcut would avoid detection and such a conflict.

Numpy-stats

I wrote a small tool to extract numeric statistics from a given file. I often do ad-hoc benchmarks where I store a bunch of numbers in a file and then try to make averages and so on. As an exercise in learning NumPy, I figured I would write such a simple tool, called numpy-stats, which probably sounds naive to seasoned Python scientists.

My incentive was that I was trying to figure out what was the distribution of password length in a given password generator scheme. So I wrote this simple script:

for i in seq 10000 ; do shuf -n4 /usr/share/dict/words | tr -d '\n' done > length

And then feed that data in the tool:

$ numpy-stats lengths { "max": 60, "mean": 33.883293722913464, "median": 34.0, "min": 14, "size": 143060, "std": 5.101490225062775 }

I am surprised that there isn't such a tool already: hopefully I am wrong and will just be pointed towards the better alternative in the comments here!

Safe Eyes

I added screensaver support to the new SafeEyes project, which I am considering as a replacement to the workrave project I have been using for years. I really like how the interruptions basically block the whole screen: way more effective than only blocking the keyboard, because all potential distractions go away.

One feature that is missing is keystrokes and mouse movement counting and of course an official Debian package, although the latter would be easy to fix because upstream already has an unofficial build. I am thinking of writing my own little tool to count keystrokes, since the overlap between SafeEyes and such a counter isn't absolutely necessary. This is something that workrave does, but there are "idle time" extensions in Xorg that do not need to count keystrokes. There are already certain tools to count input events, but none seem to do what I want (most of them are basically keyloggers). It would be an interesting test to see if it's possible to write something that would work both for Xorg and Wayland at the same time. Unfortunately, preliminary research show that:

  1. in Xorg, the only way to implement this is to sniff all events, ie. to implement a keylogger

  2. in Wayland, this is completely unsupported. it seems some compositors could implement such a counter, but then it means that this is compositor specific, or, in other words, unportable

So there is little hope here, which brings to my mind "painmeter" as an appropriate name for this future programming nightmare.

Ansible

I sent my first contribution to the ansible project with a small documentation fix. I had an eye opener recently when I discovered a GitLab ansible prototype that would manipulate GitLab settings. When I first discovered Ansible, I was frustrated by the YAML/Jinja DSL: it felt silly to write all this code in YAML when you are a Python developer. It was great to see reasonably well-written Python code that would do things and delegate the metadata storage (and only that!) to YAML, as opposed to using YAML as a DSL.

So I figured I would look at the Ansible documentation on how this works, but unfortunately, the Ansible documentation is severly lacking in this area. There are broken links (I only fixed one page) and missing pieces. For example, the developing plugins page doesn't explain how to program a plugin at all.

I was told on IRC that: "documentation around developing plugins is sparse in general. the code is the best documentation that exists (right now)". I didn't get a reply when asking which code in particular could provide good examples either. In comparison, Puppet has excellent documentation on how to create custom types, functions and facts. That is definitely a turn-off for a new contributor, but at least my pull request was merged in and I can only hope that seasoned Ansible contributors expand on this critical piece of documentation eventually.

Misc

As you can see, I'm all over the place, as usual. GitHub tells me I "Opened 13 other pull requests in 11 repositories" (emphasis mine), which I guess means on top of the "9 commits in 5 repositories" mentioned earlier. My profile probably tells a more detailed story that what would be useful to mention here. I should also mention how difficult it is to write those reports: I basically do a combination of looking into my GitHub and GitLab profiles, the last 30 days of emails and filesystem changes (!!). En vrac, a list of changes which may be of interest:

  • font-large (and its alias, font-small): shortcut to send the right escape sequence to rxvt so it changes its font
  • fix-acer: short script to hardcode the modeline (you remember those?!) for my screen which has a broken EDID pin (so autodetection fails, yay Xorg log files...)
  • ikiwiki-pandoc-quickie: fake ikiwiki renderer that (ab)uses pandoc to generate a HTML file with the right stylesheet to preview Markdown as it may look in this blog (the basic template is missing still)
  • git-annex-transfer: a command I've often been missing in git-annex, which is a way to transfer files between remotes without having to copy them locally (upstream feature request)
  • I linked the graphics of the Debian archive software architecture in the Debian wiki in the hope more people notice it.
  • I did some tweaks on my Taffybar to introduce a battery meter and hoping to have temperature sensors, which mostly failed. there's a pending pull request that may bring some sense into this, hopefully.
  • I made two small patches in Monkeysign to fix gpg.conf handling and multiple email output, a dumb bug I cannot believe anyone noticed or reported just yet. Thanks Valerie for the bug report! The upload of this in Debian is pending a review from the release team.
Categories: External Blogs

Linux Journal September 2017

Linux Journal - Fri, 09/01/2017 - 14:40
Soup to Nuts

One of my favorite things about Linux is that it has become not only the platform of choice for many projects, but it also tends to inspire an entire ecosystem of open-sourc more>>

Categories: Linux News

If Not This Then Stringify

Linux Journal - Thu, 08/31/2017 - 10:50

I love IFTTT (If This Then That), but although it usually works well, it's more and more common for triggers to fail. Sometimes they don't fail, but take several minutes to activate. When you want a light to turn on as you enter a room, several minutes of delay clearly can be a deal-breaker. more>>

Categories: Linux News

Ocado Technology's Kubermesh

Linux Journal - Wed, 08/30/2017 - 07:34

Instead of relying on servers concentrated in one large data center, the new Kubermesh is designed to simplify data-center architectures for smart factories by elegantly and cost effectively leveraging a distributed network of computing nodes spread across the enterprise. more>>

Categories: Linux News

Sysadmin 101: Ticketing

Linux Journal - Tue, 08/29/2017 - 09:42

This is the third in a series of articles on system administrator fundamentals where I focus on some lessons I've learned through the years that might be obvious to longtime sysadmins, but news to someone just coming into this position. more>>

Categories: Linux News

BeBop Sensors, Inc.'s Marcel Modular Data Gloves

Linux Journal - Mon, 08/28/2017 - 08:09

In the fabric-embedded sensors space, all controllers need to be accurate and fast. "If latency is more than 6–8 milliseconds, you are out of the band", cautions BeBop Sensors Inc., maker of the new Marcel Modular Data Glove solution for virtual and augmented reality OEMs. more>>

Categories: Linux News

Zed A. Shaw's Learn Python 3 the Hard Way

Linux Journal - Fri, 08/25/2017 - 12:51

Author Zed A. Shaw makes a simple promise in his Hard Way series of books from publisher Addison-Wesley Professional: "It'll be hard at first. more>>

Categories: Linux News

Jmol: Viewing Molecules with Java

Linux Journal - Thu, 08/24/2017 - 08:48

Let's dig back into some chemistry software to see what kind of work you can do on your Linux machine. Specifically, let's look at Jmol, a Java application that is available as both a desktop application and a web-based applet. more>>

Categories: Linux News

The supposed decline of copyleft

Anarcat - Wed, 08/23/2017 - 12:00

At DebConf17, John Sullivan, the executive director of the FSF, gave a talk on the supposed decline of the use of copyleft licenses use free-software projects. In his presentation, Sullivan questioned the notion that permissive licenses, like the BSD or MIT licenses, are gaining ground at the expense of the traditionally dominant copyleft licenses from the FSF. While there does seem to be a rise in the use of permissive licenses, in general, there are several possible explanations for the phenomenon.

When the rumor mill starts

Sullivan gave a recent example of the claim of the decline of copyleft in an article on Opensource.com by Jono Bacon from February 2017 that showed a histogram of license usage between 2010 and 2017 (seen below).

From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:

In Black Duck's sample, the most popular variant of the GPL – version 2 – is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.

Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub.

To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:

I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.

Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them".

In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".

Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier".

In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.

In search of solid evidence

So it seems we are missing good, reproducible results to confirm or dispel these claims. Sullivan explained that it is a difficult problem, if only in the way you select which projects to analyze: the impact of a MIT-licensed personal wiki will obviously be vastly different from, say, a GPL-licensed C compiler or kernel. We may want to distinguish between active and inactive projects. Then there is the problem of code duplication, both across publication platforms (a project may be published on GitHub and SourceForge for example) but also across projects (code may be copy-pasted between projects). We should think about how to evaluate the license of a given project: different files in the same code base regularly have different licenses—often none at all. This is why having a clear, documented and publicly available data set and methodology is critical. Without this, the assumptions made are not clear and it is unreasonable to draw certain conclusions from the results.

It turns out that some researchers did that kind of open research in 2016 in a paper called "The Debsources Dataset: Two Decades of Free and Open Source Software" [PDF] by Matthieu Caneill, Daniel M. Germán, and Stefano Zacchiroli. The Debsources data set is the complete Debian source code that covers a large history of the Debian project and therefore includes thousands of free-software projects of different origins. According to the paper:

The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.

Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.

Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:

The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.

Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:

Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.

The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information.

Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice.

Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive—and certainly simpler—licenses is definitely heading in the right direction.

[I would like to thank the DebConf organizers for providing meals for me during the conference.]

Note: this article first appeared in the Linux Weekly News.

Categories: External Blogs

The supposed decline of copyleft

Anarcat - Wed, 08/23/2017 - 12:00

At DebConf17, John Sullivan, the executive director of the FSF, gave a talk on the supposed decline of the use of copyleft licenses use free-software projects. In his presentation, Sullivan questioned the notion that permissive licenses, like the BSD or MIT licenses, are gaining ground at the expense of the traditionally dominant copyleft licenses from the FSF. While there does seem to be a rise in the use of permissive licenses, in general, there are several possible explanations for the phenomenon.

When the rumor mill starts

Sullivan gave a recent example of the claim of the decline of copyleft in an article on Opensource.com by Jono Bacon from February 2017 that showed a histogram of license usage between 2010 and 2017 (seen below).

From that, Bacon elaborates possible reasons for the apparent decline of the GPL. The graphic used in the article was actually generated by Stephen O'Grady in a January article, The State Of Open Source Licensing, which said:

In Black Duck's sample, the most popular variant of the GPL – version 2 – is less than half as popular as it was (46% to 19%). Over the same span, the permissive MIT has gone from 8% share to 29%, while its permissive cousin the Apache License 2.0 jumped from 5% to 15%.

Sullivan, however, argued that the methodology used to create both articles was problematic. Neither contains original research: the graphs actually come from the Black Duck Software "KnowledgeBase" data, which was partly created from the old Ohloh web site now known as Open Hub.

To show one problem with the data, Sullivan mentioned two free-software projects, GNU Bash and GNU Emacs, that had been showcased on the front page of Ohloh.net in 2012. On the site, Bash was (and still is) listed as GPLv2+, whereas it changed to GPLv3 in 2011. He also claimed that "Emacs was listed as licensed under GPLv3-only, which is a license Emacs has never had in its history", although I wasn't able to verify that information from the Internet archive. Basically, according to Sullivan, "the two projects featured on the front page of a site that was using [the Black Duck] data set were wrong". This, in turn, seriously brings into question the quality of the data:

I reported this problem and we'll continue to do that but when someone is not sharing the data set that they're using for other people to evaluate it and we see glimpses of it which are incorrect, that should give us a lot of hesitation about accepting any conclusion that comes out of it.

Reproducible observations are necessary to the establishment of solid theories in science. Sullivan didn't try to contact Black Duck to get access to the database, because he assumed (rightly, as it turned out) that he would need to "pay for the data under terms that forbid you to share that information with anybody else". So I wrote Black Duck myself to confirm this information. In an email interview, Patrick Carey from Black Duck confirmed its data set is proprietary. He believes, however, that through a "combination of human and automated techniques", Black Duck is "highly confident at the accuracy and completeness of the data in the KnowledgeBase". He did point out, however, that "the way we track the data may not necessarily be optimal for answering the question on license use trend" as "that would entail examination of new open source projects coming into existence each year and the licenses used by them".

In other words, even according to Black Duck, its database may not be useful to establish the conclusions drawn by those articles. Carey did agree with those conclusions intuitively, however, saying that "there seems to be a shift toward Apache and MIT licenses in new projects, though I don't have data to back that up". He suggested that "an effective way to answer the trend question would be to analyze the new projects on GitHub over the last 5-10 years." Carey also suggested that "GitHub has become so dominant over the recent years that just looking at projects on GitHub would give you a reasonable sampling from which to draw conclusions".

Indeed, GitHub published a report in 2015 that also seems to confirm MIT's popularity (45%), surpassing copyleft licenses (24%). The data is, however, not without its own limitations. For example, in the above graph going back to the inception of GitHub in 2008, we see a rather abnormal spike in 2013, which seems to correlate with the launch of the choosealicense.com site, described by GitHub as "our first pass at making open source licensing on GitHub easier".

In his talk, Sullivan was critical of the initial version of the site which he described as biased toward permissive licenses. Because the GitHub project creation page links to the site, Sullivan explained that the site's bias could have actually influenced GitHub users' license choices. Following a talk from Sullivan at FOSDEM 2016, GitHub addressed the problem later that year by rewording parts of the front page to be more accurate, but that any change in license choice obviously doesn't show in the report produced in 2015 and won't affect choices users have already made. Therefore, there can be reasonable doubts that GitHub's subset of software projects may not actually be that representative of the larger free-software community.

In search of solid evidence

So it seems we are missing good, reproducible results to confirm or dispel these claims. Sullivan explained that it is a difficult problem, if only in the way you select which projects to analyze: the impact of a MIT-licensed personal wiki will obviously be vastly different from, say, a GPL-licensed C compiler or kernel. We may want to distinguish between active and inactive projects. Then there is the problem of code duplication, both across publication platforms (a project may be published on GitHub and SourceForge for example) but also across projects (code may be copy-pasted between projects). We should think about how to evaluate the license of a given project: different files in the same code base regularly have different licenses—often none at all. This is why having a clear, documented and publicly available data set and methodology is critical. Without this, the assumptions made are not clear and it is unreasonable to draw certain conclusions from the results.

It turns out that some researchers did that kind of open research in 2016 in a paper called "The Debsources Dataset: Two Decades of Free and Open Source Software" [PDF] by Matthieu Caneill, Daniel M. Germán, and Stefano Zacchiroli. The Debsources data set is the complete Debian source code that covers a large history of the Debian project and therefore includes thousands of free-software projects of different origins. According to the paper:

The long history of Debian creates a perfect subject to evaluate how FOSS licenses use has evolved over time, and the popularity of licenses currently in use.

Sullivan argued that the Debsources data set is interesting because of its quality: every package in Debian has been reviewed by multiple humans, including the original packager, but also by the FTP masters to ensure that the distribution can legally redistribute the software. The existence of a package in Debian provides a minimal "proof of use": unmaintained packages get removed from Debian on a regular basis and the mere fact that a piece of software gets packaged in Debian means at least some users found it important enough to work on packaging it. Debian packagers make specific efforts to avoid code duplication between packages in order to ease security maintenance. The data set covers a period longer than Black Duck's or GitHub's, as it goes all the way back to the Hamm 2.0 release in 1998. The data and how to reproduce it are freely available under a CC BY-SA 4.0 license.

Sullivan presented the above graph from the research paper that showed the evolution of software license use in the Debian archive. Whereas previous graphs showed statistics in percentages, this one showed actual absolute numbers, where we can't actually distinguish a decline in copyleft licenses. To quote the paper again:

The top license is, once again, GPL-2.0+, followed by: Artistic-1.0/GPL dual-licensing (the licensing choice of Perl and most Perl libraries), GPL-3.0+, and Apache-2.0.

Indeed, looking at the graph, at most do we see a rise of the Apache and MIT licenses and no decline of the GPL per se, although its adoption does seem to slow down in recent years. We should also mention the possibility that Debian's data set has the opposite bias: toward GPL software. The Debian project is culturally quite different from the GitHub community and even the larger free-software ecosystem, naturally, which could explain the disparity in the results. We can only hope a similar analysis can be performed on the much larger Software Heritage data set eventually, which may give more representative results. The paper acknowledges this problem:

Debian is likely representative of enterprise use of FOSS as a base operating system, where stable, long-term and seldomly updated software products are desirable. Conversely Debian is unlikely representative of more dynamic FOSS environments (e.g., modern Web-development with micro libraries) where users, who are usually developers themselves, expect to receive library updates on a daily basis.

The Debsources research also shares methodology limitations with Black Duck: while Debian packages are reviewed before uploading and we can rely on the copyright information provided by Debian maintainers, the research also relies on automated tools (specifically FOSSology) to retrieve license information.

Sullivan also warned against "ascribing reason to numbers": people may have different reasons for choosing a particular license. Developers may choose the MIT license because it has fewer words, for compatibility reasons, or simply because "their lawyers told them to". It may not imply an actual deliberate philosophical or ideological choice.

Finally, he brought up the theory that the rise of non-copyleft licenses isn't necessarily at the detriment of the GPL. He explained that, even if there is an actual decline, it may not be much of a problem if there is an overall growth of free software to the detriment of proprietary software. He reminded the audience that non-copyleft licenses are still free software, according to the FSF and the Debian Free Software Guidelines, so their rise is still a positive outcome. Even if the GPL is a better tool to accomplish the goal of a free-software world, we can all acknowledge that the conversion of proprietary software to more permissive—and certainly simpler—licenses is definitely heading in the right direction.

[I would like to thank the DebConf organizers for providing meals for me during the conference.]

Note: this article first appeared in the Linux Weekly News.

Categories: External Blogs
Syndicate content