Alert readers will know that I'm working on a major revision to my popular Wicked Cool Shell Scripts book to come out later this year. Although most of the scripts in this now ten-year-old book still are current and valuable, a few definitely are obsolete or have been supplanted by new technology or utilities. No worries—that's why I'm doing the update. more>>
Like most families these days, our family is extremely busy. We have four boys who have activities and appointments. My wife and I both have our own businesses as well as outside activities. For years, we've been using eGroupware to help coordinate our schedules and manage contacts. The eGroupware system has served us well for a long time. However, it is starting to show its age. more>>
- backup removal
and seemed to magically out of nowhere and basically do everything i need, with an inline manual on top of it.disclaimer
Note: this is not a real benchmark! i would probably need to port bup and attic to liw's seivot software to report on this properly (and that would amazing and really interesting, but it's late now). even worse, this was done on a production server with other stuff going on so take results with a grain of salt.procedure and results
Here's what I did. I setup backups of my ridiculously huge ~/src directory on the external hard drive where I usually make my backups. I ran a clean backup with attic, than redid it, then I ran a similar backup with bup, then redid it. Here are the results:anarcat@marcos:~$ sudo apt-get install attic # this installed 0.13 on debian jessie amd64 [...] anarcat@marcos:~$ attic init /mnt/attic-test: Initializing repository at "/media/anarcat/calyx/attic-test" Encryption NOT enabled. Use the "--encryption=passphrase|keyfile" to enable encryption. anarcat@marcos:~$ time attic create --stats /mnt/attic-test::src ~/src/ Initializing cache... ------------------------------------------------------------------------------ Archive name: src Archive fingerprint: 7bdcea8a101dc233d7c122e3f69e67e5b03dbb62596d0b70f5b0759d446d9ed0 Start time: Tue Nov 18 00:42:52 2014 End time: Tue Nov 18 00:54:00 2014 Duration: 11 minutes 8.26 seconds Number of files: 283910 Original size Compressed size Deduplicated size This archive: 6.74 GB 4.27 GB 2.99 GB All archives: 6.74 GB 4.27 GB 2.99 GB ------------------------------------------------------------------------------ 311.60user 68.28system 11:08.49elapsed 56%CPU (0avgtext+0avgdata 122824maxresident)k 15279400inputs+6788816outputs (0major+3258848minor)pagefaults 0swaps anarcat@marcos:~$ time attic create --stats /mnt/attic-test::src-2014-11-18 ~/src/ ------------------------------------------------------------------------------ Archive name: src-2014-11-18 Archive fingerprint: be840f1a49b1deb76aea1cb667d812511943cfb7fee67f0dddc57368bd61c4bf Start time: Tue Nov 18 00:05:57 2014 End time: Tue Nov 18 00:06:35 2014 Duration: 38.15 seconds Number of files: 283910 Original size Compressed size Deduplicated size This archive: 6.74 GB 4.27 GB 116.63 kB All archives: 13.47 GB 8.54 GB 3.00 GB ------------------------------------------------------------------------------ 30.60user 4.66system 0:38.38elapsed 91%CPU (0avgtext+0avgdata 104688maxresident)k 18264inputs+258696outputs (0major+36892minor)pagefaults 0swaps anarcat@marcos:~$ sudo apt-get install bup # this installed bup 0.25 anarcat@marcos:~$ free && sync && echo 3 | sudo tee /proc/sys/vm/drop_caches && free # flush caches anarcat@marcos:~$ export BUP_DIR=/mnt/bup-test anarcat@marcos:~$ bup init Dépôt Git vide initialisé dans /mnt/bup-test/ anarcat@marcos:~$ time bup index ~/src Indexing: 345249, done. 56.57user 14.37system 1:45.29elapsed 67%CPU (0avgtext+0avgdata 85236maxresident)k 699920inputs+104624outputs (4major+25970minor)pagefaults 0swaps anarcat@marcos:~$ time bup save -n src ~/src Reading index: 345249, done. bloom: creating from 1 file (200000 objects). bloom: adding 1 file (200000 objects). bloom: creating from 3 files (600000 objects). Saving: 100.00% (6749592/6749592k, 345249/345249 files), done. bloom: adding 1 file (126005 objects). 383.08user 61.37system 10:52.68elapsed 68%CPU (0avgtext+0avgdata 194256maxresident)k 14638104inputs+5944384outputs (50major+299868minor)pagefaults 0swaps anarcat@marcos:attic$ time bup index ~/src Indexing: 345249, done. 56.13user 13.08system 1:38.65elapsed 70%CPU (0avgtext+0avgdata 133848maxresident)k 806144inputs+104824outputs (137major+38463minor)pagefaults 0swaps anarcat@marcos:attic$ time bup save -n src2 ~/src Reading index: 1, done. Saving: 100.00% (0/0k, 1/1 files), done. bloom: adding 1 file (1 object). 0.22user 0.05system 0:00.66elapsed 42%CPU (0avgtext+0avgdata 17088maxresident)k 10088inputs+88outputs (39major+15194minor)pagefaults 0swaps
Disk usage is comparable:anarcat@marcos:attic$ du -sc /mnt/*attic* 2943532K /mnt/attic-test 2969544K /mnt/bup-test
People are encouraged to try and reproduce those results, which should be fairly trivial.Observations
Here are interesting things I noted while working with both tools:
- attic is Python3: i could compile it, with dependencies, by doing apt-get build-dep attic and running setup.py - i could also install it with pip if i needed to (but i didn't)
- bup is Python 2, and has a scary makefile
- both have an init command that basically does almost nothing and takes little enough time that i'm ignoring it in the benchmarks
- attic backups are a single command, bup requires me to know that i first want to index and then save, which is a little confusing
- bup has nice progress information, especially during save (because when it loaded the index, it knew how much was remaining) - just because of that, bup "feels" faster
- bup, however, lets me know about its deep internals (like now i know it uses a bloom filter) which is probably barely understandable by most people
- on the contrary, attic gives me useful information about the size of my backups, including the size of the current increment
- it is not possible to get that information from bup, even after the fact - you need to du before and after the backup
- attic modifies the files access times when backing up, while bup is more careful (there's a pull request to fix this in attic, which is how i found out about this)
- both backup systems seem to produce roughly the same data size from the same input
attic and bup are about equally fast. bup took 30 seconds less than attic to save the files, but that's not counting the 1m45s it took indexing them, so on the total run time, bup was actually slower. attic is also (almost) two times faster on the second run as well. but this could be within the margin of error of this very quick experiment, so my provisional verdict for now would be that they are about as fast.
bup may be more robust (for example it doesn't modify the atimes), but this has not been extensively tested and is more based with my familiarity with the "conservatism" of the bup team rather than actual tests.
considering all the features promised by attic, it makes for a really serious contender to the already amazing bup.Next steps
The properly do this, we would need to:
- include other software (thinking of Zbackup, Burp, ddar, obnam, rdiff-backup and duplicity)
- bench attic with the noatime patch
- bench dev attic vs dev bup
- bench data removal
- bench encryption
- test data recovery
- run multiple backup runs, on different datasets, on a cleaner environment
- ideally, extend seivot to do all of that
Note that the Burp author already did an impressive comparative benchmark of a bunch of those tools for the burp2 design paper, but it unfortunately doesn't include attic or clear ways to reproduce the results.
For new Linux users, the command line is arguably the most intimidating thing. For crusty veterans like me, green text on a black background is as cozy as fuzzy slippers by a fireplace, but I still see CLI Companion as a pretty cool application. more>>
For our 50th edition of our Monthly meetups we wanted to offer you an amazing evening with an amazing lineup. That's why this time we have the chance to be welcome at Notman Housse by our friends from Real Ventures.
We would like to thanks all the sponsors and Datacratic to support us for this special evening.
In the mean time, don't forget to join us on Tuesday for our http://montrealpython.org/en/2014/11/python-night-xiii/
And also don't forget, if you have anything you would like to present or announce, just show up and tell us, we are always happy to give a chance to anyone !Flash Presentations (5-10 minutes each):
George Peristerakis: How openstack automated the software development process.
The Openstack system is a collection of open source services that helps you setup a cloud infrastructure. It is also one of the largest and most active code base. I will talk about how openstack automated most of its development process.
David Taylor: Top 10 'Python idioms it took me way too long to figure out
Since I adopted Python after learning programming in Visual Basic, I gradually found out features that, had I known them before, would have saved me from having to reinvent the wheel, like collections.Counter, dict.get(), list.enumerate(), etc.Main Presentations:
Nicolas Kruchten: Data Science and Machine Learning with PyData
Nicolas will walk through a demonstration of ways to explore, visualize and extract insights from data using PyData tools like IPython Notebook, numpy, pandas and scikit-learn.
Stéphane Guidoin: CKAN plateforme de données ouvertes en python: forces et défis
Lancé par l'Open Knowledge Foundation et développé en Python, CKAN a vocation a servir de portail de données ouvertes, notamment pour les gouvernements. CKAN a acquis une position dominante, surtout auprès des gouvernements nationaux. Malgré cette position, le développement de l'outil n'est pas aussi rapide que beaucoup souhaiteraient et plateformes comme Socrata demeurent très présentes, notamment auprès des municipalités, malgré leur prix nettement supérieur. La présentation couvrira l'architecture technologique de CKAN, ce qui lui a permis de devenir une solution dominante, mais aussi les défis de croissance auxquels fait face ce projet.
Cameron Davidson-Pilon: What is PySpark and When Should I Use it?
Spark is being called the next generation of Hadoop: it's faster, more accessible and has a large community behind it. PySpark is a Python interface to Spark. In this talk, we'll discuss what MapReduce is, PySparks API, and when to use PySpark versus another tool.When:
Monday, the December 1st 2014Where:
Notman House 51 Rue Sherbrooke West Montréal, Québec H2X 1X2 https://goo.gl/maps/iga3rComment
It's free, just join us !Schedule:
- 6:00pm — Doors open
- 6:30pm — Presentations start
- 7:30pm — Break
- 7:45pm — Second round of presentations
- 9:00pm — One free beer offered at Bénélux just across the street
- Real Ventures
- Savoir-Faire Linux
I really stink at video games. I write about gaming occasionally, but the truth of the matter is, I'm just not very good. If we play Quake, you'll frag me just about as often as I respawn. I don't have great reflexes, and my coordination is horrible. more>>
OpenGL is a well-known standard for generating 3-D as well as 2-D graphics that is extremely powerful and has many capabilities. OpenGL is defined and released by the OpenGL Architecture Review Board (ARB).
This article is a gentle introduction to OpenGL that will help you understand drawing using OpenGL. more>>
Welcome to Python Night XIII (hosted by Caravan)
After a long absence, Python Night is back and we are inviting everyone to join us in a night of coding, laughs and all out good time.
What? You don't already know what a Python Night is?
Well, it's a friendly evening where we code, chat about projects (not too loud so people won't code bugs :) ) and help each other. You come, with or without your project and we'll try to something with it.
Already announces projects:
- Kaggle Competions: https://www.kaggle.com/competitions
- Agenda du libre: https://github.com/mlhamel/agendadulibre
- Introduction to Python
- [Add here your own project]
WHEN November 18, 2014 @6pm
WHERE Caravan, 5334 de Gaspé, office #1204 (Montreal)
HOW Just grab your free ticket on: http://python-project-night-13.eventbrite.ca
Bring your computer and your smile, we'll provide beer and pizza !
Hardware errors are tough to code for. In some cases, they're impossible to code for. A particular brand of hardware error is the Machine-Check Exception (MCE), which means a CPU has a problem. On Windows systems, it's one of the causes of the Blue Screen of Death. more>>
21 years of Linux Journal on one DVD. Order yours today and receive $10 off! more>>
On Monday, November the 17th, the NAD Centre will partner up with our friends at [Savoir-Faire Linux to organize a "Python and VFX" meetup. Python is ubiquitous in the world of special effects: from Maya to Softimage and 3D Studio Max, not to mention Blender of course.
The presentations will be held at NAD Centre, 405 rue Ogilvy, 3rd floor, at 6PM.
As you rpobably know, we've already had an event about VFX and Python during MP31:http://montrealpython.org/fr/2012/09/mp31/.
And this one is promissing to be very interesting if you are using any 3D software or if you would like to know more about real case study of useful way of using Python.
Program for the evening:
- 6:00PM – Welcome
- 6:30PM – Eric Thivierge, Rigging R&D at Hybride Technologie “Python: Bridging Technologies”
- 7:00PM – Dave Lajoie, R&D Director at Digital District “Python in a vfx/animation pipeline”
- 7:45PM – Jordi Riera, Python Software Developer at Savoir-Faire Linux “How to train your Python or how to improve python codes”
- 8:15PM – Networking
- 9:00PM – Wrap up
Don't forget to register your place at: http://elite.nad.ca/product/meetup-python-2/
For more informations:
See you there !
I don't listen to music very often, but when I do, my tastes tend to be across the board. That's one of the reasons I really like Pandora, because the music selection is incredible (in fact, I can't recommend the Pithos client for Pandora enough—I've written about it in past issues). Unfortunately, with Pandora, you don't get to pick specific songs. more>>
Obsession with Big Data has gotten out of hand. Here's how. more>>
For the last 7 years, Montreal-Python has been and still is the most awesome Python community in Montreal, if not the world. We are proud to celebrate our 50th monthly meetup in couple of weeks.
For this momentus occasion, we are looking for the best speakers in town for either a 5, 10, 20, 30 or 45 minutes presentation. If you are cooking up a crazy idea powered by Python or if you have been working on something awesome, this is the perfect opportunity to present it to all of Montreal!When
Monday, December 1st, 2014 at 6:30pmWhere
To be definedHow
Just send us an email at firstname.lastname@example.org with the title of your talk, the length, and a short one sentence description.
Every time I write a Bash script or schedule a cron job, I worry about the day I'll star in my very own IT version of a Folger's commercial. Instead of "secretly replacing coffee with Folger's Instant Crystals", however, I worry I'll be replaced by an automation framework and a few crafty FOR loops. more>>
It's that time of the year again: the organisation team for the Montréal-Python community is going to proceed to the election of its board. People elected to the board are the legal representatives of the association, but they're also people who contribute their energy and ideas to drive the community.
Anyone who is subscribed to the mailing list can apply to one of these positions:
Please note that other positions could be added if needed.
We'd like to stress that it is not necessary to be a member of the board (or even a member of the permanent organisation team) to organise events under the banner of Montréal-Python; the board is the official organ, but it is you, the members of the community, who do great things with Python in Montreal.When:
Monday, November 3rd, 8pmWhere:
Ajah offices, 1124 Marie-Anne Est https://goo.gl/maps/xN1ckHow:
Show of hands
Comme see us on Monday! :)
UPDATE The new board as been elected, please welcome your new montreal-python's board for 2014-2015:
- President: Mathieu Leduc-Hamel
- Vice-president: Jean-Philippe Caissy
- Secretary: George Peristerakis
- Treasurer: Rory geoghegan
Rock on !
I've been researching OpenStack deployment methods lately and so when I got an email from Canonical inviting me to check out how they deploy OpenStack using their Metal as a Service (MaaS) software on their fantastic Orange Box demo platform I jumped at the opportunity. more>>
It's Halloween week, and the big names in Linux are determined not to disappoint the trick-or-treaters. No less than three mainline distributions have released new versions this week, led by perennially-loved-and-hated crowd favourite Ubuntu. more>>
Let's start with some homework. Go to Google (or Bing) and search for "privacy is dead, get over it". I first heard this from Bill Joy, cofounder of Sun Microsystems, but it's attributed to a number of tech folk, and there's an element of truth to it. Put something on-line and it's in the wild, however much you'd prefer to keep it under control. more>>