Friday, September 28, 2012

Good practices - A plea to open source projects

I compile, configure and test the software used on mainstream Linux distributions. I do this on a private distribution (for personal use) and at the moment it consists of slightly more than 1000 (no, this is not a typo) build scripts, which I have written myself. I maintain the init scripts and all package builds in the entire Linux distribution software stack - everything from the kernel, glibc, to the compiler toolchain all the way up to Xorg and KDE. I even wrote a scripted package manager to aid me in my task and automate the process, I haven't switched to it yet but plan to eventually. Needless to say, I know my way around Linux fairly well. Because I package so much of the Linux software stack I also know many of the dirty little secrets we hide from end users behind our crafted packages. Users are often oblivious about how much trouble package maintainers have to go through just to get their favorite application compiled, packaged and ready to be executed.

About 90% of the software projects I encounter are well behaved, well structured and well designed. It's that last 10% I spend 90% of my time with trying to get up and running, for whatever reason. This is what this post is all about - saving time and frustration.

My beef with the current state of open source/free software is the departure from traditional UNIX software engineering principles. Software projects today seem to ignore stuff like FHS (File Hierarchy Standard) and basics of the Unix Philosophy. One rule is - write simple parts connected by clean interfaces. I see this and other principles violated left and right, and blatant violations of FHS are not uncommon. Read the FHS, trust me - it won't take long. This will allow you to understand the beauty of the UNIX filesystem layout, there is practically a directory assigned for everything you might think of, and once you understand it you'll appreciate how elegant it is. People with a background on Windows systems rarely understand nor appreciate why the directory layout looks the way it does on UNIX-like operating systems but once you do I'm sure you'll grow to love the logic behind it. At first glance it looks cryptic, but there's a reason for that - and the benefits are still there today in spite of it's long history. It isn't as complex as you might think. For example, if you've written a library then it belongs in /lib or /usr/lib (depending on what kind of library it is, a core os library would belong in /lib while a less crucial library belongs in /usr/lib). Does your server program need to write data somewhere? Have a look at /var. Is your application a desktop application and need to write data somewhere? Do it in a dot directory somewhere in the users $HOME path. Do you have static data that should be availabe system-wide? Stick it in /usr/share. That's what the FHS is for - it describes the filesystem and what all these directories are meant for.

There are brilliant programmers out there who write excellent programs but some of them haven't taken the time to understand these basics, and thus their programs, brilliant though they might be, behave in undesirable ways. It isn't exactly fun to dig through someones code and patch it to conform to the FHS. For people who package software this makes your application high-maintenance. A piece of advice to software engineers - take a long hard look at the build scripts and .spec files used by distributions to build and package your program, if it's riddled with hacks just to get it packaged then you should probably consider simplifying and correcting the build framework.

If you're using the GNU autotool framework to build your software then make sure it respects all the standard configure switches, and more importantly make really sure that DESTDIR works as it's supposed to, you'd be surprised how many software projects that neglect this particular feature, but it's essential for packaging the program. For aestethic (and some practical) reasons - avoid CamelCase in program names, don't use version numbers in paths - and if you need to then opt for /usr/include/app/$VERSION rather than /usr/include/app-$VERSION, we have pkg-config for a reason - it's handy for locating package-specific directories, and simplifies management of multiple versions of a program or library on the same system - no need to clutter the top level /usr/include directory.

Use relative rather than absolute symlinks for "make install", you don't want symlinks to point to the wrong files which is what will probably happen if your application resides on a mounted filesystem, and if it doesn't then the symlink will be broken. It also makes chroot jails easier - although I wouldn't expect anyone to design their application with this in mind. In the end it's just cleaner and better to always use relative symlinks.

Do NOT change the API unless you also change the major version number. Be patient, and if you desperately want to make changes to your API then wait until you are ready to release the next major version. Why? Because that's what we (downstream) are expecting - API changes in major version changes. Let me clarify why. For example, lets assume your library is widely used by other projects, but you descide to push a major API change in a point release. Oblivious to this change downstream updates to the new point release thinking it's API compatible with the previous point release you made. It will take time before we realize a dozen other applications no longer compile due to the change. Now we have to downgrade and recompile once more. I'd love to tell you that we always read ChangeLog's but sadly some of us just don't have the time.

Do NOT use unstable, in-development libraries and core component. This causes endless misery due to constant breakage. Wait until the API and design has stabilized and please, please don't make dependencies to svn/git/cvs versions. Unless stable tarballs have been released then don't hook your code into it's API. Gnome has always been a real PITA because they do this, and I've occasionally had to pull code from CVS just to get "just the right version" to make it work. I get it - you want the latest and greatest features but please, do not torment downstream with dependencies to unreleased code.

Digitally sign your code, ALWAYS. Make this a habit. While most people who download your code probably won't verify the signature you'll be glad you signed it if your code distribution site is hacked. It happens, and when it does it's one hell of a task to figure out what was tampered with.

On a more personal note I'd want to encourage projects to stick to a single compression format for releases rather than making three or four identical tarballs compressed with different algorithms that need to be signed independently. My suggestion? Stick to gzip. It decompresses faster than any other algorithm and while it's very inefficent when compared to LZMA2 (.xz) you'll only need to do the one tarball and just about everyone will be able to decompress it. A couple of years from now xz might be widespread enough to justify the switch.

ALWAYS include a license. Preferably as file named a COPYING or LICENSE in the top-level directory. A tarball without one is like getting a wrapped present that has a suspicious ticking noise. Sure, it could be just a harmless clock but it could also be a bomb. By including a license you make your intentions clear so we don't have to worry about getting sued for packaging and distributing your program. Use a common and widely used license - legalese isn't always easy to understand so by using a common (and scrutinized) license we can be sure there aren't any hidden surprises behind the wording.

Describe your software and include an official URL for the project. As strange as this might seem a lot of projects neglect these simple details. A short paragraph in a README file is more than enough and it allows us to figure our what it's for because it isn't always obvious. It's hard to identify the purpose of program or library by a name such as libsnx. By also including an official project URL you'll also save us a lot of time trying to search for the upstream distribution website when it's time to update and we can then be sure we've got the right project. Often there are a lot of alternative programs to perform a given task.

Modularity is great. It's encouraged by standard UNIX practices, but as a project grows and becomes increasingly more complex modularity tends to show it's limitations. It isn't always practical to keep 100 modules in sync on the code level, especially if you start making API changes.

Off the top of my head, that's about it for now.

Excellent Games

I'm not what you would call a hardcore gamer, but I do play and I strongly favor the console - mostly because it's more comfortable to lie in bed with the controller rather than sitting in a chair in front of the the computer. Also, console games are more streamlined with less tweaking to be made to get the best experience, specifically the game controls are far more consistent on consoles than on the PC. To be sure, there are a few things you loose out on such as better graphics and game mods, but that's a sacrifice I gladly make. This is my list of games that I highly recommend for those who favor story-driven titles. I'm a Playstation 3 owner so unfortunately I'll only be listing games available for that console, but most of these are also avalable for the Xbox 360.

Mass Effect


The Mass Effect games are among the best RPG's ever made. These are without question the cream of the crop among modern console titles. They're extremely polished and beautiful, have an engaging story and have great replay value. (Update: The first Mass Effect was recently released for the PS3 as a PSN download, making the whole series available on the Playstation 3 console. Previously the first Mass Effect game was a Xbox 360 exclusive title)

Fallout


Fallout 3 and Fallout: New Vegas are great open-world games. The only real complaint I have about these games are how unstable they are. Fallout: New Vegas made vast improvements in stability, but Fallout 3 was horrendously unstable and buggy. That being said these games are so good that I managed to overlook these faults. You can spend hours upon hours playing these titles.

Metal Gear


While I'm reluctant to include previous gen console titles in this list the fairly recent release of Metal Gear Solid HD Collection makes the entire game series available on the PS3, including the first PSOne (PSX) Metal Gear Solid game as a PSN download. The graphics of the first game obviously doesn't live up to the standard of current titles, not by a long shot, but the story alone makes it one of the best I've ever played. The HD Collection also makes the Playstation 2 games available in HD, while still not as impressive as Metal Gear Solid 4 (visually) the story alone makes them excellent value. These games are so good they rival Fallout and Mass Effect without breaking a sweat. I have my doubts about the upcoming Metal Gear Solid Rising because Solid Snake has been replaced by Raiden as the protagonist, but I hope it'll be a good game.

Skyrim


Skyrim is simply an awesome game. Much like the Fallout titles it's an open-world game that allows you to go anywhere you feel like going and do the missions you feel like doing. This game is all about exploring and adventure, but unlike Fallout the world has an roughly estimated 12th century setting with swordplay, magic, dragons and a GREAT story. I don't recommend Oblivion though because the leveling system is, in my opinion, fundamentally broken. I didn't even finish the game because of this. A lot of people would disagree, but I would ask these people to consider all the hoops you have to jump through to get a properly leveled character. It sort of ruins the experience when you have to stand and take a beating from a mudcrab for an hour just to level up your blocking skill. Oblivion is still a good game, with plenty of exploration and adventure, but it doesn't hold a candle to Skyrim.

Dead Space


I really enjoy survival horror games and I'm a big fan of the Resident Evil series. As such I am morally obligated to recommend the Dead Space trilogy. When you play these games you turn off the lights, sit back and just enjoy the horror. These are seriously creepy games with a great story line.