Sunday, December 11, 2016

What Git got Right and Wrong

Having been using various vcs solutions for a while, I think it is worth noting that my least favorite from a user experience perspective is git.  To be sure, git has better handling of whitespace-only merge conflicts and a few other major features than any other vcs that I have used.

And the data structure and model are solid.

But even understanding what is going on behind the scenes, I find git to be an unintuitive mess at the CLI issue.

Let me start by saying things that git got right:


  1. The dag structure is nice
  2. Recursive merges are good
  3. The data models and what goes on behind the scenes is solid.
  4. It is the most full-featured vcs I have worked with
However, I still have real complaints with the software  These include fundamentally different concepts merged into the same label and the fact that commands may do many different things depending on how you call them.  The fact that the concepts are not clear means that it is worse than a learning curve issue.  One cannot have a good grasp of what git is doing behind the scenes because this is not always clear.  In particular:

  1. In what world is a fast-forward a kind of merge?
  2. Is there any command you can explain (besides clone) in three sentences or less?
  3. What does git checkout do?  Why does it depend on what you checkoout?   Can you expect an intermediate user to understand what to expect if you have staged changes on a file, when you try to check out a copy of that file from another revision?
  4. What does git reset do from a user perspective?  Is there any way a beginner can understand that from reading the documentation?
  5. In other words, git commands try to do too much at once and this is often very confusing.
  6. Submodules are an afterthought and not well supported across the entire tool chain (why does git-archive not have an option to recurse into submodules?)
These specific complaints come from what I see as a lack of clarity regarding what concepts mean and they indicate that those who wrote the tools did so at a time when the concepts were still not entirely clear in their own minds.  In essence it is not that things are named in unclear ways but that concepts have unclear boundaries.  

Some of this could be fixed in git.  fast-forward could be referred to as a shortcut to avoid a merge rather than a kind of merge. Some could be fixed with better documentation (we could describe git reset by what the user-facing changes are, rather than the internal changes).  Some would require a very different set of command layouts.

Thursday, August 18, 2016

PostgreSQL vs Hadoop

So one of the clients I do work with is moving a large database from PostgreSQL to Hadoop.  The reasons are sound -- volume and velocity are major issues for them, and PostgreSQL is not going away in their data center and in their industry there is a lot more Hadoop usage and tooling than there is PostgreSQL tooling for life science analytics (Hadoop is likely to replace both PostgreSQL and, hopefully, a massive amount of data on NFS).  However this has provided an opportunity to think about big data problems and solutions and their implications.  At the same time I have seen as many people moving from Hadoop to PostgreSQL as the other way around.  No, LedgerSMB will never likely use Hadoop as a backend.  It is definitely not the right solution to any of our problems.

Big data problems tend to fall into three categories, namely managing ever increasing volume of data, managing increasing velocity of data, and dealing with greater variety of data structure.  It's worth noting that these are categories of problems, not specific problems themselves, and the problems within the categories are sufficiently varied that there is no solution for everyone.  Moreover these solutions are hardly without their own significant costs.  All too often I have seen programs like Hadoop pushed as a general solution without attention to these costs and the result is usually something that is overly complex and hard to maintain, may be slow, and doesn't work very well.

So the first point worth noting is that big data solutions are specialist solutions, while relational database solutions for OLTP and analytics are generalist solutions.  Usually those who are smart start with the generalist solutions and move to the specialist solutions unless they know out of the box that the specialist solutions address a specific problem they know they have.  No, Hadoop does not make a great general ETL platform.....

One of the key things to note is that Hadoop is built to solve all three problems simultaneously.  This means that you effectively buy into a lot of other costs if you are trying to solve only one of the V problems with it.

The single largest cost comes from the solutions to the variety of data issues.  PostgreSQL and other relational data solutions provide very good guarantees on the data because they enforce a lack of variety.  You force a schema on write and if that is violated, you throw an error.  Hadoop enforces a schema on read, and so you can store data and then try to read it, and get a lot of null answers back because the data didn't fit your expectations.  Ouch.  But that's very helpful when trying to make sense of a lot of non-structured data.

Now, solutions to check out first if you are faced with volume and velocity problems include Postgres-XL and similar shard/clustering solutions but these really require good data partitioning criteria.  If your data set is highly interrelated, it may not be a good solution because cross-node joins are expensive.  Also you wouldn't use these for smallish datasets either, certainly not if they are under a TB since the complexity cost of these solutions is not lightly undertaken either.

Premature optimization is the root of all evil and big data solutions have their place.  However don't use them just because they are cool or new, or resume-building.  They are specialist tools and overuse creates more problems than underuse.

Sunday, August 14, 2016

Forthcoming new scalable job queue extension

So for those of you who know, I now spend most of my time doing more general PostgreSQL consulting and a fair bit of time still on LedgerSMB.  One of my major projects lately has been on a large scientific computing platform currently run on PostgreSQL, but due to volume and velocity of data being moved to Hadoop (the client maintains other fairly large PostgreSQL instances with no intention of moving btw).

With this client's permission I have decided to take a lot of the work I have done in optimizing their job queue system and create an extension under PostgreSQL for it..  The job queue currently runs tens of millions of jobs per day (meaning twice that number of write queries, and a fair number of read queries too) and is one of the most heavily optimized parts of the system, so this will be based on a large number of lessons learned on what is a surprisingly hard problem.

It is worth contrasting this to pg_message_queue of which I am also the author.  pg_message_queue is intended as a light-weight, easy to use message queue extension that one can use to plug into other programs to solve common problems where notification and message transfer are the main problems.  This project will be an industrial scale job queuing system aimed at massive concurrency.  As a result simplicity and ease of use take second place to raw power and performance under load.  In other words here I am not afraid to assume the dba and programming teams know what they are doing and has the expertise to read the manual and implement appropriately.

The first version (1.x) will support all supported versions of PostgreSQL and make the following guarantees:


  1. massively multiparallel, non-blocking performance  (we currently use with 600+ connections to PostgreSQL by worker processes).
  2. Partitioning, coalescing, and cancelling of jobs similar in some ways to TheSchwartz
  3. Exponential pushback based on number of times a job has failed
  4. Jobs may be issued again after deletion but that this can always be detected and bad jobs pruned
  5. Optionally job table partitioning.
The first client written will rely on hand-coded SQL along with DBIx::Class's schema objects.  This client will guarantee that:

  1. Work modules done always succeeds or fails in a transaction
  2. A job notifier class will be shown
  3. Pruning of completed jobs will be provided via the  perl module and a second query.
The history of this is that this came from a major client's use of The Schwartz and they out grew it for scalability reasons.  While the basic approach is thus compatible, the following changes are made:

  1. Job arguments are in json format rather than in Storable format in bytea columns
  2. Highly optimized performance on PostgreSQL
  3. Coalesce is replaced by a single integer cancellation column
  4. Jobs may be requested by batches of various sizes
2.x will support 9.5+ and dispense with the need for both advisory locks and rechecking.  I would like to support some sort of graph management as well (i.e. a graph link that goes from one job type to another which specifies "for each x create a job for y" type of semantics.  That is still all in design.

Friday, August 5, 2016

use lib '.' considered harmful (exploits discussed)

Which the discussion of CVE-2016-1238, a quick and easy fix for broken code that has been suggested is to add the following line to the top of broken Perl scripts:  Note this applies to Perl as run anywhere, whether pl/perlU, plain perl, or something else.

use lib '.';

In some corners, this has become the goto solution for the problem (pun quite deliberate).  It works, gets the job done, and introduces subtle, hidden, and extremely dangerous problems in the process.

For those interested in the concerns specific to PostgreSQL, these will be discussed near the end of this article.

Security and the Garden Path


I am borrowing an idea here from linguistics, the idea of the garden path, as something that I think highlights a lot of subtle security problems.  Consider the newspaper headline "Number of Lothian patients made ill by drinking rockets."  Starts off simple enough and you get to the end, realizing you must have misread it (and you did, probably, since the number of patients increased who were made ill by drinking, not that some people got sick because they drank hydrazine).  The obvious reading and the logical reading diverge and this leads to a lot of fun in linguistic circles.

The same basic problem occurs with regard to security problems.  Usually security problems occur because of two problems. Either people do something obviously insecure (plain text authentication for ftp users where it matters) or they do something that looks on the surface like it is secure but behind the scenes does something unexpected.

Perl here has a few surprises here because parsing and running a script is a multi-pass process but we tend to read it as a single pass. Normally these don't cause real problems but in certain cases there are very subtle dangers lurking.  Here, with use lib '.', it is possible to inject code into a running program as long as an attacker can get a file placed in the current working directory of the program.

Relative paths, including the current working directory, do have legitimate use cases, but the problems and pitfalls must be understood before selecting this method.

What the lib pragma does


Perl looks for files to require or include based on an array of paths, globally defined, called @INC.  Use lib stores a copy of the original lib at first use, and then ensures that the directory specified occurs at the start of the search order.  So directories specified with use lib are searched before the default library directories.  This becomes important as we look at how Perl programs get executed.

How Perl runs a program


Perl runs a program in two passes.  First it creates the parse tree, then it runs the program.  This is a recurive process and because of how it works, it is possible for malicious code that gets accidently run in this process to transparently inject code into this (and other portions) of the Perl process.

Keep in mind that this makes Perl a very dynamic language which, on one hand, has serious garden path issues, but on the other ensures that it is an amazingly flexible language.

During the parse stage, Perl systematically works through the file, generating a parse tree, and running any "BEGIN" blocks, "use" statements, and "no" statements.  This means that injected code can be run even if later sytnax errors appear to prevent the bulk of the program from running at all or if earlier errors cause run-time exception.

After this process finishes, Perl executes the parse tree that results.  This means that Perl code can rewrite the parse tree before your code is ever really written and that code can be inserted into that part o the process.

Transparent code injection during 'use'


Consider a simple Perl script:

#!/usr/bin/perl

use lib '.';
use Cwd;
use 5.010;
use strict;
use warnings;

say getcwd();

Looks straight-forward.  And in most cases it will do exactly what it looks like it does.  It loads the standard Cwd module and prints out the current working directory.

However, suppose I run it in a different directory, where I add two additional files:

Cwd.pm contains:

package Cwd;

use Injected;

1;


hmmmm  that doesn't look good. What does Injected.pm do?


package Injected;
use strict;

sub import {
   local @INC = @INC;
   my ($module) = caller;
   warn $module;
   delete $INC{'Injected.pm'};
   delete $INC{"$module.pm"};
   @INC = grep { $_ ne '.' } @INC;
   eval "require $module";
   warn "Got you!, via $module";

}

1;


So when Cwd imports Injected, it deletes itself from the memory of having been included, deletes its caller too, reloads the correct caller (not from the current working directory) and then executes some code (here a harmless warning).

Cwd.pm then returns success
test.pl runs Cwd->import() which is now the correct one, but we have already run unintended code that could in theory do anything.

Any program capable of being run in arbitrary directories, written in Perl, which has this line in it (use lib '.') is subject to arbitrary code injection attacks using any module in the dependency tree, required or not.

Instead, do the opposite (where you can)


As a standard partof boilerplate in any secure Perl program, I strongly recommend adding the following line to the top of any script.  As long as modules don't add it back in behind your back (would be extremely rare that they would), adding the following line:

no lib '.';

Note that this strips out the current working directory even if it si supplied as a command-line argument.  So it may not be possible in all cases.  So use common sense, and do some testing, and document this as desired behavior.  Note one can still invoke with perl -I./. in most cases so it is possible to turn this safety off.....  Additionally if you put that at the start of your module, something you include could possibly put it back in.

Safer Alternatives


In a case where you need a path relative to the script being executed, FindBin is the ideal solution.  It gives you a fixed path relative to the script being run, which is usually sufficient for most cases of an application being installed on a system as a third party.  So instead you would do:

use FindBin;
use lib $FindBin::Bin;

Then the script's directory will be in the include path.

PL/PerlU notes:


I always add the explicit rejection of cwd in my plperlu functions.  However if someone has a program that is broken by CVE-2016-1238 related fixes, it is possible that someone would add a use lib '.' to a perl module, which is a bad idea.  As discussed in the previous post, careful code review is required to be absolutely safe.  Additionally, it is a very good idea to periodically check the PostgreSQL data directory for perl modules which would indicate a compromised system.

Tuesday, August 2, 2016

PostgreSQL, PL/Perl, and CVE-2016-1238

This post is about the dangers in writing user defined functions in untrusted languages, but it is also specifically about how to avoid CVE-2016-1238-related problems when writing PL/PerlU functions.  The fix is simple and straight-forward and it is important for it to be in pl/perlU stored procedures and user defined functions for reasons I will discuss.  This discusses actual exploits and the severity of being able to inject abritrary Perl code into the running database backend is a good reason to be disciplined and careful about the use of this language.

It is worth saying at the outset that I have been impressed by how well sensible design choices in PostgreSQL generally mitigate problems like this.  In essence you have to be working in an environment where a significant number of security measures have been bypassed either intentionally or not.  This speaks volumes on the security of PostgreSQL's design since it is highly unlikely that this particular attack vector was an explicit concern.  In other words these decisions make many attacks even against untrusted languages far more difficult than they would be otherwise.

The potential consequences are severe enough, however, that secure coding is particularly important in this environment even with the help of the secure design.  And in any language it is easy to code yourself into corners where you aren't aware you are introducing security problems until they bite you.

The current implementation, we will see, already has a fair bit of real depth of defense behind it.  PostgreSQL is not, by default, vulnerable to the problems in the CVE noted in the title.  However, with a little recklessness, it is possible to open the door to the possibility of real problems and it is possible for these problems to be hidden from the code reviewer by error rather than actual malice.  Given the seriousness of what can happen if you can run arbitrary code in the PostgreSQL back-end, my view is that all applicable means should be employed to prevent problems.

PL/PerlU can be used in vulnerable ways, but PL/Perl (without the U) is by design safe.  Even with PL/PerlU it is worth noting that multiple safety measures have to be bypassed before vulnerability becomes a concern.  This is not about any vulnerabilities in PostgreSQL, but what vulnerabilities can be added through carelessly writing stored procedures or user-defined functions.

There are two lessons I would  like people to take away from this.  The first is how much care has been taken with regard to PostgreSQL regarding security in design.  The second is how easily one can accidentally code oneself into a corner.  PL/PerlU often is the right solution for many problem domains and it can be used safely but some very basic rules need to be followed to stay out of trouble.

Extending SQL in PostgreSQL using Arbitrary Languages


PostgreSQL has a language handler system that allows user defined functions and stored procedures to be written in many different languages.  Out of the box, Python, TCL, C, and Perl come supported out of the box.  Perl and TCL come in trusted and untrusted variants (see below), while Python and C are always untrusted.

There are large numbers of external language handlers as well, and it is possible to write ones own.  Consequently, PostgreSQL allows, effectively, SQL to be extended by plugins written in any language.  The focus here will be on untrusted Perl, or PL/perlU.  Untrusted languages have certain inherent risks in their usage and these become important, as here, when there is concern about specific attack vectors.

Trusted vs Untrusted Languages, and what PL/Perl and PL/PerlU can do


PostgreSQL allows languages to be marked as 'trusted' or 'untrusted' by the RDBMS.  Trusted languages are made available for anyone to write functions for while untrusted languages are restricted to database superusers.  Trusted languages are certified by the developers not to interact with file handles, not to engage in any other activity other than manipulating data in the database.

There is no way to 'use' or 'require' a Perl module in pl/perl (trusted) and therefore it is largely irrelevant for our discussion.  However PL/PerlU can access anything else on the system and it does so with the same permissions as the database manager itself.  Untrusted languages, like PL/PerlU make it possible to extend SQL in arbitrary ways by injecting (with the database superuser's permission!) code into SQL queries written in whatever langauges one wants.  Untrusted languages make PostgreSQL one of the most programmable relational databases in the world, but they also complicate database security in important ways which are well beyond the responsibility of PostgreSQL as a project.

Like test harnesses, untrusted languages are insecure by design (i.e. they allow arbitrary code injection into the database backend) and this issue is as heavily mitigated as possible, making PostgreSQL one of the most security-aware databases in the market.

To define a function  in an untrusted language, one must be a database superuser, so the PostgeSQL community places primary trust in the database administration team not to do anything stupid, which is generally a good policy.  This article is largely in order to help that policy be effective.

Paths, Permissions, and CVE-2016-1238


The CVE referenced in the title of this article and section is one which allows attackers to inject code into a running Perl routine by placing it in the current working directory of a Perl process. If an optional dependency of a dependency is placed i the current working directory, it may be included in the current Perl interpreter without the understanding of the user.  This is a problem primarily because Perl programs are sufficiently complex that people rarely understand the full dependency tree of what they are running.

If the current working directory ends up being one which includes user-writeable data of arbitrary forms, the problem exists.  If not, then one is safe.

The problem however is that changing directories changes the include path.  You can demonstrate this by doing as follows:

In ./test.pl create a file that contains:

use test;
use test2;

./test.pm is:

chdir test;
\
test/test2.pm is:

warn 'haxored again';

What happens is that the use test statement includes ./test.pm which changes directory, so when you use test2, you are including from test/test2.pm.   This means that if any period during this cycle of the Perl interpreter's life you are in a world-writable directory, and including or requring files, you are asking for trouble.


Usual Cases for PL/PerlU


The usual use case of PL/PerlU is where you need a CPAN module to process or handle information..  For example you may want to return json using JSON.pm, or you may want to write to a (non-transactional log).

For example, the most recent PL/PerlU function I wrote used basic regular expressions to parse a public document record stored in a database to extract information relating to patents awarded on protein sequences.  It was trivial but was easier to use JSON than to write json serialization inline (and yes, performed well enough given the data sizes operate on).

Are usual cases safe?  Deceptively so!


Here are a few pl/perlU functions which show the relevant environment that PL/PerlU functions run in by default:

postgres=# create or replace function show_me_inc() returns setof text language plperlu as
$$
return \@INC;
$$;
CREATE FUNCTION
postgres=# select show_me_inc();
         show_me_inc          
------------------------------
 /usr/local/lib64/perl5
 /usr/local/share/perl5
 /usr/lib64/perl5/vendor_perl
 /usr/share/perl5/vendor_perl
 /usr/lib64/perl5
 /usr/share/perl5
 .
(7 rows)

postgres=# create or replace function get_cwd() returns text language plperlu as
postgres-# $$
postgres$# use Cwd;
postgres$# return getcwd();
postgres$# $$;
CREATE FUNCTION
postgres=# select get_cwd();
       get_cwd       
---------------------
 /var/lib/pgsql/data
(1 row)


Wow.  I did not expect such a sensible, secure implementation.  PostgreSQL usually refuses to start if non-superusers of the system have write access to the data directory.  So this is, by default, a very secure configuration right up until the first chdir() operation......

Now, in the normal use case of a user defined function using PL/PerlU, you are going to have no problems.  The reason is that most of the time, if you are doing sane things, you are going to want to write immutable functions which have no side effects and maybe use helpers like JSON.pm to format data.  Whether or not there are vulnerabilities exploitable via JSON.pm, they cannot be exploited in this manner.

However, sometimes people do the wrong things in the database and here, particularly, trouble can occur.

What gets you into trouble?


To have an exploit in pl/perlU code, several things have to happen:


  1. Either a previous exploit must exist which has written files to the PostgreSQL data directory, or one must change directories
  2. After changing directories, a vulnerable module must be loaded while in a directory the attacker has write access to.
It is possible, though unlikely, for the first to occur behind your back.  But people ask me all the time how to send email from the database backend and so you cannot always guarantee people are thinking through the consequences of their actions.

So vulnerabilities can occur when people are thinking tactically and coding in undisciplined ways.  This is true everywhere but here the problems are especially subtle and they are as dangerous as they are rare.


An attack scenario.


So suppose company A receives a text file via an anonymous ftp drop box in X12 format (the specific format is not relevant, just that it comes in with contents and a file name that are specified by the external user).  A complex format like X12 means they are highly unlikely to do the parsing themselves so they implement a module loader in PostgreSQL.  The module loader operates on a file handle as such:

On import, the module loader changes directories to the incoming directory.  In the actual call to get_handle, it opens a file handle, and creates an iterator based on that, and returns it.  Nobody notices that the directory is not changed back because this is then loaded into the db and no other file access is done here.  I have seen design problems of this magnitude go undetected for an extended period, so coding defensively means assuming they will exist.

Now, next, this is re-implemented in the database as such:

CREATE OR REPLACE FUNCTION load_incoming_file(fiilename text) 
RETURNS int
language plperlu as
$$
use CompanyFileLoader '/var/incoming'; # oops, we left the data directory
use CompanyConfig 'our_format'; # oops optional dependency that falls back to .
                                                    # in terms of exploit, the rest is irrelevant
                                                    # for a proof of concept you could just return 1 here
use strict;
use warnings;

my $filename = shift;
my $records = CompanyFileLoader->get_handle($filename);
while ($r = $records->next){
    # logic to insert into the db omitted
}

return $records->count;

$$;


The relevant poart of CompanyFileLoader.pm is (the rest could be replaced with stubs for a proof of concept):

package CompanyFileLoader;
use strict;
use warnings;

my @dirstack;
sub import {
    my $dir = pop;
    push @dirstack, $dir;
    chdir $dir;
}

Now, in a real-world published module, this would cause problems but in a company's internal operations it might not pose discovered problems in a timely fashion.

The relevant part of CompanyConfig is:

package CompanyConfig;
use strict;
use warnings;

eval { require 'SupplementalConfig' };

Now, if a SupplementalConfig.pm is loaded into the same directory as the text files, it will get loaded and run as part of the pl/perlU function.

Now to exploit this someone with knowledge of the system has to place this in that directory.  It could be someone internal due to failure to secure the inbound web service properly.  It could be someone external who has inside knowledge (a former employee for example).  Or a more standard exploit could be tried on the basis that some other shared module might have a shared dependency.

The level of inside knowledge required to pull that off is large but the consequences are actually pretty dire.  When loaded, the perl module interacts with the database with the same permissions as the rest of the function, but it also has filesystem access as the database server.  This means it could do any of the following:


  1. Write perl modules to exploit this vulnerability in other contexts to the Pg data directory
  2. delete or corrupt database files
  3. possibly alter log files depending on setup.
  4. Many other really bad things.

These risks are inherent with the use of untrusted languages, that you can write vulnerable code and introduce security problems into your database.  This is one example of that and I think the PostgreSQL team has done an extremely good job of making the platform secure.



Disciplined coding to prevent problems


The danger can be effectively prevented by following some basic rules:

All user defined functions and stored procedures in PL/PerlU should include the line:

no lib '.';

It is possible that modules could add this back in behind your back, but for published modules this is extremely unlikely.  So local development projects should not use lib '.' in order to prevent this.

Secondly, never use chdir in a pl/perl function.  Remember you can always do file operations with absolute paths.   Without chdir, no initial exploit against the current working directory is possible through pl/perlu.  Use of chdir circumvents important safety protections in PostgreSQL.

Thirdly it is important that one sticks to well maintained modules.  Dangling chdir's in a module's load logic are far more likely to be found and fixed when lots of other people are using a module, and a dangling chdir is a near requirement to accidental vulnerability.  For internal modules, they need to be reviewed both for optional dependencies usage and dangling chdir's in the module load logic.

Monday, August 1, 2016

CVE-2016-1238 and the hazards of testing frameworks

Because I have lost confidence in the approach taken by those in charge of fixing this problem, I have decided to do a full disclosure series on CVE-2016-1238.  As mentioned in a previous post, the proposed fixes for the minor versions don't even remotely fix the problem  and at most can provide a false sense of security.  For this reason it is extremely important that sysadmins understand how these exploits work and how to secure their systems.

A lot of the information here is not specific to this CVE but concerns security regarding running tests generally.  For this reason while this CVE is used as an example, the same basic concepts apply to PostgreSQL db testing, and much more.

As I mentioned in a previous post, prove cannot currently be run safely in any directory that is world writeable, and the current approach of making this a module problem make this far worse, not better.  Moreover this is not something which can be patched in prove without breaking the basic guarantee that it tests the application as it would run on the current system's Perl interpreter and if you break that, all bets are off.

All the exploits I cover in this series are exploitable on fully patched systems.  However they can be prevented by good system administration.  In every case, we will look at how system administration best practices can prevent the problem.

One key problem with the current approach is that it fails to differentiate between unknowing inclusion and user errors brought on simply by not understanding the issues.  The latter has been totally disregarded by the people attempting stop-gap patches.

Test harnesses generally are insecure by design.  The whole point is to execute arbitrary code and see what happens and with this comes a series of inherent risks.  We accept those risks because they are important to the guarantees we usually want to make that our software will perform in the real world as expected.  Moreover even the security risks inherent in test harnesses are good because it is better to find problems in an environment that is somewhat sandboxed than it is in your production environment.

So testing frameworks should be considered to be code injection frameworks and they can be vectors by which code can be injected into a tested application with system security implications.

Understood this way, testing frameworks are not only insecure by design but this is desirable since you don't want a testing framework to hide a problem from you that could bite you in production.

This is a generic problem of course.  A database testing framework would (and should) allow sql injection that could happen in the tested code so that these can be tested against.  So whether you working with Perl, Python, PostgreSQL, or anything else, the testing framework itself has to be properly secured.  And in each platform this means specific things.

In PostgreSQL, this means you need to pay close attention to certain things, such as the fact that the database tests should probably not run as a superuser for example since a superuser can do things like create functions wit system access.

In Perl, one of the most important things to consider is the inclusion of modules from the current working directory and the possibility that someone could do bad things there.

What Prove Guarantees


Prove guarantees that your perl code will run, in its current configuration, in accordance with your test cases.  This necessarily requires arbitrary code execution and arbitrary dependency requirements resolved in the way Perl would resolve them on your system.

Prove guarantees that the guarantees you specify in your test cases are met by your current Perl configuration.  It therefore cannot safely do any extra sandboxing for you.

How Prove Works


The basic architecture of prove is that it wraps a test harness which runs a specified program (via the shebang line) parses its output assuming it to be in the test-anything protocol, and generates a report from the rest.  For example if you create a file test.t:

#!/bin/bash

echo 'ok 1';
echo 'ok 2';
echo '1..2';

and run prove test.t

You will get a report like the following:

$ prove test.t 
test.t .. ok   
All tests successful.

What prove has done is invoke /bin/bash, run the file on it, parse the output, check that 2 tests were run, and that both printed ok (it is a little more complex than this but....), and let you know it worked.

Of course, usually we run perl, not bash.

An Attack Scenario


The most obvious attach scenarios would occur with automated test environment that are poorly secured.  In this case, if prove runs from a directory containing material more than one user might submit, user A may be able to inject code into user B's test runs.

Suppose user A has a hook for customization as follows:

eval { require 'custom.pl' };

Now this is intended to run a custom.pl file, at most once, when the code is run, and it checks the current @INC paths.  If this doesn't exist it falls back to the current working directory, i.e. the directory the shell was working in when prove was run.  If this directory shares information between users, user b can write a custom.pl file into that directory which will run when user A's scheduled tests are run.

The code would then run with the permissions of the test run and any misbehavior would tie back to user A's test run (it is harder to tie the behavior to user B).  In the event where user A's test run operates with more system permissions than user B's, the situation is quite a bit worse.  Or maybe user B doesn't even have tests run anymore for some reason.

Now, it has been proposed that prove prune its own @INC but that doesn't address this attack because prove has to run perl separately.  Additionally separation of concerns dictates that this is not prove's problem.

Behavior Considered Reasonably Safe


As we have seen, there are inherent risks to test frameworks and these have to be properly secured against the very real fact that one is often running untrusted code on them.  However, there are a few things that really should be considered safe.  These include:


  • Running tests during installation of trusted software as root.  If you don't trust the software, you should not be installing it.
  • Running tests from directories which store only information from a single user (subdirectories of a user's home directory for example).


Recommendations for Test Systems


Several basic rules for test systems apply:

  1. Understand that test systems run arbitrary code and avoid running test cases for automated build and test systems as privileged users.
  2. Properly secure all working directories to prevent users from unknowingly sharing data or test logic
  3. Running as root is only permitted as part of a process installing trusted software.

Saturday, July 30, 2016

Notes on Security, Separation of Concerns and CVE-2016-1238 (Full Disclosure)

A cardinal rule of software security is that, when faced with a problem, make sure you fully understand it before implementing a fix.  A cardinal rule of general library design is that heuristic approaches to deciding whether something is a problem really should be disfavored.  These lessons were driven home when I ended up spending a lot of time debugging problems caused by a recent Debian fix for the CVE in the title.  Sure everyone messes up sometimes, and so this isn't really condemnation of Debian as a project but their handling of this particular CVE is a pretty good example of what not to do.

In this article I am going to discuss actual exploits.  Full disclosure has been a part of the LedgerSMB security culture since we started and discussion of exploits in this case provide administrators with real chances to secure their systems, as well as distro maintainers, Perl developers etc to write more secure software.  Recommendations will be further given at the end regarding improving the security of Perl as a programming language.

Part of the problem in this case is that the CVE is poorly scoped but CVE's are often poorly scoped and it is important for developers to work with security researchers to understand a problem, understand the implications of different approaches to fix it and so forth.  It is very easy to get in a view that "this must be fixed right now" but all too often (as here) a shallow fix does not completely resolve an issue and causes more problems than it resolves.

The Problem (with exploits)


Perl's system of module inclusion (and other operations) looks for Perl modules in the current directory after exhausting other directories.  Technically this is optional but most UNIX and Linux distributions have this behavior.  On the whole it is bad practice as well, which is why it is not by the default behavior of shells like bash.  But a lot of software depends on this (with some legitimate use cases) and so changing it is problematic.

Perl programs are also often complex and have optional dependencies which may or may not exist on a system.  If those do not otherwise exist in the system Perl directories but exist in the current working directory, then these may be loaded from the current working directory.  Note that this is not actually limited to the current working directory, and Perl could load the files from all kinds of places, users-specified or not.

So when one is running a Perl program in a world-writeable location, there is the opportunity for another user to put code there that may be picked up by the Perl interpreter and executed.  While the CVE is limited to implicit inclusion of the current working directory, the problem is actually quite a bit broader than that.  Include paths can be specified on the command line and if any of them are world-writeable, then variations of the same attacks are possible.

Some programs, of course, are intended to run arbitrary Perl code.  The test harness programs are good examples of this.  Special attention will be given to the ways test harness programs can be exploited here.

These features come together to create opportunities for exploits in multi-user systems which administrators need to be aware of and take immediate steps to prevent.  In my view there are a few important and needed features in Perl as well.

A simple exploit:

Create the following files in a safe directory:

t/01-security.t, contents:

use Test::More;
require bar;
eval { require foo };

plan skip_all => 'nothing to do';


lib/bar.pm contents:

use 5.010;
warn "this is ok";


./foo.pm, contents:

use 5.010;
warn "Haxored!";


now run:

prove -Ilib t/01-security.t

Now, what happens here is that the optional requirement of foo.pm in the test script gets resolved to the one that happens to be in your current working directory.  If that directory were world writeable, then someone could add that file and it would be run when you run your test cases.

Now, it turns out that this is not a vulnerability with prove.  Because prove runs Perl in a separate process and parses the output, eliminating the resolution inside prove itself has no effect.  What this means is that the directory where you run something like prove can really matter and if you happen to be in a world writeable directory when you run it (or other Perl programs) you run the risk of including unintended code supplied by other users.  Not good.  None of the proposed fixes address the full scope of this problem either.  Note that if any directory in @INC is world-writeable, a security problem exists.  And because these can be specified in the Perl command line, this is far more of a root problem than the mere inclusion of the current working directory.

Security Considerations for System Administrators and Software Developers


All exploits of this sort can be prevented even without the recommendations being followed in the proper fixes section.  System administrators should:


  1. Make sure that all directories in @INC other than the current working directory are properly secured against write access (this is a no brainer, but is worth repeating)
  2. Programs such as test harnesses which execute arbitrary Perl code should ONLY be run in properly secured directories, and the only time prove should ever be run as root is when installing (as root) modules from cpan.
  3. Scripts intended to be run in untrusted directories should be audited and one should ensure (and add if it is missing) the following line:  no lib '.';

Software developers should:

  1. think carefully about where a script might be executed.  If it is intended to be run on directories of user-supplied files, then include the  no lib '.';  (This does not apply to test harnesses and other programs which execute arbitrary Perl programs).
  2. Module maintainers should probably avoid using optional config modules to do configuration.  These optional configuration modules provide standard points of attack.  Use non-executable configuration files instead or modules which contain interfaces for a programmer to change the configuration.


What is wrong with the proposed fixes


The approach recommended by the reporter of the problem is to make modules exclude an implicit current working directory when loading optional dependencies.  This, as I will show, raises very serious separation of concerns problems and in Debian's case includes one serious bug which is not obvious from outside.  Moreover it doesn't address the problems caused by running test harnesses and the like in untrusted directories.  So one gets a *serious* problem with very little real security benefit.

If you are reading through the diff linked to above, you will note it is basically boilerplate that localizes @INC and removes the last entry if it is equal to a single dot.  This breaks base.pm badly because inheritance in Perl no longer follows @INC the way use does.  Without this patch the following are almost equivalent with the exception that the latter also runs the module's import() routine:

use base 'Myclass';

and

use Myclass;
use base 'Myclass';

But with this patch, the latter works and the former does not unless you specify -MMyclass in the Perl command line.  This occurs because someone is thinking technically about an issue without comprehending that this isn't an optional dependency in base and therefore the problem doesn't apply there.  But the problem is not quickly evident in the diff, nor is it evident when trying to fix this in this way on the module level.  It breaks the software contract badly, and does so for no benefit.

As a general rule, modules should be expected to act sanely when it comes to their own internal structure, but playing with @INC violates basic separation of concerns and a system that cannot be readily understood cannot be readily secured (which is why this is a real issue in the first place -- nobody understands all the optional dependencies of all the dependencies of every script on their system).

Recommendations for proper fixes


There are several things that Perl and Linux distros can do to provide proper fixes to this sort of problem.  These approaches do not violate the issues of separation of concerns.  The first and most important is to provide an option to globally remove '.' from @INC on a per-system basis.  This is one of the things that Debian did right in their reaction to this.  Providing tools for administrators to secure their systems is a good thing.

A second thing is that Perl already has a number of enhanced modes for dealing with security concerns and adding another would be a good idea.  In fact, this could probably be done as a pragma which would:

  1. On load, check to see that directories in @INC are not world-writeable -- if they are, remove them from @INC and warn, and
  2. when using lib, check to see if the directory is world-writeable and if it is die hard.
But making this a module's responsibility to care what @INC says?  That's a recipe for problems and not of solutions both security and otherwise.

Friday, July 29, 2016

What is coming in LedgerSMB 1.5

LedgerSMB 1.5 rc1 is around the corner.   I figure it is time for a very short list of major improvements:


  1.  A one-page app design which provides better responsiveness and testability
  2. Speaking of testability, we now have selenium-based bdd-tests as well (and a test coverage starting to approach reasonable -- remember, we started with no tests).
  3. We have spun off our database access framework as PGObject on CPAN.  This is an anti-ORM, basically a service locator framework for stored proceures.
  4. Quantity discounts
  5. Many stored procedures moved from PL/PGSQL to plain SQL for better type checking
  6. Full support for templates transparently stored in the PostgreSQL database



Sunday, March 20, 2016

When PostgreSQL Doesn't Scale Well Enough

The largest database I have ever worked on will eventually, it looks like, be moved off PostgreSQL.  The reason is that PostgreSQL doesn't scale well enough.  I am writing here however because the limitations are so extreme that it ought to give plenty of ammunition for those who think databases don't scale.

The current database size is 10TB and doubling every year.  The main portions of the application have no natural partition criteria.  The largest table currently is 5TB and the fastest growing portion of the application.

10TB is quite manageable.  20TB will still be manageable.  By 40TB we will need a bigger server.  But in 5 years we will be at 320 TB and so the future does not look very good for staying with PostgreSQL.

I looked at Postgres-XL and that would be useful if we had good partitioning criteria but that is not the case here.

But how many cases are there like this?  Not too many.

EDIT:  It seems I was misunderstood.  This is not complaining that PostgreSQL doesn't scale well  It is about a case that is outside of all reasonable limits.

Part of the reason for writing this is that I hear people complain that the RDBMS model breaks down at 1TB which is hogwash.  We are facing problems as we look towards 100TB.  Additionally I think that PostgreSQL would handle 100TB fine in many other cases, but not in ours.  PostgreSQL at 10, 20 or 50TB is quite usable even in cases where big tables have no adequate partitioning limit (needed to avoid running out of page counters), and at 100TB in most other cases I would expect it to be a great database system.  But the sorts of problems we will hit by 100TB will be compounded by the exponential growth of the data (figure within 8 years we expect to be at 1.3PB).  So the only solution really is to move to a big data platform.

Sunday, February 21, 2016

A couple annoyances (and solutions) regarding partitioned tables

In one of my projects we had an issue where a large table that was under huge transactional load was having trouble with autovacuum not keeping up.  The problem was that the table sometimes held over half a billion records, added and deleted millions of records a day, and that since most of these occurred at the heads of various indexes, autovacuum was just not fast enough.

So we decided to partition the table into around 50 pieces in order to allow autovacuum to achieve a bit better parallelism in managing the data.  This helped to some extent.  But partitioning is a rare solution for rare problems and comes with unexpected costs.   Interestingly most of our problems have been ORM-related.  Here are some we ran into and their solutions (spoiler:  at the end of the day, effectively, we stopped using an ORM on these tables).  At the end of the day, throughput on these tables was increased around 10-fold, and db load cut by about 90%.

Annoyance 1:  Redirection and ORM transparency


The first problem we had was getting DBIx::Class to work with the partitioned table.  The solution was to add another view in between which did the redirection of inserts, updates, and deletes.  This also allowed us to go through the ORM for inserts (we still do) without the cross-locking issues below being a problem.

Annoyance 2:  Cross-locking and exclusion constraints


A second major problem is that autovacuum can only free up space when it gets an exclusive lock and if any queries are going through the parent table, then you get constraint exclusion coming into play.  The problem here is that constraint exclusion takes out a relatively non-invasive lock on every table at planning time which means you cannot even plan to select a row from one partition if another partition is locked, if you are going through the parent table.

The obvious solution here is not to go through the parent table, but the ORM doesn't support that so we had to drop to SQL.  It also took us about 6 months to find and fix.

Annoyance 3:  Constraint exclusion doesn't always do what you expect it to!


One day we had a very slow running straight-forward query that should have been able to resolve quickly on an index scan on one of the partitions. However, because the constraint criteria was being brought in via a subquery, it was not available at plan time, so it was falling back on a sequential scan through another large partition.  Ouch......  Found the query and fixed it.

Annoyance 4:  Solving some performance problems puts more stress on the next bottleneck


The result of the initial success was increased db concurrency, which was great until it became clear our selection of rows to process and delete was leading to lots of indexes having huge numbers of dead tuples at their heads.  This meant that selecting rows actually became slower than before.  So we had to go back and engineer a new selection algorithm to avoid this problem....

Unrelated Annoyance:  Long running transactions causing autovacuum headaches


An interesting unrelated issue we had was the fact that at the time, we had transactions that would sometimes remain open for a week.  While the partitions directly affected were small, the problem is that autovacuum cannot clear tuples that are invalidated since the oldest transaction started, so higher processing throughput partitions were adversely affected.  After significant effort, we got the worst offenders corrected and now the longest running transactions take just over a day.  This is usually sufficient depending on the load of the system (but sometimes the duration spikes to 18 hours).

Was the partitioning worth it?  Definitely!  However it was a bit of a long road to get there

Wednesday, February 10, 2016

Why Commons Should Not Have Ideological Litmus Tests

This will likely be my last post on this topic.  I would like to revive this blog on technical rather than ideological issues but there seems like a real effort to force ideology in some cases.  I don't address this in terms of specific rights, but in terms of community function and I have a few more things to say on this topic before I return to purely technical questions.

I am also going to say at the outset that LedgerSMB adopted the Ubuntu Code of Conduct very early (thanks to the suggestion of Joshua Drake) and this was a very good choice for our community.  The code of conduct provides a reminder for contributors, users, participants, and leadership alike to be civil and responsible in our dealings around the commons we create.  Our experience is we have had a very good and civil community with contributions from every walk of life and a wide range of political and cultural viewpoints.  I see  this as an unqualified success.

Lately I have seen an increasing effort to codify a sort of political orthodoxy around open source participation.  The rationale is usually about trying to make people feel safe in a community, but these are usually culture war issues so invariably the goal is to exclude those with specific political viewpoints (most of the world) from full participation, or at least silence them in public life.  I see this as extremely dangerous.

On the Economic Nature of Open Source


Open source software is economically very different from the sorts of software developed by large software houses.  The dynamics are different in terms of the sort of investment taken on, and the returns are different.  This is particularly true for community projects like PostgreSQL and LedgerSMB, but it is true to a lesser extent even for corporate projects like MySQL.  The economic implications thus are very different.

With proprietary software, the software houses build the software and absorb the costs for doing so, and then later find ways to monetize that effort.  In open source, that is one strategy among many but software is built as a community and in some sense collectively owned (see more on the question of ownerership below).

So with proprietary software, you may have limited ownership over the software, and this will be particularly limited when it comes to the use in economic production (software licenses, particularly for server software, are often written to demand additional fees for many connections etc).

Like the fields and pastures before enclosure, open source software is an economic commons we can all use in economic production.  We can all take the common software and apply it to our communities, creating value in those areas we value.  And we don't all have to share the same values to do it.  But it often feeds our families and more.

But acting as a community has certain requirements.  We have to treat eachother with humanity generally.  That doesn't mean we have to agree on everything but it does mean that some degree of civility must be maintained and cultivated by those who have taken on that power in open source projects.

On the Nature of Economic Production, Ownership and Power (Functionally Defined)


I am going to start by defining some terms here because I am using these terms in functional rather than formal ways.

Economic Production:  Like all organisms we survive by transforming our environment and making it more conducive to our ability to live and thrive.  In the interpersonal setting, we would call this economic production.  Note that understood in this way, this is a very broad definition and includes everything from cooking dinner for one's family to helping people work together.  Some of this may be difficult to value but it can (what is the difference between eating out and eating at home?  How much does a manager contribute to success through coordination?).

Ownership:  Defining ownership in functional rather than formal terms is interesting.  It basically means the right to use and direct usage of something.  Seen in this way, ownership is rarely all or nothing.  Economic ownership is the right to utilize a resource in economic production.  The extent to which one is restricted in economic production using a piece of software the less one owns it, so CAL requirements in commercial software and anti-TIVOization clauses in the GPL v3 are both restrictions on functional ownership.

Economic Power:  Economic power is the power to direct or restrict economic production.  Since economic production is required for human life, economic power is power over life itself.  In an economic order dominated by corporations, corporations control every aspect of our lives.  In places where the state has taken over from the corporations, the state takes over this as well.  But such power is rarely complete because not all economic production can be centrally controlled.

I am going to come back to these below because my hesitation on kicking people out of the community due to ideological disagreements (no matter how wrong one side may seem to be) have to do with this fear of abuse of economic power.


On Meritocracy (and what should replace it)


Meritocracy is an idea popularized by Eric Raymond, that power in a community should be given to technical merit.  In short, one should judge the code, not the person.  The idea has obvious appeal and is on the surface hyper-inclusive.  We don't have to care about anything regarding each other other than quality of code.  There is room for everyone.

More recently there has been push-back in some corners against the idea of meritocracy.  This push-back comes from a number of places, but what they have in common is questioning how inclusive it really is.

The most popular concern is that meritocracy suggests that we should tolerate people who actively make the community less welcoming, particularly for underrepresented groups. and therefore meritocracy becomes a cover for excluding the same groups who are otherwise excluded in other social dimensions, that the means of exclusion differs but who is excluded might not.

There is something to be said for the above concern, but advocates have often suggested that any nexus between community and hostile ideas is sufficient to raise a problem and therefore when an Italian Catholic expresses a view of gender based on his religion on Twitter, people not even involved in the project seek his removal from it on the grounds that the ideas are toxic.  For reasons that will become clear, that is vast overreach, and a legitimate complaint is thus made toxic by the actions of those who promote it.  And similarly toxic are the efforts by some to use social category to insist that their code should be included just to show a welcoming atmosphere.

A larger problem with meritocracy though is the way it sets up open source communities to be unbalanced, ruled by technical merit and thus not able to attract the other sorts of contributions needed to make most software successful.  In a community where technical merit is the measure by which we are judged, non-technical contributions are systematically devalued and undervalued.  How many open source communities produce software which is poorly documented and without a lot of attention to user interface?  If you devalue the efforts at documentation and UI design, how will you produce software which really meets people's needs?  If you don't value the business analysts and technical writers, how will you create economic opportunities for them in your community?  If you don't value them, how will you leverage their presence to deliver value to your own customers?  You can't if your values are skewed.

The successor to meritocracy should be economic communitarianism, i.e. the recognition that what is good for the community is economically good for all its members.  Rather than technical merit, the measure of a contribution and a contributor ought to be the value that a contribution brings the community.    Some of those will be highly technical but some will not.  Sometimes a very ordinary contribution that anyone could offer will turn the tide because only one person was brave enough to do it, or had the vision to see it as necessary.  Just because those are not technical does not mean that they are not valuable or should not be deeply respected.  I would argue that in many ways the most successful open source communities are the ones which have effectively interpreted meritocracy loosely as economic communitarianism.

On Feeling Safe in the Community


Let's face it  People need to feel safe and secure in the community regarding their physical safety and economic interests.  Is there any disagreement on this point?  If there is, please comment below.  But the community cannot be responsible for how someone feels, only in making sure that people are objectively physically and economically secure within it.  If someone feels unsafe in attending conferences, community members can help address security concerns and if someone severely misbehaves in community space, then that has to be dealt with for the good of everyone.

I don't think the proponents of ideological safety measures have really thought things through entirely.  The world is a big place and it doesn't afford people ideological safety unless they don't go out and work with people they disagree with.  As soon as you go across an international border, disagreements will spring up everywhere and if you aren't comfortable with this then interacting on global projects is probably not for you.

Worse, when it comes to conduct outside of community circles, those in power in the community cannot really act constructively most of the time.  We don't have intimate knowledge and even if we do, our viewpoints have to be larger than the current conflict.

On "Cultural Relativism:" A welcoming community for all?


One of the points I have heard over and over in discussions regarding community codes of conduct is that welcoming people regardless of viewpoint (particularly on issues like abortion, sexuality, etc) is cultural relativism and thus not acceptable.  I guess the question is not acceptable to whom?  And do we really want an ideological orthodoxy on every culture war topic to be a part of an open source project?  Most people I have met do not want this.

But the overall question I have for people who push culture war codes of conduct is "when you say a welcoming community for all, do you really mean it?  Or do you just mean for everyone you agree with?  What if the majority changes their minds?"

In the end, as I will show below, trying to enforce an ideological orthodoxy in this way does not bring marginal groups into the community but necessary forces a choice of which marginal groups to further exclude.  I don't think that is a good choice and I will go on record and say it is a choice I will steadfastly refuse to make.

A Hypothetical


Ideology is shaped by culture, and ideology of sexuality is shaped by family structures, so consequently where family structures are different, views on sexuality will be also.

So suppose someone on a community email list includes a pro-same-sex marriage email signature, something like:

"Marriage is an institution for the benefit of the spouses, not [to] bind parents to their children" -- Ted Olson, arguing for a right to same-sex marraige before the United States Supreme Court.

So a socially conservative software developer from southern India complaints to the core committee saying that this is an attack on his culture, saying that traditional Indian marriages are not real marriages.  Now, I assume most people would agree that it would be best for the core committee not to insist that the email signature be changed for someone to continue to participate.  So with such a decision, suppose the complainant changes his signature instead to read:

"If mutual consent makes a sexual act moral, whether within marriage or without, and, by parity of reasoning, even between members of the same sex, the whole basis of sexual morality is gone and nothing but misery and defect awaits the youth of the country… " -- Mohandas Gandhi

Now the first person decries the signature as homophobic and demands the Indian fellow be thrown off the email list.  And the community, if it has decided to follow the effort at ideological safety has to resolve the issue.  Which group to exclude?  The sexual minority?  Or the group marginalized through a history of being on the business end of colonialism?  And if one chooses the latter, then what does that say about the state of the world?  Should Indians, Malaysians, Catholics, etc. band together to fork a competing project?  Is that worth it as a cost?  Doesn't that hurt everyone?

On Excluding People from the Commons


In my experience, excluding people from the commons carries with it massive cost, and this is a good thing because it keeps economic power from being abused.  I have watched the impact first hand.  LedgerSMB would not even exist if this weren't an issue with SQL-Ledger.  That we are now the only real living fork of SQL-Ledger and far more active than the project we forked from is a testament to the cost.

Of course in that case the issue was economic competition and a developer who did not want to leverage community development to build his own business.  I was periodically excluded from SQL-Ledger mailing lists etc for building community documentation (he sold documentation).  Finally the fork happened beccause he wouldn't take security reports seriously.  And this is one of the reasons why I would like to push for an inclusive community.

But I also experienced economic ramifications from being excluded.  It was harder to find customers (again, the reason for exclusion was economic competition so that was the point).  In essence, I am deeply aware of the implications of kicking people out.

I have seen on email lists and tracker tickets the comparison of the goal of excluding people with problematic ideologies with McCarthyism.  The goal of McCarthyism was indeed similar, to make sure that if you had the wrong ideas you would be unable to continue a professional career.  I have had relatives who suffered because they defended the legal rights of the Communist Party during that time.  I am aware of cases where the government tried to take away their professional career (unsuccessfully).

Management of community is political and the cost of excluding someone is also political.  We already exist in some ways on the margins of the software industry.  Exclude too many people and you create your own nemesis.  That's what happened to SQL-Ledger and why LedgerSMB is successful today.

Notes on former FreeBSDGirl


One blog entry that comes from the other side of this issue is Randi Harper's piece on why she no longer will go to FreeBSD conferences and participate on IRC channels.   I am not familiar with the facts surrounding her complaints and frankly I don't have time to be so what the nature of her harassment complaint is, I will not be the judge.

There is however another side to the issue that is outside what she evidently has experience with, and that is the role of software maintainers in addressing the sorts of complaints she made.  Consequently I want to address that side and then discuss her main points at the bottom.

One thing to remember is that when people make accusations of bullying, harassment, etc. the people in charge are also the people with the least actual knowledge of what is going on.  Expecting justice from those in power in cases like this will lead, far more often than not, to feelings of betrayal.  This is not because of bad intentions but because of lack of knowledge.  This was one thing I learned navigating schoolyard bullies when I was growing up and we as project maintainers are in an even lower knowledge role than school administrators are.  Bullies are furthermore usually experts at navigating the system and take advantage of those who are not as politically adept, so the more enforcement you throw at the problem, the worse it gets.

So there is an idea that those in charge will stop people from treating eachother badly.  That has to stop because it isn't really possible (as reasonable as it sounds).  What we can do is keep the peace in community settings and that is about it.  One needs bottom up solutions, not top down ones.

So if someone came to me as a maintainer of a project alleging harassment on Twitter and demanding that an active committer be removed, that demand would probably go nowhere.  If political statements were mentioned, the response would be "do we want a political orthodoxy?"  Yet LedgerSMB has avoided these problems largely because, I think, we are a community of small businesses and therefore are used to working through disagreements and maybe because we are used to seeing these sorts of things as political.

Her main points though are worth reading and pondering.  In some areas she is perfectly right and in some areas dangerously wrong.

Randi is most right in noting that personal friction cannot be handled like a technical problem.  It is a political problem and needs to be handled as such.  I don't think official processes are the primary protection here, and planning doesn't get you very far, but things do need to be handled delicately.

Secondly, there is a difference between telling someone to stay quiet and telling someone not to be shouting publicly.   I think it is worth noting that if mediation is going to work then one cannot have people trying to undermine that in public, but people do need friends and family for support and so it is important to avoid the impression that one is insisting on total confidentiality.

Randi is also correct that how one deals with conflict is a key gauge of how healthy an open source community is.  Insisting that people be banished because of politically offensive viewpoints however does not strike me as healthy or constructive.  Insisting that people behave themselves in community spaces does.  In very rare cases it may be necessary to mediate cases that involve behavior outside that, but insisting on strict enforcement of some sort of a codified policy will not bring peace or prosperity.

More controversially I will point out that there is a point that Randi makes implicitly that is worth making explicit here, namely that there is a general tone-deafness to women's actual experiences in open source.  I think this is very valid.  I can remember a former colleague in LedgerSMB making a number of complaints about how women were treated in open source.  Her complaints included both unwanted sexual attention ("desperate geeks") and more actionably the fact that she was repeatedly asked how to attract more women to open source (she responded once on an IRC channel with "do you know how annoying that is?").  She ultimately moved on to other projects following a change in employment that moved LedgerSMB outside the scope of duties,  but one obvious lesson that those of us in open source can take from this is just to listen to complaints.  Many of these are not ones that policies can solve (you really want a policy aimed at telling people not to ask what needs to be done to attract more women to open source?) but if we listen, we can learn something.

One serious danger in the current push for more expansive codes of conduct is that it puts those who have the least knowledge in the greatest responsibility.  My view is that expansive codes of conduct, vesting greater power with maintainers over areas of political advocacy outside community fora will lead to greater, not less conflict.  So I am not keen in her proposed remedies.

How Codes of Conducts Should be Used


The final point I want to bring up here is how codes of conduct should be used.  These are not things which should be seen as pseudo-legal or process-oriented documents.  If you go this way, people will abuse the system.  It is better in my experience to vest responsibility with the maintainers in keeping the peace, not dispensing out justice, and to have codes of conduct aimed at the former, not the latter.  Justice is a thorny issue, one philosophers around the world have been arguing about for millennia with no clear resolution.

A major problem is the simple fact that perception and reality don't always coincide.  I was reminded of this controversy while reading an article in The Local about the New Years Eve sexual assaults, about work by a feminist scholar in Sweden to point out that actually men are more at risk from bodily violence than women are, and that men suffer disproportionately from crime but are the least likely to modify behavior to avoid being victimized.  The article is worth reading in light of the current issues.

So I think if one expects justice from a code of conduct, one expects too much.  If one expects fairness from a code of conduct, one expects too much.  If one expects peace and prosperity for all, then that may be attainable but that is not compatible with the idea that one has a right not to be confronted by people with dangerous ideologies.

Codes of conducts, used right, provide software maintainers with a valuable tool for keeping the peace.  Used wrong, they lead open source projects into ruin.  In the end, we have to be careful to be ideologically and culturally inclusive and that means that people cannot guarantee that they are safe from ideas they find threatening.

Tuesday, January 26, 2016

On Contributor Codes of Conduct and Social Justice


The PostgreSQL, Ruby, and PHP communities have all been considering codes of conduct for contributors. The LedgerSMB community already uses the Ubuntu Code of Conduct.  Because this addresses many projects, I am syndicating this further than where there are current issues.  This is not a technical post and it covers a wide range of very divisive issues for a very diverse audience.  I can only hope that the nuance I am trying to communicate comes across.

Brief History


A proximal cause seems to be an event referred to as "Opalgate" where an Italian individual who claimed to be a part of the Opal project made some unrelated tweets in an exchange about the politics of education and the question of how gender should be presented, and some people got offended and demanded his resignation (at least that is my reading of the Twitter exchange but I have been outside the US long enough to lose the context in which it would likely be read by an American in the US).  The details are linked below, but the core questions involve how much major contributors to projects need to keep from saying anything at all about divisive issues seems to be a recurring topic.  Moreover it is a legitimate one.

Like some of my blog posts, this goes into touchy territory.  I am discussing things which require a great deal of nuance.  Chances are, regardless of where you sit on some of these issues, you will be offended by things I say, but there are worse things than to be offended (one of them is never to be challenged by different viewpoints).

I write here as someone who has lived in a number of very different cultures and who can see perspectives on many of these issues which are not present in American political discourse  For this reason, I think it is important for me to share the concerns I see because otherwise open source software maintainers often don't have a perspective outside of Western countries, or even outside the US.

Of course as open source software maintainers we want everyone to feel safe and valued as members of the community.  But cultural tensions and ways of life do crop up and taking a position on these as a community will always do more harm than good.

Background Reading regarding Opalgate and the question of so-called "social justice warriors" in open source


It may seem strange to put a list of links for background reading near the start of an article, but I want to make sure that such material is available up front.  People can read about Opalgate here and the ongoing debate between various parties about it.  It's important background reading but somewhat peripheral to the overall problems involved.  It may or may not be the best example of the difficulties in running cross-cultural projects but it does highlight the difficulties that come in addressing diverse community bases, those which may have deep philosophical disagreements about things which people take very personally.

In the interest of full disclosure, I too worry that there is too much eagerness to liberate children from concepts of gender and too little thought about how this can and will be abused, and what the life costs for the children actually will be.  I believe that we must be human and humane to all but I am concerned that the US is going down a path that strikes me as anything but that in the long run.  That doesn't mean that the concerns of the trans community in the US should be ignored, but that doesn't mean they should be paramount either.  As communities we need to come together to solve problems not fight culture wars.

Twitter is not a medium which is conducive to thoughtful exchange so I also have to cut some slack.  Probably not the wisest medium to discuss controversial topics.  But people around the world have deep differences in views on major controversies.  My wife, for example, is far more opposed to abortion than I am, and having come to a deeper understanding of her culture, I don't disagree that in her cultural context, it is more harmful.  But that brings me to another problem, that many issues are contextual and we cannot see how others really are impacted by such changes, particularly when forced from the outside.

But my view doesn't matter everywhere.  It matters in my family, my discussions with people I know, and so forth.  But most of the world is not my responsibility nor should it be.  These are not entirely easy issues and there should be room for disagreement.

Is Open Source Political?


Caroline Ada Ehmke's basic argument is that open source is inherently political, that it seeks a positive change in the world, and therefore it should ally itself with others sharing the same drive to make the world a better place.  I think this viewpoint is misguided but because it is only half-wrong.

Aristotle noted that all human relationships are necessarily political.  The three he chose as primary in Politics is illustrative:  master and slave (we could update to boss and worker); husband and wife; and king and subject.  To Aristotle, the human being alone is incomplete.  We are our relationships and our politics follows from them.  While there has been an effort to separate the personal and the political in modern times, Feminist historians have kept this tradition alive and well.  A notion of the political grounded in humans as social animals is fundamentally more conducive to justice than cold, mechanical, highly engineered social machinery.  Moreover Aristotle notes that all communities are built on some concept of the good, that humans only want things that seem good to them and therefore we can assume that all groups seek a better world, but we don't always know which ones deliver, and that is the problem. 

Open source begins not with an ideology but with a conviction.  Not everybody shares the same conviction.  Not everyone participates in open source for the same reason.    But everyone has a reason, some conviction that what they are doing is good.  There is enough commonality for us all to work together, but that commonality is not as strong as one may think.

In a previous post on this blog I argued for a very different understanding of software freedom than Richard Stallman supposes, for example.  While he holds a liberal enumerated liberties view, I hold a traditionalist work-ownership view.  Naturally that leads to different things we look for in an ideal license.

And the diversity in viewpoint does not stop there.  Some come to open source because they believe that open source is a better way of writing software.  Some because they believe that open source software delivers benefits in use.  But regardless of our disagreement we share the understanding that open source software brings community and individual benefits.

In two ways then is open source software political:
  1. Communities require governance and this is inherently political, and
  2. To the extent there is a goal to transform the software industry to one of open source that is political.
The first as we will see is a major problem.  Open source communities are diverse in a way few Americans can fully comprehend (we like to think everyone is like us and there is one right way, the American Way whether that is in industry -- the right -- or formulations of rights -- the left).  Thus most discussions end up being Western-normative (and in particular American-normative) and disregard perspectives from places like India, Malaysia, Indonesia, and so forth.

However it is worth coming back to the point that what brings us together is an economic vision.  Yes, that is intrinsically and highly political, but it also has consequences for other causes and therefore it is worth being skeptical of alliances with groups in other directions.  What would an open source-based economy look like? What would the businesses look like? Will they be the corporations of today or the perpetual family businesses and trades of yesteryear?  And if the latter, what is the implication for the family?  Many of these questions (just like questions of same-sex marriage) depend in large part on the current social institutions in a culture -- the implications of an industrial, corporate, weak family society adopting something like same-sex marriage are very different than in an agrarian or family-business, strong family society.  My view is that these will likely have different answers in different places.

Thus when an open source community takes a position on, for example, gay rights in the name of providing a welcoming community, they make the community openly hostile to a very large portion of the world and I think that is not what we want.  Moreover such a decision is usually a product of white, Western privilege and effectively marginalize those in so-called developing countries who want to see their countries economically develop in a very different direction than the US has.  Worse, this is not an unintended side effect but the whole point.

A brief detour into the argument over white privilege


A discussion of so-called white privilege is needed I think for three groups reading this:
  • Non-Americans who will have trouble understanding the idea as it applies to American society (like all social ideas, it does not apply to all societies or even where it does apply, it may not in the same way).
  • White Americans who seem to have trouble understanding what people of color in the US mean when they use the term.
  • Activists who want to use the idea as a political weapon to enforce a sort of orthodoxy

I mentioned Western-Normative above.  It is worth pointing out that this forms a part of a larger set of structures that define what is normal or central in a culture, and what is abnormal, marginalized (or perhaps liminal).  It is further worth noting that the perception as to these models is more acute to those who are not treated as the paragons of normality.  In the US, the paragon of normality is the white, straight male.  But unspoken here is that it is the white, straight, urban, wealthy American male (or maybe European, they are white too).  Everyone else (women, people of color, Africans, Asians, etc) should strive to be like these paragons of success (I, myself, having lived most of my life now either outside the US or in the rural parts, am most certainly not included in this paragon of normality model, but nevertheless it took years of marriage to someone from a different culture and race to begin to be able to partially see a different perspective).

Now, it doesn't follow that white straight males live up to this image (which is one reason why white privilege theory has proven controversial among the arguably privileged) even where there is wealth, one is brought up in nice neighborhood in the city, etc.  But that isn't really the point.  The point is that society holds these things to be *normal* and everything else to be only normal to the extent it is like this model.  It would be better and more accurate to call this a model of normality rather than privilege and to state at the outset that we cannot really walk a mile in the shoes of people from across many social borders (culture included).

White privilege is real, as is male privilege (in some areas, particularly employment), urban privilege, American privilege, Western privilege, even female privilege (in some areas, particularly family law).

Issues exist in a sticky web of culture, and no culture is perfect


These issues of privilege aren't necessarily wrong in context:  it seems unlikely that the workplace can be made less male-normative without men sharing equally in the duties and rights of childrearing, but enforcing that cuts against the goal by some feminists of liberating women from men (and also exists in tension with things like same-sex marriage and gender-nonessentialism).  Insisting that men get the same amount of parental leave as women cuts one direction, but insisting that single women get free IVF cuts the other (both of these are either the case in Sweden or efforts are being made to make them the case).  In other words, addressing male privilege requires a transformation of the economic and family order together, in such a way that having children becomes an economic investment rather than an economic burden.  But then that has implications for the idea of gay rights as we understand the concept in the West because if having and raising children becomes normative then one is providing a sort of parental privilege, and gender equality becomes based on heteronormativity.

But the ultimate white privilege is to deny it is a factor when one uses one's own perception that other cultures are homophobic or transphobic to justify one's own racist paternalism.  No need to understand why.  We are white.  We know what is right.  We just need to educate them so they can join the ranks of the elite culturally white enlightened liberals as well.  Most of the world, however, disagrees, and as maintainers of open source projects we have to somehow keep the peace. (Note I use the term liberal as it is used in the history of ideas --  in the West it is no less prevalent on the mainstream right than on the mainstream left, though the application may be different.)

Since many of these issues necessarily exist in tension with eachother, there is no such thing as a perfect culture.  It isn't even clear the West does better than Southeast Asia on the whole (in fact I would say the SE Asia does better than the West on the whole).  But all culture is an effort at these tradeoffs, and it is not the job of open source communities to push Western changes on the rest of the world.

What is Social Justice?  Two Theories and a Problem


If open source is inherently political then social justice must in some way matter to open source.  Naturally we must understand what social justice is and how it applies.  Certainly a sense of being treated fairly by the community is essential for contributors from all walks of life.  The cult of meritocracy is an effort at social justice within the community.  As some argue it is not entirely without problems (see below) but as a technical community it is a start.

Western concepts of justice today tend to stress individuality, responsibility, and autonomy.  The idea is that justice is something that exists between individuals, and maybe between individuals and the state.  And while contemporary Western social justice theorists on the left try to relate the parts to the whole of society, it isn't clear that there is room for any parts other than the isolated individual and the state in their theories.  If one starts with the view that humans are born free but everywhere in chains (Rousseau), then the job of the state is to liberate people from eachother, and that leaves no room for any other parts.

The individualist view of justice, when seen as primary, breaks down in a number of important ways.  The most important is that it provides no real way of understanding parts and how they can be related to the whole.  Thus, the state becomes both stronger and isolates people more from eachother, and predictability becomes more important than human judgement.  Separatism cannot be tolerated, and assimilationism becomes the rallying cry when it comes to how the central model of normality should deal with those outside.  In other words the only way that this approach can deal with those on the margins is to destroy their culture and assimilate the individuals remaining.  Resistance must be made futile (Opalgate can be seen as such an effort).  For this reason, this view of justice is incompatible with real cultural pluralism.   This is not a question of the political spectrum in the US or Europe.  It is a fundamental cultural assumption in the much of the West.  Interestingly, the insistence that the personal is political means that intellectual feminism already exists in tension with this cold, mechanical view of justice.

Another view of justice can be found in Thomas Aquinas's view that in addition to justice between individuals, there is a need to recognize that just as individuals are parts in relation to the whole, so are other organs within society.  In other words, justice is a function of power, and justice is in part about just  design and proper distribution of power and responsibility.  In this regard, Aquinas built on the thought experiments of Plato's Republic and the Politics of Aristotle.  In this regard, key questions of social justice include the structure of an open source community and the relationship between the parts of the community (how users and developers interact and share power and responsibility), the relationship between open source projects and so forth.

In the end though, there was a reason why Socrates eventually rejected every formulation of justice he pondered.  Justice itself is complex and to formulate it removes a critical component of it, namely human judgement when weighing harms which are not directly comparable.  I think it is therefore quite necessary for people to remain humble about the topic and to realize that nobody sees all the pieces, and that we as humans learn more from disagreement than from agreement.  Therefore every one of us is ignorant to some extent on the nature of justice and so disagreements are healthy.

Open Source Projects and So-Called Social Justice Warriors


Coraline Ehmke, in her post on ticket on Opalgate asked:

Is this what the other maintainers want to be reflected in the project? Will any transgender developers feel comfortable contributing?"

This is  good question but another question needs to be asked as well.  Given that a lot of people live in societies with very different family and social structures, should people feel comfortable using software if the maintainers of the project have come out as openly hostile to the traditional family structures in a culture?  Does not a community that is welcoming of all need to avoid the impulse to delegitimize social institutions in other cultures, ones where one necessarily lacks an understanding into how it plays into questions of economic support and power?  If open source is already political do we want to ally ourselves with groups that could alienate important portions of our user base by insisting that they change their way of life?

It is important that we maintain a community that is welcoming to all, but that means we have to work with people we disagree with.  A mere difference of opinion should never be sufficient to trigger a problem with the code of conduct and expressing an opinion outside community resources should never be sufficient to consider the community unduly unwelcoming.  A key component of the community is whether people can work together with people when they disagree, and forcing agreement or even silencing opposition is the opposite of social justice when it comes to a large-reaching global project.

Should open source communities eject social justice warriors as ESR suggests?  Not if they are willing to work comfortably with people despite disagreements on hot button issues.  Should we welcome them?  If they are willing to work with people comfortably despite disagreements on hot button issues.  Should we require civility?  Yes.  Should we as communities take stances on hot button issues internationally?  Absolutely not.  What about as individuals?  Don't we have a civic duty to engage in our own communities as we feel best?  And if both those are true, must we not be tolerant of a wide range of differences in opinion, even those we find deeply and horribly wrong?