Copyright (c) 2001-2003 V. Alex Brennen (VAB).
This document is hereby placed in the public domain.
This document lives at http://cryptnet.net/fdp/crypto/strong_distro/en/strong_distro.html (DocBook XML)
This document is currently only available in the following languages:
If you know of a translation or would like to translate it to another language please let me know so that I can distribute or link to the translated versions.
V. Alex Brennen (Principal Author)
Greg Sabino Mullane (Proofreading of version 1.0.0 - Simplification Suggestions, Technical Suggestions)
Edward C. Bailey (RPM signature sections were derived from Ed's book Maximum RPM)
Werner Koch(Werner Koch (GnuPG option usage correction) Fix for --not-dash-escaped requirement)
A strong distribution model is a model of software distribution which is cryptographically strong. In such a distribution model software archives and source code are protected against alteration, damage and replacement through the science of cryptography. Specifically, a strong distribution model is a distribution model with makes use of public key cryptographic technology to make attack, fraudulent presentation, compromise and alteration theoretically hard problems.
Let's start by either gaining an understanding of how modern cryptography works, or by running through a quick review of that subject. Cryptography is the mathematic science of creating and using cryptosystems. A cryptosystem is a system of protecting information. Encryption is the action of protecting, or encoding or scrambling, data through cryptography. Decryption is the action of decoding, or unscrambling, that data so that it can be read. Encryption and decryption is done with numeric values which are called keys. There are two types of cryptography which we'll discuss here, symmetric key cryptography and public key cryptography. After defining symmetric key cryptography, we'll shift our focus to public key cryptography. Public key cryptography is the technology which the strong distribution model is based upon.
Symmetric key cryptography is a type of cryptosystem which uses a single key, or secret, to allow the encryption of data. Typically, it is very easy to decode the secret message and reveal the hidden information if you are in possession of the secret key. If you do not have the secret key, the hope is that it is very difficult, almost impossible, to decrypt the secret message and gain access to the data.
Public key cryptography is a different type of cryptosystem which uses two complementary keys, which are interrelated in a special way, to protect data. We know from our knowledge of symmetric cryptography and encryption that a key can be used to decrypt or encrypt a message. One of the interrelated cryptographic keys in a public key cryptosystem can be used to encrypt information and, due to their interrelation, the analogous key may subsequently be used to decrypt that information encrypted with the other key. In some instances of public key cryptography, both keys may be used for encryption and decryption. The interrelated cryptographic keys in a public key cryptosystem are called a "key pair". The two keys of a key pair are separated into a private key, which is kept secret, and a public key which is broadcast. The basis of the public key cryptography which is used in the strong distribution model is that data encrypted with the private secret key, can be decrypted with the public key and data encrypted with the public key can be decrypted with the secret key.
A digital signature is the electronic equivalent of a traditional paper based signature with some additional benefits and features. A digital signature is a combination of hashing technology and public key encryption. A digital signature can be used to certify and timestamp a document, or in our case a software archive. A solid understanding of hash technology is critical in establishing an understanding of digital signatures and therefore, the strong distribution model.
A hash algorithm is a function which can be run on a variable length piece of data in order to produce a fixed length representation of that data. This fixed length representation is called a digest. It is impossible to reproduce a message from its digest. Hash technology works by being reproducible, given the same input data, the same digest will always be produced. Data integrity can be established by comparison of hash digests. Hash algorithms are designed to have a phenomena in them called the avalanche effect. The avalanche effect is the chaining of subsequent computations to the computations which preceded them in a way that a minor change in any piece of data will avalanche, or produce an extreme effect in all following calculations. Hash algorithms are also designed in a way that makes collisions extremely unlikely. A collision occurs when two different pieces of data produce the same digest when hashed.
In the openPGP based strong distribution model, a digital signature is created by creating a hash of the data and then encrypting the digest with a given private key. Software digitally signed with your secret key can be verified by anyone who acquires your public key and performs a verification operation with openPGP software. Verification takes place when a hash is recomputed on a software archive, or a piece of data, and the digest matches the digest encrypted in the digital signature. Since the signature is in part made up of hashing technology (most often SHA-1 or MD5), even any small changes to the software archive or source code, due to the avalanche effect, will be immediately apparent when they cause verification to fail. In the encryption portion of the digital signature, the tight integration of the key pair means that a digital signature decrypted by a public key which is not the public key linked with the secret key of the signer will produce a value which does not match the recalculated digest. Through this feature, verification of the digital signature not only verifies the contents of the archive through hashing technology, but also verifies the identity of the author through public key technology - and that summarizes the basis of the strong distribution model.
Many developers currently make use of MD5 hash technology in the distribution of their software. Most of us are familiar with the use of MD5 to protect against the alteration of computer files though the use of programs like tripwire which create a database of the MD5 hashes for later verification. Cryptographic signature to protect data is very similar to the use of a hash like MD5. The difference again is the encryption of the resulting digest value by a key in a known key pair allows identity to be asserted and proven to the extent that the key pair is trusted.
The possession of an openPGP key pair also provides a software project with the ability to post digitally signed message or to send digitally signed email. The maintainers of a software project can digitally sign release announcements, security advisories and even support email. In addition, it is also possible to use an openPGP key pair to send encrypted email to the maintainers of a software package. All of these abilities work can contribute greatly to the security of a free software project.
Since the identity of the author and the contents of the archives are protected, the strong distribution model protects you from having your software replaced on your server with a trojan horse. It also protects you from having your software infected with a virus while on your server and it even protects you from bit rot (data corruption) and data transmission errors.
In the case of a free software project not using a strong distribution model, the worst possible scenario in a compromise is that the server the project is using for distribution is compromised. However in the case of a free software project which is using a strong distribution model, the worst possible case is that the key pair is compromised. Theoretically, if the secret key is not stored on the server which is used for distribution and that server is compromised it should have only mildly significant security consequences for the project. The integrity of the software being distributed would be preserved. On the other hand, compromise of the secret key has dire consequences which are detailed in section 1.6.
The best thing about embracing a strong distribution model is that it gives you security that is independent of the security of your server. Since the security of a strong distribution model rests in the security of the key pair rather than the security of the server, it's no long necessary to rely on a centralized model of distribution. This means that the free software community can begin to take advantage of peer to peer (p2p) technology to distribute its software. The use of p2p technology can drastically reduce the resources needed for a large software project. The technology can therefore reduce the barrier to entry in areas such as linux distributions or heavily used and frequently downloaded applications.
The user benefits not only from the protection afforded by cryptography against crackers, but also by the protections afforded against unscrupulous distributors. Cryptographic signatures provide some degree of nonrepudation, or the inability to deny later that the distributor authored the contents of an archive. If a software distributor distributed a software archive which was infected with a virus or which contained questionable code and was using a strong distribution model, it would be impossible for the distributor to claim that infection took place on the client system or that the questionable code was not official code with out also claiming that their key pair was compromised.
In the event of a compromise of a client system which your software is running on, the user must reinstall all the software on the system from read only media, or re-download the software from the internet. In the case of free software, the user may not have read only media. If you are using the strong distribution model, the user may be spared the need to re-download all the software if they lack read only media copies. The user would just be able to reinstall their openPGP package and basic operating system, then install your software from archives which where cryptographically protected from modification.
The compromise of the strong distribution systems takes place when the key pair used by the distributor is compromised. Most importantly, if a cracker gains access to your secret key, he can digitally sign software archives and messages which will appear to have come from you directly. Since digital signatures made by the cracker will be indistinguishable from the digital signatures made by the people with legitimate access to your project's secret key, it will be impossible to tell which archives are valid and which are false. Secondarily, the cracker will also be able to decrypt all messages encrypted with the public key of the key pair, including both past and present messages.
The best way to protect yourself from such compromises is to use great caution in the protection of your secret key. The best practice is to store your secret key on a floppy disk, store a backup of your secret key and revocation certificate in a secure place, to choose an excellent passphrase, and to have only one individual have access to the secret key. This individual can function as the project's release master. To some extent, you can also limit the damages from such a compromise by using key pairs with reasonable expiration times and by using a signature only key. The use of short expiration times limits the number of releases that a key pair compromise would involve. The generation of a key pair with out an encryption subkey would eliminate the compromise of encrypted communications with the compromise of the release signature key pair.
Minor attacks which do not result in full compromises are also possible. These attacks would most likely take the form of false key pair introduction usually with the end goal of the release of trojan horses or the interception of encrypted communications. For example, an attacker could create a fake public key for your project and use that fake key to sign a trojan horse which could then be distributed. The same type of attack could be made against developer keys.
The best way to protect yourself from this type of attack is to have your key pair deeply integrated into the web of trust. Web of trust integration is discussed in section 2.3.
One final type of attack worthy of mention would be the circulation of a compromised revocation certificate. This would have the effect of disabling your key pair. Recovery from this type of attack would involve the generation of a new key and the integration of that keypair into the web of trust.
An interesting ramification of being able to certify an archive, or piece, of software as having certain contents is that you can make that certification and subsequently make assertions about those contents. For example, it now becomes possible for a distributor to make the claim that a given package of software has been reviewed and is valid and safe to run. The distributor can sign the archive with their openPGP key to produce a new version of the archive which differs from the original only in that it has been signed with the distributor's key. An unlimited number of signatures can be made on a single software archive. therefore, the distributor can use digital signatures to identify software packages as officially being part of their distribution, and perhaps as supported versions of software.
As a real world example of where third party testimonials would be useful, take the case of the Free Software P2P Network. The FSPN can digitally sign software archives which are distributed through the network. These digital signatures can provide significant additional security for end users of the FSPN and can even allow the achievement of the level of security necessary to make the FSPN possible in the absence of the adoption of the Strong Distribution Model by a large number of free software projects.
As another example of where third party testimonials would be useful, take a value added model of software distribution. Company X wants to make money by distributing GPL'd software in a Linux distribution. Company X can perform a security review on the software and if Company X finds no security flaws in the archive Company X can digitally sign the archive with an openPGP key which is used to assert that a given software archive is free from security flaws. Company X can then provide some form of warranty on their signed archives. Company X now has a value added linux distribution.
With third party testimonials, companies can charge an audit fee for their auditing and subsequent signature of a software package. This model in effect allows scarcity to be reintroduced into the market place of free software. The identity can only be asserted and the assertion about the software archive can only be made by the holders of the given key pair. therefore the reputation of the key pair holder can be developed and can be made valuable. While a diff could be run in the case of GPL'd software this is may not necessarily be the case with alternative licensing models.
This chapter covers the steps necessary to establish a strong distribution model for your own free software projects. In this section, we make use of the GnuPG openPGP implementation on the GNU/Linux platform. These steps should be easily transferable to other platforms.
The first step in the implementation of a strong distribution model is the generation of a key pair. If you're currently working with a key pair which was generated before openPGP version 4, you should regenerate a new openPGP key pair with openPGP software that uses the version 4 format. The version 4 format has many improvements in design which make it more resistant to attacks and more expandable for long term use. In the case of GnuPG, you should be using at least version 1.0.6 in order to have protection from all known security flaws. I strongly suggest that you revoke any version 2 or version 3 public keys and replace them with version 4 keys unless you have good reason not to do so.
Please, note that GnuPG does allow you to generate a key pair in which the public key of the pair contains only a public key without the standard additional encryption subkey. This type of key is, in effect, a signature only key. Some developers choose to generate signature only keys because while the compromise of a signature key may compromise the integrity of the web of trust and the validity of signatures made with the key, it will only indirectly result in the compromise of the secrecy of data. On the other hand, upon the compromise of a encryption subkey, the secrecy of the data encrypted with that subkey is compromised as well as any future data protected by that subkey. If you have an interest in understanding this further, I suggest you read the openPGP standards document, RFC2440. For most users, the generation of a key pair with an associated subkey is adequate and not cause for concern.
Due to the way in which keys can be linked in a web of trust, it is possible to generate multiple key pairs for a project. This makes it is possible to generate a signature only key pair and an encryption key pair for the same project and link them together through digital key signatures. This also allows the generation of additional key pairs for sub projects.
With GnuPG, a key pair can be generated with the following command.
bash$ gpg --gen-key
I'd suggest that you generate at least a 1024 bit key pair. Depending on the scope and direction of your project you may want to set an expiration date on the key of between one and ten years. I personally use three year expiration dates on my software publication keys. The user id on project keys is usually that of a security officer or a certificate authority (for example: email@example.com or firstname.lastname@example.org). A step by step guide to the generation of a key pair with gpg, which includes a more in depth explanation of the steps, is available in section 3.5 of the GnuPG Keysigning Party HOWTO.
In the event of a compromise of your openPGP key pair by an attacker, you'll want to revoke, or invalidate, the key pair to let the public know that it is no longer trustable. The mechanism of action that allows this in openPGP is called a revocation certificate. It is recommended that after you generate your key pair, you generate a generic revocation certificate for that key pair. The generation and storage of a revocation certificate will allow you to revoke your public key even in the event that you loose access to your private key due to compromise, seizure, forgotten passphrase, or media failure. In order to retain the ability to revoke your public key in the event that you no longer have access to your private key, you should generate a revocation certificate and store it a secure and safe place. You should also print out a copy of your revocation certificate so that it can be entered and used in the event of the failure of the media the revocation certificate is stored on.
If your revocation certificate is compromised, the individual who compromises your revocation certificate will be able to circulate the certificate thereby disabling your key. However, the individual will not be able to compromise your secret key through his access to your revocation certificate. therefore, they will not be able to generate fake signatures, decrypt messages encrypted with your key pair, or otherwise misrepresent themselves as the owner of your key pair. Since the only negative outcome possible from the compromise of a revocation certificate is the disabling of your key pair, it is a generally safe and good thing to do. For more information on revocation certificates, and instructions on how to generate one, see section 2.8 of this document.
If you have chosen to assign an expiration date to your key pair, you should generate a replacement key pair before your key pair expires. You should then use your older key pair to form a trust link with your new key pair before it expires to provide a sense of continuity to your keys.
It is very important to make your public key available and readily accessible. If a user cannot gain access to your public key theycannot verify your signatures. If the end user does not verify your signatures, the benefits of the strong distribution model are diminished significantly.
There are three steps that I recommend that you take in order to circulate your public key. First, you should post your public key on the website where the software is distributed from. You should place the ASCII armored public key in a conspicuous place where people can easily find and download it. Second, you should post your public key to the various networks of public keyservers. A list of keyserver networks, and keyserver, can be found at the end of this document in section 4.1. Finally, when the signatures on your key change be sure to update the copies of your key on your web site and the keyserver networks. Keyservers with in the same network (CryptNET, keyserver.net, etc) will automatically synchronize the various copies of your key. You'll only need to send your updated copy of your key to one node on each network. Work on achieving cross synchronization between keyserver networks is being done.
You may choose to distribute your keys by linking to them on the keyserver. This will allow your signature information to be displayed by the keyserver and alleviate the need to re-upload your key to your web site when you have additional signatures added to the key. As an example of this, here are the PGP keys for CryptNET and myself.
I do not recommend that you include your public key inside your software archive. While there is no technical security problems with this, it does encourage the end user to accept the public key driven by trust based in the location of the key rather than integrity imparted upon the key by signatures. Encouraging such habits in the end users will make them more susceptible to trojan horse attacks against the Strong Distribution Model in which fake archives and fake keys are distributed.
If your public key is not integrated into the web of trust it is of significantly less security value than a public key which is integrated into the web. The openPGP keys can be used to verify that the archive has not been modified since the owner of the private key signed it. But, what assurance is there that the owner of the private key is actually the primary developer of the software package? The web of trust provides an answer to this question. In the scope of the key pair used to sign the software archives in a strong distribution model, the web of trust is a collection of statements about the validity of the key pair the maintainer of the software package provides. Most often, these statements about validity are made by the developers of the software package.
Often software distributors see fit to rely on central distribution to guarantee the integrity of their published public key. This is a mistake because it not only allows easier compromise of their distribution model, but also because it ties the assurance of the validity of the key pair to the existence of centralized distribution point of the public key of that pair. If software is distributed over a p2p network, such a centralized point may not exist.
In the case of a software distribution key, practicality must be taken into consideration. Unlike a personal key on which it is in most cases advantageous to collect as many signatures as possible, a distribution key only really needs signatures from web of trust integrated developer keys. When presented with a public key for a software distribution, the question which should be asked is "Is this the key the developers have issued to secure their releases?". This question can be answered by looking for the presence of developer signatures on the key. Obviously, the question of the validity of the developer signatures, should they exist, is the next logical step.
Aside from the cryptographic validity of the developer signature, the validity of a developer signature must be determined in two ways. In looking at the validity of the developer signature, what we are really looking at is the relationship between the developer key and the project key, and the strength of the developer key. The relationship between the project key and the developer key should be reciprocal. The project key should be signed by the developer and the developer key should be signed by the project key. The strength of the developer key can be judged by the presence of other signatures on that key. The key for CryptNET can be examined an example of what the web of trust for a Strong Distribution Project key. The cryptnet.net has reciprocal links with its developers. The develops key in the case of CryptNET is strongly integrated in to the web of trust as you can see from this picture of the reciprocal trust links with that key and the overall web of trust integration of that key
In order to link the keys of the developers on the project into the web of trust you must first sign them with the project's key. After you have performed that signature operation to complete the reciprocal trust path between the project key and the key of the developer, the developer's key should be linked into the normal web of trust. In order to do this, a keysigning party should be held. Instructions on how to conduct a keysigning party are provided in the Keysigning Party HOWTO. Additional information about the web of trust can be found in that document.
This section consists of some naming and operating conventions I suggest to you. These are not yet standardized and there is not an impervious need to follow them, but I highly suggest that you do. If we can agree on common standards for naming and distribution, we can make things much easier on distributors, packagers, maintainers, programmers, and especially the end users. These standards where taken from primarily the practices of the GnuPG developers and the Linux Kernel Developers. therefore, they already have been deployed and somewhat established. Due to their use by the Linux Kernel Developers in the Linux Kernel Archive and the GnuPG developers, these methods should be somewhat familiar to at least some of your product's user base.
The best way to sign a binary file, be it a tarball, an archive, or an iso image, is to generate a detached ASCII armored signature. The use of a detached ASCII armored signature keeps the main binary file from having anything appended to it. The use of an ASCII armoring for the detached signature makes it easy for the user to look at the file and determine if it the transmission was most likely complete and unflawed. ASCII armoring also protects the signature from damage during transmission though limited channels which are not safe for raw binary data.
Assuming we're working with archives which conform to the GNU standards, generating a detached ASCII signature would result in an archive, for example, with the name gnu_software-1.0.0.tar.gz and an ASCII armored signature file with the name gnu_software-1.0.0.tar.gz.sig. Both files would then be distributed separately through traditional means such as ftp, http, or p2p transmission. Patch files would be signed in the same way with detached ASCII signatures. The signature of patch files would produce files with the names gnu_software-1.0.0-1.0.1.diff.gz and gnu_software-1.0.0-1.0.1.diff.gz.sig
In order to generate an armored signature as mentioned above, the following command can be used:
bash$ gpg -a -o gnu_software-0.0.1.tar.gz.sig --detach-sig gnu_software-0.0.1.tar.gz
In this command the -a argument directs ASCII armoring (radix encoding), the -o argument specifies and output file (--output may be used instead) and the --detach-sig argument specifies that the signature should be stored in the output file rather than appended to the archive.
In order to verify a signature, you must be able to access the public key of the signer. You can either download the key and import it into your keyring before you attempt verification, or you can rely on your PGP software to download the key when it is needed for verification. There is great advantage to downloading the key by hand from a keyserver which lists signatures on the key because it makes your personal verification and estimation of the validity of the key slightly easier.
The command to verify a detached signature with gpg is:
bash$ gpg --verify gnu_software-1.0.0.tar.gz.sig gnu_software-1.0.0.tar.gz
The patch program makes use of the dash ('-') character to represent lines of source code which are to be deleted. This can be problematic for PGP users because openPGP and other PGP implementations make use of the dash ('-') as a meta character. GnuPG provides a special feature for the signature of patches, either inline in an email message or inline in a file. If you use a detached signature to sign your patch file, you do not need to use the --not-dash-escaped feature of gnupg.
In order to sign a message with an inline patch and take advantage of gnupg's special feature, the following command can be used:
bash$ gpg --clearsign --not-dash-escaped
The following command can be used for verification:
bash$ gpg --verify
The worst thing that can possibly happen in a strong distribution model is the compromise of the private key used to sign software releases. If a hacker gets control of your private key, the hacker is then able to issue software releases which can be verified as authentic with your public key. If a compromise does happen, here's how to deal with it - revoke your public key.
In the event that you suspect that your secret key has been compromised, you should revoke your public key immediately. Key revocation takes place by the addition of a Revocation Signature to a public key. The revocation of a key suggests that a key is no longer valid (secure) and should not be used. Once a revocation certificate is issued, it cannot be withdrawn.
Since your openPGP key is distributed (read circulated) to people rather than distributed from a central point every time it is accessed, you must circulate (or distribute) your revocation certificate in the same manner that you distributed your public key. The circulation of the revocation certificate in the same manner as the distribution of your public key would usually mean uploading the revocation certificate to the keyserver networks and posting it in a conspicuous place on your website. You should continue to make your revoked public key available from your web site long after you have stopped using your public key. This ensures that someone does not mistakenly think that your secret key is valid and is still producing valid signatures. Best practice would be to keep links to all of the public keys your project has ever used on your website, including expired and revoked public keys.
In review, the gpg command to generate a revocation certificate is:
bash$ gpg --output revcert.asc --gen-revoke <key_id>
If you have an idea about how or when you key was compromised and you generated a revocation certificate during key generation, you will still likely want to generate a new revocation certificate to revoke your key pair. This is the case because the openPGP standard will allow you to specify the reason why you are revoking your key pair and even provide some free text comments about the reason for revocation. The circulation of a revocation certificate with this type of information is likely advantageous and preferable to the circulation of a generic certificate created during key generation.
This chapter for the provision of information related to openPGP signature integration with the various package formats. If you know would like to submit information about using PGP with a packaging format not listed here, please submit it to the maintainer of this HOWTO for inclusion.
If you have difficulty working with the integrated openPGP signatures of a packaging format, or if the packaging format of your choice does not support openPGP signature integration, you can still use the strong distribution model for your project. Simply perform the openPGP signature and verification operations as if your package distributed software was a tarball.
The following entries needed to be added to the rpmrc file:
signature pgp_name pgp_path
To sign a package at build time:
bash$ rpm --sign
To replace the signature on an already signed package:
bash$ rpm --resign
To sign an already built package:
bash$ rpm --addsign
For more information, the RPM Documentation Project rpmbook has a section on the PGP signing of RPM packages.
bash$ rpm -K package.rpm
For more information, the RPM Documentation Project rpmbook has a section on the verification of RPM packages
Applied Cryptography Second Edition, Bruce Schneier [ISBN: 0-471-11709-9]
One or more bits of data used in the encryption or description process.
If PGP, a value used to identify a key which is generated by performing a hash of key material.
In public key cryptography, a pair of keys consisting of a public and private, or secret, key which interrelate.
A collection of keys. Most often this term is used in relation to PGP, where a keyring consits of a collection of one or more key packets.
A system which stores key material in a database. These servers may be queried by users who wish to acquire the public key of a recipient they have not had prior contact with.
A get-together of people who use the PGP encryption system with the purpose of allowing those people to sign each others public keys. Keysigning parties serve to extend the web of trust.
An open standard which defines a version of the PGP security system.
Privacy software developed by Phil Zimmermann, which includes public key cryptography, a standard packet and key format, and symmetric encryption as well.
In public key cryptography, the key of a key pair which is shared.
A keyring consisting of Public Keys. This term is most often used in relation to PGP.
In public key cryptography, the key of a key pair which is kept secure.
A collection of secret keys. Most often this term is used in relation to PGP where it defines a collection of secret key packets.
A route by which trust is extended from one entity to another. In PGP, this is a link of trust between two public keys.
The collection of signatures upon keys and resultant trust paths in a user centric trust model which provide for authentication. Collectively, the trust relationships between a group of keys.