head 1.1; branch 1.1.1; access ; symbols MAXIMUM_RPM_1_0:1.1.1.1 VENDOR:1.1.1; locks ; strict; comment @# @; 1.1 date 2001.08.28.12.07.09; author rse; state Exp; branches 1.1.1.1; next ; 1.1.1.1 date 2001.08.28.12.07.09; author rse; state Exp; branches ; next ; desc @@ 1.1 log @Initial revision @ text @ 15. Adding Dependency Information to a Package Subsections

15. Adding Dependency Information to a Package

Since the very first version of RPM hit the streets, one of the side effects of RPM's ease of use was that it made it easier for people to break things. Since RPM made it so simple to erase packages, it became common for people to joyfully erase packages until something broke.

Usually this only bit people once, but even once was too much of a hassle if it could be prevented. With this in mind, the RPM developers gave RPM the ability to:

Build packages that contain information on the capabilities they require.
Build packages that contain information on the capabilities they provide.
Store this ``provides'' and ``requires'' information in the RPM database.

In addition, they made sure RPM was able to display dependency information, as well as to warn users if they were attempting to do something that would break a package's dependency requirements.

With these features in place, it became more difficult for someone to unknowingly erase a package and wreak havoc on their system.

15.1 An Overview of Dependencies

We've already alluded to the underlying concept for RPM's dependency processing. It is based on two key factors:

Packages advertise what capabilities they provide.
Packages advertise what capabilities they require.

By simply checking these two types of information, many possible problems can be avoided. For example, if a package requires a capability that is not provided by any already-installed package, that package cannot be installed and expected to work properly.

On the other hand, if a package is to be erased, but its capabilities are required by other installed packages, then it cannot be erased without causing other packages to fail.

As you might imagine, it's not quite that simple. But adding dependency information can be easy. In fact, in most cases, it's automatic!

15.2 Automatic Dependencies

When a package is built by RPM, if any file in the package's %files list is a shared library, the library's ``soname'' is automatically added to the list of capabilities the package provides. The soname is the name used to determine compatibility between different versions of a library.

Note that this is not a filename. In fact, no aspect of RPM's dependency processing is based on filenames. Many people new to RPM often make the assumption that a failed dependency represents a missing file. This is not the case.

Remember that RPM's dependency processing is based on knowing what capabilities are provided by a package and what capabilities a package requires. We've seen how RPM automatically determines what shared library resources a package provides. But does it automatically determine what shared libraries a package requires?

Yes! RPM does this by running ldd on every executable program in a package's %files list. Since ldd provides a list of the shared libraries each program requires, both halves of the equation are complete -- that is, the packages that make shared libraries available, and the packages that require those shared libraries, are tracked by RPM. RPM can then take that information into account when packages are installed, upgraded, or erased.

15.2.1 The Automatic Dependency Scripts

RPM uses two scripts to handle automatic dependency processing. They reside in /usr/bin and are called find-requires, and find-provides. We'll take a look at them in a minute, but first let's look at why there are scripts to do this sort of thing. Wouldn't it be better to have this built into RPM itself?

Actually, creating scripts for this sort of thing is a better idea. The reason? RPM has already been ported to a variety of different operating systems. Determining what shared libraries an executable requires, and the soname of shared libraries, is simple, but the exact steps required vary widely from one operating system to another. Putting this part of RPM into a script makes it easier to port RPM.

Let's take a look at the scripts that are used by RPM under the Linux operating system.

15.2.1.1 `find-requires` -- Automatically Determine Shared Library Requirements

The find-requires script for Linux is quite simple:

#!/bin/sh

# note this works for both a.out and ELF executables

ulimit -c 0

filelist=`xargs -r file | fgrep executable | cut -d: -f1 `

for f in $filelist; do
    ldd $f | awk '/=>/ { print $1 }'
done | sort -u | xargs -r -n 1 basename | sort -u

This script first creates a list of executable files. Then, for each file in the list, ldd determines the file's shared library requirements, producing a list of sonames. Finally, the list of sonames is sanitized by removing duplicates, and removing any paths.

15.2.1.2 `find-provides` -- Automatically Determine Shared Library Sonames

The find-provides script for Linux is a bit more complex, but still pretty straightforward:

#!/bin/bash

# This script reads filenames from STDIN and outputs any relevant
# provides information that needs to be included in the package.

filelist=$(grep "\\.so" | grep -v "^/lib/ld.so" | 
xargs file -L 2>/dev/null | grep "ELF.*shared object" | cut -d: -f1)

for f in $filelist; do
    soname=$(objdump -p $f | awk '/SONAME/ {print $2}')

    if [ "$soname" != "" ]; then
        if [ ! -L $f ]; then
            echo $soname
        fi
    else
        echo ${f##*/}
    fi
done | sort -u

First, a list of shared libraries is created. Then, for each file on the list, the soname is extracted, cleaned up, and duplicates removed.

15.2.2 Automatic Dependencies: An Example

Let's take a widely used program, ls, the directory lister, as an example. On a Red Hat Linux system, ls is part of the fileutils package and is installed in /bin. Let's play the part of RPM during fileutils' package build and run find-requires on /bin/ls. Here's what we'll see:

# find-requires /bin/ls <ctrl-d> libc.so.5 #

The find-requires script returned libc.so.5. Therefore, RPM should add a requirement for libc.so.5 when the fileutils package is built. We can verify that RPM did add ls' requirement for libc.so.5 by using RPM's - -requires option to display fileutils' requirements:

# rpm -q - -requires fileutils libc.so.5 #

OK, that's the first half of the equation -- RPM automatically detecting a package's shared library requirements. Now let's look at the second half of the equation - RPM detecting packages that provide shared libraries. Since the libc package includes, among others, the shared library /lib/libc.so.5.3.12, RPM would obtain its soname. We can simulate this by using find-provides to print out the library's soname:

# find-provides /lib/libc.so.5.3.12 <ctrl-d> libc.so.5 #

OK, so /lib/libc.so.5.3.12's soname is libc.so.5. Let's see if the libc package really does ``provide'' the libc.so.5 soname:

# rpm -q - -provides libc libm.so.5 libc.so.5 #

Yes, there it is, along with the soname of another library contained in the package. In this way, RPM can ensure that any package requiring libc.so.5 will have a compatible library available as long as the libc package, which provides libc.so.5, is installed.

In most cases, automatic dependencies are enough to fill the bill. However, there are circumstances when the package builder has to manually add dependency information to a package. Fortunately, RPM's approach to manual dependencies is both simple and flexible.

15.2.3 The `autoreqprov` Tag -- Disable Automatic
Dependency Processing

There may be times when RPM's automatic dependency processing is not desired. In these cases, the autoreqprov tag may be used to disable it. This tag takes a yes/no or 0/1 value. For example, to disable automatic dependency processing, the following line may be used:

AutoReqProv: no

15.3 Manual Dependencies

You might have noticed that we've been using the words ``requires'' and ``provides'' to describe the dependency relationships between packages. As it turns out, these are the exact words used in spec files to manually add dependency information. Let's look at the first tag: requires.

15.3.1 The `requires` Tag

We've been deliberately vague when discussing exactly what it is that a package requires. Although we've used the word ``capabilities'', in fact, manual dependency requirements are always represented in terms of packages. For example, if package foo requires that package bar is installed, it's only necessary to add the following line to foo's spec file:

requires: bar

Later, when the foo package is being installed, RPM will consider foo's dependency requirements met if any version of package bar is already installed.¹

If more than one package is required, they can be added to the requires tag, one after another, separated by commas and/or spaces. So if package foo requires packages bar and baz, the following line will do the trick:

requires: bar, baz

As long as any version of bar and baz is installed, foo's dependencies will be met.

15.3.1.1 Adding Version Requirements

When a package has slightly more stringent needs, it's possible to require certain versions of a package. All that's necessary is to add the desired version number, preceded by one of the following comparison operators:

<: Requires package with a version less than the specified version.
<=: Requires package with a version less than or equal to the specified version.
=: Requires package with a version equal to the specified version.
>=: Requires package with a version equal to or greater than the specified version.
>: Requires package with a version greater than the specified version.

Continuing with our example, let's suppose that the required version of package bar actually needs to be at least 2.7, and that the baz package must be version 2.1 -- no other version will do. Here's what the requires tag line would look like:

requires: bar >= 2.7, baz = 2.1

We can get even more specific and require a particular release of a package:

requires: bar >= 2.7-4, baz = 2.1-1

15.3.1.2 When Version Numbers Aren't Enough

You might think that with all these features, RPM's dependency processing can handle every conceivable situation. You'd be right, except for the problem of version numbers. RPM needs to be able to determine which version numbers are more recent than others, in order to perform its version comparisons.

It's pretty simple to determine that version 1.5 is older than version 1.6. But what about 2.01 and 2.1? Or 7.6a and 7.6? There's no way for RPM to keep up with all the different version-numbering schemes in use. But there is a solution; two, in fact...

15.3.1.2.1 Solution Number 1: Serial numbers

When RPM can't decipher a package's version number, it's time to pull out the serial tag. This tag is used to help RPM determine version number ordering. Here's a sample serial tag line:

Serial: 42

This line indicates that the package has a serial number of 42. What does the 42 mean? Only that this version of the package is older than the same package with a serial number of 41, but younger than the same package with a serial number of 43. If you think of serial numbers as being nothing more than very simple version numbers, you'll be on the mark.

In order to direct RPM to look at the serial number instead of the version number when doing dependency checking, it's necessary to append an ``S'' to the end of the conditional operator in the requires tag line. So if a package requires package foo to have a serial number equal to 42, the following tag line would be used:

Requires: foo =S 42

If the foo package needs to have a serial number greater than or equal to 42, this line would work:

Requires: foo >=S 42

It might seem that using serial numbers is a lot of extra trouble, and you're right. But there is an alternative:

15.3.1.2.2 Solution Number 2: Just Say No!

If you have the option between changing the software's version-numbering scheme, or using serial numbers in RPM, please consider changing the version-numbering scheme. Chances are, if RPM can't figure it out, most of the people using your software can't, either. But in case you aren't the author of the software you're packaging, and its version numbering scheme is giving RPM fits, the serial tag can help you out.

15.3.2 The `conflicts` Tag

The conflicts tag is the logical complement to the requires tag. It is used to specify which packages conflict with the current package. RPM will not permit conflicting packages to be installed unless overridden with the - -nodeps option.

The conflicts tag has the same format as requires. It accepts a real or virtual package name and can optionally include version and release specifications or a serial number.

15.3.3 The `provides` Tag

Now that you've seen how it's possible to require a package using the requires tag, you're probably expecting that you'll need to use the provides tag in every single package. After all, RPM has to get those package names from somewhere, right?

While it is true that RPM needs to have the package names available, the provides tag is normally not required. It would actually be redundant, because the RPM database already contains the name of every package installed. There's no need to duplicate that information.

But wait - We said earlier that manual dependency requirements are always represented in terms of packages. If RPM doesn't require the package builder to use the provides tag to provide the package name, then what is the provides tag used for?

15.3.3.1 Virtual Packages

Enter the virtual package. A virtual package is nothing more than a name specified with the provides tag. Virtual packages are handy when a package requires a certain capability, and that capability can be provided by any one of several packages. Here's an example:

In order to work properly, sendmail needs a local delivery agent to handle mail delivery. There are a number of different local delivery agents available -- sendmail will work just fine with any of them.

In this case, it doesn't make sense to force the use of a particular local delivery agent; as long as one's installed, sendmail's requirements will have been satisfied. So sendmail's package builder adds the following line to sendmail's spec file:

requires: lda

There is no package with that name available, so sendmail's requirements must be met with a virtual package. The creators of the various local delivery agents indicate that their packages satisfy the requirements of the lda virtual package by adding the following line to their packages' spec files:

provides: lda

(Note that virtual packages may not have version numbers.) Now, when sendmail is installed, as long as there is a package installed that provides the lda virtual package, there will be no problem.

15.4 To Summarize...

RPM's dependency processing is based on tracking the capabilities a package provides, and the capabilities a package requires. A package's requirements can come from two places:

Shared library requirements, automatically determined by RPM.
The requires tag line, manually added to the package's spec file.

These requirements can be viewed by using RPM's - -requires query option. A specific requirement can be viewed by using the - -whatrequires query option. Both options are fully described in chapter [*].

The capabilities a package provides, can come from three places:

Shared library sonames, automatically determined by RPM.
The provides tag line, manually added to the package's spec file.
The package's name (and optionally, version/serial number).

The first two types of information can be viewed by using RPM's - -provides query option. A specific capability can be viewed by using the - -whatprovides query option. Both options are fully described in chapter [*].

The package name and version are not considered capabilities that are explicitly provided. Therefore, if a search using - -provides or - -whatprovides comes up dry, try simply looking for a package by that name.

As you've probably gathered by now, using manual dependencies requires some level of synchronization between packages. This can be tricky, particularly if you're not responsible for both packages. But RPM's dependency processing can make life easier for your users.

Ralf S. Engelschall 2000-12-15

@ 1.1.1.1 log @Import book 'Maximum RPM' by Ed Bailey, version 1.0 @ text @@