Automated Build System

From Second Life Wiki
Jump to navigation Jump to search

Overview

This document describes our automated build system and the services it provides. The intended audience is:

  • Software developers who wish to understand how to use the system and analyze build failures
  • Software developers who wish to extend the system to perform new tasks
  • Software QA people who wish to use the system for automated testing
  • Release Managers who wish to track changes and authoritative builds

Build system scripts

SharedVsLocalBuildscripts.png

Our automated build system operates on a set of shell scripts which basically fall into two categories:

  • Project specific build scripts which define the particular build procedure used for the project;
  • Shared common code which defines our policy and environment and provides services to the project specific build scripts.

Files and Repositories

The project specific buildscripts are expected to be in three files, all stored at the top of the mercurial repository to be built:

  • build.sh contains the project specific build instructions;
  • finalize.sh contains the project specific post-build instructions, to be run when all platform builds completed;
  • BuildParams contains build parameters or properties, to be set depending on the build name or the repository name.

The shared build scripts and utilities are stored in an hg repository at http://bitbucket.org/cg_linden/buildscripts.

The main portion of the shared code resides in hg/bin/build.sh

The defaults for all the build parameters and properties are stored in hg/BuildParams

Build Tasks

Build tasks follow a naming convention:

PREFIX-role_basename

or

PREFIX-role_basename_variant

Build Types

The PREFIX classifies the build as one of the following types:

Prefix Description
A- TeamCity trigger task to kick off L-, M- and W- build tasks
L- Linux platform builds. This prefix is to be used when building multi-platform items.
M- Mac platform builds. This prefix is to be used when building multi-platform items.
W- Windows platform builds. This prefix is to be used when building multi-platform items.

As we move to TeamCity, we will be adding more prefixes and deprecate the use of the variant portion of the name.

Build Roles

The role name usually maps to a user name. That user should usually be the owner of the source repository associated with the build.

Policies governing other uses of role are being devised. One idea is to use the role to designate authoritative component builds. The build scripts can then update the appropriate Product Dashboard entry to ensure that the correct links to the correct source code repositories are being displayed there.

Basename

The basename portion of the build name currently maps to a repository name, but can be made to map to anything with TeamCity.

Variant Name

We currently use the following variants:

Name Description
_debug Builds only Debug and RelWithDebInfo
_coverity Runs coverity builds

The prefixing model in the BuildParams file makes it easy to define the behavior of additional variants.

Build Steps

Developer Entry Point

Developers can run the automated build procedure by hand, provided that two conditions are fulfilled:

Check out the repository for the shared build scripts (http://bitbucket.org/cg_linden/buildscripts) next to the repository being built; The build.sh script at the top of the repository requires that you include the following preamble:

  #!/bin/sh
  
  set -x
  umask 022
  
  # Check to see if we were invoked from the wrapper, if not, re-exec ourselves from there
  if [ "x$arch" = x ]
  then
    top=`hg root`
    if [ -x "$top/../buildscripts/hg/bin/build.sh" ]
    then
      exec "$top/../buildscripts/hg/bin/build.sh" "$top"
    else
      cat <<EOF
  This script, if called in a development environment, requires that the branch
  independent build script repository be checked out next to this repository.
  This repository is located at http://bitbucket.org/cg_linden/buildscripts
  EOF
      exit 1
    fi
  fi

Continuous Build Entry Point

The entry point for the continuous build system is the hg/bin/build.sh from a local checkout on the build agent proper. The expectation is that the hg repository to be built lives in a checkout or clone inside the current working directory, in a subdirectory named latest.

Build Output and Log Location

All the build output and build logs generated by the facilities provided by the shared hg/bin/build.sh ends up in a build_log/build-name/ subdirectory next to the directory containing the mercurial checkout being built. This results in the following filesystem layouts:

Layout for Developer Builds

working dir/
   |_buildscripts/      # clone of http://bitbucket.org/cg_linden/buildscripts
   |_projectA/          # clone of your project A
   |_projectB/          # clone of your project B
   |_projectC/          # clone of your project C
   |_build_log/         # build output
      |_userid@hostname_projectA/
      |  |_logs etc...
      |_userid@hostname_projectB/
      |  |_logs etc...
      |_userid@hostname_projectC/
         |_logs etc...

Layout for Automated Builds

checkout dir/
   |_build task A/
   |  |_latest          # clone of project A
   |  |_build_log/      # build output
   |     |_build task A/
   |        |_logs etc...
   |_build task B/
   |  |_latest          # clone of project B
   |  |_build_log/      # build output
   |     |_build task B/
   |        |_logs etc...
   |_build task C/
      |_latest          # clone of project C
      |_build_log/      # build output
         |_build task C/
            |_logs etc...

Initialization

Paths

All paths are set to ensure that the correct version of our version control and build tools can be used by simple unqualified invocation:

  • perl
  • python
  • hg
  • svn
  • gcc/g++/distcc

Important: Always invoke python and perl scripts using "python script" or "perl script". This avoids any problems related to misrepresenting execute permissions within Cygwin or some dumb version control systems.

Version Control

Currently, three version control systems are supported for determining the change set ID and change logs:

  • Mercurial
  • Git
  • Subversion

The version control system used is detected automatically.

Code Ticket

The Code Ticket is an integer mapped to a changeset ID with the following properties:

  • It is a one on one mapping to a change set id, and a service exists to translate one into the other.
  • The integer increases monotonically over time. A higher code ticket number implies that the request for allocation was made later. It is generally safe to assume that higher implies newer.

BuildParams

The BuildParams file contains a set of prefixed name value pairs. The prefix is generally the repository or project name, sometimes decorated with a user name or a variant name.

The generic format of a prefix is:

 [username_]project[_variant]

The prefix components are determined as follows:

  • username is the developer's user id in a developer build, or the username portion of the automated build task name.
  • project is the name of the directory or the repository portion of the automated build task name.
  • variant is defined if project contains the underscore character. In that case project is the part before the first underscore, variant is the part following the first underscore.

The order of preference for selecting the parameter value is:

  • Fully qualified prefixes (user + project + variant)
  • Prefix with a user portion
  • Prefix with a variant portion
  • Any prefix
  • unqualified parameter

Build Parameters end up as exported environment variables and can be referenced as such in build scripts and utilities.

Environment Variables

Process Control

The following variables are used during the build process

Name Type Description
all_builds_succeeded bool Set in the finalization step, when all platform builds have finished. It is "true" if all platform builds succeeded, "false" if one or more failed.
arch string Architecture (output of uname - Linux, Darwin or CYGWIN)
branch string Core part of the directory or repository name, usually maps to a branch name concept ("trunk")
build_id string Unique identifier known to the invoking build automation. Used to store a file with a redirect meta tag on S3 that can be linked to by the build automation.
build_log path Path to the main build log file. This file will be scanned for common error strings and uploaded to S3 upon completion of the build.
build_log_dir path Path to the location where build output files should be stored. It is OK to create subdirectories there.
changeset string Changeset id of the current checkout.
here path Relative location of the shared build.sh script.
helpers path Absolute path to the location of the shared build.sh script.
hostname string Fully qualified domain name of the build host.
invoked_by_parabuild bool "true" if the automated build system is parabuild; "false" otherwise (for example if invoked by a developer).
invoked_by_teamcity bool "true" if the automated build system is teamcity; "false" otherwise (for example if invoked by a developer).
repo string Repository of build results on S3. Used to map fairly well to the mercurial repo name, but has drifted away from that usage.
repo_url url Location of the version control source or root of the clone or checkout. Use this as the authoritative source, do not use repo.
revision int Code ticket number retrieved from the code ticket server, given the current change set id in the checkout.
root path Absolute path to the root of the checked out source tree.
succeeded bool Set initially to "true"; should be set to "false" as soon as something in the build fails.
suffix string Variant portion of the build task name. Used to select build parameters and perform custom actions - for example Coverity builds.
using_git bool "true" if the version control system is git; "false" otherwise.
using_hg bool "true" if the version control system is mercurial; "false" otherwise.
using_svn bool "true" if the version control system is subversion; "false" otherwise.
vc_system string Version control system being used. One of: "None", "git", "hg" or "svn".

Build Policies

The following variables control Build Policies. These will set a large set of other build parameters and can be used instead of setting them all individually. The policy variables all default to false.

Name Description (what happens if set to "true")
build_coverity Disable all platforms except CYGWIN and enable coverity for any build whose suffix is _coverity
build_debug_release_separately Disable debug builds for any build whose suffix is not _debug. Disable release builds for any build whose suffix is _debug
public_build Sets S3 uploads to a public S3 location.

BuildParams

Please inspect the source for the description and default values of build parameters.

Service Functions

Name Parameters Description
begin_section name Creates a teamcity service message indicating the beginning of an execution block named name. This makes the logs much more readable and structured, and also allows the progress indicator to use name to indicate the current build step.
end_section name Creates a teamcity service message indicating the end of the execution block named name. This makes the logs much more readable and structured, and also allows the progress indicator to use name to indicate the current build step. Note that this also ends all pending enclosed execution blocks.
export_arch [arch] Maps the given architecture name to a configurable name. If no parameter is passed, $arch is used. This function is mainly used to map our internal platform names ("Linux", "Darwin", "CYGWIN") to different ones.
get_item type path encoding [arch] Retrieve an uploaded build result file from S3 and store it into the file named by path. See the S3 Layout section for a description of all available result values for type.
mangle_changelog changelog Modified a debian/changelog file to include the last 200 checkin comments and a package version number that includes the codeticket number. Also echos the names of all the packages built from the debian build dir associated with the debian/changelog file.
record_failure error message Creates a teamcity service message to record a build failure. The error message will be visible in red on the summary log, and the succeeded environment variable will be set to "false".
upload_item type path encoding [asset_urls] [asset_name] Store file at specified path on S3. If asset_urls is specified, append the url to the S3 item in the file specified by asset_urls. If in addition asset_name is specified, prefix the url with asset_name and append the resulting line to the file specified by asset_urls. See the S3 Layout section for a description of all available result values for type.
upload_stub_installers build_dir Locate and upload stub installer files from specified build_dir.

Build Execution Proper

Project Build

The build.sh script at the top of the source tree is invoked in a subshell. You can exit it at any time. A non-zero exit code will result in build failure.

Output file descriptors are slightly redefined:

FD Redirect Usage
1 stdout Appears in main log as information message
2 script trace Will get inserted into the set -x script trace
3 stderr Designates important messages that will appear in the TeamCIty short log

Any lengthy operations with lots of output should redirect the output to ${build_log}. This log will get uploaded to S3 and linked into the build result summary page. In the case of a failed build, this file will also be scanned for common error messages that can appear on the TeamCity "important messages" log.

The project build is expected to upload any binaries to S3 using the provided upload_item function.

Build Log Structure

The build logs end up structured into three sections:

Progress
Minimal output to indicate the current position in the build process.
Log Analysis
Check for stack traces, perform a log scan for common errors and dump log on failure. The log is expected to be the file referenced by ${build_log}
Shell Trace
All shell commands are logged, and the dump is always included at the end in order to help analyze undetected failures.

Finalization

The finalization step is run when the last platform build working on a specific change completes. It is important to realize that this could be on any build agent, regardless of platform. Therefore, any code placed in the finalize.sh portion needs to be able to run correctly on all platforms used by the build.

The following diagram illustrates the process for multi-platform builds:

Concurrent.png

Uploading Build Results to S3

To upload build results, use the upload_item function described here. This function accepts an item type parameter. There are a variety of types available:

Name Description
changes Any kind of file relating to changes, diffs, changesets etc...
debian Debian packages - they will be bundled into a trivial debian repositoru on S3 at the end of the build.
docs Documentation, will end up in a platform independent location.
installer Any item intended for download by a user.
log Log files of any kind.
open_source Uploads specified files to a public location on S3.
symbolfile Uploads files to a location accessible by the crash reporter.
tests Any files required for use by automated testing.

The result layout on S3 is as follows:

${S3PUBLIC_URL}/
  |_${S3PREFIX}/
      |_redirect/
      |  |_${build_id}
      |      |_partial.html      <- redirect to the platform specific result file
      |      |_result.html       <- redirect to the result summary index.html file.
      |_binaries/${revision}/
      |  |_symbol files
      |  |_...
      |_repo/${basename}/        <- build task basename
         |_status                <- file containing the current build status
         |                          where good=TRUE broken=FALSE
         |_good/
         |  |_rev                <- token file named after the revision of a good overall build
         |_arch/${arch}/
         |  |_good/rev           <- token file named after the revision of a good platform build
         |_rev/${revision}       <- revision of current build
            |_index.html         <- result summary page with links to all the other items
            |_docs/...           <- automatically generated documentation
            |_bundle/...         <- file produced by hg bundle
            |_arch/${arch}       <- architecture specific results
               |_index.html      <- result summary file
               |_urls            <- token file containing a text list of all publishable result urls
               |_changes/        <- files related to "changes since last release".
               |_installer/...   <- installables
               |_buildparams/... <- shell script to set BuildParams env vars
               |_debian_repo/    <= Debian repository
               |  |_Packages.gz
               |  |_Sources.gz
               |  |_Release
               |  |_package.deb
               | ...
               |_log/...         <- build logs
               |_status/...      <- token file named true/false to indicate build success
               |_stub/...        <- stub installers

Note how the layout is in the form key/value/key/value/key/value/.... This avoids possible name collisions.

The variables in the tree above mean:

Name Description
${S3PUBLIC_URL} S3 base url including the S3 bucket name
${S3PREFIX} prefix to be used inside the bucket: hg/ = source based builds; img/ = package based image builds; pgi/ = image based cluster builds.
${build_id} unique id for the build, used so that the results page can be reached via a UI not aware of this layout.
${basename} build task basename as defined above.
${revision} Code Ticket delivered by the Ticket Server.
${arch} Platform string as returned by "uname".

Design Notes / FAQ

This is just a random collection of thoughts and ideas. Hopefully these can explain why the builds are designed the way they are...

Implementation FAQ

Why bash?
Mainly because it is easiest for gluing together the execution of various tools and doing all the pipe plumbing. Also, it doesn't depend on the existence of specific libraries or interpreter versions. It just is there and rarely changes.
Why Cygwin?
Linux build tasks outnumber windows by a huge margin, and windows .BAT files suck horribly. Yes, writing bash scripts that work well on Cygwin can be a nuisance, but not nearly as bad as dealing with random interpreter versions and library versions, or having to maintain .BAT files.
Why "set -x"?
Errors in the build framework can be hard to track down. I'd rather log too much than too little. The redirect for stderr into a file does clean up the log output immensely, making this bearable.
Why the special file descriptors?
Mainly because I run the build scripts with "set -x", and I don't want to pollute the build log. Since stderr gets used by TeamCity for "important" messages, I needed to restore the capability to output directly onto stderr, hence the file descriptor "3". Simply say echo Important Message >&3.
Why not use TeamCity dependencies for the finalizer?
Mainly legacy. Our older continuous build system made it very complicated to achieve this goal cleanly, so this solution was adopted. It works well.
Why the complication of having the "shared library" code (central build.sh) call the "client code" (branch specific build.sh)?
This is to allow the elimination of the "client code" altogether. As we build more and more projects, the central build.sh can examine the "shape" of the source tree and run a default build instead of requiring a boiler plate build.sh in every branch.

BuildParams FAQ

What do BuildParams contain?
They contain environment variable settings, classified via a prefixing convention such that a build task will pick out the settings useful for that task.
Why is that part of the source tree? Why not simply set environment variables directly?
The build environment is as much part of the source as is the code itself. If you change the build environment, you change the build result, and such a change should be documented, preferably using the same mechanism we use for documenting source code changes.
Why not use a shell script?
Because I want to enforce a declarative style, not a procedural style. If it was a shell script, we'd immediately get people putting in build logic. I'm trying to evolve a data driven build system, and using a non-shell-script file helps accomplish that goal.
Why not use an xml file?
Maybe I should... won't make it look any better, though..
Why not use a simple name value property file?
Different users will want different settings. If everyone edits the same file to change the settings to their preferences, this file will be a constant merge conflict.
Why not use many separate files?
This is actually a good idea and I considered it for a long time. I may still be convinced, but I ended up preferring the prefix based system because it allows for a better defaulting and inheritance model. Yes, one could implement the defaulting and inheritance model by sourcing in many files from various places inside a BuildParams tree in some defined order, but I ended up rejecting this because it would require a developer to look at many files to fully understand the settings. The current model allows maintainers to group and order the settings in a logical manner and highlight the defaulting and inheritance behavior.
Why not use better defaults?
I try, but usage changes faster, and we are prone to experimenting with different methods. That's all good, but we then have to live with some messiness right here. The good news is that it actually documents our experiments and the evolution of our processes.
Why not use the global defaults?
There is a global defaults file. Use it carefully. If you were to put build settings into the global defaults, you risk introducing changes to build results that are not matched up with changes in the source code. These make errors non-reproducible and are therefore to be avoided. The global defaults should only be used for settings that do not actually affect build outcomes, but instead describe those features of our build environment that make builds possible in the first place. This includes things like the list of DISTCC hosts, S3 locations, S3 keys, service urls and such. It should not include changes that actually affect specific builds. Sometimes it's a judgment call - when in doubt, change it in the source tree's BuildParams first.