NeilB: Learning from other/experienced speakers

For this year's London Perl Workshop (LPW) I've suggested that one of the themes could be "first-time speakers", which means we need to do things to encourage and help new people to speak.

I've some ideas for how we might do that, and am hoping others will have ideas too. This post is about one of those ideas.

If you go to conferences or watch the videos of talks, you'll naturally find yourself evaluating the speakers. Some you'll be impressed by, and engaged with their topic, and others, not so much. We're all different, so our sets of speakers will be different, but probably overlapping.

Maybe we can all learn something from those speakers...?

So I've compiled a list of questions to ask good speakers, and have started emailing them out. Once I've got a good number of replies, I'll collate them in some way into one or more blog posts.

These are my questions so far:

  • How long did you spend preparing for the last talk you created/gave?
  • How long have you been giving presentations at conferences / speaking in public?
  • Were you a natural speaker? What were your earliest talks like?
  • How do you come up with topics to talk on?
  • How do you go about preparing a talk?
  • What advice would you give to someone who’s never given a talk at a conference, but would like to give one?
  • What would you suggest for a first talk: lightning talk, 20-minute, or 50-minute?
  • Do you get nervous before speaking?
  • What other speakers do you enjoy listening to, and why?
  • What else should I have asked you?

What other questions should I ask? Who should I ask? Either reply in comments here, or email me: neil at bowers dot com.

Sawyer X: Perl 5 Porters Mailing List Summary: June 22nd-29th

Hey everyone,

Following is the p5p (Perl 5 Porters) mailing list summary for the past week. Enjoy!

June 22nd-29th

News and updates

Steve Hay has created a ticket to monitor all the release blockers for perl 5.22.3, Perl #128491.

Issues

New issues

Resolved issues

Discussion

Yves Orton has changed the sort order of the MANIFEST file.

Yves Orton also introduced his change of the return signature of scalar(%hash) to match 0+keys(%hash) as a continuation of Perl #114576. In short, it didn't work the way you thought, and provided details that were not helpful for you to begin with. :)

David Farrell asked about a large amount of stat calls he was seeing from perl, and wondered if they were all necessary or whether it was possible to reduce them. Some options raised: Compiling Perl without .pmc support, and reducing the number of directories in @INC. Zefram notes the purpose of continuing to iterate through directories despite not finding a module in them. Additionally with Linux detry caches, the actual stat calls would be faster than observed.

However, at the end, Dave Mitchell showed an indistinguishable difference for the stat calls between 2 @INC entries and 8.

Unicode provides a Script_Extensions property (scx) which Karl Williamson suggests using as the underlying for the single-value synonym in Perl (instead of the currently-used Script property, sc).

Darren Duncan asked if it were possible to release perl 5.24.1 with List::Util 1.45, since it now carries a stable implementation of uniq. This will likely not happen because maintenance releases should not include new features, and James E. Keenan quotes the paragraph from perlpolicy.

John Siracusa reports some confusion around eval and $@, which seems to be a bug.

Marc Lehmann has been banned from p5p for violating the Standards for Conduct.

NEILB: Thinking of giving your first ever Perl talk?

If you've never given a talk at a Perl conference or workshop, but would like to, the YAPC::NA videos are a good source of inspiration and how (not) to give a talk. Watch a bunch of videos and make brief notes: distil these down into guidance for yourself.

Sawyer X: Perl 5 Porters Mailing List Summary: June 16th-21st

Hey everyone,

Following is the p5p (Perl 5 Porters) mailing list summary for the past week. Enjoy!

June 16th-21st

News and updates

If you like Coro, and you are wondering about the support for perl, you can view the following patches by Dave Mitchell (mentioned below as well) that allow Coro to work correctly on perl. Whether they will be applied in Coro is a different question.

Karl Williamson updates that Unicode 9.0 Emoji is now available for adoption.

Grant reports

Issues

New issues

Resolved issues

Ivan Pozdeev's patch on removing superfluous -Ilib was merged.

Salvador Fandino's patch to fix a test of PerlIO::encoding was merged.

While Dave Mitchell fixed one warning produced by GCC 6.1, there is a disagreement on another (with an explanation) and comments on others.

Proposed patches

Jim Cromie provides revised patches for Perl #127885 (enhance bench.pl to test same perl under different options/args).

Dan Collins provided patches to test the problem presented in Perl #128420 (changes in regex recursion in 5.25).

Dave Mitchell provided revised patches for Coro which can be merged to restore the compatibility with perl.

Discussion

Dave Mitchell comments how Perl #127774 (Segfault in caller()) was properly fixed as part of the context stack system rework, which - if you're interested in the context stack - you might find interesting as well.

Dave Mitchell updates that, based on the responses in the thread discussing listing build options in perl -V, he made a change to list each option per line.

On the topic of inconsistencies in memory size types in the code, Dave Mitchell notes his preference on standardizing Size_t and SSize_t. Father Chrysostomos adds that he prefers STRLEN and MEM_SIZE because he views them as clearer. To continue this, Dagfinn Ilmari Mannsåker had posted a branch that standardizes some of this. All STRLEN and MEM_SIZE had been converted to Size_t and converts ssize_t to SSize_t.

Aristotle Pagaltzis provides a good break-down on the comments made in Perl #127684 (operators ||| and &&&) and suggests the ticket be closed.

Dave Mitchell provides comments on Perl #128083 (silent encoding of filenames with UTF8 flag set) and how system works.

Tony Cook provides additional comments on Perl #127663 (safety for -i option).

Andreas Koenig reports some verbose test results due to the new unescaped literal left-brace warning. Karl Williamson delved into the presented cases and provided a pull request to the respected author.

Dave Mitchell reverted the commit that updates Time::HiRes to 1.9735 because it causes tests to hang, until the issue is resolved.

In relation to Perl #128226 (remove the requirement for null termination on PVs), Dave Mitchell had done a survey of SvPV* in the perl source and of SvPV_* and SvPVX*.

Dave's Free Press: Journal: Module pre-requisites analyser

Dave's Free Press: Journal: CPANdeps

Dave's Free Press: Journal: Perl isn't dieing

Perlgeek.de : Introducing Go Continuous Delivery

Go Continuous Delivery (short GoCD or simply Go) is an open source tool that controls an automated build or deployment process.

It consists of a server component that holds the pipeline configuration, polls source code repositories for changes, schedules and distributes work, collects artifacts, and presents a web interface to visualize and control it all, and offers a mechanism for manual approval of steps. One or more agents can connect to the server, and carry out the actual jobs in the build pipeline.

Pipeline Organization

Every build, deployment or test jobs that GoCD executes must be part of a pipeline. A pipeline consists of one or more linearly arranged stages. Within a stage, jobs run potentially in parallel, and are individually distributed to agents. Tasks are again linearly executed within a job. The most general task is the execution of an external program. Other tasks include the retrieval of artifacts, or specialized things such as running a Maven build.

Matching of Jobs to Agents

When an agent is idle, it polls the server for work. If the server has jobs to run, it uses two criteria to decide if the agent is fit for carrying out the job: environments and resources.

Each job is part of a pipeline, and a pipeline is part of an environment. On the other hand, each agent is configured to be part of one or more environments. An agent only accepts jobs from pipelines from one of its environments.

Resources are user-defined labels that describe what an agent has to offer, and inside a pipeline configuration, you can specify what resources a job needs. For example you can define that job requires the phantomjs resource to test a web application, then only agents that you assign this resource will execute that job. It is also a good idea to add the operating system and version as a resources. In the example above, the agent might have the phantomjs, debian and debian-jessie resources, offering the author of the job some choice of granularity for specifying the required operating system.

Installing the Go Server on Debian

To install the Go server on a Debian or Debian-based operating system, first you have to make sure you can download Debian packages via HTTPS:

$ apt-get install -y apt-transport-https

Then you need to configure the package sourcs:

$ echo 'deb http://dl.bintray.com/gocd/gocd-deb/ /' > /etc/apt/sources.list.d/gocd.list
$ curl https://bintray.com/user/downloadSubjectPublicKey?username=gocd | apt-key add -

And finally install it:

$ apt-get update && apt-get install -y go-server

When you now point your browser at port 8154 of the go server for HTTPS (ignore the SSL security warnings) or port 8153 for HTTP, you should see to go server's web interface:

To prevent unauthenticated access, create a password file (you need to have the apache2-utils package installed to have the htpasswd command available) on the command line:

$ htpasswd -c -s /etc/go-server-passwd go-admin
New password:
Re-type new password:
Adding password for user go-admin
$ chown go: /etc/go-server-passwd
$ chmod 600 /etc/go-server-passwd

In the go web interface, click on the Admin menu and then "Server Configuration". In the "User Management", enter the path /etc/go-server-passwd in the field "Password File Path" and click on "Save" at the bottom of the form.

Immediately afterwards, the go server asks you for username and password.

You can also use LDAP or Active Directory for authentication.

Installing a Go Worker on Debian

On one or more servers where you want to execute the automated build and deployment steps, you need to install a go agent, which will connect to the server and poll it for work. On each server, you need to do the first same three steps as when installing the server, to ensure that you can install packages from the go package repository. And then, of course, install the go agent:

$ apt-get install -y apt-transport-https
$ echo 'deb http://dl.bintray.com/gocd/gocd-deb/ /' > /etc/apt/sources.list.d/gocd.list
$ curl https://bintray.com/user/downloadSubjectPublicKey?username=gocd | apt-key add -
$ apt-get update && apt-get install -y go-agent

Then edit the file /etd/default/go-agent. The first line should read

GO_SERVER=127.0.0.1

Change the right-hand side to the hostname or IP address of your go server, and then start the agent:

$ service go-agent start

After a few seconds, the agent has contacted the server, and when you click on the "Agents" menu in the server's web frontend, you should see the agent:

("lara" is the host name of the agent here).

A Word on Environments

Go makes it possible to run agents in specific environments, and for example run a go agent on each testing and on each production machine, and use the matching of pipelines to agent environments to ensure that for example an installation step happens on the right machine in the right environment. If you go with this model, you can also use Go to copy the build artifacts to the machines where they are needed.

I chose not to do this, because I didn't want to have to install a go agent on each machine that I want to deploy to. Instead I use Ansible, executed on a Go worker, to control all machines in an environment. This requires managing the SSH keys that Ansible uses, and distributing packages through a Debian repository. But since Debian seems to require a repository anyway to be able to resolve dependencies, this is not much of an extra hurdle.

So don't be surprised when the example project here only uses a single environment in Go, which I call Control.

First Contact with Go's XML Configuration

There are two ways to configure your Go server: through the web interface, and through a configuration file in XML. You can also edit the XML config through the web interface.

While the web interface is a good way to explore go's capabilities, it quickly becomes annoying to use due to too much clicking. Using an editor with good XML support get things done much faster, and it lends itself better to compact explanation, so that's the route I'm going here.

In the Admin menu, the "Config XML" item lets you see and edit the server config. This is what a pristine XML config looks like, with one agent already registered:

<?xml version="1.0" encoding="utf-8"?>
<cruise xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="cruise-config.xsd" schemaVersion="77">
<server artifactsdir="artifacts" commandRepositoryLocation="default" serverId="b2ce4653-b333-4b74-8ee6-8670be479df9">
    <security>
    <passwordFile path="/etc/go-server-passwd" />
    </security>
</server>
<agents>
    <agent hostname="lara" ipaddress="192.168.2.43" uuid="19e70088-927f-49cc-980f-2b1002048e09" />
</agents>
</cruise>

The ServerId and the data of the agent will differ in your installation, even if you followed the same steps.

To create an environment and put the agent in, add the following section somewhere within <cruise>...</cruise>:

<environments>
    <environment name="Control">
    <agents>
        <physical uuid="19e70088-927f-49cc-980f-2b1002048e09" />
    </agents>
    </environment>
</environments>

(The agent UUID must be that of your agent, not of mine).

To give the agent some resources, you can change the <agent .../> tag in the <agents> section to read:

<agent hostname="lara" ipaddress="192.168.2.43" uuid="19e70088-927f-49cc-980f-2b1002048e09">
  <resources>
    <resource>debian-jessie</resource>
    <resource>build</resource>
    <resource>debian-repository</resource>
  </resources>
</agent>

Creating an SSH key

It is convenient for Go to have an SSH key without password, to be able to clone git repositories via SSH, for example.

To create one, run the following commands on the server:

$ su - go $ ssh-keygen -t rsa -b 2048 -N '' -f ~/.ssh/id_rsa

And either copy the resulting .ssh directory and the files therein onto each agent into the /var/go directory (and remember to set owner and permissions as they were created originally), or create a new key pair on each agent.

Ready to Go

Now that the server and an agent has some basic configuration, it is ready for its first pipeline configuration. Which we'll get to soon :-).


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: YAPC::Europe 2007 report: day 3

Perlgeek.de : Managing State in a Continuous Delivery Pipeline

Continuous Delivery is all nice and fluffy for a stateless application. Installing a new application is a simple task, which just needs installing of the new binaries (or sources, in case of a language that's not compiled), stop the old instance, and start a new instance. Bonus points for reversing the order of the last two steps, to avoid downtimes.

But as soon as there is persistent state to consider, things become more complicated.

Here I will consider traditional, relational databases with schemas. You can avoid some of the problems by using a schemaless "noSQL" database, but you don't always have that luxury, and it doesn't solve all of the problems anyway.

Along with the schema changes you have to consider data migrations, but they aren't generally harder to manage than schema changes, so I'm not going to consider them in detail.

Synchronization Between Code and Database Versions

State management is hard because code is usually tied to a version of the database schema. There are several cases where this can cause problems:

  • Database changes are often slower than application updates. In version 1 of your application can only deal with version 1 of the schema, and version 2 of the application can only deal with version 2 of the schema, you have to stop the application in version 1, do the database upgrade, and start up the application only after the database migration has finished.
  • Stepbacks become painful. Typically either a database change or its rollback can lose data, so you cannot easily do an automated release and stepback over these boundaries.

To elaborate on the last point, consider the case where a column is added to a table in the database. In this case the rollback of the change (deleting the column again) loses data. On the other side, if the original change is to delete a column, that step usually cannot be reversed; you can recreate a column of the same type, but the data is lost. Even if you archive the deleted column data, new rows might have been added to the table, and there is no restore data for these new rows.

Do It In Multiple Steps

There is no tooling that can solve these problems for you. The only practical approach is to collaborate with the application developers, and break up the changes into multiple steps (where necessary).

Suppose your desired change is to drop a column that has a NOT NULL constraint. Simply dropping the column in one step comes with the problems outlined above. In a simple scenario, you might be able to do the following steps instead:

  • Deploy a database change that makes the column nullable (or give it a default value)
  • Wait until you're sure you don't want to roll back to a version where this column is NOT NULL
  • Deploy a new version of the application that doesn't use the column anymore
  • Wait until you're sure you don't want to roll back to a version of your application that uses this column
  • Deploy a database change that drops the column entirely.

In a more complicated scenario, you might first need to a deploy a version of your application that can deal with reading NULL values from this column, even if no code writes NULL values yet.

Adding a column to a table works in a similar way:

  • Deploy a database change that adds the new column with a default value (or NULLable)
  • Deploy a version of the application that writes to the new column
  • optionally run some migrations that fills the column for old rows
  • optionally deploy a database change that adds constraints (like NOT NULL) that weren't possible at the start

... with the appropriate waits between the steps.

Prerequisites

If you deploy a single logical database change in several steps, you need to do maybe three or four separate deployments, instead of one big deployment that introduces both code and schema changes at once. That's only practical if the deployments are (at least mostly) automated, and if the organization offers enough continuity that you can actually actually finish the change process.

If the developers are constantly putting out fires, chances are they never get around to add that final, desired NOT NULL constraint, and some undiscovered bug will lead to missing information later down the road.

Tooling

Unfortunately, I know of no tooling that supports the inter-related database and application release cycle that I outlined above.

But there are tools that manage schema changes in general. For example sqitch is a rather general framework for managing database changes and rollbacks.

On the lower level, there are tools like pgdiff that compare the old and new schema, and use that to generate DDL statements that bring you from one version to the next. Such automatically generated DDLs can form the basis of the upgrade scripts that sqitch then manages.

Some ORMs also come with frameworks that promise to manage schema migrations for you. Carefully evaluate whether they allow rollbacks without losing data.

No Silver Bullet

There is no single solution that manages all your data migrations automatically for your during your deployments. You have to carefully engineer the application and database changes to decouple them a bit. This is typically more work on the application development side, but it buys you the ability to deploy and rollback without being blocked by database changes.

Tooling is available for some pieces, but typically not for the big picture. Somebody has to keep track of the application and schema versions, or automate that.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: Devel::CheckLib can now check libraries' contents

Ocean of Awareness: Top-down parsing is guessing

Top-down parsing is guessing. Literally. Bottom-up parsing is looking.

The way you'll often hear that phrased is that top-down parsing is looking, starting at the top, and bottom-up parsing is looking, starting at the bottom. But that is misleading, because the input is at the bottom -- at the top there is nothing to look at. A usable top-down parser must have a bottom-up component, even if that component is just lookahead.

A more generous, but still accurate, way to describe the top-down component of parsers is "prediction". And prediction is, indeed, a very useful component of a parser, when used in combination with other techniques.

Of course, if a parser does nothing but predict, it can predict only one input. Top-down parsing must always be combined with a bottom-up component. This bottom-up component may be as modest as lookahead, but it must be there or else top-down parsing is really not parsing at all.

So why is top-down parsing used so much?

Top-down parsing may be unusable in its pure form, but from one point of view that is irrelevant. Top-down parsing's biggest advantage is that it is highly flexible -- there's no reason to stick to its "pure" form.

A top-down parser can be written as a series of subroutine calls -- a technique called recursive descent. Recursive descent allows you to hook in custom-written bottom-up logic at every top-down choice point, and it is a technique which is completely understandable to programmers with little or no training in parsing theory. When dealing with recursive descent parsers, it is more useful to be a seasoned, far-thinking programmer than it is to be a mathematician. This makes recursive descent very appealing to seasoned, far-thinking programmers, and they are the audience that counts.

Switching techniques

You can even use the flexibility of top-down to switch away from top-down parsing. For example, you could claim that a top-down parser could do anything my own parser (Marpa) could do, because a recursive descent parser can call a Marpa parser.

A less dramatic switchoff, and one that still leaves the parser with a good claim to be basically top-down, is very common. Arithmetic expressions are essential for a computer language. But they are also among the many things top-down parsing cannot handle, even with ordinary lookahead. Even so, most computer languages these days are parsed top-down -- by recursive descent. These recursive descent parsers deal with expressions by temporarily handing control over to an bottom-up operator precedence parser. Neither of these parsers is extremely smart about the hand-over and hand-back -- it is up to the programmer to make sure the two play together nicely. But used with caution, this approach works.

Top-down parsing and language-oriented programming

But what about taking top-down methods into the future of language-oriented programming, extensible languages, and grammars which write grammars? Here we are forced to confront the reality -- that the effectiveness of top-down parsing comes entirely from the foreign elements that are added to it. Starting from a basis of top-down parsing is literally starting with nothing. As I have shown in more detail elsewhere, top-down techniques simply do not have enough horsepower to deal with grammar-driven programming.

Perl 6 grammars are top-down -- PEG with lots of extensions. These extensions include backtracking, backtracking control, a new style of tie-breaking and lots of opportunity for the programmer to intervene and customize everything. But behind it all is a top-down parse engine.

One aspect of Perl 6 grammars might be seen as breaking out of the top-down trap. That trick of switching over to a bottom-up operator precedence parser for expressions, which I mentioned above, is built into Perl 6 and semi-automated. (I say semi-automated because making sure the two parsers "play nice" with each other is not automated -- that's still up to the programmer.)

As far as I know, this semi-automation of expression handling is new with Perl 6 grammars, and it may prove handy for duplicating what is done in recursive descent parsers. But it adds no new technique to those already in use. And features like

  • mulitple types of expression, which can be told apart based on their context,
  • n-ary expressions for arbitrary n, and
  • the autogeneration of multiple rules, each allowing a different precedence scheme, for expressions of arbitrary arity and associativity,

all of which are available and in current use in Marpa, are impossible for the technology behind Perl 6 grammars.

I am a fan of the Perl 6 effort. Obviously, I have doubts about one specific set of hopes for Perl 6 grammars. But these hopes have not been central to the Perl 6 effort, and I will be an eager student of the Perl 6 team's work over the coming months.

Comments

To learn more about Marpa, there's the official web site maintained by Ron Savage. I also have a Marpa web site. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Dave's Free Press: Journal: I Love Github

Dave's Free Press: Journal: Palm Treo call db module

Perlgeek.de : Automating Deployments: Building in the Pipeline

The first step of an automated deployment system is always the build. (For a software that doesn't need a build to be tested, the test might come first, but stay with me nonetheless).

At this point, I assume that there is already a build system in place that produces packages in the desired format, here .deb files. Here I will talk about integrating this build step into a pipeline that automatically polls new versions from a git repository, runs the build, and records the resulting .deb package as a build artifact.

A GoCD Build Pipeline

As mentioned earlier, my tool of choice of controlling the pipeline is Go Continuous Delivery. Once you have it installed and configured and agent, you can start to create a pipeline.

GoCD let's you build pipelines in its web interface, which is great for exploring the available options. But for a blog entry, it's easier to look at the resulting XML configuration, which you can also enter directly ("Admin" → "Config XML").

So without further ado, here's the first draft:


  <pipelines group="deployment">
    <pipeline name="package-info">
      <materials>
        <git url="https://github.com/moritz/package-info.git" dest="package-info" />
      </materials>
      <stage name="build" cleanWorkingDir="true">
        <jobs>
          <job name="build-deb" timeout="5">
            <tasks>
              <exec command="/bin/bash" workingdir="package-info">
                <arg>-c</arg>
                <arg>debuild -b -us -uc</arg>
              </exec>
            </tasks>
            <artifacts>
              <artifact src="package-info*_*" dest="package-info/" />
            </artifacts>
          </job>
        </jobs>
      </stage>
    </pipeline>
  </pipelines>

The outer-most group is a pipeline group, which has a name. It can be used to make it easier to get an overview of available pipelines, and also to manage permissions. Not very interesting for now.

The second level is the <pipeline> with a name, and it contains a list of materials and one or more stages.

Materials

A material is anything that can trigger a pipeline, and/or provide files that commands in a pipeline can work with. Here the only material is a git repository, which GoCD happily polls for us. When it detects a new commit, it triggers the first stage in the pipeline.

Directory Layout

Each time a job within a stage is run, the go agent (think worker) which runs it prepares a directory in which it makes the materials available. On linux, this directory defaults to /var/lib/go-agent/pipelines/$pipline_name. Paths in the GoCD configuration are typically relative to this path.

For example the material definition above contains the attribute dest="package-info", so the absolute path to this git repository is /var/lib/go-agent/pipelines/package-info/package-info. Leaving out the dest="..." works, and gives on less level of directory, but only works for a single material. It is a rather shaky assumption that you won't need a second material, so don't do that.

See the config references for a list of available material types and options. Plugins are available that add further material types.

Stages

All the stages in a pipeline run serially, and each one only if the previous stage succeed. A stage has a name, which is used both in the front end, and for fetching artifacts.

In the example above, I gave the stage the attribute cleanWorkingDir="true", which makes GoCD delete files created during the previous build, and discard changes to files under version control. This tends to be a good option to use, otherwise you might unknowingly slide into a situation where a previous build affects the current build, which can be really painful to debug.

Jobs, Tasks and Artifacts

Jobs are potentially executed in parallel within a stage, and have names for the same reasons that stages do.

Inside a job there can be one or more tasks. Tasks are executed serially within a job. I tend to mostly use <exec> tasks (and <fetchartifact>, which I will cover in a later blog post), which invoke system commands. They follow the UNIX convention of treating an exit status of zero as success, and everything else as a failure.

For more complex commands, I create shell or Perl scripts inside a git repository, and add repository as a material to the pipeline, which makes them available during the build process with no extra effort.

The <exec> task in our example invokes /bin/bash -c 'debuild -b -us -uc'. Which is a case of Cargo Cult Programming, because invoking debuild directly works just as well. Ah well, will revise later.

debuild -b -us -uc builds the Debian package, and is executed inside the git checkout of the source. It produces a .deb file, a .changes file and possibly a few other files with meta data. They are created one level above the git checkout, so in the root directory of the pipeline run.

These are the files that we want to work with later on, we let GoCD store them in an internal database. That's what the <artifact> instructs GoCD to do.

Since the name of the generated files depends on the version number of the built Debian package (which comes from the debian/changelog file in the git repo), it's not easy to reference them by name later on. That's where the dest="package-info/" comes in play: it makes GoCD store the artifacts in a directory with a fixed name. Later stages then can retrieve all artifact files from this directory by the fixed name.

The Pipeline in Action

If nothing goes wrong (and nothing ever does, right?), this is roughly what the web interface looks like after running the new pipeline:

So, whenever there is a new commit in the git repository, GoCD happily builds a Debian pacckage and stores it for further use. Automated builds, yay!

But there is a slight snag: It recycles version numbers, which other Debian tools are very unhappy about. In the next blog post, I'll discuss a way to deal with that.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: Graphing tool

Dave's Free Press: Journal: Travelling in time: the CP2000AN

Dave's Free Press: Journal: XML::Tiny released

Dave's Free Press: Journal: YAPC::Europe 2007 report: day 1

Perlgeek.de : Automating Deployments: Pipeline Templates in GoCD

In the last few blog post, you've seen the development of a GoCD pipeline for building a package, uploading it into repository for a testing environment, installing it in that environment, and then repeating the upload and installation cycle for a production environment.

To recap, this the XML config for GoCD so far:

<pipeline name="package-info">
  <materials>
    <git url="https://github.com/moritz/package-info.git" dest="package-info" materialName="package-info" />
    <git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils" />
  </materials>
  <stage name="build" cleanWorkingDir="true">
    <jobs>
      <job name="build-deb" timeout="5">
        <tasks>
          <exec command="../deployment-utils/debian-autobuild" workingdir="#{package}" />
        </tasks>
        <artifacts>
          <artifact src="version" />
          <artifact src="package-info*_*" dest="package-info/" />
        </artifacts>
      </job>
    </jobs>
  </stage>
  <stage name="upload-testing">
    <jobs>
      <job name="upload-testing">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="package-info">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package testing jessie package-info_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>
  <stage name="deploy-testing">
    <jobs>
      <job name="deploy-testing">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=testing</arg>
            <arg>web</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=package-info state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>
  <stage name="upload-production">
    <approval type="manual" />
    <jobs>
      <job name="upload-production">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="package-info">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package production jessie package-info_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>
  <stage name="deploy-production">
    <jobs>
      <job name="deploy-production">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=production</arg>
            <arg>web</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=package-info state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>
</pipeline>

The interesting thing here is that the pipeline isn't very specific to this project. Apart from the package name, the Debian distribution and the group of hosts to which to deploy, everything in here can be reused to any software that's Debian packaged.

To make the pipeline more generic, we can define paramaters, short params

  <params>
    <param name="distribution">jessie</param>
    <param name="package">package-info</param>
    <param name="target">web</param>
  </params>

And then replace all the occurrences of package-info inside the stages definition with #{package}and so on:

  <stage name="build" cleanWorkingDir="true">
    <jobs>
      <job name="build-deb" timeout="5">
        <tasks>
          <exec command="../deployment-utils/debian-autobuild" workingdir="#{package}" />
        </tasks>
        <artifacts>
          <artifact src="version" />
          <artifact src="#{package}*_*" dest="#{package}/" />
        </artifacts>
      </job>
    </jobs>
  </stage>
  <stage name="upload-testing">
    <jobs>
      <job name="upload-testing">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="#{package}">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package testing #{distribution} #{package}_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>
  <stage name="deploy-testing">
    <jobs>
      <job name="deploy-testing">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=testing</arg>
            <arg>#{target}</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=#{package} state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>
  <stage name="upload-production">
    <approval type="manual" />
    <jobs>
      <job name="upload-production">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="#{package}">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package production #{distribution} #{package}_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>
  <stage name="deploy-production">
    <jobs>
      <job name="deploy-production">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=production</arg>
            <arg>#{target}</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=#{package} state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>

The next step towards generalization is to move the stages to a template. This can either be done again by editing the XML config, or in the web frontend with AdminPipelines and then clicking the Extract Template link next to the pipeline called package-info.

Either way, the result in the XML looks like this:

<pipelines group="deployment">
  <pipeline name="package-info" template="debian-base">
    <params>
      <param name="distribution">jessie</param>
      <param name="package">package-info</param>
      <param name="target">web</param>
    </params>
    <materials>
      <git url="https://github.com/moritz/package-info.git" dest="package-info" materialName="package-info" />
      <git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils" />
    </materials>
  </pipeline>
</pipelines>
<templates>
  <pipeline name="debian-base">
      <!-- stages definitions go here -->
  </pipeline>
</templates>

Everything that's specific to this one software is now in the pipeline definition, and the reusable parts are in the template. With the sole exception of the deployment-utils repo, which must be added for software that is being automatically deployed, since GoCD has no way to move a material to a template.

Adding a deployment pipeline for another piece of software is now just a matter of specifying the URL, package name, target (that is, name of a group in the Ansible inventory file) and distribution. So about a minute of work once you're used to the tooling.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: Thanks, Yahoo!

Dave's Free Press: Journal: YAPC::Europe 2007 report: day 2

Ocean of Awareness: What are the reasonable computer languages?

"You see things; and you say 'Why?' But I dream things that never were; and I say 'Why not?'" -- George Bernard Shaw

In the 1960's and 1970's computer languages were evolving rapidly. It was not clear which way they were headed. Would most programming be done with general-purpose languages? Or would programmers create a language for every task domain? Or even for every project? And, if lots of languages were going to be created, what kinds of languages would be needed?

It was in that context that Čulik and Cohen, in a 1973 paper, outlined what they thought programmers would want and should have. In keeping with the spirit of the time, it was quite a lot:

  • Programmers would want to extend their grammars with new syntax, including new kinds of expressions.
  • Programmers would also want to use tools that automatically generated new syntax.
  • Programmers would not want to, and especially in the case of auto-generated syntax would usually not be able to, massage the syntax into very restricted forms. Instead, programmers would create grammars and languages which required unlimited lookahead to disambiguate, and they would require parsers which could handle these grammars.
  • Finally, programmers would need to be able to rely on all of this parsing being done in linear time.

Today, we think we know that Čulik and Cohen's vision was naive, because we think we know that parsing technology cannot support it. We think we know that parsing is much harder than they thought.

The eyeball grammars

As a thought problem, consider the "eyeball" class of grammars. The "eyeball" class of grammars contains all the grammars that a human can parse at a glance. If a grammar is in the eyeball class, but a computer cannot parse it, it presents an interesting choice. Either,

  • your computer is not using the strongest practical algorithm; or
  • your mind is using some power which cannot be reduced to a machine computation.

There are some people out there (I am one of them) who don't believe that everything the mind can do reduces to a machine computation. But even those people will tend to go for the choice in this case: There must be some practical computer parsing algorithm which can do at least as well at parsing as a human can do by "eyeball". In other words, the class of "reasonable grammars" should contain the eyeball class.

Čulik and Cohen's candidate for the class of "reasonable grammars" were the grammars that a deterministic parse engine could parse if it had a lookahead that was infinite, but restricted to distinguishing between regular expressions. They called these the LR-regular, or LRR, grammars. And the LRR grammars do in fact seem to be a good first approximation to the eyeball class. They do not allow lookahead that contains things that you have to count, like palindromes. And, while I'd be hard put to eyeball every possible string for every possible regular expression, intuitively the concept of scanning for a regular expression does seem close to capturing the idea of glancing through a text looking for a telltale pattern.

So what happened?

Alas, the algorithm in the Čulik and Cohen paper turned out to be impractical. But in 1991, Joop Leo discovered a way to adopt Earley's algorithm to parse the LRR grammars in linear time, without doing the lookahead. And Leo's algorithm does have a practical implementation: Marpa.

References, comments, etc.

To learn more about Marpa, there's the official web site maintained by Ron Savage. I also have a Marpa web site. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Perlgeek.de : Technology for automating deployments: the agony of choice

As an interlude I'd like to look at alternative technology stacks that you could use in your deployment project. I'm using to a certain stack because it makes sense in the context of the bigger environment.

This is a mostly Debian-based infrastructure with its own operations team, and software written in various (mostly dynamic) programming languages.

If your organization writes only Java code (or code in programming languages that are based on the JVM), and your operations folks are used to that, it might be a good idea to ship .jar files instead. Then you need a tool that can deploy them, and a repository to store them.

Package format

I'm a big fan of operating system packages, for three reasons: The operators are famiilar with them, they are language agnostic, and configuration management software typically supports them out of the box.

If you develop applications in several different programming languages, say perl, python and ruby, it doesn't make sense to build a deployment pipeline around three different software stacks and educate everybody involved about how to use and debug each of the language-specific package managers. It is much more economical to have the maintainer of each application build a system package, and then use one toolchain to deploy that.

That doesn't necessarily imply building a system package for each upstream package. Fat-packaging is a valid way to avoid an explosion of packaging tasks, and also of avoiding clashes when dependencies on conflicting version of the same package exists. dh-virtualenv works well for python software and all its python dependencies into a single Debian package; only the python interpreter itself needs to be installed on the target machine.

If you need to deploy to multiple operating system families and want to build only one package, nix is an interesting approach, with the additional benefit of allowing parallel installation of several versions of the same package. That can be useful for running two versions in parallel, and only switching over to the new one for good when you're convinced that there are no regressions.

Repository

The choice of package format dictates the repository format. Debian packages are stored in a different structure than Pypi packages, for example. For each repository format there is tooling available to help you create and update the repository.

For Debian, aptly is my go-to solution for repository management. Reprepro seems to be a decent alternative.

Pulp is a rather general and scalable repository management software that was originally written for RPM packages, but now also supports Debian packages, Python (pypi) packages and more. Compared to the other solutions mentioned so far (which are just command line programs you run when you need something, and file system as storage), it comes with some administrative overhead, because there's at least a MongoDB database and a RabbitMQ message broker required to run it. But when you need such a solution, it's worth it.

A smaller repository management for Python is pip2pi. In its simplest form you just copy a few .tar.gz files into a directory and run dir2pi . in that directory, and make it accessible through a web server.

CPAN-Mini is a good and simple tool for providing a CPAN mirror, and CPAN-Mini-Inject lets you add your own Perl modules, in the simplest case through the mcpani script.

Installation

Installing a package and its dependencies often looks easy on the surface, something like apt-get update && apt-get install $package. But that is deceptive, because many installers are interactive by nature, or require special flags to force installation of an older version, or other potential pitfalls.

Ansible provides modules for installing .deb packages, python modules, perl, RPM through yum, nix packages and many others. It also requires little up-front configuration on the destination system and is very friendly for beginners, but still offers enough power for more complex deployment tasks. It can also handle configuration management.

An alternative is Rex, with which I have no practical experience.

Not all configuration management systems are good fits for managing deployments. For example Puppet doesn't seem to have a good way to provide an order for package upgrades ("first update the backend on servers bck01 and bck02, and then frontend on www01, and the rest of the backend servers").


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Perlgeek.de : Automating Deployments: New Website, Community

No IT project would be complete without its own website, so my project of writing a book on automated deployments now has one too. Please visit deploybook.com and tell me what you think about. The plan is to gradually add some content, and maybe also a bit of color. When the book finally comes out (don't hold your breath here, it'll take some more months), it'll also be for sale there.

Quite a few readers have emailed me, sharing their own thoughts on the topic of building and deploying software, and often asking for feedback. There seems to be a need for a place to share these things, so I created one. The deployment community is a discourse forum dedicated to discussing such things. I expect to see a low volume of posts, but valuables to those involved.

To conclude the news roundup for this week, I'd like to mention that next week I'll be giving a talk on continuous delivery at the German Perl Workshop 2016, where I'm also one of the organizers. After the workshop is over, slides will be available online, and I'll likely have more time again for blogging. So stay tuned!


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: YAPC::Europe 2007 travel plans

Dave's Free Press: Journal: Wikipedia handheld proxy

Perlgeek.de : Automating Deployments: Stage 2: Uploading

Once you have the pipeline for building a package, it's time to distribute the freshly built package to the machines where it's going to be installed on.

I've previously explained the nuts and bolts of getting a Debian package into a repository managed by aptly so it's time to automate that.

Some Assumptions

We are going to need a separate repository for each environment we want to deploy to (or maybe group of environments; it might be OK and even desirable to share a repository between various testing environments that can be used in parallel, for example for security, performance and functional testing).

At some point in the future, when a new version of the operating system is released, we'll also need to build packages for another major version, so for example for Debian stretch instead of jessie. So it's best to plan for that case. Based on these assumptions, the path to each repository will be $HOME/aptly/$environment/$distribution.

For the sake of simplicity, I'm going to assume a single host on which both testing and production repositories will be hosted on from separate directories. If you need those repos on separate servers, it's easy to reverse that decision (or make a different one in the first place).

To easy the transportation and management of the repository, a GoCD agent should be running on the repo server. It can copy the packages from the GoCD server's artifact repository with built-in commands.

Scripting the Repository Management

It would be possible to manually initialize each repository, and only automate the process of adding a package. But since it's not hard to do, taking the opposite route of creating automatically on the fly is more reliable. The next time you need a new environment or need to support a new distribution you will benefit from this decision.

So here is a small Perl program that, given an environment, distribution and a package file name, creates the aptly repo if it doesn't exist yet, writes the config file for the repo, and adds the package.

#!/usr/bin/perl
use strict;
use warnings;
use 5.014;
use JSON qw(encode_json);
use File::Path qw(mkpath);
use autodie;

unless ( @ARGV == 3) {
    die "Usage: $0 <environment> <distribution> <.deb file>\n";
}
my ( $env, $distribution, $package ) = @ARGV;

my $base_path   = "$ENV{HOME}/aptly";
my $repo_path   = "$base_path/$env/$distribution";
my $config_file = "$base_path/$env-$distribution.conf";
my @aptly_cmd   = ("aptly", "-config=$config_file");

init_config();
init_repo();
add_package();


sub init_config {
    mkpath $base_path;
    open my $CONF, '>:encoding(UTF-8)', $config_file;
    say $CONF encode_json( {
    rootDir => $repo_path,
    architectures => [qw( i386 amd64 all )],
    });
    close $CONF;
}

sub init_repo {
    return if -d "$repo_path/db";
    mkpath $repo_path;
    system @aptly_cmd, "repo", "create", "-distribution=$distribution", "myrepo";
    system @aptly_cmd, "publish", "repo", "myrepo";
}

sub add_package {
    system @aptly_cmd,  "repo", "add", "myrepo", $package;
    system @aptly_cmd,  "publish", "update", $distribution;
}

As always, I've developed and tested this script interactively, and only started to plug it into the automated pipeline once I was confident that it did what I wanted.

And as all software, it's meant to be under version control, so it's now part of the deployment-utils git repo.

More Preparations: GPG Key

Before GoCD can upload the debian packages into a repository, the go agent needs to have a GPG key that's not protected by a password. You can either log into the go system user account and create it there with gpg --gen-key, or copy an existing .gnupg directory over to ~go (don't forget to adjust the ownership of the directory and the files in there).

Integrating the Upload into the Pipeline

The first stage of the pipeline builds the Debian package, and records the resulting file as an artifact. The upload step needs to retrieve this artifact with a fetchartifact task. This is the config for the second stage, to be inserted directly after the first one:

  <stage name="upload-testing">
    <jobs>
      <job name="upload-testing">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="package-info">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package testing jessie package-info_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>

Note that testing here refers to the name of the environment (which you can chose freely, as long as you are consistent), not the testing distribution of the Debian project.

There is a aptly resource, which you must assign to the agent running on the repo server. If you want separate servers for testing and production repositories, you'd come up with a more specific resource name here (for example `aptly-testing^) and a separate one for the production repository.

Make the Repository Available through HTTP

To make the repository reachable from other servers, it needs to be exposed to the network. The most convenient way is over HTTP. Since only static files need to be served (and a directory index), pretty much any web server will do.

An example config for lighttpd:

dir-listing.encoding = "utf-8"
server.dir-listing   = "enable"
alias.url = ( 
    "/debian/testing/jessie/"    => "/var/go/aptly/testing/jessie/public/",
    "/debian/production/jessie/" => "/var/go/aptly/production/jessie/public/",
    # more repos here
)

And for the Apache HTTP server, once you've configured a virtual host:

Options +Indexes
Alias /debian/testing/jessie/     /var/go/aptly/testing/jessie/public/
Alias /debian/production/jessie/  /var/go/aptly/production/jessie/public/
# more repos here

Achievement Unlocked: Automatic Build and Distribution

With theses steps done, there is automatic building and upload of packages in place. Since client machines can pull from that repository at will, we can tick off the distribution of packages to the client machines.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: Bryar security hole

Dave's Free Press: Journal: POD includes

Dave's Free Press: Journal: cgit syntax highlighting

Perlgeek.de : Automating Deployments: Installation in the Pipeline

As [mentioned before](perlgeek.de/blog-en/automating-deployments/2016-007-installing-packages.html), my tool of choice for automating package installation is [ansible](https://deploybook.com/resources).

The first step is to create an inventory file for ansible. In a real deployment setting, this would contain the hostnames to deploy to. For the sake of this project I just have a test setup consisting of virtual machines managed by vagrant, which leads to a somewhat unusual ansible configuration.

That's the ansible.cfg:

[defaults]
remote_user = vagrant
host_key_checking = False

And the inventory file called testing for the testing environment:

[web]
testserver ansible_ssh_host=127.0.0.1 ansible_ssh_port=2345 

(The host is localhost here, because I run a vagrant setup to test the pipeline; In a real setting, it would just be the hostname of your test machine).

All code and configuration goes to version control, I created an ansible directory in the deployment-utils repo and dumped the files there.

Finally I copied the ssh private key (from vagrant ssh-config) to /var/go/.ssh/id_rsa, adjusted the owner to user go, and was ready to go.

Plugging it into GoCD

Automatically installing a newly built package through GoCD in the testing environment is just another stage away:

  <stage name="deploy-testing">
    <jobs>
      <job name="deploy-testing">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=testing</arg>
            <arg>web</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=package-info state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>

The central part is an invocation of ansible in the newly created directory of the deployment--utils repository.

Results

To run the new stage, either trigger a complete run of the pipeline by hitting the "play" triangle in the pipeline overview in web frontend, or do a manual trigger of that one stage in the pipe history view.

You can log in on the target machine to check if the package was successfully installed:

vagrant@debian-jessie:~$ dpkg -l package-info
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  package-info   0.1-0.7.1    all          Web service for getting a list of

and verify that the service is running:

vagrant@debian-jessie:~$ systemctl status package-info
● package-info.service - Package installation information via http
   Loaded: loaded (/lib/systemd/system/package-info.service; static)
   Active: active (running) since Sun 2016-03-27 13:15:41 GMT; 4h 6min ago
  Process: 4439 ExecStop=/usr/bin/hypnotoad -s /usr/lib/package-info/package-info (code=exited, status=0/SUCCESS)
 Main PID: 4442 (/usr/lib/packag)
   CGroup: /system.slice/package-info.service
           ├─4442 /usr/lib/package-info/package-info
           ├─4445 /usr/lib/package-info/package-info
           ├─4446 /usr/lib/package-info/package-info
           ├─4447 /usr/lib/package-info/package-info
           └─4448 /usr/lib/package-info/package-info

and check that it responds on port 8080, as it's supposed to:

    vagrant@debian-jessie:~$ curl http://127.0.0.1:8080/|head -n 7
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
      0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Desired=Unknown/Install/Remove/Purge/Hold
    | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
    |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
    ||/ Name                           Version                     Architecture Description
    +++-==============================-===========================-============-===============================================================================
    ii  acl                            2.2.52-2                    amd64        Access control list utilities
    ii  acpi                           1.7-1                       amd64        displays information on ACPI devices
    curl: (23) Failed writing body (2877 != 16384)

The last line is simply curl complaining that it can't write the full output, due to the pipe to head exiting too early to receive all the contents. We can safely ignore that.

Going All the Way to Production

Uploading and deploying to production works the same as with the testing environment. So all that's needed is to duplicate the configuration of the last two pipelines, replace every occurrence of testing with pproduction, and add a manual approval button, so that production deployment remains a conscious decision:

  <stage name="upload-production">
    <approval type="manual" />
    <jobs>
      <job name="upload-production">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcdir="package-info">
            <runif status="passed" />
          </fetchartifact>
          <exec command="/bin/bash">
            <arg>-c</arg>
            <arg>deployment-utils/add-package production jessie package-info_*.deb</arg>
          </exec>
        </tasks>
        <resources>
          <resource>aptly</resource>
        </resources>
      </job>
    </jobs>
  </stage>
  <stage name="deploy-production">
    <jobs>
      <job name="deploy-production">
        <tasks>
          <exec command="ansible" workingdir="deployment-utils/ansible/">
            <arg>--sudo</arg>
            <arg>--inventory-file=production</arg>
            <arg>web</arg>
            <arg>-m</arg>
            <arg>apt</arg>
            <arg>-a</arg>
            <arg>name=package-info state=latest update_cache=yes</arg>
            <runif status="passed" />
          </exec>
        </tasks>
      </job>
    </jobs>
  </stage>

The only real news here is the second line:

    <approval type="manual" />

which makes GoCD only proceed to this stage when somebody clicks the approval arrow in the web interface.

You also need to fill out the inventory file called production with the list of your server or servers.

Achievement Unlocked: Basic Continuous Delivery

Let's recap, the pipeline

  • is triggered automatically from commits in the source code
  • automatically builds a Debian package from each commit
  • uploads it to a repository for the testing environment
  • automatically installs it in the testing environment
  • upon manual approval, uploads it to a repository for the production environment
  • ... and automatically installs the new version in production.

So the basic framework for Continuous Delivery is in place.

Wow, that escalated quickly.

Missing Pieces

Of course, there's lots to be done before we can call this a fully-fledged Continuous Delivery pipeline:

  • Automatic testing
  • Generalization to other software
  • version pinning (always installing the correct version, not the newest one).
  • Rollbacks
  • Data migration

But even as is, the pipeline can provide quite some time savings and shortened feedback cycles. The manual approval before production deployment is a good hook for manual tasks, such as manual tests.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Ocean of Awareness: Grammar reuse

Every year the Perl 6 community creates an "Advent" series of posts. I always follow these, but one in particular caught my attention this year. It presents a vision of a future where programming is language-driven. A vision that I share. The post went on to encourage its readers to follow up on this vision, and suggested an approach. But I do not think the particular approach suggested would be fruitful. In this post I'll explain why.

Reuse

The focus of the Advent post was language-driven programming, and that is the aspect that excites me most. But the points that I wish to make are more easily understood if I root them in a narrower, but more familiar issue -- grammar reuse.

Most programmers will be very familiar with grammar reuse from regular expressions. In the regular expression ("RE") world, programming by cutting and pasting is very practical and often practiced.

For this post I will consider grammar reusability to be the ability to join two grammars and create a third. This is also sometimes called grammar composition. For this purpose, I will widen the term "grammar" to include RE's and PEG parser specifications. Ideally, when you compose two grammars, what you get is

  • a language you can reasonably predict, and
  • if each of the two original grammars can be parsed in reasonable time, a language that can be parsed in reasonable time.

Not all language representations are reusable. RE's are, and BNF is. PEG looks like a combination of BNF and RE's, but PEG, in fact, is its own very special form of parser specification. And PEG parser specifications are one of the least reusable language representations ever invented.

Reuse and regular expressions

RE's are as well-behaved under reuse as a language representation can get. The combination of two RE's is always another RE, and you can reasonably determine what language the combined RE recognizes by examining it. Further, every RE is parseable in linear time.

The one downside, often mentioned by critics, is that RE's do not scale in terms of readability. Here, however, the problem is not really one of reusability. The problem is that RE's are quite limited in their capabilities, and programmers often exploit the excellent behavior of RE's under reuse to push them into applications for which RE's just do not have the power.

Reuse and PEG

When programmers first look at PEG syntax, they often think they've encountered paradise. They see both BNF and RE's, and imagine they'll have the best of each. But the convenient behavior of RE's depends on their unambiguity. You simply cannot write an unambiguous RE -- it's impossible.

More powerful and more flexible, BNF allows you to describe many more grammars -- including ambiguous ones. How does PEG resolve this? With a Gordian knot approach. Whenever it encounters an ambiguity, it throws all but one of the choices away. The author of the PEG specification gets some control over what is thrown away -- he specifies an order of preference for the choices. But degree of control is less than it seems, and in practice PEG is the nitroglycerin of parsing -- marvelous when it works, but tricky and dangerous.

Consider these 3 PEG specifications:

	("a"|"aa")"a"
	("aa"|"a")"a"
	A = "a"A"a"/"aa"

All three clearly accept only strings which are repetitions of the letter "a". But which strings? For the answers, suggestions for dealing with PEG if you are committed to it, and more, look at my previous post on PEG.

When getting an RE or a BNF grammar to work, you can go back to the grammar and ask yourself "Does my grammar look like my intended language?". With PEG, this is not really possible. With practice, you might get used to figuring out single line PEG specs like the first two above. (If you can get the last one, you're amazing.) But tracing these through multiple rule layers required by useful grammars is, in practice, not really possible.

In real life, PEG specifications are written by hacking them until the test suite works. And, once you get a PEG specification to pass the test suite for a practical-sized grammar, you are very happy to leave it alone. Trying to compose two PEG specifications is rolling the dice with the odds against you.

Reuse and the native Perl 6 parser

The native Perl 6 parser is an extended PEG parser. The extensions are very interesting from the PEG point of view. The PEG "tie breaking" has been changed, and backtracking can be used. These features mean the Perl 6 parser can be extended to languages well beyond what ordinary PEG parsers can handle. But, if you use the extra features, reuse will be even trickier than if you stuck with vanilla PEG.

Reuse and general BNF parsing

As mentioned, general BNF is reusable, and so general BNF parsers like Marpa are as reusable as regular expressions, with two caveats. First, if the two grammars are not doing their own lexing, their lexers will have to be compatible.

Second, with regular expressions you had the advantage that every regular expression parses in linear time, so that speed was guaranteed to be acceptable. Marpa users reuse grammars and pieces of grammars all the time. The result is always the language specified by the merged BNF, and I've never heard anyone complain that performance deterioriated.

But, while it may not happen often, it is possible to combine two Marpa grammars that run in linear time and end up with one that does not. You can guarantee your merged Marpa grammar will stay linear if you follow 2 rules:

  • keep the grammar unambiguous;
  • don't use an unmarked middle recursion.

Unmarked middle recursions are not things you're likely to need a lot: they are those palindromes where you have to count to find the middle: grammars like "A ::= a | a A a". If you use a middle recursion at all, it is almost certainly going to be marked, like "A ::= b | a A a", which generates strings like "aabaa". With Marpa, as with RE's, reuse is easy and practical. And, as I hope to show in a future post, unlike RE's, Marpa opens the road to language-driven programming.

Perl 6

I'm a fan of the Perl 6 effort. I certainly should be a supporter, after the many favors they've done for me and the Marpa community over the years. The considerations of this post will disappoint some of the hopes for applications of the native Perl 6 parser. But these applications have not been central to the Perl 6 effort, of which I will be an eager student over the coming months.

Comments

To learn more about Marpa, there's the official web site maintained by Ron Savage. I also have a Marpa web site. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Perlgeek.de : Automating Deployments and Configuration Management

New software versions often need new configuration as well. How do you make sure that the necessary configuration arrives on a target machine at the same time (or before) the software release that introduces them?

The obvious approach is to put the configuration in version control too and deploy it alongside the software.

Taking configuration from a source repository and applying it to running machines is what configuration management software does.

Since Ansible has been used for deployment in the examples so far -- and it's a good configuration management system as well -- it is an obvious choice to use here.

Benefits of Configuration Management

When your infrastructure scales to many machines, and you don't want your time and effort to scale linearly with them, you need to automate things. Keeping configuration consistent across different machines is a requirement, and configuration management software helps you achieve that.

Furthermore, once the configuration comes from a canonical source with version control, tracking and rolling back configuration changes becomes trivial. If there is an outage, you don't need to ask all potentially relevant colleagues whether they changed anything -- your version control system can easily tell you. And if you suspect that a recent change caused the outage, reverting it to see if the revert works is a matter of seconds or minutes.

Once configuration and deployment are automated, building new environments, for example for penetration testing, becomes a much more manageable task.

Capabilities of a Configuration Management System

Typical tasks and capabilities of configuration management software include things like connecting to the remote host, copying files to the host (and often adjusting parameters and filling out templates in the process), ensuring that operating system packages are installed or absent, creating users and groups, controlling services, and even executing arbitrary commands on the remote host.

With Ansible, the connection to the remote host is provided by the core, and the actual steps to be executed are provided by modules. For example the apt_repository module can be used to manage repository configuration (i.e. files in /etc/apt/sources.list.d/), the apt module installs, upgrades, downgrades or removes packages, and the template module typically generates configuration files from variables that the user defined, and from facts that Ansible itself gathered.

There are also higher-level Ansible modules available, for example for managing Docker images, or load balancers from the Amazon cloud.

A complete introduction to Ansible is out of scope here, but I can recommend the online documentation, as well as the excellent book Ansible: Up and Running by Lorin Hochstein.

To get a feeling for what you can do with Ansible, see the ansible-examples git repository.

Assuming that you will find your way around configuration management with Ansible through other resources, I want to talk about how you can integrate it into the deployment pipeline instead.

Integrating Configuration Management with Continuous Delivery

The previous approach of writing one deployment playbook for each application can serve as a starting point for configuration management. You can simply add more tasks to the playbook, for example for creating the configuration files that the application needs. Then each deployment automatically ensures the correct configuration.

Since most modules in Ansible are idempotent, that is, repeated execution doesn't change the state of the system after the first time, adding additional tasks to the deployment playbook only becomes problematic when performance suffers. If that happens, you could start to extract some slow steps out into a separate playbook that doesn't run on each deployment.

If you provision and configure a new machine, you typically don't want to manually trigger the deploy step of each application, but rather have a single command that deploys and configures all of the relevant applications for that machine. So it makes sense to also have a playbook for deploying all relevant applications. This can be as simple as a list of include statements that pull in the individual application's playbooks.

You can add another pipeline that applies this "global" configuration to the testing environment, and after manual approval, in the production environment as well.

Stampedes and Synchronization

In the scenario outlined above, the configuration for all related applications lives in the same git repository, and is used a material in the build and deployment pipelines for all these applications.

A new commit in the configuration repository then triggers a rebuild of all the applications. For a small number of applications, that's usually not a problem, but if you have a dozen or a few dozen applications, this starts to suck up resources unnecessarily, and also means no build workers are available for some time to build changes triggered by actual code changes.

To avoid these build stampedes, a pragmatic approach is to use ignore filters in the git materials. Ignore filters are typically used to avoid rebuilds when only documentation changes, but can also be used to prevent any changes in a repository to trigger a rebuild.

If, in the <materials> section of your GoCD pipeline, you replace

<git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils" />

With

<git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils">
  <filter>
    <ignore pattern="**/*" />
    <ignore pattern="*" />
  </filter>
</git>

then a newly pushed commit to the deployment-utils repo won't trigger this pipeline. A new build, triggered either manually or from a new commit in the application's git repository, still picks up the newest version of the deployment-utils repository.

In the pipeline that deploys all of the configuration, you wouldn't add such a filter.

Now if you change some playbooks, the pipeline for the global configuration runs and rolls out these changes, and you promote the newest version to production. When you then deploy one of your applications to production, and the build happened before the changes to the playbook, it actually uses an older version of the playbook.

This sounds like a very unfortunate constellation, but it turns out not to be so bad. The combination of playbook version and application version worked in testing, so it should work in production as well.

To avoid using an older playbook, you can trigger a rebuild of the application, which automatically uses the newest playbook version.

Finally, in practice it is a good idea to bring most changes to production pretty quickly anyway. If you don't do that, you lose overview of what changed, which leads to growing uncertainty about whether a production release is safe. If you follow this ideal of going quickly to production, the version mismatches between the configuration and application pipelines should never become big enough to worry about.

Conclusion

The deployment playbooks that you write for your applications can be extended to do full configuration management for these applications. You can create a "global" Ansible playbook that includes those deployment playbooks, and possibly other configuration, such as basic configuration of the system.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Perlgeek.de : Automatically Deploying Specific Versions

Versions. Always talking about versions. So, more talk about versions.

The installation pipeline from a previous installment always installs the newest version available. In a normal, simple, linear development flow, this is fine, and even in other workflows, it's a good first step.

But we really want the pipeline to deploy the exact versions that was built inside the same instance of the pipeline. The obvious benefit is that it allows you to rerun older versions of the pipeline to install older versions, effectively giving you a rollback.

Or you can build a second pipeline for hotfixes, based on the same git repository but a different branch, and when you do want a hotfix, you simply pause the regular pipeline, and trigger the hotfix pipeline. In this scenario, if you always installed the newest version, finding a proper version string for the hotfix is nearly impossible, because it needs to be higher than the currently installed one, but also lower than the next regular build. Oh, and all of that automatically please.

A less obvious benefit to installing a very specific version is that it detects error in the package source configuration of the target machines. If the deployment script just installs the newest version that's available, and through an error the repository isn't configured on the target machine, the installation process becomes a silent no-op if the package is already installed in an older version.

Implementation

There are two things to do: figure out version to install of the package, and and then do it.

The latter step is fairly easy, because the ansible "apt" module that I use for installation supports, and even has an example in the documentation:

# Install the version '1.00' of package "foo"
- apt: name=foo=1.00 state=present

Experimenting with this feature shows that in case this is a downgrade, you also need to add force=yes.

Knowing the version number to install also has a simple, though maybe not obvious solution: write the version number to a file, collect this file as an artifact in GoCD, and then when it's time to install, fetch the artifact, and read the version number from it.

When I last talked about the build step, I silently introduced configuration that collects the version file that the debian-autobuild script writes:

  <job name="build-deb" timeout="5">
    <tasks>
      <exec command="../deployment-utils/debian-autobuild" workingdir="#{package}" />
    </tasks>
    <artifacts>
      <artifact src="version" />
      <artifact src="package-info*_*" dest="package-info/" />
    </artifacts>
  </job>

So only the actual installation step needs adjusting. This is what the configuration looked like:

  <job name="deploy-testing">
    <tasks>
      <exec command="ansible" workingdir="deployment-utils/ansible/">
        <arg>--sudo</arg>
        <arg>--inventory-file=testing</arg>
        <arg>web</arg>
        <arg>-m</arg>
        <arg>apt</arg>
        <arg>-a</arg>
        <arg>name=package-info state=latest update_cache=yes</arg>
        <runif status="passed" />
      </exec>
    </tasks>
  </job>

So, first fetch the version file:

      <job name="deploy-testing">
        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcfile="version" />
          ...

Then, how to get the version from the file to ansible? One could either use ansible's lookup('file', path) function, or write a small script. I decided to the latter, since I was originally more aware of bash's capabilities than of ansible's, and it's only a one-liner anyway:

          ...
          <exec command="/bin/bash" workingdir="deployment-utils/ansible/">
            <arg>-c</arg>
            <arg>ansible --sudo --inventory-file=testing #{target} -m apt -a "name=#{package}=$(&lt; ../../version) state=present update_cache=yes force=yes"</arg>
          </exec>
        </tasks>
      </job>

Bash's $(...) opens a sub-process (which again is a bash instance), and inserts the output from that sub-process into the command line. < ../../version is a short way of reading the file. And, this being XML, the less-than sign needs to be escaped.

The production deployment configuration looks pretty much the same, just with --inventory-file=production.

Try it!

To test the version-specific package installation, you need to have at least two runs of the pipeline that captured the version artifact. If you don't have that yet, you can push commits to the source repository, and GoCD picks them up automatically.

You can query the installed version on the target machine with dpkg -l package-info. After the last run, the version built in that pipeline instance should be installed.

Then you can rerun the deployment stage from a previous pipeline, for example in the history view of the pipeline by hovering with the mouse over the stage, and then clicking on the circle with the arrow on it that triggers the rerun.

After the stage rerun has completed, checking the installed version again should yield the version built in the pipeline instance that you selected.

Conclusions

Once you know how to set up your pipeline to deploy exactly the version that was built in the same pipeline instance, it is fairly easy to implement.

Once you've done that, you can easily deploy older versions of your software as a step back scenario, and use the same mechanism to automatically build and deploy hotfixes.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Perlgeek.de : Automating Deployments: Version Recycling Considered Harmful

In the previous installment we saw a GoCD configuration that automatically built a Debian package from a git repository whenever somebody pushes a new commit to the git repo.

The version of the generated Debian package comes from the debian/changelog file of the git repository. Which means that whenever somebody pushes code or doc changes without a new changelog entry, the resulting Debian package has the same version number as the previous one.

The problem with this version recycling is that most Debian tooling assumes that the tuple of package name, version and architecture uniquely identifies a revision of a package. So stuffing a new version of a package with an old version number into a repository is bound to cause trouble; most repository management software simply refuses to accept that. On the target machine, upgrade the package won't do anything if the version number stays the same.

So, its a good idea to put a bit more thought into the version string of the automatically built Debian package.

Constructing Unique Version Numbers

There are several source that you can tap to generate unique version numbers:

  • Randomness (for example in the form of UUIDs)
  • The current date and time
  • The git repository itself
  • GoCD exposes several environment variables that can be of use

The latter is quite promising: GO_PIPELINE_COUNTER is a monotonic counter that increases each time GoCD runs the pipeline, so a good source for a version number. GoCD allows manual re-running of stages, so it's best to combine it with GO_STAGE_COUNTER. In terms of shell scripting, using $GO_PIPELINE_COUNTER.$GO_STAGE_COUNTER as a version string sounds like a decent approach.

But, there's more. GoCD allows you to trigger a pipeline with a specific version of a material, so you can have a new pipeline run to build an old version of the software. If you do that, using GO_PIPELINE_COUNTER as the first part of the version string doesn't reflect the use of an old code base.

To construct a version string that primarily reflects the version of the git repository, and only secondarily the build iteration, the first part of the version string has to come from git. As a distributed version control system, git doesn't supply a single, numeric version counter. But if you limit yourself to a single repository and branch, you can simply count commits.

git describe is an established way to count commits. By default it prints the last tag in the repo, and if HEAD does not resolve to the same commit as the tag, it adds the number of commits since that tag, and the abbreviated sha1 hash prefixed by g, so for example 2016.04-32-g4232204 for the commit 4232204, which is 32 commits after the tag 2016.04. The option --long forces it to always print the number of commits and the hash, even when HEAD points to a tag.

We don't need the commit hash for the version number, so a shell script to construct a good version number looks like this:

#!/bin/bash

set -e
set -o pipefail
version=$(git describe --long |sed 's/-g[A-Fa-f0-9]*$//')
version="$version.${GO_PIPELINE_COUNTER:-0}.${GO_STAGE_COUNTER:-0}"

Bash's ${VARIABLE:-default} syntax is a good way to make the script work outside a GoCD agent environment.

This script requires a tag to be set in the git repository. If there is none, it fails with this message from git describe:

fatal: No names found, cannot describe anything.

Other Bits and Pieces Around the Build

Now that we have a version string, we need to instruct the build system to use this version string. This works by writing a new entry in debian/changelog with the desired version number. The debchange tool automates this for us. A few options are necessary to make it work reliably:

export DEBFULLNAME='Go Debian Build Agent'
export DEBEMAIL='go-noreply@example.com'
debchange --newversion=$version  --force-distribution -b  \
    --distribution="${DISTRIBUTION:-jessie}" 'New Version'

When we want to reference this version number in later stages in the pipeline (yes, there will be more), it's handy to have it available in a file. It is also handy to have it in the output, so two more lines to the script:

echo $version
echo $version > ../version

And of course, trigger the actual build:

debuild -b -us -uc

Plugging It Into GoCD

To make the script accessible to GoCD, and also have it under version control, I put it into a git repository under the name debian-autobuild and added the repo as a material to the pipeline:

<pipeline name="package-info">
  <materials>
    <git url="https://github.com/moritz/package-info.git" dest="package-info" />
    <git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils" />
  </materials>
  <stage name="build" cleanWorkingDir="true">
    <jobs>
      <job name="build-deb" timeout="5">
        <tasks>
          <exec command="../deployment-utils/debian-autobuild" workingdir="#{package}" />
        </tasks>
        <artifacts>
          <artifact src="version" />
          <artifact src="package-info*_*" dest="package-info/" />
        </artifacts>
      </job>
    </jobs>
  </stage>
</pipeline>

Now GoCD automatically builds Debian packages on each commit to the git repository, and gives each a distinct version string.

The next step is to add it to a repository, so that it can be installed on a target machine with a simple apt-get command.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: CPAN Testers' CPAN author FAQ

Ocean of Awareness: What parser do birds use?

"Here we provide, to our knowledge, the first unambiguous experimental evidence for compositional syntax in a non-human vocal system." -- "Experimental evidence for compositional syntax in bird calls", Toshitaka N. Suzuki, David Wheatcroft & Michael Griesser Nature Communications 7, Article number: 10986

In this post I look at a subset of the language of the Japanese great tit, also known as Parus major. The above cited article presents evidence that bird brains can parse this language. What about standard modern computer parsing methods? Here is the subset -- probably a tiny one -- of the language actually used by Parus major.

      S ::= ABC
      S ::= D
      S ::= ABC D
      S ::= D ABC
    

Classifying the Parus major grammar

Grammophone is a very handy new tool for classifying grammars. Its own parser is somewhat limited, so that it requires a period to mark the end of a rule. The above grammar is in Marpa's SLIF format, which is smart enough to use the "::=" operator to spot the beginning and end of rules, just as the human eye does. Here's the same grammar converted into a form acceptable to Grammophone:

      S -> ABC .
      S -> D .
      S -> ABC D .
      S -> D ABC .
    

Grammophone tells us that the Parus major grammar is not LL(1), but that it is LALR(1).

What does this mean?

LL(1) is the class of grammar parseable by top-down methods: it's the best class for characterizing most parsers in current use, including recursive descent, PEG, and Perl 6 grammars. All of these parsers fall short of dealing with the Parus major language.

LALR(1) is probably most well-known from its implementations in bison and yacc. While able to handle this subset of Parus's language, LALR(1) has its own, very strict, limits. Whether LALR(1) could handle the full complexity of Parus language is a serious question. But it's a question that in practice would probably not arise. LALR(1) has horrible error handling properties.

When the input is correct and within its limits, an LALR-driven parser is fast and works well. But if the input is not perfectly correct, LALR parsers produce no useful analysis of what went wrong. If Parus hears "d abc d", a parser like Marpa, on the other hand, can produce something like this:

# * String before error: abc d\s
# * The error was at line 1, column 7, and at character 0x0064 'd', ...
# * here: d
    

Parus uses its language in predatory contexts, and one can assume that a Parus with a preference for parsers whose error handling is on an LALR(1) level will not be keeping its alleles in the gene pool for very long.

References, comments, etc.

Those readers content with sub-Parus parsing methods may stop reading here. Those with greater parsing ambitions, however, may wish to learn more about Marpa. A Marpa test script for parsing the Parus subset is in a Github gist. Marpa has a semi-official web site, maintained by Ron Savage. The official, but more limited, Marpa website is my personal one. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Dave's Free Press: Journal: Thankyou, Anonymous Benefactor!

Perlgeek.de : Continuous Delivery for Libraries?

Past Thursday I gave a talk on Continuous Delivery (slides) at the German Perl Workshop 2016 (video recordings have been made, but aren't available yet). One of the questions from the audience was something along the lines of: would I use Continuous Delivery for a software library?

My take on this is that you typically develop a library driven by the needs of one or more applications, not just for the sake of developing a library. So you have some kind of pilot application which makes use of the new library features.

You can integrate the library into the application's build pipeline. Automatically build and unit test the library, and once it succeeds, upload the library into a repository. The build pipeline for the application can then download the newest version of the library, and include it in its build result (fat-packaging). The application build step now has two triggers: commits from its own version control repository, and library uploads.

Then the rest of the delivery pipeline for the application serves as quality gating for the library as well. If the pipeline includes integration tests and functional tests for the whole software stack, it will catch errors of the library, and deploy the library along with the application.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Dave's Free Press: Journal: Number::Phone release

Dave's Free Press: Journal: Ill

Dave's Free Press: Journal: CPANdeps upgrade

Perlgeek.de : Automating Deployments: Smoke Testing and Rolling Upgrades

In the last installment I talked about unit testing that covers the logic of your application. Unit testing is a good and efficient way to ensure the quality of the business logic, however unit tests tend to test components in isolation.

You should also check that several components work together well, which can be done with integration tests or smoke tests. The distinction between these two is a bit murky at times, but typically integration tests are still done somewhat in isolation, whereas smoke tests are run against an installed copy of the software in a complete environment, with all external services available.

A smoke test thus goes through the whole software stack. For a web application, that typically entails a web server, an application server, a database, and possibly integration points with other services such as single sign-on (SSO) or external data sources.

When to Smoke?

Smoke tests cover a lot of ground at once. A single test might require a working network, correctly configured firewall, web server, application server, database, and so on to work. This is an advantage, because it means that it can detect a big class of errors, but it is also a disadvantage, because it means the diagnostic capabilities are low. When it fails, you don't know which component is to blame, and have to investigate each failure anew.

Smoke tests are also much more expensive than unit tests; they tend to take more time to write, take longer to execute, and are more fragile in the face of configuration or data changes.

So typical advice is to have a low number of smoke tests, maybe one to 20, or maybe around one percent of the unit tests you have.

As an example, if you were to develop a flight search and recommendation engine for the web, your unit tests would cover different scenarios that the user might encounter, and that the engine produces the best possible suggestions. In smoke tests, you would just check that you can enter the starting point, destination and date of travel, and that you get a list of flight suggestions at all. If there is a membership area on that website, you would test that you cannot access it without credentials, and that you can access it after logging in. So, three smoke tests, give or take.

White Box Smoke Testing

The examples mentioned above are basically black-box smoke testing, in that they don't care about the internals of the application, and approach the application just like a user. This is very valuable, because ultimately you care about your user's experience.

But sometimes some aspects of the application aren't easy to smoke test, yet break often enough to warrant automated smoke tests. A practical solution is to offer some kind of self diagnosis, for example a web page where the application tests its own configuration for consistency, checks that all the necessary database tables exist, and that external services are reachable.

Then a single smoke test can call the status page, and throw an error whenever either the status page is not reachable, or reports an error. This is a white box smoke test.

Status pages for white box smoke tests can be reused in monitoring checks, but it is still a good idea to explicitly check it as part of the deployment process.

White box smoke testing should not replace black box smoke testing, but rather complement it.

An Example Smoke Test

The matheval application from the previous blog post offers a simple HTTP endpoint, so any HTTP client will do for smoke testing.

Using the curl command line HTTP client, a possible request looks like this:

$ curl  --silent -H "Accept: application/json" --data '["+", 37, 5]' -XPOST  http://127.0.0.1:8800/
42

An easy way to check that the output matches expectations is by piping it through grep:

$ curl  --silent -H "Accept: application/json" --data '["+", 37, 5]' -XPOST  http://127.0.0.1:8800/ | grep ^42$
42

The output is the same as before, but the exit status is non-zero if the output deviates from the expectation.

Integration the Smoke Testing Into the Pipeline

One could add a smoke test stage after each deployment stage (that is, one after the test deployment, one after the production deployment).

This setup would prevent a version of your application from reaching the production environment if it failed smoke tests in the testing environment. Since the smoke test is just a shell command that indicates failure with a non-zero exit status, adding it as a command in your deployment system should be trivial.

If you have just one instance of your application running, this is the best you can do. But if you have a farm of servers, and several instances of the application running behind some kind of load balancer, it is possible to smoke test each instance separately during an upgrade, and abort the upgrade if too many instances fail the smoke test.

All big, successful tech companies guard their production systems with such partial upgrades guarded by checks, or even more elaborate versions thereof.

A simple approach to such a rolling upgrade is to write an ansible playbook for the deployment of each package, and have it run the smoke tests for each machine before moving to the next:

# file smoke-tests/python-matheval
#!/bin/bash
curl  --silent -H "Accept: application/json" --data '["+", 37, 5]' -XPOST  http://$1:8800/ | grep ^42$


# file ansible/deploy-python-matheval.yml
---
- hosts: web
  serial: 1
  max_fail_percentage: 1
  tasks:
    - apt: update_cache=yes package=python-matheval={{package_version}} state=present force=yes
    - local_action: command ../smoke-tests/python-matheval "{{ansible_host}}"
      changed_when: False

As the smoke tests grow over time, it is not practical to cram them all into the ansible playbook, and doing that also limits reusability. Instead here they are in a separate file in the deployments utils repository. Another option would be to build a package from the smoke tests and install them on the machine that ansible runs on.

While it would be easy to execute the smoke tests command on the machine on which the service is installed, running it as a local action (that is, on the control host where the ansible playbook is started) also tests the network and firewall part, and thus more realistically mimics the actual usage scenario.

GoCD Configuration

To run the new deployment playbook from within the GoCD pipeline, change the testing deployment job in the template to:

        <tasks>
          <fetchartifact pipeline="" stage="build" job="build-deb" srcfile="version" />
          <exec command="/bin/bash" workingdir="deployment-utils/ansible/">
            <arg>-c</arg>
            <arg>ansible-playbook --inventory-file=testing --extra-vars=package_version=$(&lt; ../../version) #{deploy_playbook}</arg>
          </exec>
        </tasks>

And the same for production, except that it uses the production inventory file. This change to the template also changes the parameters that need to be defined in the pipeline definition. In the python-matheval example it becomes

  <params>
    <param name="distribution">jessie</param>
    <param name="package">python-matheval</param>
    <param name="deploy_playbook">deploy-python-matheval.yml</param>
  </params>

Since there are two pipelines that share the same template, the second pipeline (for package package-info) also needs a deployment playbook. It is very similar to the one for python-matheval, it just lacks the smoke test for now.

Conclusion

Writing a small amount of smoke tests is very beneficial for the stability of your applications.

Rolling updates with integrated smoke tests for each system involved are pretty easy to do with ansible, and can be integrated into the GoCD pipeline with little effort. They mitigate the damage of deploying a bad version or a bad configuration by limiting it to one system, or a small number of systems in a bigger cluster.

With this addition, the deployment pipeline is likely to be as least as robust as most manual deployment processes, but much less effort, easier to scale to more packages, and gives more insight about the timeline of deployments and installed versions.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Perlgeek.de : Story Time: Rollbacks Saved the Day

At work we develop, among other things, an internally used web application based on AngularJS. Last Friday, we received a rather urgent bug report that in a co-worker's browser, a rather important page wouldn't load at all, and show three empty error notifications.

Only one of our two frontend developers was present, and she didn't immediately know what was wrong or how to fix it. And what's worse, she couldn't reproduce the problem in her own browser.

A quick look into our Go CD instance showed that the previous production deployment of this web application was on the previous day, and we had no report of similar errors previous to that, so we decided to do a rollback to the previous version.

I recently blogged about deploying specific versions, which allows you to do rollbacks easily. And in fact I had implemented that technique just two weeks before this incident. To make the rollback happen, I just had to click on the "rerun" button of the installation stage of the last good deployment, and wait for a short while (about a minute).

Since I had a browser where I could reproduce the problem, I could verify that the rollback had indeed solved the problem, and our co-worker was able to continue his work. This took the stress off the frontend developers, who didn't have to hurry to fix the bug in a haste. Instead, they just had to fix it before the next production deployment. Which, in this case, meant it was possible to wait to the next working day, when the second developer was present again.

For me, this episode was a good validation of the rollback mechanism in the deployment pipeline.

Postskriptum: The Bug

In case you wonder, the actual bug that triggered the incident was related to caching. The web application just introduced caching of data in the browser's local storage. In Firefox on Debian Jessie and some Mint versions, writing to the local storage raised an exception, which the web application failed to catch. Firefox on Ubuntu and Mac OS X didn't produce the same problem.

Curiously, Firefox was configured to allow the usage of local storage for this domain, the default of allowing around 5MB was unchanged, and both existing local storage and the new-to-be-written on was in the low kilobyte range. A Website that experimentally determines the local storage size confirmed it to be 5200 kb. So I suspect that there is a Firefox bug involved on these platforms as well.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

Ocean of Awareness: Introduction to Marpa Book in progress

What follows is a summary of the features of the Marpa algorithm, followed by a discussion of potential applications. It refers to itself as a "monograph", because it is a draft of part of the introduction to a technical monograph on the Marpa algorithm. I hope the entire monograph will appear in a few weeks.

The Marpa project

The Marpa project was intended to create a practical and highly available tool to generate and use general context-free parsers. Tools of this kind had long existed for LALR and regular expressions. But, despite an encouraging academic literature, no such tool had existed for context-free parsing. The first stable version of Marpa was uploaded to a public archive on Solstice Day 2011. This monograph describes the algorithm used in the most recent version of Marpa, Marpa::R2. It is a simplification of the algorithm presented in my earlier paper.

A proven algorithm

While the presentation in this monograph is theoretical, the approach is practical. The Marpa::R2 implementation has been widely available for some time, and has seen considerable use, including in production environments. Many of the ideas in the parsing literature satisfy theoretical criteria, but in practice turn out to face significant obstacles. An algorithm may be as fast as reported, but may turn out not to allow adequate error reporting. Or a modification may speed up the recognizer, but require additional processing at evaluation time, leaving no advantage to compensate for the additional complexity.

In this monograph, I describe the Marpa algorithm as it was implemented for Marpa::R2. In many cases, I believe there are better approaches than those I have described. But I treat these techniques, however solid their theory, as conjectures. Whenever I mention a technique that was not actually implemented in Marpa::R2, I will always explicitly state that that technique is not in Marpa as implemented.

Features

General context-free parsing

As implemented, Marpa parses all "proper" context-free grammars. The proper context-free grammars are those which are free of cycles, unproductive symbols, and inaccessible symbols. Worst case time bounds are never worse than those of Earley's algorithm, and therefore never worse than O(n**3).

Linear time for practical grammars

Currently, the grammars suitable for practical use are thought to be a subset of the deterministic context-free grammars. Using a technique discovered by Joop Leo, Marpa parses all of these in linear time. Leo's modification of Earley's algorithm is O(n) for LR-regular grammars. Leo's modification also parses many ambiguous grammars in linear time.

Left-eidetic

The original Earley algorithm kept full information about the parse --- including partial and fully recognized rule instances --- in its tables. At every parse location, before any symbols are scanned, Marpa's parse engine makes available its information about the state of the parse so far. This information is in useful form, and can be accessed efficiently.

Recoverable from read errors

When Marpa reads a token which it cannot accept, the error is fully recoverable. An application can try to read another token. The application can do this repeatedly as long as none of the tokens are accepted. Once the application provides a token that is accepted by the parser, parsing will continue as if the unsuccessful read attempts had never been made.

Ambiguous tokens

Marpa allows ambiguous tokens. These are often useful in natural language processing where, for example, the same word might be a verb or a noun. Use of ambiguous tokens can be combined with recovery from rejected tokens so that, for example, an application could react to the rejection of a token by reading two others.

Using the features

Error reporting

An obvious application of left-eideticism is error reporting. Marpa's abilities in this respect are ground-breaking. For example, users typically regard an ambiguity as an error in the grammar. Marpa, as currently implemented, can detect an ambiguity and report specifically where it occurred and what the alternatives were.

Event driven parsing

As implemented, Marpa::R2 allows the user to define "events". Events can be defined that trigger when a specified rule is complete, when a specified rule is predicted, when a specified symbol is nulled, when a user-specified lexeme has been scanned, or when a user-specified lexeme is about to be scanned. A mid-rule event can be defined by adding a nulling symbol at the desired point in the rule, and defining an event which triggers when the symbol is nulled.

Ruby slippers parsing

Left-eideticism, efficient error recovery, and the event mechanism can be combined to allow the application to change the input in response to feedback from the parser. In traditional parser practice, error detection is an act of desperation. In contrast, Marpa's error detection is so painless that it can be used as the foundation of new parsing techniques.

For example, if a token is rejected, the lexer is free to create a new token in the light of the parser's expectations. This approach can be seen as making the parser's "wishes" come true, and I have called it "Ruby Slippers Parsing".

One use of the Ruby Slippers technique is to parse with a clean but oversimplified grammar, programming the lexical analyzer to make up for the grammar's short-comings on the fly. As part of Marpa::R2, the author has implemented an HTML parser, based on a grammar that assumes that all start and end tags are present. Such an HTML grammar is too simple even to describe perfectly standard-conformant HTML, but the lexical analyzer is programmed to supply start and end tags as requested by the parser. The result is a simple and cleanly designed parser that parses very liberal HTML and accepts all input files, in the worst case treating them as highly defective HTML.

Ambiguity as a language design technique

In current practice, ambiguity is avoided in language design. This is very different from the practice in the languages humans choose when communicating with each other. Human languages exploit ambiguity in order to design highly flexible, powerfully expressive languages. For example, the language of this monograph, English, is notoriously ambiguous.

Ambiguity of course can present a problem. A sentence in an ambiguous language may have undesired meanings. But note that this is not a reason to ban potential ambiguity --- it is only a problem with actual ambiguity.

Syntax errors, for example, are undesired, but nobody tries to design languages to make syntax errors impossible. A language in which every input was well-formed and meaningful would be cumbersome and even dangerous: all typos in such a language would be meaningful, and parser would never warn the user about errors, because there would be no such thing.

With Marpa, ambiguity can be dealt with in the same way that syntax errors are dealt with in current practice. The language can be designed to be ambiguous, but any actual ambiguity can be detected and reported at parse time. This exploits Marpa's ability to report exactly where and what the ambiguity is. Marpa::R2's own parser description language, the SLIF, uses ambiguity in this way.

Auto-generated languages

In 1973, Čulik and Cohen pointed out that the ability to efficiently parse LR-regular languages opens the way to auto-generated languages. In particular, Čulik and Cohen note that a parser which can parse any LR-regular language will be able to parse a language generated using syntax macros.

Second order languages

In the literature, the term "second order language" is usually used to describe languages with features which are useful for second-order programming. True second-order languages --- languages which are auto-generated from other languages --- have not been seen as practical, since there was no guarantee that the auto-generated language could be efficiently parsed.

With Marpa, this barrier is raised. As an example, Marpa::R2's own parser description language, the SLIF, allows "precedenced rules". Precedenced rules are specified in an extended BNF. The BNF extensions allow precedence and associativity to be specified for each RHS.

Marpa::R2's precedenced rules are implemented as a true second order language. The SLIF representation of the precedenced rule is parsed to create a BNF grammar which is equivalent, and which has the desired precedence. Essentially, the SLIF does a standard textbook transformation. The transformation starts with a set of rules, each of which has a precedence and an associativity specified. The result of the transformation is a set of rules in pure BNF. The SLIF's advantage is that it is powered by Marpa, and therefore the SLIF can be certain that the grammar that it auto-generates will parse in linear time.

Notationally, Marpa's precedenced rules are an improvement over similar features in LALR-based parser generators like yacc or bison. In the SLIF, there are two important differences. First, in the SLIF's precedenced rules, precedence is generalized, so that it does not depend on the operators: there is no need to identify operators, much less class them as binary, unary, etc. This more powerful and flexible precedence notation allows the definition of multiple ternary operators, and multiple operators with arity above three.

Second, and more important, a SLIF user is guaranteed to get exactly the language that the precedenced rule specifies. The user of the yacc equivalent must hope their syntax falls within the limits of LALR.

References, comments, etc.

Marpa has a semi-official web site, maintained by Ron Savage. The official, but more limited, Marpa website is my personal one. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Dave's Free Press: Journal: YAPC::Europe 2006 report: day 3

Perlgeek.de : Automated Deployments: Unit Testing

Automated testing is absolutely essential for automated deployments. When you automate deployments, you automatically do them more often than before, which means that manual testing becomes more effort, more annoying, and is usually skipped sooner or later.

So to maintain a high degree of confidence that a deployment won't break the application, automated tests are the way to go.

And yet, I've written twenty blog posts about automating deployments, and this is the first about testing. Why did I drag my feet like this?

For one, testing is hard to generalize. But more importantly, the example project used so far doesn't play well with my usual approach to testing.

Of course one can still test it, but it's not an idiomatic approach that scales to real applications.

The easy way out is to consider a second example project. This also provides a good excuse to test the GoCD configuration template, and explore another way to build Debian packages.

Meet python-matheval

python-matheval is a stupid little web service that accepts a tree of mathematical expressions encoded in JSON format, evaluates it, and returns the result in the response. And as the name implies, it's written in python. Python3, to be precise.

The actual evaluation logic is quite compact:

# file src/matheval/evaluator.py
from functools import reduce
import operator

ops = {
    '+': operator.add,
    '-': operator.add,
    '*': operator.mul,
    '/': operator.truediv,
}

def math_eval(tree):
    if not isinstance(tree, list):
        return tree
    op = ops[tree.pop(0)]
    return reduce(op, map(math_eval, tree))

Exposing it to the web isn't much effort either, using the Flask library:

# file src/matheval/frontend.py
#!/usr/bin/python3

from flask import Flask, request

from matheval.evaluator import math_eval

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def index():
    tree = request.get_json(force=True)
    result = math_eval(tree);
    return str(result) + "\n"

if __name__ == '__main__':
    app.run(debug=True)

The rest of the code is part of the build system. As a python package, it should have a setup.py in the root directory

# file setup.py
!/usr/bin/env python

from setuptools import setup

setup(name='matheval',
      version='1.0',
      description='Evaluation of expression trees',
      author='Moritz Lenz',
      author_email='moritz.lenz@gmail.com',
      url='https://deploybook.com/',
      package_dir={'': 'src'},
      requires=['flask', 'gunicorn'],
      packages=['matheval']
     )

Once a working setup script is in place, the tool dh-virtualenv can be used to create a Debian package containing the project itself and all of the python-level dependencies.

This creates rather large Debian packages (in this case, around 4 MB for less than a kilobyte of actual application code), but on the upside it allows several applications on the same machine that depend on different versions of the same python library. The simple usage of the resulting Debian packages makes it well worth in many use cases.

Using dh-virtualenv is quite easy:

# file debian/rules
#!/usr/bin/make -f
export DH_VIRTUALENV_INSTALL_ROOT=/usr/share/python-custom

%:
    dh $@ --with python-virtualenv --with systemd

override_dh_virtualenv:
    dh_virtualenv --python=/usr/bin/python3

See the github repository for all the other boring details, like the systemd service files and the control file.

The integration into the GoCD pipeline is easy, using the previously developed configuration template:

<pipeline name="python-matheval" template="debian-base">
  <params>
    <param name="distribution">jessie</param>
    <param name="package">python-matheval</param>
    <param name="target">web</param>
  </params>
  <materials>
    <git url="https://github.com/moritz/python-matheval.git" dest="python-matheval" materialName="python-matheval" />
    <git url="https://github.com/moritz/deployment-utils.git" dest="deployment-utils" materialName="deployment-utils" />
  </materials>
</pipeline>

Getting Started with Testing, Finally

It is good practise and a good idea to cover business logic with unit tests.

The way that evaluation logic is split into a separate function makes it easy to test said function in isolation. A typical way is to feed some example inputs into the function, and check that the return value is as expected.

# file test/test-evaluator.py
import unittest
from matheval.evaluator import math_eval

class EvaluatorTest(unittest.TestCase):
    def _check(self, tree, expected):
        self.assertEqual(math_eval(tree), expected)

    def test_basic(self):
        self._check(5, 5)
        self._check(['+', 5], 5)
        self._check(['+', 5, 7], 12)
        self._check(['*', ['+', 5, 4], 2], 18)

if __name__ == '__main__':
    unittest.main()

One can execute the test suite (here just one test file so far) with the nosetests command from the nose python package:

$ nosetests
.
----------------------------------------------------------------------
Ran 1 test in 0.004s

OK

The python way of exposing the test suite is to implement the test command in setup.py, which can be done with the line

test_suite='nose.collector',

in the setup() call in setup.py. And of course one needs to add nose to the list passed to the requires argument.

With these measures in place, the debhelper and dh-virtualenv tooling takes care of executing the test suite as part of the Debian package build. If any of the tests fail, so does the build.

Running the test suite in this way is advantageous, because it runs the tests with exactly the same versions of all involved python libraries as end up in Debian package, and thus make up the runtime environment of the application. It is possible to achieve this through other means, but other approaches usually take much more work.

Conclusions

You should have enough unit tests to make you confident that the core logic of your application works correctly. It is a very easy and pragmatic solution to run the unit tests as part of the package build, ensuring that only "good" versions of your software are ever packaged and installed.

In future blog posts, other forms of testing will be explored.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required
Header image by Tambako the Jaguar. Some rights reserved.