Packaging Ruby gems for Guix

22 April 2025

I want to replace rubygems, rbenv/rvm and bundler with guix. The reasons are relatively simple:

The problem of course is that the official guix repository does not (and probably cannot) package every gem in rubygems to their satisfaction. However, since the rubygems packaging system has a programmatic interface, it should be possible to interact with it from guix. Indeed, the guix import command does just this: given a rubygems package, it will attempt to generate a guile expression that enables guix to package it.

I took a very simple ruby gem, hola. The output of guix import gem hola is below.

(define-public ruby-hola
  (package
    (name "ruby-hola")
    (version "0.1.3")
    (source
     (origin
       (method url-fetch)
       (uri (rubygems-uri "hola" version))
       (sha256
        (base32 "0jknwmn92gdq2fa6nvl373g77x0p6z2mk742i1v43q7as78lpmqd"))))
    (build-system ruby-build-system)
    (synopsis "A simple hello world gem")
    (description "This package provides a simple hello world gem.")
    (home-page "http://rubygems.org/gems/hola")
    (license #f)))

The package form evaluates to a guix <package> object, and the various fields declared are documented in the guix manual, under Defining Packages.

There are symbols referred to here that do not merely name fields on the package: method, url-fetch, uri, sha256 and so on. Some of them are described in the linked manual page. I’m sure it is possible to figure out where they all from, but the linked example shows the packaging of (gnu packages hello), and it’s possible to just figure out this case by analogy:

(define-module (gnu packages hello)
  #:use-module (guix packages)
  #:use-module (guix download)
  #:use-module (guix build-system gnu)
  #:use-module (guix licenses)
  #:use-module (gnu packages gawk))

(define-public hello
  (package
    (name "hello")
    (version "2.10")
    (source (origin
              (method url-fetch)
              (uri (string-append "mirror://gnu/hello/hello-" version
                                  ".tar.gz"))
              (sha256
               (base32
                "0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i"))))
    (build-system gnu-build-system)
    (arguments '(#:configure-flags '("--enable-silent-rules")))
    (inputs (list gawk))
    (synopsis "Hello, GNU world: An example GNU package")
    (description "Guess what GNU Hello prints!")
    (home-page "https://www.gnu.org/software/hello/")
    (license gpl3+)))

Probably gpl3+ comes from (guix licenses), probably package itself comes from (guix packages), probably the symbols used by the source field come from (gnu downloads) and probably gnu-build-system comes from (guix build-system gnu). By simple analogy we arrive at the following recipe for ruby-hola, which happens to work:

(define-module (gnu packages ruby-hola)
  #:use-module (guix packages)
  #:use-module (guix download)
  #:use-module (guix build-system ruby)
  #:use-module (guix licenses))

(define-public ruby-hola
  (package
    (name "ruby-hola")
    (version "0.1.3")
    (source
     (origin
       (method url-fetch)
       (uri (rubygems-uri "hola" version))
       (sha256
        (base32 "0jknwmn92gdq2fa6nvl373g77x0p6z2mk742i1v43q7as78lpmqd"))))
    (build-system ruby-build-system)
    (synopsis "A simple hello world gem")
    (description "This package provides a simple hello world gem.")
    (home-page "http://rubygems.org/gems/hola")
    (license #f)))

ruby-hola

Neat! Well, does it work? We can find out by running guix build -f ruby-hola.scm. If you packaged the same version of hola as me, you should get an identical store path of /gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3. Unless something is wrong, this version of hola should be bit-for-bit identical to the one you built. Perhaps this is not so interesting – there are not many dependencies after all – but it should be true for even very complex packages too.

You can inspect this store path yourself, if you like. It contains all the code packaged in hola, which is not much.

/gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3
└── lib
    └── ruby
        └── vendor_ruby
            ├── build_info
            ├── cache
            ├── doc
            │   └── hola-0.1.3
            │       └── ri
            │           ├── cache.ri
            │           └── Hola
            │               ├── cdesc-Hola.ri
            │               ├── hi-c.ri
            │               └── Translator
            │                   ├── cdesc-Translator.ri
            │                   ├── hi-i.ri
            │                   └── new-c.ri
            ├── extensions
            ├── gems
            │   └── hola-0.1.3
            │       ├── bin
            │       │   └── hola
            │       ├── lib
            │       │   ├── hola
            │       │   │   └── translator.rb
            │       │   └── hola.rb
            │       ├── Rakefile
            │       └── test
            │           └── test_hola.rb
            ├── plugins
            └── specifications
                └── hola-0.1.3.gemspec

Cool, so we can see a hola.rb file and a hola/translator.rb file. A quick check shows that Hola.hi('english') should print hello world. How can we execute this code? Calling that in an irb session will certainly not work, because ruby does not know where to find the code. Normally we would lean on something like bundler to do some magic, then we can call require 'hola' and start to use all the constants defined inside. Here is a simple script which executes hola:

require 'hola'

puts Hola.hi('english')

The Environment, Purity, and Profiles

It turns out that there is very little magic happening at all. Just like the PATH environment variable on Linux systems declares where executables are installed, so you can run ruby rather than /usr/bin/ruby (or, rather, /home/user/.rbenv/shims/ruby, or something else spooky), there is a GEM_PATH environment variable which tells the ruby interpreter where to search for ruby code. To execute this code, therefore, we just need to set our GEM_PATH to an appropriate directory. I will save you the trouble of finding out the correct directory is

GEM_PATH=/gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3/lib/ruby/vendor_ruby ruby hola.rb
hello world

I think guix shell --search-paths is supposed to tell you what this is automatically, but I can’t see it. I also think you are going to hit env var size limits if you are not careful.

I have assumed, by the way, that you have a ruby interpreter available. I don’t think it really matters what version you have – anything from the last 10 years should be ok. I am running guix on a foreign distribution (debian), although this is not essential. We will come back to the problem of packaging ruby itself later.

It seems sort of problematic that I have to explicitly specify in GEM_PATH all the packages I have installed. In fact, for very big projects you would probably bump up against hard limits on the size of environment variables. What I really want is for all the gems to have their vendor_ruby directories unioned together in one nice which I can call my GEM_PATH. It turns out this is a guix feature already: what I want is a profile.

Profiles are how guix achieves “side-effects” on your running system. I like to imagine that a reproducible environment like this is derived from, essentially, one giant file that I control (maybe split up into modules). If it is really like this, then you can’t do convenient things, like apt install X to immediately install a package and have it available. If the system is really a pure function of its configuration, then any side effects must be captured in the configuration before they can be applied.

But guix allows you to install packages interactively, using guix package -i. So what’s happening? The packages you install this way are linked to your “current profile”, which is a directory in your $HOME, usually called $HOME/.guix-profile. This profile also contains an /etc/profile file which sets up environment variables - this allows it to put installed packages in your $PATH, as well as modify your GEM_PATH to point to $HOME/.guix-profile/lib/ruby/vendor_ruby. After installing a couple of gems, here is how my profile looks:

$ tree ~/.guix-profile
/home/daniel/.guix-profile
├── bin -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/bin
├── etc
│   └── profile
├── lib
│   └── ruby
│       └── vendor_ruby
│           ├── build_info
│           │   └── nokogiri-1.15.2.info -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/build_info/nokogiri-1.15.2.info
│           ├── cache
│           ├── doc
│           │   ├── hola-0.1.2 -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/doc/hola-0.1.2
│           │   ├── mini_portile2-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/doc/mini_portile2-2.8.2
│           │   ├── nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/doc/nokogiri-1.15.2
│           │   └── pkg-config-1.2.5 -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/doc/pkg-config-1.2.5
│           ├── extensions
│           │   └── x86_64-linux -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/extensions/x86_64-linux
│           ├── gems
│           │   ├── hola-0.1.2 -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/gems/hola-0.1.2
│           │   ├── mini_portile2-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/gems/mini_portile2-2.8.2
│           │   ├── nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/gems/nokogiri-1.15.2
│           │   └── pkg-config-1.2.5 -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/gems/pkg-config-1.2.5
│           ├── plugins
│           └── specifications
│               ├── hola-0.1.2.gemspec -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/specifications/hola-0.1.2.gemspec
│               ├── mini_portile2-2.8.2.gemspec -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/specifications/mini_portile2-2.8.2.gemspec
│               ├── nokogiri-1.15.2.gemspec -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/specifications/nokogiri-1.15.2.gemspec
│               └── pkg-config-1.2.5.gemspec -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/specifications/pkg-config-1.2.5.gemspec
├── manifest
└── share
    ├── doc
    │   ├── ruby-mini-portile-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/share/doc/ruby-mini-portile-2.8.2
    │   └── ruby-nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/share/doc/ruby-nokogiri-1.15.2
    ├── emacs -> /gnu/store/1h8hi5pz77pprq8a7k22npzwyj8jfs8s-emacs-subdirs/share/emacs
    └── info -> /gnu/store/2rpcj1sgmpbq58wbz7qcls892ws0f3y7-info-dir/share/info

28 directories, 7 files

I am now a bit puzzled about how profiles, which you create interactively with guix package -i, relate to shells, which you create with guix shell. Here’s the manual for each:

So what do we learn?

The purpose of guix shell is to make it easy to create one-off software environments, without changing one’s profile. It is typically used to create development environments; it is also a convenient way to run applications without “polluting” your profile.

This doesn’t tell me much. I don’t see why I shouldn’t use a profile for a development environment, or a shell for my entire machine.

Both profiles and shells can be declaratively specified by a manifest file,

--manifest=file; -m file Create an environment for the packages contained in the manifest object returned by the Scheme code in file. This option can be repeated several times, in which case the manifests are concatenated.

--manifest=file; -m file

Create a new generation of the profile from the manifest object returned by the Scheme code in file. This option can be repeated several times, in which case the manifests are concatenated.

This allows you to declare the profile’s contents rather than constructing it through a sequence of –install and similar commands. The advantage is that file can be put under version control, copied to different machines to reproduce the same profile, and so on.

They both also support generating a manifest file from an imperative command, like

# for shells
guix shell --export-manifest -D guile git emacs emacs-geiser emacs-geiser-guile
# for profiles
guix package --export-manifest -i guile git emacs emacs-geiser emacs-geiser-guile

It turns out guix package --search-paths will only print the environment variables your profile needs in order to function.

Profiles are persistent. You can --list-profiles; --roll-back one version or --switch-generation to any previous version; --delete-generations (giving either a specific generation to delete, or none to clear out all old ones). If a profile exists, then packages installed in it will never be garbage-collected. By constrast, a shell is ephemeral – it can only be persisted via a manifest file, and can be garbage-collected at any time.

On reflection, I guess a shell and a profile are basically just two different pieces of globally referenced state, with profiles being slightly more “global” than shells. With a guix shell, every command within the shell refers to the state of all installed packages. Other shells do not see it, so if you have multiple shells then you’re going to have to set them up every time.

For something like a development environment I really don’t think there is much difference which one you pick. I think for services (say, a database server), running them in a shell (or better, guix shell --container) is very sensible. But most programming work probably is better suited to a profile.

Other versions

Ok - we managed to hack together one version of hola. I immediately wonder whether we can get every version of it (since, who knows, maybe my project requires an older version?). guix import is robust to this; you just have to write guix import gem hola@0.1.2 and the rest of the previous section applies verbatim. But how can you detect what versions are available? Well, I don’t know if guix can do this yet. The ruby gem command is able to do it, and the results could probably be parsed by a little script:

$ gem list ^hola$ --remote --all

*** REMOTE GEMS ***

hola (0.1.3, 0.1.2, 0.1.1, 0.1.0, 0.0.33, 0.0.31, 0.0.30, 0.0.1, 0.0.0)

Version numbers do not include commas, so this should be pretty easy to parse actually.

#!/usr/bin/env ruby

ARGV.each do |gemname|
  versions = %x{gem list ^#{gemname}$ --remote --all}
    .lines
    .last
    .match(/\(.*\)/)[0]
    .slice(1..-2)
    .split(', ')
end

Calling the above script with gem names on the command line will print all versions of the listed gems, each on its own line.

What if we don’t want to use gem? After all, we are planning to have it uninstalled in the long run. Well, you can just do the same network requests that gem is doing under the hood. I can’t really tell what gem is doing, but guix import refers to a rubygems-uri – if we can find out what that is then it’s probably easy to understand. guix is much more introspective, so we can just check in the REPL:

scheme@(guile-user)> (use-modules (guix build-system ruby))
scheme@(guile-user)> rubygems-uri
$1 = #<procedure rubygems-uri (name version)>
scheme@(guile-user)> (rubygems-uri "foobar" "0.1.2")
$2 = "https://rubygems.org/downloads/foobar-0.1.2.gem"

Devastatingly, if you visit that URL (either with, say /downloads/hola or just /downloads then the server does not show you any nice interface to traverse. You could write a little robot that scrapes the page, but this is going beyond what I can be bothered to do.

In fact if you plan on doing a lot of this, you can mirror rubygems locally using the rubygems-mirror tool. If you mirror only the latest gems then it takes about 50GB of storage to host the entire mirror. I think if you mirror every version it will probably be an order of magnitude larger. If you have a mirror locally, then it is a matter of simple directory traversal to enumerate all the possible versions.

It turns out however that there is an API call for precisely this request, which returns a JSON response. There are two relevant requests,

# for listing versions
$ curl https://rubygems.org/api/v1/versions/hola.json | ruby -r json -e "puts JSON.parse(STDIN.read).map { |x| x['number'] }"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4764  100  4764    0     0    409      0  0:00:11  0:00:11 --:--:--  1077
0.1.3
0.1.2
0.1.1
0.1.0
0.0.33
0.0.31
0.0.30
0.0.1
0.0.0

# for fetching the changes to rubygems between two timestamps
$ curl 'https://rubygems.org/api/v1/timeframe_versions.json?from=2025-04-01T00:00:00Z&to=2025-04-03T00:00:00Z' | jq | head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 61206  100 61206    0     0  24426      0  0:00:02  0:00:02 --:--:-- 24433
[
  {
    "name": "diagram",
    "downloads": 4407,
    "version": "0.3.2",
    "version_created_at": "2025-04-01T00:15:43.711Z",
    "version_downloads": 80,
    "platform": "ruby",
    "authors": "Abdelkader Boudih",
    "info": "Work with diagrams in Ruby",
    ...

It’s easy to see how these two could be combined to build an archival tool.

Version pinning, lockfiles, and building Ruby

The major pieces of software that I would like to do away with are rvm/rbenv and bundler. Ideally I would be able to manage my entire development environment using guix. So far we’ve seen how to package a specific version of a gem from rubygems, how to load it into the environment, and how to query for all available versions of a gem from the rubygems repository. This is probably enough to be able to create a guix repo which mirrors Rubygems.

When you use something like rvm to manage Ruby, you specify a version to build and then the tool ensures that the given version of Ruby is what’s available in your environment. Guix doesn’t package every version of Ruby, so we will have to build it ourselves. But that doesn’t mean that it has to be a big job: assuming most versions of Ruby are built more-or-less the same, we should be able to re-use the existing package definitions available in guix.

A quick search for guix packages ruby gives a reference, from packages.guix.gnu.org, to gnu/packages/ruby.scm. In fact, you can hack the guix edit command to print the location of your current ruby like so:

$ EDITOR="echo" guix edit ruby
+281 /gnu/store/2xyr94xw06qrvcmfp5krh5kcic6wsvpc-guix-module-union/share/guile/site/3.0/gnu/packages/ruby.scm

The ruby module makes extensive use of guix’s inheritance feature, https://www.futurile.net/2024/01/12/modifying-guix-packages-using-inheritance/ which we can also copy in our own definition. I chose to build ruby-3.3.0, since it isn’t in the upstream repository, and I work on a project that uses it.

(define-module (gnu packages my-ruby)
               #:use-module (guix packages)
               #:use-module (guix download)
               #:use-module (guix utils)
               #:use-module (gnu packages ruby)
               #:use-module (gnu packages serialization))

(define-public ruby-3.3.0
  (package
    (inherit ruby-2.7)
    (version "3.3.0")
    (source
     (origin
       (method url-fetch)
       (uri (string-append "http://cache.ruby-lang.org/pub/ruby/"
                           (version-major+minor version)
                           "/ruby-" version ".tar.xz"))
       (sha256
        (base32
         "0nwpgf27i43yd8ccsk838n86n9xki68hayxmhbwr0zk3dsinasv7"))))
     (inputs (modify-inputs (package-inputs ruby-2.7)
       (append libyaml)))))

ruby-3.3.0

To build and run this package definition, run respectively guix build -f my-ruby.scm or guix shell -f my-ruby.scm. You don’t have to run build before shell - it will be done automatically if necessary (and the results will be cached in /gnu/store).

$ ruby -v
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux-gnu]
$ guix shell -f my-ruby.scm
...
$ ruby -v
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]

The only parts I had to change (that weren’t obvious) were:

When a build fails, guix build will print the path of a .drv.gz file. This is a gzipped plain-text file which logs whatever the build printed up to its failure. In my case, it was very clear that psych could not be built, and that the error had something to do with libyaml. In general, builds can fail for all kinds of interesting reasons, but in this case this is a well-known cause of Ruby build failures, which was easy to search for – and nothing to do with guix. Incidentally, if I had inherited from one of the Ruby 3 major versions, I probably would not have encountered this issue.

On reflection, I don’t think inheritance is the right way to solve this problem. If we’re going to replicate rubygems, which contains something like 200,000 gems, any changes we make had better be completely automated. That doesn’t imply that we should have a lot of code duplication, writing one huge program that does guix import gem a million times. But it sounds tricky to write code that correctly detects when inheritance works, when I could just write code that removes the duplication directly.

I think the ideal would be something like:

Guix doesn’t support dynamic channels that generate packages on demand, as far as I know. It would be possible to write a guile script that does this, but it seems nicer for users if they can just refer to a channel. It’s probably worthwhile writing a local database cache, consisting of the scraped JSON from rubygems. This should make it much quicker to iterate on the channel locally.

It seems like the data we would need in order to generate the repository is basically one list: the set of all gems and gem versions currently existing in rubygems. The simplest (although certainly not the most efficient) way to get a seed value for this is to run the rubygems-mirror; you can load the specs.4.8.rb file like so:

irb(main):010:0> data = Marshal.load(File.binread('specs.4.8')); nil
irb(main):011:0> data.first(100)
=> 
[["_", Gem::Version.new("1.0"), "ruby"],              
 ["_", Gem::Version.new("1.1"), "ruby"],              
 ["_", Gem::Version.new("1.2"), "ruby"],              
 ["_", Gem::Version.new("1.3"), "ruby"],              
 ["_", Gem::Version.new("1.4"), "ruby"],              
 ["-", Gem::Version.new("1"), "ruby"],                
 ["0mq", Gem::Version.new("0.1.0"), "ruby"],          
 ["0mq", Gem::Version.new("0.1.1"), "ruby"],          
 ["0mq", Gem::Version.new("0.1.2"), "ruby"],          
 ["0mq", Gem::Version.new("0.2.0"), "ruby"],          
 ["0mq", Gem::Version.new("0.2.1"), "ruby"],          
 ["0mq", Gem::Version.new("0.3.0"), "ruby"],          
 ["0mq", Gem::Version.new("0.4.0"), "ruby"],          
 ["0mq", Gem::Version.new("0.4.1"), "ruby"],          
 ["0mq", Gem::Version.new("0.5.0"), "ruby"],
...

hang on… where did it get that file? Quick google determines it is publicly available at https://rubygems.org/latest_specs.4.8.gz. I have absolutely no idea what that version string means.

Given a specfile like this, the only remaining thing we need is a script which identifies all the gem versions, scrapes the JSON from the rubygems API, converts them into guix package definitions, and organises them into modules that go into my custom channel. The right thing to do is probably just to modify the guix import gem script (either by hacking on guix directly or finding a way to sub it out for my own script).

The task I want to write is not exactly like guix import gem for two reasons.

There are also some deficiencies in guix import gem I found, like it always tries to run tests (even if there are none). It also does not reliably detect the gem license. I am starting to think I may need all the information in the gem in order to build the package definition – so maybe rubygems-mirror really is the simplest way to go.

$ guix import gem base64 | grep license
    (license (list #f #f))))

The correct answer is (license (list license:ruby license:bsd-2))).

The ruby .gem archive format

I should explain a little about what data are available to us. Clearly in order to build any kind of package specification, I need the following for each version of the given gem:

When you download a gem, rubygems gives you a file literally called .gem. This might seem intimidating, but if you just click on it it’ll be revealed that this is just a [tarball](https://en.wikipedia.org/wiki/tar_(computing) containing three files, each of which is gzip-compressed:

--- !ruby/object:Gem::Specification 
name: 0mq
version: !ruby/object:Gem::Version 
  version: 0.5.3
...

This looks like YAML, but there is a special comment annotation that denotes a Ruby class. I happen to recognise this – I think you can serialize pretty much arbitrary (tree-like) objects to YAML in this way. I suppose it’s a bit like s-expression syntax in that way, but I never thought of it like that before.

Anyway, we seem to need each of these files in order to build a really robust package definition. And it also seems like it should be pretty easy to write some Ruby code that takes the .gem archive as input, and outputs a package definition. So let’s try!

The script I came up with looks like the following. All it does is, given a list of gem archives on the command line (you are assumed to have the gem locally), parse each of the files into a nice format and feed them into an instance of GuixPackage.

I originally thought the checksums file would be needed, but it turned out to not always be present (and therefore not particularly helpful). I just computed the hash locally in that case - although that is pointless unless rubygems-mirror computes checksums itself (and I don’t think it does).

#!/usr/bin/env ruby

require 'rubygems/package'
require 'zlib'
require 'stringio'
require 'yaml'

GEMSPEC_YAML_CLASSES = [::Gem::Specification, ::Gem::Version, Time, ::Gem::Dependency, ::Gem::Requirement, Symbol ]

ARGV.each do |gem_path|
  File.open gem_path do |f|
    # copied from https://weblog.jamisbuck.org/2015/7/23/tar-gz-in-ruby.html
    Gem::Package::TarReader.new(f) do |tar|
      pkg = GuixPackage.new
      # this is stupid unless rubygems-mirror checksums while downloading.
      # verify that it does!
      pkg.hash = Digest::SHA512.hexdigest(File.read(gem_path))
      tar.each do |listing|
        case listing.full_name
        when 'checksums.yaml.gz'
          pkg.checksums = YAML.load(Zlib::GzipReader.new(listing).read)
        when 'data.tar.gz'
          pkg.data = Gem::Package::TarReader.new(Zlib::GzipReader.wrap(listing)).map { |l| [l.header.name, l.read] }.to_h
        when 'metadata.gz'
          # c.f. https://github.com/jordansissel/fpm/pull/1898
          yaml_text = Zlib::GzipReader.new(listing).read
          if ::Gem::Version.new(RUBY_VERSION) >= ::Gem::Version.new("3.1.0")
            # Ruby 3.1.0 switched to a Psych/YAML version that defaults to "safe" loading
            # and unfortunately `gem specification --yaml` emits YAML that requires
            # class loaders to process correctly
            pkg.metadata = YAML.load(yaml_text,  => GEMSPEC_YAML_CLASSES,  => true)
          else
            # Older versions of ruby call this method YAML.safe_load
            # I haven't actually tested this -- it may require aliases: true also.
            # In any case unsafe_load ought to work.
            # pkg.metadata = YAML.safe_load(yaml_text, GEMSPEC_YAML_CLASSES)
          end
        else
          raise "Unrecognized file #{listing.full_name}"
        end
      end
      puts pkg.template
    end
  end
end

The GuixPackage class is responsible for converting the package data and metadata into an actual package form. As you can see, most of the information is derived from the ruby package’s “metadata”, although for some of it we have to peek at the files it contains. I don’t really believe the tests line works with guix – particularly since I don’t see how development dependencies factor into guix’s packaging. We could make them into build dependencies, but that’s not really what they are.

class GuixPackage
  attr_writer , , , 

  def template
    <<-LISP
      (package
        (name "ruby-#{gem_name}")
        (version "#{version}")
        (source
         (origin
           (method url-fetch)
           (uri (rubygems-uri "#{gem_name}" version))
           (hash (content-hash
            (base64 "#{@hash}") sha512))))
        (build-system ruby-build-system)
        (arguments (list #:tests? #{tests_included? ? "#t" : "#f"}))
        (propagated-inputs (list #{runtime_dependencies.map { |d| "ruby-#{d.name}" }.join(' ')}))
        (synopsis "#{@metadata.description}")
        (description "#{@metadata.description}")
        (home-page "#{@metadata.homepage}")
        (license #{license_symbols}))
    LISP
  end

  def gem_name
    @metadata.name
  end

  def version
    @metadata.version
  end

  def tests_included?
    return false
    # something with test_files?
    @data.any? { |name, contents| name == 'Rakefile' }
  end

  # not sure where to put dependencies of type :development, or even if there are such deps.
  def runtime_dependencies
    @metadata.dependencies.select { |d| d.type ==  }
  end

  def license_symbols
    case @metadata.licenses.count
    when 0
      "#f"
    when 1
      lookup_license_symbol(@metadata.licenses[0])
    else
      "(list #{@metadata.licenses.map { |l| lookup_license_symbol(l) }.join(' ')})"
    end
  end

  private

  def lookup_license_symbol(ruby_name)
    'license:' + {
      'MIT' => 'expat',
      'Ruby' => 'ruby',
      'BSD-2-Clause' => 'bsd-2',
      'BSD-3-Clause' => 'bsd-3',
    }.fetch(ruby_name)
  end
end

The major error here is in the license detection. Rubygems appears to allow arbitrary text in the license metadata, and while some people use appropriate SDPX identifiers, there is a lot of junk like You want it? It's yours and nobody should use this thing right now. There are even some really hard to parse examples like Ruby's or GPLv2 or later. You can find the list of modules guix recognises by cloning the source code and looking in the file guix/licenses.scm.

There are some other hacks. I couldn’t get tests to work, since rake isn’t pulled in as a dependency. We could bring test dependencies in, but it’s hard to distinguish them from development dependencies (and I think Gemfile.lock won’t pin them anyway).

But it does work, and for me is more customisable than what guix import gem will produce. For instance we can do ./parse_gem ~/.cache/rubygems-mirror/gems/sinatra-4.1.1.gem and get this nice output:

  (package
    (name "ruby-sinatra")
    (version "4.1.1")
    (source
     (origin
       (method url-fetch)
       (uri (rubygems-uri "sinatra" version))
       (hash (content-hash
        (base64 "v2x1ovXjjPuEQvSBxJHvsTLb83oRTo7jWk0k3W87iC+WejxULLr9yvliQrOfXXi3cnGtxWKi7gArh5CnPJmn6Q==") sha512))))
    (build-system ruby-build-system)
    (arguments (list #:tests? #f))
    (propagated-inputs (list ruby-logger ruby-mustermann ruby-rack ruby-rack-protection ruby-rack-session ruby-tilt))
    (synopsis "Sinatra is a DSL for quickly creating web applications in Ruby with minimal effort.")
    (description "Sinatra is a DSL for quickly creating web applications in Ruby with minimal effort.")
    (home-page "http://sinatrarb.com/")
    (license license:expat))

Building a channel

Now that we can build a single GuixPackage using ruby code, we can start to write proper channel definitions. Recall that the major problem we’re trying to solve is that our channel definitions are going to have to declare their imports, and in order to do that we need to be able to call methods on the GuixPackage object. We can actually drop that template from earlier – it’s served its purpose.

With a couple of tiny changes to our script, it now generates a list of guix packages (which we can then group together). We could call it with e.g. ./parse_gem.rb $(find ~/.cache/rubygems-mirror/gems/ -regextype sed -regex ".*/sinatra-[0-9.]*.gem").

# was ARGV.each do |gem_path|
guix_packages = ARGV.map do |gem_path|
  pkg = GuixPackage.new
  ...
  File.open gem_path do |f|
    Gem::Package::TarReader.new(f) do |tar|
      ...
      # was puts pkg.template
    end
  end
  pkg
end

group = GuixPackageGroup.new(guix_packages.compact)
puts group.template

Parsing lockfiles, and building Sinatra

The script above may well be enough to generate a rubygems channel. Probably not - but if we assume it is, and try to build something nontrivial, we should be able to see where it falls apart. A natural place to start is by taking a test case - something simple enough to work, but complicated enough to be interesting.

I had a false start with building a rails new project. So, to keep things relatively easy, I decided to build a sinatra project instead. The initial Gemfile and Gemfile.lock are quite small:

# frozen_string_literal: true

source "https://rubygems.org"

gem "sinatra"
gem "rackup"
gem "puma"
GEM
  remote: https://rubygems.org/
  specs:
    base64 (0.2.0)
    logger (1.6.5)
    mustermann (3.0.3)
      ruby2_keywords (~> 0.0.1)
    nio4r (2.7.4)
    puma (6.5.0)
      nio4r (~> 2.0)
    rack (3.1.8)
    rack-protection (4.1.1)
      base64 (>= 0.1.0)
      logger (>= 1.6.0)
      rack (>= 3.0.0, < 4)
    rack-session (2.1.0)
      base64 (>= 0.1.0)
      rack (>= 3.0.0)
    rackup (2.2.1)
      rack (>= 3)
    ruby2_keywords (0.0.5)
    sinatra (4.1.1)
      logger (>= 1.6.0)
      mustermann (~> 3.0)
      rack (>= 3.0.0, < 4)
      rack-protection (= 4.1.1)
      rack-session (>= 2.0.0, < 3)
      tilt (~> 2.0)
    tilt (2.6.0)

PLATFORMS
  ruby
  x86_64-linux

DEPENDENCIES
  puma
  rackup
  sinatra

BUNDLED WITH
   2.5.9

Luckily, we don’t actually have to care about the syntax of this file. Bundler is bundled (ha) with some code that will parse this file (after all, it must do, in order to install the gems). Here is the magic spell for parsing it:

#!/usr/bin/env ruby

require 'bundler'
# thank you https://stackoverflow.com/a/40098825/4681998
lockfile = Bundler::LockfileParser.new(Bundler.read_file(ARGV[0]))

all_specs = lockfile.specs.map { |s| [s.name, s.version] }
direct_specs = lockfile.dependencies.map(&:first)

puts all_specs.map { |name, version| "#{name}-#{version}" }
puts direct_specs

$ ./parse_lockfile.rb Gemfile.lock 
base64-0.2.0
logger-1.6.5
mustermann-3.0.3
nio4r-2.7.4
puma-6.5.0
rack-3.1.8
rack-protection-4.1.1
rack-session-2.1.0
rackup-2.2.1
ruby2_keywords-0.0.5
sinatra-4.1.1
tilt-2.6.0
puma
rackup
sinatra

You can use this script to call parse_gem.rb, and build up a huge self-contained manifest file. That would look something like this:

puts "(use-modules ((guix packages)) ((guix download)) ((guix build-system ruby)) ((guix licenses) #:prefix license:) (gnu packages ruby))"
puts "(use-modules (guix transformations))"
all_specs.each do |name,version|
  puts "(define ruby-#{name}"
  puts %x{../parse_gem/parse_gem.rb /home/daniel/.cache/rubygems-mirror/gems/#{name}-#{version}.gem}
  puts ")"
end

puts "(define transform1
  (options->transformation
    '((with-input . \"ruby=ruby@3.0.6\"))))"

puts "(packages->manifest
  (map transform1 (list ruby #{direct_specs.map { |n| "ruby-#{n}" }.join(' ')})))"

However, with a proper channel defined we should be able to just build a manifest by loading the right modules:

#!/usr/bin/env ruby

require 'bundler'
# thank you https://stackoverflow.com/a/40098825/4681998
lockfile = Bundler::LockfileParser.new(Bundler.read_file(ARGV[0]))

all_specs = lockfile.specs.map { |s| [s.name, s.version] }
direct_specs = lockfile.dependencies.map(&)
ruby_version = "3.3.3" # maybe could be read from the lockfile

# we have to only #:select ruby because (gnu packages ruby) defines some gems, and we don't want them to shadow us
puts <<-LISP
(use-modules #{direct_specs.map { |s| "(guix-rubygems #{s})"}.join(' ')})
(use-modules ((gnu packages ruby) #:select (ruby)))

(use-modules (guix transformations))

(define transform1
  (options->transformation
    `((with-input . \"ruby=ruby@#{ruby_version}\")
      #{all_specs.map { |name, version| %Q|(with-input . "ruby-#{name}=ruby-#{name}@#{version}")|}.join("\n")})))

(packages->manifest
  (map transform1 (list ruby #{direct_specs.map { |n| "ruby-#{n}" }.join(' ')})))
LISP

This dutifully prints out the following manifest, given the sinatra lockfile from earlier:

$ ./parse_lockfile_for_channel.rb Gemfile.lock 
(use-modules (guix-rubygems puma) (guix-rubygems rackup) (guix-rubygems sinatra))
(use-modules ((gnu packages ruby) #:select (ruby)))

(use-modules (guix transformations))

(define transform1
  (options->transformation
    `((with-input . "ruby=ruby@3.3.3")
      (with-input . "ruby-base64=ruby-base64@0.2.0")
(with-input . "ruby-logger=ruby-logger@1.6.5")
(with-input . "ruby-mustermann=ruby-mustermann@3.0.3")
(with-input . "ruby-nio4r=ruby-nio4r@2.7.4")
(with-input . "ruby-puma=ruby-puma@6.5.0")
(with-input . "ruby-rack=ruby-rack@3.1.8")
(with-input . "ruby-rack-protection=ruby-rack-protection@4.1.1")
(with-input . "ruby-rack-session=ruby-rack-session@2.1.0")
(with-input . "ruby-rackup=ruby-rackup@2.2.1")
(with-input . "ruby-ruby2_keywords=ruby-ruby2_keywords@0.0.5")
(with-input . "ruby-sinatra=ruby-sinatra@4.1.1")
(with-input . "ruby-tilt=ruby-tilt@2.6.0"))))

(packages->manifest
  (map transform1 (list ruby ruby-puma ruby-rackup ruby-sinatra)))

Now we just load this manifest and…

$ guix shell --pure -L ~/guix-rubygems -m manifest.scm -- ruby app.rb
guix shell: warning: failed to load '(guix-rubygems gem_plugin)':
no code for module (guix-rubygems rake)
== Sinatra (v4.1.1) has taken the stage on 4567 for development with backup from Puma
Puma starting in single mode...
* Puma version: 6.5.0 ("Sky's Version")
* Ruby version: ruby 3.3.3 (2024-06-12 revision f1c7b6f435) [x86_64-linux]
*  Min threads: 0
*  Max threads: 5
*  Environment: development
*          PID: 3474245
* Listening on http://127.0.0.1:4567
* Listening on http://[::1]:4567
Use Ctrl-C to stop
::1 - - [22/Apr/2025:19:25:36 +0100] "GET / HTTP/1.1" 200 12 0.0070
^C- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2025-04-22 19:25:39 +0100 ===
- Goodbye!
== Sinatra has ended his set (crowd applauds)

It’s alive! What if we change the Gemfile?

# frozen_string_literal: true

source "https://rubygems.org"

gem "sinatra", '<= 4.0.0'
gem "rackup"
gem "puma", '<= 6.3.0'

bundle install means everything has its versions slightly changed:

GEM
  remote: https://rubygems.org/
  specs:
    base64 (0.2.0)
    mustermann (3.0.3)
      ruby2_keywords (~> 0.0.1)
    nio4r (2.7.4)
    puma (6.0.0)
      nio4r (~> 2.0)
    rack (3.1.12)
    rack-protection (4.0.0)
      base64 (>= 0.1.0)
      rack (>= 3.0.0, < 4)
    rack-session (2.1.0)
      base64 (>= 0.1.0)
      rack (>= 3.0.0)
    rackup (2.2.1)
      rack (>= 3)
    ruby2_keywords (0.0.5)
    sinatra (4.0.0)
      mustermann (~> 3.0)
      rack (>= 3.0.0, < 4)
      rack-protection (= 4.0.0)
      rack-session (>= 2.0.0, < 3)
      tilt (~> 2.0)
    tilt (2.6.0)

PLATFORMS
  ruby
  x86_64-linux

DEPENDENCIES
  puma (<= 6.3.0)
  rackup
  sinatra (<= 4.0.0)

BUNDLED WITH
   2.5.9

The manifest is changed too:

$ ~/guix/parse_lockfile/parse_lockfile_for_channel.rb Gemfile.lock 
(use-modules (guix-rubygems puma) (guix-rubygems rackup) (guix-rubygems sinatra))
(use-modules ((gnu packages ruby) #:select (ruby)))

(use-modules (guix transformations))

(define transform1
  (options->transformation
    `((with-input . "ruby=ruby@3.3.3")
      (with-input . "ruby-base64=ruby-base64@0.2.0")
(with-input . "ruby-mustermann=ruby-mustermann@3.0.3")
(with-input . "ruby-nio4r=ruby-nio4r@2.7.4")
(with-input . "ruby-puma=ruby-puma@6.0.0")
(with-input . "ruby-rack=ruby-rack@3.1.12")
(with-input . "ruby-rack-protection=ruby-rack-protection@4.0.0")
(with-input . "ruby-rack-session=ruby-rack-session@2.1.0")
(with-input . "ruby-rackup=ruby-rackup@2.2.1")
(with-input . "ruby-ruby2_keywords=ruby-ruby2_keywords@0.0.5")
(with-input . "ruby-sinatra=ruby-sinatra@4.0.0")
(with-input . "ruby-tilt=ruby-tilt@2.6.0"))))

(packages->manifest
  (map transform1 (list ruby ruby-puma ruby-rackup ruby-sinatra)))

… and it still works! Yay!

I skipped over a pain point. The first time I tried this out, I had puma <= 6.0.0 in the Gemfile. This doesn’t work! The manifest is generated, but you get a weird error:

/gnu/store/x52r0wbx8hgvb5bajhrdmyzmxfrqahlg-ruby-rackup-2.2.1/lib/ruby/vendor_ruby/gems/rackup-2.2.1/lib/rackup/handler.rb:81:in `pick': Couldn't find handler for: puma, falcon, thin, HTTP, webrick. (LoadError)
        from /gnu/store/zb1vxr9wpa49wipd1z1wzxmzjk9bhgzc-ruby-sinatra-4.0.0/lib/ruby/vendor_ruby/gems/sinatra-4.0.0/lib/sinatra/base.rb:1624:in `run!'
        from /gnu/store/zb1vxr9wpa49wipd1z1wzxmzjk9bhgzc-ruby-sinatra-4.0.0/lib/ruby/vendor_ruby/gems/sinatra-4.0.0/lib/sinatra/main.rb:47:in `block in <module:Sinatra>'

What do you mean there’s no handler for puma! It’s right there! I went looking all the way through the dependencies, literally checking out the source code and reading through old commits. I found out that rackup only split out from mainline rack relatively recently, and began to suspect that there was a mistake in the packaging in rubygems. So I tried bundle exec ruby app.rb without any guix shell at all. And quelle surprise!

$ bundle exec ruby app.rb 
/home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/rackup-2.2.1/lib/rackup/handler.rb:81:in `pick': Couldn't find handler for: puma, falcon, thin, HTTP, webrick. (LoadError)
        from /home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/sinatra-4.0.0/lib/sinatra/base.rb:1624:in `run!'
        from /home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/sinatra-4.0.0/lib/sinatra/main.rb:47:in `block in <module:Sinatra>'

So what is going on? This combination of gems is actually incompatible, and bundler doesn’t help! I wonder if running tests would have caught this at the packaging level. Anyway: this is a good point to take a breath of air - we have a system that seems to work, at least for ruby dependencies. The next objective is something like Rails, which generally includes system dependencies as well as pure ruby code.