Packaging
Ruby gems for Guix
22 April 2025
I want to replace rubygems, rbenv/rvm and bundler with guix. The reasons are relatively simple:
- Consistency. I don’t like having 3 package managers living on my system, that operate in different ways.
- Redundancy. Gems installed via
gem
and viabundler
live in different locations, which is wasteful. - Control. Since guix has a scheme interface, it should be possible to have finegrained control over installed packages.
- Power. I can implement new features easily, by writing my own guile code.
- Flexibility. Guix can manage arbitrary software, rather than just ruby libraries.
The problem of course is that the official guix repository
does not (and probably cannot) package every gem in rubygems to
their satisfaction. However, since the rubygems packaging system
has a programmatic interface, it should be possible to interact
with it from guix. Indeed, the guix import
command
does just this: given a rubygems package, it will attempt to
generate a guile expression that enables guix to package it.
I took a very simple ruby gem, hola
.
The output of guix import gem hola
is below.
define-public ruby-hola
(
(package"ruby-hola")
(name "0.1.3")
(version
(source
(origin
(method url-fetch)"hola" version))
(uri (rubygems-uri
(sha256"0jknwmn92gdq2fa6nvl373g77x0p6z2mk742i1v43q7as78lpmqd"))))
(base32
(build-system ruby-build-system)"A simple hello world gem")
(synopsis "This package provides a simple hello world gem.")
(description "http://rubygems.org/gems/hola")
(home-page #f))) (license
The package
form evaluates to a guix
<package>
object, and the various fields
declared are documented in the guix manual, under Defining
Packages.
There are symbols referred to here that do not merely name
fields on the package: method
,
url-fetch
, uri
, sha256
and so on. Some of them are described in the linked manual page.
I’m sure it is possible to figure out where they all from, but
the linked example shows the packaging of
(gnu packages hello)
, and it’s possible to just
figure out this case by analogy:
(define-module (gnu packages hello)
#:use-module (guix packages)
#:use-module (guix download)
#:use-module (guix build-system gnu)
#:use-module (guix licenses)
#:use-module (gnu packages gawk))
(define-public hello
(package
(name "hello")
(version "2.10")
(source (origin
(method url-fetch)
(uri (string-append "mirror://gnu/hello/hello-" version
".tar.gz"))
(sha256
(base32
"0ssi1wpaf7plaswqqjwigppsg5fyh99vdlb9kzl7c9lng89ndq1i"))))
(build-system gnu-build-system)
(arguments '(#:configure-flags '("--enable-silent-rules")))
(inputs (list gawk))
(synopsis "Hello, GNU world: An example GNU package")
(description "Guess what GNU Hello prints!")
(home-page "https://www.gnu.org/software/hello/")
(license gpl3+)))
Probably gpl3+
comes from
(guix licenses)
, probably package
itself comes from (guix packages)
, probably the
symbols used by the source
field come from
(gnu downloads)
and probably
gnu-build-system
comes from
(guix build-system gnu)
. By simple analogy we
arrive at the following recipe for
ruby-hola
, which happens to work:
define-module (gnu packages ruby-hola)
(
#:use-module (guix packages)
#:use-module (guix download)
#:use-module (guix build-system ruby)
#:use-module (guix licenses))
define-public ruby-hola
(
(package"ruby-hola")
(name "0.1.3")
(version
(source
(origin
(method url-fetch)"hola" version))
(uri (rubygems-uri
(sha256"0jknwmn92gdq2fa6nvl373g77x0p6z2mk742i1v43q7as78lpmqd"))))
(base32
(build-system ruby-build-system)"A simple hello world gem")
(synopsis "This package provides a simple hello world gem.")
(description "http://rubygems.org/gems/hola")
(home-page #f)))
(license
ruby-hola
Neat! Well, does it work? We can find out by running
guix build -f ruby-hola.scm
. If you packaged the
same version of hola
as me, you should get an
identical store path of
/gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3
.
Unless something is wrong, this version of hola
should be bit-for-bit identical to the one you built. Perhaps
this is not so interesting – there are not many dependencies
after all – but it should be true for even very complex packages
too.
You can inspect this store path yourself, if you like. It
contains all the code packaged in hola
, which is
not much.
/gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3
└── lib
└── ruby
└── vendor_ruby
├── build_info
├── cache
├── doc
│ └── hola-0.1.3
│ └── ri
│ ├── cache.ri
│ └── Hola
│ ├── cdesc-Hola.ri
│ ├── hi-c.ri
│ └── Translator
│ ├── cdesc-Translator.ri
│ ├── hi-i.ri
│ └── new-c.ri
├── extensions
├── gems
│ └── hola-0.1.3
│ ├── bin
│ │ └── hola
│ ├── lib
│ │ ├── hola
│ │ │ └── translator.rb
│ │ └── hola.rb
│ ├── Rakefile
│ └── test
│ └── test_hola.rb
├── plugins
└── specifications
└── hola-0.1.3.gemspec
Cool, so we can see a hola.rb
file and a
hola/translator.rb
file. A quick check shows that
Hola.hi('english')
should print
hello world
. How can we execute this code? Calling
that in an irb session will certainly not work, because ruby
does not know where to find the code. Normally we would lean on
something like bundler to do some magic, then we can call
require 'hola'
and start to use all the constants
defined inside. Here is a simple script which executes
hola
:
require 'hola'
puts Hola.hi('english')
The Environment, Purity, and Profiles
It turns out that there is very little magic happening at
all. Just like the PATH
environment variable on
Linux systems declares where executables are installed, so you
can run ruby
rather than /usr/bin/ruby
(or, rather, /home/user/.rbenv/shims/ruby
, or
something else spooky), there is a GEM_PATH
environment variable which tells the ruby interpreter where to
search for ruby code. To execute this code, therefore, we just
need to set our GEM_PATH
to an appropriate
directory. I will save you the trouble of finding out the
correct directory is
GEM_PATH=/gnu/store/9rch71ivivd97c12a7xm9k2zng1hf4zz-ruby-hola-0.1.3/lib/ruby/vendor_ruby ruby hola.rb
hello world
I think guix shell --search-paths
is supposed to
tell you what this is automatically, but I can’t see it. I also
think you are going to hit env var size limits if you are not
careful.
I have assumed, by the way, that you have a ruby interpreter available. I don’t think it really matters what version you have – anything from the last 10 years should be ok. I am running guix on a foreign distribution (debian), although this is not essential. We will come back to the problem of packaging ruby itself later.
It seems sort of problematic that I have to explicitly
specify in GEM_PATH
all the packages I have
installed. In fact, for very big projects you would probably
bump up against hard
limits on the size of environment variables. What I really
want is for all the gems to have their vendor_ruby
directories unioned together in one nice which I can call my
GEM_PATH
. It turns out this is a guix feature
already: what I want is a profile.
Profiles are how guix achieves “side-effects” on your running
system. I like to imagine that a reproducible environment like
this is derived from, essentially, one giant file that I control
(maybe split up into modules). If it is really like this, then
you can’t do convenient things, like apt install X
to immediately install a package and have it available. If the
system is really a pure function of its configuration, then any
side effects must be captured in the configuration before they
can be applied.
But guix allows you to install packages interactively, using
guix package -i
. So what’s happening? The packages
you install this way are linked to your “current profile”, which
is a directory in your $HOME
, usually called
$HOME/.guix-profile
. This profile also contains an
/etc/profile
file which sets up environment
variables - this allows it to put installed packages in your
$PATH
, as well as modify your GEM_PATH
to point to
$HOME/.guix-profile/lib/ruby/vendor_ruby
. After
installing a couple of gems, here is how my profile looks:
$ tree ~/.guix-profile
/home/daniel/.guix-profile
├── bin -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/bin
├── etc
│ └── profile
├── lib
│ └── ruby
│ └── vendor_ruby
│ ├── build_info
│ │ └── nokogiri-1.15.2.info -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/build_info/nokogiri-1.15.2.info
│ ├── cache
│ ├── doc
│ │ ├── hola-0.1.2 -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/doc/hola-0.1.2
│ │ ├── mini_portile2-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/doc/mini_portile2-2.8.2
│ │ ├── nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/doc/nokogiri-1.15.2
│ │ └── pkg-config-1.2.5 -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/doc/pkg-config-1.2.5
│ ├── extensions
│ │ └── x86_64-linux -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/extensions/x86_64-linux
│ ├── gems
│ │ ├── hola-0.1.2 -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/gems/hola-0.1.2
│ │ ├── mini_portile2-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/gems/mini_portile2-2.8.2
│ │ ├── nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/gems/nokogiri-1.15.2
│ │ └── pkg-config-1.2.5 -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/gems/pkg-config-1.2.5
│ ├── plugins
│ └── specifications
│ ├── hola-0.1.2.gemspec -> /gnu/store/fh2l5bjxhsphrc99mh3649dw7l2asbhn-ruby-hola-0.1.2/lib/ruby/vendor_ruby/specifications/hola-0.1.2.gemspec
│ ├── mini_portile2-2.8.2.gemspec -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/lib/ruby/vendor_ruby/specifications/mini_portile2-2.8.2.gemspec
│ ├── nokogiri-1.15.2.gemspec -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/lib/ruby/vendor_ruby/specifications/nokogiri-1.15.2.gemspec
│ └── pkg-config-1.2.5.gemspec -> /gnu/store/vx4xk41i6xsrjjmwis904620w6ds6qj1-ruby-pkg-config-1.2.5/lib/ruby/vendor_ruby/specifications/pkg-config-1.2.5.gemspec
├── manifest
└── share
├── doc
│ ├── ruby-mini-portile-2.8.2 -> /gnu/store/7jlnvz74z8m6jsznfiv7cjmwxx9vg2jq-ruby-mini-portile-2.8.2/share/doc/ruby-mini-portile-2.8.2
│ └── ruby-nokogiri-1.15.2 -> /gnu/store/ycrhbjrwwc101k7zwx2wd9a4wycbj053-ruby-nokogiri-1.15.2/share/doc/ruby-nokogiri-1.15.2
├── emacs -> /gnu/store/1h8hi5pz77pprq8a7k22npzwyj8jfs8s-emacs-subdirs/share/emacs
└── info -> /gnu/store/2rpcj1sgmpbq58wbz7qcls892ws0f3y7-info-dir/share/info
28 directories, 7 files
I am now a bit puzzled about how profiles, which you
create interactively with guix package -i
, relate
to shells, which you create with
guix shell
. Here’s the manual for each:
So what do we learn?
The purpose of guix shell is to make it easy to create one-off software environments, without changing one’s profile. It is typically used to create development environments; it is also a convenient way to run applications without “polluting” your profile.
This doesn’t tell me much. I don’t see why I shouldn’t use a profile for a development environment, or a shell for my entire machine.
Both profiles and shells can be declaratively specified by a manifest file,
--manifest=file
;-m file
Create an environment for the packages contained in the manifest object returned by the Scheme code in file. This option can be repeated several times, in which case the manifests are concatenated.
--manifest=file
;-m file
Create a new generation of the profile from the manifest object returned by the Scheme code in file. This option can be repeated several times, in which case the manifests are concatenated.
This allows you to declare the profile’s contents rather than constructing it through a sequence of –install and similar commands. The advantage is that file can be put under version control, copied to different machines to reproduce the same profile, and so on.
They both also support generating a manifest file from an imperative command, like
# for shells
guix shell --export-manifest -D guile git emacs emacs-geiser emacs-geiser-guile
# for profiles
guix package --export-manifest -i guile git emacs emacs-geiser emacs-geiser-guile
It turns out guix package --search-paths
will
only print the environment variables your profile needs
in order to function.
Profiles are persistent. You can
--list-profiles
; --roll-back
one
version or --switch-generation
to any previous
version; --delete-generations
(giving either a
specific generation to delete, or none to clear out all old
ones). If a profile exists, then packages installed in it will
never be garbage-collected. By constrast, a shell is ephemeral –
it can only be persisted via a manifest file, and can be
garbage-collected at any time.
On reflection, I guess a shell and a profile are basically
just two different pieces of globally referenced state, with
profiles being slightly more “global” than shells. With a
guix shell
, every command within the shell refers
to the state of all installed packages. Other shells do not see
it, so if you have multiple shells then you’re going to have to
set them up every time.
For something like a development environment I really don’t
think there is much difference which one you pick. I think for
services (say, a database server), running them in a shell (or
better, guix shell --container
) is very sensible.
But most programming work probably is better suited to a
profile.
Other versions
Ok - we managed to hack together one version of
hola
. I immediately wonder whether we can get every
version of it (since, who knows, maybe my project requires an
older version?). guix import
is robust to this; you
just have to write guix import gem hola@0.1.2
and
the rest of the previous section applies verbatim. But how can
you detect what versions are available? Well, I don’t know if
guix
can do this yet. The ruby gem
command is able to do it, and the results could probably be
parsed by a little script:
$ gem list ^hola$ --remote --all
*** REMOTE GEMS ***
hola (0.1.3, 0.1.2, 0.1.1, 0.1.0, 0.0.33, 0.0.31, 0.0.30, 0.0.1, 0.0.0)
Version numbers do not include commas, so this should be pretty easy to parse actually.
#!/usr/bin/env ruby
ARGV.each do |gemname|
versions = %x{gem list ^#{gemname}$ --remote --all}
.lines
.last
.match(/\(.*\)/)[0]
.slice(1..-2)
.split(', ')
end
Calling the above script with gem names on the command line will print all versions of the listed gems, each on its own line.
What if we don’t want to use gem
? After all, we
are planning to have it uninstalled in the long run. Well, you
can just do the same network requests that gem
is
doing under the hood. I can’t really tell what gem
is doing, but guix import
refers to a
rubygems-uri
– if we can find out what that is then
it’s probably easy to understand. guix
is much more
introspective, so we can just check in the REPL:
scheme@(guile-user)> (use-modules (guix build-system ruby))
scheme@(guile-user)> rubygems-uri
$1 = #<procedure rubygems-uri (name version)>
scheme@(guile-user)> (rubygems-uri "foobar" "0.1.2")
$2 = "https://rubygems.org/downloads/foobar-0.1.2.gem"
Devastatingly, if you visit that URL (either with, say
/downloads/hola
or just /downloads
then the server does not show you any nice interface to
traverse. You could write a little robot that scrapes the page,
but this is going beyond what I can be bothered to do.
In fact if you plan on doing a lot of this, you can mirror
rubygems locally using the rubygems-mirror
tool. If you mirror only the latest gems then it takes about
50GB of storage to host the entire mirror. I think if you mirror
every version it will probably be an order of magnitude larger.
If you have a mirror locally, then it is a matter of simple
directory traversal to enumerate all the possible versions.
It turns out however that there is an API call for precisely this request, which returns a JSON response. There are two relevant requests,
# for listing versions
$ curl https://rubygems.org/api/v1/versions/hola.json | ruby -r json -e "puts JSON.parse(STDIN.read).map { |x| x['number'] }"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4764 100 4764 0 0 409 0 0:00:11 0:00:11 --:--:-- 1077
0.1.3
0.1.2
0.1.1
0.1.0
0.0.33
0.0.31
0.0.30
0.0.1
0.0.0
# for fetching the changes to rubygems between two timestamps
$ curl 'https://rubygems.org/api/v1/timeframe_versions.json?from=2025-04-01T00:00:00Z&to=2025-04-03T00:00:00Z' | jq | head
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 61206 100 61206 0 0 24426 0 0:00:02 0:00:02 --:--:-- 24433
[
{
"name": "diagram",
"downloads": 4407,
"version": "0.3.2",
"version_created_at": "2025-04-01T00:15:43.711Z",
"version_downloads": 80,
"platform": "ruby",
"authors": "Abdelkader Boudih",
"info": "Work with diagrams in Ruby",
...
It’s easy to see how these two could be combined to build an archival tool.
Version pinning, lockfiles, and building Ruby
The major pieces of software that I would like to do away with are rvm/rbenv and bundler. Ideally I would be able to manage my entire development environment using guix. So far we’ve seen how to package a specific version of a gem from rubygems, how to load it into the environment, and how to query for all available versions of a gem from the rubygems repository. This is probably enough to be able to create a guix repo which mirrors Rubygems.
When you use something like rvm
to manage Ruby,
you specify a version to build and then the tool ensures that
the given version of Ruby is what’s available in your
environment. Guix doesn’t package every version of Ruby, so we
will have to build it ourselves. But that doesn’t mean that it
has to be a big job: assuming most versions of Ruby are built
more-or-less the same, we should be able to re-use the existing
package definitions available in guix.
A quick search for guix packages ruby
gives a
reference, from packages.guix.gnu.org,
to gnu/packages/ruby.scm
.
In fact, you can hack the guix edit
command to
print the location of your current ruby like so:
$ EDITOR="echo" guix edit ruby
+281 /gnu/store/2xyr94xw06qrvcmfp5krh5kcic6wsvpc-guix-module-union/share/guile/site/3.0/gnu/packages/ruby.scm
The ruby module makes extensive use of guix’s inheritance
feature, https://www.futurile.net/2024/01/12/modifying-guix-packages-using-inheritance/
which we can also copy in our own definition. I chose to build
ruby-3.3.0
, since it isn’t in the upstream
repository, and I work on a project that uses it.
(define-module (gnu packages my-ruby)
#:use-module (guix packages)
#:use-module (guix download)
#:use-module (guix utils)
#:use-module (gnu packages ruby)
#:use-module (gnu packages serialization))
(define-public ruby-3.3.0
(package
(inherit ruby-2.7)
(version "3.3.0")
(source
(origin
(method url-fetch)
(uri (string-append "http://cache.ruby-lang.org/pub/ruby/"
(version-major+minor version)
"/ruby-" version ".tar.xz"))
(sha256
(base32
"0nwpgf27i43yd8ccsk838n86n9xki68hayxmhbwr0zk3dsinasv7"))))
(inputs (modify-inputs (package-inputs ruby-2.7)
(append libyaml)))))
ruby-3.3.0
To build and run this package definition, run respectively
guix build -f my-ruby.scm
or
guix shell -f my-ruby.scm
. You don’t have to run
build
before shell
- it will be done
automatically if necessary (and the results will be cached in
/gnu/store
).
$ ruby -v
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux-gnu]
$ guix shell -f my-ruby.scm
...
$ ruby -v
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
The only parts I had to change (that weren’t obvious) were:
- The hash value. I just let the file get downloaded by guix,
and then fail the build due to the incorrect hash lookup. The
ruby.tar.gz
file path (within/gnu/store
was printed on the command line, and I could runguix hash /gnu/store/xxxxx-ruby.tar.gz
to get the real hash. - I got a build error initially, due to issues with
psych
. This is the purpose of thelibyaml
input (which comes from the(gnu packages serialization)
module).
When a build fails, guix build
will print the
path of a .drv.gz
file. This is a gzipped
plain-text file which logs whatever the build printed up to its
failure. In my case, it was very clear that psych
could not be built, and that the error had something to do with
libyaml
. In general, builds can fail for all kinds
of interesting reasons, but in this case this is a well-known
cause of Ruby build failures, which was easy to search for – and
nothing to do with guix. Incidentally, if I had inherited from
one of the Ruby 3 major versions, I probably would not have
encountered this issue.
On reflection, I don’t think inheritance is the right way to
solve this problem. If we’re going to replicate rubygems, which
contains something like 200,000 gems, any changes we make had
better be completely automated. That doesn’t imply that we
should have a lot of code duplication, writing one huge program
that does guix import gem
a million times. But it
sounds tricky to write code that correctly detects when
inheritance works, when I could just write code that removes the
duplication directly.
I think the ideal would be something like:
- One guile module for each gem.
- Every version goes in a single file, with the imports required for all modules declared at the top.
- Each module exports one symbol for each version of the gem,
called
ruby-gemname-x.y.z
, plus one “placeholder” which is named justruby-gemname
Guix doesn’t support dynamic channels that generate packages on demand, as far as I know. It would be possible to write a guile script that does this, but it seems nicer for users if they can just refer to a channel. It’s probably worthwhile writing a local database cache, consisting of the scraped JSON from rubygems. This should make it much quicker to iterate on the channel locally.
It seems like the data we would need in order to generate the
repository is basically one list: the set of all gems and gem
versions currently existing in rubygems. The simplest (although
certainly not the most efficient) way to get a seed value for
this is to run the rubygems-mirror
; you can load
the specs.4.8.rb
file like so:
irb(main):010:0> data = Marshal.load(File.binread('specs.4.8')); nil
irb(main):011:0> data.first(100)
=>
[["_", Gem::Version.new("1.0"), "ruby"],
["_", Gem::Version.new("1.1"), "ruby"],
["_", Gem::Version.new("1.2"), "ruby"],
["_", Gem::Version.new("1.3"), "ruby"],
["_", Gem::Version.new("1.4"), "ruby"],
["-", Gem::Version.new("1"), "ruby"],
["0mq", Gem::Version.new("0.1.0"), "ruby"],
["0mq", Gem::Version.new("0.1.1"), "ruby"],
["0mq", Gem::Version.new("0.1.2"), "ruby"],
["0mq", Gem::Version.new("0.2.0"), "ruby"],
["0mq", Gem::Version.new("0.2.1"), "ruby"],
["0mq", Gem::Version.new("0.3.0"), "ruby"],
["0mq", Gem::Version.new("0.4.0"), "ruby"],
["0mq", Gem::Version.new("0.4.1"), "ruby"],
["0mq", Gem::Version.new("0.5.0"), "ruby"],
...
hang on… where did it get that file? Quick google determines it is publicly available at https://rubygems.org/latest_specs.4.8.gz. I have absolutely no idea what that version string means.
Given a specfile like this, the only remaining thing we need
is a script which identifies all the gem versions, scrapes the
JSON from the rubygems API, converts them into guix package
definitions, and organises them into modules that go into my
custom channel. The right thing to do is probably just to modify
the guix import gem
script (either by hacking on
guix directly or finding a way to sub it out for my own
script).
The task I want to write is not exactly like
guix import gem
for two reasons.
- I want to aggregate the package definitions, so that there is no redundancy in the modules.
- Since I have to write guix module imports as well as specify
package dependencies, I’m probably going to have to duplicate
what
guix import gem
does to fetch dependencies.
There are also some deficiencies in
guix import gem
I found, like it always tries to
run tests (even if there are none). It also does not reliably
detect the gem license. I am starting to think I may need all
the information in the gem in order to build the package
definition – so maybe rubygems-mirror
really is the
simplest way to go.
$ guix import gem base64 | grep license
(license (list #f #f))))
The correct answer is
(license (list license:ruby license:bsd-2)))
.
The ruby .gem archive format
I should explain a little about what data are available to us. Clearly in order to build any kind of package specification, I need the following for each version of the given gem:
- Whether there are any tests.
- The URL and hash of the source code.
- The dependencies of the gem.
When you download a gem, rubygems gives you a file literally
called .gem
. This might seem intimidating, but if
you just click on it it’ll be revealed that this is just a
[tarball](https://en.wikipedia.org/wiki/tar_(computing)
containing three files, each of which is gzip-compressed:
data.gz
, the source code of the gem.checksums.yaml.gz
, a YAML file containing the checksums ofdata.gz
.metadata.gz
, a file with a mysterious format that looks like this:
--- !ruby/object:Gem::Specification
name: 0mq
version: !ruby/object:Gem::Version
version: 0.5.3
...
This looks like YAML, but there is a special comment annotation that denotes a Ruby class. I happen to recognise this – I think you can serialize pretty much arbitrary (tree-like) objects to YAML in this way. I suppose it’s a bit like s-expression syntax in that way, but I never thought of it like that before.
Anyway, we seem to need each of these files in order to build
a really robust package definition. And it also seems like it
should be pretty easy to write some Ruby code that takes the
.gem
archive as input, and outputs a package
definition. So let’s try!
The script I came up with looks like the following. All it
does is, given a list of gem archives on the command line (you
are assumed to have the gem locally), parse each of the files
into a nice format and feed them into an instance of
GuixPackage
.
I originally thought the checksums
file would be
needed, but it turned out to not always be present (and
therefore not particularly helpful). I just computed the hash
locally in that case - although that is pointless unless
rubygems-mirror
computes checksums itself (and I
don’t think it does).
#!/usr/bin/env ruby
require 'rubygems/package'
require 'zlib'
require 'stringio'
require 'yaml'
GEMSPEC_YAML_CLASSES = [::Gem::Specification, ::Gem::Version, Time, ::Gem::Dependency, ::Gem::Requirement, Symbol ]
ARGV.each do |gem_path|
File.open gem_path do |f|
# copied from https://weblog.jamisbuck.org/2015/7/23/tar-gz-in-ruby.html
Gem::Package::TarReader.new(f) do |tar|
= GuixPackage.new
pkg # this is stupid unless rubygems-mirror checksums while downloading.
# verify that it does!
.hash = Digest::SHA512.hexdigest(File.read(gem_path))
pkg.each do |listing|
tarcase listing.full_name
when 'checksums.yaml.gz'
.checksums = YAML.load(Zlib::GzipReader.new(listing).read)
pkgwhen 'data.tar.gz'
.data = Gem::Package::TarReader.new(Zlib::GzipReader.wrap(listing)).map { |l| [l.header.name, l.read] }.to_h
pkgwhen 'metadata.gz'
# c.f. https://github.com/jordansissel/fpm/pull/1898
= Zlib::GzipReader.new(listing).read
yaml_text if ::Gem::Version.new(RUBY_VERSION) >= ::Gem::Version.new("3.1.0")
# Ruby 3.1.0 switched to a Psych/YAML version that defaults to "safe" loading
# and unfortunately `gem specification --yaml` emits YAML that requires
# class loaders to process correctly
.metadata = YAML.load(yaml_text, :permitted_classes => GEMSPEC_YAML_CLASSES, :aliases => true)
pkgelse
# Older versions of ruby call this method YAML.safe_load
# I haven't actually tested this -- it may require aliases: true also.
# In any case unsafe_load ought to work.
# pkg.metadata = YAML.safe_load(yaml_text, GEMSPEC_YAML_CLASSES)
end
else
raise "Unrecognized file #{listing.full_name}"
end
end
puts pkg.template
end
end
end
The GuixPackage
class is responsible for
converting the package data and metadata into an actual
package
form. As you can see, most of the
information is derived from the ruby package’s “metadata”,
although for some of it we have to peek at the files it
contains. I don’t really believe the tests
line
works with guix – particularly since I don’t see how development
dependencies factor into guix’s packaging. We could make them
into build dependencies, but that’s not really what
they are.
class GuixPackage
attr_writer :checksums, :data, :metadata, :hash
def template
<<-LISP
(package
(name "ruby-#{gem_name}")
(version "#{version}")
(source
(origin
(method url-fetch)
(uri (rubygems-uri "#{gem_name}" version))
(hash (content-hash
(base64 "#{@hash}") sha512))))
(build-system ruby-build-system)
(arguments (list #:tests? #{tests_included? ? "#t" : "#f"}))
(propagated-inputs (list #{runtime_dependencies.map { |d| "ruby-#{d.name}" }.join(' ')}))
(synopsis "#{@metadata.description}")
(description "#{@metadata.description}")
(home-page "#{@metadata.homepage}")
(license #{license_symbols}))
LISP
end
def gem_name
@metadata.name
end
def version
@metadata.version
end
def tests_included?
return false
# something with test_files?
@data.any? { |name, contents| name == 'Rakefile' }
end
# not sure where to put dependencies of type :development, or even if there are such deps.
def runtime_dependencies
@metadata.dependencies.select { |d| d.type == :runtime }
end
def license_symbols
case @metadata.licenses.count
when 0
"#f"
when 1
@metadata.licenses[0])
lookup_license_symbol(else
"(list #{@metadata.licenses.map { |l| lookup_license_symbol(l) }.join(' ')})"
end
end
private
def lookup_license_symbol(ruby_name)
'license:' + {
'MIT' => 'expat',
'Ruby' => 'ruby',
'BSD-2-Clause' => 'bsd-2',
'BSD-3-Clause' => 'bsd-3',
}.fetch(ruby_name)
end
end
The major error here is in the license detection. Rubygems
appears to allow arbitrary text in the license metadata, and
while some people use appropriate SDPX identifiers, there is a
lot of junk like You want it? It's yours
and
nobody should use this thing right now
. There are
even some really hard to parse examples like
Ruby's or GPLv2 or later
. You can find the list of
modules guix
recognises by cloning the source code
and looking in the file guix/licenses.scm
.
There are some other hacks. I couldn’t get tests to work,
since rake isn’t pulled in as a dependency. We could bring test
dependencies in, but it’s hard to distinguish them from
development dependencies (and I think Gemfile.lock
won’t pin them anyway).
But it does work, and for me is more customisable than what
guix import gem
will produce. For instance we can
do
./parse_gem ~/.cache/rubygems-mirror/gems/sinatra-4.1.1.gem
and get this nice output:
(package
(name "ruby-sinatra")
(version "4.1.1")
(source
(origin
(method url-fetch)
(uri (rubygems-uri "sinatra" version))
(hash (content-hash
(base64 "v2x1ovXjjPuEQvSBxJHvsTLb83oRTo7jWk0k3W87iC+WejxULLr9yvliQrOfXXi3cnGtxWKi7gArh5CnPJmn6Q==") sha512))))
(build-system ruby-build-system)
(arguments (list #:tests? #f))
(propagated-inputs (list ruby-logger ruby-mustermann ruby-rack ruby-rack-protection ruby-rack-session ruby-tilt))
(synopsis "Sinatra is a DSL for quickly creating web applications in Ruby with minimal effort.")
(description "Sinatra is a DSL for quickly creating web applications in Ruby with minimal effort.")
(home-page "http://sinatrarb.com/")
(license license:expat))
Building a channel
Now that we can build a single GuixPackage
using
ruby code, we can start to write proper channel definitions.
Recall that the major problem we’re trying to solve is that our
channel definitions are going to have to declare their imports,
and in order to do that we need to be able to call methods on
the GuixPackage
object. We can actually drop that
template
from earlier – it’s served its
purpose.
With a couple of tiny changes to our script, it now generates
a list of guix packages (which we can then group together). We
could call it with e.g.
./parse_gem.rb $(find ~/.cache/rubygems-mirror/gems/ -regextype sed -regex ".*/sinatra-[0-9.]*.gem")
.
# was ARGV.each do |gem_path|
= ARGV.map do |gem_path|
guix_packages = GuixPackage.new
pkg ...
File.open gem_path do |f|
Gem::Package::TarReader.new(f) do |tar|
...
# was puts pkg.template
end
end
pkgend
= GuixPackageGroup.new(guix_packages.compact)
group puts group.template
Parsing lockfiles, and building Sinatra
The script above may well be enough to generate a rubygems channel. Probably not - but if we assume it is, and try to build something nontrivial, we should be able to see where it falls apart. A natural place to start is by taking a test case - something simple enough to work, but complicated enough to be interesting.
I had a false start with building a rails new
project. So, to keep things relatively easy, I decided to build
a sinatra project instead.
The initial Gemfile
and Gemfile.lock
are quite small:
# frozen_string_literal: true
"https://rubygems.org"
source
"sinatra"
gem "rackup"
gem "puma" gem
GEM
remote: https://rubygems.org/
specs:
base64 (0.2.0)
logger (1.6.5)
mustermann (3.0.3)
ruby2_keywords (~> 0.0.1)
nio4r (2.7.4)
puma (6.5.0)
nio4r (~> 2.0)
rack (3.1.8)
rack-protection (4.1.1)
base64 (>= 0.1.0)
logger (>= 1.6.0)
rack (>= 3.0.0, < 4)
rack-session (2.1.0)
base64 (>= 0.1.0)
rack (>= 3.0.0)
rackup (2.2.1)
rack (>= 3)
ruby2_keywords (0.0.5)
sinatra (4.1.1)
logger (>= 1.6.0)
mustermann (~> 3.0)
rack (>= 3.0.0, < 4)
rack-protection (= 4.1.1)
rack-session (>= 2.0.0, < 3)
tilt (~> 2.0)
tilt (2.6.0)
PLATFORMS
ruby
x86_64-linux
DEPENDENCIES
puma
rackup
sinatra
BUNDLED WITH
2.5.9
Luckily, we don’t actually have to care about the syntax of this file. Bundler is bundled (ha) with some code that will parse this file (after all, it must do, in order to install the gems). Here is the magic spell for parsing it:
#!/usr/bin/env ruby
require 'bundler'
# thank you https://stackoverflow.com/a/40098825/4681998
lockfile = Bundler::LockfileParser.new(Bundler.read_file(ARGV[0]))
all_specs = lockfile.specs.map { |s| [s.name, s.version] }
direct_specs = lockfile.dependencies.map(&:first)
puts all_specs.map { |name, version| "#{name}-#{version}" }
puts direct_specs
$ ./parse_lockfile.rb Gemfile.lock
base64-0.2.0
logger-1.6.5
mustermann-3.0.3
nio4r-2.7.4
puma-6.5.0
rack-3.1.8
rack-protection-4.1.1
rack-session-2.1.0
rackup-2.2.1
ruby2_keywords-0.0.5
sinatra-4.1.1
tilt-2.6.0
puma
rackup
sinatra
You can use this script to call parse_gem.rb
,
and build up a huge self-contained manifest file. That would
look something like this:
puts "(use-modules ((guix packages)) ((guix download)) ((guix build-system ruby)) ((guix licenses) #:prefix license:) (gnu packages ruby))"
puts "(use-modules (guix transformations))"
all_specs.each do |name,version|
puts "(define ruby-#{name}"
puts %x{../parse_gem/parse_gem.rb /home/daniel/.cache/rubygems-mirror/gems/#{name}-#{version}.gem}
puts ")"
end
puts "(define transform1
(options->transformation
'((with-input . \"ruby=ruby@3.0.6\"))))"
puts "(packages->manifest
(map transform1 (list ruby #{direct_specs.map { |n| "ruby-#{n}" }.join(' ')})))"
However, with a proper channel defined we should be able to just build a manifest by loading the right modules:
#!/usr/bin/env ruby
require 'bundler'
# thank you https://stackoverflow.com/a/40098825/4681998
= Bundler::LockfileParser.new(Bundler.read_file(ARGV[0]))
lockfile
= lockfile.specs.map { |s| [s.name, s.version] }
all_specs = lockfile.dependencies.map(&:first)
direct_specs = "3.3.3" # maybe could be read from the lockfile
ruby_version
# we have to only #:select ruby because (gnu packages ruby) defines some gems, and we don't want them to shadow us
puts <<-LISP
(use-modules #{direct_specs.map { |s| "(guix-rubygems #{s})"}.join(' ')})
(use-modules ((gnu packages ruby) #:select (ruby)))
(use-modules (guix transformations))
(define transform1
(options->transformation
`((with-input . \"ruby=ruby@#{ruby_version}\")
#{all_specs.map { |name, version| %Q|(with-input . "ruby-#{name}=ruby-#{name}@#{version}")|}.join("\n")})))
(packages->manifest
(map transform1 (list ruby #{direct_specs.map { |n| "ruby-#{n}" }.join(' ')})))
LISP
This dutifully prints out the following manifest, given the sinatra lockfile from earlier:
$ ./parse_lockfile_for_channel.rb Gemfile.lock
(use-modules (guix-rubygems puma) (guix-rubygems rackup) (guix-rubygems sinatra))
(use-modules ((gnu packages ruby) #:select (ruby)))
(use-modules (guix transformations))
(define transform1
(options->transformation
`((with-input . "ruby=ruby@3.3.3")
(with-input . "ruby-base64=ruby-base64@0.2.0")
(with-input . "ruby-logger=ruby-logger@1.6.5")
(with-input . "ruby-mustermann=ruby-mustermann@3.0.3")
(with-input . "ruby-nio4r=ruby-nio4r@2.7.4")
(with-input . "ruby-puma=ruby-puma@6.5.0")
(with-input . "ruby-rack=ruby-rack@3.1.8")
(with-input . "ruby-rack-protection=ruby-rack-protection@4.1.1")
(with-input . "ruby-rack-session=ruby-rack-session@2.1.0")
(with-input . "ruby-rackup=ruby-rackup@2.2.1")
(with-input . "ruby-ruby2_keywords=ruby-ruby2_keywords@0.0.5")
(with-input . "ruby-sinatra=ruby-sinatra@4.1.1")
(with-input . "ruby-tilt=ruby-tilt@2.6.0"))))
(packages->manifest
(map transform1 (list ruby ruby-puma ruby-rackup ruby-sinatra)))
Now we just load this manifest and…
$ guix shell --pure -L ~/guix-rubygems -m manifest.scm -- ruby app.rb
guix shell: warning: failed to load '(guix-rubygems gem_plugin)':
no code for module (guix-rubygems rake)
== Sinatra (v4.1.1) has taken the stage on 4567 for development with backup from Puma
Puma starting in single mode...
* Puma version: 6.5.0 ("Sky's Version")
* Ruby version: ruby 3.3.3 (2024-06-12 revision f1c7b6f435) [x86_64-linux]
* Min threads: 0
* Max threads: 5
* Environment: development
* PID: 3474245
* Listening on http://127.0.0.1:4567
* Listening on http://[::1]:4567
Use Ctrl-C to stop
::1 - - [22/Apr/2025:19:25:36 +0100] "GET / HTTP/1.1" 200 12 0.0070
^C- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2025-04-22 19:25:39 +0100 ===
- Goodbye!
== Sinatra has ended his set (crowd applauds)
It’s alive! What if we change the Gemfile
?
# frozen_string_literal: true
"https://rubygems.org"
source
"sinatra", '<= 4.0.0'
gem "rackup"
gem "puma", '<= 6.3.0' gem
bundle install
means everything has its versions
slightly changed:
GEM
remote: https://rubygems.org/
specs:
base64 (0.2.0)
mustermann (3.0.3)
ruby2_keywords (~> 0.0.1)
nio4r (2.7.4)
puma (6.0.0)
nio4r (~> 2.0)
rack (3.1.12)
rack-protection (4.0.0)
base64 (>= 0.1.0)
rack (>= 3.0.0, < 4)
rack-session (2.1.0)
base64 (>= 0.1.0)
rack (>= 3.0.0)
rackup (2.2.1)
rack (>= 3)
ruby2_keywords (0.0.5)
sinatra (4.0.0)
mustermann (~> 3.0)
rack (>= 3.0.0, < 4)
rack-protection (= 4.0.0)
rack-session (>= 2.0.0, < 3)
tilt (~> 2.0)
tilt (2.6.0)
PLATFORMS
ruby
x86_64-linux
DEPENDENCIES
puma (<= 6.3.0)
rackup
sinatra (<= 4.0.0)
BUNDLED WITH
2.5.9
The manifest is changed too:
$ ~/guix/parse_lockfile/parse_lockfile_for_channel.rb Gemfile.lock
(use-modules (guix-rubygems puma) (guix-rubygems rackup) (guix-rubygems sinatra))
(use-modules ((gnu packages ruby) #:select (ruby)))
(use-modules (guix transformations))
(define transform1
(options->transformation
`((with-input . "ruby=ruby@3.3.3")
(with-input . "ruby-base64=ruby-base64@0.2.0")
(with-input . "ruby-mustermann=ruby-mustermann@3.0.3")
(with-input . "ruby-nio4r=ruby-nio4r@2.7.4")
(with-input . "ruby-puma=ruby-puma@6.0.0")
(with-input . "ruby-rack=ruby-rack@3.1.12")
(with-input . "ruby-rack-protection=ruby-rack-protection@4.0.0")
(with-input . "ruby-rack-session=ruby-rack-session@2.1.0")
(with-input . "ruby-rackup=ruby-rackup@2.2.1")
(with-input . "ruby-ruby2_keywords=ruby-ruby2_keywords@0.0.5")
(with-input . "ruby-sinatra=ruby-sinatra@4.0.0")
(with-input . "ruby-tilt=ruby-tilt@2.6.0"))))
(packages->manifest
(map transform1 (list ruby ruby-puma ruby-rackup ruby-sinatra)))
… and it still works! Yay!
I skipped over a pain point. The first time I tried this out,
I had puma <= 6.0.0
in the Gemfile
.
This doesn’t work! The manifest is generated, but you get a
weird error:
/gnu/store/x52r0wbx8hgvb5bajhrdmyzmxfrqahlg-ruby-rackup-2.2.1/lib/ruby/vendor_ruby/gems/rackup-2.2.1/lib/rackup/handler.rb:81:in `pick': Couldn't find handler for: puma, falcon, thin, HTTP, webrick. (LoadError)
from /gnu/store/zb1vxr9wpa49wipd1z1wzxmzjk9bhgzc-ruby-sinatra-4.0.0/lib/ruby/vendor_ruby/gems/sinatra-4.0.0/lib/sinatra/base.rb:1624:in `run!'
from /gnu/store/zb1vxr9wpa49wipd1z1wzxmzjk9bhgzc-ruby-sinatra-4.0.0/lib/ruby/vendor_ruby/gems/sinatra-4.0.0/lib/sinatra/main.rb:47:in `block in <module:Sinatra>'
What do you mean there’s no handler for puma
!
It’s right there! I went looking all the way through the
dependencies, literally checking out the source code and reading
through old commits. I found out that rackup
only
split out from mainline rack
relatively recently,
and began to suspect that there was a mistake in the packaging
in rubygems. So I tried bundle exec ruby app.rb
without any guix shell
at all. And quelle
surprise!
$ bundle exec ruby app.rb
/home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/rackup-2.2.1/lib/rackup/handler.rb:81:in `pick': Couldn't find handler for: puma, falcon, thin, HTTP, webrick. (LoadError)
from /home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/sinatra-4.0.0/lib/sinatra/base.rb:1624:in `run!'
from /home/daniel/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/sinatra-4.0.0/lib/sinatra/main.rb:47:in `block in <module:Sinatra>'
So what is going on? This combination of gems is
actually incompatible, and bundler
doesn’t
help! I wonder if running tests would have caught this at the
packaging level. Anyway: this is a good point to take a breath
of air - we have a system that seems to work, at least
for ruby dependencies. The next objective is something like
Rails, which generally includes system dependencies as well as
pure ruby code.