Note: if you are not familiar with PEP 345, you might want to read
it to understand this entry. It adds for instance "Requires-Dist" that
is similar to setuptools' install_requires and provides a standard for
dependencies description.
The GSOC has started and we are already working on a lot of tasks about
packaging. The main difficulty is to make sure each student works
without overlapping with others, and never get blocked. That's why we
will have weekly meetings with (almost) everyone. In parallel, the nice
posse from the Montreal user group is organizing Distutils sprints quite
often now. That means that we now have an important manpower for
Distutils and things are starting to speed up.
There's one controversial topic though, that we need to straighten up :
do we want to add an installer in Distutils2 ? And since Distutils2 goal
is to be back in the stdlib for Python 3.2, that means: do we want to
add an installer in the stdlib ?
My answer so far is Yes. And that's what I'll be working on unless
someone is able to change my mind :)
What is Distutils2 ?
Let me explain first what is the Distutils2 project, and what we want
it to provide. Like its predecessor, Distutils2 wants to provide two
things:
1. a toolbox for third packaging tools, whether they are simple
installers or full featured package managers (PyPM, Pip, Enthought
Installer etc..). This toolbox will include (if not already)
reference implementation of PEP 345, PEP 376, PEP 386. In other
words, if you want to create the next killer packaging system, you
can use modules like distutils2.version (PEP 386) or
distutils2.metadata (PEP 345) to build it, without depending on
the "everything is a command" philosophy of Distutils.
2. a standalone tool that can be used to install or remove
distributions. That's what Distutils is and that's what we want to
provide in the future in Distutils2. The ability to install projects
(and therefore its dependencies since this is a new metadata field
we added in PEP 345).
The controversy is about 2. It's controversial to provide a script that
installs dependencies via PyPI into distutils2 because some projects
like Pip already provides this feature.
Our current packaging ecosystem explained
A few years ago, before Setuptools added the ability to install
dependencies via easy_install, installing a distribution of a
given project was as simple as running a python setup.py install.
This was installing the distribution in the target system, in proper
locations defined by the install command. That's it.
Setuptools grew organically on the top of Distutils to provide new
metadata like the "install_requires" field, that lists dependencies.
Setuptools provided two things:
- A new install command that triggers the installation of
dependencies, by reading the setuptools-specific "install_requires"
metadata, and fetching dependencies at PyPI and installing them
recursively.
- An easy_install script that can be used to install a
distribution located at PyPI. That's just a bootstrap on the top of
the new install command. In other words, it grabs the archive at
PyPI, unpack it, and run "python setup.py install" on it.
In other words, your Python project setup.py is the installer
itself because when you use setuptools, it calls its specific install
command and triggers the installation chain.
That's when the mess started: people that didn't have setuptools
installed couldn't install projects that was using it of course. So the
solution that was provided was to propose an ez_setup.py script
that you have to include in your project and to run when setup.py is
used, to be able to run your installation. In other words, your setup.py
is bootstrapping the utilization/installation of setuptools. And that
turned out to be really messy since Setuptools has its own way for
installing things. I hope I don't sound harsh here, Setuptools is the
best thing that happened to packaging in years. And a lot of our current
work is to bring back its features into the "main stream".
The result is that you, as a end user, do not control what installer is
going to be used, and you end up with a site-packages that has projects
installed differently, and that uses different installers.
I am strongly against this behavior because of the mess it creates.
In my opinion a python source distribution should not embed an
installer and force its usage like this. We need to separate concerns:
a python source project should be a dumb container with the code, and
with some metadata.
Then Pip showed up.
Pip is an installer script that grabs the project you want to install
and run "python setup.py install" on it. That's all it does when the
project is a plain Distutils one. When it encounter Setuptools projects,
it blocks the installation of the project's dependencies I have
described earlier, and installs it like a simple Distutils project.
Then, it analyzes its dependencies and installs each one of them
separately.
That's really the way to go because it breaks what setuptools is
enforcing: projects are not installing other projects in the process
anymore. And in the long term, it will allow us to get rid of setup.py
(but that's another blog post). And I hope Pip will soon be able to
install Distutils2 projects because it is providing unifi ed metadata
(distutils+setuptools -> PEP 345).
Distutils2 vs Pip
So as I said before: it's controversial to provide a script that
installs dependencies via PyPI into distutils2 because Pip already
provides this feature.
But one Distutils2 goal (like Distutils) is to provide a command to
install a Distribution of your system so it works. And the concept of
"Distribution" has evolved, thanks to PEP 345. this means that it needs
to install dependencies now, exactly the way Pip does.
We could just tell people to install Pip on the top of the stdlib. But
the goal is to provide in the stdlib a working packaging environment,
that provides a minimum set of features. The goal is to have something
that works when you install Python 3.2, like what was provided when
distutils was brought in (eg batteries included).
Mac OS X includes easy_install, I don't see any good reason not to
include a package installer in the Python stdlib itself. At least, we
will be able to have a control on what script gets installed by default
with Python.
That's why I have proposed to include Pip in Distutils2 but Ian and
Carl seems a bit reluctant for various reasons. One of them is that
having Pip included in the stdlib will slow down their work. I don't
think this is true as long as it's included carefully. If Distutils2
allows its installer to be replaced through configuration by another
one, then Pip can have new releases independently from the version
included in the stdlib and people can upgrade their system without
having to wait for the next Python release.
In any case, we are working on the various bits that are composing an
installer in Distutils2 during GSOC since one of the goal of the project
as I said earlier, is to provide a toolbox. So if the merge does not
occur, it's likely that we will start a installer/uninstaller script in
Distutils2, and it will look a lot like Pip I guess.
EDIT: to make things clearer, when I am saying that both projects should merge, I am only referring to the raw "install with dependencies" features in Pip, and not all the other features.
Comments !