23rd of July, 2020: Following this writeup, I ran across Pipenv
. This is a newer, and perhaps a better, way to maintain python
environments. Unlike virtualenv
where we generate requirements.txt
file as an afterthought when needed, Pipenv
uses pipenv
files (a list of packages like requirements.txt
files) to generate the environment, which means you have a portable descriptor of your environment at all times. Also the separation between first-order dependencies and higher-order dependencies are much more clearly delineated in Pipenv
, as well as the separation between development dependencies and production dependencies. Read more about it at the Pipenv home page. Thank you Ken Reitz and all the contributors.
Inspired by My friend’s solution to my problem. Check out his blog post for a more detailed implementation of the idea for Mac/Linux users using virtualenv.
The Context
Lets start with some context. I do my work on 2 computers; my laptop and my desktop at work. I keep things synchronised by using Github as an intermediary. I also use python virtual environments (using venv
from the standard library) to keep the different sub projects in my PhD isolated and keep me from going to dependency hell. Instead of committing gigabytes of virtual envs. to Github, I just commit the requirements.txt
file with package details and use pip
to generate/update virtual envs. if/when I need them.
One of the things that kept this setup from being seamless was the fact I had to manually generate the requirements.txt
file before I commit to ensure that it is up to date with whatever updates I’ve made to the virtual env. locally. I was talking to my friend Janith about this and he came up with the following setup to automatically generate the requirements.txt
file whenever pip
is run. Check his blog post for a more comprehensive breakdown of the inner workings of pip
and virtual environments
. His post is primarily aimed at Mac/Linux users using virtualenv instead of venv
on windows.
I am using python version 3.6 and pip version 20. The root path for the virtual environment I’ll be using is /.venv
created using python -m venv .venv
on windows.
The Fix
For those unaware, venv
creates a copy of your system’s python installation. This its own copy of pip
, the python package manager. On windows when you run pip
it invokes .venv\Scripts\pip.exe
so we have to dig a bit deeper to get to a place where we can inject our code. pip.exe
seems to run .venv\Lib\site-packages\pip\_internal\cli\main.py
. It looks something like this.
"""Primary application entrypoint.
"""
from __future__ import absolute_import
import locale
import logging
import os
import sys
from pip._internal.cli.autocompletion import autocomplete
from pip._internal.cli.main_parser import parse_command
from pip._internal.commands import create_command
from pip._internal.exceptions import PipError
from pip._internal.utils import deprecation
from pip._internal.utils.typing import MYPY_CHECK_RUNNING
if MYPY_CHECK_RUNNING:
from typing import List, Optional
logger = logging.getLogger(__name__)
def main(args=None):
# type: (Optional[List[str]]) -> int
if args is None:
args = sys.argv[1:]
# Configure our deprecation warnings to be sent through loggers
deprecation.install_warning_logger()
autocomplete()
try:
cmd_name, cmd_args = parse_command(args)
except PipError as exc:
sys.stderr.write("ERROR: {}".format(exc))
sys.stderr.write(os.linesep)
sys.exit(1)
# Needed for locale.getpreferredencoding(False) to work
# in pip._internal.utils.encoding.auto_decode
try:
locale.setlocale(locale.LC_ALL, '')
except locale.Error as e:
# setlocale can apparently crash if locale are uninitialized
logger.debug("Ignoring error %s when setting locale", e)
command = create_command(cmd_name, isolated=("--isolated" in cmd_args))
return command.main(cmd_args)
Lets try messing with it by changing the end of the file to look like this…
...
...
print("Look ma! I'm in pip!")
return command.main(cmd_args)
… and running pip install badpackage
gets you something like this…
Now that we know the flow of the code, lets get to work. We are going to inject our own bit of code into the file, to generate requirements.txt
when pip finishes its other stuff successfully.
We are going to replace the last line of that file return command.main(cmd_args)
with the following code.
#1
main_cmd_status = command.main(cmd_args)
#2
if (main_cmd_status == 0) and (cmd_name in ['install', 'uninstall']):
#3
req_file_name = "requrements.txt"
req_file_path = os.path.dirname(os.path.dirname(__file__))
for i in range(5):
req_file_path = os.path.split(req_file_path)[0]
req_file_path = os.path.join(req_file_path,req_file_name)
#4
req_file_data = io.StringIO()
with redirect_stdout(req_file_data):
#5
freeze_command = create_command('freeze')
freeze_cmd_status = freeze_command.main([])
#6
with open(req_file_path, 'w+') as req_file:
req_file.writelines(req_file_data.getvalue())
#7
return main_cmd_status
We also need the following imports so add the following lines to the top of the file.
import io
from contextlib import redirect_stdout
Now when you run pip install
or pip uninstall
it will auto generate the requirements.txt
file. Lets go through how it works. I’ve numbered the code so you can refer back to the exact lines in the code that does a particular thing.
- We are going to stop
command.main
from returning by changingreturn command.main(cmd_args)
tomain_cmd_status = command.main(cmd_args)
. - Then we check if the pip was called to install or uninstall something and if this process ended successfully. That way we only run on successful
pip install
orpip uninstall
calls and not for something likepip list
. - If the above condition is
true
, we set therequirement.txt
file path to be the parent of.venv
(i.e. the directory where you ranpython -m venv .venv
). - We redirect
std out
so we can save the output frompip freeze
to a file without using asubprocess
call. - Then we are going to invoke the
pip freeze
command by using the internal API. - We capture the output from
pip freeze
and write it to the requirements file. - Finally we return the result of the original command so we don’t break anything upstream.
And that’s it! When you run pip install
successfully, pip will create an updated requirements.txt file in the same directory the virtual environment is.
However, on windows, at the moment, you have to run pip install command twice for this to happen. Because the exact functionality of pip
is hidden in pip.exe
without disassembly I dont know what it does after running the user commands. It seems that what happens after is responsible for updating the package list. Which means, until .venv\Lib\site-packages\pip\_internal\cli\main.py
exits, whatever changes that were made to the environment while inside the script wont propagate out. For example, if I ran pip list
from inside that file. after successfully installing a package, it will show the old package list. Not entirely sure how to get around this at the moment. However, if you had already installed a package, the second invocation of pip install
should detect that and only update the requirements file.
You can check Janith’s blog post for something that works for Mac/linux without needing to run the command twice.
As you might expect, this would stop working the next time you update pip, since it’ll overwrite all the files.