Hey, check out my Modern Python Projects course. It's an extended version of this workshop!
With pyenv, we solved one problem - how to easily switch between different Python versions. But even if you are using the same version of Python all the time, you have another problem - managing Python dependencies. Python has one big problem - you can’t have multiple versions of the same package installed on your computer. Every package that you install goes to some_directory/site_packages/package_name/
- notice that there is no version in the folder name! You can’t have Django 2 and Django 3 installed at the same time - they would end up in the same directory!
If you come from the JavaScript world, you know that there is node_modules
folder where you can install packages locally. And when you run some node commands, it will look first in that node_modules folder from the current directory. Python has no concept of local packages (yet!). There is a proposal (PEP 582 – Python local packages directory), but it’s hard to say when (or “if”) it will be implemented.
So each time you run pip install <some_package>
, pip will check if that package is already installed on your computer. If it’s not, it will install the latest version. You can also specify that maybe you want a specific version of a package. For example, you want to have Django version 2.2 because your application won’t work with either a higher or lower version of Django. pip will first check if you have Django 2.2.x installed on your system:
Great, but if you now switch to that other project that was using Django 3.0, it will stop working - there is no Django 3.0 on your computer! You have to run pip install django==3.0
, which will uninstall version 2.2 and install version 3.0. And the more projects you have, the more annoying this can become.
Since you can’t have two versions of the same package installed in the global site_packages
directory (that’s the directory where pip is installing packages), the idea of “virtual environments” was born.
Virtual environment prepends a folder with some binaries to the $PATH variable (the same way as pyenv does). Each time you run python
command, you will automatically use Python and site_packages from this virtualenv folder. And the pip
command will install packages inside the virtual environment, not in the global site_packages.
With virtual environments, the main idea is that you should use separate virtualenv for each of your Python projects. That way, you won’t mix dependencies.
The good news is - you don’t have to install anything! Since Python 3.3, there is a built-in module called venv that you can use to create virtual environments.
To create a new virtual environment run:
$ python -m venv my-virtualenv
This will create a directory my-virtualenv
with:
To start using this virtual environment, you first need to activate it:
$ source ./my-virtualenv/bin/activate
On Windows run my-virtualenv\Scripts\activate.bat
This will prepend the path to “my-virtualenv” at the beginning of the $PATH variable. That way, Python and pip will use it instead of relying on the global versions installed on your computer.
That’s it - now you are inside an isolated environment. You can install packages with pip, and they won’t collide with packages installed in other environments or with the global ones:
$ source ./my-virtualenv/bin/activate
(my-virtualenv) $ pip freeze
(my-virtualenv) $ pip install flask
Collecting flask
....
Installing collected packages: MarkupSafe, Jinja2, Werkzeug, itsdangerous, click, flask
Successfully installed Jinja2-2.11.1 MarkupSafe-1.1.1 Werkzeug-1.0.0 click-7.0 flask-1.1.1 itsdangerous-1.1.0
(my-virtualenv) $ pip freeze
Click==7.0
Flask==1.1.1
itsdangerous==1.1.0
Jinja2==2.11.1
MarkupSafe==1.1.1
Werkzeug==1.0.0
(my-virtualenv) $ deactivate
$ pip freeze
# Either an empty list or some other packages that you installed globally
If you want to stop using this virtual environment, run:
$ deactivate
Remember that when you activate a virtual environment, it works only for the current shell session! If you open a new terminal, you need to run the activate
command again.
It’s a good idea to display the currently used virtualenv somewhere in your terminal. Many custom shell prompts have this feature built-in (including the default macOS shell). If yours doesn’t, and you want to enable this feature, print out the $VIRTUAL_ENV
variable. venv sets it when you activate the virtual environment and unsets it when you deactivate it.
That’s pretty much all you need to know to use virtual environments. However, there are some variations on this topic - their main purpose is to make using venvs easier in one way or another.
If you used the pyenv installer in the previous section (by running curl https://pyenv.run | bash
) then you have also installed the pyenv-virtualenv
plugin. You can use it to create and manage virtual environments:
# Create a virtualenv named "my-project" that uses "3.7.4" version of Python:
$ pyenv virtualenv 3.7.4 my-project
# You can skip the first parameter "3.7.4" to use the current version of Python
$ pyenv virtualenv my-project
# List available virtual environments
$ pyenv virtualenvs
my-other-project (created from /Users/test/.pyenv/versions/3.8.1)
my-project (created from /Users/test/.pyenv/versions/3.7.4)
# Activate one of the virtual envs
$ pyenv activate my-project
# Deactivate
$ pyenv deactivate
virtualenvwrapper is and old but still very popular virtual environment management tool. The Installation guide has instructions on how to install it on Windows/Linux/macOS (it works with bash
, ksh
, and zsh
). You will need to add some lines to your shell, similar to what we did with pyenv.
Once you install it, it will give you access to a similar set of commands as pyenv-virtualenv did:
# Create a virtualenv named "my-project"
$ mkvirtualenv my-project
Using base prefix '/Users/YOUR_USERNAME/.pyenv/versions/3.7.4'
New python executable in /Users/YOUR_USERNAME/.virtualenvs/my-project/bin/python
Installing setuptools, pip, wheel...
done.
# List available virtual environments
$ lsvirtualenv
my-project
my-other-project
...
# Activate one of the virtual envs
$ workon my-project
# Deactivate
$ deactivate
On Windows you can try to use virtualenvwrapper-win. However, it might not work together with pyenv-win (see this and this GitHub issues).
Since virtualenvwrapper doesn’t work with the fish
shell, I’m using virtualfish
instead. It has the same set of functionality as virtualenvwrapper, but the commands are even shorter (vf new
, vf ls
, vf rm
, vf activate
).
If you are already using pipenv, I’m not going to tell you to change it now. But if you are looking to start using it, I can’t recommend this tool. It’s poorly maintained (the last release was on 2018.11.26 and 300+ open issues), and that’s causing a lot of controversy around the project (https://github.com/pypa/pipenv/issues/4058). Many other good projects are actively maintained - if you are looking for something beyond a combination of virtualenvs and pip, I suggest you check out poetry instead.
Poetry is a whole new way to manage Python projects. It generates a scaffolding for your project, manages its dependencies, prepares the project to be published on PyPI, etc. If you are looking for one comprehensive way to manage your Python projects, definitely check it out!
What about Anaconda or Miniconda? If you come from the data-science part of the Python community, you are probably familiar with Anaconda. It makes installing new Python versions and packages much easier. It automatically creates virtual environments and installs dependencies inside.
When you use Anaconda, you often don’t have to worry about the missing dependencies - they will usually be bundled together with the packages.
The main difference between using pip and Anaconda (that uses conda
package manager) is that the latter doesn’t install packages directly from PyPI. It installs binaries, which means that someone has to build that binary and publish it to Anaconda’s repository before you can use it. Most of the time, that won’t be a problem. If you stick to popular packages, most of them are available. But if you want to install a new package or a package that you just created - well, you can’t. As explained in their documentation: “Occasionally a package is needed which is not available as a conda package but is available on PyPI and can be installed with pip.”
I would recommend that you learn how to manage your dependencies with virtual environments and pip. It’s more difficult than just using conda
, but that’s what most Python developers are doing, and it will give you much more flexibility in the long run.
However, if you are struggling with dependencies management and you are mostly doing data science or machine learning (so you use the same popular packages all the time), conda might be just right for you!
You might be wondering how you should organize your virtual environments? There are two popular methods, and both of them have their pros and cons:
Some people like to put all of them in one place. For example, in ~/.virtualenvs
folder (and most tools do that by default). The main advantage of this approach is that you can use tools that help you manage virtual environments. You can easily list, activate, or remove them. And it doesn’t matter in which folder you currently are. That’s my preferred way too - it’s convenient to just run workon my-project
, without searching where is the activate
script.
Other people like to create virtualenvs in the same folder where their project is located (this is what venv
does by default). This approach has a different benefit - most IDEs (VS Code and PyCharm) can detect that you are using a virtual environment and automatically activate it for you. And when you remove the project that you no longer use, you automatically clean up the virtual env.