Notes on seeking wisdom and crafting software

Notes on vim python interop

I stumbled upon an interesting problem with Jedi language service for python. On digging further, realized that I’ve been thinking incorrectly so far. So let’s dedicate this post to correct the stance.

Update: there’s a behavior change since vim 8.0.1451. See setting python home post for details.

VIM and Jedi

Similar to OmniSharp (language service for .NET), Jedi is a language support service for python. It works with several editors. There are two parts to it:

vim-jedi loads a python layer which finds the jedi python package and asks source completions from it.

Defining the problem

I had been using jedi in this way:

  1. Create a virtual env and activate it
  2. Install jedi in the virtual env
  3. When vim runs in the virtual env, it uses jedi package for auto completion

This scenario recently broke. There are atleast two ways to create a virtual environment in python 3.

Method 1: python venv module

# create a python virtual env with venv in directory myenv
> python -m venv myenv
> source myenv/bin/activate

# install jedi package and open vim
> pip install jedi
> vim foo.py

# run `:python3 import jedi` within vim

Method 2: pipenv

# create a python virtual env with pipenv tool
> pipenv --python 3
> pipenv shell

# install jedi package and open vim
> pipenv install --dev jedi
> vim foo.py

# run `:python3 import jedi` within vim

Problem statement: with method 1 jedi works, with method 2 it doesn’t!

Let’s make it more complicated :) Here’s a command that works in both environments:

> python -m jedi
# no error, which means jedi is actually available in both in the python
# interpreter

If it works outside vim, what’s going on within vim?

On python interop

VIM can interop with python based on compile time flags. For working with both python2 and python3, the interoperability libraries should be dynamically linked, i.e. +python/dyn and +python3/dyn flags in :version in vim.

Our validation with vim and :version indicated the correct compilation flags. We tried the following command to see the state of python interpreter.

> vim

# type the following ex command in vim
# :python3 import time; time.sleep(300000)

> htop
# the process tree view doesn't show python as a child process of vim !?

Alright, so vim doesn’t invoke the python interpreter. So there must be a python interpreter embedded within vim process. Let’s check out!

> gdb vim
(gdb) start
(gdb) c

# execute a python command in vim
# :python3 import sys; print(sys.path)
# suspend vim process: Ctrl+Z
(gdb) info sharedlibrary

# this shows /usr/lib/python3.6/
0x00007fffe938c220  0x00007fffe9556e3d  Yes (*)     /usr/lib/libpython3.6m.so.1.0
...
0x00007fffe86edb20  0x00007fffe86ef467  Yes (*)     /usr/lib/python3.6/lib-dynload/_heapq.cpython-36m-x86_64-linux-gnu.so

Point noted. And vim source code does imply that. Check out if_python3.c in vim source code:

  • Cross platform assembly load routines defined here
  • load_dll loads the library, symbol_from_dll loads the symbols from it
  • And there’s an interepreter here

So our comparison that python -m jedi works outside vim is expected, clearly that’s not how vim does execute python code!

Specifics of vim + python

We’re quite close. There’s two other details we need to understand:

  1. An embedded python interpreter works with PYTHONHOME via Py_SetPythonHome function call. This sets the prefix and exec-prefix for the interpreter.
  2. With a virtual environment created via python -m venv way, a $VIRTUAL_ENV/pyvenv.cfg file is created. See pep-0405 for details. Relevant parts in the document:

    In this case, prefix-finding continues as normal using the value of the home key as the effective Python binary location, which finds the prefix of the base installation. sys.base_prefix is set to this value, while sys.prefix is set to the directory containing pyvenv.cfg.

    (If pyvenv.cfg is not found or does not contain the home key, prefix-finding continues normally, and sys.prefix will be equal to sys.base_prefix.)

So what happens in vim?

  1. vim calls Py_SetPythonHome with a prefix as /usr before starting the embedded interpreter. This modifies the sys.prefix and thus site-packages directory to /usr and /usr/lib/python3.6/site-packages.
  2. Now python -m venv creates a pyenv.cfg file in the virtual env it creates. This changes the sys.prefix set by vim to /tmp/myenv directory and thus site-packages becomes /tmp/myenv/lib/python3.6/site-packages.

pipenv environment can’t find jedi package in this directory since it is only installed within the virtual environment.

python -m venv environment has modified the site-packages to the virtual environment directory, so it does find the jedi package and everything works great!

Lessons

  1. vim is not aware of the virtual environment. It just loads the shared library and tries to embed a python interpreter.
  2. vim always sets PYTHONHOME to /usr.
  3. python -m venv places a pyvenv.cfg at $VIRTUAL_ENV which makes python natively aware of the virtual environment; and thus the sys.prefix changes accordingly, including site-packages of the virtual environment

Thus we need not install jedi in every python project with pipenv. We can install the package system wide using pacman -S python2-jedi python-jedi.

jedi will be loaded from the /usr/lib/python3.6/site-packages/jedi location, however it will be aware of the virtualenv and accordingly show completion and refactoring for the editors (only for packages within VIRTUAL_ENV site-packages).

Appendix

How do we prove vim indeed sets PYTHONHOME?
We can do it with setting a function breakpoint in gdb and examining the input. Here’s how:

> gdb vim
(gdb) break Py_SetPythonHome
Function "Py_SetPythonHome" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (Py_SetPythonHome) pending.

(gdb) start
Temporary breakpoint 2 at 0x66ad0
Starting program: /usr/bin/vim
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Temporary breakpoint 2, 0x00005555555baad0 in main ()

(gdb) c
Continuing.

Breakpoint 1, 0x00007fffed0a176e in Py_SetPythonHome () from /usr/lib/libpython2.7.so.1.0
#skip the python 2.7 load

(gdb) c
Continuing.
# vim starts now, input following command
:python3 import sys
Breakpoint 1, 0x00007fffe93aa44d in Py_SetPythonHome () from /usr/lib/libpython3.6m.so.1.0

# we don't have symbols, let's try to get the local arg from registers
# %rdi is used to store the first arg, %rsi second arg etc. see https://cons.mit.edu/sp17/x86-64-architecture-guide.html
# we will examine the value

(gdb) x/10s $rdi
0x5555557f5ed0: "/"
0x5555557f5ed2: ""
0x5555557f5ed3: ""
0x5555557f5ed4: "u"
0x5555557f5ed6: ""
0x5555557f5ed7: ""
0x5555557f5ed8: "s"
0x5555557f5eda: ""
0x5555557f5edb: ""
0x5555557f5edc: "r"

Point noted :)

Can I play around with the embedded interpreter?
You can create one for yourself mimicing vim in the context of this post. This gist has source code listing.

And here’s how the output looks across python -m venv and pipenv virtual environments.

The shell on left is python -m venv and on the right we have pipenv environment. Note the sys.prefix output (marked red) and the site-packages listing (marked green). python venv experiment

Hope you enjoyed this post. Namaste!