Build tools: Python
The Python programming language comes with a package manager called pip
. Find the package that provides it and install it (hint: how did we find a missing library in the C exercise?).
We are going to practice installing the mistletoe module, which renders markdown into HTML.
- In python, try the line
import mistletoe
and notice that you getModuleNotFoundError: No module named 'mistletoe'
. - Quit python again (Control-D) and try
pip3 install --user mistletoe
. You should get a success message (and possibly a warning, explained below). - Open python again and repeat
import mistletoe
. This produces no output, so the module was loaded.
Create a small sample markdown file as follows, called hello.md
for example:
# Markdown Example
Markdown is a *markup* language.
Open python again and type the following. You need to indent the last line (four spaces is usual) and press ENTER twice at the end.
import mistletoe
with open('hello.md', 'r') as file:
mistletoe.markdown(file)
This should print the markdown rendered to HTML, e.g.
<h1>Markdown Example</h1>\n<p>Markdown is a <em>markup</em> language.</p>
Python version 3 came out in 2008 and has some syntax changes compared
to Python 2 (print "hello world"
became print("hello world")
). Version 2 is now considered deprecated; but the transition
was long and extremely painful because changing the syntax of a
thing like the print statement leads to an awful lot of code breaking
and an awful lot of people preferring not to fix their code and
instead just keep an old version of Python installed.
So whilst we were dealing with this it was typical for a system to
have multiple versions of Python installed python2
for the old one
and python3
for the newer on (and even then these were often
symlinks to specific subversions like python2.6
), and then python
being a symlink for whatever your OS considered to be the "supported" version.
Different OSs absolutely had different versions of Python (MacOS was particularly egregious for staying with Python 2 for far longer than necessary) and so a solution was needed, because this was just breaking things while OS designers bickered.
The solution is that for most dependencies (except for compiled
libraries) we generally use a programming language's own package
manager and ignore what the OS provides. For Python that means pip
(occasionally called pip3
or pip2
).
Sometimes you'll see things telling you to install a package with
sudo pip install
but don't do that! It will break things horribly
eventually. You can use pip without sudo, by passing the --user
option which installs packages into a folder in your home directory
(~/.local
) instead of in /usr
which normally requires root
permissions.
Sometimes you'll still need to install a package through the OSs
package manager (numpy
and scipy
are common because they depend on
an awful lot of C code and so are a pain to install with pip
as you
have to fix the library paths and dependencies manually) but in
general try and avoid it.
Python used to manage your OS should be run by the system designers; Python used for your dev work should be managed by you. And never the twain shall meet.
Scipy
We often use scipy
for statistics, so you may as well install that too. Unfortunately, pip
will not help you here because scipy depends on a C library for fast linear algebra. You could go and install all the dependencies (and you might have to do this if you need a specific version of it), but it turns out Debian has it all packaged up as a system package: Try searching for it with apt search scipy
.
The following commands show if it is correctly installed, by sampling 5 times from a Normal distribution with mean 200 and standard deviation 10:
from scipy.stats import norm
norm(loc=200, scale=10).rvs(5)
This should print an array of five values that are not too far off 200 (to be precise, with about 95% confidence they will be between 180 and 220 - more on this in Maths B later on).
Avoiding sudo
If you need to install libraries you might be tempted to install them for all users by using sudo pip
but this can lead to pain! If you alter the system libraries and something in the system depends on a specific version of a library then it can lead to horrible breakage and things not working (in particular on OSs like Mac OS which tend to update libraries less often).
Python comes with a mechanism called venv which lets you create a virtual python install that is owned by a user: you can alter the libraries in that without sudo
and without fear of mucking up your host system. Read the docs and get used to using it---it'll save you a world of pain later!
pip freeze | tee requirements.txt
will list all the packages your using and what version they are and save them in a file called requirements.txt
.
pip install -r requirements.txt
will install them again!
This makes it super easy to ensure that someone looking at your code has all the right dependencies without having to reel off a list of go install these libraries (and will make anyone whoever has to mark your code happy and more inclined to give you marks).