Part 4 -- Tooling and Good Practices
See Part 1.
Use conda.
Python comes with venv, plus there are many others.
Conda is particularly good a building large scientific packages.
The conda script is consistent on POSIX (Linux and macOS) as well as Windows.
Python comes with venv.
If
You don't need conda. venv will do nicely.
One environment is never enough.
Open source software in general is in a constant state of flux.
You'll always need to test against newer versions of packages, libraries, frameworks.
Python changes on a regular cadence. See https://www.python.org/dev/peps/pep-0619/
Don't know how many folks intend to use docker images. I can recommend this path.
You can often find docker images with Python installed.
Example ubuntu:latest, debian:latest, python:3.9
docker pull python docker run -it --rm -v "$PWD" python:3 python your-script.py
Use a virtual environment manager.
Ideally. Use conda to build Python for you.
Otherwise. Download the Python binary and use whatever virtual environment makes you happiest.
Use Jupyter Lab
Install it in your current virtual enviroment.
conda install jupyterlab
Star the lab server.
jupyter lab
Create a notebook.
It's really cool and friendly.
It's very low overhead.
(python4hr) ODSC-Live-4hr % python Python 3.9.6 (default, Aug 18 2021, 12:38:10) [Clang 10.0.0 ] :: Anaconda, Inc. on darwin Type "help", "copyright", "credits" or "license" for more information. >>> 355/113 3.1415929203539825 >>>
It's a bare Read-Execute-Print Loop (REPL).
Consider your use cases.
Analysis? Decision-Support? Science?
Notebook. Users can tweak assumptions and do "what-if" analysis.
Automation? Web Server? IoT Application? Mobile Application?
Deployed App. Admins can change configuration.
Python is designed to be used interactively.
Use Python interactively to explore algorithms and data.
Been writing Python code for 20 years.
A REPL prompt open in a terminal window (or the IDE) at all ties.
Use it as a desk calculator.
Did a weekly in-house webcast for years from a jupyter notebook.
Outside talks like these, it's always complex.
You'll often be creating your own modules and libraries.
You can provide users with a notebook that has imports all set up and ready to roll.
You can give them super-handy libraries of ready-to-use functions.
The Jupyter Lab is very handy.
When you're going to be writing Python apps, libraries, modules, frameworks, scripts, etc.
Any text editor will do. Any.
There are -- maybe -- 20 more choices. All good. All.
Python is simple.
No compile/build/archive tool overhead.
Minimal debugging and packaging complexity.
A lot of complexity in Java, C, C++, etc., comes from the compiler.
And the archiver to make JAR's (or .ar or whatever).
And the linker to make an executable app from .o files and .ar. (or .java and .jar)
None of this in Python.
I used python -m pip install whatever
It downloaded a "wheel" whatever.whl.
Isn't that tooling overhead? Just like a JAR file?
Much Python software is available as a "wheel" or "egg".
This is not required.
You never need to make these.
This tooling is outside the language and only required if you package things for PyPI.
The pdb Python debugger is part of the distribution.
Feel free to use it.
(I add print() functions. I don't often use the debugger.)
When you've got a great library/package/framework/tool/whatever...
You'll want to put it on PyPI for others to share.
Then you'll get involved in "distutils" and "twine".
Don't start there, though. First, get stuff to work.
src collatz.py tests test_collatz.py docs ... created by sphinx README requirements.txt pyproject.toml
Make some directories
Create a collatz.py file in src.
We'll put some code in it later.
Flat is better than nested.
Don't create one-file-per-class. This is a silly approach designed to help the compiler.
Don't create lots and lots of nested directories. They don't help much.
You don't need much. A single myscript.py is acceptable. Really.
Any text editor will be good.
There's not much that's required.
You want syntax coloring, ready access to Python prompt and command line.
Don't sweat PyPI packaging and distribution.
Unit testing and integration testing are really important.
Really important.
Software features that can't be demonstrated by automated tests simply don't exist.
Extreme Programming Explained, Kent Beck
TDD -- Test-Driven Development is your friend.
To the extent possible, write test cases first.
Fill in working code later to make the tests pass.
The doctest tool scans a file, looking for >>> examples.
It runs the >>> line(s) of code.
Compares them with the lines that follow.
def hotpo(n: int) -> int: """ >>> hotpo(10) 5 >>> hotpo(5) 16 """ if n % 2 == 0: return n // 2 else: return 3 * n + 1
% python -m doctest src/collatz.py
No output? No failures.
Want details? Add -v.
% python -m doctest -v src/collatz.py
First. Install it conda install pytest.
The pytest tool looks for a tests directory.
Inside that directory, it looks for files with names starting with test*.py.
Within those files, it looks for test cases.
import pytest from collatz import hotpo def test_hotpo(): assert 5 == hotpo(10) assert 16 == hotpo(5)
% PYTHONPATH=src python -m pytest ===================== test session starts ===================== platform darwin -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 rootdir: /Users/slott/Documents/Writing/Python/ODSC-Live-4hr plugins: anyio-2.2.0 collected 1 item tests/test_collatz.py . [100%] ====================== 1 passed in 0.02s ======================
Code is in a src directory.
To make code in src visible:
Package and install. Ugh.
Put src directory into the PYTHONPATH environment variable
PYTHONPATH=src python -m pytest
import unittest from collatz import hotpo class TestHotpo(unittest.TestCase): def test(self): self.assertEqual(5, hotpo(10)) self.assertEqual(16, hotpo(5))
We create mock collaborators to test a class in isolation.
It's a stand-in that has just enough behavior to not crash the test.
Usually filled with known, fake answers.
>>> transform = Mock(return_value=42) >>> transform(1) 42
def iterate_from(n: int) -> Iterator[int]: yield n while n != 1: n = hotpo(n) yield n
Depends on hotpo(). Isolation requires a Mock.
import pytest from unittest.mock import Mock, call @pytest.fixture def mock_hotpo(monkeypatch): m = Mock(name='mock hotpo', side_effect=[4, 2, 1]) monkeypatch.setattr(collatz, 'hotpo', m) return m
def test_iterate_from(mock_hotpo): results = list(collatz.iterate_from(42)) assert results == [42, 4, 2, 1] assert mock_hotpo.mock_calls == [ call(42), call(4), call(2) ]
Helps to follow the SOLID design principles.
The dependency between iterate_from() and hotpo() is a design problem.
Requires monkeypatching the module for a test.
The topic of test case coverage is huge.
Strive for 100% code coverage.
conda isntall pytest-cov
PYTHONPATH=src python -m pytest --cov=src \ --cov-report=term-missing
---------- coverage: platform darwin, python 3.9.6-final-0 ----------- Name Stmts Miss Cover Missing ---------------------------------------------- src/collatz.py 10 0 100% ---------------------------------------------- TOTAL 10 0 100%
We'll always have multiple versions of packages on which we depend.
We'll need to test various versions of those packages with our code.
How do we do this? tox
The tox tool build virtual environments.
And runs commands in each environment.
This isn't available through conda
python -m pip install tox
There are two paths for configuration for tox
We'll focus on a simple tox.ini.
Some generic overhead.
[tox] minversion = 3.20.0 skipsdist = True envlist = json-3-2-0,json-4-0-0
[testenv] deps = pytest==6.2.4 pytest-cov==2.12.0 mypy==0.910 setenv = PYTHONPATH = {toxinidir}/src commands = python -m doctest --option ELLIPSIS src/collatz.py python -m pytest --cov=src --cov-report=term-missing mypy --strict --show-error-codes src
[testenv:json-3-2-0] deps = {[testenv]deps} jsonschema==3.2.0 [testenv:json-4-0-0] deps = {[testenv]deps} jsonschema==4.0.0a6
% tox
Output from each command...
________________________ summary ________________________ json-3-2-0: commands succeeded json-4-0-0: commands succeeded congratulations :)
The first run after a change populates a cache. After that, it's fast.
You're going to have multiple test commands: doctest, pytest, mypy, pylint, etc.
Don't write a shell script.
Use tox to run the suite of commands.
Later, when you have multiple environments, tox can manage those, too.
Easy.
Use Sphinx.
You can write in Markdown or ReStructured Text (RST).
You can organize the files any way that makes sense.
You can use the .. automodule:: directive to generate API reference documentation from your source code.
Use Sphinx.
It's how Python's internal documentation is produced.
Documentation comes from the source.
Are we there yet?
How do we deploy the app after we've written, tested, and documented it?
This can be hellishly complex, depending on what your package contains.
In the most trivial cases (pure python)
Beyond that, there are a lot of details to get right.
Continuous Integration / Continuous Deployment
You really want to automate this as much as possible.
You don't want to type commands manually to test, build, and deploy your code.
For "simple" Enterprise cases, git clone is your friend.
For world-wide distribution via PyPI: