Part 1 -- Getting Started: Python setup, Jupyter Lab
Your OS may have Python.
If it does, ignore it.
Never use the OS Python.
Always install your own fresh, new Python.
Don't use https://www.python.org/downloads/
Why not?
Many large data science packages are very difficult to build.
You'll need extra help.
Extra help in the form of conda.
The best path forward is to install conda first.
Then use conda to install Python and other packages.
Two variants.
https://docs.conda.io/en/latest/miniconda.html
Find a Miniconda installer for your OS. Start downloading it now.
Use the downloaded installer to build conda.
Once you have conda you can manage virtual environments.
Which raises the question: "What is a virtual environment?"
In the olden days we used to install one copy of Python on one server.
This was tolerable when computers were large, expensive, and rare.
It's inappropriate when everyone has their own laptop.
Python's environment has a (very) few things.
One environment, however, is rarely suitable.
When a new version of a package is released...
Problem:
You need to test.
And.
You don't want to break what you have.
Solution:
You need multiple environments.
When you install conda, it gives you a new conda command.
(You will need to stop and restart any terminal windows.)
conda sets a new prompt prefix: (base)
Terminal prompts will look like this:
(base) slott@MacBookPro-SLott ODSC-Live-4hr %
The base conda virtual environment is active.
Once you have the conda command, you can build Python and any Python package.
You do this by creating a new virtual environment at the terminal prompt:
conda create -n demo1 python=3.9
You'll get a display of what needs to be downloaded and installed.
Then...
Proceed ([y]/n)?
Once Conda has built the environment, you need to activate it. (I forget to do this all the time.)
conda activate demo1
You'll often have many environments. What are they called?
conda info --envs
Create a YAML file with the list of packages to install.
conda list -e >environment.yml
Create a new environment from someone's export
conda create -n team-env --file environment.yml
Download and Install Miniconda. https://docs.conda.io/en/latest/miniconda.html
Create a new virtual environment.
conda create -n name python=3.9
Activate the virtual environment.
conda activate name
You can manage this environment without too much brain scrambling.
Yes. This is all done at the Terminal prompt.
That's not always obvious.
We get used to IDE's and other wrappers around our OS tools.
conda is one of many solutions.
It's the best for scientists because it builds the BIG packages like scikit-learn.
It's fine for everyone else because it's consistent across all platforms.
venv is built-in, but conda is better for a lot of applications.
Do Not Uninstall the OS Python. It's part of the OS.
Do not attempt to upgrade or install anything new into the OS Python. This usually requires elevated privileges.
You will (eventually) have a lot of old Python versions. You can safely ignore them.
The Homebrew approach to installing Python leads to working with elevated privileges.
It doesn't help (much) with complex AI/Machine Learning/Data Science packages.
If you started with this, you can safely ignore it and use conda.
Don't try to uninstall homebrew or a homebrew-created Python.
There are a lot of integrated development environments (IDE's) for Python.
Jupyter Lab provides a controlled execution environment.
It's ideal for science and acceptable for many other things.
Since you have conda you can do this:
conda install jupyterlab
This will add Jupyter Lab to your current virtual environment.
Do this now. (Be sure to answer "y" to the proceed? question.)
You have a spreadsheet-like environment.
Cells with data.
Cells with expressions based on the data.
A chain of dependencies among the cells.
Change a cell, evaluate all the cells that follow: update the notebook.
Start the Lab Server.
(python4hr) server dir % jupyter lab
When the browser opens, build your notebook or module or whatever.
I'm going to launch the lab now: Untitled.ipynb
Enter expression like 355/113 in a cell.
Hit the ► button and computation.
We'll come back to this in the next sections of the course.
Install Jupyter Lab.
conda install jupyterlab
Start the lab server.
jupyter lab
Do your programming in the browser. Mostly by creating notebooks.
You can also interact with iPython directly via a console. (We'll come back to this, later.)
Python is a programming language.
Jupyter Lab is not a simple code editor. It's a run-time environment.
There are three kinds of cells in a notebook.
The default notebook mode is biased toward analysis and exploration.
Think of a lab notebook where you record your experiments.
(You can create libraries and apps, but we won't focus on this yet.)
You have the full Python language available to you.
You can do anything.
Let's look at some calculator-like features.
Want to compute the volume of an irregular tetrahedron.
Sides vary: 48, 36, and 51 inches. We need to take an average.
Also. 231 cubic inches = 1 gallon (Sorry. American Units make no sense.)
Put the stuff we know into cells:
Add computations in cells.
Here's the notebook.
Change the measurement values. Recompute. This is fun.
Also, a document: compute_1.html
I'll build a notebook to explore the Collatz Conjecture.
We have the HOTPO function, h(n).
If we appy this iteratively, h(h(h(...h(n)))), is the result always 1?
def h(n: int) -> int: if n % 2 == 0: return n // 2 else: return 3 * n + 1
def iterate_from(n: int) -> None: print(n) while n != 1: n = h(n) print(n)
The notebook is fun, but, what if you have other platform ideas?
Before looking at other deployment options...
The notebook is very useful, even for "production."
Especially in business analysis (also called decision support).
It makes assumptions visible and lets users tweak them.
The resulting document is pure text and can be archived in a git repository.
Big apps like Instagram use Python for their web servers.
You'll want a framework (like Flask.)
Or. If you're using Amazon, you'll want to use their Zappa version of Flask.
You'll be developing in an IDE like PyCharm instead of a notebook.
You'll use a desktop GUI widget framework like PyQt or wxPython.
You'll be developing in an IDE like PyCharm in addition to a notebook.
You'll have to work out a deployment strategy to distribute the app.
You'll want a mobile-based framework like BeeOS or Pythonista.
You'll be developing in an IDE like PyCharm instead of a notebook.
You'll have to work out a deployment strategy to distribute the app.
You'll be using a Raspberry Pi or a Circuit Playground Express.
You won't need a very sophisticated IDE, but it can help.
I'm a fan of creating mock test libraries and using PyCharm.
You'll be downloading the software onto the boards for testing and deployment.
Not very many common themes. These are wildly distinct environments.
I suggest starting with a notebook to understand the problem and the data.
Then work on your target deployment once you have core algorithms and data structures working.
Make a firm distinction
We'll start at the top of the hour with Part 2, Built-in data structures: list, set, dictionary, tuple
BTW. The next two sections will be done entirely in Jupyter Lab.