Three terms that seem to cause confusion: Idempotence, Hysteresis and
Deterministic. The subject came up during my webcast on the Five Kinds
of Python Functions. We can use
all three terms to describe a function. Two of them are relevant to
common design questions in software. The third is a higher-order
architectural consideration, and not really part of writing a function
definition in Python.
Idempotent -- narrowly defined -- means that f(f(x)) =
f(x). In computer science, the meaning is stretched so that we can
distinguish functions like random number generators from other
functions. A random number generator (Python's random.random(), for
example) is not idempotent. It returns a different result each time
it's called. Many other functions are idempotent because they always
return the same result given the same arguments. The functions in
Python's os module are not idempotent. The results change based on
external events.
Hysteresis is memory of previous events. A random number generator
may have some hidden hysteresis so that it can return the next value in
it's random sequence. Note that os.random() is explicitly based on
/dev/random; it involves hysteresis where entropy is accumulated. A
function that has an internal memoization cache has hysteresis; it will
compute subsequent results more quickly after having memorized previous
results.
Most functions are simply idempotent and don't generally involve any
hysteresis. The result value is a fixed mapping from the argument
values.
A memoization cache preserves idempotence while adding hysteresis. The
functools.lru_cache decorator, for example, adds hysteresis to an
otherwise idempotent function. The result value still reflects a fixed
mapping.
A random number generator cannot have idempotence; it will leverage
hysteresis. We should think of a random number generator as an iterator
through a sequence of numbers. Given a seed, the sequence is entirely
fixed. Between any two numbers, it's very difficult to predict one from
the other without also knowing the hidden seed value.
Unrelated Concept
We use idempotence and hysteresis to describe programming language
features. These features are entirely deterministic. There's no
room for arbitrary, undefined, unpredictable behavior. Any
non-determinism needs to be very tightly boxed in by otherwise
deterministic programming constructs.
When writing a Python function, we assume that the language is
deterministic. Without this assumption, it's hard to conceive of what
the language would be like. What if statements were executed out of the
order we wrote them?
External events -- interrupts and the like -- are non-deterministic.
They can happen at any time, with no relationship to what's going on
inside the software world. Inside the software world, however, we expect
that everything is deterministic. This means that a server must always
cope with non-deterministic request ordering. Once request processing
starts, however, we generally rely on essential non-deterministic
software to process the results perfectly consistently.
An important example of bounded non-determinism is in Dijksta's
hypothetical programming language described in A Discipline of
Programming.
Here there is explicit non-determinism among the "elif" and "eldo"
clauses. The selection among true alternatives was specifically
non-deterministic. Indeed, an evil demon would always strive to select
the worst possible choice. There was no "first one that's true" kind of
silliness that would allow certain kind of logic errors to survive.
A multiprocessing application leverages the OS to keep all processes
separate. Each process can then operate deterministically. Any
non-determistic OS events are concealed from the process by the normal
OS libraries that generally queue up events into buffers.
A multithreaded application, however, has to use some kind of explicit
synchronization to handle the inherent non-determinism of thread
scheduling. Thread libraries make no guarantees about the exact sequence
of operations between threads; the execution is non-deterministic
between threads.
For real fun, read about the non-deterministic memory write orders. The
Data
Race
article from Intel is illuminating. Google "non-deterministic memory
write order" for interesting reading on how processors have gotten --
perhaps -- too complex to be completely trustworthy.
This is different, also, from "arbitrary." A word that describes how the
C language deals with constructs like a[i]= i++;. There are two
unrelated state changes that happen in this statement. The order of
those two things is best described as "arbitrary." It's
deterministic. But it's not well defined by the language. Depending
on the compiler and optimization settings, it will be entirely fixed. A
function that uses these constructs could be idempotent. However,
the outcome can vary from compiler to compiler and optimization setting
to optimization setting. This kind of thing is devoutly to be avoided.
It's presence is a less-than-ideal language design choice; writing
software around a bad language feature is simply inexcusable.