Keywords
For those who for some reason (cost?) don't want to use Wolfram Mathematica instead, which also has support for symbolic algebra and so much more (and can call Python anyway).
Perhaps the most famous contributed Python package is NumPy, which offers support for numerical and scientific computing, mathematical functions, improved arrays, vectors, matrices, and linear algebra (and some vector and matrix operations). Most of NumPy under the hood is C-optimised compiled.
There are few Python packages that have as many online guides as NumPy! Apart from the NumPy docs themselves, some good starting points are W3School and GeeksForGeeks: Python Lists VS Numpy Arrays
There's also a comprehensive online book Learning Scientific Programming with Python (2nd edition) by Christian Hill with a quiz for each section.
By popular convention it is imported as:
import numpy as np
a = np.array([1, 2, 3]) # creates an np.ndarray object
print(a)
The Python array.array
and the np.ndarray
class are more memory efficient for numerical data than list
, and are more efficient for numerical computing.
Unlike core Python lists, the core Python array.array
requires all array elements to be of the same type, specified on creation.
Under the hood, an np.ndarray
is a NumPy class, whereas np.array()
is a function for constructing np.ndarray
objects from its arguments. The 'nd' in ndarray
stands for N-dimensional array.
NumPy arrays are also actually homogeneous (hold elements of the same Datatype), but the np.array()
function may in fact accept heterogeneous elements (it just converts the elements to the most applicable datatype as long as no explicit and incompatible type indicator was provided).
It is sometimes incorrectly stated that a NumPy array can be heterogeneous, because you can pass a heterogeneous list as an argument to the np.array()
function, however the resulting np.ndarray
object (specifically) created is always ultimately homogeneous. Try this:
a_np_mixed = np.array(["mixed", 1, 2, 3])
print (a_np_mixed)
['numbers' '3' '6' '9' '12']
print(type(a_np_mixed))
<class 'numpy.ndarray'>
print(type(a_np_mixed))
<class 'numpy.str_'>
Note how the above has created an ndarray
that is indeed now homogeneous w.r.t. numpy.str_
(after conversion).
Compare with:
a_np_i = np.array([1, 2, 3]) print(a_np_i) [1 2 3] print(type(a_np_i)) <class 'numpy.ndarray'> print(a_np_i.dtype.type) <class 'numpy.int64'>
Note how the above has created an ndarray
that is homogeneous w.r.t. numpy.int64_
and indeed only contains integers.
NumPy arrays should always be created via np.array(..)
. There are in fact ways to HACK creating a NumPy array using np.ndarray
with some tricky arguments, but it's not well advised.
NumPy arrays can be N-dimensional. NumPy arrays are stored contiguously in memory, which means that all rows of 2D arrays must have the same number of column slots (and similarly for 3D etc.).
The order for 2D arrays is rows then columns:
arr = np.array([[1, 2, 3],[4, 5,6]]) print(arr) [[1 2 3] [4 5 6]] print(arr.shape) (2, 3)
Resizing and adding elements
The core Pythonarray.array
can be "resized" using the insert()
or append()
methods. So it can be created empty then progressively populated.
The official NumPy docs state that:
Most NumPy arrays have some restrictions ... Once created, the total size of the array can’t change.
For example, the docs for the numpy.ndarray.resize method state:
Change shape and size of array in-place.
An np.ndarray
has a fixed size on creation, but one can change the dimensions using either the ndarray.resize()
method (which does not make a copy and pads any extra slots with zeros) or the np.resize()
function, which does make a new object and pads using the values of the source ndarray
object (not zeros).
Both the np.append()
and np.insert()
functions create new arrays, which may be a performance consideration with large arrays.
Some methods such as ndarray.reshape
appear to change the dimensions but in fact just create "views" of an underlying ndarray.base
array. The base
of an array that owns its memory is None.
Taking a slice also just creates a "view":
>>> x = np.array([1,2,3,4])
>>> x.base is None
True
>>> y = x[2:]
>>> y.base is x
True
Some other ways to create views that appear to "change" the dimensions of an ndarray
array include the np.newa_dims
function and use of the np.newaxis
alias for None.
Copy array (shallow and deep)
To create a full copy of a purely numerical array use the numpy.ndarray.copy
method or the numpy.copy
function (beware that they have different defaults).
The above only make a shallow copy of an array containing objects (dtype=object), the "copy" is therefore mutable w.r.t. changes to the underlying objects. To make a deep copy use the core library function copy.deepcopy
.
Which methods creates views and which create copies?
It's not always obvious without checking the docs which methods create copies and which just create views (although there's probably an implementation rhyme to the reason). For example, numpy.ndarray.swapaxes
just creates a view, and numpy.ndarray.flatten
returns a copy of the array collapsed into one dimension. One can achieve similar with ndarray.reshape
but it only creates a view.
Creating specific kinds of arrays
There are heaps of different Array creation routines for creating various kinds of commonly used arrays easily, including empty, zeros, ones, full (all one value), identity etc. And one can create arrays "like" another array, that is, with the same dimensions, but populated with different values (such as used for masking).
Replacing values
There are many different approaches. Some basic 1D cases:
>>> a1 = np.array([1, 2, 3, 4, 5])
[1 2 3 4 5]
>>> a1[2] = 7 # single index
[1 2 7 4 5]
>>> a1[a1==4] = 6 # conditional
[1 2 7 6 5]
>>> a1[[0,1]] = [9,8] # explicit range
[9 8 7 6 5]
>>> a1[range(2)] = [11,10] # generate range
[11 10 7 6 5]
>>> a2 = np.arange(5)
[0 1 2 3 4]
>>> np.put(a2, [0, 2], [-44, -55])
[-44 1 -55 3 4]
Some 2D cases:
>>> a3 = np.array([[1, 2, 3], [4, 5, 6]])
[[1 2 3]
[4 5 6]]
>>> a3[0][2] = 7
[[1 2 7]
[4 5 6]]
>>> a3[0][range(2)] = [8,9]
[[8 9 7]
[4 5 6]]
The following offers a good quick starts for conditional value replacement: How to Replace Values in a NumPy Array?.
NumPy matrix is apparently no more (sort of)
NumPy has a specialised 2D-array numpy.matrix
with special operators, such as *
(matrix multiplication) and **
(matrix power), however:
It is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future.
(What impact this may have on SciPi and scipy.linalg
is not clear.)
The numpy.matrix
has the following convenient short-cuts:
numpy.matrix.T
for transpose()
(non conjugated).
numpy.matrix.H
for the (complex) conjugate transpose of self.
numpy.matrix.I
for the (multiplicative) inverse of invertible self.
numpy.matrix.A
for self as an ndarray
object.
Note that of those a regular ndarray
only directly has:
ndarray.T
for transpose()
.
The operations for those other cases can all be achieved with ndarray
, but you can HACK in the same shortcuts.
NumPy handles complex numbers
The engineering friendly j
has the special meaning of indicating the imaginary part:
>>> com=np.array([1 + 2j, 1 - 3j])
[1.+2.j 1.-3.j]
>>> type(com)
<class 'numpy.ndarray'>
>>> com.dtype.type
<class 'numpy.complex128'>
And there's support for most basic complex operations such as addition, subtraction, multiplication, division etc.
Access the real and imaginary parts using np.real()
and np.imag()
:
>>> np.real(com[1])
1.0
>>> np.imag(com[1])
-3.0
Conjugate with np.conj()
:
>>> np.conj(com[1])
(1+3j)