## Python Workshop: NumPy

Based on:

This git of Zhiya Zuo

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

• Powerful N-dimensional array object.
• Useful linear algebra, Fourier transform, and random number capabilities.
• And much more

## NumPy installation

in the cmd run:

pip install numpy


## Arrays

In [1]:
# After you install numpy, load it
import numpy as np  # you can use np instead of numpy to call the functions in numpy package

In [2]:
x = np.array([1, 2, 3])  # create a numpy array object
print(type(x))

<class 'numpy.ndarray'>


We can call shape function designed for numpy.ndarray class to check the dimension

In [3]:
x.shape  # can be compared to 'len()' function that is used with list size

Out[3]:
(3,)

Unlike list, we have to use one single data type for all elements in an array

In [4]:
y = np.array([1, 'yes'])  # automatic type conversion from int to str
y

Out[4]:
array(['1', 'yes'], dtype='<U11')

### Multidimensional arrays

In [5]:
arr = np.array([[1, 2, 3, 8]])
arr.shape

Out[5]:
(1, 4)
In [6]:
arr

Out[6]:
array([[1, 2, 3, 8]])
In [7]:
arr = np.array([[1, 2, 3, 8], [3, 2, 3, 2], [4, 5, 0, 8]])
arr.shape

Out[7]:
(3, 4)
In [8]:
arr

Out[8]:
array([[1, 2, 3, 8],
[3, 2, 3, 2],
[4, 5, 0, 8]])

### Special arrays

There are many special array initialization methods to call:

In [9]:
np.zeros([3, 5], dtype=int)  # dtype can define the type of the array

Out[9]:
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
In [10]:
np.ones([3, 5])

Out[10]:
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
In [11]:
np.eye(3)

Out[11]:
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])

## Operations

The rules are very similar to R/Matlab: they are generally element wise

In [12]:
arr

Out[12]:
array([[1, 2, 3, 8],
[3, 2, 3, 2],
[4, 5, 0, 8]])
In [13]:
arr - 5

Out[13]:
array([[-4, -3, -2,  3],
[-2, -3, -2, -3],
[-1,  0, -5,  3]])
In [14]:
arr * 6  # element-vise multiplication

Out[14]:
array([[ 6, 12, 18, 48],
[18, 12, 18, 12],
[24, 30,  0, 48]])
In [15]:
arr * arr  # element-vise multiplication of two matrices

Out[15]:
array([[ 1,  4,  9, 64],
[ 9,  4,  9,  4],
[16, 25,  0, 64]])
In [16]:
np.exp(arr)

Out[16]:
array([[2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 2.98095799e+03],
[2.00855369e+01, 7.38905610e+00, 2.00855369e+01, 7.38905610e+00],
[5.45981500e+01, 1.48413159e+02, 1.00000000e+00, 2.98095799e+03]])

More examples:

In [17]:
arr_2 = np.array([[1], [3], [2], [0]])
arr_2

Out[17]:
array([[1],
[3],
[2],
[0]])
In [18]:
arr_2_T = arr_2.T  # transpose
arr_2_T

Out[18]:
array([[1, 3, 2, 0]])
In [19]:
arr @ arr_2  # matrix multiplication

Out[19]:
array([[13],
[15],
[19]])
In [20]:
arr

Out[20]:
array([[1, 2, 3, 8],
[3, 2, 3, 2],
[4, 5, 0, 8]])
In [21]:
arr.max()

Out[21]:
8
In [22]:
arr.cumsum()

Out[22]:
array([ 1,  3,  6, 14, 17, 19, 22, 24, 28, 33, 33, 41], dtype=int32)

Note: element-by-element operations is done row-by-row, unlike in Matlab (column-by-column) There are many class methods to calculate some statistics of the array itself along some axis:

• axis=1 means row-wise
• axis=0 means column-wise
In [23]:
arr.cumsum(axis=1)

Out[23]:
array([[ 1,  3,  6, 14],
[ 3,  5,  8, 10],
[ 4,  9,  9, 17]], dtype=int32)

1d array is not a column vector & not entirely a row vector and hence should be treated carefully when used with vector/matrix manipulation

In [24]:
a = np.array([1, 2, 3])
a, a.shape

Out[24]:
(array([1, 2, 3]), (3,))
In [25]:
c = np.array([[1, 2, 3]])
c, c.shape  # notice the shape diff

Out[25]:
(array([[1, 2, 3]]), (1, 3))
In [26]:
# can be multiply like a row vector
b = np.array([[1, 2], [3, 4], [5, 6]])
b

Out[26]:
array([[1, 2],
[3, 4],
[5, 6]])
In [27]:
a @ b

Out[27]:
array([22, 28])
In [28]:
# can't be transformed!
a.T, a.T.shape

Out[28]:
(array([1, 2, 3]), (3,))

A trick to transform 1d array into 2d row vector:

In [29]:
a_2d = a.reshape((1, -1))  # '-1' means to put all the rest of the elements in such a way that the reshape could fit
print(a_2d)
print(a_2d.T)

[[1 2 3]]
[[1]
[2]
[3]]


## Indexing and slicing

The most important part is how to index and slice a np.array. It is actually very similar to list, except that we now may have more index elements because there are more than one dimension for most of the datasets in real life

### 1 dimensional case

In [30]:
a1 = np.array([1, 2, 8, 100])
a1

Out[30]:
array([  1,   2,   8, 100])
In [31]:
a1[0]

Out[31]:
1
In [32]:
a1[-2]

Out[32]:
8
In [33]:
a1[[0, 1, 3]]

Out[33]:
array([  1,   2, 100])
In [34]:
a1[1:4]

Out[34]:
array([  2,   8, 100])

We can also use boolean values to index

• True means we want this element
In [35]:
a1 > 3

Out[35]:
array([False, False,  True,  True])

replacing values of array with another values according to a boolean mask

In [36]:
# this is the mask
a1[a1 > 3]

Out[36]:
array([  8, 100])
In [37]:
# this is a use of the above mask
a1[a1 > 3] = 100
a1

Out[37]:
array([  1,   2, 100, 100])

### 2 dimensional case

In [38]:
arr

Out[38]:
array([[1, 2, 3, 8],
[3, 2, 3, 2],
[4, 5, 0, 8]])

Using only one number to index will lead to a subset of the original multidimensional array: also an array

In [39]:
arr[0]

Out[39]:
array([1, 2, 3, 8])
In [40]:
type(arr[0])

Out[40]:
numpy.ndarray

Since we have 2 dimensions now, there are 2 indices we can use for indexing the 2 dimensions respectively

In [41]:
arr[0, 0]

Out[41]:
1

We can use : to indicate everything along that axis

In [42]:
arr[1]

Out[42]:
array([3, 2, 3, 2])
In [43]:
arr[1, :]

Out[43]:
array([3, 2, 3, 2])
In [44]:
arr[:, 1]  # watch out! we've got a 1d array again instead of column vector as maybe expected

Out[44]:
array([2, 2, 5])
In [45]:
# 2D masking
arr[arr > 3] = 55


### 3 dimensional case

As a final example, we look at a 3d array:

In [46]:
np.random.seed(1234)
arr_3 = np.random.randint(low=0, high=100, size=24)
arr_3

Out[46]:
array([47, 83, 38, 53, 76, 24, 15, 49, 23, 26, 30, 43, 30, 26, 58, 92, 69,
80, 73, 47, 50, 76, 37, 34])

We can use reshape to manipulate the shape of an array

In [47]:
arr_3 = arr_3.reshape(3, 4, 2)
arr_3

Out[47]:
array([[[47, 83],
[38, 53],
[76, 24],
[15, 49]],

[[23, 26],
[30, 43],
[30, 26],
[58, 92]],

[[69, 80],
[73, 47],
[50, 76],
[37, 34]]])

Note: Are the printed array not what you though it would be? Did they mixed the shape? No! see this for answers

In [48]:
arr_3[0]

Out[48]:
array([[47, 83],
[38, 53],
[76, 24],
[15, 49]])
In [49]:
arr_3[:, 3, 1]

Out[49]:
array([49, 92, 34])
In [50]:
arr_3[2, 3, 1]

Out[50]:
34