Introduction to Numpy

by Long Nguyen

This Python tutorial is minimal, covering only the absolute basics of the powerful Numpy library necessary for the Introduction to Neural Networks course. For an excellent and more comprehensive introduction to Numpy see Ryan Soklaski's Python Like You Mean It.

In this tutorial, we will cover:

  • Numpy: Working with Arrays, Matrix operations, Broadcasting.

Numpy

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

To use Numpy, we first need to import the numpy package:

In [2]:
import numpy as np

Arrays

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

We can initialize numpy arrays from nested Python lists. We can create a 1D(rank 1) array as follows.

In [30]:
a = np.array([1,2,3,4]) # rank 1 array (1D array)
print(a.shape)  
print(a)
(4,)
[1 2 3 4]

We'll work exclusively with 2D(rank 2) arrays in this course.

In [7]:
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
                                # b has shape (2,3), two rows, three cols
print(b)  # notice the difference between printing b vs outputting b
b 
[[1 2 3]
 [4 5 6]]
Out[7]:
array([[1, 2, 3],
       [4, 5, 6]])

Numpy provides a way to generate a random array of any shape whose values are drawn from the standard normal distribution(Bell curve with mean(average) $0$ and standard deviation $1$.)

In [14]:
d = np.random.randn(2,3)  # standard normal distribution
print(d)
[[-0.34529725  0.8569278   0.49064438]
 [-0.08365688  0.64257269 -1.53733287]]
In [16]:
e = np.arange(16) # rank 1 array with values from 0 to 15.
print(e.shape)
e
(16,)
Out[16]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

The reshape function allows an array to be given a new shape without changing its data. The new shape, however, needs to be compatible with the original shape.

In [17]:
e = e.reshape(4,4)
e   # note the number of brackets, where they are and the commas.
Out[17]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Multi-dimensional arrays

Let's start with a rank 3 array (3D array).

In [8]:
a = np.arange(12)
a = a.reshape(3,2,2) # three sheets, two rows and two columns
print(a.ndim) # number of dimensions
a  # try writing out a before running this cell.
3
Out[8]:
array([[[ 0,  1],
        [ 2,  3]],

       [[ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11]]])

The 3D array h consists of three "sheets", each with two rows and two columns.

In [19]:
a[0] # sheet 0 (2D array of 2 rows and 2 columns)
Out[19]:
array([[0, 1],
       [2, 3]])
In [20]:
a[0,1] # sheet 0, row 1 (1D array)
Out[20]:
array([2, 3])
In [21]:
a[0,1,1] # sheet 0, row 1, column 1
Out[21]:
3

Array math

We'll deal exclusively with greyscale images(2D arrays, or matrices) in this course. We begin by reviewing some basic matrices and their operations.

An $m\times n$ matrix $A$ is a rectangular array of real numbers arranged in $m$ rows and $n$ columns and has the form:

$$\left[ \begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array} \right] $$

We will denote an $m\times n$ matrix $A$ as $[a_{ij}]$.

In Numpy, such a 2D array has shape $(m,n)$.

Given two matrices $A=[a_{ij}]$ and $B=[b_{ij}]$, of dimension $m\times n$. The sum $A+B$ and difference $A-B$ are calculated elementwise.

For example, $$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 5 & 6 \\ 7 & 8 \end{array}\right]=\left[\begin{array}{rr} 6 & 8 \\ 10 & 12 \end{array}\right]$$

In [22]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
print(x + y) # elementwise sum
[[ 6  8]
 [10 12]]
In [23]:
# Elementwise difference
print(x - y)
[[-4 -4]
 [-4 -4]]

The scalar product $cA$ of a real number(scalar) $c$ and a matrix $A$ is computed by multiplying every entry of $\textbf{A}$ by $c$. Thus $cA=[ca_{ij}].$

In [24]:
# scalar product 
print(3*x)
[[ 3  6]
 [ 9 12]]

The elementwise product of two matrices $A=[a_{ij}]$ and $B=[b_{ij}]$, of dimension $m\times n$, also known as the Hadamard product is $A*B=[a_{ij}*b_{ij}]$.

For example, $$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]*\left[\begin{array}{rr} 5 & 6 \\ 7 & 8 \end{array}\right]=\left[\begin{array}{rr} 5 & 12 \\ 21 & 32 \end{array}\right]$$

In [69]:
# Elementwise product
print(x * y)
[[ 5 12]
 [21 32]]

Note that * is elementwise multiplication, not matrix multiplication. We instead use the dot function to multiply matrices. Matrix multiplication is VERY important in machine learning. You need to be very comfortable with it. Please review it, if necessary.

We'll review matrix multiplication through some examples.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]\left[\begin{array}{r} 1 \\ 2 \end{array}\right]=\left[\begin{array}{r} 5 \\ 11 \end{array}\right]$$

In [31]:
x = np.array([[1,2],[3,4]]) # shape (2,2)
v = np.array([[1],[2]])   # shape (2,1)

print(np.dot(x, v)) #(2,2) matrix times (2,1) matrix yields a (2,1) matrix
[[ 5]
 [11]]

$$\left[\begin{array}{rr} 1 & 0 & 3 \\ 4 & -1 & 2 \end{array}\right]\left[\begin{array}{rr} 0 & 1 \\ 2 & -1 \\ 3 & 0 \end{array}\right]=\left[\begin{array}{r} 9 & 1 \\ 4 & 5 \end{array}\right]$$

Note that a (2,3) array times (3,2) array yields a (2,2) array.

In [34]:
x = np.array([[1,0,3],[4,-1,2]]) 
y = np.array([[0,1],[2,-1],[3,0]])

print(np.dot(x, y))
[[9 1]
 [4 5]]

The following matrix multiplication is undefined since the dimensions are not aligned correctly.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]\left[\begin{array}{rr} 0 & 1 \end{array}\right]$$

In [87]:
x = np.array([[1,2],[3,4]]) 
y = np.array([[0,1]])

print(np.dot(x, y))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-87-be1e09034b93> in <module>()
      2 y = np.array([[0,1]])
      3 
----> 4 print(np.dot(x, y))

ValueError: shapes (2,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)

Given a matrix $A$ of dimension $m\times n$, define the transpose of $A$ denoted by $A^T$ as the matrix whose rows are the columns of $A$. Thus the dimension of $A^T$ is $n\times m$.

In Numpy, to transpose a matrix, simply use the T attribute of an array object:

In [91]:
print(x)
print(x.T)
[[1 2]
 [3 4]]
[[1 3]
 [2 4]]
In [92]:
v = np.array([[1,2,3]])
print(v) # shape (1,3)
print(v.T) # shape (3,1)
[[1 2 3]]
[[1]
 [2]
 [3]]

Numpy provides many useful functions for performing computations on arrays; one of the most useful is sum:

In [78]:
# .sum(axis=n) dimension n is collapsed 
# and all values in the new matrix equal to the sum of the corresponding collapsed values
x = np.array([[1,2],[3,4]])

print(np.sum(x))  # Compute sum of all elements; 
print(np.sum(x, axis=0))  # all the rows(axis=0) are collapsed, summing the columns
print(np.sum(x, axis=1))  # all the columns(axis=1) are collapsed, summing the rows
10
[4 6]
[3 7]

Another useful numpy function that we will use is argmax which returns the index of the largest value.

In [12]:
x = np.array([1,2,5,8,4,5])
np.argmax(x) # returns 3 since 8 is the largest and it is at index 3. 
Out[12]:
3

You can find the full list of mathematical functions provided by numpy in the documentation.

Broadcasting

Broadcasting allows operations on arrays whose shapes are not compatible. This notebook's treatment of broadcasting is VERY minimal. We only cover what is necessary for the rest of the course.

You can find more information about broadcasting in the documentation.

Consider the following matrices $A = \left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]$ and $B=\left[\begin{array}{rr} 0 & 1 \end{array}\right]$. Note that $A+B$ is undefined because the shapes (2,2) and (2,1) are not compatible.

Numpy, however, will broadcast the shape of the second matrix to shape (2,2) to be compatible with the first so that the addition can be done. It does this without making copies of the data and wasting memory.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 1\end{array}\right]\rightarrow\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 1 \\ 0 & 1 \end{array}\right]=\left[\begin{array}{rr} 1 & 3 \\ 3 & 5 \end{array}\right]$$

Similarly,

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 \\ 1 \end{array}\right]\rightarrow\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 0 \\ 1 & 1 \end{array}\right]=\left[\begin{array}{rr} 1 & 2 \\ 4 & 5 \end{array}\right]$$

In [11]:
A = np.array([[1,2],[3,4]])
B = np.array([[0,1]])
A+B
Out[11]:
array([[1, 3],
       [3, 5]])
In [10]:
A = np.array([[1,2],[3,4]])
B = np.array([[0],[1]])
A+B
Out[10]:
array([[1, 2],
       [4, 5]])

Homework

Do the following problems.

Manually create a numpy array of rank 1 with numbers from 1 to 4.

Create a numpy array of rank 2 with numbers from 1 to 4.

Create an 1D array with the arange function from 0 to 29. Then reshape it into 1) an array of rank 2 of the appropriate size. and 2) an array of rank 3 of the appropriate size.

Create a numpy array of rank 2 with shape (4,5) with of random values from the normal distribution.

Compute the following using numpy arrays. Use rank 2 arrays for both the matrix and the column vector.

$$\left[\begin{array}{rr} -2 & 1 & 1 \\ 3 & 2 & 3 \end{array}\right]\left[\begin{array}{rr} 0 \\ 2 \\ 3 \end{array}\right]$$

Let $X=\left[\begin{array}{rr}-2 & 1 & 1 \\3 & 2 & 3\end{array}\right]$. Compute $X^T$ the transpose of $X$.

Let $A=\left[\begin{array}{rr} 3 & -1 \\ -2 & 1 \end{array}\right]$ and $B=\left[\begin{array}{rr} 1 & 2 \\ -4 & 3 \end{array}\right]$. Compute $3A-2B$.

Let $A$ and $B$ be as in the previous problem. Compute the matrix product $AB$ and the elementwise Hadamard product $A*B$.

Let $A=\left[\begin{array}{rr} 3 & -1 &2 \\ -2 & 1 & 4 \\ 0 & 5 & 1 \end{array}\right]$. Find the sum of all of the rows. Your answer should be the 1D array $$\left[\begin{array}{rr} 1 & 5 &7\end{array}\right].$$

Let $A$ be the array as in the previous problem. Find the sum by collapsing all of the columns. Your answer should be the 1D array $$\left[\begin{array}{rr} 4 & 3 &6\end{array}\right].$$

Let A be the array as in the previous problem. Consider the function below which accepts a number and returns its square.

In [6]:
def sq(x):
    return x*x
In [7]:
sq(3)
Out[7]:
9

Compute sq(A). What do you notice? In general, if A is an $m\times n$ matrix $[a_{ij}]$ and $f:\mathbb{R}\rightarrow\mathbb{R}$ is a function. Then $f(A)$ is the $m\times n$ matrix given by $[f(a_{ij})]$. In other words, $f$ is applied elementwise to each entry of A. This is an important operation in machine learning.

Let $A=\left[\begin{array}{rr} 4 & 5 & 0 \\ -1 & 3 & 2 \\ -3 & 6 & 3\end{array}\right]$ and $B=\left[\begin{array}{rr} 1 & -2 & 1 \end{array}\right]$. Use broadcasting to compute $A+B$. Do it by hand before verifying with code.

Let $A=\left[\begin{array}{rr} 4 & 5 \\ -1 & 3 \end{array}\right]$ and $B=\left[\begin{array}{rr} 1 \end{array}\right]$. Use broadcasting to compute $A+B$. Do it by hand before verifying with code.

Let $A=\left[\begin{array}{rr} 4 & 5 \\ -1 & 3 \end{array}\right]$ and $X=\left[\begin{array}{rr} 1 & 1 \\ 2 & -1 \end{array}\right]$ and $b=\left[\begin{array}{rr} 1 \\ 2 \end{array}\right]$. Use broadcasting to compute $AX+b$. Do it by hand before verifying with code.