In this tutorial, we will cover:

- Numpy: Working with Arrays, Matrix operations, Broadcasting.

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

To use Numpy, we first need to import the `numpy`

package:

In [2]:

```
import numpy as np
```

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

We can initialize numpy arrays from nested Python lists. We can create a 1D(rank 1) array as follows.

In [30]:

```
a = np.array([1,2,3,4]) # rank 1 array (1D array)
print(a.shape)
print(a)
```

We'll work exclusively with 2D(rank 2) arrays in this course.

In [7]:

```
b = np.array([[1,2,3],[4,5,6]]) # Create a rank 2 array
# b has shape (2,3), two rows, three cols
print(b) # notice the difference between printing b vs outputting b
b
```

Out[7]:

Numpy provides a way to generate a random array of any shape whose values are drawn from the standard normal distribution(Bell curve with mean(average) $0$ and standard deviation $1$.)

In [14]:

```
d = np.random.randn(2,3) # standard normal distribution
print(d)
```

In [16]:

```
e = np.arange(16) # rank 1 array with values from 0 to 15.
print(e.shape)
e
```

Out[16]:

The reshape function allows an array to be given a new shape without changing its data. The new shape, however, needs to be compatible with the original shape.

In [17]:

```
e = e.reshape(4,4)
e # note the number of brackets, where they are and the commas.
```

Out[17]:

Let's start with a rank 3 array (3D array).

In [8]:

```
a = np.arange(12)
a = a.reshape(3,2,2) # three sheets, two rows and two columns
print(a.ndim) # number of dimensions
a # try writing out a before running this cell.
```

Out[8]:

The 3D array `h`

consists of three "sheets", each with two rows and two columns.

In [19]:

```
a[0] # sheet 0 (2D array of 2 rows and 2 columns)
```

Out[19]:

In [20]:

```
a[0,1] # sheet 0, row 1 (1D array)
```

Out[20]:

In [21]:

```
a[0,1,1] # sheet 0, row 1, column 1
```

Out[21]:

We'll deal exclusively with greyscale images(2D arrays, or matrices) in this course. We begin by reviewing some basic matrices and their operations.

An $m\times n$ matrix $A$ is a rectangular array of real numbers arranged in $m$ rows and $n$ columns and has the form:

$$\left[ \begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{array} \right] $$

We will denote an $m\times n$ matrix $A$ as $[a_{ij}]$.

In Numpy, such a 2D array has shape $(m,n)$.

Given two matrices $A=[a_{ij}]$ and $B=[b_{ij}]$, of dimension $m\times n$. The sum $A+B$ and difference $A-B$ are calculated elementwise.

For example, $$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 5 & 6 \\ 7 & 8 \end{array}\right]=\left[\begin{array}{rr} 6 & 8 \\ 10 & 12 \end{array}\right]$$

In [22]:

```
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
print(x + y) # elementwise sum
```

In [23]:

```
# Elementwise difference
print(x - y)
```

The scalar product $cA$ of a real number(scalar) $c$ and a matrix $A$ is computed by multiplying every entry of $\textbf{A}$ by $c$. Thus $cA=[ca_{ij}].$

In [24]:

```
# scalar product
print(3*x)
```

The elementwise product of two matrices $A=[a_{ij}]$ and $B=[b_{ij}]$, of dimension $m\times n$, also known as the **Hadamard** product is $A*B=[a_{ij}*b_{ij}]$.

For example, $$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]*\left[\begin{array}{rr} 5 & 6 \\ 7 & 8 \end{array}\right]=\left[\begin{array}{rr} 5 & 12 \\ 21 & 32 \end{array}\right]$$

In [69]:

```
# Elementwise product
print(x * y)
```

Note that `*`

is elementwise multiplication, not matrix multiplication. We instead use the dot function to multiply matrices. Matrix multiplication is VERY important in machine learning. You need to be very comfortable with it. Please review it, if necessary.

We'll review matrix multiplication through some examples.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]\left[\begin{array}{r} 1 \\ 2 \end{array}\right]=\left[\begin{array}{r} 5 \\ 11 \end{array}\right]$$

In [31]:

```
x = np.array([[1,2],[3,4]]) # shape (2,2)
v = np.array([[1],[2]]) # shape (2,1)
print(np.dot(x, v)) #(2,2) matrix times (2,1) matrix yields a (2,1) matrix
```

$$\left[\begin{array}{rr} 1 & 0 & 3 \\ 4 & -1 & 2 \end{array}\right]\left[\begin{array}{rr} 0 & 1 \\ 2 & -1 \\ 3 & 0 \end{array}\right]=\left[\begin{array}{r} 9 & 1 \\ 4 & 5 \end{array}\right]$$

Note that a `(2,3)`

array times `(3,2)`

array yields a `(2,2)`

array.

In [34]:

```
x = np.array([[1,0,3],[4,-1,2]])
y = np.array([[0,1],[2,-1],[3,0]])
print(np.dot(x, y))
```

The following matrix multiplication is undefined since the dimensions are not aligned correctly.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]\left[\begin{array}{rr} 0 & 1 \end{array}\right]$$

In [87]:

```
x = np.array([[1,2],[3,4]])
y = np.array([[0,1]])
print(np.dot(x, y))
```

Given a matrix $A$ of dimension $m\times n$, define the **transpose**
of $A$ denoted by $A^T$ as the matrix whose rows are the columns of $A$. Thus the dimension of $A^T$ is $n\times m$.

In Numpy, to transpose a matrix, simply use the T attribute of an array object:

In [91]:

```
print(x)
print(x.T)
```

In [92]:

```
v = np.array([[1,2,3]])
print(v) # shape (1,3)
print(v.T) # shape (3,1)
```

Numpy provides many useful functions for performing computations on arrays; one of the most useful is `sum`

:

In [78]:

```
# .sum(axis=n) dimension n is collapsed
# and all values in the new matrix equal to the sum of the corresponding collapsed values
x = np.array([[1,2],[3,4]])
print(np.sum(x)) # Compute sum of all elements;
print(np.sum(x, axis=0)) # all the rows(axis=0) are collapsed, summing the columns
print(np.sum(x, axis=1)) # all the columns(axis=1) are collapsed, summing the rows
```

Another useful numpy function that we will use is argmax which returns the index of the largest value.

In [12]:

```
x = np.array([1,2,5,8,4,5])
np.argmax(x) # returns 3 since 8 is the largest and it is at index 3.
```

Out[12]:

You can find the full list of mathematical functions provided by numpy in the documentation.

Broadcasting allows operations on arrays whose shapes are not compatible. This notebook's treatment of broadcasting is VERY minimal. We only cover what is necessary for the rest of the course.

You can find more information about broadcasting in the documentation.

Consider the following matrices
$A = \left[\begin{array}{rr}
1 & 2 \\
3 & 4
\end{array}\right]$ and $B=\left[\begin{array}{rr}
0 & 1
\end{array}\right]$. Note that $A+B$ is undefined because the shapes `(2,2)`

and `(2,1)`

are not compatible.

Numpy, however, will broadcast the shape of the second matrix to shape `(2,2)`

to be compatible with the first so that the addition can be done. It does this without making copies of the data and wasting memory.

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 1\end{array}\right]\rightarrow\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 1 \\ 0 & 1 \end{array}\right]=\left[\begin{array}{rr} 1 & 3 \\ 3 & 5 \end{array}\right]$$

Similarly,

$$\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 \\ 1 \end{array}\right]\rightarrow\left[\begin{array}{rr} 1 & 2 \\ 3 & 4 \end{array}\right]+\left[\begin{array}{rr} 0 & 0 \\ 1 & 1 \end{array}\right]=\left[\begin{array}{rr} 1 & 2 \\ 4 & 5 \end{array}\right]$$

In [11]:

```
A = np.array([[1,2],[3,4]])
B = np.array([[0,1]])
A+B
```

Out[11]:

In [10]:

```
A = np.array([[1,2],[3,4]])
B = np.array([[0],[1]])
A+B
```

Out[10]:

`(4,5)`

with of random values from the normal distribution.¶$$\left[\begin{array}{rr} -2 & 1 & 1 \\ 3 & 2 & 3 \end{array}\right]\left[\begin{array}{rr} 0 \\ 2 \\ 3 \end{array}\right]$$

`A`

be the array as in the previous problem. Consider the function below which accepts a number and returns its square.¶In [6]:

```
def sq(x):
return x*x
```

In [7]:

```
sq(3)
```

Out[7]:

`sq(A)`

. What do you notice? In general, if `A`

is an $m\times n$ matrix $[a_{ij}]$ and $f:\mathbb{R}\rightarrow\mathbb{R}$ is a function. Then $f(A)$ is the $m\times n$ matrix given by $[f(a_{ij})]$. In other words, $f$ is applied elementwise to each entry of `A`

. This is an important operation in machine learning.¶