ML Zoomcamp 2023 – Introduction to Machine Learning – Part 8

Overview:

  1. Introduction to NumPy part 3/3
    1. Randomly generated arrays
    2. Element-wise operations
    3. Comparison operations
    4. Summarizing operations

Introduction to NumPy part 3/3

Randomly generated arrays

In addition to creating and manipulating multi-dimensional arrays, NumPy also provides the capability to generate arrays filled with random values. This can be useful in various scientific and mathematical applications, as well as for testing and simulation purposes.

To create a randomly generated array, you can use the numpy.random module, which provides a range of functions for generating random values. Here are a few examples:

import numpy as np

# Generate a 1-dimensional array of 5 random integers between 0 and 9
random_integers = np.random.randint(10, size=5)
print(random_integers)
# Output: [2 5 7 1 8]

# Generate a 2-dimensional array of shape (3, 4) 
# with random floats between 0 and 1
random_floats = np.random.random((3, 4))
print(random_floats)
# Output:
# [[0.0863851  0.83574087 0.79192621 0.85248822]
#  [0.14040051 0.5714931  0.7586195  0.48544792]
#  [0.4304003  0.76688989 0.68447497 0.54361942]]

# Generate a 3-dimensional array of shape (2, 3, 2) 
# with random values from a standard normal distribution
random_normal = np.random.normal(size=(2, 3, 2))
print(random_normal)
# Output:
# [[[-0.27013393  0.54416022]
#   [ 0.02537238 -0.78380969]
#   [-1.25909646 -1.04630766]]
#
#  [[ 0.34262622 -0.66770036]
#   [-0.35426751  0.00569635]
#   [-0.05665257  0.02068191]]]

By using the appropriate functions from the numpy.random module, you can easily generate arrays with random values according to your specific requirements.

# generates a 2-dimensional array of size 5 rows and 2 columns
# with random numbers between 0 and 1 
# rand samples from standard uniform distribution
np.random.rand(5, 2)
# Output:
# array([[0.83575882, 0.03277884],
#       [0.78785763, 0.34340225],
#       [0.79212789, 0.75564912],
#       [0.78937584, 0.4326158 ],
#       [0.90909093, 0.82098053]])

# when you set the random seed it's possible to reproduce the 
# same "random" values
np.random.seed(2)
np.random.rand(5, 2)
# Output:
# array([[0.4359949 , 0.02592623],
#       [0.54966248, 0.43532239],
#       [0.4203678 , 0.33033482],
#       [0.20464863, 0.61927097],
#       [0.29965467, 0.26682728]])

# randn samples from standard normal distribution
np.random.seed(2)
np.random.randn(5, 2)
# Output:
# array([[-0.41675785, -0.05626683],
#       [-2.1361961 ,  1.64027081],
#       [-1.79343559, -0.84174737],
#       [ 0.50288142, -1.24528809],
#       [-1.05795222, -0.90900761]])

# creates random numbers between 0 and 100
np.random.seed(2)
100 * np.random.rand(5, 2)
# Output:
# array([[43.59949021,  2.59262318],
#       [54.96624779, 43.53223926],
#       [42.03678021, 33.0334821 ],
#       [20.4648634 , 61.92709664],
#       [29.96546737, 26.68272751]])

# creates an array of random integer numbers
np.random.seed(2)
np.random.randint(low=0, high=100, size=(5, 2))
# Output:
# array([[40, 15],
#       [72, 22],
#       [43, 82],
#       [75,  7],
#       [34, 49]])

Element-wise operations

NumPy supports element-wise operations on arrays, which means you can perform mathematical operations or apply functions to each element of an array individually. This makes it easy to perform calculations on entire arrays without the need for explicit loops.

For example, you can add or subtract arrays element-wise using the + and - operators, respectively. Here’s an example:

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Element-wise addition
result = arr1 + arr2
print(result)
# Output: [5 7 9]

# Element-wise subtraction
result = arr1 - arr2
print(result)
# Output: [-3 -3 -3]

Similarly, you can perform element-wise multiplication or division using the * and / operators, respectively. Here’s an example:

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Element-wise multiplication
result = arr1 * arr2
print(result)
# Output: [4 10 18]

# Element-wise division
result = arr1 / arr2
print(result)
# Output: [0.25 0.4  0.5]

In addition to basic arithmetic operations, you can also apply various mathematical functions to arrays element-wise. NumPy provides a comprehensive collection of built-in functions such as np.sin(), np.cos(), np.exp(), and np.log(), among others. Here’s an example:

import numpy as np

arr = np.array([0, np.pi/2, np.pi])

# Calculate the sine of each element
result = np.sin(arr)
print(result)
# Output: [0.         1.         1.2246468e-16]

# Calculate the exponential of each element
result = np.exp(arr)
print(result)
# Output: [ 1.          4.81047738 23.14069263]

By leveraging element-wise operations and built-in functions, you can perform complex calculations on arrays efficiently and easily in NumPy.

a = np.arange(5)
a
# Output:
# array([0, 1, 2, 3, 4])

# adds 1 to every element in the array
# be careful, you cannot do this with a normal python list
a + 1
# Output:
# array([1, 2, 3, 4, 5])

a * 2
# Output:
# array([0, 2, 4, 6, 8])

a * 100
# Output:
# array([  0, 100, 200, 300, 400])

a / 100
# Output:
# array([0.  , 0.01, 0.02, 0.03, 0.04])

(10 + (a * 2)) ** 2
# Output:
# array([100, 144, 196, 256, 324])

b = (10 + (a * 2)) ** 2 / 100
b
# Output:
# array([1.  , 1.44, 1.96, 2.56, 3.24])

# adds element-wise both arrays
a + b
# Output:
# array([1.  , 2.44, 3.96, 5.56, 7.24])

Comparison operations

Comparison operations in NumPy allow you to compare elements of arrays and obtain boolean results. This is particularly useful for tasks such as filtering or conditional assignment.

For example, you can use the comparison operator > to determine which elements of an array are greater than a specified value. Here’s an example:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Compare elements with 3
result = arr > 3
print(result)
# Output: [False False False  True  True]

In this example, the comparison arr > 3 compares each element of the array arr with the value 3. The result is a boolean array where True represents elements that are greater than 3 and False represents elements that are lesser or equal to 3.

Comparison operations can also be used with multiple arrays of the same shape. The result is an element-wise comparison between corresponding elements of the arrays. Here’s an example:

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([3, 2, 1])

# Compare elements of arr1 and arr2
result = arr1 == arr2
print(result)
# Output: [False  True False]

In this example, the comparison arr1 == arr2 compares each element of arr1 with the corresponding element of arr2. The resulting boolean array has True values for elements that are equal in both arrays and False values otherwise.

Comparison operations can also be combined with logical operators such as & (logical AND) and | (logical OR) to perform more complex comparisons. Here’s an example:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Compare elements that are greater than 2 and less than 5
result = (arr > 2) & (arr < 5)
print(result)
# Output: [False False  True  True False]

In this example, the comparison (arr > 2) & (arr < 5) combines two comparisons using the logical AND operator. The resulting boolean array contains True values only for elements that are both greater than 2 and less than 5.

By leveraging comparison operations in NumPy, you can easily perform element-wise comparisons and obtain boolean arrays representing the results. These boolean arrays can then be used for various purposes, such as filtering or conditional assignment.

a = np.arange(5)
b = (10 + (a * 2)) ** 2 / 100
# compare numbers element-wise
a >= 2
# Output:
# array([False, False,  True,  True,  True])

a > b
# Output:
# array([False, False,  True,  True,  True])

# checks which elements of a are greater than b
# a > b -> returns an boolean array
# a[a > b] returns the elements where the boolean array is true
# that are the elements 2, 3, and 4
a[a > b]
# Output:
# array([2, 3, 4])

a[2], a[3], a[4]
# Output:
(2, 3, 4)

Summarizing operations

# there are some operations that instead of returning a new array, 
# it returns a single number
# e.g. min() returns the smallest number 
a.min()
# Output: 0

a.max()
# Output: 4

a.sum()
# Output: 10

a.mean()
# Output: 2

# standard deviation
a.std()
# Output: 1.4142135623730951

n = np.array([
   [12, 13,  0],
    [ 4,  5,  1],
    [ 7,  8,  2]
])

# this also works for 2-dimensional arrays
n.sum()
# Output: 52

n.min()
# Output: 0

For more information about NumPy functions check this links:

https://www.datacamp.com/cheat-sheet/numpy-cheat-sheet-data-analysis-in-python

https://mlbookcamp.com/article/numpy

https://gist.github.com/ziritrion/9b80e47956adc0f20ecce209d494cd0a#numpy

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.