Math/Statistics Computation

Transposing Arrays

arr = np.arange(15).reshape((3, 5))
arr
# array([[ 0, 1, 2, 3, 4], 
# [ 5, 6, 7, 8, 9], 
# [10, 11, 12, 13, 14]])

arr.T
# array([[ 0, 5, 10],
# [ 1, 6, 11],
# [ 2, 7, 12],
# [ 3, 8, 13],
# [ 4, 9, 14]])

Inner product

arr = np.random.randn(6, 3)
np.dot(arr.T, arr)
# array([[ 2.584 , 1.8753, 0.8888],
# [ 1.8753, 6.6636, 0.3884],
# [ 0.8888, 0.3884, 3.9781]])

Mean

a = np.array([1,2,3,4])
np.mean(a)
# Output = 2.5

The function .mean() can be used to calculate the percent of array elements that satisfies a certain condition.

np.mean(a > 2)
# The array now becomes array([False, False, True, True])
# True = 1.0, False = 0.0
# Output = 0.5
# 50% of array elements are greater than 2

row/column mean in 2-D arrays

In a two-dimensional array, you may want the mean of just the rows or just the columns. The NumPy .mean() function can be used to find these values. To find the average of all rows, set the axis parameter to 1. To find the average of all columns, set the axis parameter to 0.

ring_toss = np.array([[1, 0, 0], 
                      [0, 0, 1], 
                      [1, 0, 1]])

np.mean(ring_toss, axis = 1)
# Output: array([ 0.33333333,  0.33333333,  0.66666667])

Sorting

NumPy .sort() function takes a NumPy array and returns a different NumPy array, this one containing the same numbers in ascending order.

heights = np.array([49.7, 46.9, 62, 47.2, 47, 48.3, 48.7])
np.sort(heights)
# Output: array([ 46.9,  47. ,  47.2,  48.3,  48.7,  49.7,  62])

Percentiles and Quartiles

NumPy .percentile function accepts a NumPy array and percentile value between 0 and 100. The function returns the value of the array element at the percentile specified.

d = np.array([1, 2, 3, 4, 4, 4, 6, 6, 7,  8, 8])
np.percentile(d, 40)
# Output: 4.00

NumPy .percentile() function can calculate the first, second and third quartiles of an array. These three quartiles are simply the values at the 25th, 50th, and 75th percentiles, so those numbers would be the parameters, just as with any other percentile.

d = [1, 2, 3, 4, 4, 4, 6, 6, 7, 8, 8]
np.percentile(d, 25)
# Output: 3.5
np.percentile(d, 75)
#Output: 6.5

Last updated

Was this helpful?