Tricks in numpy: solution

Tricky actions numpy could provide

In [1]:
import numpy as np

Array concatenation

here we have separate RGB channels of an image, concatenate them using np.concatenate and np.stack separately

In [2]:
h, w = 224, 224
R = np.random.randint(0, 256, size=(h, w))
G = np.random.randint(0, 256, size=(h, w))
B = np.random.randint(0, 256, size=(h, w))
In [3]:
# your code here #
img1 = np.concatenate(
        [R[None, ...],
         G[None, ...],
         B[None, ...]], axis=0)
assert img1.shape == (3, h, w)

img2 = np.stack([R, G, B], axis=0)
assert img2.shape == (3, h, w)

Array creation

p is a 1D array indicating probability of being dog(y=1). let's say all samples are truly dog and we need to create a ground-truth array for loss computation (don't worry :) you don't need to implement loss function here). imagine you can't explicitly use any shape information for this matter.

In [4]:
num_samples = 100
p = np.random.rand(num_samples)
In [5]:
# your code here
y_gt = np.ones_like(p)

Data stats

salary is a 2D array denoting salaries of six employees during 100 months. let's say the manager asks you to analyze his/her employees by answering the following questions (np.argmax, np.max, np.argwhere, np.median and np.percentile could be helpful :) )

In [6]:
salary = np.random.randint(5, 150, size=(6, 100))
  1. for 2nd employee, most salary earned during a month and when
In [7]:
# your code here
print(salary[1].max(), salary[1].argmax())
147 80
  1. find most salary earned, by whom and when (might be multiple employees in multiple times)
In [8]:
# your code here
print(salary.max())
np.argwhere(salary == salary.max())
149
Out[8]:
array([[ 0, 35],
       [ 3, 52]])
  1. per employee, the median salary they earned
In [9]:
# your code here
np.median(salary, 1)
Out[9]:
array([77.5, 69. , 96.5, 77.5, 64. , 70.5])
  1. per employee, their 25th max salary they earned
In [10]:
# your code here
np.percentile(salary, 75, axis=1)
Out[10]:
array([111.75, 109.25, 122.75, 109.  ,  98.25, 114.25])

Indexing (including boolean) and slicing

salary is salaries of six employees during a year. implement code for the following parts

In [11]:
salary = np.random.randint(5, 100, size=(6, 12))
  1. show the salary of all employees in second month
In [12]:
# your code here
salary[:, 1]
Out[12]:
array([60, 57, 79, 13, 61, 82])
  1. show the salary of 1st and 3rd employees in even months
In [13]:
# your code here
salary[[0, 2], 1::2]
Out[13]:
array([[60, 12, 83, 78, 45, 63],
       [79, 92, 68, 37, 81, 77]])
  1. show salary of 1st employee in 5th month, 2nd one in 11st, 3rd in 7th
In [14]:
# your code here
salary[np.arange(0, 3), [4, 10, 6]]
Out[14]:
array([64, 89, 78])
  1. one day boss comes and seems angry :) telling you the most salary someone can earn is 90 and asks you to shift the salaries more than that to it.
In [15]:
# your code here
salary[salary >= 90] = 90
assert salary.max() == 90

Other utilities

  1. create an array of 1000 samples in a linear form between 0 and 1 using np.linspace and check the difference of all two subsequent elements are close to 0.001
In [16]:
# your code here
arr = np.linspace(0, 1, 1000)
assert np.allclose(np.diff(arr), 0.001, atol=1e-4)
  1. increase value of arr by one in places provided by indices
In [17]:
arr = np.zeros(5)
indices = np.random.randint(0, 5, size=(1000))
In [18]:
# your code here
np.add.at(arr, indices, 1)
assert np.sum(arr) == 1000
  1. for arr, select the 4th and 5th axis of last dimension using both numpy indexing and ellipsis and check their value equality using np.all
In [19]:
arr = np.random.rand(10, 10, 10, 10, 10, 8)
In [20]:
# your code here
res1 = arr[..., [4, 5]]
res2 = arr[:, :, :, :, :, [4, 5]]
assert np.all(res1 == res2)