Code
import numpy as np
# declare a vector using a list as the argument
= np.array([1, 2.0, 3, 4])
v v
array([1., 2., 3., 4.])
numpy
NumPy
is the fundamental package for scientific computing with Python. It contains among other things:
Besides its obvious scientific uses, NumPy
can also be used as an efficient multi-dimensional container for general data. Arbitrary data-types can be defined. This allows NumPy
to seamlessly and speedily integrate with a wide variety of databases.
Library documentation: http://numpy.org/
numpy.array
objectimport numpy as np
# declare a vector using a list as the argument
= np.array([1, 2.0, 3, 4])
v v
array([1., 2., 3., 4.])
list([1, 2.0, 3, 4])
[1, 2.0, 3, 4]
type(v)
numpy.ndarray
v.shape
(4,)
v.ndim
1
is float v.dtype
False
v.dtype
dtype('float64')
is int np.uint8
False
Use copilot explain to understand the chunks:
The np.uint8 is a data type in NumPy, representing an unsigned 8-bit integer, which can store values from 0 to 255. The int type is the built-in integer type in Python, which can represent any integer value without a fixed size limit.
2**120, 2**40], dtype=np.int64) np.array([
--------------------------------------------------------------------------- OverflowError Traceback (most recent call last) Cell In[9], line 1 ----> 1 np.array([2**120, 2**40], dtype=np.int64) OverflowError: Python int too large to convert to C long
is int np.uint16
False
is int np.uint32
False
= np.array([1.3, 2, 3, 4], dtype=np.int64)
w w
array([1, 2, 3, 4])
w.dtype
dtype('int64')
= np.arange(100) a
type(a)
numpy.ndarray
range(100)) np.array(
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])
a.dtype
dtype('int64')
-3 * a ** 2
array([ 0, -3, -12, -27, -48, -75, -108, -147,
-192, -243, -300, -363, -432, -507, -588, -675,
-768, -867, -972, -1083, -1200, -1323, -1452, -1587,
-1728, -1875, -2028, -2187, -2352, -2523, -2700, -2883,
-3072, -3267, -3468, -3675, -3888, -4107, -4332, -4563,
-4800, -5043, -5292, -5547, -5808, -6075, -6348, -6627,
-6912, -7203, -7500, -7803, -8112, -8427, -8748, -9075,
-9408, -9747, -10092, -10443, -10800, -11163, -11532, -11907,
-12288, -12675, -13068, -13467, -13872, -14283, -14700, -15123,
-15552, -15987, -16428, -16875, -17328, -17787, -18252, -18723,
-19200, -19683, -20172, -20667, -21168, -21675, -22188, -22707,
-23232, -23763, -24300, -24843, -25392, -25947, -26508, -27075,
-27648, -28227, -28812, -29403])
42] = 13 a[
42] = 1025 a[
np.info(np.int16)
int16()
Signed integer type, compatible with C ``short``.
:Character code: ``'h'``
:Canonical name: `numpy.short`
:Alias on this platform (Linux x86_64): `numpy.int16`: 16-bit signed integer (``-32_768`` to ``32_767``).
Methods:
all -- Scalar method identical to the corresponding array attribute.
any -- Scalar method identical to the corresponding array attribute.
argmax -- Scalar method identical to the corresponding array attribute.
argmin -- Scalar method identical to the corresponding array attribute.
argsort -- Scalar method identical to the corresponding array attribute.
astype -- Scalar method identical to the corresponding array attribute.
bit_count -- int16.bit_count() -> int
byteswap -- Scalar method identical to the corresponding array attribute.
choose -- Scalar method identical to the corresponding array attribute.
clip -- Scalar method identical to the corresponding array attribute.
compress -- Scalar method identical to the corresponding array attribute.
conj -- None
conjugate -- Scalar method identical to the corresponding array attribute.
copy -- Scalar method identical to the corresponding array attribute.
cumprod -- Scalar method identical to the corresponding array attribute.
cumsum -- Scalar method identical to the corresponding array attribute.
diagonal -- Scalar method identical to the corresponding array attribute.
dump -- Scalar method identical to the corresponding array attribute.
dumps -- Scalar method identical to the corresponding array attribute.
fill -- Scalar method identical to the corresponding array attribute.
flatten -- Scalar method identical to the corresponding array attribute.
getfield -- Scalar method identical to the corresponding array attribute.
is_integer -- integer.is_integer() -> bool
item -- Scalar method identical to the corresponding array attribute.
max -- Scalar method identical to the corresponding array attribute.
mean -- Scalar method identical to the corresponding array attribute.
min -- Scalar method identical to the corresponding array attribute.
nonzero -- Scalar method identical to the corresponding array attribute.
prod -- Scalar method identical to the corresponding array attribute.
put -- Scalar method identical to the corresponding array attribute.
ravel -- Scalar method identical to the corresponding array attribute.
repeat -- Scalar method identical to the corresponding array attribute.
reshape -- Scalar method identical to the corresponding array attribute.
resize -- Scalar method identical to the corresponding array attribute.
round -- Scalar method identical to the corresponding array attribute.
searchsorted -- Scalar method identical to the corresponding array attribute.
setfield -- Scalar method identical to the corresponding array attribute.
setflags -- Scalar method identical to the corresponding array attribute.
sort -- Scalar method identical to the corresponding array attribute.
squeeze -- Scalar method identical to the corresponding array attribute.
std -- Scalar method identical to the corresponding array attribute.
sum -- Scalar method identical to the corresponding array attribute.
swapaxes -- Scalar method identical to the corresponding array attribute.
take -- Scalar method identical to the corresponding array attribute.
to_device -- None
tobytes -- None
tofile -- Scalar method identical to the corresponding array attribute.
tolist -- Scalar method identical to the corresponding array attribute.
tostring -- Scalar method identical to the corresponding array attribute.
trace -- Scalar method identical to the corresponding array attribute.
transpose -- Scalar method identical to the corresponding array attribute.
var -- Scalar method identical to the corresponding array attribute.
view -- Scalar method identical to the corresponding array attribute.
np.int16
numpy.int16
dict(enumerate(a))
{0: np.int64(0),
1: np.int64(1),
2: np.int64(2),
3: np.int64(3),
4: np.int64(4),
5: np.int64(5),
6: np.int64(6),
7: np.int64(7),
8: np.int64(8),
9: np.int64(9),
10: np.int64(10),
11: np.int64(11),
12: np.int64(12),
13: np.int64(13),
14: np.int64(14),
15: np.int64(15),
16: np.int64(16),
17: np.int64(17),
18: np.int64(18),
19: np.int64(19),
20: np.int64(20),
21: np.int64(21),
22: np.int64(22),
23: np.int64(23),
24: np.int64(24),
25: np.int64(25),
26: np.int64(26),
27: np.int64(27),
28: np.int64(28),
29: np.int64(29),
30: np.int64(30),
31: np.int64(31),
32: np.int64(32),
33: np.int64(33),
34: np.int64(34),
35: np.int64(35),
36: np.int64(36),
37: np.int64(37),
38: np.int64(38),
39: np.int64(39),
40: np.int64(40),
41: np.int64(41),
42: np.int64(1025),
43: np.int64(43),
44: np.int64(44),
45: np.int64(45),
46: np.int64(46),
47: np.int64(47),
48: np.int64(48),
49: np.int64(49),
50: np.int64(50),
51: np.int64(51),
52: np.int64(52),
53: np.int64(53),
54: np.int64(54),
55: np.int64(55),
56: np.int64(56),
57: np.int64(57),
58: np.int64(58),
59: np.int64(59),
60: np.int64(60),
61: np.int64(61),
62: np.int64(62),
63: np.int64(63),
64: np.int64(64),
65: np.int64(65),
66: np.int64(66),
67: np.int64(67),
68: np.int64(68),
69: np.int64(69),
70: np.int64(70),
71: np.int64(71),
72: np.int64(72),
73: np.int64(73),
74: np.int64(74),
75: np.int64(75),
76: np.int64(76),
77: np.int64(77),
78: np.int64(78),
79: np.int64(79),
80: np.int64(80),
81: np.int64(81),
82: np.int64(82),
83: np.int64(83),
84: np.int64(84),
85: np.int64(85),
86: np.int64(86),
87: np.int64(87),
88: np.int64(88),
89: np.int64(89),
90: np.int64(90),
91: np.int64(91),
92: np.int64(92),
93: np.int64(93),
94: np.int64(94),
95: np.int64(95),
96: np.int64(96),
97: np.int64(97),
98: np.int64(98),
99: np.int64(99)}
+ 1 a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 1026, 44,
45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
100])
= a + 1
b b
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 1026, 44,
45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
100])
is b a
False
= id(a)
f += 1
a id(a) f,
(129865480838416, 129865480838416)
a
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 1026, 44,
45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
100])
b
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 1026, 44,
45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
100])
Beware of the dimensions: a 1D array is not the same as a 2D array with 1 column
= np.array([1, 2, 3])
a1 print(a1, a1.shape, a1.ndim)
[1 2 3] (3,) 1
= np.array([1, 2, 3])
a2 print(a2, a2.shape, a2.ndim)
[1 2 3] (3,) 1
List the attributes and methods of class numpy.ndarray
. You may use function dir()
and filter the result using methods for objects of class string
.
# inner product a2.dot(a1)
np.int64(14)
(
np.array([a2])# column vector
.transpose()
.dot(np.array([a1]))# column vector multiplied by row vector )
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
(
np.array([a2])#.shape
.transpose() )
array([[1],
[2],
[3]])
(3,1) # all explicit
a2.reshape(1, 3))
.dot(a1.reshape( )
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
# Declare a 2D array using a nested list as the constructor argument
= np.array([[1,2],
M 3,4],
[3.14, -9.17]])
[ M
array([[ 1. , 2. ],
[ 3. , 4. ],
[ 3.14, -9.17]])
M.shape, M.size
((3, 2), 6)
M.ravel(), M.ndim, M.ravel().shape
(array([ 1. , 2. , 3. , 4. , 3.14, -9.17]), 2, (6,))
# arguments: start, stop, step
= (
x 12)
np.arange(4, 3)
.reshape(
) x
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
= np.arange(3).reshape(3,1)
y
y
array([[0],
[1],
[2]])
@ y, x.dot(y) x
(array([[ 5],
[14],
[23],
[32]]),
array([[ 5],
[14],
[23],
[32]]))
0, 10, 51) # meaning of the 3 positional parameters ? np.linspace(
array([ 0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. ,
2.2, 2.4, 2.6, 2.8, 3. , 3.2, 3.4, 3.6, 3.8, 4. , 4.2,
4.4, 4.6, 4.8, 5. , 5.2, 5.4, 5.6, 5.8, 6. , 6.2, 6.4,
6.6, 6.8, 7. , 7.2, 7.4, 7.6, 7.8, 8. , 8.2, 8.4, 8.6,
8.8, 9. , 9.2, 9.4, 9.6, 9.8, 10. ])
0, 10, 11, base=np.e), np.e**(np.arange(11)) np.logspace(
(array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
2.98095799e+03, 8.10308393e+03, 2.20264658e+04]),
array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
2.98095799e+03, 8.10308393e+03, 2.20264658e+04]))
import matplotlib.pyplot as plt
# Random standard Gaussian numbers
= plt.figure(figsize=(8, 4))
fig = np.random.randn(1000)
wn = wn.cumsum()
bm
=3) plt.plot(bm, lw
10)) np.diag(np.arange(
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 3, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 4, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 6, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 7, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 8, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 9]])
= np.zeros((10, 10), dtype=np.float32)
zozo zozo
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)
zozo.shape
(10, 10)
print(M)
[[ 1. 2. ]
[ 3. 4. ]
[ 3.14 -9.17]]
1, 1] M[
np.float64(4.0)
# assign new value
0, 0] = 7
M[0] = 42
M[:, M
array([[42. , 2. ],
[42. , 4. ],
[42. , -9.17]])
M
array([[42. , 2. ],
[42. , 4. ],
[42. , -9.17]])
# Warning: the next m is a **view** on M.
# One again, no copies unless you ask for one!
= M[0, :]
m m
array([42., 2.])
= 3.14
m[:] M
array([[ 3.14, 3.14],
[42. , 4. ],
[42. , -9.17]])
= 7
m[:] M
array([[ 7. , 7. ],
[42. , 4. ],
[42. , -9.17]])
# slicing works just like with anything else (lists, etc.)
= np.array([1, 2, 3, 4, 5])
A print(A)
print(A[::-1])
print(A[::2])
print(A[:-1:2])
[1 2 3 4 5]
[5 4 3 2 1]
[1 3 5]
[1 3]
+ m * 10 for n in range(5)] for m in range(5)] [[n
[[0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]]
= np.array([[n + m * 10 for n in range(5)] for m in range(5)])
A A
array([[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
[40, 41, 42, 43, 44]])
print(A[1:4])
[[10 11 12 13 14]
[20 21 22 23 24]
[30 31 32 33 34]]
= A[:, 1:4] m
1, 1] = 123 m[
A
array([[ 0, 1, 2, 3, 4],
[ 10, 11, 123, 13, 14],
[ 20, 21, 22, 23, 24],
[ 30, 31, 32, 33, 34],
[ 40, 41, 42, 43, 44]])
1] A[
array([ 10, 11, 123, 13, 14])
1] A[:,
array([ 1, 11, 21, 31, 41])
-1] A[:, ::
array([[ 4, 3, 2, 1, 0],
[ 14, 13, 123, 11, 10],
[ 24, 23, 22, 21, 20],
[ 34, 33, 32, 31, 30],
[ 44, 43, 42, 41, 40]])
print(A)
[[ 0 1 2 3 4]
[ 10 11 123 13 14]
[ 20 21 22 23 24]
[ 30 31 32 33 34]
[ 40 41 42 43 44]]
= np.array([1, 2, 4])
row_indices print(A[row_indices])
[[ 10 11 123 13 14]
[ 20 21 22 23 24]
[ 40 41 42 43 44]]
A[:, row_indices]
array([[ 1, 2, 4],
[ 11, 123, 14],
[ 21, 22, 24],
[ 31, 32, 34],
[ 41, 42, 44]])
Another way is through masking with an array of bool
s
# index masking
= np.arange(5)
B = np.array([True, False, True, False, False])
row_mask print(B)
print(B[row_mask])
[0 1 2 3 4]
[0 2]
A, A[row_mask] , A[:,row_mask]
(array([[ 0, 1, 2, 3, 4],
[ 10, 11, 123, 13, 14],
[ 20, 21, 22, 23, 24],
[ 30, 31, 32, 33, 34],
[ 40, 41, 42, 43, 44]]),
array([[ 0, 1, 2, 3, 4],
[20, 21, 22, 23, 24]]),
array([[ 0, 2],
[ 10, 123],
[ 20, 22],
[ 30, 32],
[ 40, 42]]))
Don’t forget that python
does not make copies unless told to do so (same as with any mutable type)
If you are not careful enough, this typically leads to a lot of errors and to being fired !!
= x = np.arange(6)
y 2] = 123
x[ y
array([ 0, 1, 123, 3, 4, 5])
is y x
True
# A real copy
= x.copy()
y is y x
False
# Or equivalently (but the one above is better...)
= np.copy(x) y
0] = -12
x[print(x, y, x is y)
[-12 1 123 3 4 5] [ 0 1 123 3 4 5] False
To put values of x in y (copy values into an existing array) use
= np.random.randn(10)
x id(x) x,
(array([-1.15645361, -0.46676704, 1.2741621 , 0.23024147, 1.02990933,
1.27859827, -0.14988412, -2.40625665, -0.02091037, -1.18902373]),
129865467957328)
2.78) # in place.
x.fill(id(x) x,
(array([2.78, 2.78, 2.78, 2.78, 2.78, 2.78, 2.78, 2.78, 2.78, 2.78]),
129865467957328)
= 3.14 # x.fill(3.14) can. be chained ...
x[:] id(x) x,
(array([3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14, 3.14]),
129865467957328)
= np.random.randn(x.shape[0])
x[:] id(x) x,
(array([-1.4397113 , -0.31737624, -0.42271118, 0.31185467, 1.34160539,
0.77372732, 1.22223205, 1.22049248, -0.67462385, -0.86580839]),
129865467957328)
= np.empty(x.shape) # how does empty() work ?
y id(y) y,
(array([1.4397113 , 0.31737624, 0.42271118, 0.31185467, 1.34160539,
0.77372732, 1.22223205, 1.22049248, 0.67462385, 0.86580839]),
129865467955504)
= x
y id(y), id(x), y is x y,
(array([-1.4397113 , -0.31737624, -0.42271118, 0.31185467, 1.34160539,
0.77372732, 1.22223205, 1.22049248, -0.67462385, -0.86580839]),
129865467957328,
129865467957328,
True)
In the next line you copy the values of x
into an existing array y
(of same size…)
= np.zeros(x.shape)
y = x
y[:] is x, np.all(y==x) y, y
(array([-1.4397113 , -0.31737624, -0.42271118, 0.31185467, 1.34160539,
0.77372732, 1.22223205, 1.22049248, -0.67462385, -0.86580839]),
False,
np.True_)
While in the next line, you are aliasing, you are giving a new name y
to the object named x
(you should never, ever write something like this)
= x
y is x y
True
A numpy
array can contain other things than numeric types
= np.array(['Labore', 'neque', 'ipsum', 'ut', 'non', 'quiquia', 'dolore.'])
arr arr, arr.shape, arr.dtype
(array(['Labore', 'neque', 'ipsum', 'ut', 'non', 'quiquia', 'dolore.'],
dtype='<U7'),
(7,),
dtype('<U7'))
# arr.sum()
"_".join(arr)
'Labore_neque_ipsum_ut_non_quiquia_dolore.'
arr.dtype
dtype('<U7')
numpy
So far, we have only used array
or ndarray
objects
The is another type: the matrix
type
In words: don’t use it (IMhO) and stick with arrays
# Matrix VS array objects in numpy
= np.matrix(np.arange(3))
m1 = np.matrix(np.arange(3))
m2 m1, m2
(matrix([[0, 1, 2]]), matrix([[0, 1, 2]]))
@ m2, m1.shape, m1.transpose() * m2 m1.transpose()
(matrix([[0, 0, 0],
[0, 1, 2],
[0, 2, 4]]),
(1, 3),
matrix([[0, 0, 0],
[0, 1, 2],
[0, 2, 4]]))
= np.arange(3)
a1 = np.arange(3)
a2 a1, a2
(array([0, 1, 2]), array([0, 1, 2]))
* m2.T, m1.dot(m2.T) m1
(matrix([[5]]), matrix([[5]]))
* a2 a1
array([0, 1, 4])
a1.dot(a2)
np.int64(5)
np.outer(a1, a2)
array([[0, 0, 0],
[0, 1, 2],
[0, 2, 4]])
from scipy.sparse import csc_matrix, csr_matrix, coo_matrix
= np.full(fill_value=1/4, shape=(4,))
probs probs
array([0.25, 0.25, 0.25, 0.25])
= np.random.multinomial(n=2, pvals=probs, size=4) # check you understand what is going on
X X
array([[0, 1, 0, 1],
[0, 1, 1, 0],
[0, 1, 0, 1],
[0, 0, 1, 1]])
probs
array([0.25, 0.25, 0.25, 0.25])
= coo_matrix(X) ## coordinate format X_coo
print(X_coo)
X_coo
<COOrdinate sparse matrix of dtype 'int64'
with 8 stored elements and shape (4, 4)>
Coords Values
(0, 1) 1
(0, 3) 1
(1, 1) 1
(1, 2) 1
(2, 1) 1
(2, 3) 1
(3, 2) 1
(3, 3) 1
<COOrdinate sparse matrix of dtype 'int64'
with 8 stored elements and shape (4, 4)>
# number pf non-zero coordinates X_coo.nnz
8
print(X, end='\n----\n')
print(X_coo.data, end='\n----\n')
print(X_coo.row, end='\n----\n')
print(X_coo.col, end='\n----\n')
[[0 1 0 1]
[0 1 1 0]
[0 1 0 1]
[0 0 1 1]]
----
[1 1 1 1 1 1 1 1]
----
[0 0 1 1 2 2 3 3]
----
[1 3 1 2 1 3 2 3]
----
There is also
csr_matrix
: sparse rows formatcsc_matrix
: sparse columns formatSparse rows is often used for machine learning: sparse features vectors
But sparse column format useful as well (e.g. coordinate gradient descent)
= np.random.randn(5, 5)
X X
array([[ 0.45802975, -1.0672361 , 0.53982095, -0.67564452, -0.78428308],
[ 1.35974408, -1.37158669, 0.50998743, -0.39321359, -1.16649141],
[ 0.5058729 , -2.41902123, 0.47720719, -0.01959006, 1.56713431],
[-0.94532274, 0.50039371, -0.59383875, 1.97355345, 0.74729636],
[ 1.68264783, 0.95700477, -0.39774282, 0.80307844, 1.04727189]])
# All number displayed by numpy (in the current kernel) are with 3 decimals max
=3)
np.set_printoptions(precisionprint(X)
=8) np.set_printoptions(precision
[[ 0.458 -1.067 0.54 -0.676 -0.784]
[ 1.36 -1.372 0.51 -0.393 -1.166]
[ 0.506 -2.419 0.477 -0.02 1.567]
[-0.945 0.5 -0.594 1.974 0.747]
[ 1.683 0.957 -0.398 0.803 1.047]]
numpy
arrays can have any number of dimension (hence the name ndarray
)
= np.arange(18).reshape(3, 2, 3)
X X
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]]])
X.shape
(3, 2, 3)
X.ndim
3
Visit https://numpy.org/doc/stable/reference/arrays.ndarray.html#arrays-ndarray
= np.arange(42).reshape(7, 6)
A A
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41]])
sum(), 42 * 41 //2 A.
(np.int64(861), 861)
3].mean(), np.mean (3 + np.arange(0, 42, 6)) A[:,
(np.float64(21.0), np.float64(21.0))
=0) A.mean(axis
array([18., 19., 20., 21., 22., 23.])
=1) A.mean(axis
array([ 2.5, 8.5, 14.5, 20.5, 26.5, 32.5, 38.5])
3].std(), A[:,3].var() A[:,
(np.float64(12.0), np.float64(144.0))
3].min(), A[:,3].max() A[:,
(np.int64(3), np.int64(39))
=0) A.cumsum(axis
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 8, 10, 12, 14, 16],
[ 18, 21, 24, 27, 30, 33],
[ 36, 40, 44, 48, 52, 56],
[ 60, 65, 70, 75, 80, 85],
[ 90, 96, 102, 108, 114, 120],
[126, 133, 140, 147, 154, 161]])
A
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41]])
# sum of diagonal
A.trace()
np.int64(105)
= np.arange(30).reshape(6, 5)
A = np.arange(0, 5)
v1 = np.arange(5, 10) v2
A
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]])
v1, v2
(array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9]))
* v2 v1
array([ 0, 6, 14, 24, 36])
sum(v1* v2) v1.dot(v2), np.
(np.int64(80), np.int64(80))
5,1) @ v2.reshape(1,5) v1.reshape(
array([[ 0, 0, 0, 0, 0],
[ 5, 6, 7, 8, 9],
[10, 12, 14, 16, 18],
[15, 18, 21, 24, 27],
[20, 24, 28, 32, 36]])
# Inner product between vectors
print(v1.dot(v2))
# You can use also (but first solution is better)
print(np.dot(v1, v2))
80
80
A, v1
(array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]),
array([0, 1, 2, 3, 4]))
A.shape, v1.shape
((6, 5), (5,))
# Matrix-vector inner product
A.dot(v1)
array([ 30, 80, 130, 180, 230, 280])
# Transpose
A.T
array([[ 0, 5, 10, 15, 20, 25],
[ 1, 6, 11, 16, 21, 26],
[ 2, 7, 12, 17, 22, 27],
[ 3, 8, 13, 18, 23, 28],
[ 4, 9, 14, 19, 24, 29]])
print(v1)
# Inline operations (same for *=, /=, -=)
+= 2 v1
[0 1 2 3 4]
= np.array([[42,2,3], [4,5,6], [7,8,9]])
A = np.array([1,2,3])
b print(A, b, sep=2 * '\n')
[[42 2 3]
[ 4 5 6]
[ 7 8 9]]
[1 2 3]
# solve a system of linear equations
= np.linalg.solve(A, b)
x x
array([2.18366847e-18, 2.31698718e-16, 3.33333333e-01])
A.dot(x)
array([1., 2., 3.])
= np.random.rand(3,3)
A = np.random.rand(3,3)
B
= np.linalg.eig(A)
evals, evecs evals
array([ 1.83113271+0.j , -0.00953135+0.06414649j,
-0.00953135-0.06414649j])
evecs
array([[ 0.58526062+0.j , 0.13973059-0.13199103j,
0.13973059+0.13199103j],
[ 0.72312711+0.j , -0.49869018+0.08244374j,
-0.49869018-0.08244374j],
[ 0.36682037+0.j , 0.84116875+0.j ,
0.84116875-0.j ]])
Decomposes any matrix \(A \in \mathbb R^{m \times n}\) as follows: \[ A = U \times S \times V^\top \] where - \(U\) and \(V\) are orthonormal matrices (meaning that \(U^\top \times U = I\) and \(V^\top \times V = I\)) - \(S\) is a diagonal matrix that contains the singular values in non-increasing order
print(A)
= np.linalg.svd(A) U, S, V
[[0.46889987 0.8746796 0.44914819]
[0.8365175 0.94170622 0.41869991]
[0.02935054 0.7014727 0.40146393]]
U.dot(np.diag(S)).dot(V)
array([[0.46889987, 0.8746796 , 0.44914819],
[0.8365175 , 0.94170622, 0.41869991],
[0.02935054, 0.7014727 , 0.40146393]])
- U @ np.diag(S) @ V A
array([[-3.33066907e-16, 4.44089210e-16, -2.22044605e-16],
[ 0.00000000e+00, -2.22044605e-16, -1.66533454e-16],
[-1.21430643e-16, 4.44089210e-16, 1.11022302e-16]])
# U and V are indeed orthonormal
=2)
np.set_printoptions(precisionprint(U.T.dot(U), V.T.dot(V), sep=2 * '\n')
=8) np.set_printoptions(precision
[[1.00e+00 1.22e-16 1.98e-16]
[1.22e-16 1.00e+00 1.84e-17]
[1.98e-16 1.84e-17 1.00e+00]]
[[ 1.00e+00 5.64e-18 -2.88e-17]
[ 5.64e-18 1.00e+00 -4.99e-17]
[-2.88e-17 -4.99e-17 1.00e+00]]
scipy.misc.face()
compute_approx(X, r)
!pip3 install pooch
Requirement already satisfied: pooch in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (1.8.2)
Requirement already satisfied: platformdirs>=2.5.0 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from pooch) (4.3.6)
Requirement already satisfied: packaging>=20.0 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from pooch) (24.2)
Requirement already satisfied: requests>=2.19.0 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from pooch) (2.32.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/boucheron/Documents/IFEBY310/.venv/lib/python3.12/site-packages (from requests>=2.19.0->pooch) (2024.12.14)
import numpy as np
from scipy.datasets import face
import matplotlib.pyplot as plt
%matplotlib inline
= face() X
type(X)
numpy.ndarray
plt.imshow(X)= plt.axis('off') _
= X.shape
n_rows, n_cols, n_channels = X.reshape(n_rows, n_cols * n_channels)
X_reshaped = np.linalg.svd(X_reshaped, full_matrices=False) U, S, V
X_reshaped.shape
(768, 3072)
X.shape
(768, 1024, 3)
**2) ## a kind of screeplot
plt.plot(S"log") plt.yscale(
def compute_approx(X: np.ndarray, r: int):
"""Computes the best rank-r approximation of X using SVD.
We expect X to the 3D array corresponding to a color image, that we
reduce to a 2D one to apply SVD (no broadcasting).
Parameters
----------
X : `np.ndarray`, shape=(n_rows, n_cols, 3)
The input 3D ndarray
r : `int`
The desired rank
Return
------
output : `np.ndarray`, shape=(n_rows, n_cols, 3)
The best rank-r approximation of X
"""
= X.shape
n_rows, n_cols, n_channels # Reshape X to a 2D array
= X.reshape(n_rows, n_cols * n_channels)
X_reshape # Compute SVD
= np.linalg.svd(X_reshape, full_matrices=False)
U, S, V # Keep only the top r first singular values
= 0
S[r:] # Compute the approximation
= U.dot(np.diag(S)).dot(V)
X_reshape_r # Put it between 0 and 255 again and cast to integer type
return X_reshape_r.clip(min=0, max=255).astype('int')\
.reshape(n_rows, n_cols, n_channels)
= [100, 70, 50, 30, 10, 5]
ranks = len(ranks)
n_ranks for i, r in enumerate(ranks):
= compute_approx(X, r)
X_r # plt.subplot(n_ranks, 1, i + 1)
=(5, 5))
plt.figure(figsize
plt.imshow(X_r)= plt.axis('off')
_ # plt.title(f'Rank {r} approximation of the racoon' % r, fontsize=16)
plt.tight_layout()
In the code above, we recompute the SVD of X
for every element in list rank
.
In the next chunk, we compute the SVD once, and define a generator to generate the low rank approximations of matrix X
. We take advantage of the fact that the SVD defines an orthonormal basis for the space of matrices. In this adapted orthonormal basis the optimal low rank approximations of \(X\) have a sparse expansion.
def gen_rank_k_approx(X):
"""Generator for low rank
approximation of a matrix X using truncated SVD.
Args:
X (numpy.ndarray): a numerical matrix
Yields:
(int,numpy.ndarray): rank k and best rank-k approximation of X using truncated SVD(according to Eckart-Young theorem).
"""
= np.linalg.svd(X, full_matrices=False)
U, S, V = 0
r = np.zeros_like(X, dtype='float64')
Y while (r<len(S)):
= Y + S[r] * (U[:,r,np.newaxis] @ V[r,:, np.newaxis].T)
Y += 1
r yield r, Y
= gen_rank_k_approx(X_reshaped) g
for i in range(100):
= next(g)
_, Xr if i % 10 ==0:
=(5, 5))
plt.figure(figsize
plt.imshow(
Xrmin=0, max=255)
.clip('int')
.astype(
.reshape(n_rows, n_cols, n_channels)
)= plt.axis('off')
_ plt.tight_layout()
Visit https://numpy.org/numpy-tutorials/content/tutorial-svd.html