Table of Contents
multiply和*:对应位置的乘积(element-wise product)
二者等价,为逐元素乘法,
matmul和@:矩阵乘法
此为矩阵乘法,具体规则如下
The behavior depends on the arguments in the following way.# 如果均为2D ,则为矩阵乘法If both arguments are 2-D they are multiplied like conventional matrices.# 如果一个是N-D的,则把其视为N个矩阵的堆叠If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.则,sum(a[0, 1, :] * b[0 , :, 1])# 如果任何一个参数是一维的,则其会被广播If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.dot:数量积、标量积(scalar product)或者内积(inner product)
指实数域中的两个向量运算得到一个实数值标量的二元运算。
Dot product of two arrays. Specifically,# 如果均为一维的,则返回内积If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).# 如果均为两维的,则视为矩阵乘法,即@或matmulIf both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.# 如果存在标量,则视为逐元素乘法,即multiplyIf either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.# 如果A是N维,B是一维,则将A视为多个一维array的堆叠,计算逐元素内积If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.# 如果A是N维,B是M(>2)维,最后一个轴和 b 的倒数第二个轴进行相乘If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b:`dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])`对于dot的最后与@的第二规则的对比
b为一维矩阵的时候
# 定义一个 3-D array (2, 3, 4)low=0high=100a_shape= 2, 3, 4b_shape= 4, 5a = np.random.randint(low, high, size=a_shape)# 定义一个 2-D array (4, 5)b = np.random.randint(low, high, size=b_shape)r1,r2 = np.dot(a, b),np.matmul(a, b)print(r1.shape,r2.shape)np.array_equal(r1,r2)结果,此为特殊情况
(2, 3, 5) (2, 3, 5)True考虑b为多维矩阵的情况
low = 0high = 100a_shape = 2, 2, 4b_shape = 2, 4, 2a = np.random.randint(low, high, size=a_shape)b = np.random.randint(low, high, size=b_shape)
r1, r2 = np.dot(a, b), np.matmul(a, b)# 对于matmul,为对应位置的矩阵相加print(np.array_equal(a[0] @ b[0], r2[0]), np.array_equal(a[1] @ b[1], r2[1]))# 判断matmul和dot的结果是否相等print(r1.shape, r2.shape)np.array_equal(r1, r2),np.array_equal(r1[0][0], r2[0])True True(2, 2, 2, 2) (2, 2, 2)(False, False)结果分析
dot:dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])(numpy官方表达式)
matmul:matmul(a, b)[i,j,m] = sum(a[i,j,:] * b[i,:,m])(手动推导)
设a[i,j,:]表示一行,则有4行, b[k,:,m]表示一列,则有4列对于matmul,i和k是相同的,即对应位置的矩阵相乘,所以最终结果是两个矩阵: a[1] - b[1]
a[2] - b[2]对于dot,i和k不固定,所以a的每个矩阵,可以同时和两个矩阵做矩阵乘法,得到四个矩阵: a[1] b[1] x a[2] b[2]i挑选a的一个矩阵, k挑选b的一个矩阵,所以我们仅需令i,k相等,即可打印出相同的矩阵: r2[i], r2[i,(j,k)]特别的,当b只有一个矩阵时,结果也就和matmul一样了,上一次实验的结果可以证明i = 0np.array_equal(r1[i,:,i], r2[i])#结果为True总结
multiply和*是逐元素乘法
matmul和@是矩阵乘法,高维情况则将最后两维视为矩阵,即高维都可以视为batch,batch不对应则广播,广播后A和B同位置矩阵做矩阵乘法
dot的情况复杂:
- 均为一维的,则返回内积
- 均为两维的,则视为矩阵乘法,即@或matmul
- 存在标量,则视为逐元素乘法,即multiply
- 如果A是N维,B是一维,则将A视为多个一维array的堆叠,计算逐元素内积
- 如果A是N维,B是M(>2)维,虽然依然是做矩阵乘法,但是为A和B的矩阵交叉相乘,不同于