transpose实际原理

即，改变遍历的顺序，例如16个元素，假定shape为(2, 2, 4)，则strides为(8, 4, 1)，每个轴为i,j,k，通过三层循环即可遍历。transpose需将strides，调换位置，例如交换0，1轴，strides变为(4, 8, 1)

对k轴，每1个数加1，对j轴，每8个数加1，对i轴，每4个数加1。

import numpy as np

memory = np.arange(16)
shape = (2, 2, 4)
# stride = (8, 4, 1)
stride = (4, 8, 1)  # transpose后
i_idx = j_idx = k_idx = 0
for i in range(shape[0]):  # 2
    i_idx = i * stride[0]  # 4
    for j in range(shape[1]):  # 2
        j_idx = i_idx + j * stride[1]  # 8
        for k in range(shape[2]):  # 4
            k_idx = j_idx + k * stride[2]  # 1
            print(memory[k_idx], end=' ')
        print()

另一种角度也可思考为索引的变化，比如轴为(x,y,z)，相当于转置为(y,x,z)，忽略掉z，即相当于每个item为一个长度为4的array，也就变为了二维矩阵的转置，更加好理解一些。二维矩阵的转置，相当于双层for循环(i和j)变换位置（如果使用二维矩阵存储，从这种角度，i、j也就是对应的索引），在上述代码相当于交换stride的过程。

参考

下面这个文献有详细的解释和配图，推荐查看（比我写的清晰）

python - How does NumPy’s transpose() method permute the axes of an array? - Stack Overflow

实际应用

假设有两张图片，存储为(N,C,H*W) = (2,3,4), 令其为a=np.arange(1,25)

若想拼接图片，即拼接为(3,4*2)的这种情况，想要的结果为

array([[1,  2,  3,  4, 13, 14, 15, 16],
        [ 5,  6,  7,  8, 17, 18, 19, 20],
        [ 9, 10, 11, 12, 21, 22, 23, 24]])

如果直接使用reshape或者view，得到的结果为

tensor([[ 1,  2,  3,  4,  5,  6,  7,  8],
        [ 9, 10, 11, 12, 13, 14, 15, 16],
        [17, 18, 19, 20, 21, 22, 23, 24]])

出现这种情况是，矩阵存储时数据是连续的，所以会保持这种1->24的顺序

解决方法是，先做transpose，再做view，使用view后，会提示下面错误，

view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces).
# 至少有一个维度，跨越了两个连续的子空间，所以view行不通

我们可以还使用reshape，来避免view的问题。

Returns a tensor with the same data and number of elements as input, but with the specified shape. When possible, the returned tensor will be a view of input. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.

torch.reshape — PyTorch 2.5 documentation

这就相当于如果view不成立，则调用了contiguous()函数返回一个在内存中连续的Tensor

最终代码

b = torch.transpose(a, 1, 0)
print(b.reshape((3, 8)))

tensor([[ 1,  2,  3,  4, 13, 14, 15, 16],
        [ 5,  6,  7,  8, 17, 18, 19, 20],
        [ 9, 10, 11, 12, 21, 22, 23, 24]])

SUM的原理

每个维度都可以考虑为是一个大括号，例如形状(2,2,4)，为矩阵A

axis=0，则有2个(2,4)的矩阵
axis=1，每个(2,4)的矩阵，有2个长度为[4]的array
axis=2，每个array，有4个数

求和即是对应轴的元素相加的含义

sum后的结果可能不会keepdims，所以需要注意一下

np.sum(A，axis=0) -> 矩阵(2,4) + 矩阵(2,4) ->shape(1, 2, 4)
np.sum(A，axis=1) -> 每个矩阵(2,4)内的array求和，即array + array ->shape(2, 1, 4)
np.sum(A，axis=2) -> 每个array的四个元素求和 ->shape(2, 2, 1)

推广来说，如果求和多次，例如axis=(1,2)，则从(2, 2, 4)->(2, 1, 1)，即对两个(2, 4)矩阵进行逐元素相加，相当于先执行上面提到的操作2、3，或者先执行操作3再执行操作2。

If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.

参考

numpy.sum — NumPy v2.0 Manual

其他参考

Python · numpy · axis - 知乎

关于python 高维数组transpose的实现原理以及pytorch view等的思考_transpose(1, 2)-CSDN博客

Transpose原理及SUM维度操作的思考

transpose实际原理

参考

实际应用

reshape函数的定义

最终代码

SUM的原理

参考

其他参考