标题: 论如何完美的实现和pytorch一模一样的softmax功能.

很多人所以然的一行代码搞定:

[np.exp(i)/np.sum(np.exp(i)) for i in a]

但是你真以为这就行了吗? 来对比一下:

import torch
import numpy as np


# label = torch.Tensor([
#     [1, 2, 3],
#     [0, 2, 1],
# ])


label = torch.Tensor([0, 2, 1]).long()


fc_out = torch.Tensor([
    [245, 13., 3.34],
    [45., 43., 37.],
    [1.22, 35.05, 1.23]
])

def one_hot(a, n):
    b = a.shape[0]
    c = np.zeros([b, n])
    for i in range(b):
        c[i][a[i]] = 1
    return np.array(c)

def softmax(a):
    return [np.exp(i)/np.sum(np.exp(i)) for i in a]


def cross_entropy_loss(out, label):
    # convert out to softmax probability
    out_list = out.numpy().tolist()
    out1 = softmax(out_list)
    print(out1)

    out2 = torch.softmax(out, 0)
    print(out2)
    # [0, 2, 1] -> [[1, 0, 0], [0, 0, 1], [0, 1, 0]]
    # onehot label and rotate
    label_onehot = one_hot(label, 3)
    loss = np.sum(out1 * label_onehot.T)
    print(loss)


loss = torch.nn.CrossEntropyLoss()
lv = loss(fc_out, label)
print(lv)
lv = cross_entropy_loss(fc_out, label)
print(lv)

这是一段尝试复现pytorch里面CrossEntropy的代码. 但是你尝试之后发现, 连softmax都复现不了啊.

欢迎大家评论交流, 这里面涉及到很多知识点, 也有很多巨坑.