MLP Gaussian policy

Looks like a Gaussian policy whose mean and std are outputs of a neural network.

参考文献