A close reading of the seminal, Science autoencoder paper will reveal that the middle hidden layers that allowed for the impressive clustered visualization results here:
were linear or "identity" - basically unconstrained.
"For continuous data, the hidden units of the
first-level RBM remain binary, but the visible
units are replaced by linear units with Gaussian
"The six units in the code layer were linear and all the
other units were logistic."
"Again, all units were logistic except for the 30 linear
units in the code layer"
It's mentioned five time ...
Since we don't have this option - to my knowledge - the output from our middle hidden layers is compressed and I often don't see things like clusters and outliers that should be there.