A scaled down version of problem we just received at MOSEK is

min s^2

st. Ax - b = s (P)

x >= 0

Assume for simplicity A is a matrix with 1 row. Now in this case the elements in A is very large i.e. of the order of magnitude 10^7. This implies the problem (P) is badly scaled because the row contains a lot of elements of the size 10^7 and one of the size 1.

Now an equally good model is

min s^2

st. Ax - b = (||A||_inf) s (P')

x >= 0

which is nicely scaled. An optimal solution to (P') is clearly also optimal to (P)!

Note this line of thinking could be relevant anywhere were penalty variables is employed. Many optimization problems contains penalty variables.