Fyi, the pinsker inequality bounds KL divergence in terms of Total variation distance and TVD is like infinity norm on the difference between probability distribution, and sum of absolute differences is the L1 norm, and L1 and L_infty are also related.
tl;dr is to not worry about the mathematical details, if it works it works.
As someone who knows not enough people care about the math, please ignore this advise and actually learn the math. You might come up with a better representation in the process. In any case you'll learn more than just it works, but how and why. And if your goal is to apply this method in other places you will have gained a good idea about how.
tl;dr is to not worry about the mathematical details, if it works it works.