Papers which formulate machine learning algorithms as maximum entropy or minimum relative entropy solutions, or more broadly are written in the general framework of information geometry

Note that some papers could fall under different headings


Topsoe manuscripts

Shalizi link

See also Funchun Peng's Maximum Entropy Models list, link