📌 The model trained on 512x512 resolution.
The size closer to 512 is more stable, and the higher size gives a better visual effect but is less stable