Nvidia released a paper about a 100KB text-to-image model that only trained for 4 minutes but claims to be better than bigger models - eviltoast

They also claim that it only takes about 8 seconds to generate various good images.