When you open this website, you probably see the name Mann-E on top. If you click on the link you’ll be redirected to a page with a handful of information about the product, but it’s some sort of general information and not fully in details. So in this particular blog post we’ll have a look at the latest release of the model which is the first release with a code name!
A little backstory
Although it has been described in that page, but we need to get into the details before diving deep into the matter of a release note. First, I will introduce myself a bit and you’ll have an idea about who created this model. I am Muhammadreza Haghiri and I was born on May 30th 1996. At the time I’m writing this article, I’m in the last months of my 27th year of my life and I still can’t say I’m 27. I had a passion for creation and discovering since I was very young. So I have studied and started coding at the age of 12 and by the age of 15, I could make programs which had real life application.
In late 2019 and early 2020, when the pandemic has been started most of us preferred to stay home and as you might have guessed, I was super bored. In those dark and unfortunate days of humanity I have started learning AI and in a little amount of time, I could make my very own image classifier and smart assistant (and both were inspired by movies and series I watched. Image classifier was from the 4th season of the HBO’s show Silicon Valley and smart assistant was from Marvel’s movie Iron Man).
In early 2022 I discovered generative AI and as you may guess, it was all about arts and stuff. Before getting into the details of Mann-E, you should have this background of me being an artist (a musician actually) and how much I love arts. So it was my thing! A combination of arts and computers!
Early versions of Mann-E
As said before it was all a hobby project of mine and was based on VQGAN models. But the release of Stable Diffusion was a game changer. Although it was a custom VQGAN in version 1, version 2 was basically Stable Diffusion with a little bit of modification and training and big changes happened in version 3. Version 3 of Mann-E is where I personally preferred to train custom text encoders as well and it helped improve the quality of generations a lot!
Mann-E 3 Revision 3 “Prompt Muse” Release Notes
Inference and weights
For accessing weights, checkpoints and inference tools of the new release you can visit model’s page on huggingface.
- Dataset was larger. It was 200 images, mostly digital art and illustrations. (last revision was only 50 fantasy illustration)
- With the new custom text encoder the model now understands midjourney style prompts (of course it doesn’t have understanding of –ar or –q or –v flags) but with crafting prompts in a Stable Diffusion way (e.g. using Magic Prompt to craft your prompts) it will result much much better images.
- The model is still based on weights from Runway ML’s Stable Diffusion version 1.5 as this version had a better understanding of styles and artist names.
Mann-E is still at his baby steps (yes, the name is a common Persian male name and this is why I used “his” for the pronoun 😁) and needs more work. But at this point it’s usable and made a pretty good hobby for a lot of my friends and family members as well. It can be used to study mixing different ideas or also can be used for ethnic art or more things which are not completely covered by other AI art generators.