Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours
Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

gemma_image_captioner.ipynb · codeShare/flux_chroma_image_captioner at main

Link to image-to-prompt: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/gemma_image_captioner.ipynb
Writing prompts for Chroma is hard and Joycaptions is inaccurate so I assembled the training data I could find for the model , picked 400 image text pairs at random and trained Google Gemma 3 LoRA model as in image to prompt tool that can run on Google Colab.
Its a proof-of-concept. Feel free to train your own LoRa captioning models for use on perchance. The workflow of converting JSON and .parquets into a dataset can be found in this notebook in the repo: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/train_on_parquet.ipynb
For the original unsloth notebook visit: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B)-Vision.ipynb
Other unsloth models: https://docs.unsloth.ai/get-started/unsloth-notebooks
/Cheers!