Data release for the ImageInWords (IIW) paper.
The first public Vietnamese visual linguistic foundation model(s)
Image Captioning With MobileNet-LLaMA 3