Back to models
Llama 3.2 11B Vision Instruct
Efficient vision model for image understanding and reasoning.
Context Window
128K
Input Price
$N/A
per 1M tokens
Output Price
$N/A
per 1M tokens
Family
Llama 3.2
Capabilities
Vision
Function Calling
JSON Mode
Streaming
Reasoning
Image Generation
Input Modalities
text
image
Output Modalities
text
Model Information
Status
active
Access Type
open weights
License
Proprietary
Last Updated
12/30/2025