VIEW2SPACE-4B: Multi-View Visual Reasoning
A 4.4B-parameter vision-language model based on Qwen3-VL that reasons about scenes from sparse multi-view observations. Upload one or more images and ask a question — the model integrates information across views to answer.
Model card | Paper | Code
Examples