The code release will be delayed. Thanks for your patience! We evaluate the model's multi-modal capabilities on five major categories of multi-modal tasks: Referring Expression Comprehension, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results