similar to OmniParser
#6
by
yangzehan
- opened
Does it mean that you can operate a computer, similar to OmniParser
It uses OmniParser to generate set-of-mark prompting, and further decide which button to click. Basically you can imagine this more is a mini GPT-4O that can understand the SoM prompts well and know how to take actions.
I understand, so we can add support for this model in the demo of OmniParser
yes, you are correct, please see this demo: https://huggingface.co/spaces/microsoft/Magma-UI.
We already integrated it for you, :)