[FEATURE] Tools

#470
by victor HF staff - opened
Hugging Chat org
โ€ข
edited Jun 17, 2024

Tools on HuggingChat

Learn more about available tools in this youtube video: https://www.youtube.com/watch?v=jRcheebdU5U

Today, we are excited to announce the beta release of Tools on HuggingChat! Tools open up a wide range of new possibilities, allowing the model to determine when a tool is needed, which tool to use, and what arguments to pass (via function calling).

  • For now, tools are only available on the default HuggingChat model: Cohere Command R+ because it's optimized for using tools and has performed well in our tests.
  • Tools use ZeroGPU spaces as endpoints, making it super convenient to add and test new tools!

Available tools

Tool name Description Host
Web Search Query the web and do some RAG on retrieved content against the user query HuggingChat internal tool
URL Fetcher Fetch text content from a given URL HuggingChat internal tool
Document Parser Parse content from PDF, text, csv, json and more ZeroGPU Space
Image Generation Generate images based on a given text prompt ZeroGPU Space
Image Editing Edit images based on a given text prompt ZeroGPU Space
Calculator A simple calculator for evaluating mathematical expressions HuggingChat internal tool

How we choose tools

  • A tool must be a ZeroGPU Space that comes by default with exposed API endpoints.
  • Tools need to be fast (~25 seconds max) to ensure a good user experience.
  • In general, we prefer simple and fun tools (like a new model) over complex workflows that are harder to test and more likely to fail.

Do you have an idea for a tool to add or to update one directly on HuggingChat? Share your thoughts in this ๐Ÿ‘ฅ community discussion.

Next Steps

  • Use previously generated files with tools (probably)
  • Add tools to Community Assistants: Making it possible for users to add their own ZeroGPU Spaces as tools in their Assistants.
  • Add more official tools on a regular basis.
  • Improve existing tools.
  • Support more models (maybe starting with Llama-3)
  • Add multi-step Tool Use (aka Agents)
  • Add ability to reference previous files from the conversation.
  • Add extra tools at runtime via OpenAPI specification.
victor changed discussion title from [FEATURE Tools] to [FEATURE] Tools

chat ui pauses

julien-c pinned discussion

chat ui pauses

https://huggingface.co/chat/ access it from here

Tried to do a Web search many times but I'm stuck with the loading icon and other tools seem to have different problems
Screenshot_2024-05-28-22-02-53-401_com.android.chrome.jpg

Screenshot_2024-05-28-22-07-29-152_com.android.chrome.jpg

Hugging Chat org

@Stefan171 Thanks for the report! Both issues should be fixed now, thanks to your screenshots!

@nsarrazin Pleasure. It's working now. Thanks for developing these tools.

deleted

How do we use the PDF parser?

deleted

Figured out how to use it, but PDF upload fails with error 413

What's the image

I am developing a Telegram bot that includes the Huggingface API to provide global responses for an interactive game. I need to know if the API has access to the "Tools Beta" feature, as this is critical to the functionality of our game. Or please tell me what code is available in the open source so that this can be implemented directly on the computer?

How do the tools work internally with prompting? I'd like to create something similar with an LLM assistant.

@handfuloftitty wth did you just sent ๐Ÿ˜ณ

Hugging Chat org

How do the tools work internally with prompting? I'd like to create something similar with an LLM assistant.

Everything is open source: https://github.com/huggingface/chat-ui

Everything is open source: https://github.com/huggingface/chat-ui

I tried looking around but it's hard to find. Do you know where in the codebase the prompts are located?

I really enjoy using QwQ for some complex JSON content generation. However, I really do not like how it was handled in HuggingChat here, which is why I still use Qwens hf space. There they treat QwQ just like any other LLM, by simply showing the output, without any fancy content handling.
In HuggingChat, this cool lil "reasoning" element shows up and it tries to summarize what the model is currently writing, similar to how we see it with... closed source models. We do have the ability to click and see what it's writing, but at the end of the generation the content is seemingly being summarized by some other AI. This is a really bad way to handel QwQ. It would have been different with Marco-o1 which separates thought from final output. QwQ does not do that.
I would MUCH prefer if it was handled like with deepseeks R1 interface, where they show the thought output as smaller text. OR! Just generate it as usual! Just let us see what the LLM writes like with any other model, without any fancy UI to cover up its output!

As it stands, when trying to interact with QwQ like with o1, it simply does not work as expected. Saying "hi" generates a short response by QwQ, but the summarisation LLM doesn't know what to summarize and gets confused:
image.png
It starts generating some tips on how to improve writing.
This is not a good way to handle a reasoning model with the style of QwQ.
If you REALLY want to keep using this o1-like interface, sure, go ahead, but please consider applying some fancy prompting to make it respond with a "final message". This way, you can parse the output, and the moment it types # Final Answer you can start displaying the rest as the final message, as a replacement for the current summarisation.
Here a custom system prompt I like to use on QwQ to make it respond that way:

You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.
You must provide your final answer under the "# Final Answer" header. So your response MUST look like this:
'''
<Your thought process here>
# Final Answer
<Your final anwer here>
'''

This is just the one I use, I'm sure y'all are better at prompting than I am.

The best way to handle QwQ I think is to just handle it like any other LLM on the platform. No fancy handling, no "realtime thought process interpretation" for the fun little titles in the UI, just plain text output.

Also, I really like the new way tool calls are activated and deactivated now. Way easier to use than before, thank you for that one!

Sign up or log in to comment