Kaizhao Liang's picture

Kaizhao Liang PRO

kz919

·

https://kyleliang919.github.io/

AI & ML interests

Search = AGI?

Recent Activity

liked a model 3 days ago

EleutherAI/pythia-1b-deduped

updated a model 5 days ago

sambanovasystems/QwQ-0.5B-SFT-Draft

liked a model 5 days ago

sambanovasystems/QwQ-0.5B-SFT-Draft

View all activity

Organizations

kz919's activity

commented a paper 2 months ago

Cautious Optimizers: Improving Training with One Line of Code

Paper • 2411.16085 • Published Nov 25, 2024 • 15 •

New activity in Qwen/Qwen2.5-Coder-0.5B-Instruct 2 months ago

Why does this one have different tokenizer from the 32B model?

#1 opened 2 months ago by

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct 4 months ago

<|begin_of_text|> is added twice by the preprocessor

#44 opened 4 months ago by

New activity in sambanovasystems/Llama3.1-Instruct-O1 4 months ago

Added history and Better UI

#7 opened 4 months ago by

Added history and Better UI

#6 opened 4 months ago by

use gr chatbot

#5 opened 4 months ago by

downgrade openai version

#4 opened 4 months ago by

fix gradio demo issue and not use chatbot component

#3 opened 4 months ago by

update for gradio

#2 opened 4 months ago by

use gradio

#1 opened 4 months ago by

New activity in kz919/Persona-AI 5 months ago

The persona and quote boxes are editable. Please bring your own favorite waifu!

#2 opened 5 months ago by

New activity in mattshumer/Reflection-Llama-3.1-70B 5 months ago

Please, 8B version

#8 opened 5 months ago by

New activity in kz919/Persona-AI 5 months ago

Huggingface Space Infra broken

#1 opened 5 months ago by

New activity in xianbao/SambaNova-fast 5 months ago

Referral link for API key access! It's available now!

#1 opened 5 months ago by

New activity in nvidia/Llama-3.1-Minitron-4B-Depth-Base 5 months ago

Is the instruction tuned version going to be released?

#1 opened 5 months ago by

commented 2 papers 5 months ago

Memory-Efficient LLM Training with Online Subspace Descent

Paper • 2408.12857 • Published Aug 23, 2024 • 13 •

Memory-Efficient LLM Training with Online Subspace Descent

Paper • 2408.12857 • Published Aug 23, 2024 • 13 •

New activity in stabilityai/stable-video-diffusion-img2vid-xt 11 months ago

Text conditioning

#57 opened 11 months ago by

New activity in CultriX/NeuralTrix-7B-NO-INST 12 months ago

The INST problem seems to be replaced by source:

#1 opened 12 months ago by

New activity in CultriX/NeuralTrix-7B-dpo 12 months ago

The model is not performing well.

#1 opened 12 months ago by