Supervised Finetuning demonstration

Models are finetuned on generated conversation curated from the Open Assistant.

Mixing reward model with sampling

We can use reward model to rank the best answer using this example code:

import torch
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b-base-finetuned/checkpoint-1000")
model = AutoModelForCausalLM.from_pretrained("facebook/galactica-1.3b-base-finetuned/checkpoint-1000").eval().half().cuda()

reward_name = "theblackcat102/electra-large-reward-model"
rank_model, rank_tokenizer = AutoModelForSequenceClassification.from_pretrained(reward_name), AutoTokenizer.from_pretrained(reward_name)
rank_model = rank_model.eval().half().cuda()

questions = ["<question>How do I make a resume?<answer>"]
for question in questions:
    inputs = tokenizer(question, return_tensors="pt", padding=True).to(0)
    if 'token_type_ids' in inputs:
        inputs.pop('token_type_ids')
    outputs = model.generate(**inputs, do_sample=True,
        top_k=60,
        max_length=220,
        num_return_sequences=80, 
        early_stopping=True
    )
    print(question)

    results = []
    for i, beam_output in enumerate(outputs):
        output = tokenizer.decode(beam_output, truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"])
        question, answer = output.split('<answer>', maxsplit=1)
        answer = answer.split('<question>')[0].replace('<|endoftext|>', '').lstrip().split('<answer>')[0]
        rank_inputs = rank_tokenizer(question, answer, return_tensors="pt", padding=True, max_length=512, truncation=True).to(1)
        score = rank_model(**rank_inputs).logits[0].cpu().detach()
        results.append((answer, score, output))
    full_results[question] = results
    sorted_result = sorted(results, key=lambda x:x[1], reverse=True)
    total_scores += sorted_result[0][1].item()
    print('score',sorted_result[0][1].item())
    print('-----Best rank-----')
    print(sorted_result[0][0])
    print('-------------------')

Checkout weights and biases report for training detail.

Thanks to BASIC lab for compute resource. BASIC Lab is an academic research lab which focuses in multi-modality learning and data mining domain.

Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using theblackcat102/galactica-1.3b-conversation-finetuned 1