xwm commited on
Commit
cffcfa3
·
verified ·
1 Parent(s): 8fffc3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -1,6 +1,16 @@
1
- # sciworld_workflow_compact_preference
 
 
 
 
 
 
 
 
2
 
3
- This model is a fine-tuned version of [LLaMA-Factory/saves/llama3.1-8b/sciworld_workflow_compact](https://huggingface.co/LLaMA-Factory/saves/llama3.1-8b/sciworld_workflow_compact) on the sciworld_workflow_compact_preference dataset.
 
 
4
  It achieves the following results on the evaluation set:
5
  - Loss: 1.5017
6
  - Rewards/chosen: -3.8774
@@ -52,4 +62,4 @@ The following hyperparameters were used during training:
52
  - Transformers 4.46.1
53
  - Pytorch 2.5.1+cu124
54
  - Datasets 3.1.0
55
- - Tokenizers 0.20.3
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - nlp
5
+ - agent
6
+ language:
7
+ - en
8
+ pipeline_tag: text-generation
9
+ ---
10
 
11
+ # SciWorld-MPO
12
+
13
+ This model is a fine-tuned version of Llama-3.1-8B-Instruct on the [sciworld-metaplan-preference-pairs](https://huggingface.co/datasets/xwm/Meta_Plan_Optimization/blob/main/sciworld_metaplan_preference_pairs.json) dataset.
14
  It achieves the following results on the evaluation set:
15
  - Loss: 1.5017
16
  - Rewards/chosen: -3.8774
 
62
  - Transformers 4.46.1
63
  - Pytorch 2.5.1+cu124
64
  - Datasets 3.1.0
65
+ - Tokenizers 0.20.3