This is my reproduction of the Microsoft team's work, WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. It is fully based on open-source models to construct training data and adopt supervised fine-tuning (SFT) to train the model. Also, I reproduced the experimental results in the paper. These results are excellent, confirming that the idea of 'learning from expert battles' proposed in the paper has great potential. I have also published the training data constructed during my reproduction of the paper in another repository, and everyone is welcome to use it. Original paper link: https://arxiv.org/pdf/2412.17395 I have also published the training data constructed during my reproduction of the paper in another repository: https://huggingface.co/datasets/HuggingMicah/warrior_reproduce .
Models |
Matplotlib (155) |
NumPy (220) |
Pandas (291) |
PyTorch (68) |
SciPy (106) |
Sklearn (115) |
TensorFlow (45) |
Overall (1000) |
INCODER (6.7B) |
28.3 |
4.4 |
3.1 |
4.4 |
2.8 |
2.8 |
3.8 |
7.4 |
CodeGen-Mono (16B) |
31.7 |
10.9 |
3.4 |
7.0 |
9.0 |
10.8 |
15.2 |
11.7 |
Code-Cushman-001 |
40.7 |
21.8 |
7.9 |
12.4 |
11.3 |
18.0 |
12.2 |
18.1 |
StarCoder (15B) |
51.7 |
29.7 |
11.4 |
21.4 |
20.2 |
29.5 |
24.5 |
26.0 |
WizardCoder-SC (15B) |
55.2 |
33.6 |
16.7 |
26.2 |
24.2 |
24.9 |
26.7 |
29.2 |
CodeLlama-Python (6.7B) |
55.3 |
34.5 |
16.4 |
19.9 |
22.3 |
17.6 |
28.5 |
28.0 |
WizardCoder-CL (6.7B) |
53.5 |
34.4 |
15.2 |
25.7 |
21.0 |
24.5 |
28.9 |
28.4 |
Magicoder-CL (6.7B) |
54.6 |
34.8 |
19.0 |
24.7 |
25.0 |
22.6 |
28.9 |
29.9 |
MagicoderS-CL (6.7B) |
55.9 |
40.6 |
28.4 |
40.4 |
28.8 |
35.8 |
37.6 |
37.5 |
WarriorCoder_published_in_paper (6.7B) |
55.5 |
41.8 |
26.1 |
41.2 |
33.0 |
39.1 |
42.2 |
38.1 |
WarriorCoder_my_reproduce (6.7B) |
56.1 |
45.0 |
32.0 |
38.2 |
36.8 |
44.3 |
48.9 |
41.7 |
Models |
HumanEval |
HumanEval+ |
MBPP |
MBPP+ |
WizardCoder-CL (6.7B) |
48.7 |
40.5 |
56.4 |
47.0 |
WizardCoder-SC (15B) |
51.4 |
45.3 |
61.6 |
50.7 |
Magicoder-CL (6.7B) |
60.4 |
55.7 |
64.2 |
52.5 |
MagicoderS-CL (6.7B) |
70.7 |
66.4 |
68.3 |
56.4 |
WarriorCoder (6.7B) |
79.9 |
75.4 |
75.8 |
64.5 |