Update Readme

Browse files

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -409,7 +409,7 @@ model = AutoModelForCausalLM.from_pretrained(
     device_map="cuda",
     torch_dtype="auto",
     trust_remote_code=True,
-    # if you do not Ampere or later GPUs, change attention to "eager"
     _attn_implementation='flash_attention_2',
 ).cuda()
@@ -573,8 +573,12 @@ The model is licensed under the [MIT license](./LICENSE).
 ## Trademarks
 This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
 ## Appendix A: Benchmark Methodology
 We include a brief word on methodology here - and in particular, how we think about optimizing prompts.
 In an ideal world, we would never change any prompts in our benchmarks to ensure it is always an apples-to-apples comparison when comparing different models. Indeed, this is our default approach, and is the case in the vast majority of models we have run to date.
 There are, however, some exceptions to this. In some cases, we see a model that performs worse than expected on a given eval due to a failure to respect the output format. For example:
@@ -650,3 +654,4 @@ The model was evaluated across a breadth of public and internal benchmarks to un
     + Toxigen: Toxigen is adversarial and hate speech detection
   + Red Team:
     + Responses to prompts provided by AI Red Team at Microsoft

     device_map="cuda",
     torch_dtype="auto",
     trust_remote_code=True,
+    # if you do not use Ampere or later GPUs, change attention to "eager"
     _attn_implementation='flash_attention_2',
 ).cuda()
 ## Trademarks
 This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
 ## Appendix A: Benchmark Methodology
+<details>
+  <summary>Click to view detail descriptions</summary>
 We include a brief word on methodology here - and in particular, how we think about optimizing prompts.
 In an ideal world, we would never change any prompts in our benchmarks to ensure it is always an apples-to-apples comparison when comparing different models. Indeed, this is our default approach, and is the case in the vast majority of models we have run to date.
 There are, however, some exceptions to this. In some cases, we see a model that performs worse than expected on a given eval due to a failure to respect the output format. For example:
     + Toxigen: Toxigen is adversarial and hate speech detection
   + Red Team:
     + Responses to prompts provided by AI Red Team at Microsoft
+</details>