fix colon

stanford-crfm · Apr 9, 2024 · dfea0f7 · dfea0f7
1 parent e5700cd
commit dfea0f7
Showing 1 changed file with 61 additions and 26 deletions.
diff --git a/assets/bigcode.yaml b/assets/bigcode.yaml
@@ -13,16 +13,18 @@
   size: 15.5B parameters (dense)
   dependencies: [The Stack]
   training_emissions: 16.68 tons of CO2eq
-  training_time: StarCoderBase: 320,256 GPU hours
+  training_time: 320,256 GPU hours
   training_hardware: 512 A100 80GB GPUs distributed across 64 nodes
   quality_control: No specific quality control is mentioned in model training, though
     details on data processing and how the tokenizer was trained are provided in
     the paper.
   access: open
   license: BigCode Open RAIL-M v1.0
-  intended_uses: As a foundation model to fine-tune and create more specialized models that support use cases such as code completion, fill-in-the-middle, and text summarization. Can also be used as a Tech Assistant prompt and not as an instruction model given
-    training limitations.
-  prohibited_uses: 'See BigCode Open RAIL-M license and FAQ'
+  intended_uses: As a foundation model to fine-tune and create more specialized
+    models that support use cases such as code completion, fill-in-the-middle, and
+    text summarization. Can also be used as a Tech Assistant prompt and not as an
+    instruction model given training limitations.
+  prohibited_uses: See BigCode Open RAIL-M license and FAQ
   monitoring: ''
   feedback: https://huggingface.co/bigcode/starcoder/discussions
 - type: model
@@ -32,31 +34,39 @@
     analysis on Github stars' association to data quality.
   created_date: 2023-02-24
   url: https://arxiv.org/pdf/2301.03988.pdf
-  model_card: 'https://huggingface.co/bigcode/santacoder'
+  model_card: https://huggingface.co/bigcode/santacoder
   modality: code; code
   analysis: Evaluated on MultiPL-E system benchmarks.
   size: 1.1B parameters (dense)
   dependencies: [The Stack, BigCode Dataset]
-  training_emissions: '124 kg of CO2eq'
+  training_emissions: 124 kg of CO2eq
   training_time: 14,284 GPU hours
   training_hardware: 96 NVIDIA Tesla V100 GPUs
   quality_control: ''
   access: open
   license: BigCode Open RAIL-M v1
-  intended_uses: 'The model was trained on GitHub code. As such it is not an instruction model and commands like "Write a function that computes the square root." do not work well. You should phrase commands like they occur in source code such as comments (e.g. # the following function computes the sqrt) or write a function signature and docstring and let the model complete the function body.'
-  prohibited_uses: 'See BigCode Open RAIL-M license and FAQ'
+  intended_uses: 'The model was trained on GitHub code. As such it is not an instruction
+    model and commands like "Write a function that computes the square root." do
+    not work well. You should phrase commands like they occur in source code such
+    as comments (e.g. # the following function computes the sqrt) or write a function
+    signature and docstring and let the model complete the function body.'
+  prohibited_uses: See BigCode Open RAIL-M license and FAQ
   monitoring: ''
-  feedback: 'https://huggingface.co/bigcode/santacoder/discussions'
+  feedback: https://huggingface.co/bigcode/santacoder/discussions
 - type: dataset
   name: The Stack
   organization: BigCode
-  description: The Stack contains over 6TB of permissively-licensed source code files covering 358 programming languages. The Stack serves as a pre-training dataset for Code LLMs, i.e., code-generating AI systems which enable the synthesis of programs from natural language descriptions as well as other from code snippets.
+  description: The Stack contains over 6TB of permissively-licensed source code
+    files covering 358 programming languages. The Stack serves as a pre-training
+    dataset for Code LLMs, i.e., code-generating AI systems which enable the synthesis
+    of programs from natural language descriptions as well as other from code snippets.
   created_date: 2022-11-20
   url: https://arxiv.org/pdf/2211.15533.pdf
   datasheet: https://huggingface.co/datasets/bigcode/the-stack
   modality: code
   size: 6 TB
-  sample: [https://huggingface.co/datasets/bigcode/the-stack/viewer/default/train]
+  sample:
+    - https://huggingface.co/datasets/bigcode/the-stack/viewer/default/train
   analysis: Evaluated models trained on The Stack on HumanEval and MBPP and compared
     against similarly-sized models.
   dependencies: [GitHub]
@@ -65,16 +75,20 @@
   quality_control: allowed users whose data were part of The Stack's training data
     to opt-out
   access: open
-  license: The Stack is a collection of source code from repositories with various licenses. Any use of all or part of the code gathered in The Stack must abide by the terms of the original licenses, including attribution clauses when relevant. Provenance information is provided for each data point.
+  license: The Stack is a collection of source code from repositories with various
+    licenses. Any use of all or part of the code gathered in The Stack must abide
+    by the terms of the original licenses, including attribution clauses when relevant.
+    Provenance information is provided for each data point.
   intended_uses: creating code LLMs
-  prohibited_uses: 'See https://huggingface.co/datasets/bigcode/the-stack'
+  prohibited_uses: See https://huggingface.co/datasets/bigcode/the-stack
   monitoring: ''
-  feedback: 'https://huggingface.co/datasets/bigcode/the-stack/discussions'
+  feedback: https://huggingface.co/datasets/bigcode/the-stack/discussions
 - type: model
   name: StarCoder2-15B
   organization: BigCode
-  description: StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from The Stack v2, with opt-out requests excluded. The training was carried out using the Fill-in-the-Middle
-    objective on 4+ trillion tokens.
+  description: StarCoder2-15B model is a 15B parameter model trained on 600+ programming
+    languages from The Stack v2, with opt-out requests excluded. The training was
+    carried out using the Fill-in-the-Middle objective on 4+ trillion tokens.
   created_date: 2024-02-28
   url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html
   model_card: https://huggingface.co/bigcode/starcoder2-15b
@@ -90,15 +104,22 @@
     came from to apply the proper attribution.
   access: open
   license: BigCode OpenRail-M
-  intended_uses: The model was trained on GitHub code as well as additional selected data sources such as Arxiv and Wikipedia. As such it is not an instruction model and commands like "Write a function that computes the square root." do not work well. Intended to generate code snippets from given context, but not
-    for writing actual functional code directly.
-  prohibited_uses: 'See BigCode Open RAIL-M license and FAQ'
+  intended_uses: The model was trained on GitHub code as well as additional selected
+    data sources such as Arxiv and Wikipedia. As such it is not an instruction model
+    and commands like "Write a function that computes the square root." do not work
+    well. Intended to generate code snippets from given context, but not for writing
+    actual functional code directly.
+  prohibited_uses: See BigCode Open RAIL-M license and FAQ
   monitoring: unknown
   feedback: https://huggingface.co/bigcode/starcoder2-15b/discussions
 - type: model
   name: StarCoder2-7B
   organization: BigCode
-  description: StarCoder2-7B model is a 7B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3.5+ trillion tokens.
+  description: StarCoder2-7B model is a 7B parameter model trained on 17 programming
+    languages from The Stack v2, with opt-out requests excluded. The model uses
+    Grouped Query Attention, a context window of 16,384 tokens with a sliding window
+    attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective
+    on 3.5+ trillion tokens.
   created_date: 2024-02-28
   url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html
   model_card: https://huggingface.co/bigcode/starcoder2-7b
@@ -115,22 +136,31 @@
   access: open
   license: BigCode OpenRail-M
   intended_uses: Intended to generate code snippets from given context, but not
-    for writing actual functional code directly. The model has been trained on source code from 17 programming languages. The predominant language in source is English although other languages are also present. As such the model is capable of generating code snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient and contain bugs or exploits. See the paper for an in-depth discussion of the model limitations. 
-  prohibited_uses: 'See BigCode Open RAIL-M license and FAQ'
+    for writing actual functional code directly. The model has been trained on source
+    code from 17 programming languages. The predominant language in source is English
+    although other languages are also present. As such the model is capable of generating
+    code snippets provided some context but the generated code is not guaranteed
+    to work as intended. It can be inefficient and contain bugs or exploits. See
+    the paper for an in-depth discussion of the model limitations.
+  prohibited_uses: See BigCode Open RAIL-M license and FAQ
   monitoring: unknown
   feedback: https://huggingface.co/bigcode/starcoder2-7b/discussions
 - type: model
   name: StarCoder2-3B
   organization: BigCode
-  description: StarCoder2-3B model is a 3B parameter model trained on 17 programming languages from The Stack v2, with opt-out requests excluded. The model uses Grouped Query Attention, a context window of 16,384 tokens with a sliding window attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective on 3+ trillion tokens.
+  description: StarCoder2-3B model is a 3B parameter model trained on 17 programming
+    languages from The Stack v2, with opt-out requests excluded. The model uses
+    Grouped Query Attention, a context window of 16,384 tokens with a sliding window
+    attention of 4,096 tokens, and was trained using the Fill-in-the-Middle objective
+    on 3+ trillion tokens.
   created_date: 2024-02-28
   url: https://www.servicenow.com/company/media/press-room/huggingface-nvidia-launch-starcoder2.html
   model_card: https://huggingface.co/bigcode/starcoder2-3b
   modality: code; text
   analysis: See https://arxiv.org/pdf/2402.19173.pdf
   size: 3B parameters (dense)
   dependencies: [The Stack v2]
-  training_emissions:  16,107.01 kgCO2eq
+  training_emissions: 16,107.01 kgCO2eq
   training_time: 97,120 hours (cumulative)
   training_hardware: 160 A100 GPUs
   quality_control: The model was filtered for permissive licenses and code with
@@ -139,7 +169,12 @@
   access: open
   license: BigCode OpenRail-M
   intended_uses: Intended to generate code snippets from given context, but not
-    for writing actual functional code directly. The model has been trained on source code from 17 programming languages. The predominant language in source is English although other languages are also present. As such the model is capable of generating code snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient and contain bugs or exploits. See the paper for an in-depth discussion of the model limitations. 
-  prohibited_uses: 'See BigCode Open RAIL-M license and FAQ'
+    for writing actual functional code directly. The model has been trained on source
+    code from 17 programming languages. The predominant language in source is English
+    although other languages are also present. As such the model is capable of generating
+    code snippets provided some context but the generated code is not guaranteed
+    to work as intended. It can be inefficient and contain bugs or exploits. See
+    the paper for an in-depth discussion of the model limitations.
+  prohibited_uses: See BigCode Open RAIL-M license and FAQ
   monitoring: unknown
   feedback: https://huggingface.co/bigcode/starcoder2-3b/discussions