Skip to content

Commit

Permalink
clean up changes
Browse files Browse the repository at this point in the history
  • Loading branch information
jxue16 committed Sep 28, 2024
1 parent 2195c62 commit 3517dc0
Show file tree
Hide file tree
Showing 22 changed files with 83 additions and 172 deletions.
4 changes: 2 additions & 2 deletions assets/ai21.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@
monitoring: ''
feedback: https://huggingface.co/ai21labs/Jamba-v0.1/discussions
- type: model
name: Jamba 1.5 Open Model Family (Jamba 1.5 Mini, Jamba 1.5 Large)
name: Jamba 1.5
organization: AI21
description: A family of models that demonstrate superior long context handling,
speed, and quality. Built on a novel SSM-Transformer architecture, they surpass
Expand All @@ -342,7 +342,7 @@
and Jamba 1.5 Large used 8xA100 80GB GPUs.
quality_control: The models were evaluated on the Arena Hard benchmark. For maintaining
long context performance, they were tested on the RULER benchmark.
access: Open
access: open
license: Jamba Open Model License
intended_uses: The models are built for enterprise scale AI applications. They
are purpose-built for efficiency, speed, and ability to solve critical tasks
Expand Down
2 changes: 1 addition & 1 deletion assets/aleph_alpha.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@
quality_control: The model comes with additional safety guardrails via alignment
methods to ensure safe usage. Training data is carefully curated to ensure compliance
with EU and national regulations.
access: Open
access: open
license: Aleph Open
intended_uses: The model is intended for use in domain-specific applications,
particularly in the automotive and engineering industries. It can also be tailored
Expand Down
7 changes: 3 additions & 4 deletions assets/anthropic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -625,16 +625,15 @@
quality_control: The model underwent a red-teaming assessment, and has been tested
and refined by external experts. It was also provided to the UK's AI Safety
Institute (UK AISI) for a pre-deployment safety evaluation.
access: Open
access: open
license: unknown
intended_uses: The model is intended for complex tasks such as context-sensitive
customer support, orchestrating multi-step workflows, interpreting charts and
graphs, transcribing text from images, as well as writing, editing, and executing
code.
prohibited_uses: Misuse of the model is discouraged though specific use cases
are not mentioned.
monitoring: Unknown
of misuse, and policy feedback from external experts has been integrated to
ensure robustness of evaluations.
monitoring: Unknown of misuse, and policy feedback from external experts has been
integrated to ensure robustness of evaluations.
feedback: Feedback on Claude 3.5 Sonnet can be submitted directly in-product to
inform the development roadmap and improve user experience.
3 changes: 1 addition & 2 deletions assets/aspia_space,_institu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,11 @@
url: https://arxiv.org/pdf/2405.14930v1
model_card: unknown
modality: image; image
provide insights)
analysis: The models’ performance on downstream tasks was evaluated by linear
probing. The models follow a similar saturating log-log scaling law to textual
models, their performance improves with the increase in model size up to the
saturation point of parameters.
size: Ranges from 1 million to 2.1 billion parameters.
size: 2.1B parameters
dependencies: [DESI Legacy Survey DR8]
training_emissions: Unknown
training_time: Unknown
Expand Down
12 changes: 5 additions & 7 deletions assets/evolutionaryscale.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,25 +12,23 @@
created_date: 2024-06-25
url: https://www.evolutionaryscale.ai/blog/esm3-release
model_card: unknown
modality: text; text
textual descriptions of proteins as outputs)
modality: text; image, text
analysis: The model was tested in the generation of a new green fluorescent protein.
Its effectiveness was compared to natural evolutionary processes, and it was
deemed to simulate over 500 million years of evolution.
size: 98B parameters (Dense)
dependencies: [ESM2(base model), largest dataset of proteins]
dependencies: []
training_emissions: Unknown
training_time: Unknown
training_hardware: One of the highest throughput GPU clusters in the world.
training_hardware: unknown
quality_control: The creators have put in place a responsible development framework
to ensure transparency and accountability from the start. ESM3 was tested in
the generation of a new protein, ensuring its quality and effectiveness.
access: Open
access: open
license: Unknown
intended_uses: To engineer biology from first principles. It functions as a tool
for scientists to create proteins for various applications, including medicine,
biology research, and clean energy.
prohibited_uses: Unknown
monitoring: Unknown
though specific measures are not specified.
monitoring: Unknown though specific measures are not specified.
feedback: Unknown
8 changes: 4 additions & 4 deletions assets/team_glm,_zhipu_ai,_.yaml → assets/glm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,22 +12,22 @@
which tool(s) to use.
created_date: 2023-07-02
url: https://arxiv.org/pdf/2406.12793
model_card: unknown
model_card: https://huggingface.co/THUDM/glm-4-9b
modality: text; text
analysis: Evaluations show that GLM-4, 1) closely rivals or outperforms GPT-4
in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval,
2) gets close to GPT-4-Turbo in instruction following as measured by IFEval,
3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms
GPT-4 in Chinese alignments as measured by AlignBench.
size: From 6.2 billion parameters to 9 billion parameters and 130 billion parameters.
dependencies: [GPT models, GLM-10B, GLM-130B]
size: 9B parameters
dependencies: []
training_emissions: Unknown
training_time: Unknown
training_hardware: Unknown
quality_control: High-quality alignment is achieved via a multi-stage post-training
process, which involves supervised fine-tuning and learning from human feedback.
access: Open
license: Unknown
license: Apache 2.0
intended_uses: General language modeling, complex tasks like accessing online
information via web browsing and solving math problems using Python interpreter.
prohibited_uses: Unknown
Expand Down
2 changes: 1 addition & 1 deletion assets/google.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1870,7 +1870,7 @@
2. The model has improvements in safety and efficiency over the first generation.
The deployment of Gemma 2 on Vertex AI, scheduled for the next month, will offer
effortless management of the model.
access: Open
access: open
license: Gemma (commercially-friendly license given by Google DeepMind)
intended_uses: Gemma 2 is designed for developers and researchers for various
AI tasks. It can be used via the integrations it offers with other AI tools/platforms
Expand Down
5 changes: 2 additions & 3 deletions assets/laion_e.v..yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,14 @@
quality_control: The model utilized lists of link and image hashes provided by
partner organizations. These were used to remove inappropriate links from the
original LAION-5B dataset to create Re-LAION-5B.
access: Open
access: open
license: Apache 2.0
intended_uses: Re-LAION-5B is designed for research on language-vision learning.
It can also be used by third parties to clean existing derivatives of LAION-5B
by generating diffs and removing all matched content from their versions.
prohibited_uses: The dataset should not be utilized for purposes that breach legal
parameters or ethical standards, such as dealing with illegal content.
monitoring: This version is a response to continuous scrutiny & safety revisions.
It's also meant to allow inspection and validation by a broad community.
monitoring: unknown
feedback: Problems with the dataset should be reported to the LAION organization.
They have open lines for communication with their partners and the broader research
community.
9 changes: 2 additions & 7 deletions assets/lg_ai_research.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,14 @@
EXAONE 3.0 was competitive in English and excellent in Korean compared to other
large language models of a similar size.
size: 7.8B parameters (dense)
dependencies:
- GQA
- SwiGLU
- Rotary Position Embeddings
- MeCab
- BBPE
dependencies: [MeCab]
training_emissions: Unknown
training_time: Unknown
training_hardware: Unknown
quality_control: Extensive pre-training on a diverse dataset, and advanced post-training
techniques were employed to enhance instruction-following capabilities. The
model was also trained to fully comply with data handling standards.
access: Open
access: open
license: Unknown
intended_uses: The model was intended for non-commercial and research purposes.
The capabilities of the model allow for use cases that involve advanced AI and
Expand Down
4 changes: 2 additions & 2 deletions assets/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -865,8 +865,8 @@
text summarization, multilingual conversational agents, and coding assistants.
It is the largest and most capable openly available foundation model.
created_date: 2024-07-23
url: https://ai.meta.com/blog/meta-llama-3-1/?utm_source=twitter&utm_medium=organic_social&utm_content=video&utm_campaign=llama31
model_card: unknown
url: https://ai.meta.com/blog/meta-llama-3-1/
model_card: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md
modality: text; text
analysis: The model was evaluated on over 150 benchmark datasets that span a wide
range of languages. An experimental evaluation suggests that the model is competitive
Expand Down
10 changes: 5 additions & 5 deletions assets/microsoft.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1003,23 +1003,23 @@
created_date: 2024-09-08
url: https://huggingface.co/microsoft/Phi-3.5-MoE-instruct
model_card: https://huggingface.co/microsoft/Phi-3.5-MoE-instruct
modality: Unknown
modality: text; text
analysis: The model was evaluated across a variety of public benchmarks, comparing
with a set of models including Mistral-Nemo-12B-instruct-2407, Llama-3.1-8B-instruct,
Gemma-2-9b-It, Gemini-1.5-Flash, and GPT-4o-mini-2024-07-18. It achieved a similar
level of language understanding and math as much larger models. It also displayed
superior performance in reasoning capability, even with only 6.6B active parameters.
It was also evaluated for multilingual tasks.
size: 6.6B active parameters
dependencies: [Phi-3]
size: 61B parameters (sparse); 6.6B active parameters
dependencies: [Phi-3 dataset]
training_emissions: Unknown
training_time: Unknown
training_hardware: Unknown
quality_control: The model was enhanced through supervised fine-tuning, proximal
policy optimization, and direct preference optimization processes for safety
measures.
access: Open
license: Unknown
access: open
license: MIT
intended_uses: The model is intended for commercial and research use in multiple
languages. It is designed to accelerate research on language and multimodal
models, and for use as a building block for generative AI powered features.
Expand Down
6 changes: 3 additions & 3 deletions assets/mistral.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@
quality_control: The model underwent an advanced fine-tuning and alignment phase.
Various measures such as accuracy comparisons with other models and instruction-tuning
were implemented to ensure its quality.
access: Open
access: open
license: Apache 2.0
intended_uses: The model can be used for multilingual applications, understanding
and generating natural language as well as source code, handling multi-turn
Expand Down Expand Up @@ -153,7 +153,7 @@
training_time: Unknown
training_hardware: Unknown
quality_control: Unknown
access: Open
access: open
license: Apache 2.0
intended_uses: The model is intended for code generation and can be utilized as
a local code assistant.
Expand All @@ -172,7 +172,7 @@
created_date: 2024-07-16
url: https://mistral.ai/news/mathstral/
model_card: unknown
modality: Text-to-text (presumed based on description)
modality: text; text
analysis: The model's performance has been evaluated on the MATH and MMLU industry-standard
benchmarks. It scored notably higher on both these tests than the base model
Mistral 7B.
Expand Down
20 changes: 4 additions & 16 deletions assets/qwen_team.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,20 @@
in terms of mathematical capabilities.
created_date: 2024-08-08
url: https://qwenlm.github.io/blog/qwen2-math/
model_card: unknown
model_card: https://huggingface.co/Qwen/Qwen2-Math-72B
modality: text; text
analysis: Models have been evaluated on a series of math benchmarks, demonstrating
outperformance of the state-of-the-art models in both the English and Chinese
language.
size: The size of the largest model in the Qwen2-Math series is 72B parameters.
dependencies:
- GSM8K
- Math
- MMLU-STEM
- CMATH
- GaoKao Math Cloze
- GaoKao Math QA
- OlympiadBench
- CollegeMath
- GaoKao
- AIME2024
- AMC2023
- CN Middle School 24
size: 72B parameters
dependencies: []
training_emissions: Unknown
training_time: Unknown
training_hardware: Unknown
quality_control: The models were tested with few-shot chain-of-thought prompting
and evaluated across mathematical benchmarks in both English and Chinese.
access: open
license: Unknown
license: Tongyi Qianwen
intended_uses: These models are intended for solving complex mathematical problems.
prohibited_uses: Uses that go against the ethical usage policies of Qwen Team.
monitoring: Unknown
Expand Down
35 changes: 0 additions & 35 deletions assets/roblox.yaml

This file was deleted.

27 changes: 0 additions & 27 deletions assets/samba.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,30 +57,3 @@
prohibited_uses: ''
monitoring: unknown
feedback: none
- type: model
name: sarvam-2b
organization: sarvamAI
description: This is an early checkpoint of sarvam-2b, a small, yet powerful language
model pre-trained from scratch on 2 trillion tokens. It is designed to be proficient
in 10 Indic languages (Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi,
Oriya, Punjabi, Tamil, and Telugu) + English.
created_date: 2024-08-15
url: https://huggingface.co/sarvamai/sarvam-2b-v0.5
model_card: https://huggingface.co/sarvamai/sarvam-2b-v0.5
modality: text; text
analysis: Analysis for the model is not yet provided; however, it has been reported
that more technical details like evaluations and benchmarking will be posted
soon.
size: Unknown
dependencies: []
training_emissions: Unknown
training_time: Unknown
training_hardware: NVIDIA NeMo™ Framework, Yotta Shakti Cloud, HGX H100 systems.
quality_control: Unknown
access: Open
license: Unknown
intended_uses: The model can be used for text completion and supervised fine-tuning,
particularly in the languages it was trained on.
prohibited_uses: Unknown
monitoring: Unknown
feedback: Unknown
13 changes: 6 additions & 7 deletions assets/stability_ai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,13 @@
dependencies: []
training_emissions: Unknown
training_time: Unknown
training_hardware: NVIDIA RTX GPUs, TensorRT, AMD’s APUs, consumer GPUs and MI-300X
Enterprise GPUs
training_hardware: unknown
quality_control: They have conducted extensive internal and external testing of
this model and have implemented numerous safeguards to prevent harms. Safety
measures were implemented from the start of training the model and continued
throughout testing, evaluation, and deployment.
access: open
license: Stability Non-Commercial Research Community License
license: Stability Community License
intended_uses: The model can be used by professional artists, designers, developers,
and AI enthusiasts for creating high-quality image outputs from text inputs.
prohibited_uses: Large-scale commercial use requires contacting the organization
Expand Down Expand Up @@ -63,7 +62,7 @@
working to refine and optimize the model beyond the current synthetic datasets
it has been trained on.
access: open
license: Stable AI License
license: Stability Community License
intended_uses: This model can be used for creating dynamic multi-angle videos,
with applications in game development, video editing, and virtual reality. It
allows professionals in these fields to visualize objects from multiple angles,
Expand All @@ -88,22 +87,22 @@
professions.
created_date: 2024-08-01
url: https://stability.ai/news/introducing-stable-fast-3d
model_card: unknown
model_card: https://huggingface.co/stabilityai/stable-fast-3d
modality: image; 3D
analysis: The model was evaluated on its ability to quickly and accurately transform
a single image into a detailed 3D asset. This evaluation highlighted the model's
unprecedented speed and quality, marking it as a valuable tool for rapid prototyping
in 3D work. Compared to the previous SV3D model, Stable Fast 3D offers significantly
reduced inference times--0.5 seconds versus 10 minutes--while maintaining high-quality
output.
size: Unknown
size: unknown
dependencies: [TripoSR]
training_emissions: Unknown
training_time: Unknown
training_hardware: unknown
quality_control: Unknown
access: open
license: Stability AI Community
license: Stability Community License
intended_uses: The model is intended for use in game development, virtual reality,
retail, architecture, design and other graphically intense professions. It allows
for rapid prototyping in 3D work, assisting both enterprises and indie developers.
Expand Down
2 changes: 1 addition & 1 deletion assets/stanford.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@
created_date: 2024-09-08
url: https://arxiv.org/pdf/2406.06512
model_card: unknown
modality: Image; text
modality: image; text
analysis: Merlin has been comprehensively evaluated on 6 task types and 752 individual
tasks. The non-adapted (off-the-shelf) tasks include zero-shot findings classification,
phenotype classification, and zero-shot cross-modal retrieval, while model adapted
Expand Down
Loading

0 comments on commit 3517dc0

Please sign in to comment.