Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weekly update #143

Merged
merged 3 commits into from
Nov 28, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions assets/microsoft.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -763,3 +763,45 @@
prohibited_uses: ''
monitoring: ''
feedback: https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0/discussions
- type: model
name: Florence-2
organization: Azure AI, Microsoft
jxue16 marked this conversation as resolved.
Show resolved Hide resolved
description: WizardCoder empowers Code LLMs with complex instruction fine-tuning, by adapting the Evol-Instruct method to the domain of code.
created_date: 2023-11-10
url: https://arxiv.org/pdf/2311.06242.pdf
model_card: none
modality: image, text; text
analysis: Evaluated on standard image processing benchmarks
size: 771M parameters (dense)
dependencies: [FLD-5B]
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: ''
access: closed
license: unknown
intended_uses: ''
prohibited_uses: ''
monitoring: ''
feedback: none
- type: dataset
name: FLD-5B
organization: Microsoft
description: FLD-5B is the dataset that powers Florence-2
created_date: 2023-11-10
url: https://arxiv.org/pdf/2311.06242.pdf
datasheet: ''
modality: image, text
size: 1.3B image-text annotations
sample: []
analysis: Compared to datasets that power other large-scale image models.
jxue16 marked this conversation as resolved.
Show resolved Hide resolved
dependencies: []
included: ''
excluded: ''
quality_control: ''
access: closed
license: unknown
intended_uses: ''
prohibited_uses: ''
monitoring: ''
feedback: ''
26 changes: 26 additions & 0 deletions assets/openai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -932,6 +932,32 @@
prohibited_uses: ''
monitoring: ''
feedback: ''
- type: model
name: GPT-4 Turbo
jxue16 marked this conversation as resolved.
Show resolved Hide resolved
organization: OpenAI
description: GPT-4 Turbo is a more capable version of GPT-4 and has knowledge
of world events up to April 2023. It has a 128k context window so it can fit
the equivalent of more than 300 pages of text in a single prompt.
created_date: 2023-11-06
url: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
model_card: none
modality: text; text
analysis: none
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: ''
access: limited
license:
explanation: Per the Terms of Use, a limited license is provided to the users
during their use of the API [[Section 2]](https://openai.com/api/policies/terms/).
value: custom
intended_uses: ''
prohibited_uses: ''
monitoring: unknown
feedback: none
- type: dataset
name: gpt-3.5-turbo dataset
organization: OpenAI
Expand Down
22 changes: 22 additions & 0 deletions assets/peking.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
- type: model
name: JARVIS-1
organization: Peking University Institute for Artificial Intelligence
description: JARVIS-1 is an open-world agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe.
created_date: 2023-11-10
url: https://arxiv.org/pdf/2311.05997.pdf
model_card: none
modality: text; in-game actions
analysis: Compared with other multi-task, instruction-following agents.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: ''
access: open
license: unknown
intended_uses: ''
prohibited_uses: ''
monitoring: none
feedback: none
1 change: 1 addition & 0 deletions assets/perplexity.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,4 @@
monthly_active_users: ''
user_distribution: ''
failures: ''

1 change: 1 addition & 0 deletions assets/stability.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,4 @@
monthly_active_users: ''
user_distribution: ''
failures: ''

1 change: 1 addition & 0 deletions assets/together.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,4 @@
prohibited_uses: ''
monitoring: ''
feedback: https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct/discussions

32 changes: 27 additions & 5 deletions assets/tsinghua.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
- type: model
name: CodeGeeX
organization: Tsinghua
organization: Tsinghua University
description: CodeGeeX is an autoregressive language model trained on code
created_date: 2022-09-20
url: https://github.com/THUDM/CodeGeeX
Expand All @@ -26,7 +26,7 @@
feedback: none
- type: model
name: CogView
organization: Tsinghua
organization: Tsinghua University
description: CogView is a transformer model for text-to-image generation
created_date:
explanation: The date the model paper was released
Expand All @@ -53,7 +53,7 @@
feedback: ''
- type: model
name: CogView 2
organization: Tsinghua
organization: Tsinghua University
description: CogView 2 is a hierarchical transformer for text-to-image generation
created_date:
explanation: The date the model paper was released
Expand All @@ -80,7 +80,7 @@
feedback: ''
- type: model
name: CogVideo
organization: Tsinghua
organization: Tsinghua University
description: CogVideo is a transformer model for text-to-video generation
created_date:
explanation: The date the model paper was released
Expand All @@ -107,7 +107,7 @@
feedback: ''
- type: model
name: GLM-130B
organization: Tsinghua
organization: Tsinghua University
description: GLM-130B is a bidirectional language model trained on English and
Chinese
created_date:
Expand Down Expand Up @@ -137,3 +137,25 @@
prohibited_uses: ''
monitoring: ''
feedback: ''
- type: model
name: CogVLM
organization: Zhipu AI, Tsinghua University
description: CogVLM is a powerful open-source visual language foundation model
created_date: 2023-11-06
url: https://arxiv.org/pdf/2311.03079.pdf
model_card: none
modality: image, text; text
analysis: Evaluated on image captioning and visual question answering benchmarks.
size: 17B parameters (dense)
dependencies: [Vicuna, CLIP]
training_emissions: unknown
training_time: 4096 A100 days
training_hardware: unknown
quality_control: none
access: open
license: Apache 2.0
intended_uses: Future multimodal research
prohibited_uses: none
monitoring: none
feedback: none

22 changes: 22 additions & 0 deletions assets/xai.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
- type: model
name: Grok-1
organization: xAI
description: Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy, intended to answer almost anything and even suggest what questions to ask.
created_date: 2023-11-04
url: https://x.ai/
model_card: https://x.ai/model-card/
modality: text; text
analysis: Grok-1 was evaluated on a range of reasoning benchmark tasks and on curated foreign mathematic examination questions.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: none
access: closed
license: unknown
intended_uses: Grok-1 is intended to be used as the engine behind Grok for natural language processing tasks including question answering, information retrieval, creative writing and coding assistance.
prohibited_uses: none
monitoring: unknown
feedback: none
Loading