Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove roadmap doc and fix embedder model deployment issue #506

Merged
merged 1 commit into from
Jan 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 1 addition & 65 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,3 @@
## Roadmap of KubeAGI

### v0.1.0 - 2023 Q4 Released

* Dataset Management - manage data, including local files, integrate with object storage(s3), data editing, version control, and file download
* Data Processing - data cleaning, text splitting (e.g., text segmentation, QA splitting), file labeling
* Knowledge Base - data embedding
* Model Management - manage the lifecycle of models.
* Model Serving
- Support CPU & GPU Model Serving
- Support both remote and local model inference services, and associate with the knowledge base
- Support local embedding service (bge, m3e)
- Support vLLM inference engine
* LLM Applications - prompt engineering, initial implementation of LLM application orchestration capabilities. Manage and orchestrate Prompt, LLM/Retriever Chain nodes, and provide relevant example applications (based on streamlit)
* Guided walkthroughs and example scenarios - let the user get started to build LLM application quickly, add momre built-in chat example applications

### v0.2.0 - 2024 Feb. Ongoing
* Support evaluation of Prompts under different LLMs and generate test reports.
* RAG evaluation and RAG Question Generation
- Optimize question generation, analyze question quality, filter out low-similarity questions
- Evaluation metrics: retrieval evaluation - Hit Rate, MRR; answer evaluation - fairness, relevance, consistency, etc
- Other evaluation capabilities

* Data lineage - Understand the origin and flow of data, e.g., support mapping between answers and original documents
* Perform similarity analysis on QA pairs generated by large models, allowing manual processing (deletion, merging, etc.) by users
* Playground for datasets, knowledge base, model services, etc., based on streamlit.
* LLM applications support Get/Post API Chain, enabling typical LLM application development (non-workflow mode)
* Visualization of various data types, based on streamlit

### v0.3.0 - 2024 Mar.
* Data Processing - Introduce text annotation (automated + manual) to improve data quality through assisted fine-tuning

* Data Security - Support data anonymization (e.g., masking sensitive information like ID numbers, phone numbers, and bank account numbers)

* Enhanced Data Integration - Increase the capability to integrate with various data sources (databases, APIs, etc.) and support data synchronization strategies (automatic synchronization)

* Support manual evaluation to ensure quality control before deploying to production. Additionally, incorporate manual feedback into the monitoring system

* Enable user feedback on the question-answering system to facilitate optimization of LLM applications (data processing, prompt optimization, etc.)

* Integration of GPU management, scheduling, and resource monitoring capabilities for containerized environments

* Integration of API gateway to govern model service APIs, including monitoring, analysis, and security measures, and construct AI gateway


### v0.5.0 - 2024 Apr.
* Support low-resource large model fine-tuning, including RLHF (Reinforcement Learning from Human Feedback), SFT (Semi-Supervised Fine-Tuning) techniques such as Adapter, P-tuning, and LoRA. This improves model quality while reducing performance requirements for model serving (e.g., reducing inference costs, latency issues related to long prompts or slow inference)
* Model compression techniques
* Conduct testing and evaluation of model services and embeddings (QA evaluation, metric collection)

* Implement "scale to zero" capability (integrating with Arbiter) for cold start scenarios, enabling models and applications to evolve towards a Serverless architecture

* Support orchestration of additional node types such as Agent, Cache, etc

* Add more best practices for prompt engineering
- Few-shot learning techniques
- Chain-of-Thought (CoT) approach
- Mind-mapping techniques

### v1.0 - 2024 Jun.
* Automatically constructing prompt templates based on data annotations
* Enhance the monitoring capabilities of LLMOps, monitoring the pipeline from dataset and feature data to model inference, with call chain tracing based on langchain-go
* Implement a pipeline from data source -> dataset -> data processing -> data versioning -> knowledge base -> model service
* Strengthen the Python SDK to handle basic capabilities such as dataset manipulation, data processing, and vectorization. These operations can be performed in a notebook environment.
- Refer Databricks to enhance the developer experience
* Implement gray release for LLM applications based on AI gateway

Refer to our [online documentation](http://kubeagi.k8s.com.cn/docs/Release&Plan/roadmap)
61 changes: 0 additions & 61 deletions ROADMAP_cn.md

This file was deleted.

138 changes: 138 additions & 0 deletions deploy/charts/arcadia/crds/arcadia.kubeagi.k8s.com.cn_workers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,115 @@ spec:
spec:
description: WorkerSpec defines the desired state of Worker
properties:
additionalEnvs:
description: Additional env to use
items:
description: EnvVar represents an environment variable present in
a Container.
properties:
name:
description: Name of the environment variable. Must be a C_IDENTIFIER.
type: string
value:
description: 'Variable references $(VAR_NAME) are expanded using
the previously defined environment variables in the container
and any service environment variables. If a variable cannot
be resolved, the reference in the input string will be unchanged.
Double $$ are reduced to a single $, which allows for escaping
the $(VAR_NAME) syntax: i.e. "$$(VAR_NAME)" will produce the
string literal "$(VAR_NAME)". Escaped references will never
be expanded, regardless of whether the variable exists or
not. Defaults to "".'
type: string
valueFrom:
description: Source for the environment variable's value. Cannot
be used if value is not empty.
properties:
configMapKeyRef:
description: Selects a key of a ConfigMap.
properties:
key:
description: The key to select.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the ConfigMap or its key
must be defined
type: boolean
required:
- key
type: object
x-kubernetes-map-type: atomic
fieldRef:
description: 'Selects a field of the pod: supports metadata.name,
metadata.namespace, `metadata.labels[''<KEY>'']`, `metadata.annotations[''<KEY>'']`,
spec.nodeName, spec.serviceAccountName, status.hostIP,
status.podIP, status.podIPs.'
properties:
apiVersion:
description: Version of the schema the FieldPath is
written in terms of, defaults to "v1".
type: string
fieldPath:
description: Path of the field to select in the specified
API version.
type: string
required:
- fieldPath
type: object
x-kubernetes-map-type: atomic
resourceFieldRef:
description: 'Selects a resource of the container: only
resources limits and requests (limits.cpu, limits.memory,
limits.ephemeral-storage, requests.cpu, requests.memory
and requests.ephemeral-storage) are currently supported.'
properties:
containerName:
description: 'Container name: required for volumes,
optional for env vars'
type: string
divisor:
anyOf:
- type: integer
- type: string
description: Specifies the output format of the exposed
resources, defaults to "1"
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
resource:
description: 'Required: resource to select'
type: string
required:
- resource
type: object
x-kubernetes-map-type: atomic
secretKeyRef:
description: Selects a key of a secret in the pod's namespace
properties:
key:
description: The key of the secret to select from. Must
be a valid secret key.
type: string
name:
description: 'Name of the referent. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
TODO: Add other useful fields. apiVersion, kind, uid?'
type: string
optional:
description: Specify whether the Secret or its key must
be defined
type: boolean
required:
- key
type: object
x-kubernetes-map-type: atomic
type: object
required:
- name
type: object
type: array
creator:
description: Creator defines datasource creator (AUTO-FILLED by webhook)
type: string
Expand All @@ -51,6 +160,35 @@ spec:
displayName:
description: DisplayName defines datasource display name
type: string
matchExpressions:
description: NodeSelectorRequirement to schedule this worker
items:
description: A node selector requirement is a selector that contains
values, a key, and an operator that relates the key and values.
properties:
key:
description: The label key that the selector applies to.
type: string
operator:
description: Represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and
Lt.
type: string
values:
description: An array of string values. If the operator is In
or NotIn, the values array must be non-empty. If the operator
is Exists or DoesNotExist, the values array must be empty.
If the operator is Gt or Lt, the values array must have a
single element, which will be interpreted as an integer. This
array is replaced during a strategic merge patch.
items:
type: string
type: array
required:
- key
- operator
type: object
type: array
model:
description: Model this worker wants to use
properties:
Expand Down
4 changes: 2 additions & 2 deletions pkg/worker/devices.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ const (

// DeviceBasedOnResource returns the device type based on the resource list
func DeviceBasedOnResource(resource corev1.ResourceList) Device {
_, ok := resource[ResourceNvidiaGPU]
if ok {
value, ok := resource[ResourceNvidiaGPU]
if ok && value.Value() > 0 {
return CUDA
}
return CPU
Expand Down
2 changes: 1 addition & 1 deletion pkg/worker/worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ func (podWorker *PodWorker) Model() *arcadiav1alpha1.Model {
}

// BeforeStart will create resources which are related to this Worker
// Now we have a pvc(if configured),service,LLM(if a llm model),Embedder(if a embedding model)
// Now we have a pvc(if configured), service, LLM(if a llm model), Embedder(if a embedding model)
func (podWorker *PodWorker) BeforeStart(ctx context.Context) error {
var err error
// If the local directory is mounted, there is no need to create the pvc
Expand Down