Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to the task processing API #13115

Open
julien-nc opened this issue Aug 23, 2024 · 7 comments
Open

Migrate to the task processing API #13115

julien-nc opened this issue Aug 23, 2024 · 7 comments

Comments

@julien-nc
Copy link
Member

julien-nc commented Aug 23, 2024

Translation and SpeechToText backend APIs are deprecated. Those features are now included in the task processing API (since 30).

The old APIs will stay a few more NC major version. The old SpeechToText API can now use the new providers (for TaskProcessing) so there is no rush to migrate.

The providers for the Translate API and the translation providers for the TaskProcessing API can be installed side by side so there is no rush to migrate there either.

Translation

You can use the assistant's UI to run translation tasks in the UI. If the assistant app is enabled, the OCA.Assistant.openAssistantForm function should be available.

if (OCA.Assistant.openAssistantForm) {
	OCA.Assistant.openAssistantForm({
		appId: 'spreed',
		customId: 'message translation',
		taskType: 'core:text2text:translate',
		inputs: {
			input: 'the content of the message',
		},
		closeOnResult: false,
	}).then(task => {
		if (task.status === 'STATUS_SUCCESSFUL') {
			console.debug('assistant result task output', task.output.output)
		} else {
			console.debug('assistant result task', task)
		}
	})
}

The promise will resolve if the task succeeds, fails or is scheduled for later by the user. The promise result is the task object.
The closeOnResult parameter of OCA.Assistant.openAssistantForm decides if the assistant is closed when the task succeeds of fails. It can be false to stay close to the current behaviour of the translate modal in Talk. The user sees the result in the assistant and there is a "copy" button. The user can then close the assistant modal.

SpeechToText

Transcription can be done with the core:audio2text task type of the taskProcessing API. More details on how to run such task in the backend in the Transcribe section of nextcloud/assistant#114

@nickvergessen
Copy link
Member

@julien-nc we have a bit of a problem here.

Translating chat messages

We need OCS APIs as our mobile and desktop clients are calling it, and they should "respond" with it and not be delegated to a background job (No one will wait 5 minutes on the translation of a chat message).

Transcription of call recordings

Can be done in a background job, should be fine (we do that now as well as far as I know)

@nickvergessen
Copy link
Member

@julien-nc
Copy link
Member Author

No one will wait 5 minutes on the translation of a chat message

If the provider is an exApp, it will process tasks ASAP, no delay. If the provider is a Php app and occ background-job:worker "OC\TaskProcessing\SynchronousBackgroundJob" is running, no delay either.

Once the task is scheduled, the clients can poll it with ocs/v2.php/taskprocessing/task/TASK_ID. That's what the assistant does in the frontend. No more blocking request as it could be too long and be killed but also it blocks a Php runner while waiting.

@nickvergessen
Copy link
Member

So instead of getting a string returned the clients shall DOS the server.
The feature still breaks for existing clients.

@julien-nc
Copy link
Member Author

We can also keep the providers for the old APIs in integration_openai and the features in Talk are not broken.

@nickvergessen
Copy link
Member

nickvergessen commented Aug 23, 2024

I will check with Andy next week what to do.

@julien-nc
Copy link
Member Author

Two things should make it more convenient:

  • The TextProcessing and SpeechToText APIs are now forward compatible with providers. New TaskProcessing providers can be used by the TextProcessing API (for FreePromptTaskType, HeadlineTaskType, SummaryTaskType and TopicsTaskType because they have exact matches in the new API) and the SpeechToText API. This means you will benefit from new providers while using the old APIs.
  • The TaskProcessing manager now has a runTask method to run a task synchronously. This should make the migration easier.

All this is in stable30 already.

The providers for the Translate API and the TaskProcessing API are implemented in different apps so you can keep using the Translate API as long as you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 🧭 Planning evaluation (don't pick)
Development

No branches or pull requests

3 participants