Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential minor Improvements #105

Open
dipamsen opened this issue Aug 28, 2024 · 3 comments
Open

Potential minor Improvements #105

dipamsen opened this issue Aug 28, 2024 · 3 comments

Comments

@dipamsen
Copy link
Member

dipamsen commented Aug 28, 2024

  • could we somehow tell elevenlabs certain .s are not periods? (maybe we replace the . by dot or something). Such .s can maybe be identified if the . does not have any whitespace on either side of it (eg. p5.js, this.position)
  • collapse multiple consecutive word updates into one? currently the diffing algorithm is by WORDS, which looks good if in a line a single token is being changed, only that part will be removed and then typed. But if an entire line changes, each word gets selected and typed, one by one. Maybe multiple consecutive word diffs can be collapsed into one update (upto a newline)
@dipamsen
Copy link
Member Author

dipamsen commented Aug 28, 2024

also maybe:

  • speak comments? mostly when instructed to not write comments, it (gpt-4o) does so as it thinks it is speaking whatever is being written as a comment.

@dipamsen dipamsen changed the title potential qol improvement ideas Potential minor Improvements Aug 28, 2024
@shiffman
Copy link
Member

could we somehow tell elevenlabs certain .s are not periods? (maybe we replace the . by dot or something). Such .s can maybe be identified if the . does not have any whitespace on either side of it (eg. p5.js, this.position)

Yes, we should manually replace the . with "dot" I think! We could make a list of known terms, but I think the no whitespace pattern will work well!

collapse multiple consecutive word updates into one? currently the diffing algorithm is by WORDS, which looks good if in a line a single token is being changed, only that part will be removed and then typed. But if an entire line changes, each word gets selected and typed, one by one. Maybe multiple consecutive word diffs can be collapsed into one update (upto a newline)

Game for this! Maybe lower priority!

speak comments? mostly when instructed to not write comments, it (gpt-4o) does so as it thinks it is speaking whatever is being written as a comment.

I like this idea! Do we think we could have it speak and type at the same time?

@dipamsen
Copy link
Member Author

dipamsen commented Aug 29, 2024

My evaluation on how easy it is to implement

  1. should be easy enough, though the issue is that the model may stream a chunk p5. and such chunk would be categorised as end of sentence by our parsing logic. so idk how to elegantly handle that
  2. i think this is pretty doable
  3. this seems hard because the parsing logic will have to be modified to go into speech mode on receiving a // token, and then going back to code on receiving a \n

Do we think we could have it speak and type at the same time?

probably have to change lots of stuff to have this, but possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants