Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Basic Text to Speech #1143

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Implement Basic Text to Speech #1143

wants to merge 14 commits into from

Conversation

ShaopengLin
Copy link
Collaborator

@ShaopengLin ShaopengLin commented Jul 18, 2024

Fix #44

Changes:

  • Reading the entire or selected text on a web page in different languages.
  • The reading can be triggered in the Edit Menu, Shortcut, and right-click on a page.
  • TTS UI to select languages and voices, and stop the reading.
  • Voice configurations are remembered between sessions.
  • Language determined by Zim's Locale.

@ShaopengLin
Copy link
Collaborator Author

I wasn't able to implement PAUSE, RESUME, and volume.

PAUSE/RESUME: Qt relies on the underlying system's audio engines to perform these operations and Qt has quite some buggy behavior on those engines. I tried for a long time to find solutions but none is compatible with all engines and Qt versions. This leads to the compromise of simply having Speak & STOP.

Volume: From what I tested, the Linux "speech" engines' audio doesn't change during the reading, which in my opinion makes it useless.

Considering we are shipping cross-platform, it is unreliable to have features that work on one but not on another. If we can find a way to package Qt with a working speech engine, we might be able to implement PAUSE/RESUME. However, I have no idea how to do that. @veloman-yunkan @mgautierfr Do you by any chance have an idea of how hard it would be to package an external library like Flite with Kiwix-Desktop? If it's a huge hurdle and we only get the PAUSE/RESUME, then I wouldn't say the return on the effort is great.

The priority for this PR would be to have Kiwix-Desktop be compatible with Qt6.4+, CI builds, and a satisfactory UI/UX given the existing feature.

@kelson42
Copy link
Collaborator

@ShaopengLin Thank you very much for your PR. For the moment I'm not able to compile it on my Ubuntu 22.04 but should move soon to 24.04 and hope this will allow me to have Qt 6.4. In the meantime we need to secure the backward compatibility with older version of Qt (both version 5 and 6). Please fix the compilation script and the code to secure that this feature is only activated if Qt6.4 or higher is available, otherwise just deactive the feature in a way both compilation and run are fine.

@ShaopengLin ShaopengLin force-pushed the Issue#44-tts branch 8 times, most recently from e06256d to ba921f6 Compare August 5, 2024 01:46
@ShaopengLin
Copy link
Collaborator Author

@kelson42 The compilation is now for Qt 6.4 +.

@kelson42
Copy link
Collaborator

kelson42 commented Sep 8, 2024

@ShaopengLin I have moved to 24.04. Would you be able please to rebase on main so I can finally move forward with the testing?

@ShaopengLin
Copy link
Collaborator Author

ShaopengLin commented Sep 9, 2024

@kelson42 Done.

From my own testing, TTS from Qt is rather unstable and sometimes crashes the application when used for no apparent reason/pattern at all. I have been trying my best to find out if there is a solution to the core problem, though little information really floats around the internet on this topic. I honestly hope it's just my own computer's issue... Please try it on your machine and see if any crashes or misbehaviors exist. If I cannot figure out why those things occur, rather than relying on Qt's flawed implementation, the goal should be improving screen reader compatibilities.

When you test, the dfki packages(if exists) are likely not going to work so select the other ones.
Screenshot from 2024-09-09 02-43-01

@ShaopengLin
Copy link
Collaborator Author

ShaopengLin commented Sep 11, 2024

@kelson42 I have found the root cause to the crashes. Will put up a fixup commit soon...

@ShaopengLin ShaopengLin force-pushed the Issue#44-tts branch 2 times, most recently from 6937c65 to 1b9265c Compare September 13, 2024 19:57
@ShaopengLin
Copy link
Collaborator Author

@kelson42 Should be ready for testing now.

@kelson42
Copy link
Collaborator

@ShaopengLin I see the menu entry but nothing happens if I click on it: no sound not bottom bar visible. I use Ubuntu 24.04 wieth Qt 5.15.13.

@ShaopengLin
Copy link
Collaborator Author

ShaopengLin commented Sep 14, 2024

@kelson42 As mentioned in this comment we moved compilation to Qt6.4+. I will make a change so those menu entries are invisible in Qt 6.4 and below.

@kelson42
Copy link
Collaborator

@kelson42 As mentioned in this comment we moved compilation to Qt6.4+. I will make a change so those menu entries are invisible in Qt 6.4 and below.

@ShaopengLin Then the option should not be available (or grayed)

@kelson42
Copy link
Collaborator

@ShaopengLin For me, does not compile with Qt 6.4.2:

g++ -c -pipe -std=c++17 -Werror -I/usr/local/include/ -I/usr/local/include -I/usr/include/x86_64-linux-gnu -I/usr/include/p11-kit-1 -O2 -std=gnu++1z -Wall -Wextra -fPIC -D_REENTRANT -DVERSION=2.3.1 -DQT_DEPRECATED_WARNINGS -DQT_DISABLE_DEPRECATED_BEFORE=0x060000 -DQT_NO_DEBUG -DQT_WEBENGINEWIDGETS_LIB -DQT_PRINTSUPPORT_LIB -DQT_WIDGETS_LIB -DQT_WEBENGINECORE_LIB -DQT_QUICK_LIB -DQT_OPENGL_LIB -DQT_GUI_LIB -DQT_QMLMODELS_LIB -DQT_WEBCHANNEL_LIB -DQT_QML_LIB -DQT_NETWORK_LIB -DQT_QMLINTEGRATION_LIB -DQT_POSITIONING_LIB -DQT_CORE_LIB -I. -Isubprojects/QtSingleApplication/src -I/usr/include/x86_64-linux-gnu/qt6 -I/usr/include/x86_64-linux-gnu/qt6/QtWebEngineWidgets -I/usr/include/x86_64-linux-gnu/qt6/QtPrintSupport -I/usr/include/x86_64-linux-gnu/qt6/QtWidgets -I/usr/include/x86_64-linux-gnu/qt6/QtWebEngineCore -I/usr/include/x86_64-linux-gnu/qt6/QtQuick -I/usr/include/x86_64-linux-gnu/qt6/QtOpenGL -I/usr/include/x86_64-linux-gnu/qt6/QtGui -I/usr/include/x86_64-linux-gnu/qt6/QtQmlModels -I/usr/include/x86_64-linux-gnu/qt6/QtWebChannel -I/usr/include/x86_64-linux-gnu/qt6/QtQml -I/usr/include/x86_64-linux-gnu/qt6/QtNetwork -I/usr/include/x86_64-linux-gnu/qt6/QtQmlIntegration -I/usr/include/x86_64-linux-gnu/qt6/QtPositioning -I/usr/include/x86_64-linux-gnu/qt6/QtCore -I. -I. -I/usr/lib/x86_64-linux-gnu/qt6/mkspecs/linux-g++ -o contenttypefilter.o src/contenttypefilter.cpp
src/contenttypefilter.cpp: In constructor ‘ContentTypeFilter::ContentTypeFilter(QString, QWidget*)’:
src/contenttypefilter.cpp:16:35: error: ‘checkStateChanged’ is not a member of ‘QCheckBox’
   16 |         connect(this, &QCheckBox::checkStateChanged, this, &ContentTypeFilter::onStateChanged);
      |                                   ^~~~~~~~~~~~~~~~~
make: *** [Makefile:1437: contenttypefilter.o] Error 1

@ShaopengLin
Copy link
Collaborator Author

ShaopengLin commented Sep 14, 2024

@kelson42 That is not my change. I believe this is from #1205 where I believe @sgourdas should've made the deprecation for only Qt 6.7+, and not Qt 6+. Source

@kelson42
Copy link
Collaborator

@sgourdas Can you please fix compilation regression we have now with Qt6? This is blocking review of this PR.

@sgourdas
Copy link
Collaborator

@sgourdas Can you please fix compilation regression we have now with Qt6? This is blocking review of this PR.

@kelson42 was this not fixed with #1212 by @ShaopengLin?

@kelson42
Copy link
Collaborator

Actually I don't have Qt6.7+ on my Ubuntu 24.04. How should i test this? @ShaopengLin @sgourdas Do you use a PPA to get the most recent version?

@sgourdas
Copy link
Collaborator

Actually I don't have Qt6.7+ on my Ubuntu 24.04. How should i test this? @ShaopengLin @sgourdas Do you use a PPA to get the most recent version?

If I remember correctly I had built it from the git source.

Would you like me to test something?

Copy link
Collaborator

@kelson42 kelson42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a user perspective

@ShaopengLin
Copy link
Collaborator Author

@veloman-yunkan will notify you when code is ready. We can focus on the bigger ones first.

@ShaopengLin ShaopengLin force-pushed the Issue#44-tts branch 2 times, most recently from 6fb2227 to fb1ee94 Compare November 13, 2024 08:49
@ShaopengLin
Copy link
Collaborator Author

@veloman-yunkan Ready for review.

@ShaopengLin
Copy link
Collaborator Author

@veloman-yunkan Also, Qt at CI / Linux (focal) does not have texttospeech. Will we be adding the package or will I have to disable TTS if the package does not exist?

@ShaopengLin ShaopengLin force-pushed the Issue#44-tts branch 2 times, most recently from 52fac5b to c62418d Compare November 13, 2024 18:26
@kelson42
Copy link
Collaborator

@veloman-yunkan Also, Qt at CI / Linux (focal) does not have texttospeech. Will we be adding the package or will I have to disable TTS if the package does not exist?

What does that technically means "does not have texttospeech"? It means no boice is available?

@ShaopengLin
Copy link
Collaborator Author

@kelson42 The Qt TextToSpeech module is missing in that CI.

@kelson42
Copy link
Collaborator

@ShaopengLin The problem is not really the CI but what will happen if for some reason we have Qt5 without all the necessary. My best guess is that we should be prepared to such a scenario and hide or desactivate the feature if this lack is detected at start. Or is that depending to a certain version of Qt5?

@ShaopengLin
Copy link
Collaborator Author

@kelson42 Don't believe it should depend on a certain version of Qt5, but I suppose having safeguards against this would be ideal. I will change this shortly.

Basic Reading of Articles or Selected Texts.
Added the bar at bottom of webpage. Intoduced the stop speaking button.
Bar is default closed and shown on any speech activation.
The line edit should not be edit/highlight/selectable.
Rather than only the arrow, clicking any place on the combobox should popup the selector
Tabs are by default Zim's Locale
Keep track of the voices so user do not need to reconfigure.
Some systems or QT versions might not have or the module is not available.
Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering TTS as an accessibility feature for people with vision problems, it should be primarily accessible via keyboard. I know that kiwix-desktop's accessibility in that regard is quite poor, however introducing TTS without a way of controlling it via keyboard doesn't seem wise.

@@ -544,6 +544,11 @@ void KiwixApp::saveWindowState()
mp_session->setValue("windowState", getMainWindow()->saveState());
}

void KiwixApp::saveVoiceName(const QString& langName, const QString& voiceName)
{
mp_session->setValue(langName + "/voice", voiceName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way of generating the setting name from variable data introduces a possibility of conflict with another setting with a name generated in a different way. I think it is better to at least start setting names with a fixed known prefix (either move "voice" to the beginning, or prepend "tts:" or "text2speech:")

Comment on lines +93 to +104
/* We either stay with default voices or matches with the saved voice. */
int voiceIndex = 0;
for (int i = 0; i < ui->voiceComboBox->count(); i++)
{
if (m_voices[i].name() == currentVoice.name())
voiceIndex = i;
if (m_voices[i].name() == savedVoiceName)
{
voiceIndex = i;
break;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please introduce a function int TextToSpeechBar::getVoiceIndex(const QString& voiceName) const and use it by trying to first obtain the saved voice, or the current voice as a fallback.

Comment on lines +144 to +145
if (m_speech.state() == QTextToSpeech::Speaking)
speak(m_text);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, that won't be good UX. If possible (given the limitation of pause/resuming functionality of QTextToSpeech) it would be better to pause the speech, change the voice and resume the speech.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought as well, unfortunately, QTextToSpeech's pause and resume barely work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement TTS
4 participants