Initial native message iface (#93)

* Initial native message iface I will develop a more complete native message interface, with json communication that will accept commands from the browser. For now though it is a simple stdin listener... * Add an easy tester The python tester allows to quickly test some changes. Extra work to be done. Pending. * Producer consumer queue, test program and json. Translation is broken The old translation method will not and I am not sure why. We need to create a new service anyways, but this is a problem for the Nick of tomorrow. * One producer thread consumes stdin and enqueues a bunch of translation requests. Lock on stdout writing via callback Early versions. Several issues. Number one is termination. Is there a way of knowing when all service work is done so that we can safely die, before we have any queued translations pending? * More progress Removed service shared pointer and put it inside the thread. Clean shutdown. Removed syncrhonisation with C style iostream. Modified the json * Firefox always calls translateLocally with two arguments: path-to-manifest.json extension-id No way to change that. So easy hack for now: if you see the extension-id as an argument, switch to nativeMessaging. * Add the length header to the output And flush, otherwise the message won't make it directly to Firefox * Add toggle for HTML I need HTML in the extension because there is no non-HTML mode anymore * Add support for langid tags and some code cleanup Wrapped one string in tr, removed unneeded headers and forward defines and used constexpr if template instead of overloaded functions * Implement initial version of model swapping and pivoting What isn't implemented is downloading a model that is not already downloaded. Pretty much eventLoop is used wrongly * On the way to the api proposed by @JelmervD Lots of refactoring, but now we have more reliable parsing of the incoming messages. Outgoing API is not implemented. I have made some modifications of what I expect and I will post my proposed edits tomorrow morning. We are also facing an issue with working in a separate thread. Any sort of QT objects's connections must be called in the same thread as the one in which they were constructed, as opposed to in a child thread. This means that we need to rethink the design so that FetchRemoteModels and DownloadModels are both callled from the main thread. * Move everything to the main thread. Listing models works. Fetching models is implemneted but some extra details needed to make it useful * `return` at the wrong level? * Fix response message structure * Somewhat more proper functionality * No termination condition, but everything is working Finalise is never called, because it will not complete any slots that are in progress. The way to do this is to probably have a second thread that keeps track on the termination condition and calls finalise() there... * First fully functional prototype. Not very well tested * Do not set `success:true` on download progress messages This makes handling TranslateLocally's responses much easier since we can keep the 1 request 1 response (with success = true or false) model in the extension. `update:true` messages can later be handled by something similar to progress reporting QFuture-like promises in the extension. * Send single object in `data` As opposed to an array of objects * Fix bug where remote model was selected when local model was available * Add a todo note * Move model matching code to ModelManager Temporarily removed the map that speeds up finding models but needs to be kept in sync with the model list. Will bring it back at some point but keep it hidden inside ModelManager's internals. I've tasted std::variant and it is now my favourite food. * (hopefully) fix compilation issue for macOS 10.15 * Revert "(hopefully) fix compilation issue for macOS 10.15" This reverts commit 9512113. It is essentially a "compiler isn't told properly that we're using C++17" issue. See https://stackoverflow.com/a/43631668 * Rewrite native_client.py example to use asyncio Necessary preparations for load testing later on, where I want to test concurrent requests like there would be from a browser extension trying to keep up with tabs loading & translating. * Simplify DownloadModel message handling * Update for changes to master * Upload timing test code as well * Catch nullptr returned from downloadFile The whole `operations_--` counter thing is very easy to mess up :( * Make progress messages more useful for a client Passing bytes instead of 0..1 so a client can show download speed and size if they want to. * Implement & test single shot connection type for signal handling * Implement model-specific translations It falls back too often to `QString id` when it already has access to a `Model`, but it has fewer separate code pathss. * Implement isSameModel through Model::id() * When includeRemote is false, don't include remote models. * Defensive work-around in test for broken model_info.json in eng-fin model. * Fill in srcTags and trgTag based on shortName if they're missing from model_info.json * Graceful shutdown of native client * Debug stuff It's a debug program. No use not having the debug print statements in there… * Make connectSingleShot also accept lambdas * Move respond logic to write* methods Also always pass along the request because it makes debugging a lot easier since you can see which request you're responding to. Moved a lot of the logic to single shot connections since ModelManager does not guarantee that every request that accepts `extradata` will also produce a signal. This also helps group all the logic to handling one request type in one place, which is nice. * Possibly fix undeclared `Args` in gcc * Attempt to fix compilation issue where QStringList (which extends QList<QString>) does not seem to have a constructor that accepts being and end iterators. * Base connectSingleShot template on the signal it connects to instead of the slot Maybe this works? Problem was in the Qt5 bit, not the Qt6 bit. * Fix macOS compilation issue by replacing `char[]` with `std::vector<char>` * Add async shutdown test In which I write requests, close stdin, and then wait to see if all responses do end up on stdout. * Simplify iothread & operation count locking a bit Basically I wanted a semaphore that blocked till it was 0. But until then this will do. * Add code to register native messaging client with Firefox on launch Maybe we should first ask though * Revert to using atomic In my limited testing, it seems to work okay even when the condition_variable does not share the same mutex as the atomic. * Fix concurrent downloads * Add comments to native messaging launch detection * Print errors to stderr * Change modelID -> id in download update message to match format in ListModels and DownloadModel success messages * Document each of the requests * Remove `die_` * Replace `shared_ptr<vector<char>>` with QByteArray * Fix windows registry path … I think * Switch stdin & stdout to binary mode on Windows * Include the right headers on Windows * Use the other slash Cause Qt [is funny like that](https://github.com/qt/qtbase/blob/a0a2bf2d95d4fcd468b6ce3c2e728d95425dd760/src/corelib/io/qsettings_win.cpp#L103-L115) * Fix call to `std::isspace` Documentation is explicit about only calling it with unsigned char, and Windows' C++ runtime is checking for that. * Move extension ids to a single place * Update README with info about native messaging * Remove superfluous `)` * Mention limitations up front So nobody gets hurt Co-authored-by: Jelmer van der Linde <[email protected]>
XapaJIaMnu · Apr 13, 2022 · 50db857 · 50db857
1 parent 8197c98
commit 50db857
Show file tree

Hide file tree

Showing 19 changed files with 1,587 additions and 93 deletions.
diff --git a/3rd_party/bergamot-translator b/3rd_party/bergamot-translator
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -128,6 +128,7 @@ include_directories(logo)
 add_subdirectory(src)
 # This doesn't quite work if places in the subdirectory's CMakeLists.txt
 set(PROJECT_SOURCES
+        src/constants.h
         src/main.cpp
         src/mainwindow.cpp
         src/mainwindow.h
@@ -145,9 +146,11 @@ set(PROJECT_SOURCES
         src/Translation.h
         src/Translation.cpp
         src/types.h
-        src/CommandLineIface.h
-        src/CommandLineIface.cpp
-        src/CLIParsing.h
+        src/cli/CLIParsing.h
+        src/cli/CommandLineIface.cpp
+        src/cli/CommandLineIface.h
+        src/cli/NativeMsgIface.cpp
+        src/cli/NativeMsgIface.h
         src/inventory/ModelManager.cpp
         src/inventory/ModelManager.h
         src/inventory/RepoManager.cpp

diff --git a/README.md b/README.md
@@ -127,6 +127,29 @@ sacrebleu -t wmt13 -l en-es --echo ref > /tmp/es.in
 cat /tmp/es.in | ./translateLocally -m es-en-tiny | ./translateLocally -m en-de-tiny -o /tmp/de.out
 ```
 
+# NativeMessaging interface
+translateLocally can integrate with other applications and browser extensions using [native messaging](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_messaging). This functionality is similar to using pipes on the command line, except that the message format is JSON which allows you to specify options per input fragment, and the translated fragments are returned when they become available as opposed to the input order.
+
+## Limitations
+Right now there is a 10MB message limit for incoming messages. This matches the limitations of Firefox. Responses are limited to about 4GB due to the native messaging message format.
+
+## Using NativeMessaging from Python
+Start translateLocally in a subprocess with the `-p` option, and pass it messages [formatted as described here](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_messaging#app_side) to its stdin. All supported messages are described in the [NativeMsgIface.h](src/cli/NativeMsgIface.h) file.
+
+There is an example, [native_client.py](scripts/native_client.py), that demonstrates how to use translateLocally as an async Python API.
+
+## Using NativeMessaging from browser extensions
+Right now, the functionality is only automatically available to Firefox.
+
+translateLocally automatically registers itself with Firefox when you start translateLocally in GUI mode. Then you can install the [Firefox translation addon](https://github.com/jelmervdl/firefox-translations/releases). After installation of the addon, go into the addon settings and pick "translateLocally" as translation provider.
+
+### Developing your own browser extension
+Due to the way Firefox and Chrome call translateLocally, you will need to add your browser extension id to the translateLocally source code before it is able to accept native messages.
+
+Add your extension id to [constants.h](src/constants.h) and rebuild translateLocally from source. Once you start it in GUI mode, it will re-register itself with support for your extension.
+
+If you want your extension id added to translateLocally permanently, please open an issue or send us a pull request!
+
 # Importing custom models
 translateLocally supports importing custom models. translateLocally uses the [Bergamot](https://github.com/browsermt/marian-dev) fork of [marian](https://github.com/marian-nmt/marian-dev). As such, it supports the vast majority marian models out of the box. You can just train your marian model and place it a directory. 
 ## Basic model import

diff --git a/scripts/native_client.py b/scripts/native_client.py
@@ -0,0 +1,342 @@
+#!/usr/bin/env python3
+'''A native client simulating the plugin to use for testing the server'''
+import asyncio
+import itertools
+import struct
+import json
+import time
+import sys
+import csv
+from pathlib import Path
+from pprint import pprint
+from tqdm import tqdm
+
+
+class Timer:
+    """Little helper class top measure runtime of async function calls and dump
+    all of those to a CSV.
+    """
+    def __init__(self):
+        self.measurements = []
+
+    async def measure(self, coro, *details):
+        start = time.perf_counter()
+        result = await coro
+        end = time.perf_counter()
+        self.measurements.append([end - start, *details])
+        return result
+
+    def dump(self, fh):
+        # TODO stats? For now I just export to Excel or something
+        writer = csv.writer(fh)
+        writer.writerows(self.measurements)
+
+
+class Client:
+    """asyncio based native messaging client. Main interface is just calling
+    `request()` with the right parameters and awaiting the future it returns.
+    """
+    def __init__(self, *args):
+        self.serial = itertools.count(1)
+        self.futures = {}
+        self.args = args
+
+    async def __aenter__(self):
+        self.proc = await asyncio.create_subprocess_exec(*self.args, stdin=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE)
+        self.read_task = asyncio.create_task(self.reader())
+        return self
+
+    async def __aexit__(self, *args):
+        self.proc.stdin.close()
+        await self.proc.wait()
+
+    def request(self, command, data, *, update=lambda data: None):
+        message_id = next(self.serial)
+        message = json.dumps({"command": command, "id": message_id, "data": data}).encode()
+        # print(f"Sending: {message}", file=sys.stderr)
+        future = asyncio.get_running_loop().create_future()
+        self.futures[message_id] = future, update
+        self.proc.stdin.write(struct.pack("@I", len(message)))
+        self.proc.stdin.write(message)
+        return future
+
+    async def reader(self):
+        while True:
+            try:
+                raw_length = await self.proc.stdout.readexactly(4)
+                length = struct.unpack("@I", raw_length)[0]
+                raw_message = await self.proc.stdout.readexactly(length)
+
+                # print(f"Receiving: {raw_message.decode()}", file=sys.stderr)
+                message = json.loads(raw_message)
+
+                # Not cool if there is no response message "id" here
+                if not "id" in message:
+                    continue
+
+                # print(f"Receiving response to {message['id']}", file=sys.stderr)
+                future, update = self.futures[message["id"]]
+
+                if "success" in message:
+                    del self.futures[message["id"]]
+                    if message["success"]:
+                        future.set_result(message["data"])
+                    else:
+                        future.set_exception(Exception(message["error"]))
+                elif "update" in message:
+                    update(message["data"])
+            except asyncio.IncompleteReadError:
+                break # Stop read loop if EOF is reached
+            except asyncio.CancelledError:
+                break # Also stop reading if we're cancelled
+
+
+class TranslateLocally(Client):
+    """TranslateLocally wrapper around Client that translates
+    our defined messages into functions with arguments.
+    """
+    async def list_models(self, *, include_remote=False):
+        return await self.request("ListModels", {"includeRemote": bool(include_remote)})
+
+    async def translate(self, text, src=None, trg=None, *, model=None, pivot=None, html=False):
+        if src and trg:
+            if model or pivot:
+                raise InvalidArgumentException("Cannot combine src + trg and model + pivot arguments")
+            spec = {"src": str(src), "trg": str(trg)}
+        elif model:
+            if pivot:
+                spec = {"model": str(model), "pivot": str(pivot)}
+            else:
+                spec = {"model": str(model)}
+        else:
+            raise InvalidArgumentException("Missing src + trg or model argument")
+
+        result = await self.request("Translate", {**spec, "text": str(text), "html": bool(html)})
+        return result["target"]["text"]
+
+    async def download_model(self, model_id, *, update=lambda data: None):
+        return await self.request("DownloadModel", {"modelID": str(model_id)}, update=update)
+
+
+def first(iterable, *default):
+    """Returns the first value of anything iterable, or throws StopIteration
+    if it is empty. Or, if you specify a default argument, it will return that.
+    """
+    return next(iter(iterable), *default) # passing as rest argument so it can be nothing and trigger StopIteration exception
+
+
+def get_build():
+    """Instantiate an asyncio TranslateLocally client that connects to
+    tranlateLocally in your local build directory.
+    """
+    return TranslateLocally(Path(__file__).resolve().parent / Path("../build/translateLocally"), "-p")
+
+
+async def download_with_progress(tl, model, position):
+    """tl.download but with a tqdm powered progress bar."""
+    with tqdm(position=position, desc=model["modelName"], unit="b", unit_scale=True, leave=False) as bar:
+        def update(data):
+            assert data["read"] <= data["size"]
+            bar.total = data["size"]
+            diff = data["read"] - bar.n
+            bar.update(diff)
+        return await tl.download_model(model["id"], update=update)
+
+
+async def test():
+    """Test TranslateLocally functionality."""
+    async with get_build() as tl:
+        models = await tl.list_models(include_remote=True)
+        pprint(models)
+
+        # Models necessary for tests, both direct & pivot
+        necessary_models = {("en", "de"), ("en", "es"), ("es", "en")}
+
+        # From all models available, pick one for every necessary language pair
+        # (preferring tiny ones) so we can make sure these are downloaded.
+        selected_models = {
+            (src,trg): first(sorted(
+                (
+                    model
+                    for model in models
+                    if src in model["srcTags"] and trg == model["trgTag"]
+                ),
+                key=lambda model: 0 if model["type"] == "tiny" else 1
+            ))
+            for src, trg in necessary_models
+        }
+
+        pprint(selected_models)
+
+        # Download them. Even if they're already model['local'] == True, to test
+        # that in that case this is a no-op.
+        await asyncio.gather(*(
+            download_with_progress(tl, model, position)
+            for position, model in enumerate(selected_models.values())
+        ))
+        print() # tqdm messes a lot with the print position, this makes it less bad
+
+        # Test whether the model list has been updated to reflect that the
+        # downloaded models are now local.
+        models = await tl.list_models(include_remote=True)
+        assert all(
+            model["local"]
+            for selected_model in selected_models.values()
+            for model in models
+            if model["id"] == selected_model["id"]
+        )
+
+        # Perform some translations, switching between the models
+        translations = await asyncio.gather(
+            tl.translate("Hello world!", "en", "de"),
+            tl.translate("Let's translate another sentence to German.", "en", "de"),
+            tl.translate("Sticks and stones may break my bones but words WILL NEVER HURT ME!", "en", "es"),
+            tl.translate("I <i>like</i> to drive my car. But I don't have one.", "en", "de", html=True),
+            tl.translate("¿Por qué no funciona bien?", "es", "de"),
+            tl.translate("This will be the last sentence of the day.", "en", "de"),
+        )
+
+        pprint(translations)
+
+        assert translations == [
+            "Hallo Welt!",
+            "Übersetzen wir einen weiteren Satz mit Deutsch.",
+            "Palos y piedras pueden romper mis huesos, pero las palabras NUNCA HURT ME.",
+            "Ich <i>fahre gerne</i> mein Auto. Aber ich habe keine.", #<i>fahre</i>???
+            "Warum funktioniert es nicht gut?",
+            "Dies wird der letzte Satz des Tages sein.",
+        ]
+
+        # Test bad input
+        try:
+            await tl.translate("This is impossible to translate", "en", "xx")
+            assert False, "How are we able to translate to 'xx'???"
+        except Exception as e:
+            assert "Could not find the necessary translation models" in str(e)
+
+    print("Fin")
+
+
+async def test_third_party():
+    """Test whether TranslateLocally can switch between different types of
+    models. This test assumes you have the OPUS repository in your list:
+    https://object.pouta.csc.fi/OPUS-MT-models/app/models.json
+    """
+    async with get_build() as tl:
+        models_to_try = [
+            'en-de-tiny',
+            'en-de-base',
+            'eng-fin-tiny', # model has broken model_info.json so won't work anyway :(
+            'eng-ukr-tiny',
+        ]
+
+        models = await tl.list_models(include_remote=True)
+
+        # Select a model from the model list for each of models_to_try, but
+        # leave it out if there is no model available.
+        selected_models = {
+            shortname: model
+            for shortname in models_to_try
+            if (model := first((model for model in models if model["shortname"] == shortname), None))
+        }
+
+        await asyncio.gather(*(
+            download_with_progress(tl, model, position)
+            for position, model in enumerate(selected_models.values())
+        ))
+
+        # TODO: Temporary filter to figure out 'failed' downloads. eng-fin-tiny
+        # has a broken JSON file so it will download correctly, but still not
+        # be available or show up in this list. We should probably make the
+        # download fail in that scenario.
+        models = await tl.list_models(include_remote=False)
+        for shortname in list(selected_models.keys()):
+            if not any(True for model in models if model["shortname"] == shortname):
+                print(f"Skipping {shortname} because it didn't show up in model list after downloading", file=sys.stderr)
+                del selected_models[shortname]
+
+        translations = await asyncio.gather(*[
+            tl.translate("This is a very simple test sentence", model=model["id"])
+            for model in selected_models.values()
+        ])
+
+        pprint(list(zip(selected_models.keys(), translations)))
+
+
+async def test_latency():
+    timer = Timer()
+
+    # Our line generator: just read Crime & Punishment from stdin :D
+    lines = (line.strip() for line in sys.stdin)
+
+    async with get_build() as tl:
+        for epoch in range(100):
+            print(f"Epoch {epoch}...", file=sys.stderr)
+            for batch_size in [1, 5, 10, 20, 50, 100]:
+                await asyncio.gather(*(
+                    timer.measure(
+                        tl.translate(line, "en", "de"),
+                        epoch,
+                        batch_size,
+                        len(line.split(' ')))
+                    for n, line in zip(range(batch_size), lines)
+                ))
+
+    timer.dump(sys.stdout)
+
+
+async def test_concurrency():
+    async with get_build() as tl:
+        fetch_one = tl.list_models(include_remote=True)
+        fetch_two = tl.list_models(include_remote=False)
+        fetch_three = tl.list_models(include_remote=True)
+        await asyncio.gather(fetch_one, fetch_two, fetch_three)
+
+
+async def test_shutdown():
+    tasks = []
+    async with get_build() as tl:
+        for n in range(10):
+            print(f"Requesting translation {n}")
+            tasks.append(tl.request("Translate", {
+                "src": "en",
+                "trg": "de",
+                "text": f"This is simple sentence number {n}!",
+                "html": False
+            }))
+        print("Shutting down")
+    print("Shutdown complete")
+    for translation in asyncio.as_completed(tasks):
+        print(await translation)
+    print("Fin.")
+
+
+async def test_concurrent_download():
+    """Test parallel downloads."""
+    async with get_build() as tl:
+        models = await tl.list_models(include_remote=True)
+        remote = [model for model in models if not model["local"]]
+        downloads = [
+            tl.download_model(model["id"])
+            for model, _ in zip(remote, range(3))
+        ]
+        await asyncio.gather(*downloads)
+
+
+def main():
+    tests = {
+        "test": test,
+        "third-party": test_third_party,
+        "latency": test_latency,
+        "concurrency": test_concurrency,
+        "shutdown": test_shutdown,
+        "concurrent-downloads": test_concurrent_download
+    }
+
+    if len(sys.argv) == 1 or sys.argv[1] not in tests:
+        print(f"Usage: {sys.argv[0]} {' | '.join(tests.keys())}", file=sys.stderr)
+    else:
+        asyncio.run(tests[sys.argv[1]]())
+
+
+main()