-
-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XAP (née QMK API) discovery and scoping #11567
Comments
QMK Live ConfigPrior art:
Thoughts:
|
Re: WebUSB, and the associated PR. It's probably not completely in line with what we'd want, but two big things in it: the only command that works without user intervention is the firmware version command. Otherwise, everything else REQUIRES that the keyboard be sent a pairing request, and for a button to be pushed on the keyboard to complete pairing. Also, no matter what, I think that we absolutely need to have a "prefix" for commands over raw HID, so that multiple functions can use it, in parallel. Or some such solution, to properly share the endpoint. This is a MUST, IMO. |
Thinking about this further, I think I'd like some sort of formal API specification that is able to be used as a "master list" -- produce docs, compile-time generation of kb-side parser code, as well as potentially generating multi-language clients so that people can grab an SDK and hook in directly. Clearly a long-term initiative, but direction-wise that's where I'd like to head. |
Another one: |
Opening this issue for comment... RFC from @Wilba6582 @xyzz @Kasper24 @Drugantibus -- thoughts would be great. |
I agree with the proposed solution for multiplexing different functionalities on raw HID by using a "prefix" to dispatch incoming HID packets to the appropriate function, similarly to how I do it in my repo. IMO it's easy to understand, implement, and scale for future requirements. Using different usage pages/IDs should be possible (and probably "more correct" from a HID point of view), but adds unecessary complexity, clunkiness and potential compatibility issues. I for one wouldn't want a dozen hidraw devices when connecting my keyboard :) |
I like the idea of a prefix to send the packet to the correct functionality. Perhaps have byte 0 of any incoming raw packet be parsed as the function code. I agree that there should be a meta function that always exists. This function should be able to provide the essentials - keyboard manufacturer, name, QMK version, serial number (if exists), description, and then a list of supported functions (VIA, OpenRGB, etc). Possibly also common commands like reboot into bootloader. As for RGB control, I think it would make sense to have one function for RGB matrix. This endpoint should provide the matrix size (width and height) along with the LED-to-keycode mapping and a mapping table to map LEDs into a grid. It should provide a list of enabled RGB animations and their parameters. It should be able to set the selected animation/parameters. It should also provide a means to directly control the LEDs from PC-side software. This is already implemented in @Kasper24's OpenRGB protocol. It should also provide the ability to have more than one matrix, as some keyboards could have an underglow zone with multiple rows. RGB lights not part of a matrix would be another function. It would define number of LEDs and possibly some form of zone mapping. Same as the matrix implementation, it would provide direct PC-side control as well as control over built-in animations. Non-RGB functions should also be considered. Maybe a way to upload keymaps without re-flashing the firmware? A macro editor? Polling rate? |
One possible reason to create separate HID interfaces is that various USB APIs may prevent concurrent usage of the same HID interface by different apps, therefore running an app which performs RGB control might prevent other apps (e.g., keymap configuration like VIA) from communicating with the device if the same HID interface is used by both apps. However, this can be hindered by the endpoint shortage issues (e.g., the OTG controller in F401/F411 can support at most 3 HID interfaces, and normally 2 of them are already occupied by the boot keyboard and shared, therefore adding anything more than a single rawhid interface is not possible without requiring It could be possible to create a completely custom USB interface without any endpoints and communicate using just control transfers, but this introduces Windows driver issues (although it could be possible to implement Microsoft OS descriptors to install the WinUSB driver automatically; these descriptors would be needed for WebUSB too). Also, in theory the USB HID Usage Table 1.21 defines the Lighting And Illumination Page (0x59), which looks like an attempt to make some universal mechanism for RGB LED control (although the |
Adding https://remap-keys.app/ from @yoichiro |
Gripes raised during today's discussion:
|
I think is a good idea reflect the keycodes used on the firmware like shortname, longname and group.
The same goes for some key codes. If you don't have a backlight, it doesn't make much sense, see a key code to turn on the backlight. Or MIDI |
Yep, which is why the subsystem query commands must exist -- the "group" you mentioned allows mapping back to what should and shouldn't be shown. Very much a good idea. I think I'd also prefer to see these definitions as part of a JSON document in QMK's repo, with build-time codegen much like what's happening with the data-driven builds. We should be able to get this sort of keycode definition in early, without requiring API work to be complete... and things like VIA/vial/remap.keys et.al. can leverage the definitions directly, instead of maintaining their own mapping. |
@sigprof I assume that most functions wouldn't keep using the interface continuously though? Or at least not frequently enough that this issue couldn't be resolved by some scheduling/collision avoidance algorithm... But tbh you seem much more knowledgeable than me so I would like to hear your thoughts on this :) I guess that at this point my question is how we should approach modularity vs standardization. Is the plan to offer the community the possibility of implementing their own functions on the API or will it be a "core QMK thing" which is fully documented on the docs? Or somewhere in between? |
The plan is to allow users to work within their own "namespace" without colliding with normal QMK functionality. Ideally, if we've got the ability to mark certain APIs as "secure" (i.e. requiring some sort of handshake at runtime -- think entering bootloader or clearing EEPROM) then we could also provide the ability to user-mode APIs too. |
For clarity, I mean that the interface for communicating with QMK will be common between the QMK functionality and the user-mode functionality -- "entering" user-mode will likely require a known prefix of bytes which then any payload of your choosing can be specified. |
Regarding the lock vs no lock options, I can see the rationale for both for very good reasons. Without the lock, I could make a malicious small USB host device made to do one job: randomize key layouts of QMK-compatible keyboards, made for pranking co-workers. I would only need brief physical access but it could be done. With the lock, it adds some steps, either resetting and reloading firmware or figuring out the unlock pattern, etc. I could also envision malicious software that just does something like that too. Why such things would exist, I don't know, but it is possible in theory. So the question becomes "is the risk worth it? What is the likelihood and severity of having no lock?" Currently I would characterize it as low likelihood, with the potential for increased likelihood as QMK usage increases. Severity is moderate; it's annoying (best case) and a showstopper in a worst case scenario (a user that doesn't know how to fix, so keeb is dead until help arrives) However, with the lock, it could be very frustrating to use and test keyboard PCBs during development, depending on which features require unlock. If my kepmap for a new keyboard has some problems I need to debug, the last thing I need is an unlocking key combo standing in my way, especially if I'm using tweezers to test. Might I propose the following: Make the default to have the keyboard lock enabled, but allow a simple #define to disable the lock. Also, adopt a policy that any keyboard configuration submitted in a pull request must have the lock enabled. If someone really wants to make and ship keyboards without the lock enabled, they can certainly do it, but the hex file generated by QMK source (and QMK configurator) will have locking capability. Thus, if a user knows what they're doing and doesn't want the lock enabled, more power to them. |
Also, for the locking feature, when and/or how to re-lock after an unlock should be considered. Should it re-lock automatically after a no-remapping-activity timeout (and if so, how long), or stay unlocked indefinitely unless the user re-locks it manually (through the same key combo? a different key combo? through the API?)? |
I don't know how QMK will be deal with analog keys, joystick and potentiometers but I think is a good idea take this in care for the new API. Specially for MIDI, Mouse and Joystick keycode parameters. |
I implemented this HID usage on my firmware, but unfortunately Windows API (UWP) "eats" the device and prevents normal USB HID enumeration. You can access it through the UWP APIs but I've not been able to get it to work unless the app has focus no matter what I try, and only being able to control lighting from the foreground is not very useful. I do have it working with a different usage code, so that Windows doesn't know what it is, and with a basic implementation I built for the host side of the HID spec, it works very well - I've been running it for months actively driving my LEDs. Also, hilariously and painfully, if the device fails to confirm to the spec in any way - unexpected data, no data, etc - the entire windows UI freezes until it's killed. Not the app - the windows UI process. Had a load of fun debugging it. I filed a bug with Microsoft but I haven't gotten a response and don't expect one - UWP is all but abandoned in favor of cross-platform forms in .NET 6 so we'll probably have to wait for yet another new WinAPI for it to get fixed. Since the work in #10961 was happening concurrently I left my POC alone and figured I'd regroup later - taking the commonalities between my interface and the OpenRGB PR would make a good starting point for a RGB API interface. Happy to see that's begun - I've laid out some useful notes and thoughts below to start but I'll be back with more. I documented some of my journey in #692, It started as an extension of RAW HID, but I moved on and quickly came to the same conclusion that a mess was brewing and a consistent API was needed - and by the end of my POC I was sure. Unless you want to go read all the the gory details, my last comment summarizes my feelings on suitability of HID after finishing the POC . I won't bother repeating why we need an API; we know. For context, I was replying to a comment noting that virtsir was likely a better option than RAW_HID.
My description of how it works may not be fully accurate, this was my first foray into the protocol so I've picked it up as I went along. But I did note as you said that it seems possible to implement it without consuming an interface. Not knowing this at all at the onset anyway, I created a replacement for the RAW_HID interface that moves RAW_HID to its own usage page and gives it a report number, as others have mentioned is an option. I kept RAW_HID itself in place as they can both run on different interfaces if both are needed. Regarding the InputBindings, I didn't really solve it, just worked around the gap by generating a keymap at boot to respond to the HID requests with the correct info (proper usage of the protocol dictates this is mandatory). However, my thought at solving for dynamic keymaps was to break protocol slightly by responding to update reports with the keycode map for all keys present in the report; this would ensure that at least the host-cached keymap get updated whenever that LED changes as it would be able to see the discrepancy and use the new code. The physical characteristics in the reports is another gap, but they only matter if the host uses them. The WinAPI code is probably the only other host code in existence and as noted, if we support that nothing else can use our HID implementation on top of it not working well for many use cases. They can be optional/static to be filled in per-keyboard if desired/needed, and they don't have to be right. I have a functional but incomplete/unpolished local PR palp#1. It's specific to drop alt, some of the code still lives in my keymap where it began - solving the abstraction and graceful degradation of features was one of the things that I felt an API was needed for. I implemented the HID-based client inside the Aurora RGB lighting control app along with support for the layout, which let me test some pretty intense animations and quick reactivity, without doing my own mapping and framebuffers and all that noise. This is also in a PR that's even less polished, but it works well enough. It's very much worth noting that my biggest finding from driving the LEDs themselves was that I needed to completely disengage the RGB matrix loop to be absolutely sure nothing was double-writing my PWM buffers for the LEDs - at least I think that was the problem, and doing so fixed a recurring crash. But I think that's a hardware-specific implementation to the drop keyboards. It should serve as a reminder, though, that safeguards need to be in place to protect against bad behavior from the USB host crashing the keyboard. For example, I ignore the HID LampArray command to flush the update; my code does so when it knows it should and never at the behest of the host. I had a bug in my client that was requesting updates far too often, and since prior to that change I would blindly obey, it was tripping over itself in some way. It's also entirely possible you can't access the PWM from an interrupt and that was another source of my problems - embedded systems is not my field so I'm guessing at the root cause. I know enough to understand that building for this protocol is different than most prior art in the codebase because you're processing data as it's received on the control channel, not polling and handling it on your own time. |
To follow up with and summarize some thoughts on future direction:
|
So Microsoft first pushed the Lighting and Illumination Page into the HID specification, and then made sure that it cannot be used in a useful way 😡 The fact that the API works only for the foreground app is actually documented:
As for virtser, one more disadvantage of it is that it requires two input endpoints (one bulk endpoint for received data, and one interrupt endpoint for status notifications), and the OTG USB controller in chips like STM32F4x1 can support only 3 input endpoints, therefore you can at most fit one HID interface and one virtser interface there, and then you are out of input endpoints. The HID spec also says that a HID interface must have an input endpoint, even when no input reports are defined. So at least on some devices an option to use a completely custom interface which uses only control transfers might be needed; this may work fine for controlling the keyboard state from the host, but sending asynchronous notifications from the keyboard to the host would be impossible. Multiplexing the console stream over the API may look good, but would probably introduce the same issue that one process “eats” the device and makes it unavailable for other processes; then we would need some kind of “demux” process on the host (and some kind of protocol to connect to it, with access control on top…), so this should probably be considered as a really last resort solution.
Yes, that could definitely be a problem; in general, most QMK code is not even thread safe, and the interrupt safety is even more questionable (and what happens inside the atsam code might be even worse). From the look at your LampArray code, seems that |
I would like to play around with this, is this in a state where I can try it out? |
While it's not finished yet, I've been working on HID-IO for Keystone which is somewhat similar but is more of a full RPC mechanism: https://github.com/hid-io/hid-io-core I do have C implementations but I've been moving everything to rust. |
I dont think updating the firmware should be tied to using a browser with webusb, firefox does not support it, and electron doesn't work for cli apps either. Also in general, we really shouldnt trust the OS to handle something. Microsoft probably wont care and third party stuff is usually buggy, Apple will probably actively avoid it, and it will probably end up as a resource hogging daemon on linux, it is probably better to keep things in qmk. Also, the host should not be trusted, because WebUSB lets random websites talk to the keyboard, and users are not very smart. |
How far is XAP from being usable? Should I just use rawhid for now? and has development for this stopped? |
QMK API, aka VIA+OpenRGB+other rawhid all bundled into one
Feature Request Type
Description
The text was updated successfully, but these errors were encountered: