Skip to content

Firmware Design Docs

Joshua Williams edited this page Oct 29, 2021 · 1 revision

tbs

purpose / use cases

A node to control the throttle/brake/steering actuators on the car.

<<I don't think this should be just one node. The way it is currently built, each actuator has one node controlling it. This way, if something changes about the actuators to, say, the brake and gas, the steering isn't affected. -Joshua>>

Avery: imo, making the key actuator control distributed amplifies the set of issues brought about by patchwork node availability. I'm open to splitting this up, but I think it vastly complicates shutdown in error states. I do agree that separating them into individual nodes gives us the freedom to change how one actuator works with certainty that the others aren't affected and we don't have to throw away older tests, but it's a tradeoff that I'm not 100% sold on yet. We should discuss this after the meeting.

design

tbs should be pretty dumb from a controls perspective.

This is to say, tbs will handle only performing requested actuations. tbs is not primarily responsible for the safety of performing the given inputs.

The car has 3 major actuators:

  • EPAS
  • Throttle control
  • Brake control

For EPAS, tbs should be given a specific steering wheel absolute 'angle'. Note that 0deg =/= 360deg =/= -360deg. tbs will probably use a simple PID controller to achieve these angles. As such, producers of steering wheel 'angles' should generate these based on some sort of motion profile to steer smoothly.

As of now, Throttle control and Brake control should be given as percent actuation or position. 0% means no brake or no throttle. 50% means the throttle or brake will be pushed 50% of the way, etc. Some modeling and measurements may allow for tbs to consume a single acceleration/deceleration figure in SI units, if desired in the future.

assumptions

As tbs should be light on logic, we should expect it to behave predictably when everything is online, but care should be taken to test device loss in simulation and reality if possible.

There should never be more than one tbs.

inputs/outputs

inputs

tbs_action * float32 steering_angle * float32 throttle_position [0,1] * float32 brake_position [0,1]

outputs

tbs_health * bool epas_good * bool throttle_good * bool brake_good for debugging and tuning: tbs_stat * float32 steering_current * float32 steering_target

error detection and handling

philosophy

The buck stops with tbs. Because it is ultimately responsible for changing the state of the car in the real world, it must be designed to fail in a way that preserves, in descending order

  • pedestrians
  • passengers
  • campus
  • the car

The safety of tbs only extends to the implementation of what it is told to perform. tbs will run over pedestrians if told to do so.

CAN Bus fault / device loss

In the event of a loss of EPAS/Throttle/Brake devices on the CAN bus, tbs should try to fail as safely as possible. The run should obviously be aborted, and tbs should run through a procedure dictated by the safety manager, which may involve bus reset, bus commands, or a total shutdown. In the event that the safety manager dies, tbs must engage the safe shutdown sequence:

  • if Throttle is online, issue a 0% throttle command.
  • if Brake is online, issue a high but safe brake command.
    • experimental testing is necessary to determine the value of brake actuation. For maximum safety, we should brake to minimize stopping distance.
    • in a situation with other cars on the road, we have to get a braking strategy from the safety manager.
  • if EPAS is online, issue a 0% effort command to allow the driver to take control.
  • signal failure state to the safety manager for further decision making
  • visibly (and audibly?) alert the driver of the situation (safety manager)
  • stop accepting further messages

If the safety manager deems this a transient fault, an alternative to aborting the run is to try to restart the bus, either through a node command or a full remount of the can bus.

device failure

EPAS non-actuation should be fairly easy to detect via position error, and should trigger the safe shutdown sequence.

Throttle and Brake non-actuation may be detectable if we can read back the linear actuator potentiometers (pending info). The safe shutdown sequence should be initiated, with an urgent prompt to brake if the Brake linear actuator has failed.

Upstream node stall/crash

The alternative failure case is the tbs command going stale. If a node up the stack dies, or gets stuck, tbs will continue to operate based on old data long past the validity of that data. A suitable way to handle this case is to introduce a timer in tbs, where old commands are only held onto for a fixed period of time, and tbs resorts to a stop when no command is active. This should never happen in a healthy system, and so it may also be prudent to initiate safe shutdown.

tbs stall/crash, total system loss

A tbs hang, crash, or total system lockup is also a dangerous situation. While it is easy to say this is unacceptable, it is still unavoidably possible. Orthogonal to tbs, this would seem to only be solved by a hardware watchdog on the CAN bus, which should detect the lack of EPAS/Throttle/Brake commands and initiate a forceful stop sequence. The design of such a hardware watchdog is outside the scope of this document.

logical error

tbs receives and executes an unsafe motion in an otherwise healthy system. The e-stop button must be properly connected and operational to handle this case. The e-stop system must not in any way require the rest of the system to be responsive, truthful, or alive. Therefore, tbs should not receive any information on the status of the e-stop switch, and instead be either killed or rendered impotent.

human braking?

If we detect the human driver actuating the brake, this should also signal a tbs fault. This can come from EPAS external torque, and perhaps from the vehicle CAN bus.


gps_interface

purpose / use cases

A node to control the NEO-M8U GNSS receiver and convert the NMEA sentences it outputs into a ROS message

design

On startup, gps_interface will do the initial configuration of the receiver over I2C.

Then, at a regular time interval, gps_interface will poll the latest NMEA sentence from the device, also over I2C. For each sentence read out, it will parse, and publish messages corresponding to a few types.

NMEA sentences start with a $, followed by a short sequence of characters describing the origin (GPS, GLONASS, etc.) and type of the sentence. The most important sentence type this node will handle is $xxGNS, which, among other things, yields a latitude, longitude, and altitude on the WGS-84 geoid. For each received GNS sentence, a NavSatFix message will be published.

assumptions

The I2C kernel module is assumed to play nicely with multiple processes accessing different I2C devices. Also, the effects of an interrupt in the core communication loop should probably be studied, and may require writing our own kernel module to prevent interrupted reads of multi-byte sentence data.

inputs/outputs

inputs

None.

outputs

sensor_msgs/NavSatFix

a proper position if conversion is necessary for the consumer(s)

raw IMU data and/or dead-reckoning output

gps_interface_health - bool gps_good

perhaps some diagnostic messages

error detection and handling

gps_interface should bail if the receiver is not reachable during startup.

Errors at the communication level should be recoverable in most cases, but loss of connection mid-operation is unrecoverable.

camera interface

vendors are great

The vendor of the two stereo cameras we're using on the car includes a ros2 node in their SDK!

We can integrate it into our codebase and let consumers decide how to use it.

Some highlights of topics it publishes:

  • ~/(left,right)/image_rect_color - a calibrated image from the left or right sensor in color
  • ~/point_cloud/cloud_registered - a point cloud generated from the stereo image
  • ~/pose - SLAM output if object detection onboard is used,
  • ~/obj_det/objects - objects detected in each frame

Images use the nifty image_transport package that compresses images before transport, keeping bandwidth low.

downsides

There are many parameters to configure for each set of functions provided by the SDK. Not everything they produce will be useful to us, so to save resources where we can, we should prune the output of unused nodes. Luckily, these seem to be among the many configuration options.

Conversions might be necessary for ease of use with other pre-existing pieces of the stack.

Also, the SDK documentation is very unclear on resource usage, and suggests some operations are offloaded to the host GPU. A thorough review of nodes that publish topics of use to us will be required to assess suitability.

Clone this wiki locally