laplans

Hacking My Chevy Volt to Auto-Switch Driving Modes for Efficiency

Hacking My Chevy Volt to Auto-Switch Driving Modes for Efficiency

Let’s talk about my car. I have a Chevy volt which is a Plugin-in Hybrid Electric Vehicle (PHEV). On electric alone, it gets about 50 miles of range in ideal conditions and then it has a gas tank to take you the rest of the way on long trips (about 350 miles from gas). The car also has a handful of modes (as do most cars). I won’t bother going into them all here since I just want to focus on two: “normal mode” and “hold mode”. In normal mode, the car uses the electric range first and then switches to gas. In hold mode, it switches the car to the gas engine and “holds” the battery where it is. This allows you to save your remaining electric range for later which is useful on long trips since the electric motor is more efficient below 50 MPH.

So what’s the problem? When I take long trips that exceed the electric range of my car, I want to use the electric range as efficiently as possible. To do that, I try to end the trip with 0 miles of electric range and use the electric engine anytime I’m traveling under 50 MPH and the gas engine for over 50 MPH. This is easy enough to do by hand, but it also seems like an extremely easy thing for the car to do automatically. Chevy could have added another mode called “trip mode” that would do exactly that, but they didn’t, and that’s frustrating. Also, after owning the car for almost 5 years, I find I’m much more forgetful on long trips when it comes to switching engine modes, and I’ll regularly end up on the highway burning through my electric range without noticing.

Designing “Trip Mode”

So, in this blog post, I’m going to be designing and implementing “trip mode” for my Chevy Volt. The final code, setup, and usage instructions will be available in the GitHub repository for this blog post. First, let’s talk about what we need the mode to do at a minimum:

  1. When the car starts, something needs to prompt: “Are you taking a long trip?”.
  2. In trip mode, when traveling under 50 MPH, the car should be in normal mode.
  3. When traveling over 50 MPH, the car should be in hold mode.
  4. There should be a delay or cooldown to prevent mode switching too frequently.
  5. The speed threshold should be configurable.

I have a Raspberry Pi laying around along with a compatible 7 inch touchscreen, so that will work great as an interface to prompt for enabling “trip mode”. To talk to and control the car, I will use a Gray Panda from Comma AI. Comma AI doesn’t sell the gray panda anymore but any of the Panda colors will work fine. For this project the $99 white panda would be enough. There are probably other OBD-II devices that could be used (I’m honestly not sure).

Hardware List

Here is the complete hardware list:

  1. Raspberry Pi 4B 4GB or equivalent.
  2. Any Panda should work.
  3. A male-to-male USB type-A cable to connect the Pi to the Panda (just Google and pick one).
  4. A touchscreen for the raspberry Pi to turn trip mode on and off (I also have the case for the touchscreen).
  5. Some way to mount it somewhere in your car if you want.

I didn’t have a male-to-male USB type-A cable laying around, but a USB type-A to type-C cable will work if you have a male type-C to male type-A adapter like I do:

Makeshift USB type-A male-to-male cable

Once I had the hardware, I connected it up like this:

Back wire connection
Front wire connections
Panda connected to OBD-II

The image on the left shows the back of the Raspberry Pi screen case all loaded up with the Pi and everything. The Panda is just plugged into one of the Raspberry Pi USB ports and the white cable provides power. The white cable can be plugged into either USB port available in the center dash on the Volt. My Raspberry Pi complained about being underpowered sometimes but it didn’t seem to effect anything. The image on the right shows the Panda plugged into the OBD-II port to the left of the steering wheel down towards the floor. My cables were all long enough so I could rest the pi above the Volt’s built-in touch screen. I haven’t decided where or how I’m going to mount this thing yet. There is also probably a smaller screen you could get for the pi, or you could just wire up a single button. This was the hardware I had laying around so everything I chose was chosen to avoid having to buy anything new.

Interfacing with the Panda from a Raspberry Pi

It’s possible to interface with the Panda via USB from a computer as well, but since the Pi was going to need to talk to it eventually, I just used the Pi as the main interface. The Panda also appears to have Wi-Fi but I didn’t explore that at all. My workflow and coding setup was this:

  1. Pre-configure the Raspberry Pi to connect to my home Wi-Fi.
  2. Get in my car, turn it on, and plug everything in.
  3. SSH to the Pi from my laptop and do all the work from there, transferring files to and from the Pi via SCP as needed.

This is not the best, most comfortable, or most efficient setup, but I was happy with it and it never annoyed me so much that I felt compelled to make it easier.

Preparing the Raspberry Pi

The Panda requires Python 3.8 to build and flash the firmware, which we will need to do later, so let’s sort that out now. As of this writing, Python 3.8 is not the default Python on Raspbian. So you can either flash the Pi with Ubuntu (I didn’t try this) or you can follow the instructions here for getting Python 3.8 on the pi. Once done, type python --version at the prompt to ensure the default Python is Python 3.8. Also make sure pip --version says it’s using Python 3.8.

Reading data from the CAN bus

We’ll start by writing a simple Python script that will read data from the CAN bus and print it out as a messy jumble of unreadable nonsense (exactly like the example provided in the README for the Panda on GitHub).

First, we have to setup some udev rules so that, when the Panda is detected, the device mode is given permissions 0666 instead of the default 0660 (e.g. allow everyone read/write, not just the owner and group).

sudo tee /etc/udev/rules.d/11-panda.rules <<EOF
SUBSYSTEM=="usb", ATTRS{idVendor}=="bbaa", ATTRS{idProduct}=="ddcc", MODE="0666"
SUBSYSTEM=="usb", ATTRS{idVendor}=="bbaa", ATTRS{idProduct}=="ddee", MODE="0666"
EOF
sudo udevadm control --reload-rules && sudo udevadm trigger

It’s fine if the Panda was already connected, that’s what the udevadm trigger is for. If you’re curious, you can check dmesg when you plugin the Panda and you should see something like this:

[ 1645.769508] usb 1-1.3: new full-speed USB device number 5 using dwc_otg
[ 1645.906006] usb 1-1.3: not running at top speed; connect to a high speed hub
[ 1645.916768] usb 1-1.3: New USB device found, idVendor=bbaa, idProduct=ddcc, bcdDevice= 2.00
[ 1645.916796] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1645.916812] usb 1-1.3: Product: panda
[ 1645.916826] usb 1-1.3: Manufacturer: comma.ai                                        
[ 1645.916841] usb 1-1.3: SerialNumber: XXXXXXXXXXXXXXXXXXXXXXXX

You can also run lsusb to see the bus and device number given to the Panda (it will be the one with ID bbaa:ddcc or bbaa:ddee).

Bus 001 Device 005: ID bbaa:ddcc

This tells us the device path for the Panda is /dev/bus/usb/001/005, and sure enough if I do an ls -l on /dev/bus/usb/001 I see device 005 has the rw permission for owner, group, and other, which is exactly what the udev rule above was supposed to do:

crw-rw-rw- 1 root root 189, 4 Nov  8 17:00 005

With all of that unnecessary verification out of the way, we should now be able to interface with the Panda. So let’s first install the required Python dependency.

pip install pandacan

Then, from within a Python shell, you should be able to run the following commands and get a wall of nonsense printed back after the can_recv step:

from panda import Panda
>>> panda = Panda()
>>> panda.can_recv()

Reading The Car’s Current Speed

In order to switch modes based on the car’s current speed, we will need to monitor how fast the car is going. To figure out which messages on the CAN bus contain the car’s speed and how to parse it out would require a lot of trial and error and reverse engineering. Thankfully, much of that has already been done. Comma AI’s Open Pilot is already a fairly mature system that can add auto-drive features to a surprising number of cars, so they’ve already figured a lot of this stuff out. There’s also plenty of resources from the car hacking community that break down several different message types for different cars. A lot of good reversing info on the Volt (and other cars) can be found here and here. However, the best resource I found, once I learned how to decipher it, was the opendbc repository from Comma AI.

Let’s see if we can find any information about how to parse the vehicle speed from the GM DBC files in opendbc. Looks like the file we want is gm_global_a_powertrain.dbc since it contains an ECMVehicleSpeed message definition with a VehicleSpeed signal definition in miles per hour. Sounds like exactly what we want:

BO_ 1001 ECMVehicleSpeed: 8 K20_ECM
 SG_ VehicleSpeed : 7|16@0+ (0.01,0) [0|0] "mph"  NEO

Now, these definitions are supposedly “human readable”, but that’s only true if you have a translator. This DBC_Specification.md on GitHub was extremely helpful with that. The first line that begins with BO_ is the message definition. The CAN ID is 1001 (0x3e9 hex), the total message length is 8 bytes and the sender is K20_ECM (I don’t know what that is besides an engine control module).

The line that begins with SG_ is the signal definition. This message has one signal, called VehicleSpeed, which starts at bit 7, is 16 bits long (2 bytes), big endian, unsigned, and a factor of 0.01 (meaning we’ll need to divide by 100), and is in units of miles per hour.

Let’s write a Python script to monitor the vehicle’s speed, print it to the terminal, and then go for a drive to test it out.

import sys
import struct

from panda import Panda

CURRENT_SPEED = 0.0


def read_vehicle_speed(p: Panda) -> None:
    """Read the CAN messages and parse out the vehicle speed."""
    global CURRENT_SPEED

    for addr, _, dat, _src in p.can_recv():
        if addr == 0x3e9:  # Speed is 1001 (0x3e9 hex)
            # '!' for big-endian. H for unsigned short (since it's 16 bits or 2 bytes)
            # divide by 100.0 b/c factor is 0.01
            CURRENT_SPEED = struct.unpack("!H", dat[:2])[0] / 100.0

        # Just keep updating the same line
        sys.stdout.write(f"\rSpeed: {int(CURRENT_SPEED):03d}")
        sys.stdout.flush()


def main() -> None:
    """Entry Point. Monitor Chevy volt speed."""
    try:
        p = Panda()
    except Exception as exc:
        print(f"Failed to connect to Panda! {exc}")
        return

    try:
        while True:
            read_vehicle_speed(p)
    except KeyboardInterrupt:
        print("Exiting...")
    finally:
        p.close()


if __name__ == "__main__":
    main()

That worked great! The vehicle’s speed is accurate to within 1 MPH of what’s displaying on the dashboard (which makes sense since the Pi is slow and the dashboard is also probably intentionally slow at updating).

Close enough!

Detecting Driving Mode Button Presses

Now that we’re monitoring the car’s speed, we can look into the mode button. To identify an unknown button using the Panda, there are some helpful scripts you can use: can_logger.py and can_unique.py. Instructions for how to use them to identify unknown button presses and other behavior can be found in the can_unique.md file. The basic procedure is this:

  1. Capture CAN bus traffic while not pressing the button and save that to a file
  2. Repeat step one but press the button a couple times
  3. Use can_unique.py to compare the files.

It’s important to log for long enough that you get rid of the bulk of the background noise. Thankfully though, looking at the DBC file from before, I found DriveModeButton which looks like what we need. The full message definition is below:

BO_ 481 ASCMSteeringButton: 7 K124_ASCM
 SG_ DistanceButton : 22|1@0+ (1,0) [0|0] ""  NEO
 SG_ LKAButton : 23|1@0+ (1,0) [0|0] ""  NEO
 SG_ ACCButtons : 46|3@0+ (1,0) [0|0] ""  NEO
 SG_ DriveModeButton : 39|1@0+ (1,0) [0|1] "" XXX

That’s a little confusing since the button is definitely not on the steering wheel, but it’s the right one. So parsing that out, the only signal we care about is the DriveModeButton which looks to be at bit offset 39, it’s one bit long, and is either a 1 or a 0. In Python we get the message data as a byte array, so bit 39 will be the 7th bit in the 4th byte of the message (I’m a programmer so all my numbers are indexed at 0). We can make a quick modification to monitor.py above and we’ll have it monitoring for engine mode button presses too:

CURRENT_SPEED = 0.0
BUTTON = 0
PRESS_COUNT = 0


def read_vehicle_speed(p: Panda) -> None:
    """Read the CAN messages and parse out the vehicle speed."""
    global CURRENT_SPEED
    global BUTTON
    global PRESS_COUNT

    for addr, _, dat, _src in p.can_recv():
        if addr == 0x3e9:  # Speed is 1001 (0x3e9 hex)
            # '!' for big-endian. H for unsigned short (since it's 16 bits or 2 bytes)
            # divide by 100.0 b/c factor is 0.01
            CURRENT_SPEED = struct.unpack("!H", dat[:2])[0] / 100.0
        elif addr == 0x1e1:  # ASCMSteeringButton
            # Check if the 7th bit of byte 4 is a 1
            if int(dat[4]) >> 7 & 1:
                BUTTON = 1
            elif BUTTON == 1:
                # Increment the press count on button release
                PRESS_COUNT += 1
                BUTTON = 0
            else:
                BUTTON = 0

        # Just keep updating the same line
        sys.stdout.write(f"\rSpeed: {int(CURRENT_SPEED):03d} Button: {BUTTON}")
        sys.stdout.flush()

Sending CAN Messages

Now that we can see when the mode button is pressed, we need to figure out how to actually send the button press ourselves. The first thing we need to do before we can get started is flash the Panda with a debug build of its firmware. In release, the panda won’t send anything, even if you tell it to, unless you enable one of the supported safety models. They have a safety model for each supported car, but those safety models only allow sending messages that they require for auto-driving and the engine mode button is not one of them. Once we flash it with a debug build, we’ll be able to use the SAFETY_ALLOUTPUT safety model (more like a danger model am I right?).

Flashing the Panda from a Raspberry Pi

To flash the Panda from my Raspberry Pi, the only thing I got hung up on was the Python 3.8 issue (noted above). Once you have Python 3.8, flashing the panda should be pretty easy:

  1. Clone the Panda GitHub repository
  2. Change into the board folder and run get_sdk.sh
  3. Once complete, run scons -u to compile
  4. Unplug the Panda from the OBD-II port on the car
  5. Power-cycle it by unplugging it from the Pi and plugging it back in (probably not required but for good measure).
  6. Once you see a slowly flashing blue light you can proceed with the final step
  7. Finally, with the Panda connected to the Pi but NOT connected to the OBD-II port, run flash.sh

Once flashed, verify the version by running the following in a Python shell:

>>> from panda import Panda
>>> p = Panda()
opening device XXXXXXXXXXXXXXXXXXXXXXXX 0xddcc
connected
>>> p.get_version()
'DEV-4d57e48f-DEBUG'
>>> p.close()

The hex digits between DEV and DEBUG should match the beginning of the slug for the git commit you have checked out. We’re in DEBUG mode now, training wheels are off. Be careful and don’t drive yourself off a cliff. Once we sort out what we actually need to write out in order to change the driving modes, we can modify the Panda firmware to add a new safety model that only allows sending that driving mode button press.

Sending Driving Mode Button Presses

We have the DBC message definition for the drive mode button and we can see when it’s pressed. All we should have to do is enable the bus, set the safety model to allow everything, and blast out a message with a 1 in the correct place. According to the definition of ASCMSteeringButton above, the full message is 7 bytes. So we can start with 7 bytes of 0x00:

message = bytearray(b'\x00\x00\x00\x00\x00\x00\x00')

Now we need a 1 in the 7th bit of the 4th byte. You can figure that out however you want. One easy way is to just type it into a programming calculator:

So that makes 128 in decimal or 0x80 in hex. So we can just put 0x80 in the 4th byte (indexed at 0) and that should do it:

message = bytearray(b'\x00\x00\x00\x00\x80\x00\x00')

That’s our “drive mode button press” message. Now we have to:

  1. Set the safety mode: p.set_safety_mode(Panda.SAFETY_ALLOUTPUT)
  2. Enable output on CAN bus 0 which is the powertrain (ref): p.set_can_enable(0, True)
  3. Flush the Panda’s buffers for good measure: p.can_clear(0xFFFF)
  4. Call p.can_send with the message ID (0x1e1), message, and a bus ID of 0 (for powertrain)

Putting it all together we have a prototype send.py to send the button press every second. We’ll use the press_count later to get us to the mode we want. For now, because of the while True we’ll just end up sending a button press every second.

import time

from panda import Panda


def send_button_press(p: Panda, press_count: int = 2) -> None:
    """Send the ASCMSteering DriveModeButton signal."""
    msg_id = 0x1e1  # 481 decimal
    bus_id = 0
    message = bytearray(b'\x00\x00\x00\x00\x80\x00\x00')  # 0x80 is a 1 in the 7th bit

    for press in range(press_count):
        p.can_send(msg_id, message, bus_id)
        print(f"Sent press {press + 1}")
        time.sleep(1)


def main() -> None:
    """Entry Point. Monitor Chevy volt speed."""
    try:
        p = Panda()
    except Exception as exc:
        print(f"Failed to connect to Panda! {exc}")
        return

    try:
        p.set_safety_mode(Panda.SAFETY_ALLOUTPUT)  # Turn off all safety preventing sends
        p.set_can_enable(0, True)  # Enable bus 0 for output
        p.can_clear(0xFFFF)  # Flush the panda CAN buffers
        while True:
            send_button_press(p)
    except KeyboardInterrupt:
        print("Exiting...")
    finally:
        p.close()


if __name__ == "__main__":
    main()

SUCCESS!!!

That’s not me!

Implementing Trip Mode

All the hard parts are done, now we just have to create a script that will change the mode based on a set of rules. First it’s important to talk about the behavior of the mode button to make sense of why I wrote the code the way I did.

  1. The mode button press is not registered until it is released. You can hold it down as long as you want and nothing happens in the car until the button is no longer being pressed.
  2. When pressed, the mode selection screen comes up and you have about 3 seconds where repeated button presses change the selected mode. Wait longer than 3 seconds and you need 1 button press just to re-activate the mode selection process.
  3. When activated, the highlighted mode is always “normal” regardless of which mode the car is currently in (this is a good thing for us).

Since the highlighted mode is always “normal” after the initial button press, we don’t have to monitor for button presses by the user. If the user changes the mode on us, our code will still work, since the highlighted mode always starts with “normal”. This also means we only need 1 button press to switch back to normal mode at any time. In fact, if we store our modes in a list, the number of button presses needed to get to any mode will be its index plus 1.

Now, since button presses aren’t registered until release, it may seem like we need to follow each “press” signal with a “release” signal, but we actually don’t need to. Some other module in the car (not us) is still sending the ASCMSteeringButton message with the DriveModeButton signal set to 0 (when not pressed). When we blast out our message with DriveModeButton set to 1, we’re drowning out those 0’s in a way (I’m not an expert on CAN bus stuff). Once we stop sending that “1”, the module that’s supposed to send the message will continue sending “0”, since the button isn’t being pressed, and that will work as our button release.

Alright now we can dive into the code. First we have to create our Panda object and disable the safety and all that:

try:
    p = Panda()
    p.set_safety_mode(Panda.SAFETY_ALLOUTPUT)  # Turn off all safety preventing sends
    p.set_can_enable(0, True)  # Enable bus 0 for output
    p.can_clear(0xFFFF)  # Flush the panda CAN buffers
except Exception as exc:
    logging.error(f"Failed to connect to Panda! {exc}")
    return

Next, I created a CarState (not the best name) class to handle the rest. This class will take the Panda object and it will have an update method that we have to call in a forever loop. The update method will read CAN messages, set the speed, check if we need to change modes, and send button presses to switch modes if needed.

def update(self):
    """Update the state from CAN messages."""
    for addr, _, dat, _src in self.panda.can_recv():
        if addr == 0x3e9:  # Speed is 1001 (0x3e9 hex)
            # '!' for big-endian. H for unsigned short (since it's 16 bits or 2 bytes)
            # divide by 100.0 b/c factor is 0.01
            self._set_speed(struct.unpack("!H", dat[:2])[0] / 100.0)

        now = time.perf_counter()
        if self.pending_sends and now > self.allow_button_press_after:
            self.allow_button_press_after = now + self.BUTTON_PRESS_COOLDOWN
            send = self.pending_sends.pop(0)
            self.panda.can_send_many(send)

Similar to our monitor.py script, we loop over the messages and update the speed for the class. Then we check if there’s anything to send and if we’ve waited long enough before sending the next button press. If we send them too quickly they will be registered as if the button is being held down. You may also notice that we’re using can_send_many instead of can_send now. This is because, while driving, there’s a bunch more happening on the CAN bus than when sitting in my driveway. I noticed in testing that sometimes button presses were being missed when I was just sending a single message. So now I group 50 “press” messages and call can_send_many which blasts them out. This has the effect of making it look like the button is being held down for a moment before release (like a real human).

When the speed is set, we check if we’ve crossed the threshold and need to switch modes.

def _set_speed(self, speed):
    """Set the current speed and trigger mode changes."""
    speed = int(speed)
    if self.speed > self.speed_threshold and speed < 1:
        # HACK: Speed jumps to 0 b/w valid values.
        # This hack should handle it.
        return

    self.speed = speed
    if self.speed > self.speed_threshold and self.mode == "NORMAL":
        logging.debug(f"Speed trigger (attempt HOLD): {self.speed}")
        self._switch_modes("HOLD")
    elif self.speed <= self.speed_threshold and self.mode == "HOLD":
        logging.debug(f"Speed trigger (attempt NORMAL): {self.speed}")
        self._switch_modes("NORMAL")

There is a HACK in there that I don’t love but it works. If the previous speed is over the threshold (which will always be some high speed like 50 MPH or more), we check if the speed being set is less than 1 (and ignore if it is). This happens pretty constantly. It must be that the current speed from the ECMVehicleSpeed message isn’t always valid. I’m not sure how to tell when it’s valid or not, but the only way this hack will cause an issue is if you go from 50 MPH to 0 MPH in less than a fraction of a second. If that happens I don’t think we need to worry about switching the engine mode.

Finally, the _switch_modes method takes the requested mode, checks if we’re past the cooldown, and then updates our list of pending_sends with the button presses that need to be sent by the update method.

def _switch_modes(self, new_mode: str) -> None:
    """Send the messages needed to switch modes if past our cooldown."""
    now = time.perf_counter()
    if now <= self.allow_mode_switch_after:
        return

    logging.info(f"Switch to {new_mode} mode. Speed: {self.speed}")

    # Update our cooldown and mode
    self.allow_mode_switch_after = now + self.MODE_SWITCH_COOLDOWN
    self.mode = new_mode

    # Required presses starts at 1 (to activate the screen) and
    # mode selection always starts on NORMAL.
    required_presses = 1 + self.DRIVE_MODES.index(new_mode)
    logging.debug(f"Needs {required_presses} presses")
    for _ in range(required_presses):
        cluster = []
        for _inner in range(self.SEND_CLUSTER_SIZE):
            cluster.append([self.MSG_ID, None, self.PRESS_MSG, 0])
        self.pending_sends.append(cluster)

If we already switched modes recently, this method just returns. Since it doesn’t update the current mode, the _set_speed method will trigger it over and over until the mode switches or the car begins traveling a valid speed for the mode it’s already in. When the mode does switch, we compute the number of presses by the list index of the mode plus 1 (as discussed earlier) and then we update pending_sends. The clusters are groups of 50 “press” messages that are sent all at once with can_send_many so that we make sure our button press is long enough to be registered by the car.

That covers all the logic of the CarState class, the rest is just initialization. The full tripmode.py file can be found here: https://github.com/vix597/chevy-volt-trip-mode

Wrap it Up!

This blog post is probably too long but we’re almost done. What’s left?

  1. Create a GUI with an “on” and “off” button
  2. Install it on the pi and have it start automatically at boot

For the GUI I’m going to try PySimpleGUI which seems to be exactly what I need. The “simple” in PySimpleGUI is no joke either, I’ve barely added any code and I have a working GUI.

def main() -> None:
    """Entry Point. Monitor Chevy volt speed."""
    trip_mode_enabled = False
    car_state = None

    # Theme and layout for the window
    sg.theme('DarkAmber')
    layout = [
        [sg.Text('TRIP MODE')],
        [sg.Button('ON'), sg.Button('OFF')]
    ]

    # Create the Window (800x480 is Raspberry Pi touchscreen resolution)
    window = sg.Window(
        'Trip Mode', layout, finalize=True,
        keep_on_top=True, no_titlebar=True,
        location=(0, 0), size=(800, 480))
    window.maximize()  # Make it fullscreen

    # Event Loop to process "events" and get the "values" of the inputs
    while True:
        event, _values = window.read(timeout=0)  # Return immediately
        if event == sg.WIN_CLOSED:
            if car_state:
                car_state.close()
            break

        if event == "ON" and not trip_mode_enabled:
            trip_mode_enabled = True
            car_state = enable()
        elif event == "OFF" and trip_mode_enabled:
            trip_mode_enabled = False
            car_state.close()
            car_state = None

        if car_state:
            car_state.update()

    window.close()

I broke out the code for creating the Panda connection and CarState object into an enable method which returns the CarState. Then I added a method to CarState to close the Panda connection and that was it. The PySimpleGUI code is copy-pasted from the main example on their home page with the text input box removed and the title and button text changed.

I’m going to save “start at boot” along with other improvements for later. I think this blog post is plenty long and covers everything I wanted. Check out the finished project over on GitHub and try it out for yourself if you also have a Chevy Volt. Or don’t. #GreatJob

References

  1. https://vehicle-reverse-engineering.fandom.com/wiki/GM_Volt
  2. https://github.com/openvehicles/Open-Vehicle-Monitoring-System/blob/master/vehicle/Car%20Module/VoltAmpera/voltampera_canbusnotes.txt
  3. https://itheo.tech/install-python-3-8-on-a-raspberry-pi/
  4. https://github.com/commaai/panda
  5. https://github.com/commaai/openpilot
  6. https://github.com/openvehicles/Open-Vehicle-Monitoring-System/blob/master/vehicle/Car%20Module/VoltAmpera/voltampera_canbusnotes.txt
  7. https://pysimplegui.readthedocs.io/en/latest/
Posted by laplans in Things I've Done
Creating an Icon From a Song

Creating an Icon From a Song

My adventure into the world of blogging has been going for a week and a half now. I have a half-written how-to walking through how I setup this blog and now I’m working on this post (which will come out first). It will not be long before I’m a blogging expert! The one roadblock remaining is that this site has no favicon (at least prior to publishing this). This post aims to solve that problem. Once I have a favicon, I’ll really be a force in the blogging community. Unfortunately for me however, I’m not an artist and my wife is busy making a blanket. I’m okay at Python, so I’ll generate a favicon using that.

My idea is to generate my site’s favicon programmatically using a song as input. I will need to be able to read and parse an MP3 file and write an image one pixel at a time. I can use Pillow to generate the image but I’ll have to search around for something to parse an MP3 file in Python. It would be pretty easy to just open and read the song file’s bytes and generate an image with some logic from that, but I’d like to actually parse the song so that I can generate something from the music. Depending on what library I find, maybe I’ll do something with beat detection. When this is all said and done, you’ll be able to see the finished product on github. First, a few questions:

How big is the icon supposed to be?

Looks like when I try to add one to the site, WordPress tells me it should be at least 512×512 pixels.

Can I use Pillow to make an .ico file?

Yes, but that doesn’t matter because it looks like WordPress doesn’t use .ico files “for security reasons” that I didn’t bother looking into. I’ll be generating a .png instead.

Can I read/process .mp3 files in Python?

Of course! With librosa it seems.

Generating a .png file in Python

With all my questions from above answered, I can get right into the code. Let’s start with something simple; generating a red square one pixel at a time. We will need this logic because when we generate an image from a song, we’re going to want to make per-pixel decisions based on the song.

import numpy as np
from PIL import Image
from typing import Tuple

SQUARE_COLOR = (255, 0, 0, 255)  # Let's make a red square
ICON_SIZE = (512, 512)  # The recommended minimum size from WordPress


def generate_pixels(resolution: Tuple[int, int]) -> np.ndarray:
    """Generate pixels of an image with the provided resolution."""
    pixels = []

    # Eventually I'll extend this to generate an image one pixel at a time
    # based on an input song.
    for _row in range(resolution[1]):
        cur_row = []
        for _col in range(resolution[0]):
            cur_row.append(SQUARE_COLOR)
        pixels.append(cur_row)

    return np.array(pixels, dtype=np.uint8)


def main():
    """Entry point."""

    # For now, just make a solid color square, one pixel at a time,
    # for each resolution of our image.
    img_pixels = generate_pixels(ICON_SIZE)

    # Create the image from our multi-dimmensional array of pixels
    img = Image.fromarray(img_pixels)
    img.save('favicon.png', sizes=ICON_SIZE)


if __name__ == "__main__":
    main()

It worked! We have a red square!

A red square

Analyzing an MP3 file in Python

Since we’ll be generating the image one pixel at a time, we need to process the audio file and then be able to check some values in the song for each pixel. In other words, each pixel in the generated image will represent some small time slice of the song. To determine what the color and transparency should be for each pixel, we’ll need to decide what features of the song we want to use. For now, let’s use beats and amplitude. For that, we’ll need to write a Python script that:

  1. Processes an MP3 file from a user-provided path.
  2. Estimates the tempo.
  3. Determines for each pixel, whether it falls on a beat in the song.
  4. Determines for each pixel, the average amplitude of the waveform at that pixel’s “time slice”.

Sounds like a lot, but librosa is going to do all the heavy lifting. First I’ll explain the different parts of the script, then I’ll include the whole file.

librosa makes it really easy to read and parse an MP3. The following will read in and parse an MP3 into the time series data and sample rate.

>>> import librosa
>>> time_series, sample_rate = librosa.load("brass_monkey.mp3") 
>>> time_series
array([ 5.8377805e-07, -8.7419551e-07,  1.3259771e-06, ...,
       -2.1545576e-01, -2.3902495e-01, -2.3631646e-01], dtype=float32)
>>> sample_rate
22050

I chose Brass Monkey by the Beastie Boys because I like it and it’s easy to lookup the BPM online for a well known song. According to the internet the song is 116 BPM. Let’s see what librosa says in our next code block where I show how to get the tempo of a song.

>>> onset_env = librosa.onset.onset_strength(time_series, sr=sample_rate)
>>> tempo = librosa.beat.tempo(onset_envelope=onset_env, sr=sample_rate)
>>> tempo[0]
117.45383523

Pretty spot-on! No need to test any other songs, librosa is going to work perfectly for what I need.

I know the size of the image is 512×512 which is 262,144 pixels in total, so we just need the song’s duration and then it’s simple division to get the amount of time each pixel will represent.

>>> duration = librosa.get_duration(filename="brass_monkey.mp3") 
>>> duration
158.6
>>> pixel_time = duration / (512 * 512)
>>> pixel_time
0.000605010986328125

So, the song is 158.6 seconds long and each pixel in a 512×512 image will account for about 0.0006 seconds of song. Note: It would have also been possible to get the song duration by dividing the length of the time series data by the sample rate:

>>> len(time_series) / sample_rate
158.58666666666667

Either is fine. The division above is more efficient since the song file doesn’t need to be opened a second time. I chose to go with the helper function for readability.

Now, for each pixel we need to:

  1. Determine if that pixel is on a beat or not
  2. Get the average amplitude for all samples that happen within that pixel’s time slice

We’re only missing 2 variables to achieve those goals; beats per second and samples per pixel. To get the beats per second we just divide the tempo by 60. To get the whole samples per pixel we round down the result of the number of samples divided by the number of pixels.

>>> import math 
>>> bps = tempo[0] / 60.0 
>>> bps
1.9575639204545456
>>> samples_per_pixel = math.floor(len(time_series) / (512 * 512))
>>> samples_per_pixel
13

So, we have about 1.9 beats per second and each pixel in the image represents 13 samples. So we’ll be taking the average of 13 samples for each pixel to get an amplitude at that pixel. I could have also chosen to use the max, min, median, or really anything, the average is just what I decided to use.

I’m relying on the song being long enough that it has enough samples so that samples_per_pixel is at least 1. If it’s 0 we’ll need to print an error and quit. That would mean the song doesn’t have enough data to make the image. Now we have everything we need to loop over each pixel and check if it’s a “beat pixel” and get the average amplitude of the waveform for the pixel’s time slice.

>>> import numpy as np
>>> beats = 0
>>> num_pixels = 512 * 512
>>> avg_amps = []
>>> for pixel_idx in range(num_pixels):
...     song_time = pixel_idx * pixel_time
...     song_time = math.floor(song_time)
...     if song_time and math.ceil(bps) % song_time == 0:
...         beats += 1
...     sample_idx = pixel_idx * samples_per_pixel
...     samps = time_series[sample_idx:sample_idx + samples_per_pixel]
...     avg_amplitude = np.array(samps ).mean()
...     avg_amps.append(avg_amplitude)
...
>>> print(f"Found {beats} pixels that land on a beat")
Found 3306 pixels that land on a beat
>>>

The full script with comments, error handling, a command line argument to specify the song file, and a plot to make sure we did the average amplitude correctly is below:

import os
import sys
import math
import argparse
import librosa
import numpy as np
import matplotlib.pyplot as plt


def main():
    """Entry Point."""
    parser = argparse.ArgumentParser("Analyze an MP3")
    parser.add_argument(
        "-f", "--filename", action="store",
        help="Path to an .mp3 file", required=True)
    args = parser.parse_args()

    # Input validation
    if not os.path.exists(args.filename) or \
       not os.path.isfile(args.filename) or \
       not args.filename.endswith(".mp3"):
        print("An .mp3 file is required.")
        sys.exit(1)

    # Get the song duration
    duration = librosa.get_duration(filename=args.filename)

    # Get the estimated tempo of the song
    time_series, sample_rate = librosa.load(args.filename)
    onset_env = librosa.onset.onset_strength(time_series, sr=sample_rate)
    tempo = librosa.beat.tempo(onset_envelope=onset_env, sr=sample_rate)
    bps = tempo / 60.0  # beats per second

    # The image I'm generating is going to be 512x512 (or 262,144) pixels.
    # So let's break the duration down so that each pixel represents some
    # amount of song time.
    num_pixels = 512 * 512
    pixel_time = duration / num_pixels
    samples_per_pixel = math.floor(len(time_series) / num_pixels)
    print(f"Each pixel represents {pixel_time} seconds of song")
    print(f"Each pixel represents {samples_per_pixel} samples of song")

    # Now I just need 2 more things
    # 1. a way to get "beat" or "no beat" for a given pixel
    # 2. a way to get the amplitude of the waveform for a given pixel
    beats = 0
    avg_amps = []
    for pixel_idx in range(num_pixels):
        song_time = pixel_idx * pixel_time

        # To figure out if it's a beat, let's just round and
        # see if it's evenly divisible
        song_time = math.floor(song_time)
        if song_time and math.ceil(bps) % song_time == 0:
            beats += 1

        # Now let's figure out the average amplitude of the
        # waveform for this pixel's time
        sample_idx = pixel_idx * samples_per_pixel
        samps = time_series[sample_idx:sample_idx + samples_per_pixel]
        avg_amplitude = np.array(samps ).mean()
        avg_amps.append(avg_amplitude)

    print(f"Found {beats} pixels that land on a beat")

    # Plot the average amplitudes and make sure it still looks
    # somewhat song-like
    xaxis = np.arange(0, num_pixels, 1)
    plt.plot(xaxis, np.array(avg_amps))
    plt.xlabel("Pixel index")
    plt.ylabel("Average Pixel Amplitude")
    plt.title(args.filename)
    plt.show()


if __name__ == "__main__":
    main()

First Attempt

Right now we have 2 prototype python files complete. They can be found in the prototypes folder of the GitHub repository for this project (or above). Now we have to merge those two files together and write some logic for deciding what color a pixel should be based on the song position for that pixel. We’ll be able to basically throw away the prints and graph plot from mp3_analyzer.py and just keep the math and pixel loop which we can modify into a helper method and then jam it into our red_square.py script. We will also want to do some more error handling and command line options.

We’ll start with the pixel loop from mp3_analyzer.py. Let’s convert it into a SongImage class that takes a song path and image resolution, and does all the math to store the constants we need (beats per second, samples per pixel, etc.). The SongImage class will have a helper function that takes a pixel index as input and returns a tuple with 3 items. The first item in the returned tuple will be a boolean for whether or not the provided index falls on a beat or not. The second item will be the average amplitude of the song for that index. Finally, the third item will be the timestamp for that pixel.

class SongImage:
    """An object to hold all the song info."""

    def __init__(self, filename: str, resolution: Tuple[int, int]):
        self.filename = filename
        self.resolution = resolution
        #: Total song length in seconds
        self.duration = librosa.get_duration(filename=self.filename)
        #: The time series data (amplitudes of the waveform) and the sample rate
        self.time_series, self.sample_rate = librosa.load(self.filename)
        #: An onset envelop is used to measure BPM
        onset_env = librosa.onset.onset_strength(self.time_series, sr=self.sample_rate)
        #: Measure the tempo (BPM)
        self.tempo = librosa.beat.tempo(onset_envelope=onset_env, sr=self.sample_rate)
        #: Convert to beats per second
        self.bps = self.tempo / 60.0
        #: Get the total number of pixels for the image
        self.num_pixels = self.resolution[0] * self.resolution[1]
        #: Get the amount of time each pixel will represent in seconds
        self.pixel_time = self.duration / self.num_pixels
        #: Get the number of whole samples each pixel represents
        self.samples_per_pixel = math.floor(len(self.time_series) / self.num_pixels)

        if not self.samples_per_pixel:
            raise NotEnoughSong(
                "Not enough song data to make an image "
                f"with resolution {self.resolution[0]}x{self.resolution[1]}")

    def get_info_at_pixel(self, pixel_idx: int) -> Tuple[bool, float, float]:
        """Get song info for the pixel at the provided pixel index."""
        beat = False
        song_time = pixel_idx * self.pixel_time

        # To figure out if it's a beat, let's just round and
        # see if it's evenly divisible
        song_time = math.floor(song_time)
        if song_time and math.ceil(self.bps) % song_time == 0:
            beat = True

        # Now let's figure out the average amplitude of the
        # waveform for this pixel's time
        sample_idx = pixel_idx * self.samples_per_pixel
        samps = self.time_series[sample_idx:sample_idx + self.samples_per_pixel]
        avg_amplitude = np.array(samps ).mean()
        return (beat, avg_amplitude, song_time)

Now we have a useful object. We can give it a song file and image resolution and then we can ask it (for each pixel) if that pixel is on a beat and what the average amplitude of the waveform is for that pixel. Now we have to apply that information to some algorithm that will generate an image. Spoiler alert, my first attempt didn’t go well. I will leave it here as a lesson in what not to do.

def generate_pixels(resolution: Tuple[int, int], song: SongImage) -> np.ndarray:
    """Generate pixels of an image with the provided resolution."""
    pixels = []

    pixel_idx = 0
    for _row in range(resolution[1]):
        cur_row = []
        for _col in range(resolution[0]):
            # This is where we pick our color information for the pixel
            beat, amp = song.get_info_at_pixel(pixel_idx)
            r = g = b = a = 0
            if beat and amp > 0:
                a = 255
            elif amp > 0:
                a = 125

            amp = abs(int(amp))

            # Randomly pick a primary color
            choice = random.choice([0, 1, 2])
            if choice == 0:
                r = amp
            elif choice == 1:
                g = amp
            else:
                b = amp

            cur_row.append((r, g, b, a))
            pixel_idx += 1

        pixels.append(cur_row)

    return np.array(pixels, dtype=np.uint8)

I used the function above to generate the image by choosing the pixel transparency on each beat and then used the amplitude for the pixel color. The result? Garbage!

Trash

Well that didn’t go well. The image looks terrible, but it does at least make sense. If I zoom in, there’s quite a bit of repetition due to the fact that we tied transparency to the BPM. It’s also not colorful because we used the amplitude without scaling it up at all, so we ended up with RGB values that are all very low. We could scale the amplitude up to make it more colorful. We could also shrink the image resolution to see if a smaller image is more interesting, then scale it up to 512×512 to use as an icon. Another nit-pic I have about this whole thing is that I still ended up using random which kind-of defeats the purpose of generating an image from a song. Ideally a song produces mostly the same image every time.

Another option: we could throw it away and try something different. I’m not going to completely throw it away, but I had an idea I’d like to try to make a more interesting image. Right now we’re iterating over the image one pixel at a time and then choosing a color and transparency value. Instead, let’s move a position around the image flipping it’s direction based on the song input. This will be somewhat like how 2D levels in video games are procedurally generated with a “random walker” (like this).

Side Tracked! “Random Walk” Image Generation

Let’s make something simple to start. We can modify the red_square.py script to generate an image with red lines randomly placed by a “random walker” (a position that moves in a direction and the direction randomly changes after a number of pixels).

def walk_pixels(pixels: np.ndarray):
    """Walk the image"""
    pos = Point(0, 0)
    direction = Point(1, 0)  # Start left-to-right

    for idx in range(WIDTH * HEIGHT):
        if idx % 50 == 0:
            # Choose a random direction
            direction = random.choice([
                Point(1, 0),   # Left-to-right
                Point(0, 1),   # Top-to-bottom
                Point(-1, 0),  # Right-to-left
                Point(0, -1),  # Bottom-to-top
                Point(1, 1),   # Left-to-right diaganal
                Point(-1, -1)  # Right-to-left diaganal
            ])

        pixels[pos.x][pos.y] = NEW_COLOR

        check_pos = Point(pos.x, pos.y)
        check_pos.x += direction.x
        check_pos.y += direction.y

        # Reflect if we hit a wall
        if check_pos.x >= WIDTH or check_pos.x < 0:
            direction.x *= -1
        if check_pos.y >= HEIGHT or check_pos.y < 0:
            direction.y *= -1

        pos.x += direction.x
        pos.y += direction.y

It works! We have red lines!

Red lines (512×512)

That’s a bit noisy though. Let’s see what we get from 32×32 and 64×64 by changing WIDTH and HEIGHT in the code above.

32×32
64×64

As we zoom in what we get is a bit more interesting and “logo-like”. Listen, I know it’s not gonna be a good logo, but I’ve written this much blog, I’m not about to admit this was a dumb idea. Instead I’m going to double down! One of the images that is produced by the end of this post will be the logo for this blog forever. I will never change it. We’re in it now! Buckle up! The full prototype “random walker” script can be found in the prototypes folder for the project on GitHub.

Back to business (wrap it up!)

To finish this up I just need to decide how the different features I’m pulling out of the song will drive the pixel walking code. Here’s what I’m thinking:

  1. The BPM will determine when the walker changes direction.
  2. The amplitude will have some effect on which direction we choose.
  3. We’ll use a solid pixel color for “no beat” and a different color for “beat”.
  4. We’ll loop to the other side of the image (like Asteroids) when we hit a wall.
  5. We’ll iterate as many pixels as there are in the image.

That seems simple enough and doesn’t require any randomness. We’ve basically already written all the code for this, it just needs to be combined, tweaked, and overengineered with a plugin system and about 100 different combinations of command line arguments to customize the resulting image. I’ll spare you all the overengineering, you’re free to browse the final project source code for that. For now I’ll just go over the key parts of what I’m calling the basic algorithm (my favorite one). NOTE: The code snippets in this section only contain the bits I felt were important to show and they do not have all the variable initializations and other code needed to run. See the finished project for full source.

Using the SongImage class from above, we provide a path to a song (from the command line) and a desired resulting image resolution (default 512×512):

# Process the song
song = SongImage(song_path, resolution)

Next, we modified our generate_pixels method from the red_square.py prototype to create a numpy.ndarray of transparent pixels (instead of red). As our walker walks around the image, the pixels will be changed from transparent to a color based on whether the pixel falls on a beat or not.

Finally, we implemented a basic algorithm loosely based on the rules above. In a loop from 0 to num_pixels we check the beat to set the color:

beat, amp, timestamp = song.get_info_at_pixel(idx)
if not beat and pixel == Color.transparent():
    pixels[pos.y][pos.x] = args.off_beat_color.as_tuple()
elif pixel == Color.transparent() or pixel == args.off_beat_color:
    pixels[pos.y][pos.x] = args.beat_color.as_tuple()

Then we turn 45 degrees clockwise if the amplitude (amp) is positive and 45 degrees counterclockwise if it’s negative (or 0). I added a little extra logic where, if the amplitude is more than the average amplitude for the entire song, the walker turns 90 additional degrees clockwise (or counterclockwise).

# Directions in order in 45 degree increments
directions = (
    Point(1, 0), Point(1, 1), Point(0, 1),
    Point(-1, 1), Point(-1, 0), Point(-1, -1),
    Point(0, -1), Point(1, -1)
)

# Try to choose a direction
if amp > 0:
    turn_amnt = 1
else:
    turn_amnt = -1

direction_idx += turn_amnt

# Turn more if it's above average
if amp > song.overall_avg_amplitude:
    direction_idx += 2
elif amp < (song.overall_avg_amplitude * -1):
    direction_idx -= 2

direction_idx = direction_idx % len(directions)

# Update the current direction
direction = directions[direction_idx]

Then we update the position of the walker to the next pixel in that direction. If we hit the edge of the image, we loop back around to the other side like the game Asteroids.

# Create a temporary copy of the current position to change
check_pos = Point(pos.x, pos.y)
check_pos.x += direction.x
check_pos.y += direction.y

# Wrap if we're outside the bounds
if check_pos.x >= resolution.x:
    pos.x = 0
elif check_pos.x < 0:
    pos.x = resolution.x - 1

if check_pos.y >= resolution.y:
    pos.y = 0
elif check_pos.y < 0:
    pos.y = resolution.y - 1

If you run all of that against “Brass Monkey” by the Beastie Boys you end up with the following images (I found higher resolutions looked better):

Brass Monkey 512×512
Brass Monkey 1024×1024
Brass Monkey 1920×1080

Gallery

Here’s a gallery of some of my favorites from some other songs. I changed the colors around to experiment with what looks best. I landed on matrix colors (green/black) because I’m a nerd and I don’t know how to art, with the exception of Symphony of Destruction by Megadeth, for that I used colors from the album artwork.

For the site icon, I decided to crop what looks like some sort of yelling monster with its arms waving in the air out of the 1920×1080 image generated from The Double Helix of Extinction by Between the Buried and Me.

Extra Credit – Make a live visualizer out of it

While looking at all these images I couldn’t help but wonder what part of different songs caused the walker to go a certain direction. So I decided to take a stab at writing a visualizer that would play the song while drawing the image in real time. My first attempt at it was to use the builtin Python Tk interface, tkinter, but that quickly got messy since I’m trying to set individual pixel values and I want to do it as quickly as possible. There are certainly ways I could have done this even better and more efficiently, but I decided to use Processing to get the job done. The final sketch can be found in the GitHub repository for this project under the mp3toimage_visualizer directory.

To start, I needed a way to get the image generation information over to Processing. To do that, I made a modification to the Python image generator to save off information for each pixel we set in the image. I had to save the position of the pixel, the color, and the timestamp in the song for when the pixel was changed:

pixel_changed = False
if not beat and pixel == Color.transparent():
    pixels[pos.y][pos.x] = args.off_beat_color.as_tuple()
    pixel_changed = True
elif pixel == Color.transparent() or pixel == args.off_beat_color:
    pixels[pos.y][pos.x] = args.beat_color.as_tuple()
    pixel_changed = True

if pixel_changed and pb_list is not None:
    pb_list.append(PlaybackItem(
        pos,
        Color.from_tuple(pixels[pos.y][pos.x]),
        timestamp,
        song.pixel_time))

One important optimization was skipping any pixel we already set. The walker does tons of back tracking over pixels that have already been set. It was so much that the code I wrote in Processing couldn’t keep up with the song. Skipping pixels already set was the easiest way to optimize.

Once I had the list of pixels and timestamps, I wrote them to a CSV file along with the original song file path and image resolution. After that, the Processing sketch was pretty simple. The most complicated parts, excluded here, were dealing with allowing user selection of the input file. The sketch reads in the CSV produced by the Python script and then draws the pixels based on the song’s playback position. The following snippet is from the draw method which is called once per frame.

if (pbItems.size() == 0) {
    return;
}

// Get the song's current playback position
float songPos = soundFile.position();

int count = 0;

// Get the first item
PlaybackItem item = pbItems.get(0);
while(pbItems.size() > 0 && item.should_pop(songPos)) {
    // Loop over each item who's timestamp is less than
    // or equal to the song's playback position (this is
    // more "close enough" than exact).
    item = pbItems.get(0);
    item.display();  // Display it
    pbItems.remove(0);  // Remove it from the list
    count++;
}
if (count >= 1000) {  // Over 1000 per frame and we fall way behind
    println("TOO MUCH DATA. Drawing is likely to fall behind.");
}

I think that about wraps it up. There’s a whole README and all that good stuff over in the GitHub repository for the project, so if you’re curious what one of your favorite songs looks like as an image, or if you want to mess around with the different algorithms and command line options that I didn’t go into here, go run it for yourself and let me know how it goes. Or don’t. #GreatJob!

References

Posted by laplans in Things I've Done