#561 Grub hacker help wanted

Open
opened 1 year ago by swiftgeek · 4 comments

In this meta bug i will just start listing bugs/issues we currently face with grub in various configurations, and we need help even with identifying those bugs properly:

Grub baremetal:

  • Some USB Mass Storage devices is freezing grub (forever/minutes). That even includes usb floppy
  • Sometimes it even happens for just any usb device like hid keyboard/mouse
  • native ahci starts looping after first access to drive (eg. after ls)
  • cryptomount (compared to grub i386-pc) takes at least order of magnitude more on baremetal.
  • USB plugged in on power on (so i far I reproduced this only on X200 and related) produces error: EHCI grub_ehci_pci_iter: EHCI halt timeout. Works perfectly fine when hotplugging in grub or in every case when in linux.
  • ahci/ata device cannot be distinguished from atapi ones, which causes unnecessary delays from media detection and media spin up time
  • Unknown key 0xff detected with particular USB HID devices / configurations (sometimes devices alone won't cause issue but only when combined).
  • Lack of caching on reads from drives
  • ODDs mediums with invalid/different types than iso9660 may cause hangs

SeaGRUB (grub i386-pc loaded via SeaBIOS from floppy/disk image)

  • making it recognize fs from image grub was loaded from. Same image dd-ed onto pendrive doesn't seem to have issues and is recognized properly
  • automatic module loading doesn't work when using cbfs (prefix doesn't seem to work as intended)

Some links on debug:

In this meta bug i will just start listing bugs/issues we currently face with grub in various configurations, and we need help even with identifying those bugs properly: Grub baremetal: * Some USB Mass Storage devices is freezing grub (forever/minutes). That even includes usb floppy * Sometimes it even happens for just any usb device like hid keyboard/mouse * native ahci starts looping after first access to drive (eg. after ls) * cryptomount (compared to grub i386-pc) takes at least order of magnitude more on baremetal. * USB plugged in on power on (so i far I reproduced this only on X200 and related) produces `error: EHCI grub_ehci_pci_iter: EHCI halt timeout.` Works perfectly fine when hotplugging in grub or in every case when in linux. * ahci/ata device cannot be distinguished from atapi ones, which causes unnecessary delays from media detection and media spin up time * [For example, any removable media device needs a "media eject" command, and a way for the host to determine whether the media is present, and these were not provided in the ATA protocol.](https://en.wikipedia.org/wiki/ATA_Packet_Interface#Background) * `Unknown key 0xff detected` with particular USB HID devices / configurations (sometimes devices alone won't cause issue but only when combined). * Lack of caching on reads from drives * ODDs mediums with invalid/different types than iso9660 may cause hangs SeaGRUB (grub i386-pc loaded via SeaBIOS from floppy/disk image) * making it recognize fs from image grub was loaded from. Same image dd-ed onto pendrive doesn't seem to have issues and is recognized properly * automatic module loading doesn't work when using cbfs (prefix doesn't seem to work as intended) ------- Some links on debug: * mostly pay attention to debug variable, this will spew more debug info on screen from grub modules. To enable debug in every module use `set debug=all` * https://serverfault.com/questions/869559/grub-hangs-before-menu-after-a-hdd-upgrade-how-to-debug *
strcpy commented 1 year ago

Some research into the cryptomount performance:

Doing "set debug=luks" reveals the slowdown is between the debug logs "Trying keyslot 0" and "PBKDF2 done" in disk/luks.c. The only thing of consequence between those lines is a grub_crypto_pbkdf2 on the passphrase. However, later in the same function, between "candidate key recovered" and "Slot 0 opened", there is another call to grub_crypto_pbkdf2 for the candidate key, and there is no delay between these two logs.

Additionally, manually creating a LUKS partition on a USB with cryptsetup --iter-time 1 luksFormat means that cryptomount finishes almost instantly, so the difficulty of the passphrase key derivation is related. Diffing the grub_crypto_pbkdf2 implementation in the latest Libreboot release with the latest GRUB release shows nothing has changed there. Perhaps it is just a slower implementation than the one the Linux kernel has, but I don't see how that would mean it is only slow on baremetal, as opposed to i386-pc.

Edit: The pbkdf2 implementation in cryptsetup is here, and to my untrained eye it looks identical to GRUB's.

Some research into the cryptomount performance: Doing "set debug=luks" reveals the slowdown is between the debug logs "Trying keyslot 0" and "PBKDF2 done" in *disk/luks.c*. The only thing of consequence between those lines is a *grub_crypto_pbkdf2* on the passphrase. However, later in the same function, between "candidate key recovered" and "Slot 0 opened", there is another call to *grub_crypto_pbkdf2* for the candidate key, and there is no delay between these two logs. Additionally, manually creating a LUKS partition on a USB with *cryptsetup --iter-time 1 luksFormat* means that cryptomount finishes almost instantly, so the difficulty of the passphrase key derivation is related. Diffing the *grub_crypto_pbkdf2* implementation in the latest Libreboot release with the latest GRUB release shows nothing has changed there. Perhaps it is just a slower implementation than the one the Linux kernel has, but I don't see how that would mean it is only slow on baremetal, as opposed to i386-pc. Edit: The pbkdf2 implementation in cryptsetup is [here](https://gitlab.com/cryptsetup/cryptsetup/blob/master/lib/crypto_backend/pbkdf2_generic.c), and to my untrained eye it looks identical to GRUB's.
strcpy commented 1 year ago

Cryptomount is also slow on coreboot 4.9, with GRUB2 master, also seemingly in the grub_crypto_pbkdf2 function.

@swiftgeek - When you say performance is better on i386-pc, is that through the stock BIOS or SeaBIOS? Thanks.

Edit: I think the issue is with GRUB's implementation of memmove (grub_memmove). It doesn't perform as well as glibc's. However, on my desktop, running with -O3 has a ~5x performance increase over -O2 on the GRUB implementation, so simply building GRUB with -O3 might be enough to fix this. I'll try and confirm this at some point.

Cryptomount is also slow on coreboot 4.9, with GRUB2 master, also seemingly in the *grub_crypto_pbkdf2* function. @swiftgeek - When you say performance is better on i386-pc, is that through the stock BIOS or SeaBIOS? Thanks. Edit: I think the issue is with GRUB's implementation of memmove (*grub_memmove*). It doesn't perform as well as glibc's. However, on my desktop, running with -O3 has a ~5x performance increase over -O2 on the GRUB implementation, so simply building GRUB with -O3 might be enough to fix this. I'll try and confirm this at some point.
Swift Geek commented 1 year ago
Collaborator

Both vendor bios (phoenix) and seabios, when not using grub native drivers

Both vendor bios (phoenix) and seabios, when not using grub native drivers
ashm commented 1 year ago

example details for USB HID issue:

x200s & logitech unifying receiver

Errors written over the grub menu:

Unknown key 0xff detected
Unknown key 0xff detected
Unknown key 0xb0 detected
Unknown key 0x81 detected
Unknown key 0x31 detected

lsusb -v:

Bus 006 Device 002: ID 046d:c52b Logitech, Inc. Unifying Receiver
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0         8
  idVendor           0x046d Logitech, Inc.
  idProduct          0xc52b Unifying Receiver
  bcdDevice           12.03
  iManufacturer           1 
  iProduct                2 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           84
    bNumInterfaces          3
    bConfigurationValue     1
    iConfiguration          4 
    bmAttributes         0xa0
      (Bus Powered)
      Remote Wakeup
    MaxPower               98mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      1 Boot Interface Subclass
      bInterfaceProtocol      1 Keyboard
      iInterface              0 
    HID Device Descriptor:
      bLength                 9
      bDescriptorType        33
      bcdHID               1.11
      bCountryCode            0 Not supported
      bNumDescriptors         1
      bDescriptorType        34 Report
      wDescriptorLength      59
     Report Descriptors: 
       ** UNAVAILABLE **
      Endpoint Descriptor:
    bLength                 7
    bDescriptorType         5
    bEndpointAddress     0x81  EP 1 IN
    bmAttributes            3
      Transfer Type            Interrupt
      Synch Type               None
      Usage Type               Data
    wMaxPacketSize     0x0008  1x 8 bytes
    bInterval               8
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      1 Boot Interface Subclass
      bInterfaceProtocol      2 Mouse
      iInterface              0 
    HID Device Descriptor:
      bLength                 9
      bDescriptorType        33
      bcdHID               1.11
      bCountryCode            0 Not supported
      bNumDescriptors         1
      bDescriptorType        34 Report
      wDescriptorLength     148
     Report Descriptors: 
       ** UNAVAILABLE **
      Endpoint Descriptor:
    bLength                 7
    bDescriptorType         5
    bEndpointAddress     0x82  EP 2 IN
    bmAttributes            3
      Transfer Type            Interrupt
      Synch Type               None
      Usage Type               Data
    wMaxPacketSize     0x0008  1x 8 bytes
    bInterval               2
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 No Subclass
      bInterfaceProtocol      0 None
      iInterface              0 
    HID Device Descriptor:
      bLength                 9
      bDescriptorType        33
      bcdHID               1.11
      bCountryCode            0 Not supported
      bNumDescriptors         1
      bDescriptorType        34 Report
      wDescriptorLength      93
     Report Descriptors: 
       ** UNAVAILABLE **
      Endpoint Descriptor:
    bLength                 7
    bDescriptorType         5
    bEndpointAddress     0x83  EP 3 IN
    bmAttributes            3
      Transfer Type            Interrupt
      Synch Type               None
      Usage Type               Data
    wMaxPacketSize     0x0020  1x 32 bytes
    bInterval               2
example details for USB HID issue: x200s & logitech unifying receiver Errors written over the grub menu: Unknown key 0xff detected Unknown key 0xff detected Unknown key 0xb0 detected Unknown key 0x81 detected Unknown key 0x31 detected lsusb -v: Bus 006 Device 002: ID 046d:c52b Logitech, Inc. Unifying Receiver Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x046d Logitech, Inc. idProduct 0xc52b Unifying Receiver bcdDevice 12.03 iManufacturer 1 iProduct 2 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 84 bNumInterfaces 3 bConfigurationValue 1 iConfiguration 4 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 98mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 1 Keyboard iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.11 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 59 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 8 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 2 Mouse iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.11 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 148 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 0 No Subclass bInterfaceProtocol 0 None iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.11 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 93 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0020 1x 32 bytes bInterval 2
Sign in to join this conversation.
Loading...
Cancel
Save
There is no content yet.