sd8686 linux system hang when associating to access point
Wood, Brian J
brian.j.wood at intel.com
Tue Jun 9 11:59:04 EDT 2009
>-----Original Message-----
>From: Dan Williams [mailto:dcbw at redhat.com]
>Sent: Tuesday, June 09, 2009 8:01 AM
>To: Wood, Brian J
>Cc: libertas-dev at lists.infradead.org
>Subject: RE: sd8686 linux system hang when associating to access point
>
>On Mon, 2009-06-08 at 15:02 -0700, Wood, Brian J wrote:
>> >-----Original Message-----
>> >From: Dan Williams [mailto:dcbw at redhat.com]
>> >Sent: Friday, June 05, 2009 12:35 PM
>> >To: Wood, Brian J
>> >Cc: libertas-dev at lists.infradead.org
>> >Subject: Re: sd8686 linux system hang when associating to access point
>> >
>> >On Fri, 2009-06-05 at 12:04 -0700, Wood, Brian J wrote:
>> >> Hello all,
>> >>
>> >> I have an issue with my development platform that's using Marvell's 88w8686 wifi component. In
>> >the past I have been able to use the version 9 firmware when connecting to our corporate wifi
>> >testing network. Recently when trying to associate an essid with the wifi network the entire
>system
>> >will hang (like the SDIO controller has crashed).
>> >>
>> >> I'd like to help debug this issue, but don't know where to start gathering the information
>needed
>> >by the members of the mailing list. I have lbsdebug installed on the target, the kernel source
>> >recompiled to have debugging turned on (MMC, Libertas, etc...). My kernel source version is
>> >2.6.29.2 and is from the Moblin v2 project. The development platform is Intel Menlow based (like
>> >used in many of the current Netbooks).
>> >
>> >You'll first need to figure out if the Moblin kernel still uses a
>> >non-upstream vendor driver. If so, then you need to contact Marvell or
>> >Intel for help with that driver. Historically, the Moblin kernels
>> >(2.6.22 and 2.6.24) used a completely different driver than was present
>> >in the upstream kernel, because Intel apparently didn't feel like they
>> >needed to work with the upstream community to enhance the existing one
>> >to meet their needs. Oh well.
>> >
>> >To figure that out, run 'lsmod' when the card is inserted and we'll see
>> >what driver module is there. If you're actually using the standard
>> >upstream driver, then we get to go on to the following:
>>
>> Hi Dan, thanks for the help by the way. :)
>>
>> Yes, I'm using the standard upstream libertas/libertas_sdio driver (just to make sure I pulled
>down the vanilla 2.6.29.4 kernel and rebuilt with the libertas debugging turned on, so were using a
>known upstream version. The libertas I'm using is the libertas_sdio for the sd8686.
>>
>> Here's my lsmod output:
>> sh-3.2# lsmod
>> Module Size Used by
>> libertas_sdio 11616 0
>> libertas 134196 1 libertas_sdio
>> lib80211 4848 1 libertas
>> mmc_block 8552 0
>> sdhci_pci 6352 0
>> sdhci 14392 1 sdhci_pci
>> mmc_core 41356 3 libertas_sdio,mmc_block,sdhci
>> pl2303 14440 1
>> usbserial 26212 3 pl2303
>> sh-3.2#
>
>Yup, looks good.
>
>> >
>> >1) what SDIO controller are you using? Is its source upstream in the
>> >kernel? The *largest* source of issues with the Libertas driver is
>> >crappy SD controllers. I test the driver mainly on Ricoh laptop SD
>> >controllers, and it works very well there. The quality of your
>> >controller has a huge impact on how well the driver will work.
>>
>> The development platform I'm using is a Compal Jax10 (Aigo P8860 is another version). Is there a
>Linux command to find out the SDIO controller its using? I tried using dmidecode and it didn't seem
>to list a specific HW name for the SD controller. I also tried using lspci, but I don't know if
>this PCI table info is helpful:
>>
>> 00:1e.0 SD Host controller: Intel Corporation System Controller Hub (SCH Poulsbo
>> ) SDIO Controller #1 (rev 06)
>> 00:1e.1 SD Host controller: Intel Corporation System Controller Hub (SCH Poulsbo
>> ) SDIO Controller #2 (rev 06)
>
>Ugh. That driver isn't upstream yet, and last I looked at the patch,
>had some serious issues with mergability. That doesn't say anything
>about it's actual *operation*; ie it might work fine or it might not,
>but since it's not yet upstream and hasn't passed kernel review, it's
>somewhat suspect.
>
>>
>> >
>> >2) You may wish to rebuild the libertas modules with debugging enabled.
>> >Set CONFIG_LIBERTAS_DEBUG=y in your kernel config and rebuild. Then,
>> >modprobe the libertas driver like so:
>> >
>> >modprobe libertas.ko libertas_debug=0x5863a7
>> >modprobe libertas_sdio.ko
>> >
>> >(see drivers/net/wireless/libertas/defs.h LBS_DEB_* for the debugging
>> >constants)
>> >
>> >then try to reproduce the problem. This will spew out a lot of good
>> >debugging output, which you'll want to capture over a serial port or
>> >something so that when the machine does go south, we can figure out why.
>>
>> Here's some output I thought might be helpful:
>>
>> sh-3.2# dmesg | grep -i libertas
>> [ 9.977267] libertas: 00:1d:19:f0:ca:92, fw 9.70.3p36, cap 0x00000303
>> [ 9.978091] eth0 (libertas_sdio): not using net_device_ops yet
>> [ 9.983572] libertas: PREP_CMD: command 0x00a3 failed: 2
>> [ 9.984501] libertas: PREP_CMD: command 0x00a3 failed: 2
>> [ 9.984646] libertas: eth0: Marvell WLAN 802.11 adapter
>> sh-3.2#
>>
>> I saw on a mailing list that the PREP_CMD failure was something to do with the
>CONFIG_MAC80211_MESH being set, so I tried rebuilding the kernel with this enabled/disabled and
>still see that message in dmesg. Didn't know if this helps.
>
>That failure is expected on non-USB8388 devices, which have the mesh
>firmware. Doesn't have anything to do with CONFIG_MAC80211_MESH
>actually, as the 8388 mesh capability is firmware-based. The command
>failures are actually the indicator that a mesh command failed, and thus
>mesh isn't supported on this device. We should probably clean that up,
>but it's a low priority.
>
>> Here's the rest of my data:
>>
>>
>> sh-3.2# echo 8 > /proc/sys/kernel/printk
>> sh-3.2# echo 5792679 > /sys/module/libertas/parameters/libertas_debug
>>
>> sh-3.2# ./lbsdebug
>> 0x005863a7: enter,leave,main,wext,scan,assoc,join,host,cmd,fw,thread,sdio
>>
>> Details:
>> enter + 0x00000001 - function entrance
>> leave + 0x00000002 - function exit
>> main + 0x00000004 - main libertas library code
>> net 0x00000008 - interaction with network subsystem
>> mesh 0x00000010 - wireless mesh network
>> wext + 0x00000020 - interaction with wireless extensions
>> ioctl 0x00000040 - misc IOCTLs
>> scan + 0x00000080 - scanning for APs
>> assoc + 0x00000100 - associating ton an AP
>> join + 0x00000200 - joining an IBSS?
>> 11d 0x00000400 - 802.11d country settings
>> debugfs 0x00000800 - interaction with the debugfs subsystem
>> ethtool 0x00001000 - interaction with ethtool subsystem
>> host + 0x00002000 - communication between host and wlan chip
>> cmd + 0x00004000 - command and response processing
>> rx 0x00008000 - packet reception
>> tx 0x00010000 - packet transmission
>> usb 0x00020000 - interaction with USB subsystem
>> cs 0x00040000 - interaction with card services subsystem
>> fw + 0x00080000 - firmware downloading
>> thread + 0x00100000 - main libertas worker thread
>> hex 0x00200000 - turn on detailed hex dumps
>> sdio + 0x00400000 - interaction with SDIO subsystem
>> sh-3.2#
>>
>> sh-3.2# iwconfig eth0
>> [ 66.612880] libertas enter: lbs_get_name()
>> [ 66.613052] libertas leave: lbs_get_name()
>> [ 66.613725] libertas enter: lbs_get_freq()
>> [ 66.613826] libertas wext: freq 241200000
>> [ 66.613937] libertas leave: lbs_get_freq()
>> [ 66.614588] libertas enter: lbs_get_encode()
>> [ 66.614691] libertas wext: flags 0x0, index -1, length 64, wep_tx_keyidx 0
>> [ 66.614790] libertas wext: key: 00:00:00:00:00:00, keylen 64
>> [ 66.614880] libertas wext: return flags 0xa000
>> [ 66.614963] libertas leave: lbs_get_encode()
>> [ 66.615841] libertas enter: lbs_get_essid()
>> [ 66.615928] libertas leave: lbs_get_essid()
>> [ 66.616359] libertas enter: lbs_get_mode()
>> [ 66.616445] libertas leave: lbs_get_mode()
>> [ 66.616544] libertas enter: lbs_get_range()
>> [ 66.616629] libertas enter: copy_active_data_rates()
>> [ 66.616715] libertas leave: copy_active_data_rates()
>> [ 66.616803] libertas wext: IW_MAX_BITRATES 32, num_bitrates 12
>> [ 66.616894] libertas wext: IW_MAX_FREQUENCIES 32, num_frequency 11
>> [ 66.616986] libertas leave: lbs_get_range()
>> [ 66.618195] libertas enter: lbs_get_wap()
>> [ 66.618362] libertas leave: lbs_get_wap()
>> [ 66.618532] libertas enter: lbs_get_rate()
>> [ 66.618694] libertas leave: lbs_get_rate()
>> [ 66.618863] libertas enter: lbs_get_power()
>> [ 66.619050] libertas leave: lbs_get_power()
>> [ 66.620022] liberta[ 66.621107] libertas enter: __lbs_cmd()
>> [ 66.621273] liber[ 66.626819] libertas thread: sleeping, connect_status 1,0
>> [ [ 66.643861] libertas thread: 3: currenttxskb (null), dnld_sen[ 66.6494)
>> [ 66.649574][ 66.656430] libertas enter: if_sdio_host_to_card_worker()
>> [ [ 66.660600] libertas enter: if_sdio_interrupt()
>> [ 66.66076[ 66.666272] libertas leave: if_sdio_card_to_host(), ret 0
>> [ [ 66.671932] libertas thread: 4: currenttxskb (null), dnld_sen[ 66.677602)
>> [ 66.6[ 66.683313] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00[ 66.0
>> [ 66.694742[ 66.700321] libertas enter: if_sdio_host_to_card_worker()
>> [ [ 66.705700] mmc1: req done (CMD52): 0: 00001092 00000000 0000[ 66.7114910
>> [ 66.711658eth0 IEEE 802.11b/g ESSID:""
>> Mode:Managed Frequency:2.412 GHz Access Point: Not-Associated
>> Bit Rate:0 kb/s Tx-Power=15 dBm
>> Retry short limit:8 RTS thr=2347 B Fragment thr=2346 B
>> Encryption key:off
>> Power Management:off
>> Link Quality:0 Signal level:0 Noise level:0
>> Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
>> Tx excessive retries:0 Invalid misc:0 Missed beacon:0
>>
>> sh-3.2#
>>
>> sh-3.2# iwconfig eth0 essid Guest
>> [ 305.621045] libertas enter: lbs_get_wireless_stats()
>> [ 305.621266] libertas leave: lbs_get_wireless_stats()
>> [ 305.621689] libertas enter: lbs_set_essid()
>> [ 305.621866] libertas wext: requested SSID 'Guest'
>> [ 305.622320] libertas enter: lbs_get_association_request()
>> [ 305.622502] libertas leave: lbs_get_association_request()
>> [ 305.622677] libertas leave: lbs_set_essid(), ret 0
>> [ 306.122305] libertas enter: lbs_association_worker()
>> [ 306.122543] libertas assoc: Association Request:
>> [ 306.122549] flags: 0x00000002
>> [ 306.122553] SSID: 'Guest'
>> [ 306.122556] chann: 1
>> [ 306.122560] band: 0
>> [ 306.122563] mode: 2
>> [ 306.122566] BSSID: 00:00:00:00:00:00
>> [ 306.122570] secinfo:
>> [ 306.122574] auth_mode: 1
>> [ 306.124280] libertas enter: assoc_helper_associate()
>> [ 306.124484] libertas enter: assoc_helper_essid()
>> [ 306.124669] libertas assoc: SSID 'Guest' requested
>> [ 306.124838] libertas enter: lbs_send_specific_ssid_scan(SSID 'Guest'
>> [ 306.124843] )
>> [ 306.125183] libertas enter: lbs_scan_networks(full_scan 1)
>> [ 306.125356] libertas scan: numchannels 4, bsstype 3
>> [ 306.125529] libertas scan: chan_count 11, scan_channel 0
>> [ 306.125700] libertas scan: scanning 4 of 11 channels
>> [ 306.125868] libertas enter: lbs_do_scan(bsstype 3, chanlist[].chan 1, chan_c)
>> [ 306.126147] libertas enter: __lbs_cmd()
>> [ 306.126310] libertas enter: __lbs_cmd_async()
>> [ 306.126476] libertas enter: lbs_get_cmd_ctrl_node()
>> [ 306.126643] libertas leave: lbs_get_cmd_ctrl_node()
>> [ 306.126810] libertas host: PREP_CMD: command 0x0006
>> [ 306.126976] libertas enter: lbs_queue_cmd()
>> [ 306.127156] libertas host: QUEUE_CMD: [ 306.127810] libertas thread: 2: cur0
>> [ 306.592084] mmc1: starting CMD52 arg 00000a00 flags 00000195
>> [ 306.592497] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 306.592720] mmc1: req done (CMD52): 0: 00001002 00000000 00000000 00000000
>> [ 306.592729] libertas enter: if_sdio_interrupt()
>> [ 306.593188] mmc1: starting CMD52 arg 10000a00 flags 00000195
>> [ 306.593378] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 306.593562] mmc1: req done (CMD52): 0: 00001001 00000000 00000000 00000000
>> [ 306.593748] libertas sdio: interrupt: 0x1
>> [ 306.593912] mmc1: starting CMD52 arg 90000afe flags 00000195
>> [ 306.594117] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 306.594299] mmc1: req done (CMD52): 0: 000010fe 00000000 00000000 00000000
>> [ 306.594489] libertas enter: if_sdio_card_to_host()
>> [ 306.594658] mmc1: starting CMD52 arg 10006800 flags 00000195
>> [ 306.594845] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 306.595044] mmc1: req done (CMD52): 0: 0000104a 00000000 00000000 00000000
>> [ 306.595233] mmc1: starting CMD52 arg 10006a00 flags 00000195
>> [ 306.595420] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 306.595598] mmc1: req done (CMD52): 0: 00001002 00000000 00000000 00000000
>> [ 306.595785] mmc1: starting CMD52 arg 1000400[ 306.598899] libertas thread: 0
>> [ 306.6102[ 306.615742] libertas thread: 1: currenttxskb (null), dnld_sen[ 35
>> 0
>> [ 307.081616] mmc1: starting CMD52 arg 00000a00 flags 00000195
>> [ 307.081861] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.082162] mmc1: req done (CMD52): 0: 00001002 00000000 00000000 00000000
>> [ 307.082383] libertas enter: if_sdio_interrupt()
>> [ 307.082575] mmc1: starting CMD52 arg 10000a00 flags 00000195
>> [ 307.082766] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.082948] mmc1: req done (CMD52): 0: 00001001 00000000 00000000 00000000
>> [ 307.083160] libertas sdio: interrupt: 0x1
>> [ 307.083327] mmc1: starting CMD52 arg 90000afe flags 00000195
>> [ 307.083511] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.083692] mmc1: req done (CMD52): 0: 000010fe 00000000 00000000 00000000
>> [ 307.083880] libertas enter: if_sdio_card_to_host()
>> [ 307.084068] mmc1: starting CMD52 arg 10006800 flags 00000195
>> [ 307.084254] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.084439] mmc1: req done (CMD52): 0: 000010bf 00000000 00000000 00000000
>> [ 307.084625] mmc1: starting CMD52 arg 10006a00 flags 00000195
>> [ 307.084812] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.084994] mmc1: req done (CMD52): 0: 00001002 00000000 00000000 00000000
>> [ 307.085198] mmc1: starting CMD52 arg 100040[ 307.088157] libertas thread: 21
>> [ 307.09948[ 307.104808] libertas scan: got SSID IE: 'Guest', len 5
>> [ 30[ 307.110688] libertas leave: lbs_queue_cmd()
>> [ 307.110858] l[ 307.116137] mmc1: starting CMD52 arg 10000a00 flags 00000195
>> 0
>> [ 307.461844] mmc1: starting CMD52 arg 00000a00 flags 00000195
>> [ 307.462288] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.462510] mmc1: req done (CMD52): 0: 00001002 00000000 00000000 00000000
>> [ 307.462728] libertas enter: if_sdio_interrupt()
>> [ 307.462921] mmc1: starting CMD52 arg 10000a00 flags 00000195
>> [ 307.463131] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.463315] mmc1: req done (CMD52): 0: 00001001 00000000 00000000 00000000
>> [ 307.463504] libertas sdio: interrupt: 0x1
>> [ 307.463672] mmc1: starting CMD52 arg 90000afe flags 00000195
>> [ 307.463857] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.464059] mmc1: req done (CMD52): 0: 000010fe 00000000 00000000 00000000
>> [ 307.464245] libertas enter: if_sdio_card_to_host()
>> [ 307.464415] mmc1: starting CMD52 arg 10006800 flags 00000195
>> [ 307.464602] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.464784] mmc1: req done (CMD52): 0: 0000109c 00000000 00000000 00000000
>> [ 307.464970] mmc1: starting CMD52 arg 10006a00 flags 00000195
>> [ 307.465174] sdhci [sdhci_irq()]: *** mmc1 got interrupt: 0x00000001
>> [ 307.465356] mmc1: req done (CMD52): 0: 00001001 00000000 00000000 00000000
>> [ 307.465542] mmc1: starting CMD52 arg 1000400[ 307.469629] libertas thread: 1
>> [ 307.4742[ 307.479790] libertas scan: process_bss: IE len 106
>> [ 307.47[
>>
>>
>> At this point the device hangs and the display goes garbled, I'm capturing this output using a
>USB to serial adapter on the target since it doesn't have a serial port. (I'm assuming I'm dropping
>some of the messages that I would otherwise being seeing if I had a serial connector setup, I hope
>this gives us enough information.)
>
>Hmm, yeah, that's probably not where it left off, but that's likely the
>best you can do for now. Does a normal scan (iwlist eth0 scan) make
>things hang? Or is it just association?
>
>Dan
A regular scan will work ok (no hang), its just when I try and associate with an accesspoint. I'm going to try a suggestion from Philip about the align_sdio function about aligning with multiples of 4 bytes. I saw in the if_sdio.c file a mention about the 4 byte padding for certain controllers, but it looks like in the functions if_sdio_host_to_card() and if_sdio_card_to_host() they're being padded with the "+4" and "-4" to the card->buffer and chunk respectively, this must be something similar?
Brian
>
>> >
>> >Let me know if you have any questions!
>> >
>> >Dan
>> >
>> >> Since the system completely hangs trying to look for kernel oops or messages on the running
>> >system in the logfiles/dmesg isn't an option (until a reboot). I've been adding lots of printk's
>to
>> >the drivers files just so I could try and see how this is all unfolding up to the point where it
>> >dies. I just discovered the lbsdebug tool this morning and would like to gather the information
>> >most helpful to resolve the issue. :-)
>> >>
>> >> I'm excited to help with this, so any advice of what to gather first would be greatly
>> >appreciated.
>> >>
>> >> Thank you,
>> >>
>> >> Brian Wood
>> >> Intel Corporation
>> >> UMG Platform Software Group (UPSG)
>> >> brian.j.wood at intel.com
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> libertas-dev mailing list
>> >> libertas-dev at lists.infradead.org
>> >> http://lists.infradead.org/mailman/listinfo/libertas-dev
>>
>>
>> _______________________________________________
>> libertas-dev mailing list
>> libertas-dev at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/libertas-dev
More information about the libertas-dev
mailing list