Opened 6 weeks ago

Closed 11 days ago

#710 closed баг (не воспроизводится)

Спонтанная перезагрузка блока 3U

Reported by: AlexLir Owned by: alx
Priority: низкий Milestone: 1 очередь
Component: sw Keywords:
Cc:

Description (last modified by AlexLir)

swd-r2410
Я ожидал, что блок не будет перезагружаться, сетевых штормов не было в это время и других предпосылок для перезагрузки не было.
Ничего не предвещало беды, Влад и я аутентифицировались в web интерфейсе блока и затем, внезапно, вопреки моим ожиданиям, произошла перезагрузка по watchdog.

Sep 24 05:32:59 sw01 daemon.err ntpd[247]: i/o error on routing socket No buffer space available - disabling
Oct  2 01:12:01 sw01 daemon.info swd[254]: user admin from [::ffff:192.168.0.7] authenticated
Oct  2 01:15:06 sw01 daemon.info swd[254]: user admin from [::ffff:192.168.0.7] authenticated
Oct  2 01:16:03 sw01 daemon.info swd[254]: user admin from [::ffff:192.168.0.7] authenticated
Oct  2 01:16:37 sw01 daemon.info swd[254]: user admin from [::ffff:192.168.0.83] authenticated
Oct  2 01:16:48 sw01 user.warn kernel: Alignment trap: MHD-connection (7362) PC=0xb681e3b4 Instr=0xe510500c Address=0x0000003f FSR 0x001
Oct  2 01:17:25 sw01 syslog.info syslogd started: BusyBox v1.18.5
Oct  2 01:17:25 sw01 user.notice kernel: klogd started: BusyBox v1.18.5 (2015-09-18 17:21:03 YEKT)
Oct  2 01:17:25 sw01 user.info kernel: Booting Linux on physical CPU 0
Oct  2 01:17:25 sw01 user.notice kernel: Linux version 3.6.9 (alx@ubuntu) (gcc version 4.5.3 20110311 (prerelease) (GCC) ) #1 Wed Jan 25 19:26:02 +05 2017
Oct  2 01:17:25 sw01 user.warn kernel: CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177
Oct  2 01:17:25 sw01 user.warn kernel: CPU: VIVT data cache, VIVT instruction cache
Oct  2 01:17:25 sw01 user.warn kernel: Machine: Atmel AT91SAM9G20-EK
Oct  2 01:17:25 sw01 user.warn kernel: Memory policy: ECC disabled, Data cache writeback
Oct  2 01:17:25 sw01 user.info kernel: AT91: Detected soc type: at91sam9g20
Oct  2 01:17:25 sw01 user.info kernel: AT91: Detected soc subtype: Unknown
Oct  2 01:17:25 sw01 user.info kernel: AT91: sram at 0x2fc000 of 0x8000 mapped at 0xfef70000
Oct  2 01:17:25 sw01 user.debug kernel: On node 0 totalpages: 16384
Oct  2 01:17:25 sw01 user.debug kernel: free_area_init_node: node 0, pgdat c03d39a8, node_mem_map c03e7000
Oct  2 01:17:25 sw01 user.debug kernel:   Normal zone: 128 pages used for memmap
Oct  2 01:17:26 sw01 user.debug kernel:   Normal zone: 0 pages reserved
Oct  2 01:17:26 sw01 user.debug kernel:   Normal zone: 16256 pages, LIFO batch:3
Oct  2 01:17:26 sw01 user.warn kernel: Clocks: CPU 396 MHz, master 132 MHz, main 18.432 MHz
Oct  2 01:17:26 sw01 user.debug kernel: pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
Oct  2 01:17:26 sw01 user.debug kernel: pcpu-alloc: [0] 0 
Oct  2 01:17:26 sw01 user.warn kernel: Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 16256
Oct  2 01:17:26 sw01 user.notice kernel: Kernel command line: console=ttyS0,115200 root=/dev/mtdblock5 mtdparts=atmel_nand:128k(bootstrap)ro,256k(uboot)ro,128k(env1)ro,128k(env2)ro,3M(linux),-(root) rw rootfstype=jffs2
Oct  2 01:17:26 sw01 user.info kernel: PID hash table entries: 256 (order: -2, 1024 bytes)
Oct  2 01:17:26 sw01 user.info kernel: Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Oct  2 01:17:26 sw01 user.info kernel: Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Oct  2 01:17:26 sw01 user.info kernel: Memory: 64MB = 64MB total
Oct  2 01:17:26 sw01 user.notice kernel: Memory: 60932k/60932k available, 4604k reserved, 0K highmem
Oct  2 01:17:26 sw01 user.notice kernel: Virtual kernel memory layout:
Oct  2 01:17:26 sw01 user.notice kernel:     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
Oct  2 01:17:26 sw01 user.notice kernel:     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
Oct  2 01:17:26 sw01 user.notice kernel:     vmalloc : 0xc4800000 - 0xff000000   ( 936 MB)
Oct  2 01:17:26 sw01 user.notice kernel:     lowmem  : 0xc0000000 - 0xc4000000   (  64 MB)
Oct  2 01:17:26 sw01 user.notice kernel:     modules : 0xbf000000 - 0xc0000000   (  16 MB)
Oct  2 01:17:26 sw01 user.notice kernel:       .text : 0xc0008000 - 0xc038b134   (3597 kB)
Oct  2 01:17:26 sw01 user.notice kernel:       .init : 0xc038c000 - 0xc03abcdc   ( 128 kB)
Oct  2 01:17:26 sw01 user.notice kernel:       .data : 0xc03ac000 - 0xc03d4200   ( 161 kB)
Oct  2 01:17:26 sw01 user.notice kernel:        .bss : 0xc03d4224 - 0xc03e6cd8   (  75 kB)
Oct  2 01:17:26 sw01 user.info kernel: NR_IRQS:16 nr_irqs:16 16
Oct  2 01:17:26 sw01 user.info kernel: AT91: 96 gpio irqs in 3 banks
Oct  2 01:17:26 sw01 user.info kernel: sched_clock: 32 bits at 1kHz, resolution 1000000ns, wraps every 4294967295ms
Oct  2 01:17:26 sw01 user.info kernel: Console: colour dummy device 80x30
Oct  2 01:17:26 sw01 user.info kernel: Calibrating delay loop... 196.35 BogoMIPS (lpj=98176)
Oct  2 01:17:26 sw01 user.info kernel: pid_max: default: 32768 minimum: 301
Oct  2 01:17:26 sw01 user.info kernel: Mount-cache hash table entries: 512
Oct  2 01:17:26 sw01 user.info kernel: CPU: Testing write buffer coherency: ok
Oct  2 01:17:26 sw01 user.info kernel: Setting up static identity map for 0x202aed60 - 0x202aedb8
Oct  2 01:17:26 sw01 user.info kernel: devtmpfs: initialized
Oct  2 01:17:26 sw01 user.info kernel: NET: Registered protocol family 16
Oct  2 01:17:26 sw01 user.info kernel: DMA: preallocated 256 KiB pool for atomic coherent allocations
Oct  2 01:17:26 sw01 user.info kernel: AT91: Power Management
Oct  2 01:17:26 sw01 user.info kernel: AT91: Starting after watchdog reset
Oct  2 01:17:26 sw01 user.debug kernel: tcb_clksrc: tc0 at 16.012 MHz
Oct  2 01:17:26 sw01 user.info kernel: bio: create slab <bio-0> at 0
Oct  2 01:17:26 sw01 user.notice kernel: SCSI subsystem initialized
Oct  2 01:17:26 sw01 user.info kernel: usbcore: registered new interface driver usbfs
Oct  2 01:17:26 sw01 user.info kernel: usbcore: registered new interface driver hub
Oct  2 01:17:26 sw01 user.info kernel: usbcore: registered new device driver usb
Oct  2 01:17:26 sw01 user.info kernel: i2c-gpio i2c-gpio.0: using pins 23 (SDA) and 24 (SCL)
Oct  2 01:17:26 sw01 user.info kernel: i2c-gpio i2c-gpio.1: using pins 65 (SDA) and 64 (SCL)
Oct  2 01:17:26 sw01 user.info kernel: cfg80211: Calling CRDA to update world regulatory domain
Oct  2 01:17:26 sw01 user.info kernel: Switching to clocksource tcb_clksrc
Oct  2 01:17:26 sw01 user.info kernel: NET: Registered protocol family 2
Oct  2 01:17:26 sw01 user.info kernel: TCP established hash table entries: 2048 (order: 2, 16384 bytes)
Oct  2 01:17:26 sw01 user.info kernel: TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
Oct  2 01:17:26 sw01 user.info kernel: TCP: Hash tables configured (established 2048 bind 2048)
Oct  2 01:17:26 sw01 user.info kernel: TCP: reno registered
Oct  2 01:17:26 sw01 user.info kernel: UDP hash table entries: 256 (order: 0, 4096 bytes)
Oct  2 01:17:26 sw01 user.info kernel: UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
Oct  2 01:17:26 sw01 user.info kernel: NET: Registered protocol family 1
Oct  2 01:17:26 sw01 user.info kernel: RPC: Registered named UNIX socket transport module.
Oct  2 01:17:26 sw01 user.info kernel: RPC: Registered udp transport module.
Oct  2 01:17:26 sw01 user.info kernel: RPC: Registered tcp transport module.
Oct  2 01:17:26 sw01 user.info kernel: RPC: Registered tcp NFSv4.1 backchannel transport module.
Oct  2 01:17:26 sw01 user.info kernel: jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
Oct  2 01:17:26 sw01 user.info kernel: msgmni has been set to 119
Oct  2 01:17:26 sw01 user.info kernel: Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
Oct  2 01:17:26 sw01 user.info kernel: io scheduler noop registered (default)
Oct  2 01:17:26 sw01 user.info kernel: atmel_usart.0: ttyS0 at MMIO 0xfffff200 (irq = 17) is a ATMEL_SERIAL
Oct  2 01:17:26 sw01 user.info kernel: console [ttyS0] enabled
Oct  2 01:17:26 sw01 user.info kernel: atmel_usart.1: ttyS1 at MMIO 0xfffb0000 (irq = 22) is a ATMEL_SERIAL
Oct  2 01:17:26 sw01 user.info kernel: brd: module loaded
Oct  2 01:17:26 sw01 user.info kernel: loop: module loaded
Oct  2 01:17:26 sw01 user.info kernel: atmel_nand: Use On Flash BBT
Oct  2 01:17:26 sw01 user.info kernel: atmel_nand atmel_nand: No DMA support for NAND access.
Oct  2 01:17:26 sw01 user.info kernel: NAND device: Manufacturer ID: 0xec, Chip ID: 0xda (Samsung NAND 256MiB 3,3V 8-bit), page size: 2048, OOB size: 64
Oct  2 01:17:26 sw01 user.info kernel: Bad block table found at page 131008, version 0x01
Oct  2 01:17:26 sw01 user.info kernel: Bad block table found at page 130944, version 0x01
Oct  2 01:17:26 sw01 user.info kernel: nand_read_bbt: bad block at 0x00000b400000
Oct  2 01:17:26 sw01 user.notice kernel: 6 cmdlinepart partitions found on MTD device atmel_nand
Oct  2 01:17:26 sw01 user.notice kernel: Creating 6 MTD partitions on "atmel_nand":
Oct  2 01:17:26 sw01 user.notice kernel: 0x000000000000-0x000000020000 : "bootstrap"
Oct  2 01:17:26 sw01 user.notice kernel: 0x000000020000-0x000000060000 : "uboot"
Oct  2 01:17:26 sw01 user.notice kernel: 0x000000060000-0x000000080000 : "env1"
Oct  2 01:17:26 sw01 user.notice kernel: 0x000000080000-0x0000000a0000 : "env2"
Oct  2 01:17:26 sw01 user.notice kernel: 0x0000000a0000-0x0000003a0000 : "linux"
Oct  2 01:17:26 sw01 user.notice kernel: 0x0000003a0000-0x000010000000 : "root"
Oct  2 01:17:26 sw01 user.info kernel: atmel_spi atmel_spi.0: Atmel SPI Controller at 0xfffc8000 (irq 28)
Oct  2 01:17:26 sw01 user.info kernel: atmel_spi atmel_spi.0: master is unqueued, this is deprecated
Oct  2 01:17:26 sw01 user.info kernel: atmel_spi atmel_spi.1: Atmel SPI Controller at 0xfffcc000 (irq 29)
Oct  2 01:17:26 sw01 user.info kernel: atmel_spi atmel_spi.1: master is unqueued, this is deprecated
Oct  2 01:17:26 sw01 user.info kernel: libphy: MACB_mii_bus: probed
Oct  2 01:17:26 sw01 user.info kernel: macb macb: eth0: Cadence MACB at 0xfffc4000 irq 37 (02:ad:c2:00:01:de)
Oct  2 01:17:26 sw01 user.info kernel: macb macb: eth0: attached PHY driver [Marvell DX107 PHY] (mii_bus:phy_addr=macb-ffffffff:00, irq=-1)
Oct  2 01:17:26 sw01 user.info kernel: PPP generic driver version 2.4.2
Oct  2 01:17:26 sw01 user.info kernel: ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
Oct  2 01:17:26 sw01 user.info kernel: at91_ohci at91_ohci: AT91 OHCI
Oct  2 01:17:26 sw01 user.info kernel: at91_ohci at91_ohci: new USB bus registered, assigned bus number 1
Oct  2 01:17:26 sw01 user.info kernel: at91_ohci at91_ohci: irq 36, io mem 0x00500000
Oct  2 01:17:26 sw01 user.info kernel: hub 1-0:1.0: USB hub found
Oct  2 01:17:26 sw01 user.info kernel: hub 1-0:1.0: 2 ports detected
Oct  2 01:17:26 sw01 user.info kernel: Initializing USB Mass Storage driver...
Oct  2 01:17:26 sw01 user.info kernel: usbcore: registered new interface driver usb-storage
Oct  2 01:17:26 sw01 user.info kernel: USB Mass Storage support registered.
Oct  2 01:17:26 sw01 user.info kernel: udc: at91_udc version 3 May 2006
Oct  2 01:17:26 sw01 user.info kernel: mousedev: PS/2 mouse device common for all mice
Oct  2 01:17:26 sw01 user.info kernel: rtc-m41t80 0-0068: chip found, driver version 0.05
Oct  2 01:17:26 sw01 user.info kernel: rtc-m41t80 0-0068: rtc core: registered m41t80 as rtc0
Oct  2 01:17:26 sw01 user.info kernel: i2c /dev entries driver
Oct  2 01:17:26 sw01 user.info kernel: at91sam9_wdt: enabled (heartbeat=15 sec, nowayout=0)
Oct  2 01:17:26 sw01 user.debug kernel: Registered led device: alr
Oct  2 01:17:26 sw01 user.debug kernel: Registered led device: mem
Oct  2 01:17:26 sw01 user.debug kernel: Registered led device: ok
Oct  2 01:17:26 sw01 user.debug kernel: Registered led device: link9
Oct  2 01:17:26 sw01 user.debug kernel: Registered led device: buzzer
Oct  2 01:17:26 sw01 user.info kernel: TCP: cubic registered
Oct  2 01:17:26 sw01 user.info kernel: NET: Registered protocol family 10
Oct  2 01:17:26 sw01 user.info kernel: sit: IPv6 over IPv4 tunneling driver
Oct  2 01:17:26 sw01 user.info kernel: NET: Registered protocol family 17
Oct  2 01:17:26 sw01 user.info kernel: lib80211: common routines for IEEE802.11 drivers
Oct  2 01:17:26 sw01 user.debug kernel: lib80211_crypt: registered algorithm 'NULL'
Oct  2 01:17:26 sw01 user.info kernel: input: gpio-keys as /devices/platform/gpio-keys/input/input0
Oct  2 01:17:26 sw01 user.info kernel: rtc-m41t80 0-0068: setting system clock to 2024-10-02 01:17:11 UTC (1727831831)
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x023779d0 ends at 0x02378000
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x03cfb9a0 ends at 0x03cfc000
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x046bd9bc ends at 0x046be000
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x0471e414 ends at 0x0471e800
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x04dfb044 ends at 0x04dfb800
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x075973b4 ends at 0x07597800
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x093bc0b0 ends at 0x093bc800
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x0c1fa200 ends at 0x0c1fa800
Oct  2 01:17:26 sw01 user.warn kernel: jffs2: Empty flash at 0x0ed19f70 ends at 0x0ed1a000
Oct  2 01:17:26 sw01 user.info kernel: VFS: Mounted root (jffs2 filesystem) on device 31:5.
Oct  2 01:17:26 sw01 user.info kernel: devtmpfs: mounted
Oct  2 01:17:26 sw01 user.info kernel: Freeing init memory: 124K
Oct  2 01:17:26 sw01 user.info kernel:  gadget: Gadget Serial v2.4
Oct  2 01:17:26 sw01 user.info kernel:  gadget: g_serial ready
Oct  2 01:17:26 sw01 user.info kernel: IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Oct  2 01:17:26 sw01 user.info kernel: macb macb: eth0: link up (100/Full)
Oct  2 01:17:26 sw01 user.info kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Oct  2 01:17:27 sw01 daemon.info swd[254]: starting swd-r2410

Change History (4)

comment:1 by AlexLir, 6 weeks ago

Description: modified (diff)

comment:2 by alx, 6 weeks ago

Для выявления причины проблемы прошу сделать следующие действия:

  1. Обновить все пакеты до актуальных версий.
  1. Выполнить полную перезагрузку платы.
  1. Получить core dump. Для этого надо выполнить такое команду ulimit -c unlimited, а затем перезапустить swd (обязательно в том же сеансе!) командой /etc/init.d/swd.sh restart. Затем воспроизвести проблему. После этого в корневом каталоге должен появиться core-файл.

Полученный core-файл прошу передать мне (только не надо прикреплять его к тикету, т.к. он может содержать ключи и пароли). После этого я попробую выяснить, в каком месте возник трап.

comment:3 by alx, 6 weeks ago

Чуть не забыл самое главное: строчку Alignment trap: ... из лога тоже обязательно надо привести - там адрес инструкции, на которой возник трап.

comment:4 by alx, 11 days ago

Resolution: не воспроизводится
Status: newclosed
Note: See TracTickets for help on using tickets.