09-11-2022 07:27 AM
Hello,
We are seeing high operating temperatures on the BR-MLX-10GX20-X2 LP modules. This is especially true for the one installed in Slot 2 of the MLXe-4 chassis. For instance, Sensor 2 of LP installed on slot 2 displays 65.250C.
Also, we have noticed that chassis airflow is substantially better in slots 3 and 4, when compared to the almost inexistent air flow on slots 1 and 2.
For context, this chassis is installed at a data center that maintains an average temperature of 71 F and 45% relative humidity. The HVAC perforated floor tile is literally in front of the unit.
Questions:
1. What are the normal operating temperatures ranges and why are we seeing these high temperatures with an average 1% of CPU utilization?
2. Is 65.250C something to be concerned about?
3. What SNMP OID can we use to set the low, warning, and high temperature warnings for each sensor on the LP and MP Modules?
Any suggestions and/or instructions will be highly appreciated.
show chassis output:
>show chassis
*** MLXe 4-slot Chassis ***
---POWERS ---
Power 1: ( 32006000 115286124100 - AC 1200W): Installed (OK)
Power 2: ( 32006000 112186119559 - AC 1200W): Installed (OK)
Power 3: not present
Power 4: not present
Total power budget for chassis = 2400 W
Total power used by system core = 450 W
Total power used by LPs = 880 W
Total power available = 1070 W
Slot Power-On Priority and Power Usage:
Slot1 pri=1 module type=BR-MLX-10Gx20 20-port 1/10GbE Module power usage=440W
Slot2 pri=1 module type=BR-MLX-10Gx20 20-port 1/10GbE Module power usage=440W
Slot3 pri=1 module type=BR-MLX-10Gx8-X 8-port 10GbE (X) Module power usage=270W
Slot4 pri=1 module type=BR-MLX-10Gx8-X 8-port 10GbE (X) Module power usage=270W
--- FANS ---
Back fan A-1: Status = OK, Speed = LOW (50%)
Back fan A-2: Status = OK, Speed = LOW (50%)
Back fan B-1: Status = OK, Speed = LOW (50%)
Back fan B-2: Status = OK, Speed = LOW (50%)
Back fan C-1: Status = OK, Speed = LOW (50%)
Back fan C-2: Status = OK, Speed = LOW (50%)
Back fan D-1: Status = OK, Speed = LOW (50%)
Back fan D-2: Status = OK, Speed = LOW (50%)
--- TEMPERATURE READINGS ---
Active Mgmt Module: 31.750C 42.750C
Standby Mgmt Module: 32.250C 42.875C
SFM1: FE1:35.625C
SFM2: FE1:38.0C
SFM3: FE1:37.500C
LP1 Sensor1: 37.500C
LP1 Sensor2: 56.500C
LP1 Sensor3: 54.750C
LP1 Sensor4: 51.750C
LP1 Sensor5: 43.750C
LP1 Sensor6: 58.0C
LP1 Sensor7: 49.125C
LP1 Sensor8: 38.500C
LP1 Sensor9: 50.375C
LP2 Sensor1: 52.0C
LP2 Sensor2: 65.125C
LP2 Sensor3: 65.250C
LP2 Sensor4: 64.500C
LP2 Sensor5: 60.375C
LP2 Sensor6: 63.375C
LP2 Sensor7: 61.875C
LP2 Sensor8: 45.750C
LP2 Sensor9: 55.375C
Fans are in auto mode (current speed is LOW (50%)). Temperature monitoring poll period is 60 seconds.
--- MISC INFO ---
Backplane EEPROM MAC Address: 0024.3896.aa00
show temperature output:
>show temperature
SLOT #: CARD TYPE: SENSOR # TEMPERATURE (C):
33 ACTIVE MG 1 32.0C
33 ACTIVE MG 2 42.750C
=======================================================
34 STANDBY MG 1 32.250C
34 STANDBY MG 2 42.750C
=======================================================
1 LP 1 37.500C
1 LP 2 56.500C
1 LP 3 UNUSED
1 LP 4 51.875C
1 LP 5 43.875C
1 LP 6 UNUSED
1 LP 7 UNUSED
1 LP 8 UNUSED
1 LP 9 UNUSED
2 LP 1 52.0C
2 LP 2 65.250C
2 LP 3 UNUSED
2 LP 4 64.625C
2 LP 5 60.500C
2 LP 6 UNUSED
2 LP 7 UNUSED
2 LP 8 UNUSED
2 LP 9 UNUSED
SFM # FE # TEMPERATURE (C):
1 1 35.625C
2 1 37.875C
3 1 37.625C
CPU utilization is at an average constant low of 1%.
show cpu-utilization output:
> show cpu-utilization
10:02:26 Eastern Sun Sep 11 2022
... Usage average for all tasks in the last 1 seconds ...
==========================================================
Name us/sec %
idle 954115 99
con 14 0
mon 159 0
flash 90 0
dbg 23 0
boot 80 0
main 0 0
itc 0 0
tmr 3577 0
ip_rx 15790 1
sfm_mgr 2849 0
scp 17 0
lpagent 0 0
console 110 0
vlan 0 0
mac_mgr 252 0
mrp 210 0
vsrp 0 0
erp 194 0
mxrp 76 0
snms 0 0
rtm 551 0
rtm6 433 0
ip_tx 2795 0
rip 0 0
l2vpn 0 0
mpls 0 0
nht 0 0
mpls_glue 0 0
pcep 0 0
bgp 5390 0
bgp_io 6365 0
ospf 208 0
ospf_r_calc 0 0
isis 131 0
isis_spf 0 0
mcast 241 0
msdp 19 0
vrrp 0 0
ripng 0 0
ospf6 210 0
ospf6_rt 0 0
mcast6 269 0
vrrp6 0 0
bfd 0 0
ipsec 60 0
l4 0 0
stp 0 0
gvrp_mgr 0 0
snmp 0 0
rmon 44 0
web 621 0
lacp 2904 0
dot1x 0 0
dot1ag 93 0
loop_detect 78 0
ccp 11 0
cluster_mgr 98 0
hqos 0 0
statistics 0 0
hw_access 34 0
sfm_mon 31 0
ntp 15 0
openflow_ofm 10 0
openflow_opm 22 0
dhcp6 0 0
fid_mgr 69 0
sysmon 90 0
pki 54 0
ospf_msg_task 0 0
ssl 0 0
http_client 0 0
ssh_0 1598 0
show version output:
SL 1: BR-MLX-10Gx20 20-port 1/10GbE Module (Serial #: REDACTED, Part #: 60-1002946-14)
License: 20x10GbE-X2-Scaling-UPG (LID: eydIHHLrFFI)
Boot : Version 5.9.0T175 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Mar 19 2015 at 03:17:00 labeled as xmlprm05900
(449576 bytes) from boot flash
Monitor : Version 6.2.0T175 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Feb 11 2021 at 08:11:50 labeled as xmlb06200
(574086 bytes) from code flash
IronWare : Version 6.3.0fT177 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Jul 14 2022 at 21:15:32 labeled as xmlp06300f
(9588227 bytes) from Primary
FPGA versions:
Valid PBIF Version = 2.11, Build Time = 8/19/2016 14:54:00
Valid XPP Version = 0.01, Build Time = 3/29/2021 7:08:00
MACXPP100G 0
MACXPP100G 1
1199 MHz MPC P2010 (version 8021/1051) 599 MHz bus
512 KB Boot Flash (MX29LV040C), 66846720 Bytes (~64 MB) Code Flash (MT28F256J3)
3072 MB DRAM, 8 KB SRAM
LP Slot 1 uptime is 11 days 11 hours 39 minutes 0 seconds
==========================================================================
SL 2: BR-MLX-10Gx20 20-port 1/10GbE Module (Serial #: REDACTED, Part #: 60-1002946-12)
License: 20x10GbE-X2-Scaling-UPG (LID: eydIHJLoFGv)
Boot : Version 5.9.0T175 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Mar 19 2015 at 03:17:00 labeled as xmlprm05900
(449576 bytes) from boot flash
Monitor : Version 6.2.0T175 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Feb 11 2021 at 08:11:50 labeled as xmlb06200
(574086 bytes) from code flash
IronWare : Version 6.3.0fT177 Copyright (c) 2017-2019 Extreme Networks, Inc.
Compiled on Jul 14 2022 at 21:15:32 labeled as xmlp06300f
(9588227 bytes) from Primary
FPGA versions:
Valid PBIF Version = 2.11, Build Time = 8/19/2016 14:54:00
Valid XPP Version = 0.01, Build Time = 3/29/2021 7:08:00
MACXPP100G 0
MACXPP100G 1
1199 MHz MPC P2010 (version 8021/1051) 599 MHz bus
512 KB Boot Flash (MX29LV040C), 66846720 Bytes (~64 MB) Code Flash (MT28F256J3)
3072 MB DRAM, 8 KB SRAM
LP Slot 2 uptime is 22 hours 45 minutes 53 seconds
Solved! Go to Solution.
09-12-2022 07:35 AM
Hello,
I find that the 100G CFP as well as the 10gx20 tend to be on the warmer side. If you have slots available, it is best to install them next to lower capacity cards/empty slots. Also make sure any empty slot has an insert installed to keep air flowing across the card.
Your questions about thresholds and "normal" temps are covered in the Hardware Install guide:
MLX Install Guide (check page 210, Managing the Cooling System)
Supported OIDs/MIBs are found in the MIB Guide
NetIron 6.3.x MIB reference guide (check page 201, System Temperature and thresholds)
Thanks,
09-13-2022 06:29 AM
Hi Michael,
I appreciate your input. It is important to know that we are well within normal parameters with these temps. We will move the module on slot 2 to slot 3 or 4, where we have no active interface modules, and we feel a much better airflow within those bays. This will require a maintenance window and changes in the configuration, but I think it is worth the effort to improve efficiency.
It would be nice to see a graphical illustration of these MLXe-4 chassis for us to make informed decisions regarding LP slot placement.
Regards,
Roger
09-12-2022 07:35 AM
Hello,
I find that the 100G CFP as well as the 10gx20 tend to be on the warmer side. If you have slots available, it is best to install them next to lower capacity cards/empty slots. Also make sure any empty slot has an insert installed to keep air flowing across the card.
Your questions about thresholds and "normal" temps are covered in the Hardware Install guide:
MLX Install Guide (check page 210, Managing the Cooling System)
Supported OIDs/MIBs are found in the MIB Guide
NetIron 6.3.x MIB reference guide (check page 201, System Temperature and thresholds)
Thanks,