Power management in alert events, e.g. NonCritical

Dear CERN-IPMC supports,

We are now developing ATCA blades using CERN-IPMC.

There are threshold values of UNR (Upper Non Recovery), UC, UNC, LNC, LC, and LNC. How can I understand the responses to these events?

In the case of the temperature sensor, this alert is propagated to the shelf managers’ cooling algorithms. I found the details of the cooling algorithm in the Pigeon Point Shelf Manager User Guide, and the behavior is consistent with the document.
What happens if the alerts are asserted on the voltage sensors? I observed the shutdown and power-on of 12VDC controlled by the CERN IPMC.

Best,
Shota

Hi Shota,

Thanks for your message. The default cooling algorithm of the PigeonPoint ShelfManager is indeed described in the User Guide and works on thermal sensors and their corresponding thresholds.
I guess that adding actions on other sensors and their thresholds, e.g. voltages and currents, requires defining Platform Event Filters (PEF) - but I am not an expert in this at all. Unfortunately, the experts are on holiday this week. Do you think your question could wait until next week?

Cheers,
Ralf.

Hi Ralf,

Thanks for prompt reply. I understand the case of the cooling and the situation for other sensors. Since the question is not so urgent, I can wait for that. I will check the PEF and keep watching this thread.

Thanks,
Shota

Hi Shota,

With my colleague Markus, we have been look at the Platform Event Filter mechanism, which indeed provides the possibility to take action (like shutting down) when a measured sensor value goes beyond one of the thresholds. The mechanism needs some getting used to and we have been in contact with PigeonPoint. I think that Markus will soon be able to provide you with some instructions on how to set up the PEF in a practical case.

However, the PEF runs in software on the shelf manager and can take up to two seconds before shutting down the board. So, if you are concerned about the safety of your board, there may be another, faster mechanisms using monitoring code in the IPMC firmware which can deactivate on power failures with HS cause 0Ah (Surprise State Change due to Power Failure). We are gathering information on this mechanism, which might take some time.

Cheers,
Ralf

Hi Ralf,
Thank you very much for the follow-up information. I understand the situation and wait for further details.

I agree with your point about the speed of PEF feedback. Considering the safety mechanism, I will avoid relying on the PEF so much.

Best regards,
Shota

Dear Shota,

Thanks for your reply.

I think that protection of equipment should not rely on a single measure, but rather follow a layered approach. In that sense, the PEF mechanism still makes sense, and should be used, in addition, to other mechanisms whith a faster reaction time.

We are currently working on a) instructions for setting up a PEF in case of over current and/or voltage, and b) instructions on how to use the “surprise change of state” in the IPMC. We will update you when we will have them ready.

Cheers,
Ralf