Management of non-intelligent RTM

YES IT WORKS NOW !!!

Hello Sylvain,

finally! Really glad that it works.

So if you update the firmware you get the correct compilation date? Do you also manage to bring up the RTM?

cheers,

Stefan

1 Like

Yes I get the expected compilation date :slight_smile: so my code is definitely running in situ now, real work can begin !

When changing some of the DevKit AMC 8 states I now see corresponding logs on the serial console. So this is still an ongoing process on my side but I get reactions to my stimuli.

Will keep you update along the way !

Thank you again for the tedious debugging…

Hello Sylvain,

that’s good to hear. BTW, I made a small python script that emulates the RTM insertion/extraction on the IPMC devkit in order to be able to test the IPMC firmware with RTM support:

https://cernbox.cern.ch/s/tIQ3RQzL2rDsAjn

This could be easily extended. But maybe you already have the real RTM now?

cheers,

Stefan

Hello Stefan,

Thank you, as the same procedure does not work using devkit_ctrl-based script (systematically times out).

Best regards,
Sylvain

Dear Stefan,

I am testing RTM handling with the DevKit, and I consistently run in this error in the IPMC log, as my regression bash script using is probably not fast enough to set MP_good:

RTM: ps=1 mp_en=0 mp_good=0 state=0 dead=0
RTM: ps=1 mp_en=1 mp_good=0 state=0 dead=0
<E>: RTM timeout waiting for MP good
RTM: ps=1 mp_en=0 mp_good=0 state=0 dead=1

So I tried to change this timeout in the XML configuration file, to no avail:

<NonIntelligentRTM>
    <AMCPort>8</AMCPort>
    <Power>10.0</Power>
    <HandleSwitch>USER_IO_2_ACTL</HandleSwitch>
    <BlueLed>USER_IO_27_ACTH</BlueLed>
    <PowerGoodTimeout>3000</PowerGoodTimeout>
</NonIntelligentRTM>

How am I supposed to address this please ?

Best regards,

Sylvain

Hello Sylvain,

the RTM management power-good timeout value cannot be changed via the XML configuration. Can you please try adding the following value to your user_defs.h file:

#define CFG_RTM_MP_GOOD_TIMEOUT 1000

This will increase the timeout to 1 second, the default value is 150.

cheers,

Stefan

Thank you Stefan it worked.

Best regards,

Sylvain

Hi Stefan,

To further debug our prototyping board (suspicions of bad contact on the RTM presence connector), I would like to show the value of all AMC8 pins every 100ms. I already do this for some USER_IO:

/* TIMER_CALLBACK(time, fctname) is called everytime the timer exceed the time value */
signal_t userio_read_sigs = {USER_IO_2, USER_IO_27, USER_IO_28, USER_IO_29,  };
TIMER_CALLBACK(100ms, usermain_timercback) {
debug_printf("USER_IO = [");
for (i = 0; i < (sizeof(userio_read_sigs) / sizeof(signal_t)); i++) {
    signal_t userio_read_sig = userio_read_sigs[i];
    const char pinValue = signal_read(&userio_read_sig);
    debug_printf(" %d ", pinValue);
}
debug_printf("].\n\r");

What are the MACRO names of those signal_t pins please ?

Or how can I access them ?

Best regards,

Sylvain

Hello Sylvain,

you can use the macros below in the same way you use the USER_IO_xx signals:

CFG_RTM_PRESENCE_SIGNAL
CFG_RTM_MP_ENABLE_SIGNAL
CFG_RTM_MP_GOOD_SIGNAL
CFG_RTM_I2C_BUFFER_ENABLE_SIGNAL
CFG_RTM_PWR_ENABLE_SIGNAL
CFG_RTM_PWR_GOOD_SIGNAL

cheers,

Stefan

Hello Stefan,

Thank you for your answer. But as soon as I try to output anyone of those pin value, the firmware goes wild and becomes unusable:

RTM: ps=1 mp_en=0 mp_good=1 state=2 dead=0
/HS=1 BL=0 RL=1 GL=1 : I2C dev read error, I2C address: 0046
: cannot read signal 0B i2c_addr 0046
MPg=0
: I2C dev read error, I2C address: 0046
: cannot read signal 0B i2c_addr 0046
RTM: ps=1 mp_en=0 mp_good=0 state=2 dead=0
: MP good failed
<>: FRU 2 state: M2->M0, cause = A
soft_i2c_io: I2C bus 3 is stuck
: EEPROM read error, I2C bus: 03, I2C address: A2, data address: 02CA
RTM: ps=1 mp_en=0 mp_good=1 state=0 dead=1
<
>: RTM is inserted
<>: FRU 2 state: M0->M1, cause = 0
RTM: ps=1 mp_en=0 mp_good=1 state=1 dead=1
/HS=1 BL=0 RL=1 GL=1 MPg=1
<
>: FRU 2 state: M1->M2, cause = 2
RTM: ps=1 mp_en=0 mp_good=1 state=2 dead=1
/HS=1 BL=0 RL=1 GL=1 MPg=1
/HS=1 BL=0 RL=1 GL=1 MPg=1
/HS=1 BL=0 RL=1 GL=1 MPg=1
: I2C dev read error, I2C address: 0046
: cannot read signal 06 i2c_addr 0046
RTM: ps=0 mp_en=0 mp_good=1 state=2 dead=1
<>: FRU 2 state: M2->M0, cause = A
RTM: ps=1 mp_en=0 mp_good=1 state=0 dead=0
<
>: RTM is inserted
<>: FRU 2 state: M0->M1, cause = 0
RTM: ps=1 mp_en=0 mp_good=1 state=1 dead=0
/HS=1 BL=0 RL=1 GL=1 MPg=1
: I2C dev read error, I2C address: 0046
: cannot read signal 06 i2c_addr 0046
RTM: ps=0 mp_en=0 mp_good=1 state=1 dead=0
<
>: FRU 2 state: M1->M0, cause = A
RTM: ps=1 mp_en=0 mp_good=1 state=0 dead=0
<>: RTM is inserted
<
>: FRU 2 state: M0->M1, cause = 0
RTM: ps=1 mp_en=0 mp_good=1 state=1 dead=0
/HS=1 BL=0 RL=1 GL=1 MPg=1
<_>: FRU 2 state: M1->M2, cause = 2
RTM: ps=1 mp_en=0 mp_good=1 state=2 dead=0

What do you think ?

Best regards,

Sylvain

Helo Stefan,

Further more, we looked at the “3V3 enable” signal with a scope on the real board, and this signal remains high and never triggers our HotSwap controller.

Whereas both the firmware log and the shelf-manager assert that FRU 2 is in M4 state as expected:

<_>: FRU 2 state: M3->M4, cause = 0

RTM: ps=1 mp_en=0 mp_good=1 state=4 dead=0

RTM Hot Swap | 01h | ok | 192.96 | Transition to M4

All those signals (MPen, MPgood, PPen,PPgood) should be active low for our design.

Can you please confirm that it is indeed the case in PigeonPoint code ?

Best regards,

Sylvain

Hello Sylvain,

actually those signals are active-high by default, please check hardware guide here on page 7:

https://cernbox.cern.ch/pdf-viewer/eos/project/c/cern-ipmc-support/public/CERN-IPMC%20-%20hardware%20guide.pdf

Only the presence detect (PS1#) and Enable# (for AMCs) signal are active low. For the AMCs and the iRTM there is the possibility to change the polarity through the XML, see here:

https://gitlab.cern.ch/ep-ese-be-xtca/ipmc-project/-/blob/master/README.md#amc-slots

I can add the same for the non-intelligent RTM, that way you can choose the signal polarity you need.

cheers,

Stefan

Hello Sylvain,

I have added the option to invert those pins (making them active low). Could you please add the following lines to your XML in the NonIntelligentRTM section:

<Invert>
  <Pin>MPEN</Pin>
  <Pin>MPGOOD</Pin>
  <Pin>PWREN</Pin>
  <Pin>PWRGOOD</Pin>
</Invert>

Please let me know if this helps.

cheers,

Stefan

Thank you very much it works now !

Thank you again for all your help !!

There is one remaining issue with the RTM LEDs though: the blue LED is controlled as expected (light on M<4), but the RED and GREEN ones are always OFF.

In user_defs.h we have, as recommend:

// RTM LEDs
#define CFG_RTM_FRU_LED_COUNT 2 /* excluding the blue LED */
#define CFG_RTM_FRU_LED_COLORS LED_CAP_RED, LED_CAP_GREEN
#define CFG_RTM_FRU_LED_DEFAULT_COLORS LED_LC_RED | LED_OVR_RED | LED_LC_GREEN | LED_OVR_GREEN
#define CFG_RTM_FRU_LED_SIGNALS {
{ USER_IO_28_ACTH, USER_IO_29_ACTH }
}

I obviously tried with USER_IO_28_ACTL and USER_IO_29_ACTL, which gives me an always ‘on’ GREEN LED, and an always ‘off’ RED LED.

I also verified that they otherwise both work fine by successfully controlling each one in a dedicated TIMER_CALLBACK(1s, usermain_timercback) debugging code.

Would you have any advice on this please ?

Best regards,

Sylvain

Hello Sylvain,

I believe that the LEDs are off by default is the expected behavior, only the blue LED is automatically used to indicate the host-swap state. The other LEDs can be controlled by IPMI commands or locally, see also the following thread:

https://cern-ipmc-forum.web.cern.ch/t/cern-ipmc-front-panel-leds-allow-for-led-control/111/11

However mainfru should be replaced with rtmfru here. Please also note that I have not tested this, given that we do not have a board with an RTM. You could also use the additional LEDs in the RTM algorithmic power sequencing as you already do for the front-board.

Two of the LED related defines are however wrong, can you please use the following instead:

#define CFG_RTM_FRU_LED_DEFAULT_COLORS \
LED_LC_RED | LED_OVR_RED, \
LED_LC_GREEN | LED_OVR_GREEN
#define CFG_RTM_FRU_LED_SIGNALS \
{ { [LED_SIGNAL_RED] = USER_IO_28_ACTL, } }, \
{ { [LED_SIGNAL_GREEN] = USER_IO_29_ACTL, } }

However I’m not sure if this will make any difference to the behavior.

cheers,

Stefan

Dear Stefan,

Thank you, I can now properly control my 3 RTM LEDs remotely:

ipmitool picmg led set …

But what I am looking for is:

  • the RTM GREEN LED turns ON (respectively OFF) when the RTM FRU is in M4 (respectively M1);
  • the RTM RED LED turns ON (respectively OFF) when there is a RTM FRU powering failure (respectively success).

By writing “RTM algorithmic power sequencing as you already do for the front-board” in your previous message, do you mean there is a possibility to describe this sequence in XML as for the main FRU ? What are the hooks that could get this working please ?

Best regards,

Sylvain

Hello Sylvain,

I have now added support in the XML for custom power-on/off sequences for the RTM. This works in the same way as for the front board, you just need to define the and entries under the key. Please let me know if this works for you.

Concerning the green LED, this may have to be implemented in the IPMC core code. Alternatively it may be possible to something with callback functions in the user code, but I need to look into this.

cheers,

Stefan

Hello Stefan,

I tried to implement this but it’s not clear enough to me (some words are missing in your previous answer), and wether I compile my project with right or obviously wrong syntax always works which is defining not what’s expected.

Could you please give a complete, working example in Files ¡ master ¡ ep-ese-be-xtca / ipmc-project ¡ GitLab ?

Best regards,

Sylvain