Thanks for getting back to us. I was expecting this suggestion. That’s why I was so reluctant in the past to fork the repository when I was instructed to. A quick glance of the situation now:
git ls-files | wc -l tells me that I have 633 files in my repository
git diff --name-only HEAD..upstream/master | wc -l tells me that the number of files changed from my fork to the ipmc-dev is of 602.
I certainly did not change so many files. What happened to the upstream repository? Was it fully replaced?
I am having trouble to start working with the current IPMC code. I tried many different approaches without luck. In the end, I made the simplest test I could think of, but even that seems to be crashing for me.
Here is what I did:
got a fresh clone of the ipmc-project repository
removed the file ipmc-user/user_mainfile.c
edited the config.xml file to be similar (very basic changes, I left all the details and sensors out, please see attached) to what I will need
executed compile.py which performed the compilation without errors and returned to me some files, including hpm1all.img that I use to program the IPMC
programmed our IPMC with that file and activated it
the IPMC system immediately reverted to the firmware that was previously used, which suggests me that the code is seg faulting or something.
I am probably missing something obvious. I would appreciate some feedback.
Thanks
PS: Well, it seems I cannot attach anything here. I will just add the config.xml file below.
<?xml version="1.0" encoding="UTF-8"?>
<IPMC>
<GeneralConfig>
<DeviceID>0x12</DeviceID>
<DeviceRevision>0x00</DeviceRevision>
<ManufacturerID>0x000060</ManufacturerID>
<ProductID>0x1236</ProductID>
<ManufacturingDate>06/01/2017</ManufacturingDate>
<BoardManuf>Cirly/Addax</BoardManuf>
<BoardName>TEST_FRUFROLLBACK</BoardName>
<BoardSN>00001</BoardSN>
<BoardPN>P580050995</BoardPN>
<ProductManuf>CERN</ProductManuf>
<ProductName>IPMC-TestPAD</ProductName>
<ProductPN>PN00001</ProductPN>
<ProductSN>0000001</ProductSN>
<ProductVersion type="major">1</ProductVersion>
<ProductVersion type="minor">20</ProductVersion>
<MaxCurrent>30.0</MaxCurrent>
<MaxInternalCurrent>2.0</MaxInternalCurrent>
<!-- Hardware -->
<HandleSwitch active="LOW" inactive="HIGH" />
<!-- <ResetOnWrongHAEn /> -->
<!-- <PowerMonitoringEn /> -->
<!-- <AlertMonitoringEn />-->
<!-- Shutdown timeout in tens of ms (optional - if not defined: 10s) -->
<shutdownTimeout>0</shutdownTimeout>
<nonVolatileParams forced="false" />
</GeneralConfig>
<SerialInterfaces>
<!--
This part allows connecting the UART port to interfaces.
The ports 0 to 2 are linked to the hardware:
port 0: Edge connector (Tx: 57 / Rx: 60)
port 1: Edge connector (Tx: 58 / Rx: 61)
port 2: Optionnal UART (Tx: 75 / Rx: 76)
Warning: Enabling port 2 will automatically set the GPIOs in UART mode!
For each bord, the following name can be used:
"SOL": Serial Over Lan
"SDI": Serial Debug Interface
"PI": Payload Interface
The baudrate can be set using the baudrate param. By default,
it is configured to 115200b/s.
-->
<Connect port="0" name="SDI" baudrate="115200"/>
<!-- <Connect port="1" name="PI" baudrate="115200"/> -->
<!-- <Connect port="2" name="SOL" baudrate="115200" extended="true" /> -->
<RedirectSDItoSOL/>
</SerialInterfaces>
<PowerManagement>
<PowerONSeq>
<step>PSQ_ENABLE_SIGNAL(CFG_PAYLOAD_DCDC_EN_SIGNAL)</step>
<step>PSQ_END</step>
</PowerONSeq>
<PowerOFFSeq>
<step>PSQ_DISABLE_SIGNAL(CFG_PAYLOAD_DCDC_EN_SIGNAL)</step>
<step>PSQ_END</step>
</PowerOFFSeq>
</PowerManagement>
<LANConfig>
<MACAddr>0A:0A:0A:0A:0A:86</MACAddr>
<NetMask>255.255.255.0</NetMask>
<GatewayIP>192.138.1.3</GatewayIP>
<UseFlashedMAC />
<EnableDHCP />
<IPAddrList> <!-- Default IP Addresses (used if DHCP is not active) -->
<IPAddr slot_addr="default">192.168.1.34</IPAddr>
<IPAddr slot_addr="0x41">192.168.1.20</IPAddr>
<IPAddr slot_addr="0x42">192.168.1.21</IPAddr>
<IPAddr slot_addr="0x43">192.168.1.22</IPAddr>
<IPAddr slot_addr="0x44">192.168.1.23</IPAddr>
<IPAddr slot_addr="0x45">192.168.1.24</IPAddr>
<IPAddr slot_addr="0x46">192.168.1.25</IPAddr>
<IPAddr slot_addr="0x47">192.168.1.26</IPAddr>
<IPAddr slot_addr="0x48">192.168.1.27</IPAddr>
<IPAddr slot_addr="0x49">192.168.1.28</IPAddr>
<IPAddr slot_addr="0x4a">192.168.1.29</IPAddr>
<IPAddr slot_addr="0x4b">192.168.1.30</IPAddr>
<IPAddr slot_addr="0x4c">192.168.1.31</IPAddr>
<IPAddr slot_addr="0x4d">192.168.1.32</IPAddr>
<IPAddr slot_addr="0x4e">192.168.1.33</IPAddr>
<IPAddr slot_addr="0x4f">192.168.1.34</IPAddr>
<IPAddr slot_addr="0x50">192.168.1.35</IPAddr>
</IPAddrList>
</LANConfig>
<AMCSlots>
<AMC site="1">
<PhysicalPort>1</PhysicalPort>
<MaxCurrent>6.0</MaxCurrent>
<PowerGoodTimeout>300</PowerGoodTimeout>
<DCDCEfficiency>85</DCDCEfficiency>
</AMC>
<AMC site="2">
<PhysicalPort>2</PhysicalPort>
<MaxCurrent>6.0</MaxCurrent>
<PowerGoodTimeout>300</PowerGoodTimeout>
<DCDCEfficiency>85</DCDCEfficiency>
</AMC>
<AMC site="3">
<PhysicalPort>3</PhysicalPort>
<MaxCurrent>6.0</MaxCurrent>
<PowerGoodTimeout>300</PowerGoodTimeout>
<DCDCEfficiency>85</DCDCEfficiency>
</AMC>
</AMCSlots>
<SensorList>
<Sensors type="raw" global_define="CFG_SENSOR_MCP9801" function_name="SENSOR_MCP9801" rawType="MCP9801">
<Sensor>
<Name>Internal temp.</Name>
<Type>Temperature</Type>
<Units>degrees C</Units>
<NominalReading>25</NominalReading>
<NormalMaximum>60</NormalMaximum>
<NormalMinimum>-10</NormalMinimum>
<Point id="0" x="0" y="0" />
<Point id="1" x="5" y="5" />
<Thresholds>
<UpperNonRecovery>80</UpperNonRecovery>
<UpperCritical>60</UpperCritical>
<UpperNonCritical>40</UpperNonCritical>
<LowerNonRecovery>-20</LowerNonRecovery>
<LowerCritical>-10</LowerCritical>
<LowerNonCritical>0</LowerNonCritical>
</Thresholds>
<Params>
<p type="record_id"></p> <!-- mandatory -->
<p type="user">0x090</p>
<p type="user">UCGH | LCGL</p>
</Params>
<AssertEvMask>0x0A80</AssertEvMask>
<DeassertEvMask>0x7A80</DeassertEvMask>
<DiscreteRdMask>0x3838</DiscreteRdMask>
<AnalogDataFmt>2S_COMPL</AnalogDataFmt>
<PosHysteresis>2</PosHysteresis>
<NegHysteresis>2</NegHysteresis>
<MaxReading>127</MaxReading>
<MinReading>-128</MinReading>
</Sensor>
</Sensors>
<!-- Example for GPIO sensors:
<Sensors type="raw" global_define="CFG_SENSOR_GPIO " function_name="SENSOR_GPIO" rawType="GPIOSENSOR">
<Sensor>
<Name>GPIOSens Ex.</Name>
<Type>Processor</Type>
<Params>
<p type="record_id"></p>
<p type="user">0x1</p>
<p type="user">POWER_GOOD_12V</p>
</Params>
<DiscreteRdMask>0x000F</DiscreteRdMask>
</Sensor>
</Sensors>
-->
<!-- Example for payload sensors:
<Sensors type="raw" global_define="CFG_SENSOR_PAYLOAD_THRESHOLD" function_name="SENSOR_PAYLOAD_THRESHOLD" rawType="PAYLOADSENSOR_THRESH">
<Sensor>
<Name>PayloadSens Ex.</Name>
<Type>Temperature</Type>
<Units>degrees C</Units>
<NominalReading>25</NominalReading>
<NormalMaximum>60</NormalMaximum>
<NormalMinimum>-10</NormalMinimum>
<Point id="0" x="0" y="0" />
<Point id="1" x="5" y="5" />
<Thresholds>
<UpperNonRecovery>80</UpperNonRecovery>
<UpperCritical>60</UpperCritical>
<UpperNonCritical>40</UpperNonCritical>
<LowerNonRecovery>-20</LowerNonRecovery>
<LowerCritical>-10</LowerCritical>
<LowerNonCritical>0</LowerNonCritical>
</Thresholds>
<Params>
<p type="record_id"></p>
</Params>
<AssertEvMask>0x0A80</AssertEvMask>
<DeassertEvMask>0x7A80</DeassertEvMask>
<DiscreteRdMask>0x3838</DiscreteRdMask>
<AnalogDataFmt>2S_COMPL</AnalogDataFmt>
<PosHysteresis>2</PosHysteresis>
<NegHysteresis>2</NegHysteresis>
<MaxReading>127</MaxReading>
<MinReading>-128</MinReading>
</Sensor>
</Sensors>
-->
</SensorList>
</IPMC>
Sorry for the delay but I am just coming back from vactions. As Ralf told you, some problems were found on the i2c bus with the version 1.2 and are fixed on version 1.3. As the command might send a lot of data through the i2c bus, it could be the source of your issue.
Concerning your problem with going to 1.3, it is really weird, I’ve just tried again to compile and force and it looks ok. However, it could be because of a version checking issue that the rollback is automatically performed. Could you check what is printed on the serial debug interface when you activate the new version?
Thanks for answering promptly after your return. I hope you enjoyed your time off.
Concerning your problem with going to 1.3, it is really weird, I’ve just tried again to compile and force and it looks ok.
Just to confirm, have you used the same config.xml I sent above?
However, it could be because of a version checking issue that the rollback is automatically performed. Could you check what is printed on the serial debug interface when you activate the new version?
I am afraid I cannot get this. The IPMC serial is connected to a Zynq device on the same blade, and I would only have access to the serial messages after the Zynq itself finish booting. Can you explain a bit more about this version checking issue?
Also, these IPMCs are an old revision. Could be this the problem? Can you try with a similar one as well?
The XML looks good and you should not have any problem using your binary file. When you move from 1.2 to 1.3, the firmware tries to recover “non-volatile parameters”. However, the parameters changed between 1.2 and 1.3 and, at the first boot, as the activate function cannot convert all of the parameter, it automatically performs a rollback. However, the upgrade can be forced by adding the “norollback” instruction when you activate the new firmware.
Connecting to the debug interface allows getting additional details to confirm that the rollback is issued by the non volatile parameter issue. But, if you flash with the norollback and you really have an issue with the firmware, you could face the case where you need to flash your ipmc again via JTAG.
Do you have a JTAG cable that you could use, or even a raspberry pi that we now support for reseting the system? If you have a way to flash back the module, you can try adding the “norollback” instruction to the activate command.
I bought the JTAG cable sometime ago because I thought we could end-up on this kind of situation, but I never actually used it. Do you have instructions about how to use it under the CERN IPMC context? (Linux solution, please, I don´t have any computer with other operating systems around)
Unfortunatelly, I am sorry but Pigeon Point provides only a solution with Windows, I am not aware of equivalent with Linux. But, maybe, using the stapl file, there is a solution to run it on linux.
Thanks, Julian. Can you add this somewhere in the documentation? This way people are able to get the correct items from start. I will try to get my hands on a RPi then, but it will stall the debug process for some more days.
That is not what I meant. I think it would be helpful for the developers to know what they need to have to recover an IPMC, in my case I just learned that I need a RPI for that. If I had this info before, I would get one beforehand in case of unexpected results.
I was able to boot the RPI with the image you provided, and I connected the RPI to the IPMC development bed following the pin mapping suggested. However, I am getting the following error:
BTW, the development kit powered up as it should. I will try to debug the setup further, but I was wondering if you have any suggestions since we are in a rush.
One more question: I assume that it will flash the IPMC with some default firmware, right? So in case of troubles, the procedure is: remove the IPMC from the blade, install it in the development kit, flash this default firmware and then return it to the blade to reprogram it as usual. Does this sound correct?
I was able to get the recovery setup to work (informing here so you don’t need to spend time on this). I was able to recover a bricked IPMC from more than 2 years ago caused by a norollback that failed (no idea why).
Now that I have a way to recover the cards in case of problems, I will proceed with using the norollback option you suggested above for the current issue.
I’m also from the NSW and I’m trying to look at the issue since we’re getting closer and closer to running and Thiago is quite busy with other work, too.
However, I’m quite new to IPMC. How should we update the IPMC software? If I ran this command I got
ipmitool -H 192.168.0.2 -P "" -m 0x20 -t 0x72 -T 0x82 -b 7 -B 0 mc info
Device ID : 1
Device Revision : 0
Firmware Revision : 1.07
IPMI Version : 1.5
Manufacturer ID : 28688
Manufacturer Name : Unknown (0x7010)
Product ID : 528 (0x0210)
Product Name : Unknown (0x210)
Device Available : yes
Provides Device SDRs : yes
Additional Device Support :
Sensor Device
FRU Inventory Device
IPMB Event Generator