testConnectivity OH Masking Issue #292

bregnery · 2020-01-23T10:00:01Z

It would seem that the OH masks are not working properly on the testConnectivity.py

Say for example, that we want to do testConnecivity with a GBT phase scan on (1,11,5). So we run testConnectivity with OH mask 0x20. As we do that, we notice all of the other OH's on the same CTP7 start jumping around in current values. Then, testConnectivity will fail with all of the other chambers on that CTP7 and to test another, the chamber must be powercycled in order to recover it.

Types of issue

Bug report (report an issue with the code)
Feature request (request for change which adds functionality)

Expected Behavior

We should be able to run testConnectivity one chamber at a time without affecting current/communication with masked chambers.

Current Behavior

running testConnectivity on one chamber will mess up the communication with an unprogrammed chamber.

Steps to Reproduce (for bugs)

Power on all chambers on a CTP7
Run testConnecitivity with a GBT Phase scan for one chamber (say for example OH5, mask 0x20)
Observe the current value of other OHs on that CTP7
Run testConnectivity on another chamber from that same CTP7 (say for example OH1), this should fail.

Possible Solution (for bugs)

Context (for feature requests)

Your Environment

Version used:

Name : gempython_vfatqc
Arch : x86_64
Version : 2.7.7

Name : gempython_vfatqc
Arch : noarch
Version : 1.0.5

Shell used: Bash

The text was updated successfully, but these errors were encountered:

mexanick · 2020-01-23T10:08:04Z

This is known behavior. The current values are jumping because all the OHs receive hard reset and got reprogrammed. However this should not cause any issues unless you're doing something on them in parallel. The communication to the frontend should remain stable (GBT config should not be affected). Could you please elaborate on the subsequent communication failures on other chambers?

bregnery · 2020-01-23T10:27:03Z

After we did testConnectivity.py with GBT phase scan for 0x20 and it finished, we tried testConnectivity.py for 0x2 and got the following error:

[gempro@kvm-s3562-1-ip151-74 ~]$ testConnectivity.py 1 11 0x2 --skipScurve --skipDACScan
Open pickled address table if available  /opt/cmsgemos/etc/maps//amc_address_table_top.pickle...
Initializing AMC gem-shelf01-amc11
====================
Step 1: Checking GBT Communication
====================
Checking GBT Communication (Before Programming GBTs)
GBT Communication was not established successfully
        Try checking:
                1. Fibers from GE1/1 patch-panel to OH have correct jacket color ordering
                2. Fibers from GE1/1 patch-panel to OH are fully inserted
                3. OH3 screw is properly screwed into standoff
                4. OH3 standoff on the GEB is not broken
                5. Voltage on OH3 standoff is within range [1.47,1.59] Volts
Connectivity Testing Failed
If Vmon = 8.0V then Imon must be 1.71 +/- 0.01A; if not the GBT's are not locking to the fiber link
Goodbye

This happens with any of the other OHs on shelf01-amc11at p5. The only way to recover one of the other chambers is to power cycle

bregnery · 2020-01-23T10:33:33Z

Okay, so something strange is happening.

I first noticed this last week at point 5 on a different amc. Then I was able to reproduce this twice at p5 today. But just now, we did recover.sh to the CTP7, I tried to reproduce this behavior again. But now that strange behavior has stopped.

mexanick · 2020-01-23T12:10:49Z

Hmmm... this is really interesting @evka85, please take a look. Under normal operations we are supposed to get resyncs from time to time as well as hard resets, which will be forwarded to front end and it is important to restore the system properly...

lpetre-ulb · 2020-01-23T12:58:35Z

@bregnery, since you could reproduce the issue, did you read some registers and/or made some dumps? Such as reading the GTH transceiver or the GBT link statuses (before and after a link reset for the latter)? It rather difficult to do post-mortem analysis with very little information.

We are/were using testConnectivity.py for electronics tests on 6 chambers at ULB during months and we never experience a such behavior...

Also, the IPBus issue is not related since it is not for the same link and there is not IPBus-related error messages in the current issue.

mexanick assigned jsturdy, cgalloni and mexanick Jan 23, 2020

mexanick added the Status: Pending label Jan 23, 2020

bregnery closed this as completed Jan 23, 2020

mexanick reopened this Jan 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testConnectivity OH Masking Issue #292

testConnectivity OH Masking Issue #292

bregnery commented Jan 23, 2020

mexanick commented Jan 23, 2020

bregnery commented Jan 23, 2020

bregnery commented Jan 23, 2020

mexanick commented Jan 23, 2020

lpetre-ulb commented Jan 23, 2020 •

edited

testConnectivity OH Masking Issue #292

testConnectivity OH Masking Issue #292

Comments

bregnery commented Jan 23, 2020

Types of issue

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Possible Solution (for bugs)

Context (for feature requests)

Your Environment

mexanick commented Jan 23, 2020

bregnery commented Jan 23, 2020

bregnery commented Jan 23, 2020

mexanick commented Jan 23, 2020

lpetre-ulb commented Jan 23, 2020 • edited

lpetre-ulb commented Jan 23, 2020 •

edited