When installing the Connectivity Agent, there are several mandatory command-line arguments that include a valid ICS username (-u=[username]) and password (-p=[password]). These arguments are used to verify connectivity with ICS during installation and are also stored (agent credentials store) for later use by the agent server. The purpose of storing them is to allow the running agent to make a heartbeat call to ICS. This heartbeat is used to provide status in the ICS console regarding the state of the Connectivity Agent. This blog will detail some situations/behaviors relating to the heartbeat that cause confusion when the ICS console contradicts observations on the Connectivity Agent machine.
Confusing Behaviors/Observations
The following is a real-world series of events that occurred for an ICS subscriber. Their agent had been started and running for quite a while. The ICS console was used to monitor the health of the agent (i.e., the green icon which indicates the agent is running). Then out of the blue, the console suddenly showed the agent was down (i.e., the red icon):
The obvious next step was to check on the agent machine to make sure the agent was running. When looking through the standard out that was being captured, it shows that the agent was in fact still running:
Further investigation showed that the agent server logs did not indicate any problems. In an attempt to resolve this strange scenario, the agent server was bounced … but it failed to start with the following:
Although the -u and -p command-line parameters contained the correct credentials, the startAgent.sh indicated an error code of 401 (i.e., unauthorized). This error was very perplexing since the agent had been started earlier with the same command-line arguments. After leaving the agent server down for a while, another start was kicked off to demonstrate the 401 problem. Interestingly enough, this time the agent started successfully and went to a running state. However, the ICS console was still showing that the agent was down with no indication of problems on the Connectivity Agent machine. Another attempt was made to bounce the agent server and it again failed to start with a 401.
At this point, the diagnostic logs were downloaded from the ICS console to see if there was any indication of problems on the ICS side. When analyzing the AdminServer-diagnostic.log, it showed many HTTP authentication/authorization failure messages:
At this point it was determined that the password for the ICS user associated with the Connectivity Agent had been changed without notifying the person responsible for managing the agent server. The series of odd behaviors were all tied to the heartbeat. When the ICS user password was changed, the running agent still had the old password. It was the repeated heartbeat calls with invalid credentials that caused the user account to be locked out in ICS. When a user account is locked, it is not accessible for approximately 30 minutes.
This account locking scenario explained why the agent server could be started successfully and then fail with the 401 within a short period of time. When the account was not locked, the startAgent.sh script would successfully call ICS using the credentials from the command-line. Then the server would start and use the incorrect credentials from the credentials store for the heartbeat, thus locking the user account which caused the problem to repeat itself.
The Fix
To fix this issue, a WLST script (updateicscredentials.py) has been provided that will update the Connectivity Agent credentials store. The details on running it can be found in the comments at the top of the script:
When executing this script, it is important to make sure the agent server is running. Once the script is done you should see something like the following:
At this point, stop the agent server and wait 30 minutes to allow the user account to be unlocked before restarting the server. Everything should now be back to normal:
Possible Options For Less Than 30 Minute Waiting Period
Although I have not yet had an opportunity to test the following out, in theory it should work. To avoid the 30 minute lockout period on ICS due to the Connectivity Agent heartbeat:
1. | Change the credentials on the Connectivity Agent server. |
2. | Shutdown the Connectivity Agent server. |
3. | Access the Oracle CLOUD My Services console and Reset Password / Unlock Account with the password just used for the agent: |
4. | Verify that the user can login to the ICS console (i.e., that the account is unlocked). |
5. | Start the Connectivity Agent and allow the server to get to running state. |
6. | Verify that “all is green” in the ICS console. |
All content listed on this page is the property of Oracle Corp. Redistribution not allowed without written permission