A particular failure you may have encountered with the v1.4 release of the Linux VDA occurs with a dialog entitled “HDX Session Validation Failure” appearing immediately after session launch. This forces a logout of your session after 30 seconds.
This KB is to explain why this failure occurs and what you can do to rectify the situation if you see it.
A fundamental part of session launch in a XenDesktop deployment is the process called Brokering. This is performed by the Broker, which is responsible for negotiating launch requests with worker machines. The Broker communicates with the Broker Agent on workers using the Citrix Brokering Protocol (CBP). It selects workers to handle the incoming ICA connections for desktop and application launches based on the worker’s readiness to fulfil the launch request. To ensure the launch is secure, CBP requires the Broker to associate users with their specific sessions. This user-to-session association is verified by having the Broker Agent impersonate each user that is logged on.
For the Linux VDA, the Broker Agent needs access to the Kerberos credentials of the user for impersonation during session validation. This requires the system environment be configured to cache Kerberos credentials in a simple flat file format. If the Broker Agent is unable to access the cache file for the user of a session, the above dialog is immediately shown after session launch, and then the session is terminated 30 seconds later.
There are a few diagnostic checks you can perform to pinpoint the cause of session validation failure. The checks outlined below are for a RHEL7 machine joined to AD with Winbind. You will find that the checks to perform for Centrify DirectControl or Quest are similar.
You are recommended to run the following commands in a HDX session which has failed session validation before it is forcibly logged out.
- List the cached Kerberos tickets for your user account by running:
klist
- Search /tmp for possible credential cache files belonging to your user account by running:
ls -ls /tmp/krb* |grep `id -u`
- Check if the KRB5CCNAME environment variable has been set by running:
echo $KRB5CCNAME
If the timeout proves to be too quick, you can run the same commands for the same user account with Secure Shell (SSH) or another remote login program. With the exception of the KRB5CCNAME environment variable, the results should be the same as for HDX.
Let’s start by looking at the output from “klist”. Here is the output from my working RHEL7 machine. If “klist” has successfully displayed the ticket cache, check the expiry time of your ticket.
rhel:~ # klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: ctxadmin@CITRIXLAB.LOCAL
Valid starting Expires Service principal
11/23/16 04:59:10 11/23/16 14:59:10 krbtgt/CITRIXLAB.LOCAL@CITRIXLAB.LOCAL
renew until 11/24/16 04:59:03
11/23/16 07:47:09 11/23/16 14:59:10 host/sles6.citrixlab.local@CITRIXLAB.LOCAL
renew until 11/24/16 04:59:03
rhel:~ # date
Tue Nov 29 07:25:12 EST 2016
If you find that your ticket has expired, Winbind has failed to update your ticket cache. In this case:
- Check that /etc/samba/smb.conf has:
[global]
kerberos method = secrets and keytab
winbind refresh tickets = true
- Manually renew the cached Kerberos ticket for your user account using below command:
rhel:~ # kinit -R
Often this is sufficient to fix the problem. The cache file will be refreshed.
With session validation failure “klist” often reports an error message saying that the cache file could not be found.
rhel: ~ # klist
klist: Credentials cache file '/tmp/krb5cc_10001' not found
This error is generally caused by incorrect Kerberos or AD integration settings on your machine. Please refer to section “
Correcting misconfiguration” the later part in this KB.
The KRB5CCNAME environment variable is usually set by a Pluggable Authentication Modules (PAM) “aware” program on successful login by a user. This environment variable should refer to the ticket cache file of the user:
/etc/krb5.conf
[libdefaults]
default_ccache_name = FILE:/tmp/krb5cc_%{uid}
/etc/security/pam_winbind.conf
[Global]
krb5_auth = yes
krb5_ccache_type = FILE
$echo $KRB5CCNAME
FILE:/tmp/krb5cc_16777216
The Linux VDA is PAM “aware” and retrieves all PAM related environment variables on a successful login. This includes the KRB5CCNAME environment variable. However, there are cases where PAM will fail to set the KRB5CCNAME environment variable after the user has been successfully authenticated. This occurs when the clock skew between the Linux VDA machine and the AD server becomes excessive.
Here is an excerpt from
pam_winbind.conf which mentions the clock skew in the description of the “krb5_auth” option:
pam_winbind can authenticate using Kerberos when winbindd is talking to an Active Directory domain controller. Kerberos authentication must be enabled with this parameter. When Kerberos authentication cannot succeed (e.g. due to clock skew), winbindd will fallback to samlogon authentication over MSRPC. When this parameter is used in conjunction with winbind refresh tickets, winbind will keep your Ticket Granting Ticket (TGT) uptodate by refreshing it whenever necessary. Defaults to “no”.
With the fallback to “samlogon” authentication, the Kerberos credential cache file becomes irrelevant and therefore there is no need to set the KRB5CCNAME environment variable. The Broker Agent relies on this variable to find the ticket cache file. Without the variable the Broker Agent will resort to opening the cache file assuming it has a well-known filename format. This is not always successful.
Generally, if your machine suffers from clock skew, the Broker Agent will not register with the Broker. If you suspect you have encountered this, you can confirm that Winbind is failing Kerberos authentication by enabling debug logging for PAM. Modify the/etc/pam.d/password-auth file (which is included by /etc/pam.d/ctxhdx) by appending “debug” to the “auth” line for the “pam_winbind.so” module:
/etc/pam.d/password-auth
auth sufficient pam_winbind.so use_first_pass debug
When PAM fails to set the KRB5CCNAME environment variable due to clock skew, you will see a message as follows in either /var/log/secure or /var/log/messages:
winbindd[1640]: gss_init_sec_context failed with [ Miscellaneous failure (see text): Clock skew too great]
Refer to the following section to resolve this clock skew issue.
Earlier this KB mentioned that missing ticket cache files are often the result of incorrect Kerberos or AD integration settings on the Linux VDA machine. Besides reviewing the
install guide instructions, you are recommend to run
Linux XDPing(
CTX202015) to identify the misconfigured settings. The culprit is usually a misconfigured KRB5CCNAME type setting. This will be highlighted by XDPing as shown below:
[root@rhel linux-xdping]# sudo xdping -T kerberos
Root User -------------------------------------------------------------
User: root
EUID: 0
Verify user is root [Pass]
Kerberos --------------------------------------------------------------
Kerberos version: 5
Verify Kerberos available [Pass]
Verify Kerberos version 5 [Pass]
KRB5CCNAME: [Not set]
Distro default FILE:/tmp/krb5cc_%{uid}
KRB5CCNAME type: [Supported]
KRB5CCNAME format: [Default]
Verify KRB5CCNAME cache type [Pass]
Verify KRB5CCNAME format [Pass]
Configuration file: /etc/krb5.conf [Exists]
Verify Kerberos configuration file found [Pass]
Keytab file: /etc/krb5.keytab [Exists]
Default realm: CITRIXLAB.LOCAL
Default realm KDCs: CTXAD2.CITRIXLAB.LOCAL
CTXAD2.CITRIXLAB.LOCAL
Default realm domains: citrixlab.local
.citrixlab.local
DNS lookup realm: [Enabled]
DNS lookup KDC: [Enabled]
Weak crypto: [Disabled]
Clock skew limit: 300 s
Verify system keytab file exists [Pass]
Verify default realm set [Pass]
Verify default realm in upper-case [Pass]
Verify default realm not EXAMPLE.COM [Pass]
Verify default realm domain mappings [Pass]
Verify default realm master KDC configured [Pass]
Verify default realm slave KDC configured [Pass]
Verify Kerberos weak crypto disabled [Pass]
Verify Kerberos clock skew setting [Pass]
Default ccache: FILE:/tmp/krb5cc_%{uid}
Default ccache type: [Supported]
Default ccache format: [Default]
Verify default credential cache cache type [Pass]
Verify default credential cache format [Pass]
UPN system key [RHELS4$@CITRIXLAB.LOCAL]: [Exists]
SPN system key [host/rhels4.citrixlab.local]: [Exists]
Verify Kerberos system keys for UPN exist [Pass]
Verify Kerberos system keys for SPN exist [Pass]
Kerberos login: Success (TGT received)
Verify KDC authentication [Pass]
TGT cached: Success
Verify TGT cached [Pass]
Summary ---------------------------------------------------------------
All tests passed
If error occurs you will need to check a few settings, especially the KRB5CCNAME type for Kerberos and the AD integration tool configured on your Linux VDA machine. Details as following:
- Ensure that Kerberos tickets are verified using both the secrets TDB file and system keytab. This involves the “kerberos method” configuration setting in .
[global]
kerberos method = secrets and keytab
winbind refresh tickets = yes
As shown above, the “authconfig” tool generated this section of the file. You will find that this includes the “kerberos method” setting. I recommend moving “kerberos method” out of the generated section as I have done to avoid inadvertently changing it when running “authconfig” again.
- Ensure the “krb5_auth” and “krb5_ccache_type” configuration settings are correct in /etc/security/pam_winbind.conf.
[global]
krb5_auth = yes
krb5_ccache_type = FILE
- Ensure the “default_ccache_name” configuration setting in /etc/krb5.conf does not specify a conflicting KRB5CCNAME type to that in /etc/security/pam_winbind.conf. I recommend deleting this setting if it is specified or making it the same as for pam_winbind.conf.
- Finally, remember to restart the Winbind service if you have made any configuration changes: service winbind restart
That’s all for troubleshooting session validation failures in the Linux VDA. You should have now some insight into resolving validation failures if ever you encounter them in the wild.