Redundancy - Top 5 Single Points of Failure in Most Citrix Implementations
Citrix Implementation No Comments »- WI/CSG
- XML
- STA
- TS licensing server
- License server and data store
When we talk to Citrix administrators and first ask about their Citrix implementation, they may tell us they have 2 or 3, 4 or 5 servers. With the exception of one machine running the Web Interface, the rest of the Citrix servers are assumed to be pretty much equal, serving apps.
But the truth is there are several components in the Citrix farm that are single points of failure, with varying levels of tolerance for disconnection. All the Citrix servers are therefore not equal. Some application servers going down may only create additional load on the rest of the servers. Other application servers may be involved in more unique and critical functions as well, such as XML server, or STA, for the Web Interface.
1. WI/CSG
If the WebInterface machine, which runs over IIS, goes down, there may be no other method of external access to the applications on the LAN. By default, the Citrix Web Interface is not fault tolerant. It takes only minutes to “create a site” when first configuring the Citrix servers, but by itself that site is a single point of failure.
The first thing that could be done is to create a second site on a second Web Interface machine. By itself this would not provide for a smooth failover; users would loose connectivity to the first site, then have to enter a different URL or IP address to get to the second site, before reconnecting to their ICA/CGP sessions.
Originally the only thing Citrix said we could do about failover was get a hardware load balancer, but eventually Microsoft Clustering was supported. The same is true of the Citrix Secure Gateway - the SSL software that comes with WebInterface for free, to secure the ICA data stream via SSL/TLS certificates.
2. XML
The Web Interface needs to talk to at least one “designated XML server” for each farm that it supplies credentials to. In the configuration utility for the site there is an option to add additional servers to a list of designated XML servers, and decide whether you want all the servers contacted on a regular basis, or, simply a main server and a list of backups. Either way, more than one - strategically chosen - Presentation server should be configured. The only requirement of the servers chosen is that they have the same port configured to be used for XML.
Even if there are two websites and two WebInterface machines, both using the same configuration with the same single XML server, though functional - and common - would be a single point of failure.
3. STA
If you’re using the WebInterface, you’re probably using the Citrix Secure Gateway software, or the Citrix Access Gateway hardware network appliance, to secure the ICA/CGP traffic via SSL over untrusted networks like the internet. Either way, your security box is using one of your Presentation Servers to both issue and authenticate “1-time-only tickets”, which are passed from the Web Interface to the client device and back to the Presentation Servers. It’s part of how the single-sign-on effect works within WebInterface, without exposing credentials to untrusted networks. Logistically, the same one server has to authenticate all the tickets that it issues.
And being a single DLL to do the job, it isn’t very hard, and Citrix tells us we’ll never need more than one for performance, but this is also a very dangerous single point of failure. That one Citrix server that happens to have been chosen as the STA - Secure Ticket Authority - goes down, and nobody can get in from outside.
Solving the single STA problem is about as easy as solving the single XML server problem; there is an option where the STA is configured to add more than one, and to allow the list to be used for “load balancing” or failover. This has to be done carefully, however, as multiple interfaces have to be configured with identical information, one interface telling the system where to make the certificates, and the other telling the system where to authenticate them. If these do not always match, that is also a big single point of failure. Without an STA, everyone using the Web Interface just gets a red error.
4. TS licensing server
Logging in successfully to the application server requires not only a Citrix License Server and a concurrent user Citrix License for the version of Citrix trying to be accessed, but also a Terminal Services CAL, which has to be stored on a Terminal Services Licensing Server. There is usually at least one TS Licensing server, possibly the Domain Controller, for the whole Citrix Farm.
There are two different methods of licensing available for TSCAL’s - per user and per seat. The per user licensing is preferable, because you are only on your honor, and if some disconnect occurs between the Citrix servers and the TS License server, there is no technical issue.
But per-seat licensing is another story. Being unable to get a TSCAL, after a set amount of time, can stop a user from getting in to a Citrix session, when the method of Terminal Services licensing on that Citrix server is set to per-seat. If this is the case, the TS Licensing server is another dangerous single point of failure.
5. License server and data store
These are actually two separate issues, but they have a lot in common; both are 30-day fault-tolerant single points of failure. Anyone good enough to check their Event Viewers on the Citrix Servers at least once a week will see the red “X” and the number of grace hours remaining until catastrophic failure, in there once an hour for the whole 30-day countdown.
The Citrix License server has to be up, and the license file has to be available, over port TCP 27000, by default, to all Citrix Presentation Servers. The license file has the case-sensitive name of the License server hard-coded and digitally signed, so if licensing is installed somewhere else and the farm is pointed to the new license server, there is still a 30-day countdown. Citrix can re-allocate the license for you on MyCitrix.com, but it only takes two reboots to rename a server in honor of the old license server, assuming that it’s not going to cause any other problems for other things hosted on the server to be renamed. Citrix supports the use of Microsoft Clustering for the Citrix License server.
The IMA data store is an older and more complicated story. Holding all the settings for all the published apps, policies, and servers in the farm, this heart of the Citrix implementation may reside in an Access database on any one of your Citrix Presentation Servers - by default, on the first one in the farm. Then again, and in most situations, the IMA Data Store resides on a separate SQL server, with the outside chance it is sitting on Oracle or DB2.
If it’s sitting on an external box, there are two single points of failure, by default, just in the case of Data Store disconnection - there’s the SQL server itself, of course, but then there’s the one server with the DSN file to that database server. By default, IMA configures the first server in the farm to connect to the data store, and the rest of the servers go through that first server, in what Citrix calls an “indirect” IMA connection. Our option, and a Citrix best practice, is to add DSN files manually to several other Citrix servers, in order to maintain connection to the Data Store under all circumstances.
Having the data store itself backed up goes back to the beginning of the article, the idea that, whether on Access or on SQL, and whether there are professional daily backups running or not, there should be a separate, “last known good” backup of the data store, on a flash drive, in the possession of the lead administrator or integrator, and a process in place to get back to that Data Store state in the event of Data Store failure.
CM, Citrix Training Instructor
Unitek Citrix Training

