For some time now we have been facing issues with users getting stuck on Identity Server and not being returned to our Sitecore website.
Our Setup
In our setup, we have four front end content delivery instances (behind load balancer) running our student portal. The portal is fully locked down, so access to any page in the front end requires the user to be logged in. Users login via Identity Server, which in turn links through to our on premise ADFS as an external identity provider. We were running Duende Identity Server on two virtual machines, behind a load balancer.
The Problem
The issue we were facing is that periodically (once every 2 or 3 weeks) some of the CD instances would stop allowing the users to login. When accessing the directly on problematic virtual machine, the user is redirected the login screen, then gets stuck before redirecting back to the Sitecore site.
We were able to identify exceptions in the logs relating to the issue:
2024-04-05T15:38:39.5466013+01:00 [FTL] (Sitecore Identity/SC-NP-P-XC01) Unhandled exception: "The key {769d42de-2cf0-40ae-a655-991d57b67a08} was not found in the key ring. For more information go to http://aka.ms/dataprotectionwarning"
System.Security.Cryptography.CryptographicException: The key {769d42de-2cf0-40ae-a655-991d57b67a08} was not found in the key ring. For more information go to http://aka.ms/dataprotectionwarning
at Microsoft.AspNetCore.DataProtection.KeyManagement.KeyRingBasedDataProtector.UnprotectCore(Byte[] protectedData, Boolean allowOperationsOnRevokedKeys, UnprotectStatus& status)
at Microsoft.AspNetCore.DataProtection.KeyManagement.KeyRingBasedDataProtector.Unprotect(Byte[] protectedData)
at Microsoft.AspNetCore.DataProtection.DataProtectionCommonExtensions.Unprotect(IDataProtector protector, String protectedData)
at Duende.IdentityServer.Stores.Serialization.PersistentGrantSerializer.Deserialize[T](String json) in /_/src/Storage/Stores/Serialization/PersistentGrantSerializer.cs:line 108
at Sitecore.Plugin.IdentityServer.Storage.ExternalUsers.ExternalUserRepository.FindAsync(String providerName, String userId)
at Sitecore.Plugin.IdentityServer.Services.ExternalProfileService.GetProfileDataAsync(ProfileDataRequestContext context)
at Sitecore.Plugin.IdentityServer.Services.CombinedProfileService.GetProfileDataAsync(ProfileDataRequestContext context)
at Duende.IdentityServer.ResponseHandling.UserInfoResponseGenerator.ProcessAsync(UserInfoRequestValidationResult validationResult) in /_/src/IdentityServer/ResponseHandling/Default/UserInfoResponseGenerator.cs:line 108
at Duende.IdentityServer.Endpoints.UserInfoEndpoint.ProcessUserInfoRequestAsync(HttpContext context) in /_/src/IdentityServer/Endpoints/UserInfoEndpoint.cs:line 93
at Duende.IdentityServer.Endpoints.UserInfoEndpoint.ProcessAsync(HttpContext context) in /_/src/IdentityServer/Endpoints/UserInfoEndpoint.cs:line 61
at Duende.IdentityServer.Hosting.IdentityServerMiddleware.Invoke(HttpContext context, IEndpointRouter router, IUserSession session, IEventService events, IIssuerNameService issuerNameService, IBackChannelLogoutService backChannelLogoutService) in /_/src/IdentityServer/Hosting/IdentityServerMiddleware.cs:line 103
As usual, when all else fails, we logged a ticket with Sitecore Support, who were able to help identify the issue.
The Solution
Artem from Sitecore Support came back with the following advice:
Sitecore Identity Server (SIS) is built on top of .NET Core and uses the .NET Core cryptography API, including the key ring. SIS uses the key ring to store and manages cryptographic keys that are used for encrypting and decrypting tokens, signing and verifying tokens, and other security-related operations.
Each Sitecore instance that uses SIS will have its own instance of the key ring, which is created and managed independently by the .NET Core runtime within the memory of the process that hosts the instance. This means that the key ring is not shared between multiple Sitecore instances.
I would like to share with you that the reported behavior could take place if two IS instances run behind a Load Balancer at the same time. Please be advised that it is not allowed to have more than one IS instance behind the Load Balancer. You may refer to the following article to fetch more details on this topic: https://doc.sitecore.com/xp/en/developers/103/platform-administration-and-architecture/scaling-and-configuring-sitecore-identity-server.html
You cannot set up multiple instances of the SIS role behind a load balancer. An encrypted cookie can only be decrypted by the specific instance of the SIS role that originally issued it, which cannot be guaranteed in a load balanced setup.
I would like to emphasize that multiple IS instances are not allowed in all versions.
We have submitted a feature request to our development team for a scaling SIS. This functionality could be implemented in future Sitecore versions. To track the future status of this report, please use reference number 378961. More information about public reference numbers can be found here: https://support.sitecore.com/kb?id=kb_article_view&sysparm_article=KB0853187
Summary
In short, our problem was that we were running two instances of Identity Server behind a load balancer. The solution was to drop down to one single instance.