I have a problem with a memory leak in .NET Core 3.1 API. The application is hosted in azure app service.
It is clearly visible on a graph that under constant load the memory is very slowly growing. it will only go down after app restart.
I created two memory dumps. One with high memory and one after restart and it's clearly visible that the reason is the app trying to load XmlSerialization.dll multiple times.
Now we have multiple other APIs that are using almost identical code when it comes to serialization and I'm not exactly sure why the problem occurs only in this one. Potentially because maybe this one has a much higher traffic when using the APIs.
I've read some articles about XmlSerializer class having memory issues but those were listed for some of the constructors we are not using. The only instance of using XmlSerializer directly in code was using an XmlSerializer(Type) constructor.
private static async Task<T> ParseResponseContentAsync<T>(HttpResponseMessage response, Accept accept)
{
try
{
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
{
using (StreamReader reader = new StreamReader(contentStream, Encoding.UTF8))
{
switch (accept)
{
case Accept.Xml:
XmlSerializer serializer = new XmlSerializer(typeof(T));
return (T)serializer.Deserialize(reader);
case Accept.Json:
string stringContent = await reader.ReadToEndAsync();
return JsonConvert.DeserializeObject<T>(stringContent);
default:
throw new CustomHttpResponseException(HttpStatusCode.NotImplemented, $"Unsupported Accept type '{accept}'");
}
}
}
}
catch (Exception ex)
{
throw new InvalidOperationException($"Response content could not be deserialized as {accept} to {typeof(T)}", ex);
}
}
But I'm pretty sure this method is not used in this API anyway .
So another potential problematic place could be somewhere in the Controller serialization of responses.
Startup.cs registration:
services
.AddControllers(options =>
{
options.OutputFormatters.Add(new XmlSerializerOutputFormatter(
new XmlWriterSettings
{
OmitXmlDeclaration = false
}));
options.Filters.Add<CustomHttpResponseExceptionFilter>();
})
.AddNewtonsoftJson(options => options.SerializerSettings.Converters.Add(
new StringEnumConverter(typeof(CamelCaseNamingStrategy))))
.AddXmlSerializerFormatters();
Example of an endpoint:
[Produces(MimeType.ApplicationXml, MimeType.TextXml, MimeType.ApplicationJson, MimeType.TextJson)]
[ProducesResponseType(StatusCodes.Status200OK)]
[ProducesResponseType(StatusCodes.Status404NotFound)]
[ProducesResponseType(StatusCodes.Status401Unauthorized)]
[HttpGet("EndpointName")]
[Authorize]
public async Task<ActionResult<ResponseDto>> Get([FromModel] InputModel inputModel)
{
//some code
return responseDto;
}
Dto returned from the API:
[XmlRoot(ElementName = "SomeName")]
public class ResponseDto
{
[XmlElement(ElementName = "Result")]
public Result Result { get; set; }
[XmlAttribute(AttributeName = "Status")]
public string Status { get; set; }
[XmlAttribute(AttributeName = "DoneSoFar")]
public int DoneSoFar { get; set; }
[XmlAttribute(AttributeName = "OfTotal")]
public int OfTotal { get; set; }
}
Now I haven't been able to find any documented cases of .AddXmlSerialization
causing these kinds of issues and I'm not sure what the solution or a workaround should be. Any help would be greatly appreciated.
EDIT: I've run some additional tests as @dbc suggested.
Now it seems that we are not even hitting this line new XmlSerializer(typeof(T)
in our scenarios since nothing was logged after logger code was added. We do however use default xml serialization for some of our API endpoints. Now one thing I noticed that might be causing this behavior is that the paths in memory dumps logs don't match the files that actually exist in the root folder.
The paths which are visible in memory dumps are *.Progress.Lib.XmlSerializers.dll
or *.Domain.Lib.XmlSerializers.dll
Now I wonder if this isn't the issue documented here - link since I can't see those files in wwwroot directory.
If it is I'm not sure if the solution would be to somehow reference the .dlls directly ?
Edit2: Adding a screen of how memory looks like after deploying cached serializer suggested by @dbc. There is no constant growth but it seems after few hours memory rises and doesn't go down. It is possible that the main problem is resolved but since it takes a lot of time to notice big differences we will monitor this for now. There is nothing showing in large object heap or any big number of memory is not allocated in managed memory. This API however when first deployed runs around 250 mB and after one day now at 850 mB. When we turn off the load test tool the memory didn't really go down too much.
Edit3: So we looked closer at some historical data and it seems that the last screen is a normal behavior. It never grows beyond a certain point. Not sure why that happens but this is acceptable.