I have an application written by C#. I need to process a big text file. For each string, I need to parse 3-10 alphanumeric values and recalculate them. Regex is perfectly suitable for that. But c# regex is not fast enough. I tried to use ctre (compile-time regular expressions) library by Hana Dusikova (C++ 17) and it had a perfect performance. I need to call regex for line by line reading. If I use PInvoke calling from C++ library how much spent it Will be? I want to make a c++ library and do calls from C# when I need to parse the line. Is it a good idea?
Asked
Active
Viewed 450 times
0
-
Have you tried to use [RegexOptions.Compiled](https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regexoptions?view=netframework-4.8)? Not sure if that will have a big performance impact tho. – Longoon12000 Jan 09 '20 at 15:56
-
And, are you compiling the c# code to a binary? – Mike Robinson Jan 09 '20 at 16:01
-
2Why don't you write a sample program in C++, measure the time it takes and compare to the C# program? – luxun Jan 09 '20 at 16:01
-
@Longoon12000 of course. When I use ctre regex it takes near 9 seconds for 220MB file and c# regex takes near 35 seconds. It is 388% - a very fine result. – Noisy88 Jan 09 '20 at 16:02
-
note that the RegexOptions.Compiled that I linked is c# regex, not an external library like the ctre that you linked – Longoon12000 Jan 09 '20 at 16:05
-
@luxun In a simple test ctre regex has a fine result. But I didn't use PInvoke (call c++ code through c# application), that test is expensive because program objects have a difficult structure. For full testing, I need a lot of time. To write that code is not easy. – Noisy88 Jan 09 '20 at 16:08
-
@Longoon12000 with RegexOptions.Compiled it was faster. Increased by 10% in real-life testing... Not so much. – Noisy88 Jan 09 '20 at 16:53
-
2For me, any decision around a P/Invoke based tool needs to incorporate marshalling time. I've had transit costs eat up more than a 388% performance boost because the interface was too fine grained. – parktomatomi Jan 09 '20 at 20:55
-
@parktomatomi I found the best solution. It's a precompiled regex by C#. I took a test file with a size 220MB. I tried to compare interpreted, compiled and precompiled regex by C# and I got these results: 1)Precompiled regex: 18453 ms; 2) Compiled regex: 18713 ms; 3) Interpreted regex: 25928 ms. – Noisy88 Jan 10 '20 at 05:00