I'm a beginner programmer and I've been banging my head on my desk for a while because of this problem. Basically, I'm trying to write some code for an application I'm making that will be able to read the rating of multiple thousands of files quickly inside a specified folder.
I was actually able to write something that works, the problem is the performance. Here is the code in its entirety, I will explain why it is problematic in more detail below:
using System.Collections.Generic;
using System.Windows.Forms;
using Microsoft.WindowsAPICodePack.Shell;
using System.Diagnostics;
namespace Tests
{
public partial class Form1 : Form
{
List<string> Files = new List<string>();
public Form1()
{
InitializeComponent();
}
private void event_Form1_Shown(object sender, EventArgs e)
{
string File = @"D:\Downloads\1.png";
int NumberOfLoops = 5000;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < NumberOfLoops; i++)
{
var file = ShellFile.FromFilePath(File);
int Rating = Convert.ToInt32(file.Properties.System.Rating.Value);
if (Rating == 0)
{
Files.Add(File);
}
}
sw.Stop();
MessageBox.Show("Time: " + sw.ElapsedMilliseconds.ToString() + "ms (" + NumberOfLoops.ToString() + "x)");
}
}
}
On my system, reading the rating of this one file 5000 times takes around 6200ms (local harddrive) and 21500ms if the file is on a network share.
The problem is, as I eluded before, that this code will be used to read the rating of way more files than 5000 (sometimes hundreds of thousands) and the performance is absolutely abysymal. What I have also learned is that Windows uses some form of caching for reading this kind of metadata from a file more rapidly once it has been read before, so reading a specific file's metadata over and over is the absolute best scenario in terms of performance.
But even though it might not be accurate, it is still a useful test to do in order to have some kind of benchmark to compare different methods of reading file extended attributes to see which one takes the least amount of time to complete. In real-world use, the app will actually have to read the ratings of a gigantic pool of different files, which slows things down by a factor of around 25 times by my testing (the 21500ms operation takes 578000ms for example, which is around 10 minutes so you can see why this is becoming a problem).
Since I know I'm a beginner and that my code is probably super inefficient, I started looking around for other methods of doing the same thing. So using this solution from a thread on a similar problem, I came up with this code:
using System.Collections.Generic;
using System.Windows.Forms;
using System.Diagnostics;
namespace Tests
{
public partial class Form1 : Form
{
List<string> Files = new List<string>();
Shell32.Shell app = new Shell32.Shell();
public Form1()
{
InitializeComponent();
}
private void event_Form1_Shown(object sender, EventArgs e)
{
string Folder = @"D:\Downloads\";
string File = "1.png";
int NumberOfLoops = 5000;
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < NumberOfLoops; i++)
{
var folderObj = app.NameSpace(Folder);
var filesObj = folderObj.Items();
var headers = new Dictionary<string, int>();
for (int j = 0; j < short.MaxValue; j++)
{
string header = folderObj.GetDetailsOf(null, j);
if (String.IsNullOrEmpty(header))
break;
if (!headers.ContainsKey(header)) headers.Add(header, j);
}
var testFile = filesObj.Item(File);
if (folderObj.GetDetailsOf(testFile, headers["Rating"]) == "Unrated")
{
Files.Add(Folder + File);
}
}
sw.Stop();
MessageBox.Show("Time: " + sw.ElapsedMilliseconds.ToString() + "ms (" + NumberOfLoops.ToString() + "x)");
}
}
}
Unfortunately, this method is even slower than the one before, clocking in at around 6700ms on my local harddrive and 23000ms on a network share. I also found these other solutions which seemed to be doing something to what I want, but I couldn't get them to work for various reasons:
https://stackoverflow.com/a/65349545 : the
StorageFile.GetFileFromPathAsync
call gives me an error even if I addedMicrosoft.Windows.SDK.Contracts
into the project NuGet packages.https://stackoverflow.com/a/48096438 : Using the popular TagLib-Sharp library, but unfortunately even though I was able to compile the code using this solution, I was not able to read the rating from a file (I was able to read the tags though, which are similar but not quite the thing I was looking for).
https://stackoverflow.com/a/29308647/19518435: This solution looked promising, but as another commenter mentionned, I have no idea what the
FolderItem2
is supposed to be referencing. EDIT: Got this solution to work with some help, but unfortunately it is not really on par in terms of performance, see EDIT1 below for more details.
Ideally, I would like to find a way for this "benchmark" I've made to take around 1000ms or less (so in the realms of around 6-7 times faster than the first two methods).
I am really motivated to get this to work, but frankly I am out of ideas. It's kind of a frustrating situation because I know my code is probably very unoptimised or there might be a way more obvious way to do what I'm trying to achieve, but since I am very inexperienced I don't really know what else to try. So that's why I'm turning to you, any help would be greatly appreciated!
EDIT1: Was able to make two more methods work with some help, but unfortunately both are not very good in terms of performance. I compiled all 4 in this GitHub repo if anyone wants to take a second look at them, because I feel like there's a good chance my bad implementation is affecting performance: https://github.com/user-727/FileRatingReader