5

Some software has the option to collect anonymous usage data. How does that work? How is it collected and sent? I'd like to write a small test to try this myself, but I'm not sure where to begin.

Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
DRokie
  • 705
  • 2
  • 9
  • 20
  • 5
    It works by collecting data and sending it. How did *you* think it works? Which part are you having trouble with? Be more specific about what you want to know, and you'll get more specific answers. – Rob Kennedy Jun 17 '11 at 03:20
  • 1
    See also (with Delphi code): http://stackoverflow.com/questions/1115048/is-there-a-tool-to-gather-win32-application-usage-statistics-closed - or http://stackoverflow.com/questions/2441420/usage-tracking-for-windows-desktop-applications – mjn Jun 17 '11 at 05:46
  • 1
    DRokie - you could have a look at DeskMetrics.com. I'm not affiliated with them in anyway, other than I've used it and it worked for me. (Basically it's a DLL, and they have a Delphi interface unit). – Stuart Jun 18 '11 at 12:51

1 Answers1

14

Collecting data: You can accumulate whatever data you want to accumulate. For example, we wanted to know what forms our users were using (we have a lot of forms). So, in every form's FormCreate, we call code that appends Self.Name to a text file. Now we are tracking form creations, and the order in which our users visit the forms.

You could collect any data you wanted. For example, if you wanted to know how many times a user got a certain error message, whenever you show the error message, append the name of the error message to a file. If you wanted to know how long a user spent on a certain screen, you could note Now() when the form is opened and Now() when the form is closed and then write the difference to a data file.

Sending the data: You need to transfer either the raw data you collected to your server or you need to pre-process that data and send that. In our example above, we would just send the text file. You can use any internet library of your choice to upload a file to your server. We use Indy FTP, since it comes with Delphi. Upload the file, giving it a unique name (maybe a GUID if your server accepts that format?) to your server. (Choosing a good file name that hasn't already been uploaded to the server by another user's application is one of the challenges you'll have.) Be sure that you don't include anything in the file or file name that could be used to identify the user unless you've gotten permission to do so & understand any legal ramifications. Decide you want to upload the file maybe once a week, or once a day at a random time so all users aren't uploading at the same time. Of course you might want to pre-process the file before uploading it, collapsing the data in some way to make the file smaller.

Be sure your data collection file doesn't grow too big. You probably should delete it after uploading it. Also, if the upload file is big, it will cause a noticeable delay and freeze in your application unless you take steps to upload in the background, etc.

Note that if your users have very strict firewalls or security software, sending a file like this might be prohibited and could even cause your software to be flagged as malware. You'll need to carefully consider this issue and evaluate the various ways that data can be sent over the web in a way that is safe, unobtrusive and ANONYMOUS, and allowed by various security applications. For example, you will need to understand whatever protocol you use to upload and how much information that it might provide your server about the identify (like IP address, which might be vaguely personal with the right tools or search warrant.)

Then, at your server, over weeks (or whatever time frame you choose to upload files) you will have a lot of files that your software uploaded from users' machines. These files contain the names of the forms your users loaded, or the names of the error messages they got, or elapsed times, or whatever data you collected and uploaded. You would then decide how to process that data into a meaningful reports. Examining all the files, you might learn something like: 50% of our users never opened form X. Or: most users never saw error message #17 or only got error message #22 on form TForm3, or users spent an average of 45 seconds with Form4 visible.

I've simplified almost everything above. There are of course, for example, much better ways to save the collected data than appending to a text file. A text file might grow too big and too slow. There might be legal or ethical issues you'll need to consider.

But this is the general idea.

This is not a casual project to put into an application that others will use unless you fully understand all the issues and design and code it well. (That, I suppose, can be said about any coding!) But, as I noted, above I've written an overview of how you might do it for, say, an homework assignment or to just explore.)

RobertFrank
  • 7,332
  • 11
  • 53
  • 99
  • I doubt that GUID is safely anonymous. Part of their uniqueness is due to the MAC address of the machine. I suspect that GUIDs could be correlated. – Chris Thornton Jun 17 '11 at 02:09
  • Sorry. This isn't an answer; it's a lot of text, though. Congrats on the effort, but I'm afraid I have to downvote it for the actual content. – Ken White Jun 17 '11 at 02:13
  • 2
    @Chris Thornton, CoCreateGuid no longer copies Ethernet physical address into uuid. – Premature Optimization Jun 17 '11 at 02:16
  • Ken: I appreciate your not down-voting me anonymously, but your comment above is not a useful comment since it does not help me understand WHY my posting is not an answer. Yes, my answer is text. How does it not help the user? Are there guidelines at SO that state I have to post code for an answer to be valid? IMO, this answer will be helpful to the user and qualifies as an answer. It presents a general design, suggests some implementation strategies, discusses some pitfalls, and IMO, speaks to the level of the question in conceptual manner. I hope you will explain how you feel it is not. – RobertFrank Jun 17 '11 at 02:17
  • 6
    +1 to compensate a downvote. Since OP is not clear, this answer shows a good insight. – Premature Optimization Jun 17 '11 at 02:22
  • 1
    @Robert: No, you don't have to post actual code. But "you could collect any data you wanted" and "you need to transfer the data" and "Be sure your data collection file doesn't get too big" and "examining the files you might learn something" aren't really answers to the question. Despite @Downvoter's trying to punish me by opposing everything I say, I still don't think you've answered the question asked - you've just explained what the things are they might want to think about when they try and come up with a solution. "I want to improve my car. How?" "You might think about new paint." – Ken White Jun 17 '11 at 02:35
  • @Ken White, ah thats you again, Mr. Badwordfinder :) No, you arent THAT important. This answer will remain good at least until OP come back and make his question specific. Plus answer itself is quite educational. – Premature Optimization Jun 17 '11 at 03:16
  • 5
    @Ken, this question asked how to send data; Robert said he uses Indy's FTP component. The question asked how to collect data; Robert said that when an event occurs that he's interested in (giving several examples) he records information about the event in a file. How does this not answer the question? Besides, a new paint job seems like a perfect suggestion for the task of improving a car. So does replacing the air filter, or cleaning the crumbs out of the backseat. Vague questions get varied answers. – Rob Kennedy Jun 17 '11 at 03:17
  • +1 It is not Roberts fault that the question is so vague. So the answer is perfectly valid. Not a great one, but it can't be, because the question is neither. – Runner Jun 17 '11 at 04:52
  • @Rob: If the question was too vague to be able to answer, as this one was, the poster should have been asked to make it an actual question, and if that wasn't done it should have been closed. More than a screenful of text (again, while Robert's effort was commendable) venturing guesses as to what the OP may have been really wanting to know is simply not an answer. If I posted 20 paragraphs if 'ipso lorem' in response to "where can I get test data?", with no more information about what type of data or how it would be used, would you up vote that as well? – Ken White Jun 17 '11 at 11:09
  • This isn't 20 paragraphs of lorem ipsum, @Ken. It's nine paragraphs providing a good overview of how to collect usage data. Since it's the accepted answer, it must have been what DRokie wanted. You'll notice I've edited the question, and although it doesn't really look like the original anymore, I don't think I changed the overall sense of the question at all. My biggest change was to explicitly say "I'm not sure where to begin," indicating that an overview is being requested, not specific details for a specific problem. I think it was implied before – Rob Kennedy Jun 17 '11 at 14:55
  • @Rob: Of course I meant 'Lorem ipsum'; guess the coffee hadn't kicked in. I'm not sure I agree that it's a good answer because of the vagueness of the question asked, but I'm removing my downvote because as I've said from the beginning, it's not a *wrong* answer (not changing to an upvote, however). – Ken White Jun 18 '11 at 00:01