8

I need some artificial data namely, "Two-Spiral", "Cluster-inside-Cluster", "Half-Kernel", "crescent-full-moon", and "outlier" for Machine Learning purposes.

Artificial Data

Is there any guide/package/source-code in MATLAB?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
BlueBit
  • 397
  • 6
  • 22
  • 1
    Do you have full-sized pictures of your data samples? Then you can `imread` every image, loop through it `x` and `y` coordinates and separates points in claster by color threshold. – Larry Foobar Apr 22 '13 at 13:10
  • 1
    If you can define some of those shapes with a function (pretty eash for the circles etc) then you can use the monte-carlo approach from this solution: http://stackoverflow.com/questions/16098209/point-cloud-generation/16098613#16098613 – Dan Apr 22 '13 at 13:37
  • How can i separate points in different clusters with imread? – BlueBit Apr 22 '13 at 15:33
  • Also, for some pictures such as (c), (d), (f), i do not have any picture. – BlueBit Apr 22 '13 at 15:35

1 Answers1

15

Because I thought it would be useful to have these kind of datasets available and because it would be a fun exercise, I wrote some functions to generate random datasets that are very similar to the ones shown in your picture. There are a bunch of options to control the number of instances, amount of noise, etc. The output for each function is an Nx3 matrix, where each line contains the X,Y coordinates and the class of an instance.

This is what the output looks like:

Example of generated datasets

I did it in 6 scripts of 30-40 lines each. I uploaded the scripts to the Matlab File Exchange but it hasn't been reviewed yet. For now, you can get the files here. There are barely any comments in this first version, but I hope the code is self-explanatory. There is also a demo script (datasetsdemo.m) that will run all the scripts and produce the image shown above.

Junuxx
  • 14,011
  • 5
  • 41
  • 71