6

I am using Orange (in Python) for some data mining tasks. More specifically, for clustering. Although I have gone through the tutorial and read most of the documentation, I still have a problem. All the examples in docs and tutorials assume that I have a tab delimited table with data in it. However, there is nothing saying how one can go about creating a new table from scratch. For example, I want to create a table for word frequencies across different documents.

Maybe I am missing something so if anyone has any insight it'd be appreciated.

Thanks George

EDIT:

This is how I create my table

#First construct the domain object (top row)
vars = []
for var in variables:
    vars.append(Orange.data.variable.Continuous(str(var)))
domain = Orange.data.Domain(vars, classed) #The second argument indicated that the last attr must not be a class    
#Add data rows assuming we have a matrix 
t = Orange.data.Table(domain, matrix)        
tswei
  • 413
  • 7
  • 9
George Eracleous
  • 4,278
  • 6
  • 41
  • 50
  • hear, hear! best stuff I found was at: http://orange.biolab.si/doc/reference/Orange.data.table/ . Part of why I switched to ``pandas -> R`` (ugh!) as my workflow. – Gregg Lind Jan 24 '12 at 19:54
  • Orange document is very good now. See https://orange3.readthedocs.io/projects/orange-data-mining-library/en/latest/tutorial/data.html – H.C.Chen Aug 25 '21 at 02:59
  • You can use `Orange.data.Table.from_numpy`, which would be suitable here. – vijolica Aug 26 '21 at 07:10
  • To add a quick helper - If you to to the "Data" tab on the widgets sidebar, and select "Python Script", the default (on 3.30 at least) is already filled out with a python script that creates a table from numpy. – stellarpower Nov 14 '21 at 23:24

2 Answers2

5

This took me hours to figure out. In python, do this:

Import Orange
List, Of, Column, Variables = [Orange.feature.Discrete(x) for x in ['What','Theyre','Called','AsStrings']]
Domain = Orange.data.Domain([List, Of, Column, Variables])
Table = Orange.data.Table(Domain)
Table.save('NewTable.tab')

I'd tell you what each bit of code does, but as of now I'm not really sure. It's funny that such a powerful toolkit should have such hard to understand documentation, but I suspect it's because it's entire user base has doctorates.

N. McA.
  • 4,796
  • 4
  • 35
  • 60
  • Well I don't have a doctorate yet I use Orange. But I agree with you. The documentation is amazingly complicated. The problem is that sometimes Orange tries to make life so much harder than it actually is:). I've actually managed to found the solution myself but forgot to post it here. I am gonna do it now but I'll select your answer:) – George Eracleous Jul 05 '12 at 10:01
  • We'd appreciate suggestions (or even better, pull-requests with edits) to make it better! – vijolica Aug 26 '21 at 07:09
2

The documentation is indeed insufficient if you ask me. This may not be the answer to the question but it could be helpful to someone else. I tried for hours to create a Table using constructors and Domains and what not, just for an association rule mining task, and finally found out that the easiest way to create a table is simply to write your data to a file with the extension .tab or .basket and create a table from that.

Orange.data.Table("yourFile.basket")

Of course the structure of the file needs to be correct. See the provided example files located in the Orange package directory inside datasets/