1

I would like to know how to get into dataframe using Python in IBM SPSS Modeler?

So far the only thing I've noticed was a piece of code which itself is not that clear.

Example: SPSS modeler Extension Transform - Python

As you probably know for R in IBM SPSS Modeler it works completly different and easier.

The only thing to do if you want to perform some analytics on dataframe is just to assign a dataframe to a new variable called "modelerData".

How does it look like in Python? Is there an easier way to play with data inside IBM SPSS Modeler?

My common scenario is:

First node (Source node) -> Database node -> data import using SQL Second node: Transform node where I perform some data manipulation etc.

Do I have to use each time this block of code I've attached above?

Would be very grateful for help!

Philipp
  • 745
  • 2
  • 7
  • 20
Filip
  • 63
  • 3

1 Answers1

0

You can use the 'Extension Transform' node within 'Record Operations' and then select 'Python for Spark' in the Syntax tab. Screenshot of Extension Transform Node

Other ways to play with data in SPSS Modeler (after importing the dataset using the nodes with Source tab) are using various nodes available within Record Ops and Field Ops tabs. There are nodes where you can filter, sort, merge, append, derive/calculate new fields, bin, etc

eg of Record Ops

Siddhi
  • 13
  • 2