0

Original question

I've seen pretty much in all examples people first making an instance of a class: ss = StandardScaler() and only after that use the methods from the instance: ss.fit_transform(df), rather than calling the method on the class itself: StandardScaler().fit_transform(df).

Is this because of:

  1. There are cases, which would throw an error otherwise.
  2. There are cases, which don't throw an error, but produce different results (scary!)
  3. Prevents repetition of code (but it's ok, if its used only once.)
  4. It's better to do just one thing on one line of code.
  5. Aesthetics & opinion.
  6. Some other reason, please let me know!

Some answers thus far

Thank you for the answers that raised many clarifying points, here's some as I understand them. Please correct me if I'm mistaken.

Potential reasons I suggested for making an instance first:

There are cases, which would throw an error otherwise.

  • Thomas Weller's answer below states that there shouldn't be, since calling the method on the class creates a temporary instance - it just doesn't get stored in a variable.

There are cases, which don't throw an error, but produce different results (scary!)

  • Thomas Weller's answer below states that there shouldn't be, since calling the method on the class creates a temporary instance.

It's ok to call on the class itself, if its used only once.

  • This seems to be true, as there is no reason to store the instance in a variable and repetition is not a problem.

It's better to do just one thing on one line of code.

  • Readability is more important than doing just one thing per line. In my opinion, both versions are just as clear and readable.

Aesthetics & opinion

  • There's some of these involved as well.

Some other reason, please let me know!

  • Of course object oriented programming is useful in many ways, but my question concerned only the isolated use of a class and a method someone else has already programmed for me.
  • My question wasn't concerned whether or not you can put parameters inside the class or the method - my example actually does this: np.random.default_rng(0).integers(10, size=(4,5))

Code Example

import numpy as np
from sklearn.preprocessing import StandardScaler

# Here I'm using .interegs() without making an instance first
int_array1 = np.random.default_rng(0).integers(10, size=(4,5))

# Here I'm using .interegs() without making an instance first
int_array2 = StandardScaler().fit_transform(int_array1)

# This time instantiating before using for comparison
rng = np.random.default_rng(0)
int_array3 = rng.integers(10, size=(4,5))
ss = StandardScaler()
int_array4 = ss.fit_transform(int_array3)

print(int_array1)
print(int_array2)
print(int_array3)
print(int_array4)

Output has the same results regardless of instantiation.

[[8 6 5 2 3]
 [0 0 0 1 8]
 [6 9 5 6 9]
 [7 6 5 5 9]]
[[ 0.88354126  0.22941573  0.57735027 -0.72760688 -1.70856429]
 [-1.68676059 -1.60591014 -1.73205081 -1.21267813  0.30151134]
 [ 0.2409658   1.14707867  0.57735027  1.21267813  0.70352647]
 [ 0.56225353  0.22941573  0.57735027  0.72760688  0.70352647]]
[[8 6 5 2 3]
 [0 0 0 1 8]
 [6 9 5 6 9]
 [7 6 5 5 9]]
[[ 0.88354126  0.22941573  0.57735027 -0.72760688 -1.70856429]
 [-1.68676059 -1.60591014 -1.73205081 -1.21267813  0.30151134]
 [ 0.2409658   1.14707867  0.57735027  1.21267813  0.70352647]
 [ 0.56225353  0.22941573  0.57735027  0.72760688  0.70352647]]
Mikko Haavisto
  • 991
  • 7
  • 10

3 Answers3

0

This answer will be more from perspective as it more of generic question. I don't see your question have any relationship with code which you have posted .

Think about the instance where you have a function definition which need to be used for different purpose , if you don't have class you can't achieve it . Class have helps to achieve the purpose of init and also scope of variable .

More than all it a way of good design and also to helps in debugging

Sreevathsabr
  • 649
  • 7
  • 23
0

and only after that using its methods: ss.fit_transform(df).

There are two types of methods: class methods and (normal, instance) methods.

Only class methods can be called without having the instance (object of that class).

Even if this - ss = StandardScaler() - looks empty to you, it's not, it just uses default values for the parameters. But it's customisable and you can have multiple Scalers that differ a bit in what they do.

In case of objects, some things can be stored between the calls and that's the case here as well - e.g. Scaler stores number of samples it has seen, and returns self (to allow method chaining, probably?) in partial_fit (click for StandardScaler.partial_fit source code)

h4z3
  • 5,265
  • 1
  • 15
  • 29
-1

You have a misunderstanding. Both versions of your code create an instance:

ss = StandardScaler()
ss.fit_transform(df)

as well as

StandardScaler().fit_transform(df)

The variable name (ss) has nothing to do with creating an instance. The braces (()) after the class name are responsible for creating the instance.

Code without creation of an instance would look like

StandardScaler.fit_transform(df)
#             ^^ note the missing braces

We call such methods static.

Some other reason, please let me know!

You want an object live longer if it holds state, i.e. it's contents in the object that changes over time.

The example you posted is perfect for demonstrating this, you just didn't do it right:

import numpy as np
# No variable assignment
print(np.random.default_rng(0).integers(10, size=(1, 5)))
print(np.random.default_rng(0).integers(10, size=(1, 5)))
print("-"*10)
# Variable assignment
rng = np.random.default_rng(0)
print(rng.integers(10, size=(1, 5)))
print(rng.integers(10, size=(1, 5)))

That way you can demonstrate that the random number generator has state. And that state ensures that it generates new random numbers on the next call.

Possible output:

[[8 6 5 2 3]]
[[8 6 5 2 3]]
----------
[[8 6 5 2 3]]
[[0 0 0 1 8]]

There are cases, which would throw an error otherwise.

This should not happen. The code

ClassName().method()

also creates an instance, it just has no variable name assigned. It's like doing

temp = ClassName()
temp.method()
del temp            # the variable is gone here

There are cases, which don't throw an error, but produce different results (scary!)

This should not happen for the same reason as before.

Prevents repetition of code (but it's ok, if its used only once.)

As you say: the constructor is run once when you assign the variable. If you need the variable more often and you don't assign a variable, this might result in constructors run multiple times.

When calling the same constructor over and over, DRY (the clean code principle "don't repeat yourself") becomes a reason, yes.

It's better to do just one thing on one line of code.

I don't think there's such a rule. List comprehensions are often the opposite.

Aesthetics & opinion.

Always :-)

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • "There are cases, which would throw an error otherwise." "this should not happen" - What? If you call an instance method without an instance, you WILL get an error, that's a fact. (Because of Python's duck typing you might pass something behaving like the instance, but that's exception and it's not really recommended to pass non-instance as `self`.) – h4z3 Mar 17 '21 at 17:27
  • @h4z3 the code `ClassName().method()` does create an instance and thus it's possible to call the instance method. It just does not assign it to a variable. – Thomas Weller Mar 17 '21 at 17:32
  • Ah. The OP mentioned "calling the method on the class itself", so I assumed no instance. – h4z3 Mar 18 '21 at 10:09