47

I am studying this snippet of python code. What does X = X[:, 1] mean in the last line?

def linreg(X,Y):
    # Running the linear regression
    X = sm.add_constant(X)
    model = regression.linear_model.OLS(Y, X).fit()
    a = model.params[0]
    b = model.params[1]
    X = X[:, 1]
Dave Mackey
  • 4,306
  • 21
  • 78
  • 136
Taewan
  • 1,167
  • 4
  • 15
  • 25

5 Answers5

74
x = np.random.rand(3,2)

x
Out[37]: 
array([[ 0.03196827,  0.50048646],
       [ 0.85928802,  0.50081615],
       [ 0.11140678,  0.88828011]])

x = x[:,1]

x
Out[39]: array([ 0.50048646,  0.50081615,  0.88828011])

So what that line did is sliced the array, taking all rows (:) but keeping the second column (1)

Leb
  • 15,483
  • 10
  • 56
  • 75
  • No problem, glad I helped. – Leb Nov 03 '15 at 05:12
  • 1
    Your link is incorrect (it points to the Python 2.3 docs). Use [this one](http://docs.scipy.org/doc/numpy-1.10.1/reference/arrays.indexing.html#basic-slicing-and-indexing) from the `numpy` docs instead. – MattDMo Nov 03 '15 at 05:18
  • @MattDMo more up-to-date, I was trying to find something related to python. – Leb Nov 03 '15 at 05:22
  • 2
    The Python docs are [here](https://docs.python.org/3/library/functions.html#slice), and are linked from the numpy ones. Both are necessary to understand what's going on in the example, as the numpy syntax is different from standard Python. – MattDMo Nov 03 '15 at 05:24
  • 2
    So the important takeaway is that this is numpy extension to Python, not standard Python (2 or 3) right? – audiodude Mar 24 '17 at 02:06
  • For readers who find this latter even more up to date, docs are here https://numpy.org/doc/stable/reference/arrays.indexing.html – Robi Sen Oct 06 '20 at 19:48
13

Something you should know

The term you need to search for is "slice". x[start:end:step] is the full form. Here we can omit some values and it will use a default value:

  • start defaults to 0,
  • end defaults to the length of the list,
  • and step defaults to 1.

And hence x[:] means the same as x[0:len(x):1]

Dave Mackey
  • 4,306
  • 21
  • 78
  • 136
Adiraamruta
  • 131
  • 1
  • 2
6

Meaning of X = X[:, 1] in Python is:

  • X is a dataset or a array
  • Say Here X have n rows and n columns
  • so by doing x=x[:,1] we get all the rows in x present at index 1.

for example:

x = array([[0.69859393, 0.1042432 ],
   [0.55138493, 0.18639614],
   [0.27338772, 0.80351282]])

x[:,1] = array([0.1042432 , 0.18639614, 0.80351282])
4

It is like you are specifying the axis. Consider the starting column as 0 then as you go through 1,2 and so on.

The syntax is x[row_index,column_index]

You can also specify a range of row values as per your need in row_index, eg:1:13 extracts first 13 rows along with whatever specified in the column

Dave Mackey
  • 4,306
  • 21
  • 78
  • 136
  • 1
    since index starts at 0, 1:13 only extracts12 rows, starting on the 2nd element. For extract first 13 rows,you should use the expression 0:13. – Manuel Romeiro Nov 03 '19 at 23:27
3

x[:,1] this is 2d slicing, here x[row_index, column_index]