0

I am sure this is not hard, but I can't figure it out!

I want to create a dataframe that starts at 1 for the first row and ends at 100,000 in increments of 1, 2, 4, 5, or whatever. I could do this in my sleep in Excel, but is there a slick way to do this without importing a .csv or .txt file?

I have needed to do this in variations many times and just settled on importing a .csv, but I am tired of that.

Example in Excel

Clay
  • 27
  • 7
  • What are you having trouble with? Creating a `DataFrame()`? Assigning to some `df['column'] = ` of it? Coming up with a way to `range()` over some numbers? Setting the step size for such a range? Putting those three things together? Please share some information about what you found and tried, with an example of code you expected to work and a brief description what the problem was, or the error message that stumped you. – Grismar Feb 19 '22 at 22:38
  • 1
    Use `list(range(1, 100001))` to create a list, then convert that list to a dataframe. – Barmar Feb 19 '22 at 23:14
  • I didn't know about the range in numpy or that I could assign the range to a df. Amirhossein Kiani showed me exactly what I wanted to know that I didn't know. – Clay Feb 20 '22 at 23:29

1 Answers1

1

Generating numbers

Generating numbers is not something special to pandas, rather numpy module or range function (as mentioned by @Grismer) can do the trick. Let's say you want to generate a series of numbers and assign these numbers to a dataframe. As I said before, there are multiple approaches two of which I personally prefer.

  • range function

Take range(1,1000,1) as an Example. This function gets three arguments two of which are not mandatory. The first argument defines the start number, the second one defines the end number, and the last one points to the steps of this range. So the abovementioned example will result in the numbers 1 to 9999 (Note that this range is a half-open interval which is closed at the start and open at the end).

  • numpy.arange function

To have the same results as the previous example, take numpy.arange(1,1000,1) as an example. The arguments are completely the same as the range's arguments.

Assigning to dataframe

Now, if you want to assign these numbers to a dataframe, you can easily do this by using the pandas module. Code below is an example of how to generate a dataframe:

import numpy as np
import pandas as pd
myRange = np.arange(1,1001,1) # Could be something like myRange = range(1,1000,1)
df = pd.DataFrame({"numbers": myRange})
df.head(5)

which results in a dataframe like(Note that just the first five rows have been shown):

numbers
0 1
1 2
2 3
3 4
4 5

Difference of numpy.arange and range

To keep this answer short, I'd rather to refer to this answer by @hpaulj

TheFaultInOurStars
  • 3,464
  • 1
  • 8
  • 29