0

Hi I am starting to learn pandas to deal with text files. So far I have been using numpy loadtxt but I am having some issues with some text files generated by a very old program (which I cannot replace):

[/home/Desktop/Flux_Calibration_Steps/stdWolf_wide_cr_f_a_bg] 1 4091 300.01 1.195  4240.778  7791.743 wolf wide
 4330.00   1.3731E-13   20.000       88809.
 4350.00   1.3480E-13   20.000      117447.
 4370.00   1.5911E-13   20.000      162742.
 4390.00   1.6972E-13   20.000      183740.
 4510.00   1.8634E-13   20.000      863055.
 4530.00   1.8641E-13   20.000     1056961.
 4550.00   1.8308E-13   20.000     1215476.
 4570.00   1.7654E-13   20.000     1352265.
[/home/Desktop/Flux_Calibration_Steps/stdF34_wide_cr_f_a_bg_] 1 4091 300.01 1.037  4241.941  7793.365 F34 broad
 4400.00   2.8298E-13   50.000     1244259.
 4450.00   2.6912E-13   50.000     1978971.
 4500.00   2.5837E-13   50.000     3862673.
 4550.00   2.4811E-13   50.000     5843749.
 4600.00   2.3832E-13   50.000     7363710.

In here you have the data in the form of a 4 column table. As you can see there are 2 longer rows. These represent data from different sources which the code stacks one after another.

I would like to extract the data from the first and second column from each of the different sources. Also I would like to get the data from the header rows if possible.

However I do not know which is the "pythonish" way to do this. I wonder if anyone would offer and advice on how to identify the index of each "header" row without making the loop.

A few warnings:

1) The number of columns is constant, in both header and data rows. But the elements may differ 2) The number of rows maybe different for the several data sources

Thanks for any advice.

JohnE
  • 29,156
  • 8
  • 79
  • 109
Delosari
  • 677
  • 2
  • 17
  • 29
  • 1
    Please edit your data into your code, it's not that many lines, plus post desired output – EdChum May 20 '15 at 09:18
  • @EdChum Thanks for the comment: The data for this example is small. In a real case I could have several thousand lines. I cannot edit this data since it is generated by a program which has... very old formating capabilities... (IRAF if you need to know). Maybe I was not clear, I do not need a sample working code. Just an advice: how would you get data from a text file, such as this, where not all the columns have the same format without doing a line loop? – Delosari May 20 '15 at 09:44
  • Well `np.loadtxt` and `pd.read_table` should be able to handle this fine – EdChum May 20 '15 at 09:48
  • By the question "how would you get data from a text file, such as this, where not all the columns have the same format without doing a line loop" I'd say to take a look into the `converters` parameter of `read_csv`, but after reading your post I'm not sure that's the easiest way of getting what you want. – vmg May 20 '15 at 10:23
  • Thank you @vmg I will take a look at this function and provide a reply with the best I can do :) – Delosari May 20 '15 at 11:00
  • I deleted my answer since your comment said it's not what you want. I think I'm starting to understand the question but you really need to explicitly show the desired output so there is no confusion. – JohnE May 20 '15 at 15:17

0 Answers0