0

DataFrame.to_string() uses one extra white space between columns for sign alignment. How to remove it?

INFORMATION

  • I have a fixed field file: adb.dat , which fixed length for each row is 93. adb.dat
20230731;                                                                                     
7914785201501236AAA    TEST TEST                     94481873                                
7914785201502341AAA    TEST TEST                     94481873                                
7914785201503455AAA    TEST TEST                     94481873                                
6736705201501232AAA    TEST TEST                     94481873                                
6736705201502347AAA    TEST TEST                     94481873                                
00000005; 

But when I use DataFrame.to_string(), and write this string to a new file, the length of each row is 94.

First try

According to pandas-dev git, issue#571 said this is a justify problem, so I followed their discussion and tried, but I cannot get my expected result. https://github.com/pandas-dev/pandas/issues/571

Here is my code

    df= pd.read_fwf(py_file_fullnm,
    skiprows=1,
    skipfooter=1,
    header=None,
    names=['content1','content2'],
    colspecs=[(0,16),(16,93)],
    delimiter="\0",
    skipinitialspace =True,
    dtype='string')  

    data=df.to_string(index=False,header=None,justify='right')
    print(data)
    with open('testing.dat','w') as editor:
        editor.write("20230731;  ")
        editor.write("\n"+data)
        editor.write("\n00000007;   ")

Second try

According to Remove the automatic two spaces between columns that Pandas DataFrame.to_string inserts

We can use

df.iloc[0].fillna('').str.strip().str.cat(sep=' ') 

to remove the space between each column, it works, but the length become shorter than first try, field length is 61.

Here is my code

    df= pd.read_fwf(py_file_fullnm,
    skiprows=1,
    skipfooter=1,
    header=None,
    names=['content1','content2'],
    colspecs=[(0,16),(16,93)],
    delimiter="\0",
    skipinitialspace =True,
    dtype='string')

    with open('testing.dat','w') as editor:
        editor.write("20230731;  ")
        for i in range (0,5):
            data=df.iloc[i].fillna('').str.strip().str.cat(sep='')
            editor.write("\n"+data)
        editor.write("\n00000005;   ")

Any method I can use to remove this extra space between each columns, and keep the same field length? Thank you.

First try Result

20230731;                                                                                     
7914785201501236 AAA    TEST TEST                     94481873                                
7914785201502341 AAA    TEST TEST                     94481873                                
7914785201503455 AAA    TEST TEST                     94481873                                
6736705201501232 AAA    TEST TEST                     94481873                                
6736705201502347 AAA    TEST TEST                     94481873                                
00000005; 

Second try Result

20230731;  
7914785201501236AAA    TEST TEST                     94481873
7914785201502341AAA    TEST TEST                     94481873
7914785201503455AAA    TEST TEST                     94481873
6736705201501232AAA    TEST TEST                     94481873
6736705201502347AAA    TEST TEST                     94481873
00000005;   

My expected result

20230731;                                                                                     
7914785201501236AAA    TEST TEST                     94481873                                
7914785201502341AAA    TEST TEST                     94481873                                
7914785201503455AAA    TEST TEST                     94481873                                
6736705201501232AAA    TEST TEST                     94481873                                
6736705201502347AAA    TEST TEST                     94481873                                
00000005; 
  • 1
    Is there a difference between your second attempt and the final result? – mozway Dec 05 '22 at 15:49
  • yes, my expexted result the string is like '7914785201501236AAA TEST TEST 94481873 around 30 space here' BUT the second result is '7914785201501236AAA TEST TEST 94481873' ,Their field length is different. – John Ng Dec 05 '22 at 17:13

0 Answers0