0

first of all, thanks to read my post. I hope you guys can help me, I'm really new in Python, sorry maybe the answer is really easy.

I read several posts to add [CR][LN] in all lines but the main issue I have in my script ( I don't create that ), is the need to integrate [CR][LN] in all the lines.

At the moment the script only adds [LN] but not the [CR]. The script goes to SQL to extract some tables, convert the information to CSV ( at this moment the information maintains [CR][LN] ), and after that convert to JSON ( in this step only give me the [LN].

import pyodbc
import fileinput
import csv
import pandas as pd
import json
import os
import sys

conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=TEST;'
                      'UID=test;'
                      'PWD=12345;'
                      'Database=TEST;'
                      'Trusted_Connection=no;')
cursor = conn.cursor()

query = "SELECT * FROM placeholder"


with open(r"D:\Test.txt") as file:
    lines = file.readlines()
    print(lines)


for user_input in lines:

    result = query.replace("placeholder", user_input)
    print(result)
    sql_query = pd.read_sql(result,conn)
    df = pd.DataFrame(sql_query)
    user_inputs =  user_input.strip("\n")
    filename = os.path.join('D:\\', user_inputs + '.csv')
    df.to_csv (filename, index = False, encoding='utf-8', sep = '~', quotechar = "`", quoting=csv.QUOTE_ALL)
    print(filename)
    filename_json = os.path.join('D:\\', user_inputs + '.jsonl')
    csvFilePath = (filename)
    jsonFilePath = (filename_json)
    print(filename_json)
    df_o = df.astype(str)
    df_o = df_o.applymap(lambda x: x.strip() if isinstance(x, str) else x)
    df_o.to_json(filename_json, orient = "records",  lines = bool, date_format = "iso", double_precision = 15, force_ascii = False, date_unit = 'ms', default_handler = str)

dir_name = "D:\\"
test = os.listdir(dir_name)

for item in test:
    if item.endswith(".csv"):
        os.remove(os.path.join(dir_name, item)) 

cursor.close()
conn.close()

So, I don't know where I need to add this instruction.

Again thanks so much for all you guys always helping me !!!

Kind regards.

Carlos
  • 9
  • 1
  • "carriage-return" is Python literal `'\r'`, "line-feed" is the well-known `'\n'`. – Joffan Apr 22 '22 at 17:43
  • Not sure what `lines=bool` will evaluate to in your call to to_json. You might try `lines=True` and see if that gives what you want. – bfris Apr 22 '22 at 17:58
  • Is this only an issue with the CSV being written via `df.to_csv(...)`? – tdelaney Apr 22 '22 at 17:59
  • I try with True, but the results are the same only LN ... – Carlos Apr 22 '22 at 18:07
  • Can you post a test script that demonstrates the problem that we can run? If there is a point where `\r\n` is maintained but then there is a problem (you say CSV is okay, but JSON is not) then write a test script hardcoding the good stuff and just the code where everything goes wrong. – tdelaney Apr 22 '22 at 18:10
  • Thanks for your reply @tdelaney, for do that I need to create a new script because as you can see all are concatenated. The problem is in the line: df_o.to_json(filename_json, orient = "records", lines = bool, date_format = "iso", double_precision = 15, force_ascii = False, date_unit = 'ms', default_handler = str) I think I need use the replace method ??? – Carlos Apr 22 '22 at 18:20
  • Okay, so just guessing, but maybe you are referring to the newlines between records in the json output. You can open the output file yourself, giving it a newline ending, and then you'll get the `\r\n`: `df_o.to_json(open("filename_json", "w", newline="\r\n"), ...)` – tdelaney Apr 22 '22 at 18:21
  • @Carlos - yes, you would need to write a new script, but that's a good thing. A test script that just focuses on your problem that we can run. It shouldn't be very long. I am guessing that you are referring to the newlines between records in the json file. So, sql, csv, etc are not part of the problem. – tdelaney Apr 22 '22 at 18:23
  • To be clear - Is your problem with the newline separators in the JSON file? – tdelaney Apr 22 '22 at 18:24
  • @tdelaney Amazing my friend !!! Really thanks so much !!! You save my life !!! You are a guru !!! – Carlos Apr 22 '22 at 18:37

1 Answers1

0

pandas.DataFrame.to_json uses the newline rules of the underlying file object when writing records. If you pass in a file name, pandas will open the file in the default "\n" newline mode. Alternately, you could open the file yourself, choosing the newline policy you want.

import pandas as pd

df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]])
df.to_json(open("test.json", "w", newline="\r\n"), orient="records", lines=True)
print(open("test.json", "rb").read())

Output

b'{"0":1,"1":2,"2":3}\r\n{"0":4,"1":5,"2":6}\r\n{"0":7,"1":8,"2":9}'

(Note also that lines should be True or False, not bool - which happens to be "truthy" so it works, but not correct).

tdelaney
  • 73,364
  • 6
  • 83
  • 116