How to write this new file completing with information inside a string in the correct positions?

Question

If I have this string, it contains several lines separated by newlines.

data_string = """
te gustan los animales?
puede que algunos me gusten pero no se mucho de eso

animales
puede que algunos me gusten pero no se mucho de eso

te gustan las plantas?
no se mucho sobre ellas

carnivoro
segun mis registros son aquellos que se alimentan de carne
"""

# Split the string into individual lines
lines = data_string.splitlines() 

# Open the check.py file for writing
with open("check.py", "w") as f:
    # Write the lines in the file here...
    for line in lines:
        #How to write the code with this lines info?

Using python how can I create a file using the content of the lines inside the data_string variable so that it is thus inside a new file.

This is how the resulting file should look like, note that the lines "te gustan los animales?", "animales", "te gustan las plantas?" and "carnivoro" were left as parameters of SequenceMatcher(None, str1, "here").ratio() and the lines "puede que algunos me gusten pero no se mucho de eso", "no se mucho sobre ellas", "no se mucho sobre ellas" and "segun mis registros son aquellos que se alimentan de carne" were left as response_text = "here"

Output file called check.py :

from difflib import SequenceMatcher

def check_function(str1):
    
    similarity_ratio = 0.0
    response_text = "no coincide con nada"
    threshold = 0.0

    similarity_ratio_to_compare = SequenceMatcher(None, str1, "te gustan los animales?").ratio()
    if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold:
        response_text = "puede que algunos me gusten pero no se mucho de eso"
        similarity_ratio = similarity_ratio_to_compare

    similarity_ratio_to_compare = SequenceMatcher(None, str1, "animales").ratio()
    if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold:
        response_text = "puede que algunos me gusten pero no se mucho de eso"
        similarity_ratio = similarity_ratio_to_compare

    similarity_ratio_to_compare = SequenceMatcher(None, str1, "te gustan las plantas?").ratio()
    if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold:
        response_text = "no se mucho sobre ellas"
        similarity_ratio = similarity_ratio_to_compare

    similarity_ratio_to_compare = SequenceMatcher(None, str1, "carnivoro").ratio()
    if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold:
        response_text = "segun mis registros son aquellos que se alimentan de carne"
        similarity_ratio = similarity_ratio_to_compare

    return response_text

input_text = "te gusta saltar la soga bien piolon???"
text = check_function(input_text)
print(text)

In the end this file must be saved with the name check.py, keep in mind that the number of lines inside the data_string variable is not known (and in this case there are only 4 questions with 4 answers to prevent the question from being too long)

I am still not clear what is the expected output is, appreciate if you explain a little bit more — mpx, Mar 10 '23 at 06:32
@mpx I was trying to get the first code to generate a txt file with the second code shown in the question. That file instead of saving it as a .txt saves it as a .py The problem is that to generate that code you must concatenate parts of the lines of the string that is in the first code. The second code is only the output, of how the resulting file should be — Matt095, Mar 10 '23 at 06:35
I am still having difficulty understanding what you intend to do. Perhaps you could create a simple example that achieves the same thing you intend to do. — mpx, Mar 10 '23 at 06:40
@Manvi this is the first code `data_string = """ te gustan los animales? puede que algunos me gusten pero no se mucho de eso` and this is the second code (the correct output file, that I need obtain with the first code) `from difflib import SequenceMatcher def check_function(str1): similarity_ratio = 0.0 ...` — Matt095, Mar 10 '23 at 06:42
```te gustan los animales? puede que algunos me gusten pero no se mucho de eso animales puede que algunos me gusten pero no se mucho de eso te gustan las plantas? no se mucho sobre ellas carnivoro segun mis registros son aquellos que se alimentan de carne``` You want this stuff in .txt file??? — Manvi, Mar 10 '23 at 06:46
There is a string with several lines, which follow an order: question, answer, empty line, question, answer, empty line, ... , in that first code you try to create a file and write the code that appears later inside it. The problem is that for this I need to concatenate some lines that were in the `data_string` variable. Ademas hay que tener en cuenta que se asume no saber exactamente cuantas lineas va a tener la variable `data_string`, por lo que deberia crear un loop. — Matt095, Mar 10 '23 at 06:47
@Manvi I want write all this code `from difflib import SequenceMatcher def check_function(str1): similarity_ratio = 0.0 response_text = "no coincide con nada" threshold = 0.0` in a `.txt` or `.py` file — Matt095, Mar 10 '23 at 06:48
@Manvi Note that this code in these parts maintains a repeating structure but concatenating each of the strings extracted from the lines `similarity_ratio_to_compare = SequenceMatcher(None, str1, "te gustan los animales?").ratio() if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold: response_text = "puede que algunos me gusten pero no se mucho de eso" similarity_ratio = similarity_ratio_to_compare` — Matt095, Mar 10 '23 at 06:50
In this case concatenate `"te gustan los animales?"` and `"puede que algunos me gusten pero no se mucho de eso"`, then repear this but with `"animales"` and `"puede que algunos me gusten pero no se mucho de eso"`, ... And so on until doing it with each of the question-answers. Until forming the second code shown in the question — Matt095, Mar 10 '23 at 06:53
As per I understand you have given output file ```check.py``` which include ```check_function()``` function and the 1st file you will use to get the 2nd file??? Is that you want? — Manvi, Mar 10 '23 at 06:56
@Manvi The data_string variable although here only has 4 question and answer pairs, the real one has more than 10000 question and answer pairs, so instead of manually writing 10000 or more times the structure — Matt095, Mar 10 '23 at 07:01
`similarity_ratio_to_compare = SequenceMatcher(None, str1, " Do you like animals?").ratio() if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold: response_text = "I might like some but I don't know much about it" similarity_ratio = similarity_ratio_to_compare` , try to create the first code that using a loop can do this second code in an automated way, since manually writing the same sequence 10000 times is not practical — Matt095, Mar 10 '23 at 07:02
For one time you have to write it manually..... N you only want to return answer of the question if it is available in the string right?? — Manvi, Mar 10 '23 at 07:22
@Manvi Exactly, that's the goal of the second code (which is the code that the first code should write). In this reduced example, there are only 4 question-answer pairs, but writing 10000 or more question-answer pairs would be writing the same block of code too many times but changing the concatenated strings. This program only makes sense when the `data_string` variable has many lines inside, in this case there are only 4 pairs (question-answer). And it is possible to write the second code by hand, but if the question-answer pairs increase, obviously I need a loop that builds that second code — Matt095, Mar 10 '23 at 07:37
"This is how the resulting file should look like" - **What is "this"**? What are you referring to here? It seems like you are asking about taking an existing file `check.py` and replacing some of its contents. For the given example, **exactly** what is/should be in `check.py` **both before and after** the code runs? — Karl Knechtel, Mar 10 '23 at 07:56
@KarlKnechtel "THIS" is all the second code `from difflib import SequenceMatcher def check_function(str1): similarity_ratio = 0.0 response_text = "no coincide con nada" threshold = 0.0 similarity_ratio_to_compare = SequenceMatcher(None, str1, "te gustan los animales?").ratio() if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold: response_text = "puede que algunos me gusten pero no se mucho de eso" similarity_ratio = similarity_ratio_to_compare` — Matt095, Mar 10 '23 at 07:58
@KarlKnechtel The second code is the correct output, not the code. — Matt095, Mar 10 '23 at 07:59
Please [edit] the question to clarify that. We need all necessary information to understand the question, **in the question itself**. Comments can be deleted at any time, and are not appropriate for showing multi-line code. — Karl Knechtel, Mar 10 '23 at 08:00
@KarlKnechtel I will edit that, but in the question literally I put "Output file called check.py :" is that... And I add the result file content — Matt095, Mar 10 '23 at 08:02

score 1 · Accepted Answer · answered Mar 10 '23 at 07:54

Here is the code you can save your question and answer pair in a file as the key value pair in .json format: Code to create .json file containing question and answer pair:

jsonData.py


import json

data_string = {
"te gustan los animales?":
"puede que algunos me gusten pero no se mucho de eso",

"animales":
"puede que algunos me gusten pero no se mucho de eso",

"te gustan las plantas?":
"no se mucho sobre ellas",

"carnivoro":
"segun mis registros son aquellos que se alimentan de carne"
}

with open('result.json', 'w') as fp:
    json.dump(data_string, fp)

from the above code you can store all your question and answer pair in the json file... First store all you question and answer pair using jsonData.py file and then use the below code to match the question and answer if exis..

And the below code is for matching the question answer pair :

check.py

from difflib import SequenceMatcher
import json

with open('result.json', 'r') as f_in:
    data = json.load(f_in)

def check_function(str1):
    similarity_ratio = 0.0
    response_text = "no coincide con nada"
    threshold = 0.0

    try:
        similarity_ratio_to_compare = SequenceMatcher(None,str1, data[str1]).ratio()
        if similarity_ratio_to_compare > similarity_ratio and similarity_ratio_to_compare > threshold:
            response_text = data[str1]
            similarity_ratio = similarity_ratio_to_compare
    except Exception as e:
        return response_text
    return response_text

input_text = input('Enter text to get answer:')
text = check_function(input_text)
print(text)

the problem with that is that I dont know how to convert 10000 lines in this json format `"te gustan los animales?": "puede que algunos me gusten pero no se mucho de eso"`, with `:` in middle — Matt095, Mar 10 '23 at 08:10
You just have to add your question and answer pair in the ```jsonData.py``` file in ```data_string``` variable...Every pair must be seperated by ```,``` as Example:```'que1':'ans1', que2':'ans2', and so on....``` and just run jsonData file after adding all the pairs...It will create ```result.json``` file and open it in the ```check.py``` file as given above — Manvi, Mar 10 '23 at 09:01

How to write this new file completing with information inside a string in the correct positions?

1 Answers1