I am trying to load a zip file and save it in the virtual file system for further processing with pyscript. In this example, I aim to open it and list its content.
As far as I got:
See the self standing html code below, adapted from tutorials (with thanks to the author, btw)
It is able to load Pyscript, lets the user select a file and loads it (although not in the right format it seems). It creates a dummy zip file and saves it to the virtual file, and list the content. All this works upfront and also if I point the process_file function to that dummy zip file, it indeed opens and lists it.
The part that is NOT working is when I select via the button/file selector any valid zip file in the local file system, when loading the data into data
it is text (utf-8) and I get this error:
File "/lib/python3.10/zipfile.py", line 1353, in _RealGetContents
raise BadZipFile("Bad magic number for central directory")
zipfile.BadZipFile: Bad magic number for central directory
I have tried saving to a file and loading it, instead of using BytesIO , also tried variations of using ArrayBuffer or Stream from here I have also tried creating a FileReader and using readAsBinaryString() or readAsText() and various transformations, with same result: either it fails to recognise the "magic number" or I get "not a zip file". When feeding some streams or arrayBuffer I get variations of:
TypeError: a bytes-like object is required, not 'pyodide.JsProxy'
At this point I suspect there is something embarrassingly obvious that yet I am unable to see, so, any fresh pair of eyes and advice on how best/simply load a file is much appreciated :) Many thanks in advance.
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
<script defer src="https://pyscript.net/alpha/pyscript.js"></script>
<title>Example</title>
</head>
<body>
<p>Example</p>
<br />
<label for="myfile">Select a file:</label>
<input type="file" id="myfile" name="myfile">
<br />
<br />
<div id="print_output"></div>
<br />
<p>File Content:</p>
<div style="border:2px inset #AAA;cursor:text;height:120px;overflow:auto;width:600px; resize:both">
<div id="content">
</div>
</div>
<py-script output="print_output">
import asyncio
import zipfile
from js import document, FileReader
from pyodide import create_proxy
import io
async def process_file(event):
fileList = event.target.files.to_py()
for f in fileList:
data= await f.text()
mf=io.BytesIO(bytes(data,'utf-8'))
with zipfile.ZipFile(mf,"r") as zf:
nl=zf.namelist()
nlf=" _ ".join(nl)
document.getElementById("content").innerHTML=nlf
def main():
# Create a Python proxy for the callback function
# process_file() is your function to process events from FileReader
file_event = create_proxy(process_file)
# Set the listener to the callback
e = document.getElementById("myfile")
e.addEventListener("change", file_event, False)
mf = io.BytesIO()
with zipfile.ZipFile(mf, mode="w",compression=zipfile.ZIP_DEFLATED) as zf:
zf.writestr('file1.txt', b"hi")
zf.writestr('file2.txt', str.encode("hi"))
zf.writestr('file3.txt', str.encode("hi",'utf-8'))
with open("a.txt.zip", "wb") as f: # use `wb` mode
f.write(mf.getvalue())
with zipfile.ZipFile("a.txt.zip", "r") as zf:
nl=zf.namelist()
nlf=" ".join(nl)
document.getElementById("content").innerHTML = nlf
main()
</py-script>
</body>
</html>