# Unzip the dataset (if we haven't already)
if not os.path.exists('./cola_public/'):
!unzip cola_public_1.1.zip
The above code will unzip a file in jupyter notebook. How would I do this in a similar fashion if the file was a .gz file?
# Unzip the dataset (if we haven't already)
if not os.path.exists('./cola_public/'):
!unzip cola_public_1.1.zip
The above code will unzip a file in jupyter notebook. How would I do this in a similar fashion if the file was a .gz file?
The zipfile package works pretty well for gzip
import zipfile as zf
file = zf.ZipFile("/path/to/file/YOUR_FILE.gzip")
I assume that your file was tar.gz and it contains more files, then you can use. (You need to create test folder or use root)
with tarfile.open('TEST.tar.gz', 'r:gz') as _tar:
for member in _tar:
if member.isdir():#here write your own code to make folders
continue
fname = member.name.rsplit('/',1)[1]
_tar.makefile(member, 'TEST' + '/' + fname)
Or if your gz is not a tar file and contains a single file you can use gzip Reference:- https://docs.python.org/2/library/gzip.html#examples-of-usage
import gzip
import shutil
def gunzip(file_path,output_path):
with gzip.open(file_path,"rb") as f_in, open(output_path,"wb") as f_out:
shutil.copyfileobj(f_in, f_out)
f_in.close()
f_out.close()
f='TEST.txt.gz'
gunzip(f,f.replace(".gz",""))