-1

I was wondering how I can translate the following code into python from matlab script? Especially the content within the for loop and the writematrix last line of code, I have been stuck on. This program codes for copying the second row all columns of every excel file within a folder and creates an output excel file in the same folder which has the second row all columns of every excel file within the folder into a single matrix. Thank you, and a happy new year.

clc
clearvars
fileDir = cd;
outfile = 'OUT.xlsx'; %Output file name
fileNames = dir(fullfile(fileDir,'*.CSV'));
fileNames_sorted = natsortfiles({fileNames.name});
M= length (fileNames_sorted);
second_col= [];

for f = 1:M
    raw = importdata( fullfile(fileDir, fileNames_sorted{f}));
    second_col= [second_col raw(:,2)];  % extract the second column
end
writematrix(second_col,fullfile(cd,outfile))
  • This question seems to be a duplicate of your previous two: [Translating excel file writer matlab script to python](https://stackoverflow.com/questions/70422266/translating-excel-file-writer-matlab-script-to-python) and https://stackoverflow.com/questions/70507643/translating-short-for-loop-matlab-script-to-python. Please do not duplicate questions. – beaker Jan 01 '22 at 00:31

1 Answers1

0

You have to install packages first

pip install pandas, numpy
from pathlib import Path
import pandas as pd 
import numpy as np

clear 

[del globals()[name] for name in dir() if not name.startswith('_')]

fileDir = Path.cwd()
outfile = Path(fileDir, 'OUT.csv')
fileNames = sorted(fileDir.glob('*.csv'))

second_col= []

for file in fileNames:
    df = pd.read_csv(file)
    second_col.append(df.iloc[:, 0])
    
matrix = np.array(second_col)
np.savetxt(outfile, matrix, delimiter=",")

In case you need to transpose the matrix save like this

np.savetxt(outfile, matrix.T, delimiter=",")

Other ways of saving

Make sure you use tabs in the for loop. Have fun

EDIT

Running dir() within a list comprehension wont work see for explanation dir inside function. Here is an alternative.


from pathlib import Path
import pandas as pd 
import numpy as np

def clearvars():    
    for el in sorted(globals()):
        if '__' not in el:
                print(f'deleted: {el}')
                del el
clearvars() 
clear 

fileDir = Path.cwd()
outfile = Path(fileDir, 'OUT.csv')
fileNames = sorted(fileDir.glob('*.csv'))

second_col= []

for file in fileNames:
    df = pd.read_csv(file)
    second_col.append(df.iloc[:, 0])

matrix = np.array(second_col)
np.savetxt(outfile, matrix, delimiter=",")

You can delete this line print(f'deleted: {el}') if you like. It's just for clarification.

sorted(globals() or dir()contains a dictionary of all variables (namespace) used by python to run that code. Python relevant variables start and end with a '__'. You a basically looping through the names in the namespace and deleting all the names that you created in your script.

RSale
  • 463
  • 5
  • 14
  • Hi @RSale. I was wondering what the line: [del globals()[name] for name in dir() if not name.startswith('_')] does? Also how can I choose the directory where my program works? thanks – Jonas Freiheit Jan 03 '22 at 13:09
  • If you are new to python and you are using vscode. Checkout inteactive windows. This will blow your mind ;) – RSale Jan 03 '22 at 14:01
  • Did the rest of the code work? I don't have Matlab installed right now. So I couldn't varify. – RSale Jan 03 '22 at 14:02
  • Hi @RSale, Thanks for continuing with your help, unfortunately when I run the code through Spyder it runs the directory through my downloads file outputting an empty excel file. - when I select the directory to be in a specific file directory it returns to the downloads folder unfortunately. – Jonas Freiheit Jan 04 '22 at 13:24
  • Sorry that was my mistake ```rglob``` should be ```glob``. rglob is the recursive `version. You can also try adding the path as string directly ``` fileDir = "filepath" ``` . – RSale Jan 04 '22 at 15:53
  • Hi @RSale, using glob`` instead gives me a syntax error. And I change fileDir to = "C:/users/John/Desktop/File" that correct? THanks – Jonas Freiheit Jan 05 '22 at 22:02
  • In your original code is assumed that ```fileDir = cd``` was a mistake and you meant ```fileDir = pwd```. Is this what you meant or did you do something like ```cd = 'filepath? Path.cwd()``` get's your current working directory. Try, ```fileDir = Path.cwd("C:/users/John/Desktop/File") ``` The glob part works only if fileDir was generated with Path. Sorry, this was a mistake on my part. – RSale Jan 06 '22 at 00:45
  • Also try ```sorted(fileDir.glob('*.csv'))``` instead of .CSV – RSale Jan 06 '22 at 00:51
  • Jonas did it work? – RSale Jan 06 '22 at 18:11
  • Hi @RSale, I've recoded by replacing that line with fileDir = Path.cwd("C:/Users/Jonas/Desktop/Test") , I got an error saying cwd() takes 1 positional argument but 2 were given. Not sure where 2 were given. – Jonas Freiheit Jan 07 '22 at 13:10
  • I shouldn't write you responses at night. It's totally my fault. The Path.cwd is wrong. Try ```fileDir = Path("C:/users/John/Desktop/File")``` . Python's ```Path.cwd()``` is the same as Matlab's ```pwd```. – RSale Jan 07 '22 at 14:03
  • Hi @RSale, Thanks for the help, the program works now! How do I give you a thumbs up – Jonas Freiheit Jan 08 '22 at 13:15
  • I'm glad it works now. You can click the arrow up next to my answer and mark your answer as answered next to your question. Hope you stay on the python track – RSale Jan 08 '22 at 17:57
  • looks like I can't upvote this since my reputation isn't high enough yet – Jonas Freiheit Jan 12 '22 at 02:55