2

Is there any way I can embed or any process to capture each of the row data turn into vector, or array number which is in shape (1,)?

enter image description here

My intention is to embed each of the rows information become something to representative input feature, so that I can convert 1x3 shape from DataFrame into 1. For example,

row [1,6,5]  become =====> [2.178] //sample array number only, not necessary 2.178
row [1,10,3] become =====> [3.415] //sample array number only
row [1,12,5] become =====> [0.888] //sample array number only

Is that any libary available? or does my request make sense? Thank you in advance for your view.

To generate data sample,

# Import pandas library 
import pandas as pd 
  
# initialize list of lists 
sample_dat = [[1, 10, 5], [1, 10, 3], [1, 12, 5]] 
  
# Create the pandas DataFrame 
sample_df = pd.DataFrame(sample_dat, columns = ['userId', 'movieId', 'rating'])
Yeo Keat
  • 143
  • 1
  • 9
  • What is formula for `row [1,6,5] become =====> [2.178]` ? How is count `2.178` ? – jezrael Feb 19 '21 at 05:32
  • Hi, there is no certain formula, the 2.178 is just a sample output I simply create, it just want to shows a single dimension arrray. – Yeo Keat Feb 19 '21 at 05:37
  • So need `sample_df['mean'] = sample_df.mean(axis=1) print (sample_df[['mean']].to_numpy())` ? – jezrael Feb 19 '21 at 05:41
  • I see, means you are suggesting use mean value to represent each row data. I have try it just now but I afraid this representative method is not suitable for my case. – Yeo Keat Feb 19 '21 at 05:45
  • It was reason why I ask what is formula. Because you need any function here for procesing per rows like `.sum(axis=1)`. – jezrael Feb 19 '21 at 05:46
  • I understand what you mean. Unfortunately, there is no formula for this. It is something like squeeze the entire row data turn into a single value as representative. I trying using flatten. – Yeo Keat Feb 19 '21 at 05:50
  • 1
    ok, but what is rule of `single value as representative` ? What is conversation? – jezrael Feb 19 '21 at 05:52
  • Thank you @jezrael, I think I should think further about it, as I currently in a clueless situation. – Yeo Keat Feb 19 '21 at 06:26
  • If my understanding is correct, you are trying to hash an array right? So each different row has a different number (that does not really mean anything but you are confident that no other array has this number)? You could just do `[hash(sample) for sample in sample_dat]` – pregenRobot Feb 19 '21 at 10:01

0 Answers0