How to extract feature from lists?

Question

How to extract feature from dataset by python like :

I find two ways to slove this problem. 1) One is:

But So it is not a good way.

2) Another is :

Search C and D column to find topK items, and only keep the topK. But it will lead to the information loss.

Is there a better way to solve this problem?

if you want to access values in values(for.ex list/dict) then use sub-indexing means if it is list use columnname[list_index][element_index],if it is dictionary use use columnnmae[dict_key] or someting like that — SRG, Nov 25 '19 at 07:23

score 0 · Answer 1 · answered Nov 25 '19 at 07:51

0

I guess I understand your question. I am listing an approach that you can follow without any sparsity or information loss.

Let's say your column C varies from c1 to c4 and you create a binary vector of c1 to c4 as you already did.
Then convert the binary vector in decimal and use it as a feature. (For eg. 1,1,0,0, --> 0*2^0 + 0*2^1 + 1*2^2 + 1*2^3).
Take forward the same approach to D, but I would suggest you to create two features. One like step 2 without making use of the values of D and another using the values of D while taking a decimal conversion and then decide to retain them based on the correlation between the two features.

answered Nov 25 '19 at 07:51

Dr Sudeep Ghosh

Thank you. Convert the binary vector in decimal has some problem if the result is bigger than max(decimal) or max(hex). – Anna Nov 25 '19 at 08:14
Transfer D to D-key、D-value 、D-cor requires value is uniq. But the sub-value of list D maybe repeated, such as [{'d1':'d1_value'},{'d2':'d1_value'}]. – Anna Nov 25 '19 at 08:31

1 Answers1