1

I tried to run a pre trained python HMM_TAGGER provided on http://www.henrikleopold.com/downloads/ under the section "Label Parsing Technique".

The HMM_TAGGER script is written in python 2 and when i am running it i get an "TypeError: coercing to Unicode: need string or buffer, NoneType found" message

0%| | 0/246 [00:00<?, ?it/s]gen_style is None 0%| | 0/246 [00:01<?, ?it/s] Traceback (most recent call last): File "HMM_TAGGER.py", line 204, in parse_data() File "HMM_TAGGER.py", line 189, in parse_data result.append(process_label(df.iloc[i])) File "HMM_TAGGER.py", line 170, in process_label store = word + indicator + 'misc-' + gen_style + ', ' TypeError: coercing to Unicode: need string or buffer, NoneType found

So the script is trying to concatenate a None value to string. After reviewing the input dataset it is clear that this error occurs because the column "Style" is empty. So my guess was that therefore the "generalize_style" function got nothing to work with

def generalize_style(style):
    AN  = ['AN_NP', 'AN_ING', 'AN_OF', 'AN_IRR', 'AN']
    VOS = ['VOS_IRR', 'VOS', 'VO']
    DES = ['DES', 'PS', 'DES / EVENT', 'GATEWAY']
    NA = ['NA']
    
    if style in AN:
        return 'AN'
    elif style in VOS:
        return 'VOS'
    elif style in DES:
        return 'DES'
    elif style in NA
        return 'NA'

Exmaple of the Input dataset (Excel file):

Label Split Style Action Business Object Tags
Check Invoice
Confirmation of Booking
Pay Bill
repair the costume
Ship product
Retrieve product from warehouse
Complete details
Process Order

So i actually thougt that this script is for POS tagging and determinating the style of the labels (for example "Check Invoice" is in Verb Oject Style (VOS)) but it seems that it already needs a style as input. Or am i wrong?

So i have the following questions:

Is there a problem with the input dataset? Is the script not properly working because it is written in python2 and i need to convert it to python 3?

kunif
  • 4,060
  • 2
  • 10
  • 30
meigl
  • 11
  • 1

0 Answers0