I am new to GATE NLP. I started learning and going through GATE tutorials to understand the basics. I Tried some text formats in GATE Developer tool.
Steps I followed
- I download GATE Developer tool from this site. https://gate.ac.uk/download/
- Navigated to downloaded folder.
- Ran the command ./gate.sh
- Went through some You tube videos to understood the basics.
- Loaded Annie Application
- Removed Annie orthomatcher and Annie NE Transducer
- Added New GATE Document
- Created a new corpus with the document
- On Processing Resource added new JAPE Transducer, Mapped a sample JAPE File
I wanted to add some new rules in jape files to detect the given phone number formats from word documents.
Format1: +14168878659
, Format2: +1-4036187846
, Format3: +1.647.400.3581
, Format4 : +1 647 400 3581
I wrote Jape File for processing +14168878659
Phase:Address
Input: Token Lookup
Options: control = appelt
Macro: PHONE_COUNTRYCODE1
({Token.string == "+1"}
{Token.kind == number,Token.length == "2"} |
{Token.kind == number,Token.length == "11"}
)
Rule:PhoneReg1
Priority: 20
(
(PHONE_COUNTRYCODE1)
):phoneNumber1 -->:phoneNumber1.Phone = {kind = "phoneNumber1", rule = "PhoneReg1"}
Can someone tell me how to write jape for these formats? +1-4036187846
,+1.647.400.3581
, +1 647 400 3581
I know these may be not valid formats. I am doing this for learning purpose.