I am having some problem to do a particular operation on a specific list.
I have a list of tokens where a token represents a word and I want use a single predicate to recognize if some contiguous tokens in this list represent a date which may take the following two forms:
1) FIRST FORM: 1° Marzo 1995
2) SECONDO FORM: 1° Marzo of 1995
To do a concrete examples about what I mean when I speck of token list.
This is the tokens list related to the first form:
[t(1,[49]),t(-1,[176]),t(-1,[32]),t(2,[77,97,114,122,111]),t(-1,[32]),t(3,[49,57,57,53]),t(-1,[10])]
and this is the tokens list related to the second form:
[t(1,[49]),t(-1,[176]),t(-1,[32]),t(2,[77,97,114,122,111]),t(-1,[32]),t(3,[111,102]),t(-1,[32]),t(4,[49,57,57,53]),t(-1,[10])]
As you can see a general token have a "functor" t and into it there are two arguments: a number (not necessarily progressive, if a token not represents a word but is something different from an alnum character it value is -1) and a string that represent a word or a single character (like a whit space or the "°" and so on)
I have implement a good working predicate that recognizes if 7 contiguous general tokens in a token list represents a date (like 1° Marzo 1995) and then build a single specialized data token by concatenation, in this way:
tagga([t(Number1, Day), t(-1, "°"), t(-1, Space1), t(Number4, Month), t(-1, Space2), t(Number6, Year)|ListaToken], [d(Number1, CompositeDateTag)|ListaTokenTaggati]) :-
length(Day, LnDay),
(LnDay =:= 1; LnDay =:= 2),
Day = [Head|Tail],
char_type(Head, digit),
%number_codes(DayNumber, Day),
%DayNumber =< 31,
Space1 == " ",
member(Month, ["gennaio", "febbraio", "marzo", "aprile", "maggio", "giugno", "luglio", "agosto", "settembre", "ottobre", "novembre",
"dicembre", "Gennaio", "Febbraio", "Marzo", "Aprile", "Maggio", "Giugno", "Luglio", "Agosto", "Settembre", "Ottobre", "Novembre", "Dicembre"]),
Space2 == " ",
length(Year, LnYear),
LnYear =:= 4,
NumericString = [Head|Tail],
char_type(Head, digit),
append(Day, "°", UntilSt),
append(UntilSt, Space1, UntilSpace1),
append(UntilSpace1, Month, UntilMonth),
append(UntilMonth, Space2, UntilSpace2),
append(UntilSpace2, Year, CompositeDateTag),
tagga(ListaToken, ListaTokenTaggati).
Now I could implement an additional predicate that recognize if 9 contiguous general tokens in the tokens list represent a date in the second form (something like: 15 Marzo of 1995) but I would know if I can modify my previous tagga/2 predicate to do that it recognize both dates that meet the first and the second form
My reasoning is the following one, I have a tokens list that could be something like:
[t(Number1, Day), t(-1, "°"), t(-1, Space1), t(Number4, Month), t(-1, Space2), t(Number6, Year)|ListaToken] in the case of the FIRST DATE FORM
Or something like:
[t(Number1, Day), t(-1, "°"), t(-1, Space1), t(Number4, Month), t(-1, Space2), t(Number7, "of"), t(-1, Space3), t(Number9, Year)|ListaToken] in the case of the SECOND DATE FORM
Ok, I will know if I can combine in some way the two forms into a single general form that match with both and then operate on it...
Something like this:
[t(Number1, Day), t(-1, "°"), t(-1, Space1), t(Number4, Month) | SUBLIST |t(Number9, Year)|ListaToken]
Where sublist could be:
1) The single token: t(-1, Space2)
or
2) Something like this token list: t(-1, Space2), t(Number7, "of"), t(-1, Space3)
Is this idea good? Someone can help me to implement a predicate like this? (I am trying but, until now I have obtained no result)