1
xidel https://www.url.com/folder -e "<button class="btn" type="BUTTON" onclick="self.location='https://www.url.com/folder/2'">Next &gt;</button>"

I am trying to extract what's in between the single quotes with this xidel template and I am getting nowhere fast.

<button class="btn" type="BUTTON" onclick="self.location='{.}'">Next &gt;</button>

Do I have to escape some characters. The syntax is confusing. I am using this on the commandline on windows, latest version.

Lid
  • 115
  • 11

1 Answers1

2

At first I tried:

xidel -s https://www.fanfiction.net/s/12963528/1/Forced-Return -e "<button>{@onClick}</button>*"

but that gave me 5 results of buttons with an onClick attribute, so I needed to be more specific:

xidel -s https://www.fanfiction.net/s/12963528/1/Forced-Return -e "<div style='clear:both;text-align:right;'><button>{@onClick}</button></div>"

which will output: self.location='/s/12963528/2/Forced-Return'

So, now we need to get rid of the prefix and single quotes... RegEx is fine for that:

xidel -s https://www.fanfiction.net/s/12963528/1/Forced-Return -e "<div style='clear:both;text-align:right;'><button>{extract(@onClick,'=.(.*).',1)}</button></div>"

This will output what you wanted: /s/12963528/2/Forced-Return

MatrixView
  • 311
  • 2
  • 7
  • Thanks for helping out. I tried to do the above but it grabs the whole html source code. I just need the url inside the single quotes – Lid Jun 27 '18 at 22:52
  • if you can, post the real URL and the text/attribute you need extracted... – MatrixView Jun 28 '18 at 18:49
  • `xidel https://www.fanfiction.net/s/12963528/1/Forced-Return -e ""` I want this to be the output /s/12963528/2/Forced-Return – Lid Jun 29 '18 at 03:52
  • Thanks so much that helped, so xpath and xquery can't deal with single or double quotes escaped properly without regular expressions help. Good to know. – Lid Jun 29 '18 at 13:25
  • in xquery/xpath quotes can be substituded by "(double) or ' (single), BUT I've already used both single and double quotes in the xidel line, AND xidel parsing seems to substitute them back to ' or " for example in substring-before() and substring-after(), so RegEx was the easy way to go for me... users BeniBela or Reino (Xidel experts) might know how to escape or circumvent these quotations in different ways. – MatrixView Jun 29 '18 at 14:55
  • I actually tried using " or ' on the command line in xidel and they don't work. I'll try to get in contact with the author of xidel and see what he says. I couldn't find anything in the readme or manual on how to deal with properly escaping quotes. – Lid Jun 29 '18 at 21:42
  • `extract(@onClick,''(.+)'',1)` on Windows and `extract(@onClick,"'(.+)'",1)` on Linux work for me. Without xquery `extract(@onClick,'''(.+)''',1)` on Windows and `extract(@onClick,"'\''(.+)'\''",1)` work for me. – Reino Jul 01 '18 at 20:00
  • As expected... Reino knows other ways... two of them actually. Thx man! – MatrixView Jul 01 '18 at 20:10