0

I'm trying to create a script that searches a user's gmail for emails with "soundcloud.com" present and newer_than:1 day. I am then trying to pull those soundcloud urls into a spreadsheet with appendRow. I use a regex to snag the url and anything after the 'soundcloud.com' until whitespace. I also use a function to parse html.

I keep getting "We're sorry, a server error occurred" after a few seconds of running the script. Here's the code:

function getTextFromHtml(body) {
  return getTextFromNode(Xml.parse(body, true).getElement());
}


function getTextFromNode(x) {
  switch(x.toString()) {
    case 'XmlText': return x.toXmlString();
    case 'XmlElement': return x.getNodes().map(getTextFromNode).join('');
    default: return '';
  }
}

function searchy() {
  var search = GmailApp.search('"soundcloud.com" newer_than:1d');
  for (var i=0; i < search.length; i++) {
    var msgs = search[i].getMessages();
    for (var j=0; j < msgs.length; j++) {
      var body = msgs[j].getPlainBody();
      var printthis = getTextFromHtml(body);
      var regex = /https?:\/\/soundcloud\.com\/\S*/gi;
      var scURL = regex.exec(printthis);
      var sheet = SpreadsheetApp.getActiveSheet();
      sheet.appendRow(scURL);
      }
    }
  }

And here's the execution transcript:

[15-03-05 11:16:59:009 CST] Starting execution
[15-03-05 11:16:59:539 CST] GmailApp.search(["soundcloud.com" newer_than:1d]) [0.512 seconds]
[15-03-05 11:16:59:724 CST] GmailThread.getMessages() [0.184 seconds]
[15-03-05 11:16:59:929 CST] GmailMessage.getPlainBody() [0.205 seconds]
[15-03-05 11:16:59:971 CST] Xml.parse([https://soundcloud.com/josef-salvat/josef-salvat-every-night-1

-- 
(EMAIL SIGNATURE)
, true]) [0.042 seconds]
[15-03-05 11:16:59:973 CST] XmlDocument.getElement() [0 seconds]
[15-03-05 11:16:59:977 CST] XmlText.toXmlString() [0 seconds]
[15-03-05 11:16:59:977 CST] SpreadsheetApp.getActiveSheet() [0 seconds]
[15-03-05 11:17:00:100 CST] Sheet.appendRow([[https://soundcloud.com/josef-salvat/josef-salvat-every-night-1]]) [0.122 seconds]
[15-03-05 11:17:00:280 CST] GmailThread.getMessages() [0.179 seconds]
[15-03-05 11:17:04:388 CST] GmailMessage.getPlainBody() [4.108 seconds]
[15-03-05 11:17:04:428 CST] Xml.parse([https://soundcloud.com/denniskruissen/de-hofnar-dennis-kruissen-ayo-technology-bootleg

-- 
(EMAIL SIGNATURE)
, true]) [0.039 seconds]
[15-03-05 11:17:04:429 CST] XmlDocument.getElement() [0 seconds]
[15-03-05 11:17:04:431 CST] XmlText.toXmlString() [0 seconds]
[15-03-05 11:17:04:431 CST] SpreadsheetApp.getActiveSheet() [0 seconds]
[15-03-05 11:17:04:432 CST] Sheet.appendRow([null]) [0 seconds]
[15-03-05 11:17:04:434 CST] Execution failed: We're sorry, a server error occurred. Please wait a bit and try again. (line 24, file "Copy of search") [5.406 seconds total runtime]

Can anyone help??? Also, is there a better way to do this? I'm a novice.

Thanks! Neal

EDIT I figured it out. Had to drop the g in the regex function. See Henrique's answer here: Google Apps Script Regex exec() returning null Thank you very much Vincent for your help!

Community
  • 1
  • 1
cdwhiteiv
  • 87
  • 1
  • 7

1 Answers1

0

Seems like regex.exec() is returning null because there is no match.

One way to fix the function to prevent error:

function searchy() {
  var search = GmailApp.search('"soundcloud.com" newer_than:1d');
  for (var i=0; i < search.length; i++) {
    var msgs = search[i].getMessages();
    for (var j=0; j < msgs.length; j++) {
      var body = msgs[j].getPlainBody();
      var printthis = getTextFromHtml(body);
      var regex = /https?:\/\/soundcloud\.com\/\S*/gi;
      var scURL = regex.exec(printthis);

      if (scURL !== null) {
          var sheet = SpreadsheetApp.getActiveSheet();
          sheet.appendRow(scURL);
          }
      }
    }
  }

Source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec

user700390
  • 2,287
  • 1
  • 19
  • 26
  • Thank you for the help, but now it just ignores some soundcloud urls. Is that because the regex isn't properly formatted? Here is a couple of soundcloud examples that it matches: https://soundcloud.com/thump/saint-wknd-guest-mix, https://soundcloud.com/josef-salvat/josef-salvat-every-night-1 and here is one that it doesnt: https://soundcloud.com/goldroom/ekkah-last-chance-to-dance-goldroom-remix – cdwhiteiv Mar 05 '15 at 18:26
  • The regex looks ok, you can test it at http://regexpal.com/ if you like. I am getting matches for both of those strings. Are you certain of the input, and that the getTextFromNode() function is working as intended? I don't know if you can debug interactively, but you should at least be able to add some additional logging output to verify that input string is as expected. – user700390 Mar 05 '15 at 18:57
  • I used the functions from this thread: http://stackoverflow.com/questions/20644301/copy-specific-text-from-email-message-to-spreadsheet/20645141#20645141 As I said I'm pretty new to javascript and coding in general so I tried using these functions and changing the var's to match my own. – cdwhiteiv Mar 05 '15 at 20:27
  • Since I figured that out, I now need to make the 'newer_than:' function go to minutes instead of days,weeks, months, or years. Any advice for this? I tried doing newer_than:10m, newer_than:.007d, etc and that didn't work. Should I make this a new thread? – cdwhiteiv Mar 05 '15 at 21:07