How to read html content of a specific URL using Firefox addon?

Question

I want to create an addon which will load html content of a specific url and save a specific line of that page and then move to that url. I read a lot of things on Mozila.org about content of a web page but I don't understand how to read the html content.

Must it be a addon, which can be installed by others or is it enough to get this working on your machine? I'm thinking about using a greasemonkey script. — Kwebble, Aug 03 '14 at 22:28

Noitidart · Answer 1 · 2014-08-04T06:44:00.340

1

Here's a simple snippet that does XHR request, WITHOUT cookies. Don't worry about cross-origin as you are running from privelaged scope, meaning you aren't coding this in a website but as a firefox addon.

var {Cu: utils, Cc: classes, Ci: instances} = Components;
Cu.import('resource://gre/modules/Services.jsm');
function xhr(url, cb) {
    let xhr = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].createInstance(Ci.nsIXMLHttpRequest);

    let handler = ev => {
        evf(m => xhr.removeEventListener(m, handler, !1));
        switch (ev.type) {
            case 'load':
                if (xhr.status == 200) {
                    cb(xhr.response);
                    break;
                }
            default:
                Services.prompt.alert(null, 'XHR Error', 'Error Fetching Package: ' + xhr.statusText + ' [' + ev.type + ':' + xhr.status + ']');
                break;
        }
    };

    let evf = f => ['load', 'error', 'abort'].forEach(f);
    evf(m => xhr.addEventListener(m, handler, false));

    xhr.mozBackgroundRequest = true;
    xhr.open('GET', url, true);
    xhr.channel.loadFlags |= Ci.nsIRequest.LOAD_ANONYMOUS | Ci.nsIRequest.LOAD_BYPASS_CACHE | Ci.nsIRequest.INHIBIT_PERSISTENT_CACHING;
    //xhr.responseType = "arraybuffer"; //dont set it, so it returns string, you dont want arraybuffer. you only want this if your url is to a zip file or some file you want to download and make a nsIArrayBufferInputStream out of it or something
    xhr.send(null);
}

Example usage of this snippet:

var href = 'http://www.bing.com/'
xhr(href, data => {
    Services.prompt.alert(null, 'XHR Success', data);
});

edited Aug 04 '14 at 06:44

answered Aug 04 '14 at 06:38

Noitidart

35,443
37
154
323

Okay Thank's for a detailed answer, Let me try this! – Ali Mohyudin Aug 04 '14 at 06:43
My pleasure. It's copy and paste. You can copy paste it to scratchpad, set "Environment" menu to "Browser" and then run. Keep in mind this DOES NOT use the cookies of the user. – Noitidart Aug 04 '14 at 06:44
Can you please tell me where to write the above code? in main.js of my addon or in a data folder with script.js? – Ali Mohyudin Aug 04 '14 at 06:45
Oh crap are you using [tag:firefox-addon-sdk]? If you are doing that, then you can paste this in `main.js` but change `var {Cu: utils, Cc: classes, Ci: instances} = Components;` to `var {Cu, Cc, Ci} = require('chrome');` – Noitidart Aug 04 '14 at 06:47
1

and on second thought, the sdk has some built in module for xhr, see here, its called request module. If you're using sdk than u should do it this way: https://developer.mozilla.org/en-US/Add-ons/SDK/High-Level_APIs/request – Noitidart Aug 04 '14 at 06:48
You mean I should replace this line `xhr = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"]' to 'xhr = Cc[" developer.mozilla.org/en-US/Add-ons/SDK/High-Level_APIs/request;1"]' ? – Ali Mohyudin Aug 04 '14 at 07:13
1

no no no lol if you want to use the code i pasted above follow my comment above that. if you want to go with request module copy paste the example code from that site into main.js – Noitidart Aug 04 '14 at 07:14
I copied and pasted the code of request example but it's confusing me that where I should place the code of icons and it should run the code after I click on the icon. – Ali Mohyudin Aug 04 '14 at 07:23
oh do you have a main.js file? just copy paste the code into there. im not an sdk guy so i wont be able to help too much forgive me man :( – Noitidart Aug 04 '14 at 08:02
Yes I've main.js and I copy past the code but nothing showed up in firefox. And It's okay bro! :) – Ali Mohyudin Aug 04 '14 at 08:25
I'm finding some way to do this job done but so far I got no successful results but It's okay I'm not going to give up! :D – Ali Mohyudin Aug 04 '14 at 08:26
1

Paging @canuckistani, addon-sdk expert needed please. – Noitidart Aug 04 '14 at 08:31

Kwebble · Answer 2 · 2014-08-04T13:12:22.900

Without knowing the page and URL to find on it I can't create a complete solution, but here's an example Greasemonkey script I wrote that does something similar.

This script is for Java articles on DZone. When an article has a link to the source, it redirects to this source page:

// ==UserScript==
// @name        DZone source
// @namespace   com.kwebble
// @description Directly go to the source of a DZone article.
// @include     http://java.dzone.com/*
// @version     1
// @grant       none
// ==/UserScript==

var node = document.querySelector('a[target="_blank"]');

if (node !== null) {
    document.location = node.getAttribute('href');
}

Usage:

Install Greasemonkey if you haven't yet.
Create the script, similar to mine. Set the value for @include to the page that contains the URL to find.
You must determine what identifies the part of the page with the destination URL and change the script to find that URL. For my script it's a link with a target of "_blank".

After saving the script visit the page with the link. Greasemonkey should execute your script and redirect the browser.

[edit] This searches script tags for text like you described and redirects.

// ==UserScript==
// @name        Test
// @namespace   com.kwebble
// @include     your_page
// @version     1
// @grant       none
// ==/UserScript==

var nodes = document.getElementsByTagName('script'),
    i, matches;

for (i = 0; i < nodes.length; i++) {
    if (nodes.item(i).innerHTML !== '') {
        matches = nodes.item(i).innerHTML.match(/windows\.location = "(.*?).php";/);

        if (matches !== null){
            document.location = matches[1];
        }
    }
}

The regular expression to find the URL might need some tweaking to match the exact page content.

I want to get a url from the header of a page. The url is written in the javascript as `windows.location = "http://www.url.com/blah_blah.php";` — Ali Mohyudin, Aug 04 '14 at 12:19
I added a second version searching script contents in the page. — Kwebble, Aug 04 '14 at 13:13
okay that's very helpful! thanks! I need a little more help, there's only one windows.location but with different url every time. can I copy that url without any match? and finally I want to move that link. what to do for that? — Ali Mohyudin, Aug 04 '14 at 15:49
The different URL is OK, the code (.*?) means any value in that spot is considered the URL. I don't understand what you mean by 'copy that url without any match' and moving the link. If the regular expression matches then the value of matches[1] contains the URL, do with it what you want. — Kwebble, Aug 04 '14 at 21:54

erosman · Answer 3 · 2014-08-04T10:44:13.667

0

Addon or GreaseMonkey script have a similar approach but addon can use native Firefox APIs. (but it is a lot more complicated than scripts)

Basically, this is the process (without knowing your exact requirements)

Get the content of a remote URL with XMLHttpReques()
Get the data that you need with RegEx or DOMParser()
Change the current URL to that target with location.replace()

edited Aug 04 '14 at 10:44

answered Aug 04 '14 at 04:31

erosman

7,094
7
27
46

How to read html content of a specific URL using Firefox addon?

3 Answers3

Example usage of this snippet:

Linked