Questions tagged [jaunt-api]

Jaunt is a free Java web-scraping and automation library.

Jaunt Beta is a new, free, Java web-scraping/automation library. The API presents a lightweight, headless browser for interfacing with websites, web-apps, and web services. Jaunt makes it easy to parse, traverse, search, extract and filter HTML & XML data. It provides three levels of abstraction: DOM-level, component-level, and browser-level. It is an ideal API for web automation where Javascript is not required, including: filling out and submitting forms creating web-bots or web-scraping programs. creating REST clients for XML services. interfacing with web-based APIs or web-apps. automated testing.

37 questions
1
vote
1 answer

Running Jaunt (web-scraper) on Google App Engine: Java

I'm trying to submit a form through App Engine and Jaunt. I get this error: java.lang.NullPointerException at com.jaunt.UserAgent.a(SourceFile:2337) at com.jaunt.UserAgent.send(SourceFile:887) at…
opowell
  • 568
  • 4
  • 20
0
votes
2 answers

.contains() problem/looks like it doesnt work

I'm working on web scraper and I can't solve problem I'm having for the second day in row. The problem with this method is when the bot is supposed to visit the website, harvest all URL's, and add the ones of them it didn't visit already to List<…
0
votes
0 answers

Using Html Unit or Selenium Is there A way use Google Reverse Geocoding

I try jsoup, html unit, selenium.But I don't see script form for reverse geocodin. How this is possible ? Can i use this script ? https://developers.google.com/maps/documentation/geocoding/intro
nyavuzcan
  • 1
  • 4
0
votes
0 answers

Proxy does not work, when I set it programmaticaly in Java using Jaunt library

I have simple scraping program based on Joint library, that needs to set and change proxy programmaticaly. private static void setProxy(){ System.setProperty("https.proxyHost", ProxyProducer.getProxy()); // this class just returns host…
Anton M
  • 15
  • 1
  • 7
0
votes
1 answer

How to get data from the Java web scraping API?

I am trying to get table data from the following url: Get Data from this URL and I wrote this code with the help of jaunt API package org.open.browser; import com.jaunt.Element; import com.jaunt.Elements; import com.jaunt.JauntException; import…
Subodh Joshi
  • 12,717
  • 29
  • 108
  • 202
0
votes
1 answer

Jaunt hyperlink replacing values with %3F and %3D

I'm currently retrieving hyperlinks from websites with the Jaunt api provided for Java. The code is as follows: for (Element link : UA.doc.findEvery("

").findEvery("")) { String temp = link.getAt("href"); …

Michael
  • 87
  • 1
  • 2
  • 11
0
votes
1 answer

How to get the InnerHTML of dynamic loading webpage?

I am new to java and using a jaunt1.3.8 library for web scraping. I am trying to get the InnerHTML of the webpage : https://www.justdial.com/Pune/Cake-Shops/nct-10070075. the site will not show us the full list of search results. when we reach the…
Tyson
  • 17
  • 8
0
votes
0 answers

ClassNotFoundException and NoClassDefFoundError while using Jaunt Library?

I am working on a web scraper using the Jaunt library. I am currently getting the runtime error while on the linux terminal: Error: A JNI error has occurred, please check your installation and try again Exception in thread "main"…
0
votes
1 answer

ResponseException in jaunt

Here is the error message: UserAgent.sendGET; response error requestUrl: https://www.linkedin.com/directory/topics-c/ response: requestURL: https://www.linkedin.com/directory/topics-c/ status: 999 here is my code try { Document doc =…
user2252882
0
votes
1 answer

Error clicking button in form with Jaunt

So I am trying to submit a form using Jaunt. There are two submit buttons, a check and apply. I am Trying to click the check button but am having some trouble because it cant find the button with the identifier of "check". I am basically copying…
k9b
  • 1,457
  • 4
  • 24
  • 54
0
votes
1 answer

Cannot get form from webpage

I am trying to get the login form from: https://www.etoro.com/login when I inspect in Chrome I can see the element, however when I use the jaunt api in Java I cannot get the form. userAgent = new…
Dwest
  • 3
  • 2
0
votes
0 answers

Finding Jaunt Element not working?

I'm trying to get a specific element off YouTube (the title of the video). HTML: http://pastebin.com/cjr2SgNd Important HTML part: I…
ruyili
  • 694
  • 10
  • 25
0
votes
1 answer

Jaunt Webcrawler API doesn't treat correctly relative URLs

I implement a crawler that do something like: repeat Visit each page and get all links that have not been visited. until no new links The page it is crawling is https://www.mercadoribeirao.com.br I'm getting all links like:
alexpfx
  • 6,412
  • 12
  • 52
  • 88
0
votes
1 answer

Jaunt-api cookie issue

I am trying to login into the yahoo mail with jaunt-api but getting "cookie not enabled issue". I am new in jaunt-api, so please help me. I am using the following code. try { UserAgent userAgent = new UserAgent(); …
Ravi Kumar
  • 432
  • 4
  • 14
0
votes
1 answer

Problems logging in to a webpage using java and jaunt-api

So I'm trying to log in to a webpage using Jaunt. The first thing to mention is that the webpage is .aspx and the submit button has an option onclick="javascript:WebForm_DoP..." and as far as I know Jaunt doesn't support Javascript right? In case…
lpares12
  • 3,504
  • 4
  • 24
  • 45