1

I am trying to get images, but the problem is that the thing I am getting are not all images and some images are too small to see.

def webService(url: String) = Action.async { implicit request =>
    WS.url("http://" + url).get().map { response =>
      val s = " src=\""                      //src also on the jquery path
      var all = response.body
  var state = true
  var set = Set[String]()
  while (state) {
    val ix = all.indexOf(s) + s.length();
    var imageUrl = all.substring(ix, all.indexOf("\"", ix + 1))
    all = all.substring(ix, all.length())
    state = all.contains("<img") || all.contains("< img")
     if (imageUrl.contains(".jpg")) {
        set=set+(imageUrl.split(".jpg")(0)+".jpg")
      } else if (imageUrl.contains(".jpeg")) {
        set=set+(imageUrl.split(".jpeg")(0)+".jpeg")
      } else if (imageUrl.contains(".png")){
        set=set+(imageUrl.split(".png")(0)+".png")
      }
    }
    Ok(views.html.webService("", url,set))
    }
}

I want all images with their dimensions, so I can remove unwanted images.

  • I am using play framework 2.2 with scala
Govind Singh
  • 15,282
  • 14
  • 72
  • 106
  • You'll need to clarify what you want to achieve with this code. Also, what's with all the `split`ing and putting-back-together? I would recommend using a library such as [JSoup](http://jsoup.org/) when working with/scraping HTML - there's even a [Scala-pimped version](https://github.com/filosganga/ssoup) – millhouse Nov 11 '14 at 06:22

1 Answers1

3

To parse HTML I would use an HTML Parser - something like jsoup.

import java.net.URL
import org.jsoup.Jsoup
import scala.collection.JavaConversions._

val timeout = 1000
val url = new URL("http://www.stackoverflow.com")

val imgTags = Jsoup.parse(url, timeout).select("img[src]").toList
val imgSources = imgTags.map(_.attributes.get("src"))
scala:> imgSources: List[String] = List(https://i.stack.imgur.com/tKsDb.png, ...

To get the dimensions of an image, you can use javax.imageio.ImageIO:

case class ImageDimension(heigth: Int, width: Int)

def imageDimension(imageUrl: String): ImageDimension = {
  import javax.imageio.ImageIO
  import java.net.URL
  val img = ImageIO.read(new URL(imageUrl))
  ImageDimension(img.getHeight, img.getWidth)
}

scala> imageDimension("https://www.google.de/images/srpr/logo11w.png")
res0: ImageDimension = ImageDimension(190,538)
j-keck
  • 1,021
  • 1
  • 7
  • 13