-2

I am going to buy a new smartphone which shall run lineageOS. Because the decision about the right smartphone is always guided by the pricing of the phone I would like to generate a pricing list containing all devices supported by lineageOS.

To make is super clear: This question is not about an exact step by step instruction. It is about finding the right technologies/tools to get the work done.

Which tools to use to list pricing for all lineageOS ROMs?


The only tools that I know to use to generate such a list are ´grep´ and ´wget´ in a ´bash´ environment. This is not the most efficient way to get the work done and I hope someone else can show up more suitable tools. Nevertheless here is my receipt to generate the list:

  1. Using wget to download the devices homepage or the lineageOS statistics web service

  2. Using awk and/or grep to filter out of the homepages source code a plain list of all devices

  3. Using a bash for-loop calling wget for every device string into the restFull-API (is this really the right name for that technologie?) of idealo or amazon. This may look like this:

    for device in $DEVICES; do wget https://www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=$device > $device.html

  4. Using grep to find the line with the device's price by filtering for the first search item. This may as ugly as this:

    grep -A999999 pageContent-wrapper device.html | grep -m1 -A2 ">price-prefix" | grep "€"

  5. Using cut to extract the price itself from the line

  6. Output a well formatted list by using something like this:

    echo $device $price

eDeviser
  • 1,605
  • 2
  • 17
  • 44
  • 1
    I don't see a shortcut around getting each device's data and that seems to be the bottleneck for time. Your general idea seems like it will work. If you need this soon, go ahead and do that, so at least you're getting something. If this is the only thing you have to work on, then you can use your first 10+ devices to see if there are any optimizations you can do and then recode to take advantage. Good luck. – shellter Dec 16 '19 at 17:35
  • 1
    To parse html, use a proper parser like in my answer – Gilles Quénot Dec 16 '19 at 17:54

1 Answers1

2

 Here we go, using a proper HTML parser and :

while read device; do
  printf '%s %s\n' "$device" $(
  saxon-lint --html --xpath '(//div[@class="offerList-item-priceMin"])[1]/text()' \
  "https://www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=$device"
)
done < <(
  saxon-lint --html --xpath '//a[starts-with(@href, "/model")]/text()' \
  https://stats.lineageos.org/
)

Check saxon-lint

 Output

m8 128,22 €
bacon 1,89 €
riva 224,99 €
cancro 8,35 €
klte 
j7eltexx 
t0lte 
wt88047 
i9300 35,58 €
mido 558,00 €
...

 Note

Testing a5y17lte (by example) via https://www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=a5y17lte there's no result.

The site is not reliable, an another example: bacon 1,89 € is not a phone :D

Another working tool is xidel :

xidel -e '//a[starts-with(@href, "/model")]/text()' https://stats.lineageos.org/
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223