1

I'm trying to put all the info about each cell in a table in a single line. And I need to figure out how to print the header of each column in the table.

td, table {
  border: 2px black solid;
}
<table>
  <tr>
    <td>a1</td>
    <td>a2</td>
    <td>a3</td>
    <td>a4</td>
  </tr>
  <tr>
    <td>b1</td>
    <td>b2</td>
    <td>b3</td>
    <td>b4</td>
  </tr>
  <tr>
    <td>c1</td>
    <td>c2</td>
    <td>c3</td>
    <td>c4</td>
  </tr>
  <tr>
    <td>d1</td>
    <td>d2</td>
    <td>d3</td>
    <td>d4</td>
  </tr>
</table>
Table 1
+----+----+----+----+
| a1 | a2 | a3 | a4 |
+----+----+----+----+
| b1 | b2 | b3 | b4 |
+----+----+----+----+
| c1 | c2 | c3 | c4 |
+----+----+----+----+
| d1 | d2 | d3 | d4 |
+----+----+----+----+

Table 2
+----+----+----+----+
| e1 | e2 | e3 | e4 |
+----+----+----+----+
| f1 | f2 | f3 | f4 |
+----+----+----+----+
| g1 | g2 | g3 | g4 |
+----+----+----+----+
| h1 | h2 | h3 | h4 |
+----+----+----+----+

And Other Tables ...

I want to get the cells printed with the cell at the top of the column (i.e. tr[1]).

The output shouldn't have the first raw ..

The first output should be:

The cell b1 has the header a1

..

The cell g2 has the header e2

and so on ..

I'm using xidel:

xidel $site -e "//tr[position()>1]/td/concat('The cell ', ., $codeX)"

What the value of $codeX should be?

Thanks,

user37421
  • 407
  • 1
  • 5
  • 12

3 Answers3

2

Xidel supports XQuery 3.0 so for structuring the task I would suggest e.g.

let $rows := //tr,
    $header-cells := $rows[1]/td
for $data-row in $rows[position() gt 1]
for $cell at $pos in $data-row/td
return $cell!('cell ' || . || ' has header ' || $header-cells[$pos])

Not sure whether that works well from the command line but does the job.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • This can be done simpler: `for $row in //tr[position()>1] for $cell at $i in $row/td return concat('cell ',$cell,' has header ',//tr[1]/td[$i])`. – Reino Jan 07 '19 at 01:07
1

You may use only xpath for get it:

//table//tr[1]/td[count(//table//td[text()='${cellValue}']/preceding-sibling::*) + number(boolean(//table//td[text()='${cellValue}']/preceding-sibling::*))]

Note: specifying cell which contains existing value (e.g. 'b3') gives a correct cell from header ('a3'). If you trying to search invalid value of cell you receiving a correct empty value because of cell in the header is not exist.

fpsthirty
  • 185
  • 1
  • 4
  • 15
0

To get table header text just get 1st tr data //tr[1]/td or //tr[1]/th if tag th used for header (which is expected)

To get header by column text try this XPath on that table: https://www.w3schools.com/css/tryit.asp?filename=trycss_table_border

//th[count(//tr/td[text()='Griffin'])]

Logic is: find position of td with specific text //tr/td[text()='Griffin'], by using count() function. And just find th by this position

Vitaliy Moskalyuk
  • 2,463
  • 13
  • 15