2

I want to generate Java bindings for the /usr/include/yara.h header file using the https://github.com/openjdk/jextract tool.

From readme:

Jextract jextract is a tool which mechanically generates Java bindings from a native library headers. This tools leverages the clang C API in order to parse the headers associated with a given native library, and the generated Java bindings build upon the Foreign Function & Memory API. The jextract tool was originally developed in the context of Project Panama (and then made available in the Project Panama Early Access binaries).

I am able to build and test the project but I'm having some issues with the type definition of intmax_t when running the command:

build/jextract/bin/jextract --source --output java-yara --target-package com.virustotal.yara /usr/include/yara.h
/usr/include/inttypes.h:290:8: error: unknown type name 'intmax_t'

I checked the /usr/include/inttypes.h file and it correctly imports the <stdint.h> header:

/*
 *  ISO C99: 7.8 Format conversion of integer types <inttypes.h>
 */

#ifndef _INTTYPES_H
#define _INTTYPES_H 1

#include <features.h>
/* Get the type definitions.  */
#include <stdint.h>

Here https://github.com/MaurizioCasciano/jextract/tree/unknown_type_name_intmax_t is my fork with a simple Taskfile to build the project and reproduce the error running:

task build
task yara-extract

instead of the full commands, if you prefer.

I did also try adding the relevant include paths but it didn't work.

-I /usr/include
-I /usr/include/yara
-I /usr/lib/llvm-15/lib/clang/15.0.6/include

What else am I missing to allow the parsing of this typedef for 'intmax_t' ?

Below you can see the tools I'm using.

Environment

$ hostnamectl
Operating System: Ubuntu 22.04.1 LTS              
          Kernel: Linux 5.15.0-52-generic
    Architecture: x86-64

$ java --version
openjdk 19 2022-09-20
OpenJDK Runtime Environment (build 19+36-2238)
OpenJDK 64-Bit Server VM (build 19+36-2238, mixed mode, sharing)

$ gradle --version

------------------------------------------------------------
Gradle 7.6
------------------------------------------------------------

Build time:   2022-11-25 13:35:10 UTC
Revision:     daece9dbc5b79370cc8e4fd6fe4b2cd400e150a8

Kotlin:       1.7.10
Groovy:       3.0.13
Ant:          Apache Ant(TM) version 1.10.11 compiled on July 10 2021
JVM:          19 (Oracle Corporation 19+36-2238)
OS:           Linux 5.15.0-52-generic amd64

$ llvm-config-15 --version
15.0.6

$ clang-15 --version
Ubuntu clang version 15.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

I am also able to parse the /usr/include/yara.h header file to generate the corresponing Abstract Syntax Tree (AST) representation of the code:

clang-15 -fsyntax-only -Xclang -ast-dump /usr/include/yara.h | grep intmax_t

and the reference is correctly returned:


|-TypedefDecl 0x559220f80400 <line:72:1, col:18> col:18 referenced __intmax_t 'long'
|-TypedefDecl 0x559220f80470 <line:73:1, col:27> col:27 referenced __uintmax_t 'unsigned long'
|-TypedefDecl 0x55922102e2f0 <line:101:1, col:21> col:21 referenced intmax_t '__intmax_t':'long'
| `-TypedefType 0x55922102e2c0 '__intmax_t' sugar
|   |-Typedef 0x559220f80400 '__intmax_t'
|-TypedefDecl 0x55922102e380 <line:102:1, col:22> col:22 referenced uintmax_t '__uintmax_t':'unsigned long'
| `-TypedefType 0x55922102e350 '__uintmax_t' sugar
|   |-Typedef 0x559220f80470 '__uintmax_t'
|-FunctionDecl 0x55922102e788 <line:290:1, col:74> col:17 imaxabs 'intmax_t (intmax_t)' extern
| |-ParmVarDecl 0x55922102e6c0 <col:26, col:35> col:35 __n 'intmax_t':'long'
|-FunctionDecl 0x55922102ea68 <line:293:1, line:294:41> line:293:18 imaxdiv 'imaxdiv_t (intmax_t, intmax_t)' extern
| |-ParmVarDecl 0x55922102e8e0 <col:27, col:36> col:36 __numer 'intmax_t':'long'
| |-ParmVarDecl 0x55922102e958 <col:45, col:54> col:54 __denom 'intmax_t':'long'
|-FunctionDecl 0x559221040998 <line:297:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:297:17 strtoimax 'intmax_t (const char *restrict, char **restrict, int)' extern
|-FunctionDecl 0x559221040cc8 </usr/include/inttypes.h:301:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:301:18 strtoumax 'uintmax_t (const char *restrict, char **restrict, int)' extern
|-FunctionDecl 0x5592210410e8 </usr/include/inttypes.h:305:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:305:17 wcstoimax 'intmax_t (const __gwchar_t *restrict, __gwchar_t **restrict, int)' extern
|-FunctionDecl 0x559221041428 </usr/include/inttypes.h:310:1, /usr/include/x86_64-linux-gnu/sys/cdefs.h:79:54> /usr/include/inttypes.h:310:18 wcstoumax 'uintmax_t (const __gwchar_t *restrict, __gwchar_t **restrict, int)' extern
1Z10
  • 2,801
  • 7
  • 33
  • 82
  • `-I /usr/lib/llvm-15/lib/clang/15.0.6/include` this include looks strange to me. The standard header files should already be bundled with jextract in the conf/jextract directory of the built runtime image, and they should be included automatically. – Jorn Vernee Dec 27 '22 at 17:14
  • You might want to try with a prebuilt jextract as well, as you might have a broken jextract build: https://jdk.java.net/jextract/ (the build scripts are not super good at validating the input clang package, and last time we checked the system llvm package on Ubuntu didn't work. You might want to get a pre-built package from github instead if you want to build yourself: https://github.com/llvm/llvm-project/releases/tag/llvmorg-13.0.0). – Jorn Vernee Dec 27 '22 at 17:22
  • This `LLVM` include is where I found other `inttypes.h` and `stdint.h` files, apart from the standard `/usr/include`. However, it did not help. I'll try with the LLVM `13.0.0` instead of the `15.0.6`. – 1Z10 Dec 27 '22 at 18:15
  • Right, but `inttypes.h` and `stdint.h` are already bundled with jextract (should be in `conf/jextract`), so you shouldn't need to have a `-I` flag for it. – Jorn Vernee Dec 27 '22 at 18:32
  • I checked: `inttypes.h` and `stdint.h` are correctly placed under `build/jextract/conf/jextract/`. And all the test are passed without errors, even with `LLVM 15.0.6` – 1Z10 Dec 27 '22 at 18:35
  • If you would like to check, I did also push all the yara headers under the `samples/yara` dir. https://github.com/MaurizioCasciano/jextract/tree/yara/samples/yara – 1Z10 Dec 27 '22 at 18:39
  • 1
    I gave it a go here, and it seems to work with your repo on the `yara` branch, built jextract with the LLVM 13.0.0 package I linked to, and then `./build/jextract/bin/jextract --source --output java-yara --target-package com.virustotal.yara -I ./samples/yara ./samples/yara/yara.h` (There are warnings about `long double` but that is expected) – Jorn Vernee Dec 27 '22 at 18:53
  • If you still can't make it work, I suggest sending this question to the mailing list at `jextract-dev@openjdk.org`. Maurizio (from Panama) is more familiar with Linux/Ubuntu so he might have more ideas (but he's currently on vacation, so it might be next week before he replies). – Jorn Vernee Dec 27 '22 at 19:00
  • Ah, looks like you already did t hat :) – Jorn Vernee Dec 27 '22 at 19:01
  • yes, I did :) I'll check and post some updates. Thanks – 1Z10 Dec 27 '22 at 19:07

1 Answers1

0

As suggested by @Jorn Vernee, this issue was related to the installation of LLVM and CLang on Ubuntu, using the APT packages available at https://apt.llvm.org/

I tested it, without any problem, using the latest release archive available on GitHub:

There are only some warnings for the long double native types being skipped.

This is the Taskfile.yaml to download and extract the LLVM+CLang archive and generate the YARA-Java bindings inside a simple Maven module structure:

version: 3

dotenv: ['.env']

env:
  LLVM_HOME: "./libs/clang_llvm"
  JTREG_HOME: "/usr/lib/jtreg"

vars:
  PROJECT_ROOT:
    sh: echo $PWD
  YARA_JAVA_ROOT: "samples/yara/yara-java"
  YARA_JAVA_SRC: "{{.YARA_JAVA_ROOT}}/src/main/java"
  YARA_JAVA_RESOURCES: "{{.YARA_JAVA_ROOT}}/src/main/resources"
  YARA_JAVA_HEADERS: "{{.YARA_JAVA_RESOURCES}}/headers"

tasks:
  setup:
    cmds:
      - mkdir -p libs/clang_llvm
      - true || wget -nc -O libs/clang_llvm.tar.xz --show-progress https://github.com/llvm/llvm-project/releases/download/llvmorg-15.0.6/clang+llvm-15.0.6-x86_64-linux-gnu-ubuntu-18.04.tar.xz
      - tar --strip-components=1 -xvf libs/clang_llvm.tar.xz -C libs/clang_llvm
  build:
    cmds:
      - gradle -Pjdk19_home="${JAVA_HOME}" -Pllvm_home="${LLVM_HOME}" clean verify
  test:
    cmds:
      - gradle -Pjdk19_home="${JAVA_HOME}" -Pllvm_home="${LLVM_HOME}" -Pjtreg_home="${JTREG_HOME}" jtreg
  yara-headers:
    cmds:
      - mkdir -p {{.YARA_JAVA_HEADERS}}
      - cp /usr/include/yara.h {{.YARA_JAVA_HEADERS}}
      - cp -r /usr/include/yara {{.YARA_JAVA_HEADERS}}
  yara-java:
    cmds:
      - task: yara-java-includes
      - sh -c "(
                build/jextract/bin/jextract
                -l yara
                --source
                --output {{.YARA_JAVA_SRC}}
                --target-package com.virustotal.yara
                {{.YARA_JAVA_HEADERS}}/yara.h
              )"
  yara-java-includes:
    cmds:
      - mkdir -p {{.YARA_JAVA_RESOURCES}}
      - sh -c "(
                build/jextract/bin/jextract
                --dump-includes {{.YARA_JAVA_RESOURCES}}/includes.txt
                {{.YARA_JAVA_HEADERS}}/yara.h
              )"
1Z10
  • 2,801
  • 7
  • 33
  • 82