12

We are trying to avoid duplicate code in a project where Java and Python are used together. The majority of the code base is in Java, with Python now being added due to the prevalence in the machine learning environment.

In a green-field scenario, we'd start with sth. like swagger or protobuf and derive the models from the generated code. But this doesn't work now.

The J classes are annotated with some annotations and they are targeting Java 8.

While researching, I found the following possible route to turn the structure (without the methods) of the classes into Python class structures:

  1. Generate XML schemas from the Java classes
  2. Generate Python classes from the xml schema files

The added benefit: The two languages actually communicate via XML in our project so the schema files are helpful for other use cases. We're using maven to build Java, therefore it would be nice to include it in the maven process.

I included this in the pom.xml:

<!-- https://mvnrepository.com/artifact/org.codehaus.mojo/jaxb2-maven-plugin -->
<dependency>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>jaxb2-maven-plugin</artifactId>
    <version>2.3.1</version>
</dependency>

as well as the default plugin configuration

<plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>jaxb2-maven-plugin</artifactId>
            <executions>
                <execution>
                    <id>schemagen</id>
                    <goals>
                        <goal>schemagen</goal>
                    </goals>
                </execution>
            </executions>
            <!--
                Use default configuration, implying that sources are read
                from the directory src/main/java below the project basedir.

                (i.e. getProject().getCompileSourceRoots() in Maven-speak).
            -->
        </plugin>

But I get an error

[ERROR] Failed to execute goal org.codehaus.mojo:jaxb2-maven-plugin:2.3.1:schemagen (default-cli) on project common: JAXB errors arose while SchemaGen compiled sources to XML. -> [Help 1]       
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.codehaus.mojo:jaxb2-maven-plugin:2.3.1:schemagen (default-cli) on project common: JAXB errors arose while SchemaGen compiled sources to XML.  

I then looked into JSON Schemas as an intermediary but that also doesn't really cut it because it's not easily possible to create Python class source code from JSON schemas.

So is there any way to generate simple "Pojo" Python classes from Java code? No methods, no complex cross-compile but a simple structural conversion. I can generate UML diagrams from the Java files in IntelliJ so all the information is there, I just need a tool that helps converting

pascalwhoop
  • 2,984
  • 3
  • 26
  • 40
  • I've used the python library generateDS for generating model files from XSDs with some success. Here's a link: http://www.davekuhlman.org/generateDS.html You can give it an XSD and it spits out a generated model for you. – wholevinski Apr 03 '18 at 13:16
  • Yes I am aware of that library. But generating XSD files from Java was no success for me as noted above. – pascalwhoop Apr 03 '18 at 18:55
  • Ok, I thought you were going down that road still, my mistake. It might be helpful to see the stack trace on that JAXB error as well if you're not opposed to keep going with that. – wholevinski Apr 03 '18 at 18:58
  • 1
    See if this helps you with the JSON schema approach? https://stackoverflow.com/questions/12465588/convert-a-json-schema-to-a-python-class – Tarun Lalwani Apr 06 '18 at 10:43
  • @TarunLalwani, no because it doesn't generate class source code. It just works during runtime. I want class code so that PyCharm or vim etc do IDE support during development. – pascalwhoop Apr 06 '18 at 12:34
  • Could you give a minimal example: a Java class and the expected out in Python? Will you generate the python code every time you compile the maven project (and overwrite all potential modifications) or is it a "single shot"? Do you need to translate the whole project structure? – jferard Apr 08 '18 at 12:10
  • @pascalwhoop, feedback on the new answer? – Tarun Lalwani Apr 10 '18 at 07:39
  • In my experience, I've not found many code generators that produce anything useful. Have you considered moving the python code into something like jython and just natively calling into the existing classes? – mdadm Apr 10 '18 at 15:20
  • Given the right json structure it is entirely possible to generate Python classes from json code - you just need to to dynamically write the code - my pypi package' importjson' does exactly this; allows importation of json and transformation into json. With a bit of work if you needed it I could write the python to a source code file - i.e. make it a json -> python compiler. Let me know if you need this. – Tony Suffolk 66 Apr 12 '18 at 00:54

3 Answers3

9

So is there any way to generate simple "Pojo" Python classes from Java code?

I had a go at it and below is the solution:

Considering below simplistic Pojo.java

public class Pojo {
    private String string = "default";
    public int integer = 1;
    public String getString(){
        return string;
    }
}

The solution will need 3 phases

1. Java Pojo to JSON Schema

I could find below options:

  1. FasterXML/jackson-module-jsonSchema: This is the base, which below libraries also use internally.
  2. mbknor/mbknor-jackson-jsonSchema: Officially cited by above to support the v4 of the json schema.
  3. reinert/JJSchema

With below relevant code with option 1(also go through the site):

ObjectMapper MAPPER = new ObjectMapper();
JsonSchemaGenerator generator = new JsonSchemaGenerator(MAPPER);
JsonSchema jsonSchema = generator.generateSchema(Pojo.class);
System.out.println(MAPPER.writeValueAsString(jsonSchema));

Below output json schema string is got:

{"type":"object","id":"urn:jsonschema:Pojo","properties":{"string":{"type":"string"},"integer":{"type":"integer"}}}

2. JSON Schema post-process

This phase is required mainly because I found that for the simplistic use case(at least), Step 3 below needs a json schema that has a definitions property mandatorily. I guess this is because of the evolving schema definitions @ http://json-schema.org/. Also, we can include a title property to specify the name of the python class that next step will generate.

We can easily accomplish these in the java program of Step 1 above as a post step. We need a json schema string of below form:

{"definitions": {}, "title": "Pojo", "type":"object","id":"urn:jsonschema:Pojo","properties":{"string":{"type":"string"},"integer":{"type":"integer"}}}

Notice that only addition is "definitions": {}, "title": "Pojo"

3. Json schema to Python class

frx08/jsonschema2popo seems to be doing this job quite nicely.

pip install jsonschema2popo
jsonschema2popo -o /path/to/output_file.py /path/to/json_schema.json

Some more points

  1. The Java-Json schema generators will only include those properties in the output which are either public or have a public getter.
  2. I assume that for a mass migration annotating the Java classes will be a pain. Otherwise, if this is feasible to you, all the above java libraries provide rich annotations where you can specify whether a property is mandatory and much more.
sujit
  • 2,258
  • 1
  • 15
  • 24
  • 1
    I will investigate how well this works for more complex types and report back. But the answer looks promising and intuitive – pascalwhoop Apr 11 '18 at 14:52
1

TechWalla @https://www.techwalla.com/articles/how-to-convert-java-to-python has detailed instructions. See if it helps you.

Pasting the instructions here Step 1 Download and extract java2python. The file you download is a gzip file, and it contains within it a tarball file; both are compression schemes, and both can be decompressed with 7zip, an open-source program.

Step 2 Place the contents of the java2python folder on the root of your C:\ drive.

Step 3 Open a command prompt and navigate to "C:\java2python\" before typing in "python setup.py install" without quotes. This will tell the Python interpreter to run the setup script and prepare your computer. Change directories to "C:\java2python\bin\" and keep the window open.

Step 4 Copy the Java file to be converted into your bin subfolder, under java2python. In the command line, run "j2py -i input_file.java -o output_file.py," replacing the input_file and output_file with your filenames.

Step 5 Open the new Python folder and read the code. It probably won't be perfect, so you'll need to go over it to make sure it makes sense from a Python point of view. Even spending time manually checking, however, you will have saved large amounts of time from hand-converting

Sunil
  • 136
  • 2
  • 12
  • Also you can give a try at https://seelio.com/w/20mn/java-to-python-converter-wip. It has a github project which can be used to convert Java to Python programs. – Sunil Apr 09 '18 at 08:44
1
  1. JAX-WS 2.2 requires JAXB 2.2 for data binding. Make sure everything is updated
  2. Schemagen has a compile scope dependency.
  3. Best practices for generating the xml file :

    Provide a package-info.java file with the XmlSchema:

    schemagen example.Address example\package-info.java

    Use the @XmlType annotation namespace attribute to specify a namespace

    @XmlType(namespace="http://myNameSpace")

After resolving the issue on creating the xml file try lxml’s objectify sub-package. or minidom module to iterate trought xml elements and create Python classes.