2

I want to parse a proto file. Wanted to check is there any java library available which can parse proto files. Based on my requirement I cannot use descriptor parseFrom method or protoc command. Please suggest thanks in advance.

$ protoc --include_imports --descriptor_set_out temp *.proto // I don't want to do this manual step 
or 
DescriptorProtos.FileDescriptorProto descriptorProto = DescriptorProtos.FileDescriptorProto.parseFrom(proto.getBytes());

Appreciate suggestion thanks

Maana
  • 640
  • 3
  • 9
  • 22
  • Could you clarify *why* you don't want to run the tool that is designed to give you exactly what you're looking for? (I can think of various *potential* reasons, but your *specific* reason would affect possible solutions.) – Jon Skeet Jan 26 '22 at 15:28
  • @Jon Skeet- It is because I want to run this step in jenkins and we are not allowed to make any configuration changes there. – Maana Jan 31 '22 at 15:39
  • I would talk to the people who set those rules then. Requiring that no tools are available other than Java seems overly prohibitive to me. (Would you expect that "something equivalent to protoc" is available in *every* language just in case there's someone else with that requirement but for PHP, Ruby etc?) – Jon Skeet Jan 31 '22 at 16:30
  • @JonSkeet with multiple Jenkins workers, it can get cumbersome to have to install several tools. You could use tools like Puppet to do so, but if one tool is allowed, then others will ask for tools as well, and you end up with a system full of tools used for just one or two jobs. Whenever I want to do something in Jenkins that requires extra tools I try to find (or create) a Docker container, then run that. That way, only one tool (Docker) is needed. A quick Google search showed several candidates. – Rob Spoor Feb 01 '22 at 17:51
  • @RobSpoor: Putting together a Docker container of useful tools or finding an existing one seems reasonable. Requiring that every possible tool is written in every language is not, IMO :) – Jon Skeet Feb 01 '22 at 17:52
  • @JonSkeet I agree. – Rob Spoor Feb 01 '22 at 17:55
  • 1
    When you say "parse a proto file", what do you mean - parse it into what exactly? Parsers typically produce an AST representation of the input - is that what you want? – jon hanson Feb 01 '22 at 20:39
  • @Maana, what build system do you use? Maven, Gradle, etc.? There are the corresponding build system plugins (`protobuf-maven-plugin`, `protobuf-gradle-plugin`) that generate the code as a part of the build process. Would using such a build system plugin be an acceptable solution for you? – Sergey Vyacheslavovich Brunov Feb 01 '22 at 21:21
  • 1
    @jon-hanson So, I writing a sonarqube custome plugin in java which will scan any application(java, go etc) and look for .proto file and parse it. I want that parser to give back details about all messages, fieldTypes etc.. – Maana Feb 01 '22 at 23:32
  • 1
    @Sergey Vyacheslavovich Brunov- I am writing sonarqube custom proto parser so I should be able to scan any application java or golang and read the proto file. The thing with generated code is that it is not consistent across different languages. In java we get .class file where in go we get pb.go – Maana Feb 01 '22 at 23:37

1 Answers1

2

Possible solution: io.protostuff:protostuff-parser library

Let's consider the 3.1.38 version of the io.protostuff:protostuff-parser library as the current version.

Example program

Please, consider the below example program as a draft to get started with the library.

Input file

Let's assume the /some/directory/data/test.proto file exist with the following content:

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
  enum ContentType {
    WEB = 1;
    IMAGES = 2;
    VIDEO = 3;
  }
  ContentType content_type = 4;
}

pom.xml: Dependencies

<project>
    <dependencies>
        <dependency>
            <groupId>io.protostuff</groupId>
            <artifactId>protostuff-parser</artifactId>
            <version>3.1.38</version>
        </dependency>
        <dependency>
            <groupId>com.google.inject</groupId>
            <artifactId>guice</artifactId>
            <version>5.1.0</version>
        </dependency>
    </dependencies>
</project>

Program

import com.google.inject.Guice;
import com.google.inject.Injector;
import io.protostuff.compiler.ParserModule;
import io.protostuff.compiler.model.Message;
import io.protostuff.compiler.model.Proto;
import io.protostuff.compiler.parser.Importer;
import io.protostuff.compiler.parser.LocalFileReader;
import io.protostuff.compiler.parser.ProtoContext;
import java.nio.file.Path;
import java.util.List;

public final class Program {
    public static void main(final String[] args) {
        final Injector injector = Guice.createInjector(new ParserModule());
        final Importer importer = injector.getInstance(Importer.class);
        final ProtoContext protoContext = importer.importFile(
            new LocalFileReader(Path.of("/some/directory/data")),
            "test.proto"
        );

        final Proto proto = protoContext.getProto();

        final List<Message> messages = proto.getMessages();
        System.out.println(String.format("Messages: %s", messages));

        final Message searchRequestMessage = proto.getMessage("SearchRequest");
        System.out.println(String.format("SearchRequest message: %s", searchRequestMessage));

        final List<Enum> searchRequestMessageEnums = searchRequestMessage.getEnums();
        System.out.println(String.format("SearchRequest message enums: %s", searchRequestMessageEnums));
    }
}

The program output:

Messages: [Message{name=SearchRequest, fullyQualifiedName=..SearchRequest, fields=[Field{name=query, typeName=string, tag=1, options=DynamicMessage{fields={}}}, Field{name=page_number, typeName=int32, tag=2, options=DynamicMessage{fields={}}}, Field{name=result_per_page, typeName=int32, tag=3, options=DynamicMessage{fields={}}}, Field{name=content_type, typeName=ContentType, tag=4, options=DynamicMessage{fields={}}}], enums=[Enum{name=ContentType, fullyQualifiedName=..SearchRequest.ContentType, constants=[EnumConstant{name=WEB, value=1, options=DynamicMessage{fields={}}}, EnumConstant{name=IMAGES, value=2, options=DynamicMessage{fields={}}}, EnumConstant{name=VIDEO, value=3, options=DynamicMessage{fields={}}}], options=DynamicMessage{fields={}}}], options=DynamicMessage{fields={}}}]
SearchRequest message: Message{name=SearchRequest, fullyQualifiedName=..SearchRequest, fields=[Field{name=query, typeName=string, tag=1, options=DynamicMessage{fields={}}}, Field{name=page_number, typeName=int32, tag=2, options=DynamicMessage{fields={}}}, Field{name=result_per_page, typeName=int32, tag=3, options=DynamicMessage{fields={}}}, Field{name=content_type, typeName=ContentType, tag=4, options=DynamicMessage{fields={}}}], enums=[Enum{name=ContentType, fullyQualifiedName=..SearchRequest.ContentType, constants=[EnumConstant{name=WEB, value=1, options=DynamicMessage{fields={}}}, EnumConstant{name=IMAGES, value=2, options=DynamicMessage{fields={}}}, EnumConstant{name=VIDEO, value=3, options=DynamicMessage{fields={}}}], options=DynamicMessage{fields={}}}], options=DynamicMessage{fields={}}}
SearchRequest message enums: [Enum{name=ContentType, fullyQualifiedName=..SearchRequest.ContentType, constants=[EnumConstant{name=WEB, value=1, options=DynamicMessage{fields={}}}, EnumConstant{name=IMAGES, value=2, options=DynamicMessage{fields={}}}, EnumConstant{name=VIDEO, value=3, options=DynamicMessage{fields={}}}], options=DynamicMessage{fields={}}}]
  • 1
    Thank you Sergey Vyacheslavovich Brunov this is what I was looking for :) – Maana Feb 02 '22 at 14:19
  • 1
    Sergey Vyacheslavovich Brunov : Any idea on how to resolve import. If import is not found I am getting error like - Can not load proto: google.protobuf.descriptor.proto not found. – Maana Feb 02 '22 at 20:28
  • 2
    @Maana, that is another question. Please, consider asking it as a separate question (not a comment) with the corresponding minimal reproducible example. – Sergey Vyacheslavovich Brunov Feb 03 '22 at 01:13