- Compiler Invocation. The protocol buffer compiler produces Java output when invoked with the -javaout= command-line flag. The parameter to the -javaout= option is the directory where you want the compiler to write your Java output. The compiler creates a single.java for each.proto file input.
- The protobuf API in Java is used to serialize and deserialize Java objects. You don’t need to worry about any encoding and decoding detail. Advantages of Google Protocol Buffer.
Straight out of Effective Java, Third Edition, we tackle the flaws in Java serialization and how to counter them using Google's Protocol Buffers as an example.
Effective Java, Third Edition was recently released, and I have been interested in identifying the updates to this classic Java development book, whose last edition only covered through Java 6. There are obviously completely new items in this edition that are closely related to Java 7, Java 8, and Java 9 such as Items 42 through 48 in Chapter 7 ('Lambdas and Streams'), Item 9 ('Prefer try-with-resources to try-finally'), and Item 55 ('Return optionals judiciously'). I was (very slightly) surprised to realize that the third edition of Effective Java had a new item not specifically driven by the new versions of Java, but that was instead was driven by developments in the software development world independent of the versions of Java. That item, Item 85 ('Prefer alternatives to Java Serialization') is what motivated me to write this introductory post on using Google's Protocol Buffers with Java.
In Item 85 of Effective Java, Third Edition, Josh Bloch emphasizes in bold text the following two assertions related to Java serialization:
- 'The best way to avoid serialization exploits is to never deserialize anything.'
- 'There is no reason to use Java serialization in any new system you write.'
After outlining the dangers of Java deserialization and making these bold statements, Bloch recommends that Java developers employ what he calls (to avoid confusion associated with the term 'serialization' when discussing Java) 'cross-platform structured-data representations.' Bloch states that the leading offerings in this category are JSON (JavaScript Object Notation) and Protocol Buffers (protobuf). I found this mention of Protocol Buffers to be interesting because I've been reading about and playing with Protocol Buffers a bit lately. The use of JSON (even with Java) is exhaustively covered online. I feel like awareness of Protocol Buffers may be less among Java developers than awareness of JSON and so feel like a post on using Protocol Buffers with Java is warranted.
Google's Protocol Buffers is described on its project page as 'a language-neutral, platform-neutral extensible mechanism for serializing structured data.' That page adds, 'think XML, but smaller, faster, and simpler.' Although one of the advantages of Protocol Buffers is that they support representing data in a way that can be used by multiple programming languages, the focus of this post is exclusively on using Protocol Buffers with Java.
There are several useful online resources related to Protocol Buffers including the main project page, the GitHub protobuf project page, the proto3 Language Guide (proto2 Language Guide is also available), the Protocol Buffer Basics: Java tutorial, the Java Generated Code Guide, the Java API (Javadoc) Documentation, the Protocol Buffers release page, and the Maven Repository page. The examples in this post are based on Protocol Buffers 3.5.1.
The Protocol Buffer Basics: Java tutorial outlines the process for using Protocol Buffers with Java. It covers a lot more possibilities and things to consider when using Java than I will cover here. The first step is to define the language-independent Protocol Buffers format. This a done in a text file with the .proto
extension. For my example, I've described my protocol format in the file album.proto
which is shown in the next code listing.
album.proto
Although the above definition of a protocol format is simple, there's a lot covered. The first line explicitly states that I'm using proto3 instead of the assumed default proto2 that is currently used when this is not explicitly specified. The two lines beginning with option are only of interest when using this protocol format to generate Java code and they indicate the name of the outermost class and the package of that outermost class that will be generated for use by Java applications to work with this protocol format.