Go and Protocol Buffers, acceleration

  • Tutorial
Some continuation of the article Go and Protocol Buffers a bit of practice (or a quick start, for those who are not familiar) . The encoding / decoding processes in certain formats in Go are closely related to reflection. And as we, dear reader, we know - reflection is a long time. About what fighting methods exist this article. I think that sophisticated people are unlikely to find anything new in it.


Root



Actually, the mentioned article talked about packages github.com/golang/protobuf/{proto,protoc-gen-go}. What is wrong with them? Namely, that reflection is used. Suppose you have a project that works with a specific set of structures. And these structures are constantly encoded in Protocol Buffers and vice versa. If it were always different, unpredictable types, then there is no problem. But if the set is known in advance, there is absolutely no need to use reflection. As you know, it is customary to use some interface, which is responsible for coding. Here's an example piece from encoding/json:
type Marshaler interface {
    MarshalJSON() ([]byte, error)
}
type Unmarshaler interface {
    UnmarshalJSON([]byte) error
}

Reference: Marshaler , Unmarshaler .
If the encoder encounters a type embodying one of these interfaces, then in this case all the work rests with their methods.
Simple json example
type X struct {
    Name string,
    Value int,
}
func (x *X) MarshalJSON() ([]byte, error) {
    return []byte(fmt.Sprintf(`{"name": %q, "value": %d}`, x.Name, x.Value))
}

They do not always (Un)Marshalerlook so rosy. For example, there is something to read about yaml ( Eng. ) And in general on this topic.


Key



The solution, as always, is simple. Use another package:
go get github.com/gogo/protobuf/{proto,protoc-gen-gogo,gogoproto,protoc-gen-gofast}

These packages simply add convenience and speed.
About the package (links):

As you can see, acceleration from 1.10x and higher. It is possible to simply use a set of extensions - without acceleration. There is an opportunity to simply speed up. I settled on this command:
protoc \
  --proto_path=$GOPATH/src:$GOPATH/src/github.com/gogo/protobuf/protobuf:. \
  --gogofast_out=. *.proto
and you will get both extensions (if any) and acceleration.
example
Not urging to use extensions, but for reference.
syntax="proto3";
package some;
//protoc \
//  --proto_path=$GOPATH/src:$GOPATH/src/github.com/gogo/protobuf/protobuf:. \
//  --gogofast_out=. *.proto
import "github.com/gogo/protobuf/gogoproto/gogo.proto";
 // для тестов, создаёт метод Equal, проверять идентичность
option (gogoproto.equal_all)            = true;
option (gogoproto.goproto_stringer_all) = false;
// Stringer для всех (для тестов нужно это расширение)
option (gogoproto.stringer_all)         = true;
// для тестов - наполнение случайными значениями
option (gogoproto.populate_all)         = true;
// генерация набора тестов
option (gogoproto.testgen_all)          = true;
// набор бенчмарков
option (gogoproto.benchgen_all)         = true;
// нужно
option (gogoproto.marshaler_all)        = true;
// размер сообщения
option (gogoproto.sizer_all)            = true;
// нужно
option (gogoproto.unmarshaler_all)      = true;
// enums, не важно - это для красоты
option (gogoproto.goproto_enum_prefix_all) = false;
enum Bool {
	Yes      = 0;
	No       = 1;
	DontCare = 2;
}
message Some {
	option (gogoproto.goproto_unrecognized ) = false;
	option (gogoproto.goproto_getters)       = false;
	Bool  Waht  = 1;
	int64 Count = 2;
	bytes Hash  = 3;
}

will succeed
/*
    большая часть выпилена
    там ещё куча методов (Size, String и т.д.) и ещё один файл с тестами
*/
type Bool int32
const (
	Yes      Bool = 0
	No       Bool = 1
	DontCare Bool = 2
)
// ...
type Some struct {
	Waht  Bool   `protobuf:"varint,1,opt,name=Waht,proto3,enum=some.Bool" json:"Waht,omitempty"`
	Count int64  `protobuf:"varint,2,opt,name=Count,proto3" json:"Count,omitempty"`
	Hash  []byte `protobuf:"bytes,3,opt,name=Hash,proto3" json:"Hash,omitempty"`
}
// воплощение интерфейса proto.Message (github.com/golang/protobuf/proto)
func (m *Some) Reset()      { *m = Some{} }
func (*Some) ProtoMessage() {}
// собственно вот
func (m *Some) Marshal() (data []byte, err error) {
	// ...
}
// и вот
func (m *Some) Unmarshal(data []byte) error {
	// ...
}


As you can see, some extensions have beta status , and this is also a remark about proto3. Do not doubt. This package has been successfully used by many (see the home page). Nevertheless, this does not exempt from writing tests. If you are not interested in extensions and stuff, then (as noted in the README of the project) this command will be enough:
protoc --gofast_out=. myproto.proto


Stones



fly in the ointment

If you have not looked into the previous spoiler, then I would like to emphasize one of its fragments, here it is
func (m *Some) Reset()      { *m = Some{} } // очень грубо

The fact is that gogoit allows you to generate “fast” structures. Moreover, you can use them with the "old" github.com/golang/protobuf/proto. In this case, methods will be used Marshaland Unmarshal- there is no problem. But what if you use the same instance of the structure many times. If the structure is large (no, huge), then by and large it would not hurt to use the pool and save the "worked out" structures, and then retrieve them back - reuse them.

The approach github.com/golang/protobuf/proto. Reference .
func Unmarshal(buf []byte, pb Message) error {
	pb.Reset() // акцент на этом
	return UnmarshalMerge(buf, pb)
}

Challenge Reset. And consequently from *m = Some{}- the old structure is thrown out, the new is created. This structure is small - do not care - but I would like to save Hash []byte( I mean allocated memory ), in case you use a big o-hash.

The approach is github.com/gogo/protobuf/protosimilar - copy-paste. Not a glimpse.

Well. You can try to use the method Unmarshaldirectly or UnmarshalMerge- just add your own MyResetmethod, cut the length of the slice - leave the capacity. Not! Here is the line from the generated one Unmarshal:
m.Hash = append([]byte{}, data[iNdEx:postIndex]...)

A new slice is created - the old one flies into the GC firebox. Actually, if you have small structures ( fields of structures - and all together, too ) - then the easiest way is not to bathe. For large ones, look for workarounds (read rewrite generated code). With the current implementation, using the pool does not make sense.

Bonus



The library is convenient for streaming. Writing messages in io.Writer, reading from io.Reader- such a bike already exists.

Since I started talking about json: github.com/pquerna/ffjson . Similarly for json. Not just a generator, but a Swiss knife for json+ Go.

Since I started talking about speed and about the pool: github.com/valyala/fasthttp . "Quick" replacement net/http. Acceleration due to memory reuse. And the same with additional features.

Only registered users can participate in the survey. Please come in.

the form

  • 78.4% messy / mixed 73
  • 21.5% normal 20
  • 0% too much 0

content

  • 20.5% nothing new 16
  • 32% 50/50 approximately 25
  • 47.4% have something interesting 37

Also popular now: