Understanding PGO in GoLang 1.20

  sonic0002        2023-02-28 04:27:46       2,807        0    

Background

The Go 1.20 version was officially released in February 2023, it introduced the PGO(Profile Guided Optimization) mechanism.

The basic principle of PGO can be divided into the following two steps:

  1. First, profiling is performed on the program to collect data about the program's runtime and generate a profiling file.
  2. When compiling the program, enable the PGO option, and the compiler will optimize the program's performance based on the content in the .pgo file.

When compiling a program, the compiler will perform many optimizations, including well-known optimizations such as inline optimization, escape analysis, and constant propagation, which can be achieved directly by analyzing the program's source code.

However, some optimizations cannot be implemented by analyzing the source code alone. For example, if a function has many if/else conditional branches, we may want the compiler to automatically optimize the order of the conditional branches to speed up the conditional branch judgment and improve program performance.

However, the compiler may not know which conditional branches are entered more frequently and which are entered less frequently because this depends on the program's input.

At this time, the people responsible for optimizing the compiler thought of PGO: Profile Guided Optimization.

The principle of PGO is simple: first run the program to collect data about its runtime. thereafter the compiler analyzes the program's behavior based on the collected runtime data and makes targeted performance optimizations.

For example, if the program can collect data on which conditional branches are entered more frequently, it will place the judgment of that conditional branch first to reduce the time spent on conditional judgment and improve program performance.

So how does Go use PGO to optimize program performance? Let's take a look at a specific example.

Example

We are implementing a web API called /render. This API takes a markdown file in binary format as input and returns the HTML format after converting it from markdown format.

We are using the gitlab.com/golang-commonmark/markdown project to implement this API.

Environment setup

$ go mod init example.com/markdown

Let's create main.go with below code

package main

import (
    "bytes"
    "io"
    "log"
    "net/http"
    _ "net/http/pprof"

    "gitlab.com/golang-commonmark/markdown"
)

func render(w http.ResponseWriter, r *http.Request) {
    if r.Method != "POST" {
        http.Error(w, "Only POST allowed", http.StatusMethodNotAllowed)
        return
    }

    src, err := io.ReadAll(r.Body)
    if err != nil {
        log.Printf("error reading body: %v", err)
        http.Error(w, "Internal Server Error", http.StatusInternalServerError)
        return
    }

    md := markdown.New(
        markdown.XHTMLOutput(true),
        markdown.Typographer(true),
        markdown.Linkify(true),
        markdown.Tables(true),
    )

    var buf bytes.Buffer
    if err := md.Render(&buf, src); err != nil {
        log.Printf("error converting markdown: %v", err)
        http.Error(w, "Malformed markdown", http.StatusBadRequest)
        return
    }

    if _, err := io.Copy(w, &buf); err != nil {
        log.Printf("error writing response: %v", err)
        http.Error(w, "Internal Server Error", http.StatusInternalServerError)
        return
    }
}

func main() {
    http.HandleFunc("/render", render)
    log.Printf("Serving on port 8080...")
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Compile and run the program

$ go mod tidy
$ go build -o markdown.nopgo
$ ./markdown.nopgo
2023/02/25 22:30:51 Serving on port 8080...

We create a new file called input.md in the main directory of the program, and the contents can be customized as long as it conforms to the markdown syntax.

Send the binary content of the markdown file to the /render API using the curl command.

$ curl --data-binary @input.md http://localhost:8080/render

This outputs the HTML format of the contents in input.md.

Profiling

Next, we will perform profiling on the main.go program to obtain runtime data, and then use PGO to optimize performance.

In main.go, there is an import net/http/pprof library. It adds a new web API /debug/pprof/profile based on the existing web API /render. We can request this profiling API to obtain the runtime data of the program.

Create a subdirectory called load in the main directory of the program, and create a file called main.go under the load subdirectory. The load/main.go file will continuously request the /render API of the server started by ./markdown.nogpo to simulate the actual runtime of the program.

$ go run example.com/markdown/load

Request the profiling API to obtain the runtime data of the program.

$ curl -o cpu.pprof "http://localhost:8080/debug/pprof/profile?seconds=30"

Wait for 30 seconds, and the curl command will exit, and the cpu.pprof file will be generated in the main directory of the program.

Note: Go version 1.20 is required to compile and run the program.

PGO

$ mv cpu.pprof default.pgo
$ go build -pgo=auto -o markdown.withpgo

When compiling the program with go build, enable the -pgo option.

The -pgo option supports both the specified profiling file and the auto mode. In auto mode, it will automatically search for the default.pgo profiling file in the program's main directory.

Go officially recommends that everyone use auto mode and maintain the default.pgo file in the program's main directory so that all developers can use default.pgo to optimize program performance.

In Go 1.20, the default value of the -pgo option is off, so we must add -pgo=auto to enable PGO optimization.

In future Go versions, the official plan is to set the default value of the -pgo option to auto.

Performance comparison

In the program's subdirectory load, add the bench_test.go file, which uses Go's performance testing Benchmark framework to perform stress testing on the server.

Not enabling PGO

Start the server program without PGO enabled.

$ ./markdown.nopgo

Start the testing

$ go test example.com/markdown/load -bench=. -count=20 -source ../input.md > nopgo.txt

Enabling PGO

Start the server program with PGO enabled.

$ ./markdown.withpgo

Start the testing

$ go test example.com/markdown/load -bench=. -count=20 -source ../input.md > withpgo.txt

The output with above testings can be found below

$ go install golang.org/x/perf/cmd/benchstat@latest
$ benchstat nopgo.txt withpgo.txt
goos: darwin
goarch: amd64
pkg: example.com/markdown/load
cpu: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
       │  nopgo.txt  │             withpgo.txt             │
       │   sec/op    │   sec/op     vs base                │
Load-4   447.3µ ± 7%   401.3µ ± 1%  -10.29% (p=0.000 n=20)

As we can see, after using PGO optimization, the program's performance has improved by 10.29%, which is a significant improvement.

In Go 1.20, using PGO usually results in a performance improvement of around 2%-4%.

In future versions, the compiler will continue to optimize the PGO mechanism to further improve program performance.

Summary

Go 1.20 introduces PGO to allow the compiler to optimize program performance. PGO has two steps:

  1. Obtain a profiling file.
  2. Use the PGO option when compiling with go build to guide the compiler to optimize the program's performance based on the profiling file.

In production environments, we can collect profiling data over a period of time and use PGO to optimize the program to improve system performance.

Using PGO can result in significant performance gains, with a typical increase of 2%-4% in Go 1.20. In the future, the compiler will continue to optimize the PGO mechanism to further improve program performance.

Reference: 一文读懂Go 1.20引入的PGO性能优化

GOLANG  PGO  GO 1.20 

       

  RELATED


  0 COMMENT


No comment for this article.



  RANDOM FUN

Localization failed to be loaded?