Today, let's talk about the usage and applicable scenarios of errGroup
, which is often used. Why is it useful? Generally, when we use goroutine, we cannot return a value. If you want to pass out the result of the goroutine execution, you usually have to use the channel. The errGroup
package is suitable if you want to know when the goroutine you opened encounters an error during execution and stop working, and I need to know the error value.
errGroup usage
The package needs to be downloaded and installed first.
go get -u golang.org/x/sync
An example of using it would be
package main
import (
"fmt"
"log"
"net/http"
"golang.org/x/sync/errgroup"
)
func main() {
eg := errgroup.Group{}
eg.Go(func() error {
return getPage("https://blog.kennycoder.io")
})
eg.Go(func() error {
return getPage("https://google.com")
})
if err := eg.Wait(); err != nil {
log.Fatalf("get error: %v", err)
}
}
func getPage(url string) error {
resp, err := http.Get(url)
if err != nil {
return err
}
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("fail to get page: %s, wrong statusCode: %d", url, resp.StatusCode)
}
log.Printf("success get page %s", url)
return nil
}
- First, create a group struct. This struct itself doesn't need any fields to be set, so simply create an empty struct will do.
- The group struct only provides two functions,
Go
andWait
, which are actually quite similar to waitGroup. - The Go function accepts a function as a parameter and returns an error. This function is actually the function that you want to execute in the goroutine. In the example above,
getPage
is placed inGo
. The implementation ofgetPage
is very simple. It callshttp.Get(url)
and returns an error if thestatusCode
is not 200. Otherwise, it writes a log to indicate that the execution was successful. - Finally, calling
Wait
means that it will start to block, similar to the waitGroupWait
method. It will wait for all the goroutines that you have opened to finish executing before exitingWait
. One difference is thatWait
will return an error, which comes from the error returned by one of your goroutines.
In the example above, Wait()
itself does not return an error because all web pages can be accessed normally. If you change one of the URLs to a non-existent URL, you will see the effect.
2021/10/03 11:52:23 success get page https://google.com
2021/10/03 11:52:23 get error: Get "https://kenny.example.com": dial tcp: lookup kenny.example.com: no such host
exit status 1
If both of your goroutines try to access a non-existent URL:
2021/10/03 11:53:52 get error: Get "https://kenny.example.com": dial tcp: lookup kenny.example.com: no such host
exit status 1
You will find that only one error is printed at the end because errGroup
only stores the error from one goroutine. Which goroutine's error is stored depends on which one encounters an error first and will be stored. The errors from subsequent goroutines will not be stored.
You may feel that what you want is to know the results of all errors, and it is not helpful to just give me the error from one goroutine. In the example above, it is indeed not ideal because you won't know which URLs are unable to be accessed. So personally, I think the scenario where errGroup is suitable is when the tasks you want to execute are the same or of the same nature. For example, if you have multiple goroutines that need to access the same service to obtain different information. Therefore, when one of the goroutines fails, the most likely reason is a network issue, and other goroutines accessing the same service will also be unable to access it successfully. Additionally, if what you want is that when one goroutine fails, even if other goroutines have completed successfully, it is not helpful, then errGroup is very suitable.
However, simply using an empty Group struct is not helpful.
When errGroup
is running, even if one of your goroutines encounters an error, it will not cancel the other goroutines. This means that other goroutines cannot be canceled in time, and it is impossible to know if the goroutines have exited correctly.
errGroup has considered this situation, so it chooses to use the context method to cancel other goroutines.
One example
func main() {
eg, ctx := errgroup.WithContext(context.Background())
eg.Go(func() error {
for i := 0; i < 10; i++ {
select {
case <-ctx.Done():
log.Printf("goroutine should cancel")
return nil
default:
if err := getPage("https://blog.kennycoder.io"); err != nil {
return err
}
time.Sleep(1 * time.Second)
}
}
return nil
})
eg.Go(func() error {
for i := 0; i < 10; i++ {
select {
case <-ctx.Done():
log.Printf("goroutine should cancel")
return nil
default:
if err := getPage("https://google.com"); err != nil {
return err
}
time.Sleep(1 * time.Second)
}
}
return nil
})
if err := eg.Wait(); err != nil {
log.Fatalf("get error: %v", err)
}
}
- errGroup provides WithContext to put your parent context in, and returns a group struct and context. This context is actually a cancel context.
- After obtaining the cancel context, you can put it in the
Go
function and useselect <-ctx.Done()
to know whether to be canceled and thus end the goroutine. In this scenario, I access the URL ten times. If there is an error once, I return err, and if I receivectx.Done()
, I return nil to end the goroutine. I access one of the URLs incorrectly to see the effect:
2021/10/03 12:11:40 success get page https://google.com
2021/10/03 12:11:41 goroutine should cancel
2021/10/03 12:11:41 get error: Get "https:kenny.example.com": http: no Host in request URL
exit status 1
As you can see, the second goroutine successfully accessed the URL once, but it had to print out the log and end the second goroutine because it received ctx.Done
before successfully accessing the URL for the second time. Finally, the Wait
error printed out the error from the first goroutine.
This approach ensures that when one goroutine encounters an error, it also notifies other goroutines to end their work and ensures that the goroutines can exit normally.
How does errGroup work internally
Let's take a look at its structure.
type Group struct {
cancel func()
wg sync.WaitGroup
errOnce sync.Once
err error
}
This is the structure of the Group
struct, which shows that there is a cancel func()
that is used for the previously mentioned WithContext
. The appearance of WaitGroup
indicates that errGroup is also implemented through WaitGroup
, and Once
is for the existence of accepting only one error. err
is the error value returned by Wait
in the end.
Below is the Go
function
func (g *Group) Go(f func() error) {
g.wg.Add(1)
go func() {
defer g.wg.Done()
if err := f(); err != nil {
g.errOnce.Do(func() {
g.err = err
if g.cancel != nil {
g.cancel()
}
})
}
}()
}
Through wg.Add(1)
, a goroutine is opened internally to execute the function passed in, and the return value is checked for errors. If there is an error, errOnce.Do
is used to store the error. Because the feature of Once
is that no matter what function you pass in, Once
will only execute once, even if multiple goroutines return errors. Therefore, g.err = err
will only be set once, and its value will not be changed once it is successfully set.
Finally, it will also check whether cancel
is nil. If it is not nil, what does it mean? It means that errGroup
is created using WithContext
, so cancel
will not be nil. At this time, g.cancel()
will be called to cancel other goroutines for you, and ctx.Done()
will have a value.
And because it is implemented through WaitGroup
, g.wg.Done()
needs to be called after each goroutine completes its work to decrement it.
Below is the Wait
function:
func (g *Group) Wait() error {
g.wg.Wait()
if g.cancel != nil {
g.cancel()
}
return g.err
}
Similarly, g.wg.Wait()
is called, which causes the blocking effect. After all the goroutines have finished, it will check whether cancel
is nil and cancel other goroutines in the same way. The reason for doing this is more towards canceling this context. After all, this cancel
context exists for this errorGroup, and it should be reset after all tasks are completed.
func WithContext(ctx context.Context) (*Group, context.Context) {
ctx, cancel := context.WithCancel(ctx)
return &Group{cancel: cancel}, ctx
}
As you can see, a cancel context is indeed established, and ctx
is returned to the client side for use, and the cancel
func is stored.
In summary, the design of errGroup
is very simple, and cleverly uses WaitGroup
and Context
to achieve the ability to wait for all goroutines and obtain errors, and uses Context
to allow the client side to design the implementation of exiting goroutines.