Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/debug/cmd/viewcore: panic when loading Go 1.23 core #71182

Open
nsrip-dd opened this issue Jan 8, 2025 · 7 comments
Open

x/debug/cmd/viewcore: panic when loading Go 1.23 core #71182

nsrip-dd opened this issue Jan 8, 2025 · 7 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nsrip-dd
Copy link
Contributor

nsrip-dd commented Jan 8, 2025

Go version

go version go1.24rc1 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/nick.ripley/Library/Caches/go-build'
GODEBUG=''
GOENV='/Users/nick.ripley/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/f3/g91d13pd6kd3vdxts_gsgd1r0000gn/T/go-build3774044177=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/nick.ripley/repos/go-debug/go.mod'
GOMODCACHE='/Users/nick.ripley/go/pkg/mod'
GONOPROXY='redacted'
GONOSUMDB='redacted'
GOOS='darwin'
GOPATH='/Users/nick.ripley/go'
GOPRIVATE='redacted'
GOPROXY='redacted'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/nick.ripley/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24rc1'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

Built cmd/viewcore from the latest commit (https://go.googlesource.com/debug/+/b341049684da5bace4625c231b50de1ef2e6a453) and tried to open a Go 1.23 core:

$ file core.6
core.6: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from '/usr/local/bin/redacted'
$ go version ./redacted
redacted: go1.23.0
$ ~/repos/go-debug/viewcore core.6 --exe redacted overview

What did you see happen?

The program ran for a bit, then panicked:

panic: address 8 is not mapped in the core file

goroutine 1 [running]:
golang.org/x/debug/internal/core.(*Process).ReadUint64(0x140149531a8?, 0x1052385cc?)
	/Users/nick.ripley/repos/go-debug/internal/core/read.go:81 +0x198
golang.org/x/debug/internal/core.(*Process).ReadUintptr(0x140149531a8?, 0x105237e8c?)
	/Users/nick.ripley/repos/go-debug/internal/core/read.go:120 +0x3c
golang.org/x/debug/internal/gocore.region.Uintptr({0x140025fe000?, 0x1400ee94030?, 0x14000ac10e0?})
	/Users/nick.ripley/repos/go-debug/internal/gocore/region.go:40 +0x40
golang.org/x/debug/internal/gocore.readHeap0(0x140024fe000, {0x140025fe000?, 0x14000c1d140?, 0x14000c1c420?}, {0x14005369080, 0x3, 0x4e000?}, 0x800000000000)
	/Users/nick.ripley/repos/go-debug/internal/gocore/process.go:526 +0x32ac
golang.org/x/debug/internal/gocore.readHeap(0x140024fe000)
	/Users/nick.ripley/repos/go-debug/internal/gocore/process.go:278 +0x2cc
golang.org/x/debug/internal/gocore.Core(0x140025fe000)
	/Users/nick.ripley/repos/go-debug/internal/gocore/process.go:142 +0x5a4
main.readCore()
	/Users/nick.ripley/repos/go-debug/cmd/viewcore/main.go:277 +0x7c
main.runOverview(0x1400014cb00?, {0x140000788a0?, 0x4?, 0x10530b3d4?})
	/Users/nick.ripley/repos/go-debug/cmd/viewcore/main.go:389 +0x1c
github.com/spf13/cobra.(*Command).execute(0x1057a6020, {0x14000078880, 0x2, 0x2})
	/Users/nick.ripley/go/pkg/mod/github.com/spf13/[email protected]/command.go:944 +0x630
github.com/spf13/cobra.(*Command).ExecuteC(0x1057a5d40)
	/Users/nick.ripley/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x320
github.com/spf13/cobra.(*Command).Execute(...)
	/Users/nick.ripley/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
main.main()
	/Users/nick.ripley/repos/go-debug/cmd/viewcore/main.go:244 +0x148

What did you expect to see?

No panic. FWIW Delve v1.23.1 seems to handle the core fine. I can open the core with the debugger and get goroutine stacks, etc.

The binary and core dump are from an internal production service so I'm hesitant to share them. But I'm happy to collect any info that would help, test out patches, etc.

@mauri870 mauri870 added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. compiler/runtime Issues related to the Go compiler and/or runtime. labels Jan 8, 2025
@mauri870
Copy link
Member

mauri870 commented Jan 8, 2025

cc @golang/runtime

@mknyszek
Copy link
Contributor

mknyszek commented Jan 8, 2025

I think this crash can happen if there was a goroutine actively allocating a new large object. Are any of the goroutines in the core in the allocator?

(If so, then the fix is straightforward, which is to tolerate the bad address being nil. It's used for finding pointers, and in this case there aren't any relevant pointers, so it's the correct fix.)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/641515 mentions this issue: gocore: skip nil largeType when discovering pointers

@seankhliao seankhliao changed the title golang.org/x/debug/cmd/viewcore: panic when loading Go 1.23 core x/debug/cmd/viewcore: panic when loading Go 1.23 core Jan 8, 2025
@gopherbot gopherbot added this to the Unreleased milestone Jan 8, 2025
@nsrip-dd
Copy link
Contributor Author

nsrip-dd commented Jan 8, 2025

Hmm... I don't see any goroutines doing allocation. Here are the "running" goroutines in the core, according to delve:

(dlv) goroutines -with running
  Goroutine 129 - User: /root/.gimme/versions/go1.23.0.linux.amd64/src/runtime/sigqueue.go:152 os/signal.signal_recv (0x48e809) (thread 9)
* Goroutine 132 - User: /root/.gimme/versions/go1.23.0.linux.amd64/src/runtime/trace.go:318 runtime/trace.Stop (0x9dfca5) (thread 14) [coroutine]
  Goroutine 2679 - User: /tmp/go-build/cgo-gcc-prolog:115 C._cgo_52142f90988d_Cfunc_fdb_run_network (0x2f6317a) (thread 12)

Goroutine 129 is getting signal. Goroutine 132 is crashing while stopping the tracer (this is from my investigation of #69085), and Goroutine 2679 is doing a cgo call. I dumped the rest of the stacks and grepped for mallocgc and didn't get any matches. Anything else I should look for that would help confirm or rule out allocation?

I can also try poking around with a debugger and see if I can pinpoint where this failure is happening.

@mknyszek
Copy link
Contributor

mknyszek commented Jan 8, 2025

Does delve show things running on the system stack? I'm not familiar enough with delve to say for sure.

The patch, even if not the right fix, should allow viewcore to make progress on your core file, at the very least. But it would be helpful if you could print some information out of the mspan for the large object, just before the point of failure.

@mknyszek mknyszek self-assigned this Jan 8, 2025
@mknyszek mknyszek moved this to In Progress in Go Compiler / Runtime Jan 8, 2025
@nsrip-dd
Copy link
Contributor Author

nsrip-dd commented Jan 8, 2025

Patch set 2 of the linked CL gets viewcore to run successfully for me. Still haven't tracked down in delve whether the program was allocating when the core was taken. Happy to keep looking/collecting info if you want to further diagnose this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: In Progress
Development

No branches or pull requests

5 participants