首页 Go行指令
文章
取消

Go行指令

Go 编译器接受注释形式的指令,为了与非指令的注释区分,这种指令要求 // 与指令名之间不能有空格。然而,由于它们也是注释,对于支持不了指令转换或不识别特定指令的工具来说会把它们当做普通注释一样忽略处理。

以下示例代码环境为 MacOS,go 1.19.2

//line形式行指令

line directives,行指令(笔者暂未找到相应的中文术语,暂直译为”行指令“)有如下几种形式:

1
2
3
4
5
6
7
8
//line :line
//line :line:col
//line filename:line
//line filename:line:col
/*line :line*/
/*line :line:col*/
/*line filename:line*/
/*line filename:line:col*/

主要出现在机器生成的代码中,以便编译器和调试器将原始输入中的位置报告给生成器。

//go形式行指令

形式://go:name,后面介绍如下几种常见行指令

//go:linkname

形式为:

1
//go:linkname localname importpath.name

该指令指示编译器使用 importpath.name 作为源代码中声明为 localname 的变量或函数的目标文件符号名称。

如:

1
2
3
4
5
6
7
// src/runtime/timestub.go

//go:linkname time_now time.now
func time_now() (sec int64, nsec int32, mono int64) {
	sec, nsec = walltime()
	return sec, nsec, nanotime()
}

//go:noescape

形式为:

1
//go:noescape

该指令后面跟一个有声明但没有主体(实现可能不是 Go)的函数,不允许编译器对其做逃逸分析。它指定了函数不允许指针传参并从栈上逃逸到堆上或者到函数返回值上的行为。

一般情况下,该指令用于内存分配优化。 因为编译器默认会进行逃逸分析,会通过规则判定一个变量是分配到堆上还是栈上,特殊情况下,逃逸分析某函数其是存放在堆上,但我们想分配到栈上,就可以使用该指令。

如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// src/runtime/stubs.go

// memmove copies n bytes from "from" to "to".
//
// memmove ensures that any pointer in "from" is written to "to" with
// an indivisible write, so that racy reads cannot observe a
// half-written pointer. This is necessary to prevent the garbage
// collector from observing invalid pointers, and differs from memmove
// in unmanaged languages. However, memmove is only required to do
// this if "from" and "to" may contain pointers, which can only be the
// case if "from", "to", and "n" are all be word-aligned.
//
// Implementations are in memmove_*.s.
//
//go:noescape
func memmove(to, from unsafe.Pointer, n uintptr)

这个案例满足了该指令的常见特性:

  • memmove_*.s 只有声明,主体是由底层汇编实现的
  • memmove 函数功能在栈上处理性能会更好

//go:nosplit

形式:

1
//go:nosplit

该指令后需跟一个函数定义,它明确了函数必须忽略堆栈溢出检查。

npsplit 中的 split 指的是栈分裂(stack-split)。goroutine 的初始栈大小比较小,为2K,但随着工作的执行需要,就需要增加栈空间,runtime 会创建一个比原来栈大小大的新栈(2倍),并将原来栈的上下文拷贝到新栈上,这个过程就是栈分裂。为了实现 goroutine 能够动态调整大小,就需要一个检测机制来保证动态扩容 goroutine 栈。这个指令就是为了逃避这个检测机制,可以提高性能,但如果使用不当可能会发生 stack overflow。

如:

1
2
3
4
5
6
7
8
9
10
11
12
13
// src/runtime/lock_futex.go

// Possible lock states are mutex_unlocked, mutex_locked and mutex_sleeping.
// mutex_sleeping means that there is presumably at least one sleeping thread.
// Note that there can be spinning threads during all states - they do not
// affect mutex's state.

// We use the uintptr mutex.key and note.key as a uint32.
//
//go:nosplit
func key32(p *uintptr) *uint32 {
	return (*uint32)(unsafe.Pointer(p))
}

//go:nowritebarrierrec

形式:

1
//go:nowritebarrierrec

该指令表示编译器遇到写屏障时会产生一个错误,且允许递归。这个函数调用的其他函数如果有写屏障也会报错。简而言之。针对写屏障的处理,防止死循环。

如:

1
2
3
4
5
6
7
8
// src/runtime/mgcwork.go

// empty reports whether w has no mark work available.
//
//go:nowritebarrierrec
func (w *gcWork) empty() bool {
	return w.wbuf1 == nil || (w.wbuf1.nobj == 0 && w.wbuf2.nobj == 0)
}

//go:yeswritebarrierrec

形式:

1
//go:yeswritebarrierrec

该指令与 //go:nowritebarrierrec 相对,当编译器遇到该指令时会停止。

如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// src/runtime/proc.go

// Schedules gp to run on the current M.
// If inheritTime is true, gp inherits the remaining time in the
// current time slice. Otherwise, it starts a new time slice.
// Never returns.
//
// Write barriers are allowed because this is called immediately after
// acquiring a P in several places.
//
//go:yeswritebarrierrec
func execute(gp *g, inheritTime bool) {
	...
}

//go:noinline

形式:

1
//go:noinline

该指令后面需跟一个函数定义,表示禁止编译器对函数进行内联优化。

仅用于特殊 runtime 函数或者调试编译器。

如:

1
2
3
4
//go:noinline
func unexportedPanicForTesting(b []byte, i int) byte {
	return b[i]
}

//go:norace

形式:

1
//go:norace

该指令后面需跟一个函数定义,表示禁止进行竟态检测。

如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// src/syscall/exec_bsd.go

// Fork, dup fd onto 0..len(fd), and exec(argv0, argvv, envv) in child.
// If a dup or exec fails, write the errno error to pipe.
// (Pipe is close-on-exec so if exec succeeds, it will be closed.)
// In the child, this function must not acquire any locks, because
// they might have been locked at the time of the fork. This means
// no rescheduling, no malloc calls, and no new stack segments.
// For the same reason compiler does not race instrument it.
// The calls to RawSyscall are okay because they are assembly
// functions that do not grow the stack.
//
//go:norace
func forkAndExecInChild(argv0 *byte, argv, envv []*byte, chroot, dir *byte, attr *ProcAttr, sys *SysProcAttr, pipe int) (pid int, err Errno) {
	...
}

//go:notinheap

形式:

1
//go:notinheap

该类型常用于类型声明,表示这个类型不允许从 GC 堆上进行申请内存。在运行时常用来做较低层次的内部结构,避免调度器和内存分配中的写屏障,能够提高性能。

1
2
3
4
5
6
7
8
9
10
11
12
13
// src/runtime/malloc.go

// notInHeap is off-heap memory allocated by a lower-level allocator
// like sysAlloc or persistentAlloc.
//
// In general, it's better to use real types marked as go:notinheap,
// but this serves as a generic type for situations where that isn't
// possible (like in the allocators).
//
// TODO: Use this as the return type of sysAlloc, persistentAlloc, etc?
//
//go:notinheap
type notInHeap struct{}

//go:systemstack

形式:

1
//go:systemstack

表示函数必须在系统栈上运行。

如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// src/runtime/headdump.go

//go:systemstack
func dumpmemstats(m *MemStats) {
	assertWorldStopped()

	// These ints should be identical to the exported
	// MemStats structure and should be ordered the same
	// way too.
	dumpint(tagMemStats)
	dumpint(m.Alloc)
	dumpint(m.TotalAlloc)
	dumpint(m.Sys)
	dumpint(m.Lookups)
	dumpint(m.Mallocs)
	dumpint(m.Frees)
	dumpint(m.HeapAlloc)
	dumpint(m.HeapSys)
	dumpint(m.HeapIdle)
	dumpint(m.HeapInuse)
	dumpint(m.HeapReleased)
	dumpint(m.HeapObjects)
	dumpint(m.StackInuse)
	dumpint(m.StackSys)
	dumpint(m.MSpanInuse)
	dumpint(m.MSpanSys)
	dumpint(m.MCacheInuse)
	dumpint(m.MCacheSys)
	dumpint(m.BuckHashSys)
	dumpint(m.GCSys)
	dumpint(m.OtherSys)
	dumpint(m.NextGC)
	dumpint(m.LastGC)
	dumpint(m.PauseTotalNs)
	for i := 0; i < 256; i++ {
		dumpint(m.PauseNs[i])
	}
	dumpint(uint64(m.NumGC))
}

参考资料:

本文由作者按照 CC BY 4.0 进行授权