go install golang.org/x/vuln/cmd/govulncheck@latest
go install golang.org/x/tools/cmd/deadcode@latest


By default, Temporal SDKs set a Worker Identity to
${process.pid}@${os.hostname}, which combines the
Worker's process ID (process.pid) and the hostname of
the machine is running the Worker (os.hostname).

When running Workers inside Docker containers, the
process ID is always 1, as each container typically
runs a single process. This makes the process
identifier meaningless for identification purposes.

Include relevant context: Incorporate information that
helps establish the context of the Worker, such as the
deployment environment (staging or production), region,
or any other relevant details.

Ensure uniqueness: Make sure that the Worker Identity is unique
within your system to avoid ambiguity when debugging issues.

Keep it concise: While including relevant information is important,
try to keep the Worker Identity concise and easily readable to
facilitate quick identification and troubleshooting.
The Temporal Service (including the Temporal Cloud) doesn't execute
any of your code (Workflow and Activity Definitions) on Temporal Service
machines. The Temporal Service is solely responsible for orchestrating
State Transitions and providing Tasks to the next available Worker Entity.

A Worker Process can be both a Workflow Worker Process and an
Activity Worker Process. Many SDKs support the ability to have
multiple Worker Entities in a single Worker Process.
(Worker Entity creation and management differ between SDKs.)

A single Worker Entity can listen to only a single Task Queue.
But if a Worker Process has multiple Worker Entities, the
Worker Process could be listening to multiple Task Queues.

There are two types of Task Queues,
Activity Task Queues and Workflow Task Queues.

Task Queues do not require explicit registration but instead
are created on demand when a Workflow Execution or Activity
spawns or when a Worker Process subscribes to it.

When a Task Queue is created, both a Workflow Task Queue and
an Activity Task Queue are created under the same name.
A Sticky Execution is when a Worker Entity caches the Workflow
in memory and creates a dedicated Task Queue to listen on.
A Sticky Execution occurs after a Worker Entity completes the
first Workflow Task in the chain of Workflow Tasks
for the Workflow Execution.

Some SDKs provide a Session API that provides a straightforward
way to ensure that Activity Tasks are executed with the same
Worker without requiring you to manually specify Task Queue names.

2024-12-02: 想到一个好玩的问题

给出一个具体的数学定理 (或者大一点, 主题),
你觉得它最大程度上桥接了不同的数学, 物理, 计算科学.

其实我想说的是:
有不少项目一开始 Scala/Java 开发, 后来也开发 Rust 版本.
但这个项目是极少数 Rust 版本活跃度赶超 Scala/Java 的.

对比:
https://github.com/apache/iceberg-rust
https://github.com/apache/iceberg

https://github.com/apache/hudi
https://github.com/apache/hudi-rs

2024-12, 看来 Hudi 最先出局了~ 然后 Databricks 自己放弃 Delta Lake, all in Iceberg~


不得不吐槽, 作者在 manning.com 几乎停更了 2023 & 2024 两年整! 很多读者 (包括本人) 在论坛催更无效~

后续就算更新也不会再阅读了~ 差评!


Move 生态, 半死不活~



The prover key embeds all the information necessary to
generate proof in a zero-knowledge-preserving fashion
for that specific circuit. Similarly, the verifier key
embeds all the required information to verify that the
proof is indeed correct. These aren't private keys but
information that can and should be publicly distributed.
Any party that needs to generate or verify proof
should have access to them.

既没有 From Zero, 也没有 to Hero; 文章水了一些~




We've now seen two different approaches (Push/Pull)
to looping over all the elements of a set.
Different Go packages use these approaches and several others.
That means that when you start using a new Go container package
you may have to learn a new looping mechanism.
It also means that we can't write one function that
works with several different types of containers,
as the container types will handle looping differently.

We want to improve the Go ecosystem by developing
standard approaches for looping over containers.
As of Go 1.23 it now supports ranging over functions that
take a single argument. The single argument must itself be a
function that takes zero to two arguments and returns a bool;
by convention, we call it the yield function.
func(yield func() bool)

func(yield func(V) bool)

func(yield func(K, V) bool)
When we speak of an iterator in Go, we mean a function
with one of these three types. As we'll discuss below,
there is another kind of iterator in the
standard library: a pull iterator.
When it is necessary to distinguish between
standard iterators and pull iterators,
we call the standard iterators push iterators.
As a matter of convention, we encourage all container types
to provide an All method that returns an iterator,
so that programmers don't have to remember whether to range
over All directly or whether to call All
to get a value they can range over.
A pull iterator works the other way around:
it is a function that is written such that each time
you call it, it returns the next value in the sequence.

We'll repeat the difference between the two types
of iterators to help you remember:

A push iterator pushes each value in a sequence to
a yield function. Push iterators are standard iterators
in the Go standard library, and are supported
directly by the for/range statement.

A pull iterator works the other way around. Each time you
call a pull iterator, it pulls another value from a sequence
and returns it. Pull iterators are not supported directly by
the for/range statement; however, it's straightforward to write
an ordinary for statement that loops through a pull iterator.
The first function returned by iter.Pull, the pull iterator,
returns a value and a boolean that reports
whether that value is valid.
The boolean will be false at the end of the sequence.
iter.Pull returns a stop function in case we don't read
through the sequence to the end. In the general case the
push iterator, the argument to iter.Pull, may
start goroutines, or build new data structures that need
to be cleaned up when iteration is complete.

The push iterator will do any cleanup when the yield
function returns false, meaning that no more values
are required. When used with a for/range statement,
the for/range statement will ensure that if the loop
exits early, through a break statement or for any
other reason, then the yield function will return false.

With a pull iterator, on the other hand, there is no way
to force the yield function to return false,
so the stop function is needed.
// EqSeq reports whether two iterators contain the same
// elements in the same order.
func EqSeq[E comparable](s1, s2 iter.Seq[E]) bool {
  next1, stop1 := iter.Pull(s1)
  defer stop1()
  next2, stop2 := iter.Pull(s2)
  defer stop2()
  for {
    v1, ok1 := next1()
    v2, ok2 := next2()
    if !ok1 {
      return !ok2
    }
    if ok1 != ok2 || v1 != v2 {
      return false
    }
  }
}

自 GitOps 理念以来, 至少在长驻的任务上, 带来的便利是毋庸置疑的~
之前也实施过一个项目, 基于:

https://github.com/apache/flink-kubernetes-operator

开发了公司内部的 Flink Job 的调度, 也颇有收益~
但是, 对于 Batch Job (not scheduled), GitOps 还合适么? 比如:

https://github.com/kubeflow/spark-operator
https://github.com/apache/spark-kubernetes-operator

我不觉得! (至少, 对于 no schedule 的)

从一个十分粗暴的角度而言, GitOps 就是声明了长驻资源~
一切皆 GitOps 显然不合适. 或许, 简单的原则是:
GitOps 适用于手动 (包括通过一些: 工具/CI/CD) 提交的资源定义.
而这些资源定义, 一般而言是相对不容易变更的.
此处不容易变更, 是相对的, 大体上不会超过 (微) 服务发布的频率.


7 月末, 体验了一下 Axum (Rust) 的 Web 开发.
说实话, 纯粹个人的角度, 比 Hertz (Go) 要好得多!

但是, 很快发现意义不大, 因为, Rust 的生态优势目前有三处:
1. 区块链/密码学/同态加密 (隐私计算)
2. 围绕着 Arrow(-rs) 与 DataFusion 的数据生态
3. Rust powered 的 Python 生态

其实绝大多数纯 Web 开发者很大概率不会接受 Rust 的学习曲线.
所以, 除非是个人开发者或者 (独立) 开源项目,
否则, Rust 的纯 Web 开发意义不大~
(包括用 Rust 写 K8s operator 等.)

嗯, 如果团队成员大多数是 Rust 掌握者呢? 比如: Data, ML 团队~
再熟练的 Rust 玩家在 Web 开发领域得到的优势也抵不过消耗.
(比如: 编译时间, 编译报错的处理等.)

打一个不恰当的比喻: Rust 大多处于 Data Plane (数据处理, 计算密集).
(Data Plane 对立面是 Control Plane)

有一个特殊情况, 很多 Rust 编写的数据计算组件,
K8s operator 也就用 Rust 写了, 单一代码库, 同一种语言, 也合理.
否则, 其实我不觉得 Rust 写 K8s operator 有任何好处~

// The problem was that those references might be
// self-references, meaning they point to
// other fields of the same object.
async fn foo<'a>(z: &'a mut i32) {
  // ...
}

async fn bar(x: i32, y: i32) -> i32 {
  let mut z = x + y;
  foo(&mut z).await;
  z
}
// Let's ask ourselves, what would the internal
// states of `Bar` be? Something like this:

enum Bar {
  // When it starts, it contains only its arguments
  Start { x: i32, y: i32 },

  // At the first await, it must contain `z` and
  // the `Foo` future that references `z`
  FirstAwait { z: i32, foo: Foo<'?> }

  // When its finished it needs no data
  Complete,
}
// The `Foo` object instead borrows the `z` field of `Bar`,
// which is stored along side it in the same struct.
// This is why these future types are said to be
// "self-referential:" they contain fields which
// reference other fields in themselves.


简单 bench 了下

(1_000_000 次):
sha2 (256): 6666.56 ms
md5:        1252.75 ms
blake3:     417.502 ms

(10_000_000 次):
sha2 (256)  66.40 s
md5         12.74 s
blake3      3.47  s

为啥不 WASM 的那一段很务实.

But there is a workaround: use the C ABI.
Unlike Rust, C does have a stable ABI on
every major OS and processor architecture.
So if we can constrain our plugin interface
to only use C-compatible data structures and
functions we can safely link against plugins
compiled by any Rust compiler.

Even better: as the C ABI is the lingua franca
in the systems world, many other languages are
able to emit it, opening the door to supporting
UDFs in a variety of compiled languages.

2018 年开始, 我决定安心做一点想做而擅长的事.
人生短暂, 学习如何管理很多人做事并非我期望的发展方向.
尤其当我逐步融入开源社区后, 我发现,
这个世界上许多软件基础设施往往都是由一两个人支撑.
早在 2011 年时, 我就怀疑过, 软件项目需要很多人一起完成可能是一个骗局,
那么, 当处于一个稳定的环境而自己又有能力时,
这种机遇并不多见, 就应该尝试做点什么.

The typical approach in Luminal for supporting new backends would be:

1. Swap out each primitive operation with a backend-specific operation.
2. Add in operations to copy to device and copy from device
   before and after Function operations.
3. Pattern-match to swap out chunks of
   operations with specialized variants.
4. All other optimizations.
One more note: The core of Luminal has no idea about any of this!
GPUs are a foreign concept to it, which is nessecary since we
want to add backends to TPUs, Groq chips, and whatever else
may come in the future without changing anything in the core.

拭目以待!


Our new generator, which we unimaginatively named ChaCha8Rand
for specification purposes and implemented as
math/rand/v2's rand.ChaCha8,
is a lightly modified version of ChaCha stream cipher.
ChaCha is widely used in a 20-round form called ChaCha20,
including in TLS and SSH.
We used ChaCha8 as the core of ChaCha8Rand.
Most stream ciphers, including ChaCha8, work by defining a
function that is given a key and a block number and produces a
fixed-size block of apparently random data.
The cryptographic standard these aim for (and usually meet) is
for this output to be indistinguishable from actual random data in
the absence of some kind of exponentially costly brute-force search.
A message is encrypted or decrypted by XOR'ing successive blocks of
input data with successive randomly generated blocks.
To use ChaCha8 as a rand.Source, we use the generated blocks directly
instead of XOR'ing them with input data
(this is equivalent to encrypting or decrypting all zeros).
We changed a few details to make ChaCha8Rand more
suitable for generating random numbers.
The Go runtime now maintains a per-core ChaCha8Rand state
(300 bytes), seeded with operating system-supplied
cryptographic randomness, so that random numbers can be
generated quickly without any lock contention.
Dedicating 300 bytes per core may sound expensive,
but on a 16-core system, it is about the same as storing
a single shared Go 1 generator state (4,872 bytes).
The speed is worth the memory.
Overall, ChaCha8Rand is slower than the Go 1 generator,
but it is never more than twice as slow,
and on typical servers, the difference is never more than 3ns.
Very few programs will be bottlenecked by this difference,
and many programs will enjoy the improved security.

The GQL standard does not specify how the
returned data is displayed to the user.
MATCH ((a)-[r]->(b)){1, 5}
RETURN a, r, b

-- This example will find paths where one node
-- knows another node, up to five hops long.
Nodes are enclosed in parenthesis while
edges are enclosed in square brackets.
INSERT (:Person {
  firstname: 'Avery',
  lastname: 'Stare',
  joined: date("2022-08-23")
})
- [:LivesIn {
  since: date("2022-07-15")
}]
-> (:City {
  name: 'Granville',
  state: 'OH',
  country: 'USA'
})
MATCH (a {
  firstname: 'Avery'
}), (d {
  name: 'Unique'
})
INSERT (a) - [:HasPet] -> (d)
-- GQL data is deleted by identifying nodes,
-- detaching them to delete relationships,
-- then deleting the nodes.
MATCH (a {firstname: 'Avery'}) - [b] -> (c)
DETACH DELETE a, c
A schema-free graph will accept any data that is inserted.
This allows for quick startup but leaves the control of
the data with the application developer(s) and/or users.


This milestone represents a key transition in
Ethereum's long-term roadmap:

blobs are the moment where Ethereum scaling ceased to be
a "zero-to-one" problem, and became a "one-to-N" problem.
The next stage is likely to be a simplified version of
DAS called PeerDAS. In PeerDAS, each node stores a significant
fraction (eg. 1/8) of all blob data, and nodes maintain
connections to many peers in the p2p network.
When a node needs to sample for a particular piece of data,
it asks one of the peers that it knows is
responsible for storing that piece.
// rustc 1.77.0
alignment of i128: 16
func (r *TheController) SetupWithManager(mgr ctrl.Manager) error {
  return ctrl.NewControllerManagedBy(mgr).
    For(&TheObject{}).
    // This is useful because we don't want to
    // reconcile again when the generation is not changed.
    // The generation is changed when the spec is updated.
    // The generation is not changed when the status is updated.
    WithEventFilter(predicate.GenerationChangedPredicate{}).
    Complete(r)
}
For those readers familiar with transformers
and eager for the punchline, here it is:

Each transformer block (containing a multi-head self-attention
layer and feed-forward network) learns weights that associate a
given prompt with a class of strings found in the training corpus.
The distribution of tokens that follow those strings in the
training corpus is, approximately, what the block outputs as
its predictions for the next token.

Each block may associate the same prompt with a different
class of training corpus strings, resulting in a different
distribution of the next tokens and thus different predictions.
The final transformer output is a linear combination of
each block's predictions.
The takeaway is that simplifying the transformation performed
by the blocks to just the contributions of the feed-forward
networks results in a shorter output vector (has a smaller norm)
than the original output but points in roughly the same direction.

And the difference in norms would have no impact on the
transformer's final output, because of the LayerNorm operation
after the stack of blocks. That LayerNorm step will adjust the
norm of any input vector to a similar value regardless of its
initial magnitude; the final linear layer that follows it will
always see inputs of approximately the same norm.
I think the model has learned a complex, non-linear embedding
subspace corresponding to each token. Any embedding within that
subspace results in an output distribution that assigns the
token near a certain probability.
Each embedding I was able to learn is probably a point in
the embedding subspace for the corresponding token.
Within a block, adding the feed-forward network output
vector to the input produces an output embedding that
better aligns with the embedding subspaces of specific tokens.

And those tokens are the same ones predicted in the approximation:
they're the tokens that follow the strings in the training
corpus that yield similar feed-forward network
outputs to the current prompt.

哈哈, 作者蛮逗的~ 结论一般, 过程值得尊敬~

Resource savings are nice to have, but the real power of
Flink Autotuning is the reduced time to production.

With Flink Autoscaling and Flink Autotuning, all users
need to do is set a max memory size for the TaskManagers,
just like they would normally configure TaskManager memory.
Flink Autotuning then automatically adjusts the various
memory pools and brings down the total container memory size.
It does that by observing the actual max memory usage on
the TaskMangers or by calculating the exact number of
network buffers required for the job topology.
The adjustments are made together with Flink Autoscaling,
so there is no extra downtime involved.

很实用的功能, 实际效果有待检验!

Arroyo 0.10 ships as a single, compact binary that
can be deployed in a variety of ways.
Our first decision was to adopt Apache Arrow as our in-memory
data representation, replacing the static Struct types.
Arrow is a columnar, in-memory format designed for
analytical computations. The coolest thing about Arrow is that
it's a cross-language standard; it supports sharing data
directly between engines and even different languages without
copying or serialization overhead.
For example, Pandas programs written in Python could
operate directly on data generated by Arroyo.
The takeaway: we only have to pay high overhead of small
batch sizes when our data volume is very low.
But if we're only handling 10 or 100 events per second,
the overall cost of processing will be very small in any case.
And at high data volumes (tens of thousands to millions of
events per second) we can have our cake and eat it too-achieve
high throughput with batching and columnar data while
still maintaining low absolute latency.
Now that Arroyo compiles down to a single binary,
we're working to remove the other external dependencies,
including Postgres and Prometheus;
future releases of Arroyo will have the option of running
their control plane on an embedded sqlite database.
first, second, third, fourth := 11, 22, 33, 44
s := []*int{&first, &second, &third, &fourth}

if len(s) >= 4 {
  s = slices.Delete(s, 2, 3)
  fmt.Println("New length is", len(s))
}

for _, v := range s {
  fmt.Println(*v)
}

// New length is 3
// 11
// 22
// 44
first, second, third, fourth := 11, 22, 33, 44
s := []*int{&first, &second, &third, &fourth}

if len(s) >= 4 {
  s := slices.Delete(s, 2, 3)
  fmt.Println("New length is", len(s))
}

for _, v := range s {
  fmt.Println(*v)
}

// New length is 3
// 11
// 22
// 44
// panic: runtime error: invalid memory address or nil pointer dereference

type Item struct {
  Name   string
  Amount int
}

items := []*Item{
  {Name: "Car", Amount: 1},
  {Name: "Car", Amount: 1},
}

l1 := len(slices.CompactFunc(items, func(a *Item, b *Item) bool {
  return a.Name == b.Name
}))

l2 := len(slices.CompactFunc(items, func(a *Item, b *Item) bool {
  return a.Amount == b.Amount
}))

fmt.Println(l1, l2)

// Go 1.21:
// 1 1
// Go 1.22:
// panic: runtime error: invalid memory address or nil pointer dereference

这个 Case 因为我使用 ent, 比较容易出现 []*ent.Entity. 所以我切换到了 lo.UniqBy. 一方面是 slices 目前无法替代 lo; 另一方面是 lo 使用体验更加.


pub async fn get_city(ip: String) -> Option<String> {
  let body: serde_json::Value =
    reqwest::get(format!("http://geoip-service:8000/{ip}"))
      .await
      .ok()?
      .json()
      .await
      .ok()?;

  body.pointer("/names/en")
    .and_then(|t| t.as_str())
    .map(|t| t.to_string())
}
create view cities as
select get_city(logs.ip) as city
from logs;

SELECT * FROM (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY window
    ORDER BY count DESC
  ) as row_num
  FROM (SELECT count(*) as count,
    city,
    hop(interval '5 seconds', interval '15 minutes') as window
      FROM cities
      WHERE city IS NOT NULL
      group by city, window)
) WHERE row_num <= 5;

values := []int{1, 2, 3, 4, 5}
for _, v := range values {
  go func() {
    // go <= 1.21
    // vet: loop variable val captured by func literal
    // 5 5 5 5 5
    // go >= 1.22
    // 2 1 4 5 3 (randomly)
    fmt.Printf("%d ", v)
  }()
}