Posts, Events

Iceberg operates at two of the layers in the stack:
the table format and the catalog spec.
It typically uses Parquet as the underlying file format.

Lance spans three layers of the stack, because it's
simultaneously a file format, table format and a catalog spec.
Lance's file format makes it more convenient to maintain
multimodal data natively as blobs inside the columns,
with no external lookups (the multimodal data is co-located
with metadata and embeddings), thus simplifying governance
and management of data that's multimodal in nature.

It's also significantly more performant, because at the
table level, Lance can pack multiple smaller rows together
while storing very large rows (e.g., image or audio blobs)
in a dedicated file thanks to its fragment-based design,
thus balancing performance with storage size.
In Iceberg, data evolution comes with a non-trivial cost:
adding data to a new column requires a full table rewrite
since Parquet stores entire row groups together.
This means that for very large tables, it's common to see
multiple new feature columns being added in parallel by
multiple teams in an organization, which would require a
table lock as new columns are being added, bottlenecking
the feature engineering process.

In Lance, adding a new column is essentially a
zero-copy operation. Lance's fragment design allows
independent column files per fragment
(though multiple columns can share a data file),
meaning that adding or updating a column simply appends
new column files without touching existing data.
let b = document.body
let t = document.createElement('table')
b.appendChild(t)
let table = [
  ['one','two','three'],
  ['four','five','six']
]
let c
let r
table.forEach((row,ri) => {
  let r = t.insertRow(ri)
  row.forEach((l,i) => {
    c = r.insertCell(i)
    c.innerText = l
  })
})
if let Channel::Stable(v) = release_info()
  && let Semver { major, minor, .. } = v
  && major == 1
  && minor == 88
{
  println!("`let_chains` was stabilized in this version");
}
name = "World"
tmpl = t"Hello {name}"
assert tmpl.strings[0] == "Hello "
assert tmpl.interpolations[0].value == "World"
assert tmpl.interpolations[0].expression == "name"

tmpl = t"Hello {name!r}"
assert tmpl.interpolations[0].conversion == "r"

val = 42
tmpl = t"Value: {val:.2f}"
assert tmpl.interpolations[0].format_spec == ".2f"
let mut vec: Vec<String> = vec![];

let closure = async || {
  vec.push(ready(String::from("")).await);
};

视频