提交 0d6716d9 创建 作者: Austin Clements's avatar Austin Clements

Use the zero-page allocator to allocate radix nodes

Previously, radix_node's constructor would zero all of the pointers. Now we use zalloc. This fixes physical sharing between the mmap that first allocates a radix node and every other operation that does anything in that node (I argue this is not just sweeping this problem under the rug: mmap obviously does not commute with address space creation, which could theoretically pre-populate the radix tree. Since that would consume too many resources, we do it lazily. The resulting physical sharing is exactly the same, it just happens at a different time.)
上级 6997e5ae
......@@ -163,10 +163,18 @@ class radix_elem : public rcu_freed {
struct radix_node {
radix_ptr child[1 << bits_per_level];
radix_node() { }
// We need to customize not only allocation but initialization, so
// radix_node has no constructors. Instead, use create.
radix_node() = delete;
radix_node(const radix_node &o) = delete;
static radix_node *create();
~radix_node();
NEW_DELETE_OPS(radix_node)
// Since we use custom allocation for radix_node's, we must also
// custom delete them. Note that callers may alternatively use
// zfree when freeing a radix_node that's known to be empty (for
// example, after failed optimistic concurrency).
static void operator delete(void *p);
};
// Assert we have enough spare bits for all flags.
......@@ -196,7 +204,7 @@ struct radix {
radix_ptr root_;
u32 shift_;
radix(u32 shift) : root_(radix_entry(new radix_node())), shift_(shift) {
radix(u32 shift) : root_(radix_entry(radix_node::create())), shift_(shift) {
}
~radix();
radix_elem* search(u64 key);
......
......@@ -39,7 +39,7 @@ push_down(radix_entry cur, radix_ptr *ptr)
radix_elem *elem = cur.elem();
// FIXME: This might throw. Might need to unlock the things you've
// already hit.
radix_node *new_rn = new radix_node();
radix_node *new_rn = radix_node::create();
if (elem != nullptr) {
for (int i = 0; i < (1<<bits_per_level); i++) {
new_rn->child[i].store(radix_entry(elem));
......@@ -58,14 +58,20 @@ push_down(radix_entry cur, radix_ptr *ptr)
// reallocating new_rn if elem doesn't change.
// Avoid bouncing on the refcount 1<<bits_per_level times.
if (elem != nullptr) {
for (int i = 0; i < (1<<bits_per_level); i++) {
new_rn->child[i].store(radix_entry(nullptr));
}
if (elem != nullptr)
elem->decref(1<<bits_per_level);
}
delete new_rn;
// XXX(austin) This happens for nearly 50% of radix_node
// allocations. Is the compare exchange actually right?
if (elem == nullptr)
// We know the page is still zeroed
zfree(new_rn);
else
// We already did a batch decref above. We could zero all of
// the entries and call the destructor (which will scan the
// node again). Instead, we skip the whole thing and free
// directly.
kfree(new_rn);
}
}
return cur;
......@@ -131,6 +137,14 @@ radix_entry::release()
}
}
radix_node*
radix_node::create()
{
static_assert(sizeof(radix_node) == PGSIZE,
"radix_node must be exactly one page");
return (radix_node*)zalloc("radix_node");
}
radix_node::~radix_node()
{
for (int i = 0; i < (1<<bits_per_level); i++) {
......@@ -138,6 +152,12 @@ radix_node::~radix_node()
}
}
void
radix_node::operator delete(void *p)
{
kfree(p);
}
radix::~radix()
{
root_.load().release();
......
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论