Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve IR node handles #107

Open
6 tasks
fabianschuiki opened this issue Apr 12, 2020 · 0 comments
Open
6 tasks

Improve IR node handles #107

fabianschuiki opened this issue Apr 12, 2020 · 0 comments
Labels
A-ir Area: Intermediate representation. C-enhancement Category: Adding or improving on features. P-long-term Priority: Long-term endeavour.

Comments

@fabianschuiki
Copy link
Owner

The graph-ir branch contains a quick trial on how the IR could be revamped. Part of that is a new builder/modifier scheme where the user does not directly interact with modules, functions, processes, entities, blocks, and instructions, but rather uses builder instances. These lock the module or unit in place using a lifetime and are the only means to modify these structs -- they don't provide such functionality themselves.

llhd/examples/gir.rs

Lines 47 to 85 in 75aeda7

impl Module {
/// Create a new module.
pub fn new() -> Self {
Self {
present_units: Default::default(),
units: Default::default(),
}
}
/// Modify the module.
pub fn modify(&mut self) -> ModuleBuilder {
ModuleBuilder(UnsafeCell::new(self))
}
fn alloc_unit(&self, data: UnitData) -> UnitId {
// Safe because we only add elements. This may cause the vector to grow,
// which moves the boxes around. But since they're boxes, the referred
// to data will not move.
let mut units = unsafe { &mut *self.units.get() };
for (id, slot) in units.iter_mut().enumerate() {
if slot.is_none() {
*slot = Some(Box::new(data));
return UnitId(id as u32);
}
}
let id = UnitId(units.len() as u32);
units.push(Some(Box::new(data)));
id
}
fn dealloc_unit(&mut self, unit: UnitId) -> UnitData {
// Safe because the function takes &mut, ensuring that no references to
// any of the units exist.
unsafe {
assert!(!(*self.present_units.get()).contains(unit.0));
*(*self.units.get())[unit.0 as usize].take().unwrap()
}
}
}

llhd/examples/gir.rs

Lines 107 to 155 in 75aeda7

impl<'m> ModuleBuilder<'m> {
pub fn new_function<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> {
self.new_unit(UnitData::new(UnitKind::Function, name))
}
pub fn new_process<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> {
self.new_unit(UnitData::new(UnitKind::Process, name))
}
pub fn new_entity<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> {
self.new_unit(UnitData::new(UnitKind::Entity, name))
}
fn new_unit<'u>(&'u self, data: UnitData) -> UnitBuilder<'m, 'u> {
// Safe because we hand out pointers on the heap.
let id = self.alloc_unit(data);
let data = unsafe {
(*self.units.get())[id.0 as usize]
.as_mut()
.unwrap()
.as_mut()
};
UnitBuilder(Unit(id, data as *const _, PhantomData), data, PhantomData)
}
pub fn add_unit(&self, unit: Unit<'m>) {}
pub fn remove_unit(&self, unit: Unit<'m>) {}
/// Modify a unit in the module.
///
/// Only one module can be modified at the time through this function, since
/// we cannot guarantee that the user requests modification of the same unit
/// twice. If you want to modify units in parallel, use `modify_units`.
pub fn modify_unit<'u>(&'u mut self, unit: Unit<'m>) -> UnitBuilder<'m, 'u> {
// Safe because the function is &mut, enforcing a single reference.
let data = unsafe {
(*self.units.get())[(unit.0).0 as usize]
.as_mut()
.unwrap()
.as_mut() as *mut _
};
UnitBuilder(unit, data, PhantomData)
}
/// Modify all units in the module in parallel.
pub fn modify_units<'u>(&'u mut self) -> Vec<UnitBuilder<'m, 'u>> {
vec![]
}
}

This provides a single point of modification for the IR, and allows the code to perform a lot of bookkeeping on the side, such as pred/succ and usage tables, as well as dominator tree invalidation. The current IR would greatly benefit from such a scheme as it streamlines interaction with LLHD and provides more opportunities to make things fast on the inside.

An interesting improvement would be to make the changes applied to these builders deferred, such that the user can accumulate modifications which then only become visible upon committing them. This would allow the user to interact with analyses such as DTs and TRGs without them being invalidated by modification, with the express mental model being that the changes are delayed and as such the analyses remain valid.

An important side-effect of this is that modification through a single structure allows that structure to maintain a lot of accelerating lookup data structures, such as predecessor/successor lists, terminator maps, phi node lists, cached dominator tables, etc.

Todo

  • Use an arena to allocate values, instructions, blocks in PrimaryTable2
  • Split handles such as Value into ValueId (the actual index) and Value (a wrapper around the index, and other pointers).
    • Block
    • Value
    • Inst
  • Distinguish between removing a node (e.g. remove_value) and deallocating it (e.g. dealloc_value). The former does not invalidate existing references, but marks node for deletion.
@fabianschuiki fabianschuiki added A-ir Area: Intermediate representation. P-long-term Priority: Long-term endeavour. labels Apr 12, 2020
@fabianschuiki fabianschuiki added the C-enhancement Category: Adding or improving on features. label Apr 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ir Area: Intermediate representation. C-enhancement Category: Adding or improving on features. P-long-term Priority: Long-term endeavour.
Projects
None yet
Development

No branches or pull requests

1 participant