Improve IR node handles #107

fabianschuiki · 2020-04-12T07:08:22Z

The graph-ir branch contains a quick trial on how the IR could be revamped. Part of that is a new builder/modifier scheme where the user does not directly interact with modules, functions, processes, entities, blocks, and instructions, but rather uses builder instances. These lock the module or unit in place using a lifetime and are the only means to modify these structs -- they don't provide such functionality themselves.

llhd/examples/gir.rs

Lines 47 to 85 in 75aeda7

 impl Module { 

 /// Create a new module. 

 pub fn new() -> Self { 

 Self { 

 present_units: Default::default(), 

 units: Default::default(), 

 } 

 } 

 /// Modify the module. 

 pub fn modify(&mut self) -> ModuleBuilder { 

 ModuleBuilder(UnsafeCell::new(self)) 

 } 

 fn alloc_unit(&self, data: UnitData) -> UnitId { 

 // Safe because we only add elements. This may cause the vector to grow, 

 // which moves the boxes around. But since they're boxes, the referred 

 // to data will not move. 

 let mut units = unsafe { &mut *self.units.get() }; 

 for (id, slot) in units.iter_mut().enumerate() { 

 if slot.is_none() { 

 *slot = Some(Box::new(data)); 

 return UnitId(id as u32); 

 } 

 } 

 let id = UnitId(units.len() as u32); 

 units.push(Some(Box::new(data))); 

 id 

 } 

 fn dealloc_unit(&mut self, unit: UnitId) -> UnitData { 

 // Safe because the function takes &mut, ensuring that no references to 

 // any of the units exist. 

 unsafe { 

 assert!(!(*self.present_units.get()).contains(unit.0)); 

 *(*self.units.get())[unit.0 as usize].take().unwrap() 

 } 

 } 

 }

llhd/examples/gir.rs

Lines 107 to 155 in 75aeda7

 impl<'m> ModuleBuilder<'m> { 

 pub fn new_function<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> { 

 self.new_unit(UnitData::new(UnitKind::Function, name)) 

 } 

 pub fn new_process<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> { 

 self.new_unit(UnitData::new(UnitKind::Process, name)) 

 } 

 pub fn new_entity<'u>(&'u self, name: UnitName) -> UnitBuilder<'m, 'u> { 

 self.new_unit(UnitData::new(UnitKind::Entity, name)) 

 } 

 fn new_unit<'u>(&'u self, data: UnitData) -> UnitBuilder<'m, 'u> { 

 // Safe because we hand out pointers on the heap. 

 let id = self.alloc_unit(data); 

 let data = unsafe { 

 (*self.units.get())[id.0 as usize] 

 .as_mut() 

 .unwrap() 

 .as_mut() 

 }; 

 UnitBuilder(Unit(id, data as *const _, PhantomData), data, PhantomData) 

 } 

 pub fn add_unit(&self, unit: Unit<'m>) {} 

 pub fn remove_unit(&self, unit: Unit<'m>) {} 

 /// Modify a unit in the module. 

 /// 

 /// Only one module can be modified at the time through this function, since 

 /// we cannot guarantee that the user requests modification of the same unit 

 /// twice. If you want to modify units in parallel, use `modify_units`. 

 pub fn modify_unit<'u>(&'u mut self, unit: Unit<'m>) -> UnitBuilder<'m, 'u> { 

 // Safe because the function is &mut, enforcing a single reference. 

 let data = unsafe { 

 (*self.units.get())[(unit.0).0 as usize] 

 .as_mut() 

 .unwrap() 

 .as_mut() as *mut _ 

 }; 

 UnitBuilder(unit, data, PhantomData) 

 } 

 /// Modify all units in the module in parallel. 

 pub fn modify_units<'u>(&'u mut self) -> Vec<UnitBuilder<'m, 'u>> { 

 vec![] 

 } 

 }

This provides a single point of modification for the IR, and allows the code to perform a lot of bookkeeping on the side, such as pred/succ and usage tables, as well as dominator tree invalidation. The current IR would greatly benefit from such a scheme as it streamlines interaction with LLHD and provides more opportunities to make things fast on the inside.

An interesting improvement would be to make the changes applied to these builders deferred, such that the user can accumulate modifications which then only become visible upon committing them. This would allow the user to interact with analyses such as DTs and TRGs without them being invalidated by modification, with the express mental model being that the changes are delayed and as such the analyses remain valid.

An important side-effect of this is that modification through a single structure allows that structure to maintain a lot of accelerating lookup data structures, such as predecessor/successor lists, terminator maps, phi node lists, cached dominator tables, etc.

Todo

Use an arena to allocate values, instructions, blocks in PrimaryTable2
Split handles such as Value into ValueId (the actual index) and Value (a wrapper around the index, and other pointers).
- Block
- Value
- Inst
Distinguish between removing a node (e.g. remove_value) and deallocating it (e.g. dealloc_value). The former does not invalidate existing references, but marks node for deletion.

The text was updated successfully, but these errors were encountered:

fabianschuiki added A-ir Area: Intermediate representation. P-long-term Priority: Long-term endeavour. labels Apr 12, 2020

fabianschuiki mentioned this issue Apr 12, 2020

Simplify IR data structures #82

Closed

13 tasks

fabianschuiki added the C-enhancement Category: Adding or improving on features. label Apr 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve IR node handles #107

Improve IR node handles #107

fabianschuiki commented Apr 12, 2020

Improve IR node handles #107

Improve IR node handles #107

Comments

fabianschuiki commented Apr 12, 2020

Todo