Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Move from iptables to nft as iptables is deprecated #250

Merged
merged 21 commits into from
Nov 21, 2022

Conversation

tomribbens
Copy link
Contributor

Since iptables is deprecated, a move to the replacement nftables was implemented in this change. This is in line with various network improvements that are ongoing as discussed in Issue #237. The use of nftables will facilitate some future enhancements, especially related to IPv6 support.

At the same time, this commit refactors all the networking and firewall code into its own package "network", as multiple packages were depending on making network changes. The new package is now an interface for all networking related tasks.

Since iptables is deprecated, a move to the replacement nftables was implemented in this change. This is in line with various network improvements that are ongoing as discussed in Issue aleph-im#237. The use of nftables will facilitate some future enhancements, especially related to IPv6 support.

At the same time, this commit refactors all the networking and firewall code into its own package "network", as multiple packages were depending on making network changes. The new package is now an interface for all networking related tasks.
Copy link
Member

@hoh hoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall code looks good, however there are some some structural changes to make:

  • This is not Java, it is okay to have standalone functions in a module instead of a class namespace. See the Zen of Python for the philosophical approach. Using static and class methods here does not make much sense.

  • The name of the network package is too generic and can be confused with system libraries or one of the following https://pypi.org/search/?q=network

firecracker/microvm.py Outdated Show resolved Hide resolved
network/firewall.py Outdated Show resolved Hide resolved
network/firewall.py Outdated Show resolved Hide resolved
network/firewall.py Outdated Show resolved Hide resolved
network/firewall.py Outdated Show resolved Hide resolved
network/firewall.py Outdated Show resolved Hide resolved
@tomribbens
Copy link
Contributor Author

I don't agree that classmethods don't make sense. There's state information that needs to be kept, while these methods are needed in very different parts of the codebase, so passing around that state information would litter the code too much. This way makes that much easier, and much clearer in the code in my opinion. I don't really understand why you linked to the Zen of Python for this, as I don't see any principle really against what I wrote, but multiple in favor of what I wrote. As indicated, yes, there are some things that might be improved upon, but you've not convinced me why it would need to be restructured.

Some more functions maybe should be @staticmethod instead of @classmethod, but moving them out of the class would make it more confusing to read later on.

True, the name network is generic. But it does cover what it does. I'm not opposed to changing it, but I'm not sure to what. Do you have suggestions?

firecracker/microvm.py Outdated Show resolved Hide resolved
vm_supervisor/ip.py Outdated Show resolved Hide resolved
Copy link
Member

@hoh hoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule, I try to limit to 30 lines per function and 300 lines per file. Splitting the ip.py file in multiple modules within the same directory/package would be clearer.

vm_supervisor/ip.py Outdated Show resolved Hide resolved
vm_supervisor/ip.py Outdated Show resolved Hide resolved
@hoh
Copy link
Member

hoh commented Nov 10, 2022

The AlephFirecrackerVM class is responsible of managing VM related resources that are specific to Aleph. I feel like it would make sense to move/call much of the firewall management code from there. This removes some indirections as discussed.

170e1a4

@tomribbens
Copy link
Contributor Author

As a general rule, I try to limit to 30 lines per function and 300 lines per file. Splitting the ip.py file in multiple modules within the same directory/package would be clearer.

After you split them out, I had some issues with them being too tightly coupled and having circular dependencies, so I joined them together again to resolve that. However, those dependencies seem to no longer be a problem, so I split them out again.

@tomribbens
Copy link
Contributor Author

The AlephFirecrackerVM class is responsible of managing VM related resources that are specific to Aleph. I feel like it would make sense to move/call much of the firewall management code from there. This removes some indirections as discussed.

170e1a4

So, the main reason it is the way it currently is, is because the code is not assuming nftables is set up in the default way, but actually will inspect the existing ruleset to determine some settings. This information is then stored in the instance of the Firewall class. Currently there is some other information stored in there as well, but I've been thinking about it, and I think that can be removed as those things are simply based of the vm_id, and so can be recalculated.

We can re-inspect the existing ruleset every time we do a change to the firewall rules, and if we do, the whole Firewall class may be removed, and make all its methods just normal functions. I'm a bit torn on it, as I don't like the idea of recalculating this info every time since it comes from an external system, but on the other hand, it would eliminate passing the reference down as the code is currently doing.

@tomribbens
Copy link
Contributor Author

I removed the Firewall class. I did it in a seperate branch as I wanted to keep it seperate until certain. See the diff here: tomribbens/aleph-vm@tomribbens-migrate-to-nftables...tomribbens:aleph-vm:tomribbens-migrate-to-nftables-no-firewall-object

This doesn't put anything in AlephFirecrackerVM though. Not sure if that's actually something that's actually needed. This seems fine as is to me.

@hoh
Copy link
Member

hoh commented Nov 15, 2022

Can you confirm that the object 'Nftables' returned by get_customized_nftables() and can be instantiated multiple times without any issue ? If that is the case, removing the class indeed makes the code cleaner.
Caching that function by decorating it with lru_cache()[https://docs.python.org/3/library/functools.html#functools.lru_cache] could remove the performance issue when calling the same function multiple times.

Regarding the # TODO: gracefully exit, the common approach in Python is to raise an exception and catch it where it should be handled. Can you explain in which situations this could happen ?

@tomribbens
Copy link
Contributor Author

Can you confirm that the object 'Nftables' returned by get_customized_nftables() and can be instantiated multiple times without any issue ? If that is the case, removing the class indeed makes the code cleaner.

I don't see why not. It's just loading and unloading the interface lib file each time.

Caching that function by decorating it with lru_cache()[https://docs.python.org/3/library/functools.html#functools.lru_cache] could remove the performance issue when calling the same function multiple times.

That's a possibility. get_customized_nftables() takes about 1ms to execute according to my tests. In the objectless way, it gets run basically every time a firewall rule is added or removed, which would be at startup and teardown of a vm. From testing, I get that it is about 9KiB of data in memory. Both are small numbers, so I'm not sure what side of the tradeoff is best.

Regarding the # TODO: gracefully exit, the common approach in Python is to raise an exception and catch it where it should be handled. Can you explain in which situations this could happen ?

2 different chains could register with the kernel in the same hook. If they do, both chains would be called. But if they do, and don't agree on what to do with a certain packet, weird stuff would happen. From what I understand, it's a situation that isn't really that useful, just technically possible in the way the netfilter architecture is setup. In practice, I don't think we would encounter the situation ever.

@hoh hoh merged commit df99175 into aleph-im:main Nov 21, 2022
@tomribbens tomribbens deleted the tomribbens-migrate-to-nftables branch November 24, 2022 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants