`IpAddr::from_str` is unnecessarily slow for IPv6 addresses #94825

jyn514 · 2022-03-10T22:18:47Z

from_str is implemented like so:

Lines 226 to 229 in 5f4e067

 /// Read an IP Address, either IPv4 or IPv6. 

 fn read_ip_addr(&mut self) -> Option<IpAddr> { 

 self.read_ipv4_addr().map(IpAddr::V4).or_else(move || self.read_ipv6_addr().map(IpAddr::V6)) 

 }

This unnecessarily penalizes ipv6 addresses, because the parser will always do a full linear scan of the input before it even starts to parse the address. Instead, it could use the knowledge that IPv4 addresses are at most 15 bytes long to skip directly to read_ipv6_addr for longer strings.

@rustbot label: +I-slow +T-libs +C-enhancement

The text was updated successfully, but these errors were encountered:

jyn514 · 2022-03-10T22:31:54Z

IPv4 addresses are at most 15 bytes long

This is true because octal addresses were banned in #83652.

equt · 2022-03-10T23:20:31Z

@rustbot claim

equt · 2022-03-12T04:44:27Z

I do agree that it could benefit the read_ip_addr, and maybe also the read_socket_addr. But I believe that

the parser will always do a full linear scan of the input

is not true.

If I didn't get it wrong, the current parser will realize the input is an IPv6 after parsing at most four bytes. And if we're going to use the input size as a factor to guess, something like 0000::1 will still suffer from the parser's backtracking.

mbartlett21 · 2022-03-13T22:54:56Z

the current parser will realize the input is an IPv6 after parsing at most four bytes

That seems correct

rust/library/std/src/net/parser.rs

Lines 145 to 148 in 21b0325

 let mut groups = [0; 4]; 

 for (i, slot) in groups.iter_mut().enumerate() { 

 *slot = p.read_separator('.', i, |p| {

Noratrieb · 2024-02-20T21:24:49Z

There are probably still further performance improvements that could be made, but unless someone has a real-world use case for parsing that many IPv6 addresses, this seems unnecessary.

saethlin · 2024-02-20T22:48:28Z

unless someone has a real-world use case for parsing that many IPv6 addresses

My employer maintains an ETL process that mostly chomps through netflow data, spending a nontrivial amount of cycles parsing IP addresses depending on configuration.

I don't particularly see any reason to close this; if we can have super-fast IP address parsing, we should.

rustbot added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 10, 2022

rustbot assigned equt Mar 10, 2022

equt removed their assignment Oct 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`IpAddr::from_str` is unnecessarily slow for IPv6 addresses #94825

`IpAddr::from_str` is unnecessarily slow for IPv6 addresses #94825

jyn514 commented Mar 10, 2022 •

edited by rustbot

Loading

jyn514 commented Mar 10, 2022

equt commented Mar 10, 2022

equt commented Mar 12, 2022 •

edited

Loading

mbartlett21 commented Mar 13, 2022

Noratrieb commented Feb 20, 2024

saethlin commented Feb 20, 2024

IpAddr::from_str is unnecessarily slow for IPv6 addresses #94825

IpAddr::from_str is unnecessarily slow for IPv6 addresses #94825

Comments

jyn514 commented Mar 10, 2022 • edited by rustbot Loading

jyn514 commented Mar 10, 2022

equt commented Mar 10, 2022

equt commented Mar 12, 2022 • edited Loading

mbartlett21 commented Mar 13, 2022

Noratrieb commented Feb 20, 2024

saethlin commented Feb 20, 2024

`IpAddr::from_str` is unnecessarily slow for IPv6 addresses #94825

`IpAddr::from_str` is unnecessarily slow for IPv6 addresses #94825

jyn514 commented Mar 10, 2022 •

edited by rustbot

Loading

equt commented Mar 12, 2022 •

edited

Loading