Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcpreplay appears to hang using loop and netmap options #424

Closed
fklassen opened this issue Nov 2, 2017 · 15 comments
Closed

tcpreplay appears to hang using loop and netmap options #424

fklassen opened this issue Nov 2, 2017 · 15 comments

Comments

@fklassen
Copy link
Member

fklassen commented Nov 2, 2017

From tcpreplay-users:

Version information:

SYS-E300-8D:~$ /usr/local/bin/tcpreplay -V
Warning: May need to run as root to get access to all network interfaces.
tcpreplay version: 4.2.5 (build git:v4.2.5)
Copyright 2013-2017 by Fred Klassen <tcpreplay at appneta dot com> - AppNeta
Copyright 2000-2012 by Aaron Turner <aturner at synfin dot net>
The entire Tcpreplay Suite is licensed under the GPLv3
Cache file supported: 04
Not compiled with libdnet.
Compiled against libpcap: 1.5.3
64 bit packet counters: enabled
Verbose printing via tcpdump: enabled
Packet editing: disabled
Fragroute engine: disabled
Default injection method: PF_PACKET send()
Optional injection method: netmap

Command line used:

sudo /usr/local/bin/tcpreplay --intf1=eth7 -tK --loop 2 --unique-ip --stats=0 --netmap ./tcpreplay-4.2.5/test/test.pcap

Platform:

Distribution: Ubuntu 14.04 (trusty)

Kernel version: Linux version 4.4.0-31-generic (buildd@lgw01-43) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) ) #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016

Make & model of the network card:

Intel D-1500 SoC

Description of problem:

Using netmap mode and the included test.pcap file, tcpreplay appears to hang when --loop argument is greater than 1:

    SYS-E300-8D:~$ sudo /usr/local/bin/tcpreplay --intf1=eth7 -tK --loop 2 --unique-ip --stats=0 --netmap ./tcpreplay-4.2.5/test/ test.pcap 
    Switching network driver for eth7 to netmap bypass mode... done!
    File Cache is enabled
    Test start: 2017-11-02 11:05:40.699585 ...
    Loop 1 of 2...

Running "top" shows tcpreplay using 100% of the cpu.

Attempting to use Ctrl C to stop it fails:

^C User interrupt...
sendpacket_abort
^C User interrupt...
sendpacket_abort


Running nearly the same command above with a --loop value of 1 works fine:

    SYS-E300-8D:~$ sudo /usr/local/bin/tcpreplay --intf1=eth7 -tK --loop 1 --netmap ./tcpreplay-4.2.5/test/test.pcap 
    Switching network driver for eth7 to netmap bypass mode... done!
    File Cache is enabled
    Actual: 141 packets (62704 bytes) sent in 0.000154 seconds
    Rated: 407168831.1 Bps, 3257.35 Mbps, 915584.41 pps
    Flows: 37 flows, 240259.74 fps, 140 flow packets, 1 non-flow
    Statistics for network device: eth7
        Successful packets:        141
        Failed packets:            0
        Truncated packets:         0
        Retried packets (ENOBUFS): 0
        Retried packets (EAGAIN):  0
    Switching network driver for eth7 to normal mode... done!

Thanks.


Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot_______________________________________________
Tcpreplay-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tcpreplay-users
Support Information: http://tcpreplay.synfin.net/trac/wiki/Support

@fklassen fklassen added the bug label Nov 2, 2017
@ken-adey
Copy link

I had to reinstall the OS due to a hardware change (new SSD), which means I had to rebuild/reinstall tcpreplay and netmap. Now the loop functionality is working with the --netmap option. Go figure!

@fklassen fklassen removed the bug label Nov 14, 2017
@fklassen
Copy link
Member Author

Interesting. Many Netmap issues are not due to my code. I suspect something in Netmap. I removed the "bug" label for now, but will keep open for a while until I can confirm it works for me.

@fklassen
Copy link
Member Author

I was unable to reproduce with my current version. It may have been fixed somewhere along the way. I'll release 4.3 beta1 soon so you can try it out.

root@m50-aaeon-v0:/home/admin# ./tcpreplay -Kt --netmap -i eth1 iht2.pcap iht.pcap 
Switching network driver for eth1 to netmap bypass mode... done!
File Cache is enabled
Actual: 3554 packets (2294767 bytes) sent in 0.017811 seconds
Rated: 128839874.2 Bps, 1030.71 Mbps, 199539.61 pps
Statistics for network device: eth1
	Successful packets:        3554
	Failed packets:            0
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  3997
Switching network driver for eth1 to normal mode... done!
root@m50-aaeon-v0:/home/admin# ./tcpreplay -Kt --netmap -i eth1 -l2 iht2.pcap iht.pcap 
Switching network driver for eth1 to netmap bypass mode... done!
File Cache is enabled
Actual: 7108 packets (4589534 bytes) sent in 0.036975 seconds
Rated: 124125327.9 Bps, 993.00 Mbps, 192237.99 pps
Statistics for network device: eth1
	Successful packets:        7108
	Failed packets:            0
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  7751
Switching network driver for eth1 to normal mode... done!

@EaseTheWorld
Copy link

I can reproduce this issue. Redhat 7.0, tcpreplay 4.2.6, latest netmap, ixgbe-5.3.3.
Without loop it works fine but with '--loop=2' it hangs.
If you need more info, please tell me.

@EaseTheWorld
Copy link

It seems that netmap_tx_queues_empty() keeps returning false after sent a file.
I'll check more.

@fklassen
Copy link
Member Author

Reopening for further investigation.

@fklassen fklassen reopened this May 10, 2018
@EaseTheWorld
Copy link

NETMAP_TX_RING_EMPTY() keeps returning false.
I tried nm_tx_pending() but it keeps returning true.
both because txring->tail is not changed even after NIOCTXSYNC ioctl.
This seems netmap issue.

One question though, why is netmap_tx_queues_empty() checked only when loop == 1?
I think queue empty should be checked whether loop or one shot play.

@fklassen
Copy link
Member Author

What version of netmap are you using? We may want to open and issue with netmap to see if there has been any changes in this macro.

@EaseTheWorld
Copy link

NETMAP_TX_RING_EMPTY returning false is same meaning as nm_tx_pending() returning true.
so changing NETMAP_TX_RING_EMPTY to !nm_tx_pending() doesn't solve the issue.
As I said, the issue is that txring->tail is not changed even after NIOCTXSYNC ioctl.
I'm not familiar with netmap so I don't know where to look at now...

netmap api 12 (commit 53caf0a) with redhat 7.0

@EaseTheWorld
Copy link

I looked at the netmap/apps/pkt-gen.c (no hang) and found out
slot->flags |= NS_REPORT is the key.
In pkt-gen, NS_REPORT flag is set for every slot but tcpreplay is not.
If I remove the condition if (avail <= 1) in sendpacket_send_netmap() so NS_REPORT is always set, then
issue is resolved.
I'm not 100% sure this is the right fix, though.

@fklassen
Copy link
Member Author

fklassen commented Jun 1, 2018

That appears to be new behaviour in pkt-gen.c. It's not the first time that updates are made to netmap that are not backwards compatible.

I've had little time to work on this project, but will attempt to free up a day or two next week. Hopefully I can come up with a patch that is backwards compatible with versions of netmap that do not have NS_REPORT defined.

@EaseTheWorld
Copy link

EaseTheWorld commented Jun 4, 2018

I looked at history of src/common/netmap.c and if (avail <= 1) slot->flags |= NS_REPORT code has been there since the first version. maybe NS_REPORT meaning or usage changed?
As far as I understand, when netmap sends the slot with NS_REPORT flag set, tail is updated. otherwise, tail is never(or lazily) updated.
So... to check tx queues empty

  • set NS_REPORT flag for the slot containing last packet if last is known or
  • set NS_REPORT for every slot(I don't know how expensive this is.) so tail is updated in real time.
  • set NS_REPORT for last packet of every burst(netmap/apps/pkt-gen seems to use this style)

fklassen added a commit that referenced this issue Oct 30, 2018
As of netmap commit 7b16969f (version 10), should use nm_tx_pending()
to determine whether the TX buffers still contain data.

Fix is to use nm_tx_pending() rather than NETMAP_TX_RING_EMPTY,
unless using an older version of netmap.
fklassen added a commit that referenced this issue Oct 31, 2018
Bug #424 Fix compatibility for netmap version 10 and higher
@fklassen
Copy link
Member Author

Support for newer versions of netmap.

It appears that for version 10 and higher, should use nm_tx_pending() instead of (nm_ring_space(ring) >= (ring)->num_slots - 1).

Fixed in PR #522

@fklassen
Copy link
Member Author

fklassen commented Oct 31, 2018

Found that the fix works with --loop option. Fails with --top-speed option. Reopening.

@fklassen fklassen reopened this Oct 31, 2018
fklassen added a commit that referenced this issue Oct 31, 2018
fklassen added a commit that referenced this issue Nov 1, 2018
@fklassen
Copy link
Member Author

fklassen commented Nov 1, 2018

Fixed in PR #424

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants