Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to enable exact (but inefficient) seeking into variable bitrate MP3s #6787

Closed
natidykstein opened this issue Dec 19, 2019 · 14 comments
Assignees

Comments

@natidykstein
Copy link

natidykstein commented Dec 19, 2019

Issue description

When seeking to 1603000ms in the mp3 file it seems to seek approx. 3 seconds earlier.
Seeking the mp3 file using Audacity(or VLC) it seeks to the expected position (as you can hear the expected audio).
I'm not sure it's relevant but the mp3 was created by extracting the aac stream of an mp4 video file using ffmpeg. (The original mp4 is seeked correctly by ExoPlayer while the mp3 does not)

Reproduction steps

  1. Using ExoPlayer demo app seek to 1603000ms. (I manually changed the code)
  2. The expected audio should say "I'll a..." and instead it says "are you asking me..."
    This happens consistently across devices.

Link to test content

A link to the mp3 file was emailed to [email protected].

A full bug report captured from the device

Full bug reported was emailed to [email protected]

Version of ExoPlayer being used

2.11.0

Device(s) and version(s) of Android being used

Was reproduced on -
OnePlus 6 running Android 10
Google Pixel 3 running Android 10
Virtual device Google Pixel 2 Running Android 9
Virtual device Nexus 5X running Android 7.1.1

@ojw28
Copy link
Contributor

ojw28 commented Dec 19, 2019

I get "You are not authorized to download this file." when I try and download the file. Please could you make it available?

@natidykstein
Copy link
Author

I've sent a new link

@ojw28
Copy link
Contributor

ojw28 commented Dec 20, 2019

Thanks! Unless you're going to use a constant bitrate, MP3 is fundamentally not well suited to use cases that require exact seeking. There are two reasons for this:

  1. For exact seeking, a container format will ideally provide a precise time-to-byte mapping in a header. This mapping allows a player to map a requested seek time to the corresponding byte offset, and start requesting/parsing/playing media from that offset. The headers available for specifying this mapping in MP3 are, unfortunately, often imprecise. The sample you've provided uses a XING header, which specifies the mapping for 100 points that have a byte granularity equal to 1/256th of the length of the file in bytes. For your sample, this means a time-to-byte mapping is specified for points approximately 18 seconds apart, and each of these mappings may be off by ~20KB. So the mapping is both quite sparse and limited in accuracy.
  2. For container formats that don't provide a precise time-to-byte mapping (or any time-to-byte mapping at all), it's still possible to perform an exact seek if the container includes absolute sample timestamps in the stream. In this case a player can map the seek time to a best guess of the corresponding byte offset, start requesting media from that offset, parse the first absolute sample timestamp, and effectively perform a guided binary search into the media until it finds the right sample. Unfortunately MP3 does not include absolute sample timestamps in the stream, so this approach is not possible.

Ultimately, this means that the only way to perform an exact seek into this type of MP3 is to scan the entire file and manually build up a time-to-byte mapping in the player. This obviously doesn't scale well to large MP3 files, particularly if the user tries to seek to near the end of the stream shortly after starting playback, which would require the player to wait until it's downloaded and indexed the entire stream before performing the seek. For ExoPlayer we decided to optimize for seeking speed over accuracy in this case.

We do have plans to support exact seeking by building up an index, however we'll most likely disable this option by default (if we do this, it'll be possible to enable it with a flag). I will keep this issue open to track this enhancement. If you control the media you're playing, I would suggest that you use a more suitable container format (i.e. MP4).

@ojw28 ojw28 changed the title MP3: SeekTo does not seek to exact location Add option to enable exact (but inefficient) seeking into variable bitrate MP3s Dec 20, 2019
@ojw28 ojw28 assigned kim-vde and unassigned ojw28 Dec 20, 2019
@natidykstein
Copy link
Author

Thanks a lot for your detailed explanation.

Since we are the one to extract the audio stream from the original mp4 video we control the media.
If the requirement for the seek of the streamed audio is to be both fast and exact - is CBR mp3 better than mp4/m4a here? any considerations regarding the codec?

As a side note - as one who's been working with ExoPlayer in the last 5 years (even had the opportunity to make a small contribution to the project :)) - I think you're doing an incredible job in making our life much easier - keep up the good work!

@ojw28
Copy link
Contributor

ojw28 commented Dec 20, 2019

MP4/M4A is always a better choice. IMO there aren't really any valid use cases for MP3 any more, unless you need to use/support it for legacy reasons.

p.s. Thanks! Happy to help :).

@natidykstein
Copy link
Author

I've inspected our part of the code that extracts the audio using ffmpeg and noticed that we do use CBR of 48kbps. To make sure I've analyzed the mp3 we're talking about and saw that is indeed the case (see attached screenshot).
What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?
Annotation 2019-12-20 232159

@ojw28
Copy link
Contributor

ojw28 commented Dec 20, 2019

What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?

That sounds quite likely.

@natidykstein
Copy link
Author

  1. Should I strip the XING header? (is an mp3 without a XING header is still valid?)
  2. Can the player detect if the mp3 is indeed CBR and ignore the XING header when doing seek calculations? (This would be my preferred option)

@ojw28
Copy link
Contributor

ojw28 commented Dec 21, 2019

Why isn't your preferred option to use a container format that's appropriate for your use case? Even the people who made MP3 don't think you should use it any more.

mp3 is still very popular amongst consumers. However, most state-of-the-art media services such as streaming or TV and radio broadcasting use modern ISO-MPEG codecs such as the AAC family or in the future MPEG-H. Those technologies, that have been developed with major contributions from Fraunhofer IIS, can deliver more features and a higher audio quality at much lower bitrates compared to mp3

My understanding of XING headers are that they're only for VBR content, so if your file is CBR I'm not sure why it's ended up with a XING header in the first place (if you do some research for XING header, most references on the internet suggest that they're only used for VBR content). So yes, if you can generate the CBR MP3 without the XING header, I would expect that to work. We don't support your second suggestion.

@natidykstein
Copy link
Author

You're right - using a different container is probably the right approach.
My second suggestion is just an optimization/work-around that would fit my case exactly - but I completly understand if it doesn't seem justified as a general approach.

I guess we'll probably need to go over all of our already generated mp3 (there are a lot) and perform some kind of adjustment - removing the XING header or transcoding to a different container, and change the way we generate new mp3 files.

Thanks for the tip in the right direction.

@natidykstein
Copy link
Author

Just a note on the competitive front - iPhone's AVPlayer seeking is precise on the same mp3 - so it probably ignores the XING header in this case.

icbaker pushed a commit that referenced this issue Jan 28, 2020
This seeker is only seeking to frames that have already been read by the
extractor for playback. Seeking to not-yet-read frames will be
implemented in another change.

Issue: #6787
PiperOrigin-RevId: 291888899
ojw28 pushed a commit that referenced this issue Jan 30, 2020
@ojw28
Copy link
Contributor

ojw28 commented Feb 13, 2020

This is supported in the dev-v2 branch via a new Mp3Extractor.FLAG_ENABLE_INDEX_SEEKING flag. This will be included in 2.13.0.

@ojw28 ojw28 closed this as completed Feb 13, 2020
@nishanBende
Copy link

@ojw28 how to enable or use this flag?
Is there some initializer where I can pass it?
Thanks

@kim-vde
Copy link
Contributor

kim-vde commented Mar 4, 2020

It is enabled by using FLAG_ENABLE_INDEX_SEEKING, which can be set on a DefaultExtractorsFactory using setMp3ExtractorFlags.

The documentation has not been updated yet as this functionality has not been released yet.

@google google locked and limited conversation to collaborators Apr 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants