-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"SubripDecoder" class throws an Exception when an .srt file omits HOURS from its timestamps (ex: 00:00,000) #7122
Comments
workaround for a problem with the ExoPlayer .srt file parser: google/ExoPlayer#7122
Thanks for the report! Do you have a reference to the SRT spec that states the hours are optional, but the other time parameters are all required? As far as I can tell it's not really formally specified anywhere, but it does seem like a bug that we have an optional group in the regex that is then assumed to be non-null. Since it's kind of a loose format, I'll make the hours properly optional. |
I agree; I couldn't find any formal spec anywhere either. Since it is loosely specified, .srt files in the wild appear to feel free to ignore unwanted fields. For example, here is a similar discussion about whether or not the MILLISECONDS field could be made optional.. and that project concluded: "why not?". I would propose that while you're touching this decoder:
|
// https://repl.it/@WarrenBank/ExoPlayer-SubripDecoder
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
run_test("00:00:00,000 --> 00:00:01,000");
run_test(" 00:00,000 --> 00:01,000");
run_test(" 00:00 --> 00:01 ");
}
private static final String SUBRIP_TIMECODE = "(?:(\\d+):)?(\\d+):(\\d+)(?:,(\\d+))?";
private static final Pattern SUBRIP_TIMING_LINE = Pattern.compile("\\s*(" + SUBRIP_TIMECODE + ")\\s*-->\\s*(" + SUBRIP_TIMECODE + ")\\s*");
private static void run_test(String currentLine) {
Matcher matcher = SUBRIP_TIMING_LINE.matcher(currentLine);
if (matcher.matches()) {
long start = parseTimecode(matcher, /* groupOffset= */ 1);
long end = parseTimecode(matcher, /* groupOffset= */ 6);
System.out.println("start: " + start);
System.out.println("end: " + end);
}
else {
System.out.println("no match");
}
}
private static long parseTimecode(Matcher matcher, int groupOffset) {
long timestampMs = 0;
String groupVal;
// HOURS field is optional
groupVal = matcher.group(groupOffset + 1);
if (groupVal != null)
timestampMs += Long.parseLong(groupVal) * 60 * 60 * 1000;
// MINUTES field is required
groupVal = matcher.group(groupOffset + 2);
if (groupVal != null)
timestampMs += Long.parseLong(groupVal) * 60 * 1000;
// SECONDS field is required
groupVal = matcher.group(groupOffset + 3);
if (groupVal != null)
timestampMs += Long.parseLong(groupVal) * 1000;
// MILLISECONDS field is optional
groupVal = matcher.group(groupOffset + 4);
if (groupVal != null)
timestampMs += Long.parseLong(groupVal);
// convert timecode from MILLISECONDS to MICROSECONDS
return timestampMs * 1000;
}
} output (stdout):
|
Add a test for this case, and extend the existing tests to ensure the hour is parsed when it's present. issue:#7122 PiperOrigin-RevId: 302472213
Making milliseconds optional makes sense - I'll send a follow-up change. |
issue:#7122 PiperOrigin-RevId: 303154493
Add a test for this case, and extend the existing tests to ensure the hour is parsed when it's present. issue:#7122 PiperOrigin-RevId: 302472213
issue:#7122 PiperOrigin-RevId: 303154493
SUBRIP_TIMECODE
makes this field optionalparseTimecode
blindly assumes this field contains a non-null String value that can be parsed to aLong
minimal code to reproduces the problem:
output (stderr):
fixed:
output (stdout):
The text was updated successfully, but these errors were encountered: