-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[options] Added workaround option to execute "n_function" #31187
base: master
Are you sure you want to change the base?
Conversation
If we're doing this, let's align it with the PhantomJS support in yt-dlp so that either package can be used. Other extractors might have a use for a JS interpreter (currently PhantomJS has a class in the OpenLoad extractor module). There should be a function that runs a JS function and returns the result plus some way of determining which "engine" to use, based on options and defaults. What do you think? |
You mean to create something like WebDriverJSwrapper? P.S. phantomjs is no longer maintained, and the executable that can be downloaded from the official website is statically linked to the 2016 webkit. In other words, the webkit vulnerabilities discovered since 2016 have not been fixed. Executing externally obtained JS is inherently dangerous, but even more so with a JS engine like this one. Before that, phantomjs can read and write files and execute processes from JS, so the game is over when you load externally obtained JS. I don't think youtube would do such a thing, but I am not confident enough to say that all video sites would not do it. When using selenium's webdriver, the JS engine is usually the browser the user normally uses, so there should be no increased risk. Also, even the average user is likely to update their browser, so it is always guaranteed that the latest JS engine can be used with the same level of security as the browser. For example, if a new JS syntax is defined, you do not need to consider it as long as you use the browser with webdriver. Anyway, if you are going to use an external JS engine, you should use one that is maintained. :p |
The lack of a plausible alternative is why we have the built-in mini-JS interpreter. In some applications there is no "browser the user normally uses". Also mentioned here is this more plausible https:/amol-/dukpy |
Yes, the best thing to do is to continually improve jsinperp so that it can do whatever the browser can do.
so I totally agree with you on that point.
Implemented at 3038610. |
3038610
to
c9d491f
Compare
f = ('return ((e) => {{' | ||
'const d = decodeURIComponent(e);' | ||
'const p = d.lastIndexOf("}}");' | ||
'const th = d.substring(0, p);' | ||
'const bh = d.substring(p);' | ||
'const m = "var {0};" + th + ";{0} = {1};" + bh;' | ||
'const s = document.createElement("script");' | ||
's.innerHTML = m;' | ||
'document.body.append(s);' | ||
'return {0}("{2}");' | ||
'}})("{3}");').format(dummyfunc, funcname, n_param, compat_urllib_quote(jscode)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this run the entire js?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when the JS is loaded, an anonymous function is called with _yt_player as an argument.
There is a lot of waste, but this is what the browser normally does when browsing youtube.
Also, the JS that allows the desired function to be referenced from the outside is inserted at the time of this call, so there is no need for complicated parsing. :)
If we want to use an external binary, wouldn't |
@pukkandan In addition, since nodejs was originally designed for writing regular programs in JS, it does not seem like a good idea to pass externally derived JS to be executed(child_process, fs, etc...). |
youtube-dl is often used in servers/containers/embedded devices where a browser is not normally available. And installing chrome/firefox on a headless system is not trivial PS: Also, |
Yes, so this is only an workaround.
Thanks, is this it? |
c9d491f
to
781aaea
Compare
6c33d96
to
6f0fa3a
Compare
6f0fa3a
to
e29f919
Compare
e29f919
to
fc47462
Compare
17acf9e
to
648809f
Compare
2a7c266
to
e92ba73
Compare
Co-authored-by: Jouni Järvinen <[email protected]>
Now that Deno (addressing security issues associated with node.js) exists, I'm more favourably inclined to this functionality. If we can ensure that it works like yt-dlp (which does support Deno), it should be merged. |
Please follow the guide below
x
into all the boxes [ ] relevant to your pull request (like that [x])Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
In regards to the recent youtube "n_function" update, if another fancy way comes up that the current jsinterp can't handle, this patch can give users a workaround until it is fixed in jsinterp.
But of course, you need to have selenium installed($ pip install selenium) and webdriver for the browser you want to use. :p