1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-15 03:11:18 +00:00

Compare commits

...

55 Commits

Author SHA1 Message Date
github-actions
adba24d207 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2022-09-01 11:26:07 +00:00
pukkandan
5d7c7d6569 Release 2022.09.01 2022-09-01 16:49:04 +05:30
pukkandan
d2c8aadf79 [cleanup] Misc
Closes #4710, Closes #4754, Closes #4723
Authored by: pukkandan, MrRawes, DavidH-2022
2022-09-01 16:49:03 +05:30
pukkandan
1ac7f46184 Update to ytdl-commit-ed5c44e7
[compat] Replace deficient ChainMap class in Py3.3 and earlier
ed5c44e7b7
2022-09-01 16:46:32 +05:30
pukkandan
05deb747bb [jsinterp] Fix escape in regex 2022-09-01 16:46:32 +05:30
pukkandan
b505e8517a [extractor/youtube] Fallback regex for nsig code extraction 2022-09-01 16:46:32 +05:30
pukkandan
f2e9fa3ef7 [FormatSort] Fix aext for --prefer-free-formats
Closes #4735
2022-09-01 16:46:31 +05:30
satan1st
50a399326f [build] make tar' should not follow DESTDIR` (#4790)
Ref: https://www.gnu.org/prep/standards/html_node/DESTDIR.html
Authored by: satan1st
2022-09-01 16:46:17 +05:30
coletdjnz
1ff88b7aec [extractor/youtube] Add no-youtube-prefer-utc-upload-date compat option (#4771)
This option reverts 992f9a730b and 17322130a9 to prefer the non-UTC upload date in microformats.

Authored by: coletdjnz, pukkandan
2022-09-01 10:02:28 +00:00
bashonly
825d3ce386 [cookies] Improve container support (#4806)
Closes #4800
Authored by: bashonly, pukkandan, coletdjnz
2022-09-01 15:22:59 +05:30
bashonly
92aa6d6883 [extractor/triller] Add extractor (#4712)
Closes #4703
Authored by: bashonly
2022-09-01 15:20:54 +05:30
Elyse
b2a4db425b [VQQ] Add extractors (#4706)
Closes #1666
Authored by: elyse0
2022-09-01 12:42:34 +05:30
Yifu Yu
de49cdbe9d [extractor/bilibili] Extract flac with premium account (#4759)
Authored by: jackyyf
2022-08-31 23:22:16 +05:30
shirt
9f9c85dda4 [Build] Update pyinstaller 2022-08-31 13:12:26 -04:00
HobbyistDev
11734714c2 [extractor/eurosport] Add extractor (#4613)
Closes #2487
Authored by: HobbyistDev
2022-08-31 22:32:33 +05:30
pukkandan
b86ca447ce [extractor/mediaset] Fix embed extraction
Closes #4804
2022-08-31 22:24:41 +05:30
Tejas Arlimatti
f8c7ba9984 [extractor/epoch] Add extractor (#4772)
Closes #4714
Authored by: tejasa97
2022-08-31 22:16:26 +05:30
DepFA
76f2bb175d [extractor/stripchat] Don't modify input URL (#4781)
Authored by: dfaker
2022-08-31 21:10:59 +05:30
Elyse
f26af78a8a [jsinterp] Add charcodeAt and bitwise overflow (#4706)
Authored by: elyse0
2022-08-31 21:01:22 +05:30
Lesmiscore
bfbecd1174 [extractor/newspicks] Add extractor (#4725)
Authored by: Lesmiscore
2022-08-31 02:07:55 +09:00
bashonly
9bd13fe5bb [cookies] Support firefox container in --cookies-from-browser (#4753)
Authored by: bashonly
2022-08-30 22:24:46 +05:30
Jeff Huffman
459262ac97 [extractor/crunchyroll:beta] Use anonymous access (#4704)
Closes #4692
Authored by: tejing1
2022-08-30 22:04:13 +05:30
Lesmiscore
82ea226c61 Restore LD_LIBRARY_PATH when using PyInstaller (#4666)
Authored by: Lesmiscore
2022-08-31 01:24:14 +09:00
pukkandan
da4db748fa [utils] Add deprecation_warning
See https://github.com/yt-dlp/yt-dlp/pull/2173#issuecomment-1097021515
2022-08-30 21:03:07 +05:30
pukkandan
e1eabd7beb [downloader/external] Smarter detection of executable
Closes #4778
2022-08-30 18:13:38 +05:30
pukkandan
d81ba7d491 [jsinterp, extractor/youtube] Minor fixes 2022-08-30 18:13:37 +05:30
OHaiiBuzzle
5135ed3d4a [extractor/huya] Fix stream extraction (#4798)
Closes #4658
Authored by: ohaiibuzzle
2022-08-30 16:14:16 +05:30
pukkandan
c4b2df872d [jsinterp] Fix _separate
Ref: https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1231126941
2022-08-30 16:06:40 +05:30
Samantaz Fox
224b5a35f7 [extractor/youtube] Update iOS Innertube clients (#4792)
Authored by: SamantazFox
2022-08-29 03:36:55 +00:00
coletdjnz
50ac0e5416 [extractor/youtube] Use device-specific user agent (#4770)
Thwart latest fingerprinting attempt (see https://github.com/iv-org/invidious/issues/3230#issuecomment-1226887639)

Authored by: coletdjnz
2022-08-28 22:59:54 +00:00
Lesmiscore
e0992d5558 [extractor/IslamChannel] Add extractors (#4779)
Authored by: Lesmiscore
2022-08-28 01:37:25 +09:00
pukkandan
5e01315aa1 [cache, extractor/youtube] Invalidate old cache 2022-08-27 07:25:14 +05:30
pukkandan
4e4982ab5b [extractor/generic] Don't return JW player without formats
CLoses #4765
2022-08-27 06:21:17 +05:30
cgrigis
89e4d86171 [extractor/arte] Bug fix (#4769)
Closes #4768
Authored by: cgrigis
2022-08-27 05:58:01 +05:30
Shreyas Minocha
a1af516259 [extractor/screencastomatic] Support --video-password (#4761)
Authored by: shreyasminocha
2022-08-26 08:59:45 +05:30
pukkandan
1d64a59547 [extractor/vimeo:user] Fix _VALID_URL
Closes #4758
2022-08-26 06:29:03 +05:30
pukkandan
ca7f8b8f31 Bugfix for 822d66e591
Closes #4760
2022-08-26 06:08:05 +05:30
pukkandan
164b03c486 [jsinterp] Fix bug in operator precedence
Fixes https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1226659543
2022-08-25 09:40:46 +05:30
pukkandan
e5458d1d88 Fix lazy extractor bug in fe7866d0ed
and add test

Fixes https://github.com/yt-dlp/yt-dlp/pull/3234#issuecomment-1225347071
2022-08-24 15:19:58 +05:30
pukkandan
b5e7a2e69d Add version to infojson 2022-08-24 13:03:45 +05:30
pukkandan
2516cafb28 Fix bug in fe7866d0ed 2022-08-24 08:21:39 +05:30
pukkandan
fd404bec7e Fix --break-per-url --max-downloads 2022-08-24 08:00:13 +05:30
pukkandan
fe7866d0ed Add option --use-extractors
Deprecates `--force-generic-extractor`

Closes #3234, Closes #2044

Related: #4307, #1791
2022-08-24 07:47:51 +05:30
pukkandan
5314b52192 [utils] Add orderedSet_from_options 2022-08-24 07:38:55 +05:30
pukkandan
13db4e7b9e [extractor/mixcloud] All formats are audio-only
Closes #4740
2022-08-23 04:11:27 +05:30
Joshua Lochner
07275b708b [extractor/medaltv] Fix extraction (#4739)
Authored by: xenova
2022-08-23 01:34:12 +05:30
Elyse
b85703d11a [extractor/rtbf] Fix jwt extraction (#4738)
Closes #4683
Authored by: elyse0
2022-08-23 00:15:46 +05:30
pukkandan
992dc6b486 [jsinterp] Implement timeout
Workaround for #4716
2022-08-22 06:19:06 +05:30
pukkandan
822d66e591 Fix bug in --alias 2022-08-22 04:37:23 +05:30
pukkandan
8d1ad6378f [extractor/BiliBiliSearch] Don't sort by date
Related #4682
2022-08-21 05:19:20 +05:30
pukkandan
2d1019542a [extractor/BiliBiliSearch] Fix infinite loop
Closes #4682
2022-08-21 05:19:20 +05:30
pukkandan
b25cac650f [extractor/youtube] Fix bug in format sorting 2022-08-21 00:56:27 +05:30
pukkandan
90a1df305b [test] Fix test_youtube_signature 2022-08-21 00:51:03 +05:30
pukkandan
0a6b4b82e9 [extractor/uktv] Improve _VALID_URL
Closes #4707
Authored by: dirkf
2022-08-20 05:00:45 +05:30
pukkandan
1704c47ba8 [extractor/bitchute] Mark errors as expected
Closes #4685
2022-08-20 04:53:05 +05:30
60 changed files with 1892 additions and 672 deletions

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a broken site - label: I'm reporting a broken site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a new site support request - label: I'm reporting a new site support request
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm requesting a site-specific feature - label: I'm requesting a site-specific feature
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a bug unrelated to a specific site - label: I'm reporting a bug unrelated to a specific site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -20,7 +20,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell

View File

@@ -26,7 +26,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.19** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true required: true
@@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.19, Current version: 2022.08.19 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.19) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell

View File

@@ -194,7 +194,7 @@ jobs:
- name: Install Requirements - name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install --upgrade pip setuptools wheel py2exe python -m pip install --upgrade pip setuptools wheel py2exe
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |
@@ -230,7 +230,7 @@ jobs:
- name: Install Requirements - name: Install Requirements
run: | run: |
python -m pip install --upgrade pip setuptools wheel python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |

View File

@@ -299,3 +299,12 @@ bashonly
jacobtruman jacobtruman
masta79 masta79
palewire palewire
cgrigis
DavidH-2022
dfaker
jackyyf
ohaiibuzzle
SamantazFox
shreyasminocha
tejasa97
xenov

View File

@@ -11,6 +11,54 @@
--> -->
### 2022.09.01
* Add option `--use-extractors`
* Merge youtube-dl: Upto [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7)
* Add yt-dlp version to infojson
* Fix `--break-per-url --max-downloads`
* Fix bug in `--alias`
* [cookies] Support firefox container in `--cookies-from-browser` by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [downloader/external] Smarter detection of executable
* [extractor/generic] Don't return JW player without formats
* [FormatSort] Fix `aext` for `--prefer-free-formats`
* [jsinterp] Various improvements by [pukkandan](https://github.com/pukkandan), [dirkf](https://github.com/dirkf), [elyse0](https://github.com/elyse0)
* [cache] Mechanism to invalidate old cache
* [utils] Add `deprecation_warning`
* [utils] Add `orderedSet_from_options`
* [utils] `Popen`: Restore `LD_LIBRARY_PATH` when using PyInstaller by [Lesmiscore](https://github.com/Lesmiscore)
* [build] `make tar` should not follow `DESTDIR` by [satan1st](https://github.com/satan1st)
* [build] Update pyinstaller by [shirt-dev](https://github.com/shirt-dev)
* [test] Fix `test_youtube_signature`
* [cleanup] Misc fixes and cleanup by [DavidH-2022](https://github.com/DavidH-2022), [MrRawes](https://github.com/MrRawes), [pukkandan](https://github.com/pukkandan)
* [extractor/epoch] Add extractor by [tejasa97](https://github.com/tejasa97)
* [extractor/eurosport] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/IslamChannel] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/newspicks] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/triller] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/VQQ] Add extractors by [elyse0](https://github.com/elyse0)
* [extractor/youtube] Improvements to nsig extraction
* [extractor/youtube] Fix bug in format sorting
* [extractor/youtube] Update iOS Innertube clients by [SamantazFox](https://github.com/SamantazFox)
* [extractor/youtube] Use device-specific user agent by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add `--compat-option no-youtube-prefer-utc-upload-date` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/arte] Bug fix by [cgrigis](https://github.com/cgrigis)
* [extractor/bilibili] Extract `flac` with premium account by [jackyyf](https://github.com/jackyyf)
* [extractor/BiliBiliSearch] Don't sort by date
* [extractor/BiliBiliSearch] Fix infinite loop
* [extractor/bitchute] Mark errors as expected
* [extractor/crunchyroll:beta] Use anonymous access by [tejing1](https://github.com/tejing1)
* [extractor/huya] Fix stream extraction by [ohaiibuzzle](https://github.com/ohaiibuzzle)
* [extractor/medaltv] Fix extraction by [xenova](https://github.com/xenova)
* [extractor/mediaset] Fix embed extraction
* [extractor/mixcloud] All formats are audio-only
* [extractor/rtbf] Fix jwt extraction by [elyse0](https://github.com/elyse0)
* [extractor/screencastomatic] Support `--video-password` by [shreyasminocha](https://github.com/shreyasminocha)
* [extractor/stripchat] Don't modify input URL by [dfaker](https://github.com/dfaker)
* [extractor/uktv] Improve `_VALID_URL` by [dirkf](https://github.com/dirkf)
* [extractor/vimeo:user] Fix `_VALID_URL`
### 2022.08.19 ### 2022.08.19
* Fix bug in `--download-archive` * Fix bug in `--download-archive`

View File

@@ -33,7 +33,6 @@ completion-zsh: completions/zsh/_yt-dlp
lazy-extractors: yt_dlp/extractor/lazy_extractors.py lazy-extractors: yt_dlp/extractor/lazy_extractors.py
PREFIX ?= /usr/local PREFIX ?= /usr/local
DESTDIR ?= .
BINDIR ?= $(PREFIX)/bin BINDIR ?= $(PREFIX)/bin
MANDIR ?= $(PREFIX)/man MANDIR ?= $(PREFIX)/man
SHAREDIR ?= $(PREFIX)/share SHAREDIR ?= $(PREFIX)/share
@@ -134,7 +133,7 @@ yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscrip
$(PYTHON) devscripts/make_lazy_extractors.py $@ $(PYTHON) devscripts/make_lazy_extractors.py $@
yt-dlp.tar.gz: all yt-dlp.tar.gz: all
@tar -czf $(DESTDIR)/yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \ @tar -czf yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \ --exclude '*.DS_Store' \
--exclude '*.kate-swp' \ --exclude '*.kate-swp' \
--exclude '*.pyc' \ --exclude '*.pyc' \

View File

@@ -71,7 +71,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES # NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/b0a60ce](https://github.com/ytdl-org/youtube-dl/commit/b0a60ce2032172aeaaf27fe3866ab72768f10cb2)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl) * Merged with **youtube-dl v2021.12.17+ [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7b74ac77f87ca5ed6cb5e964a0c6a0678)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API * **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
@@ -141,6 +141,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading * Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections * Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this * Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this * If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead * Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this * Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@@ -320,7 +321,7 @@ To build the standalone executable, you must have Python and `pyinstaller` (plus
On some systems, you may need to use `py` or `python` instead of `python3`. On some systems, you may need to use `py` or `python` instead of `python3`.
Note that pyinstaller [does not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment. Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly. **Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
@@ -375,7 +376,13 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--list-extractors List all supported extractors and exit --list-extractors List all supported extractors and exit
--extractor-descriptions Output descriptions of all supported --extractor-descriptions Output descriptions of all supported
extractors and exit extractors and exit
--force-generic-extractor Force extraction to use the generic extractor --use-extractors NAMES Extractor names to use separated by commas.
You can also use regexes, "all", "default"
and "end" (end URL matching); e.g. --ies
"holodex.*,end,youtube". Prefix the name
with a "-" to exclude it, e.g. --ies
default,-generic. Use --list-extractors for
a list of extractor names. (Alias: --ies)
--default-search PREFIX Use this prefix for unqualified URLs. E.g. --default-search PREFIX Use this prefix for unqualified URLs. E.g.
"gvsearch2:python" downloads two videos from "gvsearch2:python" downloads two videos from
google videos for the search term "python". google videos for the search term "python".
@@ -524,8 +531,8 @@ You can also fork the project on github and run your fork's [build workflow](.gi
a file that is in the archive a file that is in the archive
--break-on-reject Stop the download process when encountering --break-on-reject Stop the download process when encountering
a file that has been filtered out a file that has been filtered out
--break-per-input Make --break-on-existing, --break-on-reject --break-per-input --break-on-existing, --break-on-reject,
and --max-downloads act only on the current --max-downloads, and autonumber resets per
input URL input URL
--no-break-per-input --break-on-existing and similar options --no-break-per-input --break-on-existing and similar options
terminates the entire download queue terminates the entire download queue
@@ -700,18 +707,20 @@ You can also fork the project on github and run your fork's [build workflow](.gi
and dump cookie jar in and dump cookie jar in
--no-cookies Do not read/dump cookies from/to file --no-cookies Do not read/dump cookies from/to file
(default) (default)
--cookies-from-browser BROWSER[+KEYRING][:PROFILE] --cookies-from-browser BROWSER[+KEYRING][:PROFILE][::CONTAINER]
The name of the browser and (optionally) the The name of the browser to load cookies
name/path of the profile to load cookies from. Currently supported browsers are:
from, separated by a ":". Currently brave, chrome, chromium, edge, firefox,
supported browsers are: brave, chrome, opera, safari, vivaldi. Optionally, the
chromium, edge, firefox, opera, safari, KEYRING used for decrypting Chromium cookies
vivaldi. By default, the most recently on Linux, the name/path of the PROFILE to
accessed profile is used. The keyring used load cookies from, and the CONTAINER name
for decrypting Chromium cookies on Linux can (if Firefox) ("none" for no container) can
be (optionally) specified after the browser be given with their respective seperators.
name separated by a "+". Currently supported By default, all containers of the most
keyrings are: basictext, gnomekeyring, kwallet recently accessed profile are used.
Currently supported keyrings are: basictext,
gnomekeyring, kwallet
--no-cookies-from-browser Do not load cookies from browser (default) --no-cookies-from-browser Do not load cookies from browser (default)
--cache-dir DIR Location in the filesystem where youtube-dl --cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information (such can store some downloaded information (such
@@ -1229,7 +1238,6 @@ The available fields are:
- `id` (string): Video identifier - `id` (string): Video identifier
- `title` (string): Video title - `title` (string): Video title
- `fulltitle` (string): Video title ignoring live timestamp and generic title - `fulltitle` (string): Video title ignoring live timestamp and generic title
- `url` (string): Video URL
- `ext` (string): Video filename extension - `ext` (string): Video filename extension
- `alt_title` (string): A secondary title of the video - `alt_title` (string): A secondary title of the video
- `description` (string): The description of the video - `description` (string): The description of the video
@@ -1264,26 +1272,6 @@ The available fields are:
- `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public" - `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public"
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL - `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL - `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `format` (string): A human-readable description of the format
- `format_id` (string): Format code specified by `--format`
- `format_note` (string): Additional info about the format
- `width` (numeric): Width of the video
- `height` (numeric): Height of the video
- `resolution` (string): Textual description of width and height
- `tbr` (numeric): Average bitrate of audio and video in KBit/s
- `abr` (numeric): Average audio bitrate in KBit/s
- `acodec` (string): Name of the audio codec in use
- `asr` (numeric): Audio sampling rate in Hertz
- `vbr` (numeric): Average video bitrate in KBit/s
- `fps` (numeric): Frame rate
- `dynamic_range` (string): The dynamic range of the video
- `audio_channels` (numeric): The number of audio channels
- `stretched_ratio` (float): `width:height` of the video's pixels, if not square
- `vcodec` (string): Name of the video codec in use
- `container` (string): Name of the container format
- `filesize` (numeric): The number of bytes, if known in advance
- `filesize_approx` (numeric): An estimate for the number of bytes
- `protocol` (string): The protocol that will be used for the actual download
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch of when the information extraction was completed - `epoch` (numeric): Unix epoch of when the information extraction was completed
@@ -1302,6 +1290,8 @@ The available fields are:
- `webpage_url_basename` (string): The basename of the webpage URL - `webpage_url_basename` (string): The basename of the webpage URL
- `webpage_url_domain` (string): The domain of the webpage URL - `webpage_url_domain` (string): The domain of the webpage URL
- `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries) - `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries)
All the fields in [Filtering Formats](#filtering-formats) can also be used
Available for the video that belongs to some logical chapter or section: Available for the video that belongs to some logical chapter or section:
@@ -1383,13 +1373,13 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples #### Output template examples
```bash ```bash
$ yt-dlp --get-filename -o "test video.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "test video.%(ext)s" BaW_jenozKc
test video.webm # Literal name with correct extension test video.webm # Literal name with correct extension
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.webm # Restricted file name youtube-dl_test_video_.webm # Restricted file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist # Download YouTube playlist videos in separate directory indexed by video order in a playlist
@@ -1478,6 +1468,7 @@ You can also filter the video formats by putting a condition in brackets, as in
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals): The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance - `filesize`: The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes
- `width`: Width of the video, if known - `width`: Width of the video, if known
- `height`: Height of the video, if known - `height`: Height of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s - `tbr`: Average bitrate of audio and video in KBit/s
@@ -1485,16 +1476,23 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `vbr`: Average video bitrate in KBit/s - `vbr`: Average video bitrate in KBit/s
- `asr`: Audio sampling rate in Hertz - `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate - `fps`: Frame rate
- `audio_channels`: The number of audio channels
- `stretched_ratio`: `width:height` of the video's pixels, if not square
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields: Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields:
- `url`: Video URL
- `ext`: File extension - `ext`: File extension
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
- `container`: Name of the container format - `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
- `language`: Language code - `language`: Language code
- `dynamic_range`: The dynamic range of the video
- `format_id`: A short description of the format
- `format`: A human-readable description of the format
- `format_note`: Additional info about the format
- `resolution`: Textual description of width and height
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`. Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
@@ -1521,7 +1519,7 @@ The available fields are:
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other) - `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec` - `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred. - `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`. - `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
- `ext`: Equivalent to `vext,aext` - `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if known in advance - `filesize`: Exact filesize, if known in advance
- `fs_approx`: Approximate filesize calculated from the manifests - `fs_approx`: Approximate filesize calculated from the manifests
@@ -2058,6 +2056,7 @@ While these options are redundant, they are still expected to be used due to the
#### Not recommended #### Not recommended
While these options still work, their use is not recommended since there are other alternatives to achieve the same While these options still work, their use is not recommended since there are other alternatives to achieve the same
--force-generic-extractor --ies generic,default
--exec-before-download CMD --exec "before_dl:CMD" --exec-before-download CMD --exec "before_dl:CMD"
--no-exec-before-download --no-exec --no-exec-before-download --no-exec
--all-formats -f all --all-formats -f all

View File

@@ -11,14 +11,17 @@ from ..utils import (
# These bloat the lazy_extractors, so allow them to passthrough silently # These bloat the lazy_extractors, so allow them to passthrough silently
ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'} ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'}
_WARNED = False
class LazyLoadMetaClass(type): class LazyLoadMetaClass(type):
def __getattr__(cls, name): def __getattr__(cls, name):
if '_real_class' not in cls.__dict__ and name not in ALLOWED_CLASSMETHODS: global _WARNED
write_string( if ('_real_class' not in cls.__dict__
'WARNING: Falling back to normal extractor since lazy extractor ' and name not in ALLOWED_CLASSMETHODS and not _WARNED):
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n') _WARNED = True
write_string('WARNING: Falling back to normal extractor since lazy extractor '
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n')
return getattr(cls.real_class, name) return getattr(cls.real_class, name)

View File

@@ -12,7 +12,9 @@ from inspect import getsource
from devscripts.utils import get_filename_args, read_file, write_file from devscripts.utils import get_filename_args, read_file, write_file
NO_ATTR = object() NO_ATTR = object()
STATIC_CLASS_PROPERTIES = ['IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_NETRC_MACHINE', 'age_limit'] STATIC_CLASS_PROPERTIES = [
'IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_ENABLED', '_NETRC_MACHINE', 'age_limit'
]
CLASS_METHODS = [ CLASS_METHODS = [
'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable' 'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable'
] ]

View File

@@ -1,13 +1,13 @@
#!/usr/bin/env sh #!/usr/bin/env sh
if [ -z $1 ]; then if [ -z "$1" ]; then
test_set='test' test_set='test'
elif [ $1 = 'core' ]; then elif [ "$1" = 'core' ]; then
test_set="-m not download" test_set="-m not download"
elif [ $1 = 'download' ]; then elif [ "$1" = 'download' ]; then
test_set="-m download" test_set="-m download"
else else
echo 'Invalid test type "'$1'". Use "core" | "download"' echo 'Invalid test type "'"$1"'". Use "core" | "download"'
exit 1 exit 1
fi fi

View File

@@ -364,6 +364,7 @@
- **Engadget** - **Engadget**
- **Epicon** - **Epicon**
- **EpiconSeries** - **EpiconSeries**
- **Epoch**
- **Eporner** - **Eporner**
- **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>] - **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>]
- **EroProfile:album** - **EroProfile:album**
@@ -377,6 +378,7 @@
- **EsriVideo** - **EsriVideo**
- **Europa** - **Europa**
- **EuropeanTour** - **EuropeanTour**
- **Eurosport**
- **EUScreen** - **EUScreen**
- **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>] - **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVLive**: [<abbr title="netrc machine"><em>ewetv</em></abbr>] - **EWETVLive**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
@@ -553,6 +555,8 @@
- **iq.com**: International version of iQiyi - **iq.com**: International version of iQiyi
- **iq.com:album** - **iq.com:album**
- **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺 - **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺
- **IslamChannel**
- **IslamChannelSeries**
- **ITProTV** - **ITProTV**
- **ITProTVCourse** - **ITProTVCourse**
- **ITTF** - **ITTF**
@@ -820,6 +824,7 @@
- **Newgrounds** - **Newgrounds**
- **Newgrounds:playlist** - **Newgrounds:playlist**
- **Newgrounds:user** - **Newgrounds:user**
- **NewsPicks**
- **Newstube** - **Newstube**
- **Newsy** - **Newsy**
- **NextMedia**: 蘋果日報 - **NextMedia**: 蘋果日報
@@ -1331,6 +1336,8 @@
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict**: (**Currently broken**) - **TrailerAddict**: (**Currently broken**)
- **TravelChannel** - **TravelChannel**
- **Triller**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **TrillerUser**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **Trilulilu** - **Trilulilu**
- **Trovo** - **Trovo**
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix - **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
@@ -1506,6 +1513,8 @@
- **VoxMedia** - **VoxMedia**
- **VoxMediaVolume** - **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series**
- **vqq:video**
- **Vrak** - **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be - **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be

View File

@@ -668,7 +668,7 @@ class TestYoutubeDL(unittest.TestCase):
def test_prepare_outtmpl_and_filename(self): def test_prepare_outtmpl_and_filename(self):
def test(tmpl, expected, *, info=None, **params): def test(tmpl, expected, *, info=None, **params):
params['outtmpl'] = tmpl params['outtmpl'] = tmpl
ydl = YoutubeDL(params) ydl = FakeYDL(params)
ydl._num_downloads = 1 ydl._num_downloads = 1
self.assertEqual(ydl.validate_outtmpl(tmpl), None) self.assertEqual(ydl.validate_outtmpl(tmpl), None)

View File

@@ -11,41 +11,46 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib import contextlib
import subprocess import subprocess
from yt_dlp.utils import encodeArgument from yt_dlp.utils import Popen
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
LAZY_EXTRACTORS = 'yt_dlp/extractor/lazy_extractors.py'
try:
_DEV_NULL = subprocess.DEVNULL
except AttributeError:
_DEV_NULL = open(os.devnull, 'wb')
class TestExecution(unittest.TestCase): class TestExecution(unittest.TestCase):
def test_import(self): def run_yt_dlp(self, exe=(sys.executable, 'yt_dlp/__main__.py'), opts=('--version', )):
subprocess.check_call([sys.executable, '-c', 'import yt_dlp'], cwd=rootDir) stdout, stderr, returncode = Popen.run(
[*exe, '--ignore-config', *opts], cwd=rootDir, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def test_module_exec(self): print(stderr, file=sys.stderr)
subprocess.check_call([sys.executable, '-m', 'yt_dlp', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.assertEqual(returncode, 0)
return stdout.strip(), stderr.strip()
def test_main_exec(self): def test_main_exec(self):
subprocess.check_call([sys.executable, 'yt_dlp/__main__.py', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.run_yt_dlp()
def test_import(self):
self.run_yt_dlp(exe=(sys.executable, '-c', 'import yt_dlp'))
def test_module_exec(self):
self.run_yt_dlp(exe=(sys.executable, '-m', 'yt_dlp'))
def test_cmdline_umlauts(self): def test_cmdline_umlauts(self):
p = subprocess.Popen( _, stderr = self.run_yt_dlp(opts=('ä', '--version'))
[sys.executable, 'yt_dlp/__main__.py', '--ignore-config', encodeArgument('ä'), '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
self.assertFalse(stderr) self.assertFalse(stderr)
def test_lazy_extractors(self): def test_lazy_extractors(self):
try: try:
subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', 'yt_dlp/extractor/lazy_extractors.py'], cwd=rootDir, stdout=_DEV_NULL) subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', LAZY_EXTRACTORS],
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=_DEV_NULL) cwd=rootDir, stdout=subprocess.DEVNULL)
self.assertTrue(os.path.exists(LAZY_EXTRACTORS))
_, stderr = self.run_yt_dlp(opts=('-s', 'test:'))
self.assertFalse(stderr)
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=subprocess.DEVNULL)
finally: finally:
with contextlib.suppress(OSError): with contextlib.suppress(OSError):
os.remove('yt_dlp/extractor/lazy_extractors.py') os.remove(LAZY_EXTRACTORS)
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -71,6 +71,9 @@ class TestJSInterpreter(unittest.TestCase):
jsi = JSInterpreter('function f(){return 0 ?? 42;}') jsi = JSInterpreter('function f(){return 0 ?? 42;}')
self.assertEqual(jsi.call_function('f'), 0) self.assertEqual(jsi.call_function('f'), 0)
jsi = JSInterpreter('function f(){return "life, the universe and everything" < 42;}')
self.assertFalse(jsi.call_function('f'))
def test_array_access(self): def test_array_access(self):
jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}') jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}')
self.assertEqual(jsi.call_function('f'), [5, 2, 7]) self.assertEqual(jsi.call_function('f'), [5, 2, 7])
@@ -129,6 +132,11 @@ class TestJSInterpreter(unittest.TestCase):
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50]) self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_builtins(self): def test_builtins(self):
jsi = JSInterpreter('''
function x() { return NaN }
''')
self.assertTrue(math.isnan(jsi.call_function('x')))
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; } function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; }
''') ''')
@@ -188,6 +196,30 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x'), 10) self.assertEqual(jsi.call_function('x'), 10)
def test_catch(self):
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_finally(self):
jsi = JSInterpreter('''
function x() { try{throw 10} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_nested_try(self):
jsi = JSInterpreter('''
function x() {try {
try{throw 10} finally {throw 42}
} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_for_loop_continue(self): def test_for_loop_continue(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a } function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a }
@@ -200,6 +232,14 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x'), 0) self.assertEqual(jsi.call_function('x'), 0)
def test_for_loop_try(self):
jsi = JSInterpreter('''
function x() {
for (i=0; i-10; i++) { try { if (i == 5) throw i} catch {return 10} finally {break} };
return 42 }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_literal_list(self): def test_literal_list(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return [1, 2, "asdf", [5, 6, 7]][3] } function x() { return [1, 2, "asdf", [5, 6, 7]][3] }
@@ -347,6 +387,27 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x').flags & re.I, re.I) self.assertEqual(jsi.call_function('x').flags & re.I, re.I)
jsi = JSInterpreter(R'''
function x() { let a=/,][}",],()}(\[)/; return a; }
''')
self.assertEqual(jsi.call_function('x').pattern, r',][}",],()}(\[)')
def test_char_code_at(self):
jsi = JSInterpreter('function x(i){return "test".charCodeAt(i)}')
self.assertEqual(jsi.call_function('x', 0), 116)
self.assertEqual(jsi.call_function('x', 1), 101)
self.assertEqual(jsi.call_function('x', 2), 115)
self.assertEqual(jsi.call_function('x', 3), 116)
self.assertEqual(jsi.call_function('x', 4), None)
self.assertEqual(jsi.call_function('x', 'not_a_number'), 116)
def test_bitwise_operators_overflow(self):
jsi = JSInterpreter('function x(){return -524999584 << 5}')
self.assertEqual(jsi.call_function('x'), 379882496)
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
self.assertEqual(jsi.call_function('x'), 915423904)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -110,6 +110,22 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/1f7d5369/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/1f7d5369/player_ias.vflset/en_US/base.js',
'batNX7sYqIJdkJ', 'IhOkL_zxbkOZBw', 'batNX7sYqIJdkJ', 'IhOkL_zxbkOZBw',
), ),
(
'https://www.youtube.com/s/player/009f1d77/player_ias.vflset/en_US/base.js',
'5dwFHw8aFWQUQtffRq', 'audescmLUzI3jw',
),
(
'https://www.youtube.com/s/player/dc0c6770/player_ias.vflset/en_US/base.js',
'5EHDMgYLV6HPGk_Mu-kk', 'n9lUJLHbxUI0GQ',
),
(
'https://www.youtube.com/s/player/113ca41c/player_ias.vflset/en_US/base.js',
'cgYl-tlYkhjT7A', 'hI7BBr2zUgcmMg',
),
(
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
),
] ]

View File

@@ -29,6 +29,7 @@ from .cookies import load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
from .extractor import gen_extractor_classes, get_info_extractor from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text from .minicurses import format_text
from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors
@@ -47,7 +48,7 @@ from .postprocessor import (
get_postprocessor, get_postprocessor,
) )
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import detect_variant from .update import REPOSITORY, current_git_head, detect_variant
from .utils import ( from .utils import (
DEFAULT_OUTTMPL, DEFAULT_OUTTMPL,
IDENTITY, IDENTITY,
@@ -89,6 +90,7 @@ from .utils import (
args_to_str, args_to_str,
bug_reports_message, bug_reports_message,
date_from_str, date_from_str,
deprecation_warning,
determine_ext, determine_ext,
determine_protocol, determine_protocol,
encode_compat_str, encode_compat_str,
@@ -115,6 +117,7 @@ from .utils import (
network_exceptions, network_exceptions,
number_of_digits, number_of_digits,
orderedSet, orderedSet,
orderedSet_from_options,
parse_filesize, parse_filesize,
preferredencoding, preferredencoding,
prepend_extension, prepend_extension,
@@ -236,7 +239,7 @@ class YoutubeDL:
Default is 'only_download' for CLI, but False for API Default is 'only_download' for CLI, but False for API
skip_playlist_after_errors: Number of allowed failures until the rest of skip_playlist_after_errors: Number of allowed failures until the rest of
the playlist is skipped the playlist is skipped
force_generic_extractor: Force downloader to use the generic extractor allowed_extractors: List of regexes to match against extractor names that are allowed
overwrites: Overwrite all video and metadata files if True, overwrites: Overwrite all video and metadata files if True,
overwrite only non-video files if None overwrite only non-video files if None
and don't overwrite any file if False and don't overwrite any file if False
@@ -301,8 +304,9 @@ class YoutubeDL:
should act on each input URL as opposed to for the entire queue should act on each input URL as opposed to for the entire queue
cookiefile: File name or text stream from where cookies should be read and dumped to cookiefile: File name or text stream from where cookies should be read and dumped to
cookiesfrombrowser: A tuple containing the name of the browser, the profile cookiesfrombrowser: A tuple containing the name of the browser, the profile
name/path from where cookies are loaded, and the name of the name/path from where cookies are loaded, the name of the keyring,
keyring, e.g. ('chrome', ) or ('vivaldi', 'default', 'BASICTEXT') and the container name, e.g. ('chrome', ) or
('vivaldi', 'default', 'BASICTEXT') or ('firefox', 'default', None, 'Meta')
legacyserverconnect: Explicitly allow HTTPS connection to servers that do not legacyserverconnect: Explicitly allow HTTPS connection to servers that do not
support RFC 5746 secure renegotiation support RFC 5746 secure renegotiation
nocheckcertificate: Do not verify SSL certificates nocheckcertificate: Do not verify SSL certificates
@@ -476,6 +480,8 @@ class YoutubeDL:
The following options are deprecated and may be removed in the future: The following options are deprecated and may be removed in the future:
force_generic_extractor: Force downloader to use the generic extractor
- Use allowed_extractors = ['generic', 'default']
playliststart: - Use playlist_items playliststart: - Use playlist_items
Playlist item to start at. Playlist item to start at.
playlistend: - Use playlist_items playlistend: - Use playlist_items
@@ -627,7 +633,7 @@ class YoutubeDL:
for msg in self.params.get('_warnings', []): for msg in self.params.get('_warnings', []):
self.report_warning(msg) self.report_warning(msg)
for msg in self.params.get('_deprecation_warnings', []): for msg in self.params.get('_deprecation_warnings', []):
self.deprecation_warning(msg) self.deprecated_feature(msg)
self.params['compat_opts'] = set(self.params.get('compat_opts', ())) self.params['compat_opts'] = set(self.params.get('compat_opts', ()))
if 'list-formats' in self.params['compat_opts']: if 'list-formats' in self.params['compat_opts']:
@@ -757,13 +763,6 @@ class YoutubeDL:
self._ies_instances[ie_key] = ie self._ies_instances[ie_key] = ie
ie.set_downloader(self) ie.set_downloader(self)
def _get_info_extractor_class(self, ie_key):
ie = self._ies.get(ie_key)
if ie is None:
ie = get_info_extractor(ie_key)
self.add_info_extractor(ie)
return ie
def get_info_extractor(self, ie_key): def get_info_extractor(self, ie_key):
""" """
Get an instance of an IE with name ie_key, it will try to get one from Get an instance of an IE with name ie_key, it will try to get one from
@@ -780,8 +779,19 @@ class YoutubeDL:
""" """
Add the InfoExtractors returned by gen_extractors to the end of the list Add the InfoExtractors returned by gen_extractors to the end of the list
""" """
for ie in gen_extractor_classes(): all_ies = {ie.IE_NAME.lower(): ie for ie in gen_extractor_classes()}
self.add_info_extractor(ie) all_ies['end'] = UnsupportedURLIE()
try:
ie_names = orderedSet_from_options(
self.params.get('allowed_extractors', ['default']), {
'all': list(all_ies),
'default': [name for name, ie in all_ies.items() if ie._ENABLED],
}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for allowed_extractors: {e.pattern}')
for name in ie_names:
self.add_info_extractor(all_ies[name])
self.write_debug(f'Loaded {len(ie_names)} extractors')
def add_post_processor(self, pp, when='post_process'): def add_post_processor(self, pp, when='post_process'):
"""Add a PostProcessor object to the end of the chain.""" """Add a PostProcessor object to the end of the chain."""
@@ -827,9 +837,11 @@ class YoutubeDL:
def to_stdout(self, message, skip_eol=False, quiet=None): def to_stdout(self, message, skip_eol=False, quiet=None):
"""Print message to stdout""" """Print message to stdout"""
if quiet is not None: if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. '
'Use "YoutubeDL.to_screen" instead')
if skip_eol is not False: if skip_eol is not False:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. '
'Use "YoutubeDL.to_screen" instead')
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out) self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out)
def to_screen(self, message, skip_eol=False, quiet=None): def to_screen(self, message, skip_eol=False, quiet=None):
@@ -965,11 +977,14 @@ class YoutubeDL:
return return
self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once) self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once)
def deprecation_warning(self, message): def deprecation_warning(self, message, *, stacklevel=0):
deprecation_warning(
message, stacklevel=stacklevel + 1, printer=self.report_error, is_error=False)
def deprecated_feature(self, message):
if self.params.get('logger') is not None: if self.params.get('logger') is not None:
self.params['logger'].warning(f'DeprecationWarning: {message}') self.params['logger'].warning(f'Deprecated Feature: {message}')
else: self.to_stderr(f'{self._format_err("Deprecated Feature:", self.Styles.ERROR)} {message}', True)
self.to_stderr(f'{self._format_err("DeprecationWarning:", self.Styles.ERROR)} {message}', True)
def report_error(self, message, *args, **kwargs): def report_error(self, message, *args, **kwargs):
''' '''
@@ -1029,7 +1044,7 @@ class YoutubeDL:
def get_output_path(self, dir_type='', filename=None): def get_output_path(self, dir_type='', filename=None):
paths = self.params.get('paths', {}) paths = self.params.get('paths', {})
assert isinstance(paths, dict) assert isinstance(paths, dict), '"paths" parameter must be a dictionary'
path = os.path.join( path = os.path.join(
expand_path(paths.get('home', '').strip()), expand_path(paths.get('home', '').strip()),
expand_path(paths.get(dir_type, '').strip()) if dir_type else '', expand_path(paths.get(dir_type, '').strip()) if dir_type else '',
@@ -1412,11 +1427,11 @@ class YoutubeDL:
ie_key = 'Generic' ie_key = 'Generic'
if ie_key: if ie_key:
ies = {ie_key: self._get_info_extractor_class(ie_key)} ies = {ie_key: self._ies[ie_key]} if ie_key in self._ies else {}
else: else:
ies = self._ies ies = self._ies
for ie_key, ie in ies.items(): for key, ie in ies.items():
if not ie.suitable(url): if not ie.suitable(url):
continue continue
@@ -1425,14 +1440,16 @@ class YoutubeDL:
'and will probably not work.') 'and will probably not work.')
temp_id = ie.get_temp_id(url) temp_id = ie.get_temp_id(url)
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': ie_key}): if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': key}):
self.to_screen(f'[{ie_key}] {temp_id}: has already been recorded in the archive') self.to_screen(f'[{key}] {temp_id}: has already been recorded in the archive')
if self.params.get('break_on_existing', False): if self.params.get('break_on_existing', False):
raise ExistingVideoReached() raise ExistingVideoReached()
break break
return self.__extract_info(url, self.get_info_extractor(ie_key), download, extra_info, process) return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process)
else: else:
self.report_error('no suitable InfoExtractor for URL %s' % url) extractors_restricted = self.params.get('allowed_extractors') not in (None, ['default'])
self.report_error(f'No suitable extractor{format_field(ie_key, None, " (%s)")} found for URL {url}',
tb=False if extractors_restricted else None)
def _handle_extraction_exceptions(func): def _handle_extraction_exceptions(func):
@functools.wraps(func) @functools.wraps(func)
@@ -2511,9 +2528,6 @@ class YoutubeDL:
'--live-from-start is passed, but there are no formats that can be downloaded from the start. ' '--live-from-start is passed, but there are no formats that can be downloaded from the start. '
'If you want to download from the current time, use --no-live-from-start')) 'If you want to download from the current time, use --no-live-from-start'))
if not formats:
self.raise_no_formats(info_dict)
def is_wellformed(f): def is_wellformed(f):
url = f.get('url') url = f.get('url')
if not url: if not url:
@@ -2526,7 +2540,10 @@ class YoutubeDL:
return True return True
# Filter out malformed formats for better extraction robustness # Filter out malformed formats for better extraction robustness
formats = list(filter(is_wellformed, formats)) formats = list(filter(is_wellformed, formats or []))
if not formats:
self.raise_no_formats(info_dict)
formats_dict = {} formats_dict = {}
@@ -2728,42 +2745,26 @@ class YoutubeDL:
if lang not in available_subs: if lang not in available_subs:
available_subs[lang] = cap_info available_subs[lang] = cap_info
if (not self.params.get('writesubtitles') and not if not available_subs or (
self.params.get('writeautomaticsub') or not not self.params.get('writesubtitles')
available_subs): and not self.params.get('writeautomaticsub')):
return None return None
all_sub_langs = tuple(available_subs.keys()) all_sub_langs = tuple(available_subs.keys())
if self.params.get('allsubtitles', False): if self.params.get('allsubtitles', False):
requested_langs = all_sub_langs requested_langs = all_sub_langs
elif self.params.get('subtitleslangs', False): elif self.params.get('subtitleslangs', False):
# A list is used so that the order of languages will be the same as try:
# given in subtitleslangs. See https://github.com/yt-dlp/yt-dlp/issues/1041 requested_langs = orderedSet_from_options(
requested_langs = [] self.params.get('subtitleslangs'), {'all': all_sub_langs}, use_regex=True)
for lang_re in self.params.get('subtitleslangs'): except re.error as e:
discard = lang_re[0] == '-' raise ValueError(f'Wrong regex for subtitlelangs: {e.pattern}')
if discard:
lang_re = lang_re[1:]
if lang_re == 'all':
if discard:
requested_langs = []
else:
requested_langs.extend(all_sub_langs)
continue
current_langs = filter(re.compile(lang_re + '$').match, all_sub_langs)
if discard:
for lang in current_langs:
while lang in requested_langs:
requested_langs.remove(lang)
else:
requested_langs.extend(current_langs)
requested_langs = orderedSet(requested_langs)
elif normal_sub_langs: elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1] requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else: else:
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1] requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
if requested_langs: if requested_langs:
self.write_debug('Downloading subtitles: %s' % ', '.join(requested_langs)) self.to_screen(f'[info] {video_id}: Downloading subtitles: {", ".join(requested_langs)}')
formats_query = self.params.get('subtitlesformat', 'best') formats_query = self.params.get('subtitlesformat', 'best')
formats_preference = formats_query.split('/') if formats_query else [] formats_preference = formats_query.split('/') if formats_query else []
@@ -3271,6 +3272,7 @@ class YoutubeDL:
self.to_screen(f'[info] {e}') self.to_screen(f'[info] {e}')
if not self.params.get('break_per_url'): if not self.params.get('break_per_url'):
raise raise
self._num_downloads = 0
else: else:
if self.params.get('dump_single_json', False): if self.params.get('dump_single_json', False):
self.post_extract(res) self.post_extract(res)
@@ -3319,6 +3321,12 @@ class YoutubeDL:
return info_dict return info_dict
info_dict.setdefault('epoch', int(time.time())) info_dict.setdefault('epoch', int(time.time()))
info_dict.setdefault('_type', 'video') info_dict.setdefault('_type', 'video')
info_dict.setdefault('_version', {
'version': __version__,
'current_git_head': current_git_head(),
'release_git_head': RELEASE_GIT_HEAD,
'repository': REPOSITORY,
})
if remove_private_keys: if remove_private_keys:
reject = lambda k, v: v is None or k.startswith('__') or k in { reject = lambda k, v: v is None or k.startswith('__') or k in {
@@ -3683,7 +3691,8 @@ class YoutubeDL:
if VARIANT not in (None, 'pip'): if VARIANT not in (None, 'pip'):
source += '*' source += '*'
write_debug(join_nonempty( write_debug(join_nonempty(
'yt-dlp version', __version__, f'{"yt-dlp" if REPOSITORY == "yt-dlp/yt-dlp" else REPOSITORY} version',
__version__,
f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '', f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})', '' if source == 'unknown' else f'({source})',
delim=' ')) delim=' '))
@@ -3699,18 +3708,8 @@ class YoutubeDL:
if self.params['compat_opts']: if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts'])) write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
if source == 'source': if current_git_head():
try: write_debug(f'Git HEAD: {current_git_head()}')
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
write_debug(f'Git HEAD: {stdout.strip()}')
except Exception:
with contextlib.suppress(Exception):
sys.exc_clear()
write_debug(system_identifier()) write_debug(system_identifier())
exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self) exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self)

View File

@@ -63,6 +63,8 @@ from .utils import (
) )
from .YoutubeDL import YoutubeDL from .YoutubeDL import YoutubeDL
_IN_CLI = False
def _exit(status=0, *args): def _exit(status=0, *args):
for msg in args: for msg in args:
@@ -344,10 +346,16 @@ def validate_options(opts):
# Cookies from browser # Cookies from browser
if opts.cookiesfrombrowser: if opts.cookiesfrombrowser:
mobj = re.match(r'(?P<name>[^+:]+)(\s*\+\s*(?P<keyring>[^:]+))?(\s*:(?P<profile>.+))?', opts.cookiesfrombrowser) container = None
mobj = re.fullmatch(r'''(?x)
(?P<name>[^+:]+)
(?:\s*\+\s*(?P<keyring>[^:]+))?
(?:\s*:\s*(?P<profile>.+?))?
(?:\s*::\s*(?P<container>.+))?
''', opts.cookiesfrombrowser)
if mobj is None: if mobj is None:
raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}') raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}')
browser_name, keyring, profile = mobj.group('name', 'keyring', 'profile') browser_name, keyring, profile, container = mobj.group('name', 'keyring', 'profile', 'container')
browser_name = browser_name.lower() browser_name = browser_name.lower()
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". ' raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". '
@@ -357,7 +365,7 @@ def validate_options(opts):
if keyring not in SUPPORTED_KEYRINGS: if keyring not in SUPPORTED_KEYRINGS:
raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". ' raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". '
f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}') f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}')
opts.cookiesfrombrowser = (browser_name, profile, keyring) opts.cookiesfrombrowser = (browser_name, profile, keyring, container)
# MetadataParser # MetadataParser
def metadataparser_actions(f): def metadataparser_actions(f):
@@ -766,6 +774,7 @@ def parse_options(argv=None):
'windowsfilenames': opts.windowsfilenames, 'windowsfilenames': opts.windowsfilenames,
'ignoreerrors': opts.ignoreerrors, 'ignoreerrors': opts.ignoreerrors,
'force_generic_extractor': opts.force_generic_extractor, 'force_generic_extractor': opts.force_generic_extractor,
'allowed_extractors': opts.allowed_extractors or ['default'],
'ratelimit': opts.ratelimit, 'ratelimit': opts.ratelimit,
'throttledratelimit': opts.throttledratelimit, 'throttledratelimit': opts.throttledratelimit,
'overwrites': opts.overwrites, 'overwrites': opts.overwrites,

View File

@@ -14,4 +14,5 @@ if __package__ is None and not hasattr(sys, 'frozen'):
import yt_dlp import yt_dlp
if __name__ == '__main__': if __name__ == '__main__':
yt_dlp._IN_CLI = True
yt_dlp.main() yt_dlp.main()

View File

@@ -6,7 +6,8 @@ import re
import shutil import shutil
import traceback import traceback
from .utils import expand_path, write_json_file from .utils import expand_path, traverse_obj, version_tuple, write_json_file
from .version import __version__
class Cache: class Cache:
@@ -45,12 +46,20 @@ class Cache:
if ose.errno != errno.EEXIST: if ose.errno != errno.EEXIST:
raise raise
self._ydl.write_debug(f'Saving {section}.{key} to cache') self._ydl.write_debug(f'Saving {section}.{key} to cache')
write_json_file(data, fn) write_json_file({'yt-dlp_version': __version__, 'data': data}, fn)
except Exception: except Exception:
tb = traceback.format_exc() tb = traceback.format_exc()
self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}') self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}')
def load(self, section, key, dtype='json', default=None): def _validate(self, data, min_ver):
version = traverse_obj(data, 'yt-dlp_version')
if not version: # Backward compatibility
data, version = {'data': data}, '2022.08.19'
if not min_ver or version_tuple(version) >= version_tuple(min_ver):
return data['data']
self._ydl.write_debug(f'Discarding old cache from version {version} (needs {min_ver})')
def load(self, section, key, dtype='json', default=None, *, min_ver=None):
assert dtype in ('json',) assert dtype in ('json',)
if not self.enabled: if not self.enabled:
@@ -61,8 +70,8 @@ class Cache:
try: try:
with open(cache_fn, encoding='utf-8') as cachef: with open(cache_fn, encoding='utf-8') as cachef:
self._ydl.write_debug(f'Loading {section}.{key} from cache') self._ydl.write_debug(f'Loading {section}.{key} from cache')
return json.load(cachef) return self._validate(json.load(cachef), min_ver)
except ValueError: except (ValueError, KeyError):
try: try:
file_size = os.path.getsize(cache_fn) file_size = os.path.getsize(cache_fn)
except OSError as oe: except OSError as oe:

View File

@@ -3,6 +3,7 @@ import contextlib
import http.cookiejar import http.cookiejar
import json import json
import os import os
import re
import shutil import shutil
import struct import struct
import subprocess import subprocess
@@ -24,7 +25,13 @@ from .dependencies import (
sqlite3, sqlite3,
) )
from .minicurses import MultilinePrinter, QuietMultilinePrinter from .minicurses import MultilinePrinter, QuietMultilinePrinter
from .utils import Popen, YoutubeDLCookieJar, error_to_str, expand_path from .utils import (
Popen,
YoutubeDLCookieJar,
error_to_str,
expand_path,
try_call,
)
CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'} CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'}
SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'} SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'}
@@ -85,8 +92,9 @@ def _create_progress_bar(logger):
def load_cookies(cookie_file, browser_specification, ydl): def load_cookies(cookie_file, browser_specification, ydl):
cookie_jars = [] cookie_jars = []
if browser_specification is not None: if browser_specification is not None:
browser_name, profile, keyring = _parse_browser_specification(*browser_specification) browser_name, profile, keyring, container = _parse_browser_specification(*browser_specification)
cookie_jars.append(extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring)) cookie_jars.append(
extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring, container=container))
if cookie_file is not None: if cookie_file is not None:
is_filename = YoutubeDLCookieJar.is_path(cookie_file) is_filename = YoutubeDLCookieJar.is_path(cookie_file)
@@ -101,9 +109,9 @@ def load_cookies(cookie_file, browser_specification, ydl):
return _merge_cookie_jars(cookie_jars) return _merge_cookie_jars(cookie_jars)
def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None): def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None, container=None):
if browser_name == 'firefox': if browser_name == 'firefox':
return _extract_firefox_cookies(profile, logger) return _extract_firefox_cookies(profile, container, logger)
elif browser_name == 'safari': elif browser_name == 'safari':
return _extract_safari_cookies(profile, logger) return _extract_safari_cookies(profile, logger)
elif browser_name in CHROMIUM_BASED_BROWSERS: elif browser_name in CHROMIUM_BASED_BROWSERS:
@@ -112,7 +120,7 @@ def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(),
raise ValueError(f'unknown browser: {browser_name}') raise ValueError(f'unknown browser: {browser_name}')
def _extract_firefox_cookies(profile, logger): def _extract_firefox_cookies(profile, container, logger):
logger.info('Extracting cookies from firefox') logger.info('Extracting cookies from firefox')
if not sqlite3: if not sqlite3:
logger.warning('Cannot extract cookies from firefox without sqlite3 support. ' logger.warning('Cannot extract cookies from firefox without sqlite3 support. '
@@ -131,11 +139,36 @@ def _extract_firefox_cookies(profile, logger):
raise FileNotFoundError(f'could not find firefox cookies database in {search_root}') raise FileNotFoundError(f'could not find firefox cookies database in {search_root}')
logger.debug(f'Extracting cookies from: "{cookie_database_path}"') logger.debug(f'Extracting cookies from: "{cookie_database_path}"')
container_id = None
if container not in (None, 'none'):
containers_path = os.path.join(os.path.dirname(cookie_database_path), 'containers.json')
if not os.path.isfile(containers_path) or not os.access(containers_path, os.R_OK):
raise FileNotFoundError(f'could not read containers.json in {search_root}')
with open(containers_path) as containers:
identities = json.load(containers).get('identities', [])
container_id = next((context.get('userContextId') for context in identities if container in (
context.get('name'),
try_call(lambda: re.fullmatch(r'userContext([^\.]+)\.label', context['l10nID']).group())
)), None)
if not isinstance(container_id, int):
raise ValueError(f'could not find firefox container "{container}" in containers.json')
with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir: with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir:
cursor = None cursor = None
try: try:
cursor = _open_database_copy(cookie_database_path, tmpdir) cursor = _open_database_copy(cookie_database_path, tmpdir)
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies') if isinstance(container_id, int):
logger.debug(
f'Only loading cookies from firefox container "{container}", ID {container_id}')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE originAttributes LIKE ? OR originAttributes LIKE ?',
(f'%userContextId={container_id}', f'%userContextId={container_id}&%'))
elif container == 'none':
logger.debug('Only loading cookies not belonging to any container')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE NOT INSTR(originAttributes,"userContextId=")')
else:
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies')
jar = YoutubeDLCookieJar() jar = YoutubeDLCookieJar()
with _create_progress_bar(logger) as progress_bar: with _create_progress_bar(logger) as progress_bar:
table = cursor.fetchall() table = cursor.fetchall()
@@ -948,11 +981,11 @@ def _is_path(value):
return os.path.sep in value return os.path.sep in value
def _parse_browser_specification(browser_name, profile=None, keyring=None): def _parse_browser_specification(browser_name, profile=None, keyring=None, container=None):
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser: "{browser_name}"') raise ValueError(f'unsupported browser: "{browser_name}"')
if keyring not in (None, *SUPPORTED_KEYRINGS): if keyring not in (None, *SUPPORTED_KEYRINGS):
raise ValueError(f'unsupported keyring: "{keyring}"') raise ValueError(f'unsupported keyring: "{keyring}"')
if profile is not None and _is_path(profile): if profile is not None and _is_path(profile):
profile = os.path.expanduser(profile) profile = os.path.expanduser(profile)
return browser_name, profile, keyring return browser_name, profile, keyring, container

View File

@@ -92,6 +92,7 @@ class FileDownloader:
for func in ( for func in (
'deprecation_warning', 'deprecation_warning',
'deprecated_feature',
'report_error', 'report_error',
'report_file_already_downloaded', 'report_file_already_downloaded',
'report_warning', 'report_warning',

View File

@@ -515,16 +515,14 @@ _BY_NAME = {
if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD') if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD')
} }
_BY_EXE = {klass.EXE_NAME: klass for klass in _BY_NAME.values()}
def list_external_downloaders(): def list_external_downloaders():
return sorted(_BY_NAME.keys()) return sorted(_BY_NAME.keys())
def get_external_downloader(external_downloader): def get_external_downloader(external_downloader):
""" Given the name of the executable, see whether we support the given """ Given the name of the executable, see whether we support the given downloader """
downloader . """
# Drop .exe extension on Windows
bn = os.path.splitext(os.path.basename(external_downloader))[0] bn = os.path.splitext(os.path.basename(external_downloader))[0]
return _BY_NAME.get(bn, _BY_EXE.get(bn)) return _BY_NAME.get(bn) or next((
klass for klass in _BY_NAME.values() if klass.EXE_NAME in bn
), None)

View File

@@ -65,8 +65,8 @@ class FragmentFD(FileDownloader):
""" """
def report_retry_fragment(self, err, frag_index, count, retries): def report_retry_fragment(self, err, frag_index, count, retries):
self.deprecation_warning( self.deprecation_warning('yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. '
'yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. Use yt_dlp.downloader.FileDownloader.report_retry instead') 'Use yt_dlp.downloader.FileDownloader.report_retry instead')
return self.report_retry(err, count, retries, frag_index) return self.report_retry(err, count, retries, frag_index)
def report_skip_fragment(self, frag_index, err=None): def report_skip_fragment(self, frag_index, err=None):

View File

@@ -1,5 +1,28 @@
# flake8: noqa: F401 # flake8: noqa: F401
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .abc import ( from .abc import (
ABCIE, ABCIE,
ABCIViewIE, ABCIViewIE,
@@ -470,6 +493,7 @@ from .epicon import (
EpiconIE, EpiconIE,
EpiconSeriesIE, EpiconSeriesIE,
) )
from .epoch import EpochIE
from .eporner import EpornerIE from .eporner import EpornerIE
from .eroprofile import ( from .eroprofile import (
EroProfileIE, EroProfileIE,
@@ -491,6 +515,7 @@ from .espn import (
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .europa import EuropaIE from .europa import EuropaIE
from .europeantour import EuropeanTourIE from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE
from .euscreen import EUScreenIE from .euscreen import EUScreenIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
from .expressen import ExpressenIE from .expressen import ExpressenIE
@@ -720,6 +745,10 @@ from .iqiyi import (
IqIE, IqIE,
IqAlbumIE IqAlbumIE
) )
from .islamchannel import (
IslamChannelIE,
IslamChannelSeriesIE,
)
from .itprotv import ( from .itprotv import (
ITProTVIE, ITProTVIE,
ITProTVCourseIE ITProTVCourseIE
@@ -1079,6 +1108,7 @@ from .newgrounds import (
NewgroundsPlaylistIE, NewgroundsPlaylistIE,
NewgroundsUserIE, NewgroundsUserIE,
) )
from .newspicks import NewsPicksIE
from .newstube import NewstubeIE from .newstube import NewstubeIE
from .newsy import NewsyIE from .newsy import NewsyIE
from .nextmedia import ( from .nextmedia import (
@@ -1728,6 +1758,12 @@ from .telequebec import (
from .teletask import TeleTaskIE from .teletask import TeleTaskIE
from .telewebion import TelewebionIE from .telewebion import TelewebionIE
from .tempo import TempoIE from .tempo import TempoIE
from .tencent import (
VQQSeriesIE,
VQQVideoIE,
WeTvEpisodeIE,
WeTvSeriesIE,
)
from .tennistv import TennisTVIE from .tennistv import TennisTVIE
from .tenplay import TenPlayIE from .tenplay import TenPlayIE
from .testurl import TestURLIE from .testurl import TestURLIE
@@ -1787,6 +1823,10 @@ from .toongoggles import ToonGogglesIE
from .toutv import TouTvIE from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE from .traileraddict import TrailerAddictIE
from .triller import (
TrillerIE,
TrillerUserIE,
)
from .trilulilu import TriluliluIE from .trilulilu import TriluliluIE
from .trovo import ( from .trovo import (
TrovoIE, TrovoIE,
@@ -2092,7 +2132,6 @@ from .weibo import (
WeiboMobileIE WeiboMobileIE
) )
from .weiqitv import WeiqiTVIE from .weiqitv import WeiqiTVIE
from .wetv import WeTvEpisodeIE, WeTvSeriesIE
from .wikimedia import WikimediaIE from .wikimedia import WikimediaIE
from .willow import WillowIE from .willow import WillowIE
from .wimtv import WimTVIE from .wimtv import WimTVIE
@@ -2175,28 +2214,6 @@ from .younow import (
from .youporn import YouPornIE from .youporn import YouPornIE
from .yourporn import YourPornIE from .yourporn import YourPornIE
from .yourupload import YourUploadIE from .yourupload import YourUploadIE
from .youtube import (
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zattoo import ( from .zattoo import (
BBVTVIE, BBVTVIE,

View File

@@ -95,24 +95,24 @@ class ArteTVIE(ArteTVBaseIE):
# all obtained by exhaustive testing # all obtained by exhaustive testing
_COUNTRIES_MAP = { _COUNTRIES_MAP = {
'DE_FR': { 'DE_FR': (
'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC', 'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC',
'PF', 'PM', 'RE', 'WF', 'YT', 'PF', 'PM', 'RE', 'WF', 'YT',
}, ),
# with both of the below 'BE' sometimes works, sometimes doesn't # with both of the below 'BE' sometimes works, sometimes doesn't
'EUR_DE_FR': { 'EUR_DE_FR': (
'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI', 'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI',
'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF', 'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF',
'YT', 'YT',
}, ),
'SAT': { 'SAT': (
'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ', 'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ',
'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF',
'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI', 'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI',
'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC', 'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC',
'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO', 'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO',
'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT', 'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT',
}, ),
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -218,6 +218,9 @@ class BiliBiliIE(InfoExtractor):
durl = traverse_obj(video_info, ('dash', 'video')) durl = traverse_obj(video_info, ('dash', 'video'))
audios = traverse_obj(video_info, ('dash', 'audio')) or [] audios = traverse_obj(video_info, ('dash', 'audio')) or []
flac_audio = traverse_obj(video_info, ('dash', 'flac', 'audio'))
if flac_audio:
audios.append(flac_audio)
entries = [] entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4') RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
@@ -620,14 +623,15 @@ class BiliBiliSearchIE(SearchInfoExtractor):
'keyword': query, 'keyword': query,
'page': page_num, 'page': page_num,
'context': '', 'context': '',
'order': 'pubdate',
'duration': 0, 'duration': 0,
'tids_2': '', 'tids_2': '',
'__refresh__': 'true', '__refresh__': 'true',
'search_type': 'video', 'search_type': 'video',
'tids': 0, 'tids': 0,
'highlight': 1, 'highlight': 1,
})['data'].get('result') or [] })['data'].get('result')
if not videos:
break
for video in videos: for video in videos:
yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid'])) yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))

View File

@@ -65,10 +65,12 @@ class BitChuteIE(InfoExtractor):
error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video') error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video')
if error == 'Video Unavailable': if error == 'Video Unavailable':
raise GeoRestrictedError(error) raise GeoRestrictedError(error)
raise ExtractorError(error) raise ExtractorError(error, expected=True)
formats = entries[0]['formats'] formats = entries[0]['formats']
self._check_formats(formats, video_id) self._check_formats(formats, video_id)
if not formats:
raise self.raise_no_formats('Video is unavailable', expected=True, video_id=video_id)
self._sort_formats(formats) self._sort_formats(formats)
description = self._html_search_regex( description = self._html_search_regex(

View File

@@ -480,6 +480,9 @@ class InfoExtractor:
will be used by geo restriction bypass mechanism similarly will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES. to _GEO_COUNTRIES.
The _ENABLED attribute should be set to False for IEs that
are disabled by default and must be explicitly enabled.
The _WORKING attribute should be set to False for broken IEs The _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests. in order to warn the users and skip the tests.
""" """
@@ -491,6 +494,7 @@ class InfoExtractor:
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None _GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
_ENABLED = True
_NETRC_MACHINE = None _NETRC_MACHINE = None
IE_DESC = None IE_DESC = None
SEARCH_KEY = None SEARCH_KEY = None
@@ -1689,7 +1693,7 @@ class InfoExtractor:
'order_free': ('webm', 'mp4', 'flv', '', 'none')}, 'order_free': ('webm', 'mp4', 'flv', '', 'none')},
'aext': {'type': 'ordered', 'field': 'audio_ext', 'aext': {'type': 'ordered', 'field': 'audio_ext',
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'), 'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')}, 'order_free': ('ogg', 'opus', 'webm', 'mp3', 'm4a', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000}, 'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple', 'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple',
'field': ('vcodec', 'acodec'), 'field': ('vcodec', 'acodec'),
@@ -1762,9 +1766,8 @@ class InfoExtractor:
if field not in self.settings: if field not in self.settings:
if key in ('forced', 'priority'): if key in ('forced', 'priority'):
return False return False
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Using arbitrary fields ({field}) for format sorting is '
f'Using arbitrary fields ({field}) for format sorting is deprecated ' 'deprecated and may be removed in a future version')
'and may be removed in a future version')
self.settings[field] = {} self.settings[field] = {}
propObj = self.settings[field] propObj = self.settings[field]
if key not in propObj: if key not in propObj:
@@ -1849,9 +1852,8 @@ class InfoExtractor:
if self._get_field_setting(field, 'type') == 'alias': if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field') alias, field = field, self._get_field_setting(field, 'field')
if self._get_field_setting(alias, 'deprecated'): if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Format sorting alias {alias} is deprecated and may '
f'Format sorting alias {alias} is deprecated ' 'be removed in a future version. Please use {field} instead')
f'and may be removed in a future version. Please use {field} instead')
reverse = match.group('reverse') is not None reverse = match.group('reverse') is not None
closest = match.group('separator') == '~' closest = match.group('separator') == '~'
limit_text = match.group('limit') limit_text = match.group('limit')
@@ -3258,7 +3260,7 @@ class InfoExtractor:
'subtitles': {}, 'subtitles': {},
} }
media_attributes = extract_attributes(media_tag) media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src')) src = strip_or_none(dict_get(media_attributes, ('src', 'data-video-src', 'data-src', 'data-source')))
if src: if src:
f = parse_content_type(media_attributes.get('type')) f = parse_content_type(media_attributes.get('type'))
_, formats = _media_formats(src, media_type, f) _, formats = _media_formats(src, media_type, f)
@@ -3269,7 +3271,7 @@ class InfoExtractor:
s_attr = extract_attributes(source_tag) s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen # data-video-src and data-src are non standard but seen
# several times in the wild # several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src'))) src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src', 'data-source')))
if not src: if not src:
continue continue
f = parse_content_type(s_attr.get('type')) f = parse_content_type(s_attr.get('type'))
@@ -3872,7 +3874,7 @@ class InfoExtractor:
def _extract_from_webpage(cls, url, webpage): def _extract_from_webpage(cls, url, webpage):
for embed_url in orderedSet( for embed_url in orderedSet(
cls._extract_embed_urls(url, webpage) or [], lazy=True): cls._extract_embed_urls(url, webpage) or [], lazy=True):
yield cls.url_result(embed_url, cls) yield cls.url_result(embed_url, None if cls._VALID_URL is False else cls)
@classmethod @classmethod
def _extract_embed_urls(cls, url, webpage): def _extract_embed_urls(cls, url, webpage):
@@ -3941,3 +3943,12 @@ class SearchInfoExtractor(InfoExtractor):
@classproperty @classproperty
def SEARCH_KEY(cls): def SEARCH_KEY(cls):
return cls._SEARCH_KEY return cls._SEARCH_KEY
class UnsupportedURLIE(InfoExtractor):
_VALID_URL = '.*'
_ENABLED = False
IE_DESC = False
def _real_extract(self, url):
raise UnsupportedError(url)

View File

@@ -720,15 +720,20 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
def _get_params(self, lang): def _get_params(self, lang):
if not CrunchyrollBetaBaseIE.params: if not CrunchyrollBetaBaseIE.params:
if self._get_cookies(f'https://beta.crunchyroll.com/{lang}').get('etp_rt'):
grant_type, key = 'etp_rt_cookie', 'accountAuthClientId'
else:
grant_type, key = 'client_id', 'anonClientId'
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage( initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(
f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None) f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None)
api_domain = app_config['cxApiParams']['apiDomain'] api_domain = app_config['cxApiParams']['apiDomain']
basic_token = str(base64.b64encode(('%s:' % app_config['cxApiParams']['accountAuthClientId']).encode('ascii')), 'ascii')
auth_response = self._download_json( auth_response = self._download_json(
f'{api_domain}/auth/v1/token', None, note='Authenticating with cookie', f'{api_domain}/auth/v1/token', None, note=f'Authenticating with grant_type={grant_type}',
headers={ headers={
'Authorization': 'Basic ' + basic_token 'Authorization': 'Basic ' + str(base64.b64encode(('%s:' % app_config['cxApiParams'][key]).encode('ascii')), 'ascii')
}, data='grant_type=etp_rt_cookie'.encode('ascii')) }, data=f'grant_type={grant_type}'.encode('ascii'))
policy_response = self._download_json( policy_response = self._download_json(
f'{api_domain}/index/v2', None, note='Retrieving signed policy', f'{api_domain}/index/v2', None, note='Retrieving signed policy',
headers={ headers={
@@ -747,21 +752,6 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
CrunchyrollBetaBaseIE.params = (api_domain, bucket, params) CrunchyrollBetaBaseIE.params = (api_domain, bucket, params)
return CrunchyrollBetaBaseIE.params return CrunchyrollBetaBaseIE.params
def _redirect_from_beta(self, url, lang, internal_id, display_id, is_episode, iekey):
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(url, display_id), display_id)
content_data = initial_state['content']['byId'][internal_id]
if is_episode:
video_id = content_data['external_id'].split('.')[1]
series_id = content_data['episode_metadata']['series_slug_title']
else:
series_id = content_data['slug_title']
series_id = re.sub(r'-{2,}', '-', series_id)
url = f'https://www.crunchyroll.com/{lang}{series_id}'
if is_episode:
url = url + f'/{display_id}-{video_id}'
self.to_screen(f'{display_id}: Not logged in. Redirecting to non-beta site - {url}')
return self.url_result(url, iekey, display_id)
class CrunchyrollBetaIE(CrunchyrollBetaBaseIE): class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
IE_NAME = 'crunchyroll:beta' IE_NAME = 'crunchyroll:beta'
@@ -800,10 +790,6 @@ class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, True, CrunchyrollIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
episode_response = self._download_json( episode_response = self._download_json(
@@ -897,10 +883,6 @@ class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, False, CrunchyrollShowPlaylistIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
series_response = self._download_json( series_response = self._download_json(

46
yt_dlp/extractor/epoch.py Normal file
View File

@@ -0,0 +1,46 @@
from .common import InfoExtractor
class EpochIE(InfoExtractor):
_VALID_URL = r'https?://www.theepochtimes\.com/[\w-]+_(?P<id>\d+).html'
_TESTS = [
{
'url': 'https://www.theepochtimes.com/they-can-do-audio-video-physical-surveillance-on-you-24h-365d-a-year-rex-lee-on-intrusive-apps_4661688.html',
'info_dict': {
'id': 'a3dd732c-4750-4bc8-8156-69180668bda1',
'ext': 'mp4',
'title': 'They Can Do Audio, Video, Physical Surveillance on You 24H/365D a Year: Rex Lee on Intrusive Apps',
}
},
{
'url': 'https://www.theepochtimes.com/the-communist-partys-cyberattacks-on-america-explained-rex-lee-talks-tech-hybrid-warfare_4342413.html',
'info_dict': {
'id': '276c7f46-3bbf-475d-9934-b9bbe827cf0a',
'ext': 'mp4',
'title': 'The Communist Partys Cyberattacks on America Explained; Rex Lee Talks Tech Hybrid Warfare',
}
},
{
'url': 'https://www.theepochtimes.com/kash-patel-a-6-year-saga-of-government-corruption-from-russiagate-to-mar-a-lago_4690250.html',
'info_dict': {
'id': 'aa9ceecd-a127-453d-a2de-7153d6fd69b6',
'ext': 'mp4',
'title': 'Kash Patel: A 6-Year-Saga of Government Corruption, From Russiagate to Mar-a-Lago',
}
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
youmaker_video_id = self._search_regex(r'data-trailer="[\w-]+" data-id="([\w-]+)"', webpage, 'url')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'http://vs1.youmaker.com/assets/{youmaker_video_id}/playlist.m3u8', video_id, 'mp4', m3u8_id='hls')
return {
'id': youmaker_video_id,
'formats': formats,
'subtitles': subtitles,
'title': self._html_extract_title(webpage)
}

View File

@@ -0,0 +1,99 @@
from .common import InfoExtractor
from ..utils import traverse_obj
class EurosportIE(InfoExtractor):
_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/\d+/[\w-]+_(?P<id>vid\d+)'
_TESTS = [{
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/highlights-rafael-nadal-brushes-aside-caper-ruud-to-win-record-extending-14th-french-open-title_vid1694147/video.shtml',
'info_dict': {
'id': '2480939',
'ext': 'mp4',
'title': 'Highlights: Rafael Nadal brushes aside Caper Ruud to win record-extending 14th French Open title',
'description': 'md5:b564db73ecfe4b14ebbd8e62a3692c76',
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388285-69245968-2560-1440.png',
'duration': 195.0,
'display_id': 'vid1694147',
'timestamp': 1654446698,
'upload_date': '20220605',
}
}, {
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/watch-the-top-five-shots-from-men-s-final-as-rafael-nadal-beats-casper-ruud-to-seal-14th-french-open_vid1694283/video.shtml',
'info_dict': {
'id': '2481254',
'ext': 'mp4',
'title': 'md5:149dcc5dfb38ab7352acc008cc9fb071',
'duration': 130.0,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388422-69248708-2560-1440.png',
'description': 'md5:a0c8a7f6b285e48ae8ddbe7aa85cfee6',
'display_id': 'vid1694283',
'timestamp': 1654456090,
'upload_date': '20220605',
}
}, {
# geo-fence but can bypassed by xff
'url': 'https://www.eurosport.com/cycling/tour-de-france-femmes/2022/incredible-ride-marlen-reusser-storms-to-stage-4-win-at-tour-de-france-femmes_vid1722221/video.shtml',
'info_dict': {
'id': '2582552',
'ext': 'mp4',
'title': 'Incredible ride! - Marlen Reusser storms to Stage 4 win at Tour de France Femmes',
'duration': 188.0,
'display_id': 'vid1722221',
'timestamp': 1658936167,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/07/27/3423347-69852108-2560-1440.jpg',
'description': 'md5:32bbe3a773ac132c57fb1e8cca4b7c71',
'upload_date': '20220727',
}
}]
_TOKEN = None
# actually defined in https://netsport.eurosport.io/?variables={"databaseId":<databaseId>,"playoutType":"VDP"}&extensions={"persistedQuery":{"version":1 ..
# but this method require to get sha256 hash
_GEO_COUNTRIES = ['DE', 'NL', 'EU', 'IT', 'FR'] # Not complete list but it should work
def _real_initialize(self):
if EurosportIE._TOKEN is None:
EurosportIE._TOKEN = self._download_json(
'https://eu3-prod-direct.eurosport.com/token?realm=eurosport', None,
'Trying to get token')['data']['attributes']['token']
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
json_data = self._download_json(
f'https://eu3-prod-direct.eurosport.com/playback/v2/videoPlaybackInfo/sourceSystemId/eurosport-{display_id}',
display_id, query={'usePreAuth': True}, headers={'Authorization': f'Bearer {EurosportIE._TOKEN}'})['data']
json_ld_data = self._search_json_ld(webpage, display_id)
formats, subtitles = [], {}
for stream_type in json_data['attributes']['streaming']:
if stream_type == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4')
elif stream_type == 'dash':
fmts, subs = self._extract_mpd_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
elif stream_type == 'mss':
fmts, subs = self._extract_ism_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
self._sort_formats(formats)
return {
'id': json_data['id'],
'title': json_ld_data.get('title') or self._og_search_title(webpage),
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'thumbnails': json_ld_data.get('thumbnails'),
'description': (json_ld_data.get('description')
or self._html_search_meta(['og:description', 'description'], webpage)),
'duration': json_ld_data.get('duration'),
'timestamp': json_ld_data.get('timestamp'),
}

View File

@@ -3,7 +3,6 @@ import re
import urllib.parse import urllib.parse
import xml.etree.ElementTree import xml.etree.ElementTree
from . import gen_extractor_classes
from .common import InfoExtractor # isort: split from .common import InfoExtractor # isort: split
from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE
from .commonprotocols import RtmpIE from .commonprotocols import RtmpIE
@@ -26,6 +25,7 @@ from ..utils import (
parse_resolution, parse_resolution,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
traverse_obj,
try_call, try_call,
unescapeHTML, unescapeHTML,
unified_timestamp, unified_timestamp,
@@ -2805,7 +2805,7 @@ class GenericIE(InfoExtractor):
self._downloader.write_debug('Looking for embeds') self._downloader.write_debug('Looking for embeds')
embeds = [] embeds = []
for ie in gen_extractor_classes(): for ie in self._downloader._ies.values():
gen = ie.extract_from_webpage(self._downloader, url, webpage) gen = ie.extract_from_webpage(self._downloader, url, webpage)
current_embeds = [] current_embeds = []
try: try:
@@ -2840,8 +2840,9 @@ class GenericIE(InfoExtractor):
try: try:
info = self._parse_jwplayer_data( info = self._parse_jwplayer_data(
jwplayer_data, video_id, require_title=False, base_url=url) jwplayer_data, video_id, require_title=False, base_url=url)
self.report_detected('JW Player data') if traverse_obj(info, 'formats', ('entries', ..., 'formats')):
return merge_dicts(info, info_dict) self.report_detected('JW Player data')
return merge_dicts(info, info_dict)
except ExtractorError: except ExtractorError:
# See https://github.com/ytdl-org/youtube-dl/pull/16735 # See https://github.com/ytdl-org/youtube-dl/pull/16735
pass pass

View File

@@ -6,7 +6,6 @@ from ..compat import compat_urlparse, compat_b64decode
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
js_to_json,
str_or_none, str_or_none,
try_get, try_get,
unescapeHTML, unescapeHTML,
@@ -55,11 +54,7 @@ class HuyaLiveIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id=video_id) webpage = self._download_webpage(url, video_id=video_id)
json_stream = self._search_regex(r'"stream":\s+"([a-zA-Z0-9+=/]+)"', webpage, 'stream', default=None) stream_data = self._search_json(r'stream:\s+', webpage, 'stream', video_id=video_id, default=None)
if not json_stream:
raise ExtractorError('Video is offline', expected=True)
stream_data = self._parse_json(compat_b64decode(json_stream).decode(), video_id=video_id,
transform_source=js_to_json)
room_info = try_get(stream_data, lambda x: x['data'][0]['gameLiveInfo']) room_info = try_get(stream_data, lambda x: x['data'][0]['gameLiveInfo'])
if not room_info: if not room_info:
raise ExtractorError('Can not extract the room info', expected=True) raise ExtractorError('Can not extract the room info', expected=True)
@@ -67,6 +62,8 @@ class HuyaLiveIE(InfoExtractor):
screen_type = room_info.get('screenType') screen_type = room_info.get('screenType')
live_source_type = room_info.get('liveSourceType') live_source_type = room_info.get('liveSourceType')
stream_info_list = stream_data['data'][0]['gameStreamInfoList'] stream_info_list = stream_data['data'][0]['gameStreamInfoList']
if not stream_info_list:
raise ExtractorError('Video is offline', expected=True)
formats = [] formats = []
for stream_info in stream_info_list: for stream_info in stream_info_list:
stream_url = stream_info.get('sFlvUrl') stream_url = stream_info.get('sFlvUrl')

View File

@@ -0,0 +1,82 @@
import re
from .common import InfoExtractor
from ..utils import traverse_obj, urljoin
class IslamChannelIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/watch/(?P<id>\d+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/watch/38604310',
'info_dict': {
'id': '38604310',
'title': 'Omar - Young Omar',
'description': 'md5:5cc7ddecef064ea7afe52eb5e0e33b55',
'thumbnail': r're:https?://.+',
'ext': 'mp4',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
thumbnail = self._search_regex(
r'data-poster="([^"]+)"', webpage, 'data poster', fatal=False) or \
self._html_search_meta(('og:image', 'twitter:image'), webpage)
headers = {
'Token': self._search_regex(r'data-token="([^"]+)"', webpage, 'data token'),
'Token-Expiry': self._search_regex(r'data-expiry="([^"]+)"', webpage, 'data expiry'),
'Uvid': video_id,
}
show_stream = self._download_json(
f'https://v2-streams-elb.simplestreamcdn.com/api/show/stream/{video_id}', video_id,
query={
'key': self._search_regex(r'data-key="([^"]+)"', webpage, 'data key'),
'platform': 'chrome',
}, headers=headers)
# TODO: show_stream['stream'] and show_stream['drm'] may contain something interesting
streams = self._download_json(
traverse_obj(show_stream, ('response', 'tokenization', 'url')), video_id,
headers=headers)
formats, subs = self._extract_m3u8_formats_and_subtitles(traverse_obj(streams, ('Streams', 'Adaptive')), video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': self._html_search_meta(('og:title', 'twitter:title'), webpage),
'description': self._html_search_meta(('og:description', 'twitter:description', 'description'), webpage),
'formats': formats,
'subtitles': subs,
'thumbnails': [{
'id': 'unscaled',
'url': thumbnail.split('?')[0],
'ext': 'jpg',
'preference': 2,
}, {
'id': 'orig',
'url': thumbnail,
'ext': 'jpg',
'preference': 1,
}] if thumbnail else None,
}
class IslamChannelSeriesIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/series/(?P<id>[a-f\d-]+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/series/a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
'info_dict': {
'id': 'a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
},
'playlist_mincount': 31,
}]
def _real_extract(self, url):
pl_id = self._match_id(url)
webpage = self._download_webpage(url, pl_id)
return self.playlist_from_matches(
re.finditer(r'<a\s+href="(/watch/\d+)"[^>]+?data-video-type="show">', webpage),
pl_id, getter=lambda x: urljoin(url, x.group(1)), ie=IslamChannelIE)

View File

@@ -8,15 +8,33 @@ from ..utils import (
float_or_none, float_or_none,
int_or_none, int_or_none,
str_or_none, str_or_none,
try_get, traverse_obj,
) )
class MedalTVIE(InfoExtractor): class MedalTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?medal\.tv/(?P<path>games/[^/?#&]+/clips)/(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://medal.tv/clips/2mA60jWAGQCBH', 'url': 'https://medal.tv/games/valorant/clips/jTBFnLKdLy15K',
'md5': '7b07b064331b1cf9e8e5c52a06ae68fa', 'md5': '6930f8972914b6b9fdc2bb3918098ba0',
'info_dict': {
'id': 'jTBFnLKdLy15K',
'ext': 'mp4',
'title': "Mornu's clutch",
'description': '',
'uploader': 'Aciel',
'timestamp': 1651628243,
'upload_date': '20220504',
'uploader_id': '19335460',
'uploader_url': 'https://medal.tv/users/19335460',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 13,
}
}, {
'url': 'https://medal.tv/games/cod%20cold%20war/clips/2mA60jWAGQCBH',
'md5': '3d19d426fe0b2d91c26e412684e66a06',
'info_dict': { 'info_dict': {
'id': '2mA60jWAGQCBH', 'id': '2mA60jWAGQCBH',
'ext': 'mp4', 'ext': 'mp4',
@@ -26,9 +44,15 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1603165266, 'timestamp': 1603165266,
'upload_date': '20201020', 'upload_date': '20201020',
'uploader_id': '10619174', 'uploader_id': '10619174',
'thumbnail': 'https://cdn.medal.tv/10619174/thumbnail-34934644-720p.jpg?t=1080p&c=202042&missing',
'uploader_url': 'https://medal.tv/users/10619174',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 23,
} }
}, { }, {
'url': 'https://medal.tv/clips/2um24TWdty0NA', 'url': 'https://medal.tv/games/cod%20cold%20war/clips/2um24TWdty0NA',
'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148', 'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148',
'info_dict': { 'info_dict': {
'id': '2um24TWdty0NA', 'id': '2um24TWdty0NA',
@@ -39,25 +63,42 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1605580939, 'timestamp': 1605580939,
'upload_date': '20201117', 'upload_date': '20201117',
'uploader_id': '5156321', 'uploader_id': '5156321',
'thumbnail': 'https://cdn.medal.tv/5156321/thumbnail-36787208-360p.jpg?t=1080p&c=202046&missing',
'uploader_url': 'https://medal.tv/users/5156321',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 9,
} }
}, { }, {
'url': 'https://medal.tv/clips/37rMeFpryCC-9', 'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://medal.tv/clips/2WRj40tpY_EU9', 'url': 'https://medal.tv/games/valorant/clips/2WRj40tpY_EU9',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
path = self._match_valid_url(url).group('path')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
hydration_data = self._parse_json(self._search_regex( next_data = self._search_json(
r'<script[^>]*>\s*(?:var\s*)?hydrationData\s*=\s*({.+?})\s*</script>', '<script[^>]*__NEXT_DATA__[^>]*>', webpage,
webpage, 'hydration data', default='{}'), video_id) 'next data', video_id, end_pattern='</script>', fatal=False)
clip = try_get( build_id = next_data.get('buildId')
hydration_data, lambda x: x['clips'][video_id], dict) or {} if not build_id:
raise ExtractorError(
'Could not find build ID.', video_id=video_id)
locale = next_data.get('locale', 'en')
api_response = self._download_json(
f'https://medal.tv/_next/data/{build_id}/{locale}/{path}/{video_id}.json', video_id)
clip = traverse_obj(api_response, ('pageProps', 'clip')) or {}
if not clip: if not clip:
raise ExtractorError( raise ExtractorError(
'Could not find video information.', video_id=video_id) 'Could not find video information.', video_id=video_id)
@@ -113,9 +154,8 @@ class MedalTVIE(InfoExtractor):
# Necessary because the id of the author is not known in advance. # Necessary because the id of the author is not known in advance.
# Won't raise an issue if no profile can be found as this is optional. # Won't raise an issue if no profile can be found as this is optional.
author = try_get( author = traverse_obj(api_response, ('pageProps', 'profile')) or {}
hydration_data, lambda x: list(x['profiles'].values())[0], dict) or {} author_id = str_or_none(author.get('userId'))
author_id = str_or_none(author.get('id'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s') author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
return { return {

View File

@@ -172,31 +172,27 @@ class MediasetIE(ThePlatformBaseIE):
}] }]
def _extract_from_webpage(self, url, webpage): def _extract_from_webpage(self, url, webpage):
def _qs(url):
return parse_qs(url)
def _program_guid(qs): def _program_guid(qs):
return qs.get('programGuid', [None])[0] return qs.get('programGuid', [None])[0]
entries = []
for mobj in re.finditer( for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1', r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1',
webpage): webpage):
embed_url = mobj.group('url') embed_url = mobj.group('url')
embed_qs = _qs(embed_url) embed_qs = parse_qs(embed_url)
program_guid = _program_guid(embed_qs) program_guid = _program_guid(embed_qs)
if program_guid: if program_guid:
entries.append(embed_url) yield self.url_result(embed_url)
continue continue
video_id = embed_qs.get('id', [None])[0] video_id = embed_qs.get('id', [None])[0]
if not video_id: if not video_id:
continue continue
urlh = self._request_webpage(embed_url, video_id, note='Following embed URL redirect') urlh = self._request_webpage(embed_url, video_id, note='Following embed URL redirect')
embed_url = urlh.geturl() embed_url = urlh.geturl()
program_guid = _program_guid(_qs(embed_url)) program_guid = _program_guid(parse_qs(embed_url))
if program_guid: if program_guid:
entries.append(embed_url) yield self.url_result(embed_url)
return entries
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None): def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
for video in smil.findall(self._xpath_ns('.//video', namespace)): for video in smil.findall(self._xpath_ns('.//video', namespace)):

View File

@@ -159,6 +159,7 @@ class MixcloudIE(MixcloudBaseIE):
formats.append({ formats.append({
'format_id': 'http', 'format_id': 'http',
'url': decrypted, 'url': decrypted,
'vcodec': 'none',
'downloader_options': { 'downloader_options': {
# Mixcloud starts throttling at >~5M # Mixcloud starts throttling at >~5M
'http_chunk_size': 5242880, 'http_chunk_size': 5242880,

View File

@@ -0,0 +1,54 @@
import re
from .common import InfoExtractor
from ..utils import ExtractorError
class NewsPicksIE(InfoExtractor):
_VALID_URL = r'https://newspicks\.com/movie-series/(?P<channel_id>\d+)\?movieId=(?P<id>\d+)'
_TESTS = [{
'url': 'https://newspicks.com/movie-series/11?movieId=1813',
'info_dict': {
'id': '1813',
'title': '日本の課題を破壊せよ【ゲスト:成田悠輔】',
'description': 'md5:09397aad46d6ded6487ff13f138acadf',
'channel': 'HORIE ONE',
'channel_id': '11',
'release_date': '20220117',
'thumbnail': r're:https://.+jpg',
'ext': 'mp4',
},
}]
def _real_extract(self, url):
video_id, channel_id = self._match_valid_url(url).group('id', 'channel_id')
webpage = self._download_webpage(url, video_id)
entries = self._parse_html5_media_entries(
url, webpage.replace('movie-for-pc', 'movie'), video_id, 'hls')
if not entries:
raise ExtractorError('No HTML5 media elements found')
info = entries[0]
self._sort_formats(info['formats'])
title = self._html_search_meta('og:title', webpage, fatal=False)
description = self._html_search_meta(
('og:description', 'twitter:title'), webpage, fatal=False)
channel = self._html_search_regex(
r'value="11".+?<div\s+class="title">(.+?)</div', webpage, 'channel name', fatal=False)
if not title or not channel:
title, channel = re.split(r'\s*|\s*', self._html_extract_title(webpage))
release_date = self._search_regex(
r'<span\s+class="on-air-date">\s*(\d+)年(\d+)月(\d+)日\s*</span>',
webpage, 'release date', fatal=False, group=(1, 2, 3))
info.update({
'id': video_id,
'title': title,
'description': description,
'channel': channel,
'channel_id': channel_id,
'release_date': ('%04d%02d%02d' % tuple(map(int, release_date))) if release_date else None,
})
return info

View File

@@ -52,6 +52,8 @@ class PhantomJSwrapper:
This class is experimental. This class is experimental.
""" """
INSTALL_HINT = 'Please download it from https://phantomjs.org/download.html'
_BASE_JS = R''' _BASE_JS = R'''
phantom.onError = function(msg, trace) {{ phantom.onError = function(msg, trace) {{
var msgStack = ['PHANTOM ERROR: ' + msg]; var msgStack = ['PHANTOM ERROR: ' + msg];
@@ -110,8 +112,7 @@ class PhantomJSwrapper:
self.exe = check_executable('phantomjs', ['-v']) self.exe = check_executable('phantomjs', ['-v'])
if not self.exe: if not self.exe:
raise ExtractorError( raise ExtractorError(f'PhantomJS not found, {self.INSTALL_HINT}', expected=True)
'PhantomJS not found, Please download it from https://phantomjs.org/download.html', expected=True)
self.extractor = extractor self.extractor = extractor
@@ -219,7 +220,7 @@ class PhantomJSwrapper:
return html, stdout return html, stdout
def execute(self, jscode, video_id=None, note='Executing JS'): def execute(self, jscode, video_id=None, *, note='Executing JS'):
"""Execute JS and return stdout""" """Execute JS and return stdout"""
if 'phantom.exit();' not in jscode: if 'phantom.exit();' not in jscode:
jscode += ';\nphantom.exit();' jscode += ';\nphantom.exit();'
@@ -231,8 +232,12 @@ class PhantomJSwrapper:
cmd = [self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name] cmd = [self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name]
self.extractor.write_debug(f'PhantomJS command line: {shell_quote(cmd)}') self.extractor.write_debug(f'PhantomJS command line: {shell_quote(cmd)}')
stdout, stderr, returncode = Popen.run(cmd, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) try:
stdout, stderr, returncode = Popen.run(cmd, timeout=self.options['timeout'] / 1000,
text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except Exception as e:
raise ExtractorError(f'{note} failed: Unable to run PhantomJS binary', cause=e)
if returncode: if returncode:
raise ExtractorError(f'Executing JS failed:\n{stderr.strip()}') raise ExtractorError(f'{note} failed with returncode {returncode}:\n{stderr.strip()}')
return stdout return stdout

View File

@@ -11,6 +11,7 @@ from ..utils import (
int_or_none, int_or_none,
strip_or_none, strip_or_none,
traverse_obj, traverse_obj,
try_call,
unified_timestamp, unified_timestamp,
) )
@@ -255,7 +256,7 @@ class RTBFIE(RedBeeBaseIE):
if not login_token: if not login_token:
self.raise_login_required() self.raise_login_required()
session_jwt = self._download_json( session_jwt = try_call(lambda: self._get_cookies(url)['rtbf_jwt'].value) or self._download_json(
'https://login.rtbf.be/accounts.getJWT', media_id, query={ 'https://login.rtbf.be/accounts.getJWT', media_id, query={
'login_token': login_token.value, 'login_token': login_token.value,
'APIKey': self._GIGYA_API_KEY, 'APIKey': self._GIGYA_API_KEY,

View File

@@ -1,10 +1,12 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError,
get_element_by_class, get_element_by_class,
int_or_none, int_or_none,
remove_start, remove_start,
strip_or_none, strip_or_none,
unified_strdate, unified_strdate,
urlencode_postdata,
) )
@@ -34,6 +36,28 @@ class ScreencastOMaticIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
'https://screencast-o-matic.com/player/' + video_id, video_id) 'https://screencast-o-matic.com/player/' + video_id, video_id)
if (self._html_extract_title(webpage) == 'Protected Content'
or 'This video is private and requires a password' in webpage):
password = self.get_param('videopassword')
if not password:
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
form = self._search_regex(
r'(?is)<form[^>]*>(?P<form>.+?)</form>', webpage, 'login form', group='form')
form_data = self._hidden_inputs(form)
form_data.update({
'scPassword': password,
})
webpage = self._download_webpage(
'https://screencast-o-matic.com/player/password', video_id, 'Logging in',
data=urlencode_postdata(form_data))
if '<small class="text-danger">Invalid password</small>' in webpage:
raise ExtractorError('Unable to login: Invalid password', expected=True)
info = self._parse_html5_media_entries(url, webpage, video_id)[0] info = self._parse_html5_media_entries(url, webpage, video_id)[0]
info.update({ info.update({
'id': video_id, 'id': video_id,

View File

@@ -29,9 +29,7 @@ class StripchatIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(url, video_id, headers=self.geo_verification_headers())
'https://stripchat.com/%s/' % video_id, video_id,
headers=self.geo_verification_headers())
data = self._parse_json( data = self._parse_json(
self._search_regex( self._search_regex(

369
yt_dlp/extractor/tencent.py Normal file
View File

@@ -0,0 +1,369 @@
import functools
import random
import re
import string
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
js_to_json,
traverse_obj,
urljoin,
)
class TencentBaseIE(InfoExtractor):
"""Subclasses must set _API_URL, _APP_VERSION, _PLATFORM, _HOST, _REFERER"""
def _get_ckey(self, video_id, url, guid):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{self._APP_VERSION}|{guid}|'
f'{self._PLATFORM}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Windows x86_64|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex().upper()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
guid = ''.join([random.choice(string.digits + string.ascii_lowercase) for _ in range(16)])
ckey = self._get_ckey(video_id, video_url, guid)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0',
'sphls': '2' if video_format == 'hls' else '0',
'dtype': '3' if video_format == 'hls' else '0',
'defn': video_quality,
'spsrt': '2', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json',
'spwm': '1',
# For SHD
'host': self._HOST,
'referer': self._REFERER,
'ehost': video_url,
'appVer': self._APP_VERSION,
'platform': self._PLATFORM,
# For VQQ
'guid': guid,
'flowid': ''.join(random.choice(string.digits + string.ascii_lowercase) for _ in range(32)),
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
self._API_URL, video_id, query=query), 'api_response', video_id)
def _extract_video_formats_and_subtitles(self, api_response, video_id):
video_response = api_response['vl']['vi'][0]
video_width, video_height = video_response.get('vw'), video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pt'], video_id, 'mp4', fatal=False)
for f in fmts:
f.update({'width': video_width, 'height': video_height})
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_native_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')) or ():
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _extract_all_video_formats_and_subtitles(self, url, video_id, series_id):
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (
# '': 480p, 'shd': 720p, 'fhd': 1080p
('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(
url, video_id, series_id, subtitle_format, video_format, video_quality)
if api_response.get('em') != 0 and api_response.get('exem') != 0:
if '您所在区域暂无此内容版权' in api_response.get('msg'):
self.raise_geo_restricted()
raise ExtractorError(f'Tencent said: {api_response.get("msg")}')
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id)
native_subtitles = self._extract_video_native_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
return formats, subtitles
def _get_clean_title(self, title):
return re.sub(
r'\s*[_\-]\s*(?:Watch online|腾讯视频|(?:高清)?1080P在线观看平台).*?$',
'', title or '').strip() or None
class VQQBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://v\.qq\.com'
_API_URL = 'https://h5vv6.video.qq.com/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '10901'
_HOST = 'v.qq.com'
_REFERER = 'v.qq.com'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
self._search_regex(
r'(?s)<script[^>]*>[^<]*window\.__pinia\s*=\s*([^<]+)</script>',
webpage, 'pinia data', fatal=False),
video_id, transform_source=js_to_json, fatal=False)
class VQQVideoIE(VQQBaseIE):
IE_NAME = 'vqq:video'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/(?:page|cover/(?P<series_id>\w+))/(?P<id>\w+)'
_TESTS = [{
'url': 'https://v.qq.com/x/page/q326831cny0.html',
'md5': '826ef93682df09e3deac4a6e6e8cdb6e',
'info_dict': {
'id': 'q326831cny0',
'ext': 'mp4',
'title': '我是选手:雷霆裂阵,终极时刻',
'description': 'md5:e7ed70be89244017dac2a835a10aeb1e',
'thumbnail': r're:^https?://[^?#]+q326831cny0',
},
}, {
'url': 'https://v.qq.com/x/page/o3013za7cse.html',
'md5': 'b91cbbeada22ef8cc4b06df53e36fa21',
'info_dict': {
'id': 'o3013za7cse',
'ext': 'mp4',
'title': '欧阳娜娜VLOG',
'description': 'md5:29fe847497a98e04a8c3826e499edd2e',
'thumbnail': r're:^https?://[^?#]+o3013za7cse',
},
}, {
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27/a00269ix3l8.html',
'md5': '71459c5375c617c265a22f083facce67',
'info_dict': {
'id': 'a00269ix3l8',
'ext': 'mp4',
'title': '鸡毛飞上天 第01集',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
'thumbnail': r're:^https?://[^?#]+7ce5noezvafma27',
'series': '鸡毛飞上天',
},
}, {
'url': 'https://v.qq.com/x/cover/mzc00200p29k31e/s0043cwsgj0.html',
'md5': '96b9fd4a189fdd4078c111f21d7ac1bc',
'info_dict': {
'id': 's0043cwsgj0',
'ext': 'mp4',
'title': '第1集如何快乐吃糖',
'description': 'md5:1d8c3a0b8729ae3827fa5b2d3ebd5213',
'thumbnail': r're:^https?://[^?#]+s0043cwsgj0',
'series': '青年理工工作者生活研究所',
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'title'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'desc'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': (self._og_search_thumbnail(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'pic160x90'))),
'series': traverse_obj(webpage_metadata, ('global', 'coverInfo', 'title')),
}
class VQQSeriesIE(VQQBaseIE):
IE_NAME = 'vqq:series'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/cover/(?P<id>\w+)\.html/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27.html',
'info_dict': {
'id': '7ce5noezvafma27',
'title': '鸡毛飞上天',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
},
'playlist_count': 55,
}, {
'url': 'https://v.qq.com/x/cover/oshd7r0vy9sfq8e.html',
'info_dict': {
'id': 'oshd7r0vy9sfq8e',
'title': '恋爱细胞2',
'description': 'md5:9d8a2245679f71ca828534b0f95d2a03',
},
'playlist_count': 12,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = [f'/x/cover/{series_id}/{video_id}.html' for video_id in re.findall(
r'<div[^>]+data-vid="(?P<video_id>[^"]+)"[^>]+class="[^"]+episode-item-rect--number',
webpage)]
return self.playlist_from_matches(
episode_paths, series_id, ie=VQQVideoIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))
class WeTvBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
_API_URL = 'https://play.wetv.vip/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '4830201'
_HOST = 'wetv.vip'
_REFERER = 'wetv.vip'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': '0c70fdfaa5011ab022eebc598e64bbbe',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '3b3c15ca4b9a158d8d28d5aa9d7c0a49',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'title'))),
'description': (traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = ([f'/play/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')]
or re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage))
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))

View File

@@ -8,12 +8,14 @@ class TestURLIE(InfoExtractor):
""" Allows addressing of the test cases as test:yout.*be_1 """ """ Allows addressing of the test cases as test:yout.*be_1 """
IE_DESC = False # Do not list IE_DESC = False # Do not list
_VALID_URL = r'test(?:url)?:(?P<extractor>.+?)(?:_(?P<num>[0-9]+))?$' _VALID_URL = r'test(?:url)?:(?P<extractor>.*?)(?:_(?P<num>[0-9]+))?$'
def _real_extract(self, url): def _real_extract(self, url):
from . import gen_extractor_classes from . import gen_extractor_classes
extractor_id, num = self._match_valid_url(url).group('extractor', 'num') extractor_id, num = self._match_valid_url(url).group('extractor', 'num')
if not extractor_id:
return {'id': ':test', 'title': '', 'url': url}
rex = re.compile(extractor_id, flags=re.IGNORECASE) rex = re.compile(extractor_id, flags=re.IGNORECASE)
matching_extractors = [e for e in gen_extractor_classes() if rex.search(e.IE_NAME)] matching_extractors = [e for e in gen_extractor_classes() if rex.search(e.IE_NAME)]

304
yt_dlp/extractor/triller.py Normal file
View File

@@ -0,0 +1,304 @@
import itertools
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
unified_timestamp,
url_basename,
)
class TrillerBaseIE(InfoExtractor):
_NETRC_MACHINE = 'triller'
_AUTH_TOKEN = None
_API_BASE_URL = 'https://social.triller.co/v1.5'
def _perform_login(self, username, password):
if self._AUTH_TOKEN:
return
user_check = self._download_json(
f'{self._API_BASE_URL}/api/user/is-valid-username', None, note='Checking username',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps({'username': username}, separators=(',', ':')).encode('utf-8'))
if user_check.get('status'): # endpoint returns "status":false if username exists
raise ExtractorError('Unable to login: Invalid username', expected=True)
credentials = {
'username': username,
'password': password,
}
login = self._download_json(
f'{self._API_BASE_URL}/user/auth', None, note='Logging in',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps(credentials, separators=(',', ':')).encode('utf-8'))
if not login.get('auth_token'):
if login.get('error') == 1008:
raise ExtractorError('Unable to login: Incorrect password', expected=True)
raise ExtractorError('Unable to login')
self._AUTH_TOKEN = login['auth_token']
def _get_comments(self, video_id, limit=15):
comment_info = self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_id}/comments_v2',
video_id, fatal=False, note='Downloading comments API JSON',
headers={'Origin': 'https://triller.co'}, query={'limit': limit}) or {}
if not comment_info.get('comments'):
return
for comment_dict in comment_info['comments']:
yield {
'author': traverse_obj(comment_dict, ('author', 'username')),
'author_id': traverse_obj(comment_dict, ('author', 'user_id')),
'id': comment_dict.get('id'),
'text': comment_dict.get('body'),
'timestamp': unified_timestamp(comment_dict.get('timestamp')),
}
def _check_user_info(self, user_info):
if not user_info:
self.report_warning('Unable to extract user info')
elif user_info.get('private') and not user_info.get('followed_by_me'):
raise ExtractorError('This video is private', expected=True)
elif traverse_obj(user_info, 'blocked_by_user', 'blocking_user'):
raise ExtractorError('The author of the video is blocked', expected=True)
return user_info
def _parse_video_info(self, video_info, username, user_info=None):
video_uuid = video_info.get('video_uuid')
video_id = video_info.get('id')
formats = []
video_url = traverse_obj(video_info, 'video_url', 'stream_url')
if video_url:
formats.append({
'url': video_url,
'ext': 'mp4',
'vcodec': 'h264',
'width': video_info.get('width'),
'height': video_info.get('height'),
'format_id': url_basename(video_url).split('.')[0],
'filesize': video_info.get('filesize'),
})
video_set = video_info.get('video_set') or []
for video in video_set:
resolution = video.get('resolution') or ''
formats.append({
'url': video['url'],
'ext': 'mp4',
'vcodec': video.get('codec'),
'vbr': int_or_none(video.get('bitrate'), 1000),
'width': int_or_none(resolution.split('x')[0]),
'height': int_or_none(resolution.split('x')[1]),
'format_id': url_basename(video['url']).split('.')[0],
})
audio_url = video_info.get('audio_url')
if audio_url:
formats.append({
'url': audio_url,
'ext': 'm4a',
'format_id': url_basename(audio_url).split('.')[0],
})
manifest_url = video_info.get('transcoded_url')
if manifest_url:
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
comment_count = int_or_none(video_info.get('comment_count'))
user_info = user_info or traverse_obj(video_info, 'user', default={})
return {
'id': str_or_none(video_id) or video_uuid,
'title': video_info.get('description') or f'Video by {username}',
'thumbnail': video_info.get('thumbnail_url'),
'description': video_info.get('description'),
'uploader': str_or_none(username),
'uploader_id': str_or_none(user_info.get('user_id')),
'creator': str_or_none(user_info.get('name')),
'timestamp': unified_timestamp(video_info.get('timestamp')),
'upload_date': unified_strdate(video_info.get('timestamp')),
'duration': int_or_none(video_info.get('duration')),
'view_count': int_or_none(video_info.get('play_count')),
'like_count': int_or_none(video_info.get('likes_count')),
'artist': str_or_none(video_info.get('song_artist')),
'track': str_or_none(video_info.get('song_title')),
'webpage_url': f'https://triller.co/@{username}/video/{video_uuid}',
'uploader_url': f'https://triller.co/@{username}',
'extractor_key': TrillerIE.ie_key(),
'extractor': TrillerIE.IE_NAME,
'formats': formats,
'comment_count': comment_count,
'__post_extractor': self.extract_comments(video_id, comment_count),
}
class TrillerIE(TrillerBaseIE):
_VALID_URL = r'''(?x)
https?://(?:www\.)?triller\.co/
@(?P<username>[\w\._]+)/video/
(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
'''
_TESTS = [{
'url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'md5': '228662d783923b60d78395fedddc0a20',
'info_dict': {
'id': '71595734',
'ext': 'mp4',
'title': 'md5:9a2bf9435c5c4292678996a464669416',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:9a2bf9435c5c4292678996a464669416',
'uploader': 'theestallion',
'uploader_id': '18992236',
'creator': 'Megan Thee Stallion',
'timestamp': 1660598222,
'upload_date': '20220815',
'duration': 47,
'height': 3840,
'width': 2160,
'view_count': int,
'like_count': int,
'artist': 'Megan Thee Stallion',
'track': 'Her',
'webpage_url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'uploader_url': 'https://triller.co/@theestallion',
'comment_count': int,
}
}, {
'url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'md5': '874055f462af5b0699b9dbb527a505a0',
'info_dict': {
'id': '71621339',
'ext': 'mp4',
'title': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'uploader': 'charlidamelio',
'uploader_id': '1875551',
'creator': 'charli damelio',
'timestamp': 1660773354,
'upload_date': '20220817',
'duration': 16,
'height': 1920,
'width': 1080,
'view_count': int,
'like_count': int,
'artist': 'Dixie',
'track': 'Someone to Blame',
'webpage_url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'uploader_url': 'https://triller.co/@charlidamelio',
'comment_count': int,
}
}]
def _real_extract(self, url):
username, video_uuid = self._match_valid_url(url).group('username', 'id')
video_info = traverse_obj(self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_uuid}',
video_uuid, note='Downloading video info API JSON',
errnote='Unable to download video info API JSON',
headers={
'Origin': 'https://triller.co',
}), ('videos', 0))
if not video_info:
raise ExtractorError('No video info found in API response')
user_info = self._check_user_info(video_info.get('user') or {})
return self._parse_video_info(video_info, username, user_info)
class TrillerUserIE(TrillerBaseIE):
_VALID_URL = r'https?://(?:www\.)?triller\.co/@(?P<id>[\w\._]+)/?(?:$|[#?])'
_TESTS = [{
# first videos request only returns 2 videos
'url': 'https://triller.co/@theestallion',
'playlist_mincount': 9,
'info_dict': {
'id': '18992236',
'title': 'theestallion',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}, {
'url': 'https://triller.co/@charlidamelio',
'playlist_mincount': 25,
'info_dict': {
'id': '1875551',
'title': 'charlidamelio',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}]
def _real_initialize(self):
if not self._AUTH_TOKEN:
guest = self._download_json(
f'{self._API_BASE_URL}/user/create_guest',
None, note='Creating guest session', data=b'', headers={
'Origin': 'https://triller.co',
}, query={
'platform': 'Web',
'app_version': '',
})
if not guest.get('auth_token'):
raise ExtractorError('Unable to fetch required auth token for user extraction')
self._AUTH_TOKEN = guest['auth_token']
def _extract_video_list(self, username, user_id, limit=6):
query = {
'limit': limit,
}
for page in itertools.count(1):
for retry in self.RetryManager():
try:
video_list = self._download_json(
f'{self._API_BASE_URL}/api/users/{user_id}/videos',
username, note=f'Downloading user video list page {page}',
errnote='Unable to download user video list', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}, query=query)
except ExtractorError as e:
if isinstance(e.cause, json.JSONDecodeError) and e.cause.pos == 0:
retry.error = e
continue
raise
if not video_list.get('videos'):
break
yield from video_list['videos']
query['before_time'] = traverse_obj(video_list, ('videos', -1, 'timestamp'))
if not query['before_time']:
break
def _entries(self, videos, username, user_info):
for video in videos:
yield self._parse_video_info(video, username, user_info)
def _real_extract(self, url):
username = self._match_id(url)
user_info = self._check_user_info(self._download_json(
f'{self._API_BASE_URL}/api/users/by_username/{username}',
username, note='Downloading user info',
errnote='Failed to download user info', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}).get('user', {}))
user_id = str_or_none(user_info.get('user_id'))
videos = self._extract_video_list(username, user_id)
thumbnail = user_info.get('avatar_url')
return self.playlist_result(
self._entries(videos, username, user_info), user_id, username, thumbnail=thumbnail)

View File

@@ -2,7 +2,7 @@ from .common import InfoExtractor
class UKTVPlayIE(InfoExtractor): class UKTVPlayIE(InfoExtractor):
_VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)' _VALID_URL = r'https?://uktvplay\.(?:uktv\.)?co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001', 'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001',
'info_dict': { 'info_dict': {

View File

@@ -1131,7 +1131,7 @@ class VimeoChannelIE(VimeoBaseInfoExtractor):
class VimeoUserIE(VimeoChannelIE): class VimeoUserIE(VimeoChannelIE):
IE_NAME = 'vimeo:user' IE_NAME = 'vimeo:user'
_VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos|[#?]|$)' _VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos)?/?(?:$|[?#])'
_TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>' _TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>'
_TESTS = [{ _TESTS = [{
'url': 'https://vimeo.com/nkistudio/videos', 'url': 'https://vimeo.com/nkistudio/videos',
@@ -1140,6 +1140,9 @@ class VimeoUserIE(VimeoChannelIE):
'id': 'nkistudio', 'id': 'nkistudio',
}, },
'playlist_mincount': 66, 'playlist_mincount': 66,
}, {
'url': 'https://vimeo.com/nkistudio/',
'only_matching': True,
}] }]
_BASE_URL_TEMPL = 'https://vimeo.com/%s' _BASE_URL_TEMPL = 'https://vimeo.com/%s'

View File

@@ -1,208 +0,0 @@
import functools
import re
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import determine_ext, int_or_none, traverse_obj, urljoin
class WeTvBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
def _get_ckey(self, video_id, url, app_version, platform):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{app_version}|0000000000000000|'
f'{platform}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Win32|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
app_version = '3.5.57'
platform = '4830201'
ckey = self._get_ckey(video_id, video_url, app_version, platform)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0', # 0 - SRT, 1 - VTT
'sphls': '1' if video_format == 'hls' else '0', # 0 - MP4, 1 - HLS
'defn': video_quality, # '': 480p, 'shd': 720p, 'fhd': 1080p
'spsrt': '1', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json', # Response format: xml, json,
'dtype': '1',
'spwm': '1',
'host': 'wetv.vip', # These three values are needed for SHD
'referer': 'wetv.vip',
'ehost': video_url,
'appVer': app_version,
'platform': platform,
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
'https://play.wetv.vip/getvinfo', video_id, query=query), 'api_response', video_id)
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': 'a046f565c9dce9b263a0465a422cd7bf',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '4d9d69bcfd11da61f4aae64fc6b316b3',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _extract_video_formats_and_subtitles(self, api_response, video_id, video_quality):
video_response = api_response['vl']['vi'][0]
video_width = video_response.get('vw')
video_height = video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pname'], video_id, 'mp4', fatal=False)
for f in fmts:
f['width'] = video_width
f['height'] = video_height
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')):
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(url, video_id, series_id, subtitle_format, video_format, video_quality)
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id, video_quality)
native_subtitles = self._extract_video_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
return {
'id': video_id,
'title': (self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = (re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage)
or [f'/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')])
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=traverse_obj(webpage_metadata, ('coverInfo', 'title')) or self._og_search_title(webpage),
description=traverse_obj(webpage_metadata, ('coverInfo', 'description')) or self._og_search_description(webpage))

View File

@@ -110,8 +110,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID', 'clientName': 'ANDROID',
'clientVersion': '17.29.34', 'clientVersion': '17.31.35',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 3, 'INNERTUBE_CONTEXT_CLIENT_NAME': 3,
@@ -122,8 +123,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID_EMBEDDED_PLAYER', 'clientName': 'ANDROID_EMBEDDED_PLAYER',
'clientVersion': '17.29.34', 'clientVersion': '17.31.35',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 55, 'INNERTUBE_CONTEXT_CLIENT_NAME': 55,
@@ -135,7 +137,8 @@ INNERTUBE_CLIENTS = {
'client': { 'client': {
'clientName': 'ANDROID_MUSIC', 'clientName': 'ANDROID_MUSIC',
'clientVersion': '5.16.51', 'clientVersion': '5.16.51',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.music/5.16.51 (Linux; U; Android 11) gzip'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 21, 'INNERTUBE_CONTEXT_CLIENT_NAME': 21,
@@ -146,8 +149,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID_CREATOR', 'clientName': 'ANDROID_CREATOR',
'clientVersion': '22.28.100', 'clientVersion': '22.30.100',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.creator/22.30.100 (Linux; U; Android 11) gzip'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 14, 'INNERTUBE_CONTEXT_CLIENT_NAME': 14,
@@ -160,8 +164,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS', 'clientName': 'IOS',
'clientVersion': '17.30.1', 'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3', 'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 5, 'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
@@ -171,8 +176,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_MESSAGES_EXTENSION', 'clientName': 'IOS_MESSAGES_EXTENSION',
'clientVersion': '17.30.1', 'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3', 'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 66, 'INNERTUBE_CONTEXT_CLIENT_NAME': 66,
@@ -183,7 +189,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_MUSIC', 'clientName': 'IOS_MUSIC',
'clientVersion': '5.18', 'clientVersion': '5.21',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtubemusic/5.21 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 26, 'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
@@ -193,7 +201,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_CREATOR', 'clientName': 'IOS_CREATOR',
'clientVersion': '22.29.101', 'clientVersion': '22.33.101',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.ytcreator/22.33.101 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 15, 'INNERTUBE_CONTEXT_CLIENT_NAME': 15,
@@ -555,7 +565,8 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
'Origin': origin, 'Origin': origin,
'X-Youtube-Identity-Token': identity_token or self._extract_identity_token(ytcfg), 'X-Youtube-Identity-Token': identity_token or self._extract_identity_token(ytcfg),
'X-Goog-PageId': account_syncid or self._extract_account_syncid(ytcfg), 'X-Goog-PageId': account_syncid or self._extract_account_syncid(ytcfg),
'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg) 'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg),
'User-Agent': self._ytcfg_get_safe(ytcfg, lambda x: x['INNERTUBE_CONTEXT']['client']['userAgent'], default_client=default_client)
} }
if session_index is None: if session_index is None:
session_index = self._extract_session_index(ytcfg) session_index = self._extract_session_index(ytcfg)
@@ -2148,6 +2159,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'comment_count': int, 'comment_count': int,
'channel_follower_count': int 'channel_follower_count': int
} }
}, {
# Same video as above, but with --compat-opt no-youtube-prefer-utc-upload-date
'url': 'https://www.youtube.com/watch?v=2NUZ8W2llS4',
'info_dict': {
'id': '2NUZ8W2llS4',
'ext': 'mp4',
'title': 'The NP that test your phone performance 🙂',
'description': 'md5:144494b24d4f9dfacb97c1bbef5de84d',
'uploader': 'Leon Nguyen',
'uploader_id': 'VNSXIII',
'uploader_url': 'http://www.youtube.com/user/VNSXIII',
'channel_id': 'UCRqNBSOHgilHfAczlUmlWHA',
'channel_url': 'https://www.youtube.com/channel/UCRqNBSOHgilHfAczlUmlWHA',
'duration': 21,
'view_count': int,
'age_limit': 0,
'categories': ['Gaming'],
'tags': 'count:23',
'playable_in_embed': True,
'live_status': 'not_live',
'upload_date': '20220102',
'like_count': int,
'availability': 'public',
'channel': 'Leon Nguyen',
'thumbnail': 'https://i.ytimg.com/vi_webp/2NUZ8W2llS4/maxresdefault.webp',
'comment_count': int,
'channel_follower_count': int
},
'params': {'compat_opts': ['no-youtube-prefer-utc-upload-date']}
}, { }, {
# date text is premiered video, ensure upload date in UTC (published 1641172509) # date text is premiered video, ensure upload date in UTC (published 1641172509)
'url': 'https://www.youtube.com/watch?v=mzZzzBU6lrM', 'url': 'https://www.youtube.com/watch?v=mzZzzBU6lrM',
@@ -2621,7 +2661,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
raise ExtractorError('Cannot decrypt nsig without player_url') raise ExtractorError('Cannot decrypt nsig without player_url')
player_url = urljoin('https://www.youtube.com', player_url) player_url = urljoin('https://www.youtube.com', player_url)
jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url) try:
jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url)
except ExtractorError as e:
raise ExtractorError('Unable to extract nsig function code', cause=e)
if self.get_param('youtube_print_sig_code'): if self.get_param('youtube_print_sig_code'):
self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n') self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n')
@@ -2630,7 +2673,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
ret = extract_nsig(jsi, func_code)(s) ret = extract_nsig(jsi, func_code)(s)
except JSInterpreter.Exception as e: except JSInterpreter.Exception as e:
try: try:
jsi = PhantomJSwrapper(self) jsi = PhantomJSwrapper(self, timeout=5000)
except ExtractorError: except ExtractorError:
raise e raise e
self.report_warning( self.report_warning(
@@ -2646,24 +2689,40 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self.write_debug(f'Decrypted nsig {s} => {ret}') self.write_debug(f'Decrypted nsig {s} => {ret}')
return ret return ret
def _extract_n_function_name(self, jscode):
funcname, idx = self._search_regex(
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx'))
if not idx:
return funcname
return json.loads(js_to_json(self._search_regex(
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\]);', jscode,
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
def _extract_n_function_code(self, video_id, player_url): def _extract_n_function_code(self, video_id, player_url):
player_id = self._extract_player_info(player_url) player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id) func_code = self.cache.load('youtube-nsig', player_id, min_ver='2022.09.1')
jscode = func_code or self._load_player(video_id, player_url) jscode = func_code or self._load_player(video_id, player_url)
jsi = JSInterpreter(jscode) jsi = JSInterpreter(jscode)
if func_code: if func_code:
return jsi, player_id, func_code return jsi, player_id, func_code
funcname, idx = self._search_regex( func_name = self._extract_n_function_name(jscode)
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx')) # For redundancy
if idx: func_code = self._search_regex(
funcname = json.loads(js_to_json(self._search_regex( r'''(?xs)%s\s*=\s*function\s*\((?P<var>[\w$]+)\)\s*
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\]);', jscode, # NB: The end of the regex is intentionally kept strict
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)] {(?P<code>.+?}\s*return\ [\w$]+.join\(""\))};''' % func_name,
jscode, 'nsig function', group=('var', 'code'), default=None)
if func_code:
func_code = ([func_code[0]], func_code[1])
else:
self.write_debug('Extracting nsig function with jsinterp')
func_code = jsi.extract_function_code(func_name)
func_code = jsi.extract_function_code(funcname)
self.cache.store('youtube-nsig', player_id, func_code) self.cache.store('youtube-nsig', player_id, func_code)
return jsi, player_id, func_code return jsi, player_id, func_code
@@ -2945,8 +3004,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# YouTube comments have a max depth of 2 # YouTube comments have a max depth of 2
max_depth = int_or_none(get_single_config_arg('max_comment_depth')) max_depth = int_or_none(get_single_config_arg('max_comment_depth'))
if max_depth: if max_depth:
self._downloader.deprecation_warning( self._downloader.deprecated_feature('[youtube] max_comment_depth extractor argument is deprecated. '
'[youtube] max_comment_depth extractor argument is deprecated. Set max replies in the max-comments extractor argument instead.') 'Set max replies in the max-comments extractor argument instead')
if max_depth == 1 and parent: if max_depth == 1 and parent:
return return
@@ -3068,7 +3127,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _is_unplayable(player_response): def _is_unplayable(player_response):
return traverse_obj(player_response, ('playabilityStatus', 'status')) == 'UNPLAYABLE' return traverse_obj(player_response, ('playabilityStatus', 'status')) == 'UNPLAYABLE'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr): _STORY_PLAYER_PARAMS = '8AEB'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr, smuggled_data):
session_index = self._extract_session_index(player_ytcfg, master_ytcfg) session_index = self._extract_session_index(player_ytcfg, master_ytcfg)
syncid = self._extract_account_syncid(player_ytcfg, master_ytcfg, initial_pr) syncid = self._extract_account_syncid(player_ytcfg, master_ytcfg, initial_pr)
@@ -3078,8 +3139,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
yt_query = { yt_query = {
'videoId': video_id, 'videoId': video_id,
'params': '8AEB' # enable stories
} }
if smuggled_data.get('is_story') or _split_innertube_client(client)[0] == 'android':
yt_query['params'] = self._STORY_PLAYER_PARAMS
yt_query.update(self._generate_player_context(sts)) yt_query.update(self._generate_player_context(sts))
return self._extract_response( return self._extract_response(
item_id=video_id, ep='player', query=yt_query, item_id=video_id, ep='player', query=yt_query,
@@ -3112,7 +3175,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return orderedSet(requested_clients) return orderedSet(requested_clients)
def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg): def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg, smuggled_data):
initial_pr = None initial_pr = None
if webpage: if webpage:
initial_pr = self._search_json( initial_pr = self._search_json(
@@ -3162,7 +3225,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
try: try:
pr = initial_pr if client == 'web' and initial_pr else self._extract_player_response( pr = initial_pr if client == 'web' and initial_pr else self._extract_player_response(
client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr) client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr, smuggled_data)
except ExtractorError as e: except ExtractorError as e:
if last_error: if last_error:
self.report_warning(last_error) self.report_warning(last_error)
@@ -3196,7 +3259,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, is_live, duration): def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, is_live, duration):
itags, stream_ids = {}, [] itags, stream_ids = {}, []
itag_qualities, res_qualities = {}, {0: -1} itag_qualities, res_qualities = {}, {0: None}
q = qualities([ q = qualities([
# Normally tiny is the smallest video-only formats. But # Normally tiny is the smallest video-only formats. But
# audio-only formats with unknown quality may get tagged as tiny # audio-only formats with unknown quality may get tagged as tiny
@@ -3264,7 +3327,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
except ExtractorError as e: except ExtractorError as e:
phantomjs_hint = '' phantomjs_hint = ''
if isinstance(e, JSInterpreter.Exception): if isinstance(e, JSInterpreter.Exception):
phantomjs_hint = f' Install {self._downloader._format_err("PhantomJS", self._downloader.Styles.EMPHASIS)} to workaround the issue\n' phantomjs_hint = (f' Install {self._downloader._format_err("PhantomJS", self._downloader.Styles.EMPHASIS)} '
f'to workaround the issue. {PhantomJSwrapper.INSTALL_HINT}\n')
self.report_warning( self.report_warning(
f'nsig extraction failed: You may experience throttling for some formats\n{phantomjs_hint}' f'nsig extraction failed: You may experience throttling for some formats\n{phantomjs_hint}'
f' n = {query["n"][0]} ; player = {player_url}', video_id=video_id, only_once=True) f' n = {query["n"][0]} ; player = {player_url}', video_id=video_id, only_once=True)
@@ -3354,7 +3418,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f['format_id'] = itag f['format_id'] = itag
itags[itag] = proto itags[itag] = proto
f['quality'] = itag_qualities.get(try_get(f, lambda f: f['format_id'].split('-')[0]), -1) f['quality'] = q(itag_qualities.get(try_get(f, lambda f: f['format_id'].split('-')[0]), -1))
if f['quality'] == -1 and f.get('height'): if f['quality'] == -1 and f.get('height'):
f['quality'] = q(res_qualities[min(res_qualities, key=lambda x: abs(x - f['height']))]) f['quality'] = q(res_qualities[min(res_qualities, key=lambda x: abs(x - f['height']))])
return True return True
@@ -3425,14 +3489,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _download_player_responses(self, url, smuggled_data, video_id, webpage_url): def _download_player_responses(self, url, smuggled_data, video_id, webpage_url):
webpage = None webpage = None
if 'webpage' not in self._configuration_arg('player_skip'): if 'webpage' not in self._configuration_arg('player_skip'):
query = {'bpctr': '9999999999', 'has_verified': '1'}
if smuggled_data.get('is_story'):
query['pp'] = self._STORY_PLAYER_PARAMS
webpage = self._download_webpage( webpage = self._download_webpage(
webpage_url + '&bpctr=9999999999&has_verified=1&pp=8AEB', video_id, fatal=False) webpage_url, video_id, fatal=False, query=query)
master_ytcfg = self.extract_ytcfg(video_id, webpage) or self._get_default_ytcfg() master_ytcfg = self.extract_ytcfg(video_id, webpage) or self._get_default_ytcfg()
player_responses, player_url = self._extract_player_responses( player_responses, player_url = self._extract_player_responses(
self._get_requested_clients(url, smuggled_data), self._get_requested_clients(url, smuggled_data),
video_id, webpage, master_ytcfg) video_id, webpage, master_ytcfg, smuggled_data)
return webpage, master_ytcfg, player_responses, player_url return webpage, master_ytcfg, player_responses, player_url
@@ -3898,7 +3965,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
upload_date = ( upload_date = (
unified_strdate(get_first(microformats, 'uploadDate')) unified_strdate(get_first(microformats, 'uploadDate'))
or unified_strdate(search_meta('uploadDate'))) or unified_strdate(search_meta('uploadDate')))
if not upload_date or (not info.get('is_live') and not info.get('was_live') and info.get('live_status') != 'is_upcoming'): if not upload_date or (
not info.get('is_live')
and not info.get('was_live')
and info.get('live_status') != 'is_upcoming'
and 'no-youtube-prefer-utc-upload-date' not in self.get_param('compat_opts', [])
):
upload_date = strftime_or_none(self._extract_time_text(vpir, 'dateText')[0], '%Y%m%d') or upload_date upload_date = strftime_or_none(self._extract_time_text(vpir, 'dateText')[0], '%Y%m%d') or upload_date
info['upload_date'] = upload_date info['upload_date'] = upload_date
@@ -6005,7 +6077,7 @@ class YoutubeStoriesIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = f'RLTD{self._match_id(url)}' playlist_id = f'RLTD{self._match_id(url)}'
return self.url_result( return self.url_result(
f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1', smuggle_url(f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1', {'is_story': True}),
ie=YoutubeTabIE, video_id=playlist_id) ie=YoutubeTabIE, video_id=playlist_id)

View File

@@ -18,10 +18,11 @@ from .utils import (
def _js_bit_op(op): def _js_bit_op(op):
def zeroise(x):
return 0 if x in (None, JS_Undefined) else x
def wrapped(a, b): def wrapped(a, b):
def zeroise(x): return op(zeroise(a), zeroise(b)) & 0xffffffff
return 0 if x in (None, JS_Undefined) else x
return op(zeroise(a), zeroise(b))
return wrapped return wrapped
@@ -71,6 +72,8 @@ def _js_comp_op(op):
def wrapped(a, b): def wrapped(a, b):
if JS_Undefined in (a, b): if JS_Undefined in (a, b):
return False return False
if isinstance(a, str) or isinstance(b, str):
return op(str(a or 0), str(b or 0))
return op(a or 0, b or 0) return op(a or 0, b or 0)
return wrapped return wrapped
@@ -98,8 +101,8 @@ _OPERATORS = { # None => Defined in JSInterpreter._operator
'&': _js_bit_op(operator.and_), '&': _js_bit_op(operator.and_),
'===': operator.is_, '===': operator.is_,
'==': _js_eq_op(operator.eq),
'!==': operator.is_not, '!==': operator.is_not,
'==': _js_eq_op(operator.eq),
'!=': _js_eq_op(operator.ne), '!=': _js_eq_op(operator.ne),
'<=': _js_comp_op(operator.le), '<=': _js_comp_op(operator.le),
@@ -172,7 +175,14 @@ class Debugger:
def interpret_statement(self, stmt, local_vars, allow_recursion, *args, **kwargs): def interpret_statement(self, stmt, local_vars, allow_recursion, *args, **kwargs):
if cls.ENABLED and stmt.strip(): if cls.ENABLED and stmt.strip():
cls.write(stmt, level=allow_recursion) cls.write(stmt, level=allow_recursion)
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs) try:
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs)
except Exception as e:
if cls.ENABLED:
if isinstance(e, ExtractorError):
e = e.orig_msg
cls.write('=> Raises:', e, '<-|', stmt, level=allow_recursion)
raise
if cls.ENABLED and stmt.strip(): if cls.ENABLED and stmt.strip():
cls.write(['->', '=>'][should_ret], repr(ret), '<-|', stmt, level=allow_recursion) cls.write(['->', '=>'][should_ret], repr(ret), '<-|', stmt, level=allow_recursion)
return ret, should_ret return ret, should_ret
@@ -226,7 +236,7 @@ class JSInterpreter:
@staticmethod @staticmethod
def _separate(expr, delim=',', max_split=None): def _separate(expr, delim=',', max_split=None):
OP_CHARS = '+-*/%&|^=<>!,;' OP_CHARS = '+-*/%&|^=<>!,;{}:'
if not expr: if not expr:
return return
counters = {k: 0 for k in _MATCHING_PARENS.values()} counters = {k: 0 for k in _MATCHING_PARENS.values()}
@@ -237,13 +247,14 @@ class JSInterpreter:
counters[_MATCHING_PARENS[char]] += 1 counters[_MATCHING_PARENS[char]] += 1
elif not in_quote and char in counters: elif not in_quote and char in counters:
counters[char] -= 1 counters[char] -= 1
elif not escaping and char in _QUOTES and in_quote in (char, None): elif not escaping:
if in_quote or after_op or char != '/': if char in _QUOTES and in_quote in (char, None):
in_quote = None if in_quote and not in_regex_char_group else char if in_quote or after_op or char != '/':
elif in_quote == '/' and char in '[]': in_quote = None if in_quote and not in_regex_char_group else char
in_regex_char_group = char == '[' elif in_quote == '/' and char in '[]':
in_regex_char_group = char == '['
escaping = not escaping and in_quote and char == '\\' escaping = not escaping and in_quote and char == '\\'
after_op = not in_quote and char in OP_CHARS or (char == ' ' and after_op) after_op = not in_quote and char in OP_CHARS or (char.isspace() and after_op)
if char != delim[pos] or any(counters.values()) or in_quote: if char != delim[pos] or any(counters.values()) or in_quote:
pos = 0 pos = 0
@@ -259,7 +270,9 @@ class JSInterpreter:
yield expr[start:] yield expr[start:]
@classmethod @classmethod
def _separate_at_paren(cls, expr, delim): def _separate_at_paren(cls, expr, delim=None):
if delim is None:
delim = expr and _MATCHING_PARENS[expr[0]]
separated = list(cls._separate(expr, delim, 1)) separated = list(cls._separate(expr, delim, 1))
if len(separated) < 2: if len(separated) < 2:
raise cls.Exception(f'No terminating paren {delim}', expr) raise cls.Exception(f'No terminating paren {delim}', expr)
@@ -338,7 +351,7 @@ class JSInterpreter:
if expr.startswith('new '): if expr.startswith('new '):
obj = expr[4:] obj = expr[4:]
if obj.startswith('Date('): if obj.startswith('Date('):
left, right = self._separate_at_paren(obj[4:], ')') left, right = self._separate_at_paren(obj[4:])
expr = unified_timestamp( expr = unified_timestamp(
self.interpret_expression(left, local_vars, allow_recursion), False) self.interpret_expression(left, local_vars, allow_recursion), False)
if not expr: if not expr:
@@ -352,8 +365,8 @@ class JSInterpreter:
return None, should_return return None, should_return
if expr.startswith('{'): if expr.startswith('{'):
inner, outer = self._separate_at_paren(expr, '}') inner, outer = self._separate_at_paren(expr)
# Look for Map first # try for object expression (Map)
sub_expressions = [list(self._separate(sub_expr.strip(), ':', 1)) for sub_expr in self._separate(inner)] sub_expressions = [list(self._separate(sub_expr.strip(), ':', 1)) for sub_expr in self._separate(inner)]
if all(len(sub_expr) == 2 for sub_expr in sub_expressions): if all(len(sub_expr) == 2 for sub_expr in sub_expressions):
def dict_item(key, val): def dict_item(key, val):
@@ -371,7 +384,7 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer expr = self._dump(inner, local_vars) + outer
if expr.startswith('('): if expr.startswith('('):
inner, outer = self._separate_at_paren(expr, ')') inner, outer = self._separate_at_paren(expr)
inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion) inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion)
if not outer or should_abort: if not outer or should_abort:
return inner, should_abort or should_return return inner, should_abort or should_return
@@ -379,53 +392,62 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer expr = self._dump(inner, local_vars) + outer
if expr.startswith('['): if expr.startswith('['):
inner, outer = self._separate_at_paren(expr, ']') inner, outer = self._separate_at_paren(expr)
name = self._named_object(local_vars, [ name = self._named_object(local_vars, [
self.interpret_expression(item, local_vars, allow_recursion) self.interpret_expression(item, local_vars, allow_recursion)
for item in self._separate(inner)]) for item in self._separate(inner)])
expr = name + outer expr = name + outer
m = re.match(rf'''(?x) m = re.match(r'''(?x)
(?P<try>try|finally)\s*| (?P<try>try)\s*\{|
(?P<catch>catch\s*(?P<err>\(\s*{_NAME_RE}\s*\)))| (?P<switch>switch)\s*\(|
(?P<switch>switch)\s*\(| (?P<for>for)\s*\(
(?P<for>for)\s*\(|''', expr) ''', expr)
if m and m.group('try'): md = m.groupdict() if m else {}
if expr[m.end()] == '{': if md.get('try'):
try_expr, expr = self._separate_at_paren(expr[m.end():], '}') try_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
else: err = None
try_expr, expr = expr[m.end() - 1:], ''
try: try:
ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion) ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion)
if should_abort: if should_abort:
return ret, True return ret, True
except JS_Throw as e:
local_vars[self._EXC_NAME] = e.error
except Exception as e: except Exception as e:
# XXX: This works for now, but makes debugging future issues very hard # XXX: This works for now, but makes debugging future issues very hard
local_vars[self._EXC_NAME] = e err = e
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('catch'): pending = (None, False)
catch_expr, expr = self._separate_at_paren(expr[m.end():], '}') m = re.match(r'catch\s*(?P<err>\(\s*{_NAME_RE}\s*\))?\{{'.format(**globals()), expr)
if self._EXC_NAME in local_vars: if m:
catch_vars = local_vars.new_child({m.group('err'): local_vars.pop(self._EXC_NAME)}) sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
ret, should_abort = self.interpret_statement(catch_expr, catch_vars, allow_recursion) if err:
catch_vars = {}
if m.group('err'):
catch_vars[m.group('err')] = err.error if isinstance(err, JS_Throw) else err
catch_vars = local_vars.new_child(catch_vars)
err, pending = None, self.interpret_statement(sub_expr, catch_vars, allow_recursion)
m = re.match(r'finally\s*\{', expr)
if m:
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort: if should_abort:
return ret, True return ret, True
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion) ret, should_abort = pending
return ret, should_abort or should_return if should_abort:
return ret, True
elif m and m.group('for'): if err:
constructor, remaining = self._separate_at_paren(expr[m.end() - 1:], ')') raise err
elif md.get('for'):
constructor, remaining = self._separate_at_paren(expr[m.end() - 1:])
if remaining.startswith('{'): if remaining.startswith('{'):
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining)
else: else:
switch_m = re.match(r'switch\s*\(', remaining) # FIXME switch_m = re.match(r'switch\s*\(', remaining) # FIXME
if switch_m: if switch_m:
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:], ')') switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:])
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining, '}')
body = 'switch(%s){%s}' % (switch_val, body) body = 'switch(%s){%s}' % (switch_val, body)
else: else:
@@ -444,11 +466,9 @@ class JSInterpreter:
except JS_Continue: except JS_Continue:
pass pass
self.interpret_expression(increment, local_vars, allow_recursion) self.interpret_expression(increment, local_vars, allow_recursion)
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('switch'): elif md.get('switch'):
switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:], ')') switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:])
switch_val = self.interpret_expression(switch_val, local_vars, allow_recursion) switch_val = self.interpret_expression(switch_val, local_vars, allow_recursion)
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining, '}')
items = body.replace('default:', 'case default:').split('case ')[1:] items = body.replace('default:', 'case default:').split('case ')[1:]
@@ -471,6 +491,8 @@ class JSInterpreter:
break break
if matched: if matched:
break break
if md:
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion) ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return return ret, should_abort or should_return
@@ -504,7 +526,7 @@ class JSInterpreter:
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})? (?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
=(?!=)(?P<expr>.*)$ =(?!=)(?P<expr>.*)$
)|(?P<return> )|(?P<return>
(?!if|return|true|false|null|undefined)(?P<name>{_NAME_RE})$ (?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
)|(?P<indexing> )|(?P<indexing>
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$ (?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
)|(?P<attribute> )|(?P<attribute>
@@ -539,6 +561,8 @@ class JSInterpreter:
raise JS_Continue() raise JS_Continue()
elif expr == 'undefined': elif expr == 'undefined':
return JS_Undefined, should_return return JS_Undefined, should_return
elif expr == 'NaN':
return float('NaN'), should_return
elif m and m.group('return'): elif m and m.group('return'):
return local_vars.get(m.group('name'), JS_Undefined), should_return return local_vars.get(m.group('name'), JS_Undefined), should_return
@@ -573,7 +597,7 @@ class JSInterpreter:
member = self.interpret_expression(m.group('member2'), local_vars, allow_recursion) member = self.interpret_expression(m.group('member2'), local_vars, allow_recursion)
arg_str = expr[m.end():] arg_str = expr[m.end():]
if arg_str.startswith('('): if arg_str.startswith('('):
arg_str, remaining = self._separate_at_paren(arg_str, ')') arg_str, remaining = self._separate_at_paren(arg_str)
else: else:
arg_str, remaining = None, arg_str arg_str, remaining = None, arg_str
@@ -683,6 +707,13 @@ class JSInterpreter:
return obj.index(idx, start) return obj.index(idx, start)
except ValueError: except ValueError:
return -1 return -1
elif member == 'charCodeAt':
assertion(isinstance(obj, str), 'must be applied on a string')
assertion(len(argvals) == 1, 'takes exactly one argument')
idx = argvals[0] if isinstance(argvals[0], int) else 0
if idx >= len(obj):
return None
return ord(obj[idx])
idx = int(member) if isinstance(obj, list) else member idx = int(member) if isinstance(obj, list) else member
return obj[idx](argvals, allow_recursion=allow_recursion) return obj[idx](argvals, allow_recursion=allow_recursion)
@@ -751,7 +782,7 @@ class JSInterpreter:
\((?P<args>[^)]*)\)\s* \((?P<args>[^)]*)\)\s*
(?P<code>{.+})''' % {'name': re.escape(funcname)}, (?P<code>{.+})''' % {'name': re.escape(funcname)},
self.code) self.code)
code, _ = self._separate_at_paren(func_m.group('code'), '}') code, _ = self._separate_at_paren(func_m.group('code'))
if func_m is None: if func_m is None:
raise self.Exception(f'Could not find JS function "{funcname}"') raise self.Exception(f'Could not find JS function "{funcname}"')
return [x.strip() for x in func_m.group('args').split(',')], code return [x.strip() for x in func_m.group('args').split(',')], code
@@ -766,7 +797,7 @@ class JSInterpreter:
if mobj is None: if mobj is None:
break break
start, body_start = mobj.span() start, body_start = mobj.span()
body, remaining = self._separate_at_paren(code[body_start - 1:], '}') body, remaining = self._separate_at_paren(code[body_start - 1:])
name = self._named_object(local_vars, self.extract_function_from_code( name = self._named_object(local_vars, self.extract_function_from_code(
[x.strip() for x in mobj.group('args').split(',')], [x.strip() for x in mobj.group('args').split(',')],
body, local_vars, *global_stack)) body, local_vars, *global_stack))
@@ -784,7 +815,7 @@ class JSInterpreter:
global_stack[0].update(itertools.zip_longest(argnames, args, fillvalue=None)) global_stack[0].update(itertools.zip_longest(argnames, args, fillvalue=None))
global_stack[0].update(kwargs) global_stack[0].update(kwargs)
var_stack = LocalNameSpace(*global_stack) var_stack = LocalNameSpace(*global_stack)
ret, should_abort = self.interpret_statement(code.replace('\n', ''), var_stack, allow_recursion - 1) ret, should_abort = self.interpret_statement(code.replace('\n', ' '), var_stack, allow_recursion - 1)
if should_abort: if should_abort:
return ret return ret
return resf return resf

View File

@@ -25,10 +25,12 @@ from .utils import (
OUTTMPL_TYPES, OUTTMPL_TYPES,
POSTPROCESS_WHEN, POSTPROCESS_WHEN,
Config, Config,
deprecation_warning,
expand_path, expand_path,
format_field, format_field,
get_executable_path, get_executable_path,
join_nonempty, join_nonempty,
orderedSet_from_options,
remove_end, remove_end,
write_string, write_string,
) )
@@ -163,6 +165,7 @@ class _YoutubeDLHelpFormatter(optparse.IndentedHelpFormatter):
class _YoutubeDLOptionParser(optparse.OptionParser): class _YoutubeDLOptionParser(optparse.OptionParser):
# optparse is deprecated since python 3.2. So assume a stable interface even for private methods # optparse is deprecated since python 3.2. So assume a stable interface even for private methods
ALIAS_DEST = '_triggered_aliases'
ALIAS_TRIGGER_LIMIT = 100 ALIAS_TRIGGER_LIMIT = 100
def __init__(self): def __init__(self):
@@ -174,6 +177,7 @@ class _YoutubeDLOptionParser(optparse.OptionParser):
formatter=_YoutubeDLHelpFormatter(), formatter=_YoutubeDLHelpFormatter(),
conflict_handler='resolve', conflict_handler='resolve',
) )
self.set_default(self.ALIAS_DEST, collections.defaultdict(int))
_UNKNOWN_OPTION = (optparse.BadOptionError, optparse.AmbiguousOptionError) _UNKNOWN_OPTION = (optparse.BadOptionError, optparse.AmbiguousOptionError)
_BAD_OPTION = optparse.OptionValueError _BAD_OPTION = optparse.OptionValueError
@@ -232,30 +236,16 @@ def create_parser():
current + value if append is True else value + current) current + value if append is True else value + current)
def _set_from_options_callback( def _set_from_options_callback(
option, opt_str, value, parser, delim=',', allowed_values=None, aliases={}, option, opt_str, value, parser, allowed_values, delim=',', aliases={},
process=lambda x: x.lower().strip()): process=lambda x: x.lower().strip()):
current = set(getattr(parser.values, option.dest)) values = [process(value)] if delim is None else map(process, value.split(delim))
values = [process(value)] if delim is None else list(map(process, value.split(delim)[::-1])) try:
while values: requested = orderedSet_from_options(values, collections.ChainMap(aliases, {'all': allowed_values}),
actual_val = val = values.pop() start=getattr(parser.values, option.dest))
if not val: except ValueError as e:
raise optparse.OptionValueError(f'Invalid {option.metavar} for {opt_str}: {value}') raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {e.args[0]}')
if val == 'all':
current.update(allowed_values)
elif val == '-all':
current = set()
elif val in aliases:
values.extend(aliases[val])
else:
if val[0] == '-':
val = val[1:]
current.discard(val)
else:
current.update([val])
if allowed_values is not None and val not in allowed_values:
raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {actual_val}')
setattr(parser.values, option.dest, current) setattr(parser.values, option.dest, set(requested))
def _dict_from_options_callback( def _dict_from_options_callback(
option, opt_str, value, parser, option, opt_str, value, parser,
@@ -305,8 +295,7 @@ def create_parser():
aliases = (x if x.startswith('-') else f'--{x}' for x in map(str.strip, aliases.split(','))) aliases = (x if x.startswith('-') else f'--{x}' for x in map(str.strip, aliases.split(',')))
try: try:
alias_group.add_option( alias_group.add_option(
*aliases, help=opts, nargs=nargs, type='str' if nargs else None, *aliases, help=opts, nargs=nargs, dest=parser.ALIAS_DEST, type='str' if nargs else None,
dest='_triggered_aliases', default=collections.defaultdict(int),
metavar=' '.join(f'ARG{i}' for i in range(nargs)), action='callback', metavar=' '.join(f'ARG{i}' for i in range(nargs)), action='callback',
callback=_alias_callback, callback_kwargs={'opts': opts, 'nargs': nargs}) callback=_alias_callback, callback_kwargs={'opts': opts, 'nargs': nargs})
except Exception as err: except Exception as err:
@@ -365,10 +354,20 @@ def create_parser():
'--extractor-descriptions', '--extractor-descriptions',
action='store_true', dest='list_extractor_descriptions', default=False, action='store_true', dest='list_extractor_descriptions', default=False,
help='Output descriptions of all supported extractors and exit') help='Output descriptions of all supported extractors and exit')
general.add_option(
'--use-extractors', '--ies',
action='callback', dest='allowed_extractors', metavar='NAMES', type='str',
default=[], callback=_list_from_options_callback,
help=(
'Extractor names to use separated by commas. '
'You can also use regexes, "all", "default" and "end" (end URL matching); '
'e.g. --ies "holodex.*,end,youtube". '
'Prefix the name with a "-" to exclude it, e.g. --ies default,-generic. '
'Use --list-extractors for a list of extractor names. (Alias: --ies)'))
general.add_option( general.add_option(
'--force-generic-extractor', '--force-generic-extractor',
action='store_true', dest='force_generic_extractor', default=False, action='store_true', dest='force_generic_extractor', default=False,
help='Force extraction to use the generic extractor') help=optparse.SUPPRESS_HELP)
general.add_option( general.add_option(
'--default-search', '--default-search',
dest='default_search', metavar='PREFIX', dest='default_search', metavar='PREFIX',
@@ -443,11 +442,12 @@ def create_parser():
'allowed_values': { 'allowed_values': {
'filename', 'filename-sanitization', 'format-sort', 'abort-on-error', 'format-spec', 'no-playlist-metafiles', 'filename', 'filename-sanitization', 'format-sort', 'abort-on-error', 'format-spec', 'no-playlist-metafiles',
'multistreams', 'no-live-chat', 'playlist-index', 'list-formats', 'no-direct-merge', 'multistreams', 'no-live-chat', 'playlist-index', 'list-formats', 'no-direct-merge',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-attach-info-json', 'embed-metadata', 'no-attach-info-json', 'embed-metadata', 'embed-thumbnail-atomicparsley',
'embed-thumbnail-atomicparsley', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-youtube-prefer-utc-upload-date',
}, 'aliases': { }, 'aliases': {
'youtube-dl': ['-multistreams', 'all'], 'youtube-dl': ['all', '-multistreams'],
'youtube-dlc': ['-no-youtube-channel-redirect', '-no-live-chat', 'all'], 'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat'],
} }
}, help=( }, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc ' 'Options that can help keep compatibility with youtube-dl or youtube-dlc '
@@ -634,7 +634,7 @@ def create_parser():
selection.add_option( selection.add_option(
'--break-per-input', '--break-per-input',
action='store_true', dest='break_per_url', default=False, action='store_true', dest='break_per_url', default=False,
help='Make --break-on-existing, --break-on-reject and --max-downloads act only on the current input URL') help='--break-on-existing, --break-on-reject, --max-downloads, and autonumber resets per input URL')
selection.add_option( selection.add_option(
'--no-break-per-input', '--no-break-per-input',
action='store_false', dest='break_per_url', action='store_false', dest='break_per_url',
@@ -1401,14 +1401,15 @@ def create_parser():
help='Do not read/dump cookies from/to file (default)') help='Do not read/dump cookies from/to file (default)')
filesystem.add_option( filesystem.add_option(
'--cookies-from-browser', '--cookies-from-browser',
dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE]', dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE][::CONTAINER]',
help=( help=(
'The name of the browser and (optionally) the name/path of ' 'The name of the browser to load cookies from. '
'the profile to load cookies from, separated by a ":". '
f'Currently supported browsers are: {", ".join(sorted(SUPPORTED_BROWSERS))}. ' f'Currently supported browsers are: {", ".join(sorted(SUPPORTED_BROWSERS))}. '
'By default, the most recently accessed profile is used. ' 'Optionally, the KEYRING used for decrypting Chromium cookies on Linux, '
'The keyring used for decrypting Chromium cookies on Linux can be ' 'the name/path of the PROFILE to load cookies from, '
'(optionally) specified after the browser name separated by a "+". ' 'and the CONTAINER name (if Firefox) ("none" for no container) '
'can be given with their respective seperators. '
'By default, all containers of the most recently accessed profile are used. '
f'Currently supported keyrings are: {", ".join(map(str.lower, sorted(SUPPORTED_KEYRINGS)))}')) f'Currently supported keyrings are: {", ".join(map(str.lower, sorted(SUPPORTED_KEYRINGS)))}'))
filesystem.add_option( filesystem.add_option(
'--no-cookies-from-browser', '--no-cookies-from-browser',
@@ -1866,7 +1867,6 @@ def create_parser():
def _hide_login_info(opts): def _hide_login_info(opts):
write_string( deprecation_warning(f'"{__name__}._hide_login_info" is deprecated and may be removed '
'DeprecationWarning: "yt_dlp.options._hide_login_info" is deprecated and may be removed in a future version. ' 'in a future version. Use "yt_dlp.utils.Config.hide_login_info" instead')
'Use "yt_dlp.utils.Config.hide_login_info" instead\n')
return Config.hide_login_info(opts) return Config.hide_login_info(opts)

View File

@@ -7,10 +7,10 @@ from ..utils import (
PostProcessingError, PostProcessingError,
RetryManager, RetryManager,
_configuration_args, _configuration_args,
deprecation_warning,
encodeFilename, encodeFilename,
network_exceptions, network_exceptions,
sanitized_Request, sanitized_Request,
write_string,
) )
@@ -73,10 +73,14 @@ class PostProcessor(metaclass=PostProcessorMetaClass):
if self._downloader: if self._downloader:
return self._downloader.report_warning(text, *args, **kwargs) return self._downloader.report_warning(text, *args, **kwargs)
def deprecation_warning(self, text): def deprecation_warning(self, msg):
warn = getattr(self._downloader, 'deprecation_warning', deprecation_warning)
return warn(msg, stacklevel=1)
def deprecated_feature(self, msg):
if self._downloader: if self._downloader:
return self._downloader.deprecation_warning(text) return self._downloader.deprecated_feature(msg)
write_string(f'DeprecationWarning: {text}') return deprecation_warning(msg, stacklevel=1)
def report_error(self, text, *args, **kwargs): def report_error(self, text, *args, **kwargs):
self.deprecation_warning('"yt_dlp.postprocessor.PostProcessor.report_error" is deprecated. ' self.deprecation_warning('"yt_dlp.postprocessor.PostProcessor.report_error" is deprecated. '

View File

@@ -15,6 +15,7 @@ from ..utils import (
Popen, Popen,
PostProcessingError, PostProcessingError,
_get_exe_version_output, _get_exe_version_output,
deprecation_warning,
detect_exe_version, detect_exe_version,
determine_ext, determine_ext,
dfxp2srt, dfxp2srt,
@@ -30,7 +31,6 @@ from ..utils import (
traverse_obj, traverse_obj,
variadic, variadic,
write_json_file, write_json_file,
write_string,
) )
EXT_TO_OUT_FORMATS = { EXT_TO_OUT_FORMATS = {
@@ -187,8 +187,8 @@ class FFmpegPostProcessor(PostProcessor):
else: else:
self.probe_basename = basename self.probe_basename = basename
if basename == self._ffmpeg_to_avconv[kind]: if basename == self._ffmpeg_to_avconv[kind]:
self.deprecation_warning( self.deprecated_feature(f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and '
f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and may be removed in a future version. Use {kind} instead') f'may be removed in a future version. Use {kind} instead')
return version return version
@functools.cached_property @functools.cached_property
@@ -1064,7 +1064,7 @@ class FFmpegThumbnailsConvertorPP(FFmpegPostProcessor):
@classmethod @classmethod
def is_webp(cls, path): def is_webp(cls, path):
write_string(f'DeprecationWarning: {cls.__module__}.{cls.__name__}.is_webp is deprecated') deprecation_warning(f'{cls.__module__}.{cls.__name__}.is_webp is deprecated')
return imghdr.what(path) == 'webp' return imghdr.what(path) == 'webp'
def fixup_webp(self, info, idx=-1): def fixup_webp(self, info, idx=-1):

View File

@@ -1,4 +1,5 @@
import atexit import atexit
import contextlib
import hashlib import hashlib
import json import json
import os import os
@@ -13,6 +14,7 @@ from .compat import compat_realpath, compat_shlex_quote
from .utils import ( from .utils import (
Popen, Popen,
cached_method, cached_method,
deprecation_warning,
shell_quote, shell_quote,
system_identifier, system_identifier,
traverse_obj, traverse_obj,
@@ -50,6 +52,19 @@ def detect_variant():
return VARIANT or _get_variant_and_executable_path()[0] return VARIANT or _get_variant_and_executable_path()[0]
@functools.cache
def current_git_head():
if detect_variant() != 'source':
return
with contextlib.suppress(Exception):
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
return stdout.strip()
_FILE_SUFFIXES = { _FILE_SUFFIXES = {
'zip': '', 'zip': '',
'py2exe': '_min.exe', 'py2exe': '_min.exe',
@@ -288,11 +303,8 @@ def run_update(ydl):
def update_self(to_screen, verbose, opener): def update_self(to_screen, verbose, opener):
import traceback import traceback
from .utils import write_string deprecation_warning(f'"{__name__}.update_self" is deprecated and may be removed '
f'in a future version. Use "{__name__}.run_update(ydl)" instead')
write_string(
'DeprecationWarning: "yt_dlp.update.update_self" is deprecated and may be removed in a future version. '
'Use "yt_dlp.update.run_update(ydl)" instead\n')
printfn = to_screen printfn = to_screen

View File

@@ -828,8 +828,8 @@ def escapeHTML(text):
def process_communicate_or_kill(p, *args, **kwargs): def process_communicate_or_kill(p, *args, **kwargs):
write_string('DeprecationWarning: yt_dlp.utils.process_communicate_or_kill is deprecated ' deprecation_warning(f'"{__name__}.process_communicate_or_kill" is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.utils.Popen.communicate_or_kill instead') f'in a future version. Use "{__name__}.Popen.communicate_or_kill" instead')
return Popen.communicate_or_kill(p, *args, **kwargs) return Popen.communicate_or_kill(p, *args, **kwargs)
@@ -840,12 +840,35 @@ class Popen(subprocess.Popen):
else: else:
_startupinfo = None _startupinfo = None
def __init__(self, *args, text=False, **kwargs): @staticmethod
def _fix_pyinstaller_ld_path(env):
"""Restore LD_LIBRARY_PATH when using PyInstaller
Ref: https://github.com/pyinstaller/pyinstaller/blob/develop/doc/runtime-information.rst#ld_library_path--libpath-considerations
https://github.com/yt-dlp/yt-dlp/issues/4573
"""
if not hasattr(sys, '_MEIPASS'):
return
def _fix(key):
orig = env.get(f'{key}_ORIG')
if orig is None:
env.pop(key, None)
else:
env[key] = orig
_fix('LD_LIBRARY_PATH') # Linux
_fix('DYLD_LIBRARY_PATH') # macOS
def __init__(self, *args, env=None, text=False, **kwargs):
if env is None:
env = os.environ.copy()
self._fix_pyinstaller_ld_path(env)
if text is True: if text is True:
kwargs['universal_newlines'] = True # For 3.6 compatibility kwargs['universal_newlines'] = True # For 3.6 compatibility
kwargs.setdefault('encoding', 'utf-8') kwargs.setdefault('encoding', 'utf-8')
kwargs.setdefault('errors', 'replace') kwargs.setdefault('errors', 'replace')
super().__init__(*args, **kwargs, startupinfo=self._startupinfo) super().__init__(*args, env=env, **kwargs, startupinfo=self._startupinfo)
def communicate_or_kill(self, *args, **kwargs): def communicate_or_kill(self, *args, **kwargs):
try: try:
@@ -860,9 +883,9 @@ class Popen(subprocess.Popen):
self.wait(timeout=timeout) self.wait(timeout=timeout)
@classmethod @classmethod
def run(cls, *args, **kwargs): def run(cls, *args, timeout=None, **kwargs):
with cls(*args, **kwargs) as proc: with cls(*args, **kwargs) as proc:
stdout, stderr = proc.communicate_or_kill() stdout, stderr = proc.communicate_or_kill(timeout=timeout)
return stdout or '', stderr or '', proc.returncode return stdout or '', stderr or '', proc.returncode
@@ -1934,7 +1957,7 @@ class DateRange:
def platform_name(): def platform_name():
""" Returns the platform name as a str """ """ Returns the platform name as a str """
write_string('DeprecationWarning: yt_dlp.utils.platform_name is deprecated, use platform.platform instead') deprecation_warning(f'"{__name__}.platform_name" is deprecated, use "platform.platform" instead')
return platform.platform() return platform.platform()
@@ -1980,6 +2003,23 @@ def write_string(s, out=None, encoding=None):
out.flush() out.flush()
def deprecation_warning(msg, *, printer=None, stacklevel=0, **kwargs):
from . import _IN_CLI
if _IN_CLI:
if msg in deprecation_warning._cache:
return
deprecation_warning._cache.add(msg)
if printer:
return printer(f'{msg}{bug_reports_message()}', **kwargs)
return write_string(f'ERROR: {msg}{bug_reports_message()}\n', **kwargs)
else:
import warnings
warnings.warn(DeprecationWarning(msg), stacklevel=stacklevel + 3)
deprecation_warning._cache = set()
def bytes_to_intlist(bs): def bytes_to_intlist(bs):
if not bs: if not bs:
return [] return []
@@ -4862,8 +4902,8 @@ def decode_base_n(string, n=None, table=None):
def decode_base(value, digits): def decode_base(value, digits):
write_string('DeprecationWarning: yt_dlp.utils.decode_base is deprecated ' deprecation_warning(f'{__name__}.decode_base is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.decode_base_n instead') f'in a future version. Use {__name__}.decode_base_n instead')
return decode_base_n(value, table=digits) return decode_base_n(value, table=digits)
@@ -5332,8 +5372,8 @@ def traverse_obj(
def traverse_dict(dictn, keys, casesense=True): def traverse_dict(dictn, keys, casesense=True):
write_string('DeprecationWarning: yt_dlp.utils.traverse_dict is deprecated ' deprecation_warning(f'"{__name__}.traverse_dict" is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.utils.traverse_obj instead') f'in a future version. Use "{__name__}.traverse_obj" instead')
return traverse_obj(dictn, keys, casesense=casesense, is_user_input=True, traverse_string=True) return traverse_obj(dictn, keys, casesense=casesense, is_user_input=True, traverse_string=True)
@@ -5785,6 +5825,36 @@ def truncate_string(s, left, right=0):
return f'{s[:left-3]}...{s[-right:]}' return f'{s[:left-3]}...{s[-right:]}'
def orderedSet_from_options(options, alias_dict, *, use_regex=False, start=None):
assert 'all' in alias_dict, '"all" alias is required'
requested = list(start or [])
for val in options:
discard = val.startswith('-')
if discard:
val = val[1:]
if val in alias_dict:
val = alias_dict[val] if not discard else [
i[1:] if i.startswith('-') else f'-{i}' for i in alias_dict[val]]
# NB: Do not allow regex in aliases for performance
requested = orderedSet_from_options(val, alias_dict, start=requested)
continue
current = (filter(re.compile(val, re.I).fullmatch, alias_dict['all']) if use_regex
else [val] if val in alias_dict['all'] else None)
if current is None:
raise ValueError(val)
if discard:
for item in current:
while item in requested:
requested.remove(item)
else:
requested.extend(current)
return orderedSet(requested)
# Deprecated # Deprecated
has_certifi = bool(certifi) has_certifi = bool(certifi)
has_websockets = bool(websockets) has_websockets = bool(websockets)

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2022.08.19' __version__ = '2022.09.01'
RELEASE_GIT_HEAD = '48c88e088' RELEASE_GIT_HEAD = '5d7c7d656'
VARIANT = None VARIANT = None