1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-17 12:21:52 +00:00

Compare commits

..

74 Commits

Author SHA1 Message Date
github-actions
adba24d207 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2022-09-01 11:26:07 +00:00
pukkandan
5d7c7d6569 Release 2022.09.01 2022-09-01 16:49:04 +05:30
pukkandan
d2c8aadf79 [cleanup] Misc
Closes #4710, Closes #4754, Closes #4723
Authored by: pukkandan, MrRawes, DavidH-2022
2022-09-01 16:49:03 +05:30
pukkandan
1ac7f46184 Update to ytdl-commit-ed5c44e7
[compat] Replace deficient ChainMap class in Py3.3 and earlier
ed5c44e7b7
2022-09-01 16:46:32 +05:30
pukkandan
05deb747bb [jsinterp] Fix escape in regex 2022-09-01 16:46:32 +05:30
pukkandan
b505e8517a [extractor/youtube] Fallback regex for nsig code extraction 2022-09-01 16:46:32 +05:30
pukkandan
f2e9fa3ef7 [FormatSort] Fix aext for --prefer-free-formats
Closes #4735
2022-09-01 16:46:31 +05:30
satan1st
50a399326f [build] make tar' should not follow DESTDIR` (#4790)
Ref: https://www.gnu.org/prep/standards/html_node/DESTDIR.html
Authored by: satan1st
2022-09-01 16:46:17 +05:30
coletdjnz
1ff88b7aec [extractor/youtube] Add no-youtube-prefer-utc-upload-date compat option (#4771)
This option reverts 992f9a730b and 17322130a9 to prefer the non-UTC upload date in microformats.

Authored by: coletdjnz, pukkandan
2022-09-01 10:02:28 +00:00
bashonly
825d3ce386 [cookies] Improve container support (#4806)
Closes #4800
Authored by: bashonly, pukkandan, coletdjnz
2022-09-01 15:22:59 +05:30
bashonly
92aa6d6883 [extractor/triller] Add extractor (#4712)
Closes #4703
Authored by: bashonly
2022-09-01 15:20:54 +05:30
Elyse
b2a4db425b [VQQ] Add extractors (#4706)
Closes #1666
Authored by: elyse0
2022-09-01 12:42:34 +05:30
Yifu Yu
de49cdbe9d [extractor/bilibili] Extract flac with premium account (#4759)
Authored by: jackyyf
2022-08-31 23:22:16 +05:30
shirt
9f9c85dda4 [Build] Update pyinstaller 2022-08-31 13:12:26 -04:00
HobbyistDev
11734714c2 [extractor/eurosport] Add extractor (#4613)
Closes #2487
Authored by: HobbyistDev
2022-08-31 22:32:33 +05:30
pukkandan
b86ca447ce [extractor/mediaset] Fix embed extraction
Closes #4804
2022-08-31 22:24:41 +05:30
Tejas Arlimatti
f8c7ba9984 [extractor/epoch] Add extractor (#4772)
Closes #4714
Authored by: tejasa97
2022-08-31 22:16:26 +05:30
DepFA
76f2bb175d [extractor/stripchat] Don't modify input URL (#4781)
Authored by: dfaker
2022-08-31 21:10:59 +05:30
Elyse
f26af78a8a [jsinterp] Add charcodeAt and bitwise overflow (#4706)
Authored by: elyse0
2022-08-31 21:01:22 +05:30
Lesmiscore
bfbecd1174 [extractor/newspicks] Add extractor (#4725)
Authored by: Lesmiscore
2022-08-31 02:07:55 +09:00
bashonly
9bd13fe5bb [cookies] Support firefox container in --cookies-from-browser (#4753)
Authored by: bashonly
2022-08-30 22:24:46 +05:30
Jeff Huffman
459262ac97 [extractor/crunchyroll:beta] Use anonymous access (#4704)
Closes #4692
Authored by: tejing1
2022-08-30 22:04:13 +05:30
Lesmiscore
82ea226c61 Restore LD_LIBRARY_PATH when using PyInstaller (#4666)
Authored by: Lesmiscore
2022-08-31 01:24:14 +09:00
pukkandan
da4db748fa [utils] Add deprecation_warning
See https://github.com/yt-dlp/yt-dlp/pull/2173#issuecomment-1097021515
2022-08-30 21:03:07 +05:30
pukkandan
e1eabd7beb [downloader/external] Smarter detection of executable
Closes #4778
2022-08-30 18:13:38 +05:30
pukkandan
d81ba7d491 [jsinterp, extractor/youtube] Minor fixes 2022-08-30 18:13:37 +05:30
OHaiiBuzzle
5135ed3d4a [extractor/huya] Fix stream extraction (#4798)
Closes #4658
Authored by: ohaiibuzzle
2022-08-30 16:14:16 +05:30
pukkandan
c4b2df872d [jsinterp] Fix _separate
Ref: https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1231126941
2022-08-30 16:06:40 +05:30
Samantaz Fox
224b5a35f7 [extractor/youtube] Update iOS Innertube clients (#4792)
Authored by: SamantazFox
2022-08-29 03:36:55 +00:00
coletdjnz
50ac0e5416 [extractor/youtube] Use device-specific user agent (#4770)
Thwart latest fingerprinting attempt (see https://github.com/iv-org/invidious/issues/3230#issuecomment-1226887639)

Authored by: coletdjnz
2022-08-28 22:59:54 +00:00
Lesmiscore
e0992d5558 [extractor/IslamChannel] Add extractors (#4779)
Authored by: Lesmiscore
2022-08-28 01:37:25 +09:00
pukkandan
5e01315aa1 [cache, extractor/youtube] Invalidate old cache 2022-08-27 07:25:14 +05:30
pukkandan
4e4982ab5b [extractor/generic] Don't return JW player without formats
CLoses #4765
2022-08-27 06:21:17 +05:30
cgrigis
89e4d86171 [extractor/arte] Bug fix (#4769)
Closes #4768
Authored by: cgrigis
2022-08-27 05:58:01 +05:30
Shreyas Minocha
a1af516259 [extractor/screencastomatic] Support --video-password (#4761)
Authored by: shreyasminocha
2022-08-26 08:59:45 +05:30
pukkandan
1d64a59547 [extractor/vimeo:user] Fix _VALID_URL
Closes #4758
2022-08-26 06:29:03 +05:30
pukkandan
ca7f8b8f31 Bugfix for 822d66e591
Closes #4760
2022-08-26 06:08:05 +05:30
pukkandan
164b03c486 [jsinterp] Fix bug in operator precedence
Fixes https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1226659543
2022-08-25 09:40:46 +05:30
pukkandan
e5458d1d88 Fix lazy extractor bug in fe7866d0ed
and add test

Fixes https://github.com/yt-dlp/yt-dlp/pull/3234#issuecomment-1225347071
2022-08-24 15:19:58 +05:30
pukkandan
b5e7a2e69d Add version to infojson 2022-08-24 13:03:45 +05:30
pukkandan
2516cafb28 Fix bug in fe7866d0ed 2022-08-24 08:21:39 +05:30
pukkandan
fd404bec7e Fix --break-per-url --max-downloads 2022-08-24 08:00:13 +05:30
pukkandan
fe7866d0ed Add option --use-extractors
Deprecates `--force-generic-extractor`

Closes #3234, Closes #2044

Related: #4307, #1791
2022-08-24 07:47:51 +05:30
pukkandan
5314b52192 [utils] Add orderedSet_from_options 2022-08-24 07:38:55 +05:30
pukkandan
13db4e7b9e [extractor/mixcloud] All formats are audio-only
Closes #4740
2022-08-23 04:11:27 +05:30
Joshua Lochner
07275b708b [extractor/medaltv] Fix extraction (#4739)
Authored by: xenova
2022-08-23 01:34:12 +05:30
Elyse
b85703d11a [extractor/rtbf] Fix jwt extraction (#4738)
Closes #4683
Authored by: elyse0
2022-08-23 00:15:46 +05:30
pukkandan
992dc6b486 [jsinterp] Implement timeout
Workaround for #4716
2022-08-22 06:19:06 +05:30
pukkandan
822d66e591 Fix bug in --alias 2022-08-22 04:37:23 +05:30
pukkandan
8d1ad6378f [extractor/BiliBiliSearch] Don't sort by date
Related #4682
2022-08-21 05:19:20 +05:30
pukkandan
2d1019542a [extractor/BiliBiliSearch] Fix infinite loop
Closes #4682
2022-08-21 05:19:20 +05:30
pukkandan
b25cac650f [extractor/youtube] Fix bug in format sorting 2022-08-21 00:56:27 +05:30
pukkandan
90a1df305b [test] Fix test_youtube_signature 2022-08-21 00:51:03 +05:30
pukkandan
0a6b4b82e9 [extractor/uktv] Improve _VALID_URL
Closes #4707
Authored by: dirkf
2022-08-20 05:00:45 +05:30
pukkandan
1704c47ba8 [extractor/bitchute] Mark errors as expected
Closes #4685
2022-08-20 04:53:05 +05:30
github-actions
b76e9cedb3 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2022-08-19 00:11:11 +00:00
pukkandan
48c88e088c Release 2022.08.19 2022-08-19 05:08:22 +05:30
pukkandan
a831c2ea90 [cleanup] Misc 2022-08-19 05:08:21 +05:30
pukkandan
be13a6e525 [jsinterp] Bring on-par with youtube-dl
Code from: https://github.com/ytdl-org/youtube-dl/pull/31175, https://github.com/ytdl-org/youtube-dl/pull/31182

Authored by pukkandan, dirkf
2022-08-19 05:08:21 +05:30
bashonly
8a3da4c68c [extractor/instagram] Fix bugs in 7d3b98be4c (#4701)
Authored by: bashonly
2022-08-19 03:45:49 +05:30
nixxo
4d37d4a77c [extractor/rai] Minor fix (#4700)
Closes #4691, #4690
2022-08-19 02:28:59 +05:30
bashonly
7d3b98be4c [extractor/instagram] Fix extraction (#4696)
Closes #4657, #4532, #4475
Authored by: bashonly, pritam20ps05
2022-08-19 02:27:46 +05:30
Elyse
2b3e43e247 [extractor/rtbf] Fix stream extractor (#4671)
Closes #4656
Authored by: elyse0
2022-08-19 01:42:04 +05:30
Alexander Seiler
f60ef66371 [extractor/zattoo] Fix Zattoo resellers (#4675)
Closes #4630
Authored by: goggle
2022-08-19 01:27:51 +05:30
pukkandan
25836db6be [extractor/youtube] Add fallback to phantomjs
Related #4635
2022-08-18 21:35:18 +05:30
pukkandan
587021cd9f [phantomjs] Add function to execute JS without a DOM
Authored by: MinePlayersPE, pukkandan
2022-08-18 21:34:47 +05:30
pukkandan
580ce00782 [youtube] Improve signature caching
and refactor related functions
2022-08-18 21:33:30 +05:30
ChillingPepper
2f1a299c50 [extractor/SovietsCloset] Fix extractor (#4688)
Closes #4200 
Authored by: ChillingPepper
2022-08-18 16:44:45 +05:30
pukkandan
f6ca640b12 [jsinterp] Fix for youtube player 1f7d5369
Closes #4635 again
2022-08-18 16:38:35 +05:30
pukkandan
3ce2933693 [youtube] Fix error reporting of "Incomplete data"
Related: #4669
2022-08-16 22:01:48 +05:30
pukkandan
c200096c03 Fix bug in --download-archive
Closes #4668
2022-08-16 22:00:51 +05:30
pukkandan
6d3e7424bf [jsinterp] Fix for youtube player c81bbb4a 2022-08-16 06:53:45 +05:30
pukkandan
5c6d2ef9d1 [youtube] Improve format sorting for IOS formats
When no itag/resolution is available for reference, use the closest resolution
2022-08-15 14:04:05 +05:30
Lesmiscore
460eb9c50e [build] Exclude devscripts from installs
Closes #4667
2022-08-15 13:51:35 +05:30
66 changed files with 3006 additions and 949 deletions

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a broken site - label: I'm reporting a broken site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a new site support request - label: I'm reporting a new site support request
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm requesting a site-specific feature - label: I'm requesting a site-specific feature
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -18,7 +18,7 @@ body:
options: options:
- label: I'm reporting a bug unrelated to a specific site - label: I'm reporting a bug unrelated to a specific site
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
@@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell
validations: validations:

View File

@@ -20,7 +20,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true required: true
@@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell

View File

@@ -26,7 +26,7 @@ body:
required: true required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit) - label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true required: true
@@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube'] [debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i'] [debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8 [debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe) [debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0 [debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs [debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs [debug] Checking exe version: ffprobe -bsfs
@@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3 [debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {} [debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest [debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14 Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.08.14) yt-dlp is up to date (2022.09.01)
<more lines> <more lines>
render: shell render: shell

View File

@@ -194,7 +194,7 @@ jobs:
- name: Install Requirements - name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install --upgrade pip setuptools wheel py2exe python -m pip install --upgrade pip setuptools wheel py2exe
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |
@@ -230,7 +230,7 @@ jobs:
- name: Install Requirements - name: Install Requirements
run: | run: |
python -m pip install --upgrade pip setuptools wheel python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare - name: Prepare
run: | run: |

View File

@@ -299,3 +299,12 @@ bashonly
jacobtruman jacobtruman
masta79 masta79
palewire palewire
cgrigis
DavidH-2022
dfaker
jackyyf
ohaiibuzzle
SamantazFox
shreyasminocha
tejasa97
xenov

View File

@@ -11,6 +11,71 @@
--> -->
### 2022.09.01
* Add option `--use-extractors`
* Merge youtube-dl: Upto [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7)
* Add yt-dlp version to infojson
* Fix `--break-per-url --max-downloads`
* Fix bug in `--alias`
* [cookies] Support firefox container in `--cookies-from-browser` by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [downloader/external] Smarter detection of executable
* [extractor/generic] Don't return JW player without formats
* [FormatSort] Fix `aext` for `--prefer-free-formats`
* [jsinterp] Various improvements by [pukkandan](https://github.com/pukkandan), [dirkf](https://github.com/dirkf), [elyse0](https://github.com/elyse0)
* [cache] Mechanism to invalidate old cache
* [utils] Add `deprecation_warning`
* [utils] Add `orderedSet_from_options`
* [utils] `Popen`: Restore `LD_LIBRARY_PATH` when using PyInstaller by [Lesmiscore](https://github.com/Lesmiscore)
* [build] `make tar` should not follow `DESTDIR` by [satan1st](https://github.com/satan1st)
* [build] Update pyinstaller by [shirt-dev](https://github.com/shirt-dev)
* [test] Fix `test_youtube_signature`
* [cleanup] Misc fixes and cleanup by [DavidH-2022](https://github.com/DavidH-2022), [MrRawes](https://github.com/MrRawes), [pukkandan](https://github.com/pukkandan)
* [extractor/epoch] Add extractor by [tejasa97](https://github.com/tejasa97)
* [extractor/eurosport] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/IslamChannel] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/newspicks] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/triller] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/VQQ] Add extractors by [elyse0](https://github.com/elyse0)
* [extractor/youtube] Improvements to nsig extraction
* [extractor/youtube] Fix bug in format sorting
* [extractor/youtube] Update iOS Innertube clients by [SamantazFox](https://github.com/SamantazFox)
* [extractor/youtube] Use device-specific user agent by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add `--compat-option no-youtube-prefer-utc-upload-date` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/arte] Bug fix by [cgrigis](https://github.com/cgrigis)
* [extractor/bilibili] Extract `flac` with premium account by [jackyyf](https://github.com/jackyyf)
* [extractor/BiliBiliSearch] Don't sort by date
* [extractor/BiliBiliSearch] Fix infinite loop
* [extractor/bitchute] Mark errors as expected
* [extractor/crunchyroll:beta] Use anonymous access by [tejing1](https://github.com/tejing1)
* [extractor/huya] Fix stream extraction by [ohaiibuzzle](https://github.com/ohaiibuzzle)
* [extractor/medaltv] Fix extraction by [xenova](https://github.com/xenova)
* [extractor/mediaset] Fix embed extraction
* [extractor/mixcloud] All formats are audio-only
* [extractor/rtbf] Fix jwt extraction by [elyse0](https://github.com/elyse0)
* [extractor/screencastomatic] Support `--video-password` by [shreyasminocha](https://github.com/shreyasminocha)
* [extractor/stripchat] Don't modify input URL by [dfaker](https://github.com/dfaker)
* [extractor/uktv] Improve `_VALID_URL` by [dirkf](https://github.com/dirkf)
* [extractor/vimeo:user] Fix `_VALID_URL`
### 2022.08.19
* Fix bug in `--download-archive`
* [jsinterp] **Fix for new youtube players** and related improvements by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [phantomjs] Add function to execute JS without a DOM by [MinePlayersPE](https://github.com/MinePlayersPE), [pukkandan](https://github.com/pukkandan)
* [build] Exclude devscripts from installs by [Lesmiscore](https://github.com/Lesmiscore)
* [cleanup] Misc fixes and cleanup
* [extractor/youtube] **Add fallback to phantomjs** for nsig
* [extractor/youtube] Fix error reporting of "Incomplete data"
* [extractor/youtube] Improve format sorting for IOS formats
* [extractor/youtube] Improve signature caching
* [extractor/instagram] Fix extraction by [bashonly](https://github.com/bashonly), [pritam20ps05](https://github.com/pritam20ps05)
* [extractor/rai] Minor fix by [nixxo](https://github.com/nixxo)
* [extractor/rtbf] Fix stream extractor by [elyse0](https://github.com/elyse0)
* [extractor/SovietsCloset] Fix extractor by [ChillingPepper](https://github.com/ChillingPepper)
* [extractor/zattoo] Fix Zattoo resellers by [goggle](https://github.com/goggle)
### 2022.08.14 ### 2022.08.14
* Merge youtube-dl: Upto [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56) * Merge youtube-dl: Upto [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56)
@@ -19,8 +84,7 @@
* [extractor] Fix format sorting of `channels` * [extractor] Fix format sorting of `channels`
* [ffmpeg] Disable avconv unless `--prefer-avconv` * [ffmpeg] Disable avconv unless `--prefer-avconv`
* [ffmpeg] Smarter detection of ffprobe filename * [ffmpeg] Smarter detection of ffprobe filename
* [patreon] Ignore erroneous media attachments by [coletdjnz](https://github.com/coletdjnz) * [embedthumbnail] Detect `libatomicparsley.so`
* [postprocessor/embedthumbnail] Detect `libatomicparsley.so`
* [ThumbnailsConvertor] Fix conversion after `fixup_webp` * [ThumbnailsConvertor] Fix conversion after `fixup_webp`
* [utils] Fix `get_compatible_ext` * [utils] Fix `get_compatible_ext`
* [build] Fix changelog * [build] Fix changelog
@@ -30,6 +94,7 @@
* [cleanup] Misc fixes and cleanup * [cleanup] Misc fixes and cleanup
* [extractor/moview] Add extractor by [HobbyistDev](https://github.com/HobbyistDev) * [extractor/moview] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/parler] Add extractor by [palewire](https://github.com/palewire) * [extractor/parler] Add extractor by [palewire](https://github.com/palewire)
* [extractor/patreon] Ignore erroneous media attachments by [coletdjnz](https://github.com/coletdjnz)
* [extractor/truth] Add extractor by [palewire](https://github.com/palewire) * [extractor/truth] Add extractor by [palewire](https://github.com/palewire)
* [extractor/aenetworks] Add formats parameter by [jacobtruman](https://github.com/jacobtruman) * [extractor/aenetworks] Add formats parameter by [jacobtruman](https://github.com/jacobtruman)
* [extractor/crunchyroll] Improve `_VALID_URL`s * [extractor/crunchyroll] Improve `_VALID_URL`s

View File

@@ -33,7 +33,6 @@ completion-zsh: completions/zsh/_yt-dlp
lazy-extractors: yt_dlp/extractor/lazy_extractors.py lazy-extractors: yt_dlp/extractor/lazy_extractors.py
PREFIX ?= /usr/local PREFIX ?= /usr/local
DESTDIR ?= .
BINDIR ?= $(PREFIX)/bin BINDIR ?= $(PREFIX)/bin
MANDIR ?= $(PREFIX)/man MANDIR ?= $(PREFIX)/man
SHAREDIR ?= $(PREFIX)/share SHAREDIR ?= $(PREFIX)/share
@@ -134,7 +133,7 @@ yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscrip
$(PYTHON) devscripts/make_lazy_extractors.py $@ $(PYTHON) devscripts/make_lazy_extractors.py $@
yt-dlp.tar.gz: all yt-dlp.tar.gz: all
@tar -czf $(DESTDIR)/yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \ @tar -czf yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \ --exclude '*.DS_Store' \
--exclude '*.kate-swp' \ --exclude '*.kate-swp' \
--exclude '*.pyc' \ --exclude '*.pyc' \

View File

@@ -71,7 +71,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES # NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56717c73ee597d2e077d11b69ed48a1b02d)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl) * Merged with **youtube-dl v2021.12.17+ [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7b74ac77f87ca5ed6cb5e964a0c6a0678)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API * **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
@@ -141,6 +141,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading * Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections * Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this * Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this * If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead * Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this * Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@@ -320,7 +321,7 @@ To build the standalone executable, you must have Python and `pyinstaller` (plus
On some systems, you may need to use `py` or `python` instead of `python3`. On some systems, you may need to use `py` or `python` instead of `python3`.
Note that pyinstaller [does not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment. Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly. **Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
@@ -329,7 +330,7 @@ You will need the build tools `python` (3.6+), `zip`, `make` (GNU), `pandoc`\* a
After installing these, simply run `make`. After installing these, simply run `make`.
You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The dependencies marked with **\*** are not needed for this) You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The build tools marked with **\*** are not needed for this)
### Standalone Py2Exe Builds (Windows) ### Standalone Py2Exe Builds (Windows)
@@ -375,7 +376,13 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--list-extractors List all supported extractors and exit --list-extractors List all supported extractors and exit
--extractor-descriptions Output descriptions of all supported --extractor-descriptions Output descriptions of all supported
extractors and exit extractors and exit
--force-generic-extractor Force extraction to use the generic extractor --use-extractors NAMES Extractor names to use separated by commas.
You can also use regexes, "all", "default"
and "end" (end URL matching); e.g. --ies
"holodex.*,end,youtube". Prefix the name
with a "-" to exclude it, e.g. --ies
default,-generic. Use --list-extractors for
a list of extractor names. (Alias: --ies)
--default-search PREFIX Use this prefix for unqualified URLs. E.g. --default-search PREFIX Use this prefix for unqualified URLs. E.g.
"gvsearch2:python" downloads two videos from "gvsearch2:python" downloads two videos from
google videos for the search term "python". google videos for the search term "python".
@@ -524,8 +531,8 @@ You can also fork the project on github and run your fork's [build workflow](.gi
a file that is in the archive a file that is in the archive
--break-on-reject Stop the download process when encountering --break-on-reject Stop the download process when encountering
a file that has been filtered out a file that has been filtered out
--break-per-input Make --break-on-existing, --break-on-reject --break-per-input --break-on-existing, --break-on-reject,
and --max-downloads act only on the current --max-downloads, and autonumber resets per
input URL input URL
--no-break-per-input --break-on-existing and similar options --no-break-per-input --break-on-existing and similar options
terminates the entire download queue terminates the entire download queue
@@ -700,18 +707,20 @@ You can also fork the project on github and run your fork's [build workflow](.gi
and dump cookie jar in and dump cookie jar in
--no-cookies Do not read/dump cookies from/to file --no-cookies Do not read/dump cookies from/to file
(default) (default)
--cookies-from-browser BROWSER[+KEYRING][:PROFILE] --cookies-from-browser BROWSER[+KEYRING][:PROFILE][::CONTAINER]
The name of the browser and (optionally) the The name of the browser to load cookies
name/path of the profile to load cookies from. Currently supported browsers are:
from, separated by a ":". Currently brave, chrome, chromium, edge, firefox,
supported browsers are: brave, chrome, opera, safari, vivaldi. Optionally, the
chromium, edge, firefox, opera, safari, KEYRING used for decrypting Chromium cookies
vivaldi. By default, the most recently on Linux, the name/path of the PROFILE to
accessed profile is used. The keyring used load cookies from, and the CONTAINER name
for decrypting Chromium cookies on Linux can (if Firefox) ("none" for no container) can
be (optionally) specified after the browser be given with their respective seperators.
name separated by a "+". Currently supported By default, all containers of the most
keyrings are: basictext, gnomekeyring, kwallet recently accessed profile are used.
Currently supported keyrings are: basictext,
gnomekeyring, kwallet
--no-cookies-from-browser Do not load cookies from browser (default) --no-cookies-from-browser Do not load cookies from browser (default)
--cache-dir DIR Location in the filesystem where youtube-dl --cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information (such can store some downloaded information (such
@@ -1229,7 +1238,6 @@ The available fields are:
- `id` (string): Video identifier - `id` (string): Video identifier
- `title` (string): Video title - `title` (string): Video title
- `fulltitle` (string): Video title ignoring live timestamp and generic title - `fulltitle` (string): Video title ignoring live timestamp and generic title
- `url` (string): Video URL
- `ext` (string): Video filename extension - `ext` (string): Video filename extension
- `alt_title` (string): A secondary title of the video - `alt_title` (string): A secondary title of the video
- `description` (string): The description of the video - `description` (string): The description of the video
@@ -1264,26 +1272,6 @@ The available fields are:
- `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public" - `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public"
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL - `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL - `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `format` (string): A human-readable description of the format
- `format_id` (string): Format code specified by `--format`
- `format_note` (string): Additional info about the format
- `width` (numeric): Width of the video
- `height` (numeric): Height of the video
- `resolution` (string): Textual description of width and height
- `tbr` (numeric): Average bitrate of audio and video in KBit/s
- `abr` (numeric): Average audio bitrate in KBit/s
- `acodec` (string): Name of the audio codec in use
- `asr` (numeric): Audio sampling rate in Hertz
- `vbr` (numeric): Average video bitrate in KBit/s
- `fps` (numeric): Frame rate
- `dynamic_range` (string): The dynamic range of the video
- `audio_channels` (numeric): The number of audio channels
- `stretched_ratio` (float): `width:height` of the video's pixels, if not square
- `vcodec` (string): Name of the video codec in use
- `container` (string): Name of the container format
- `filesize` (numeric): The number of bytes, if known in advance
- `filesize_approx` (numeric): An estimate for the number of bytes
- `protocol` (string): The protocol that will be used for the actual download
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch of when the information extraction was completed - `epoch` (numeric): Unix epoch of when the information extraction was completed
@@ -1302,6 +1290,8 @@ The available fields are:
- `webpage_url_basename` (string): The basename of the webpage URL - `webpage_url_basename` (string): The basename of the webpage URL
- `webpage_url_domain` (string): The domain of the webpage URL - `webpage_url_domain` (string): The domain of the webpage URL
- `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries) - `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries)
All the fields in [Filtering Formats](#filtering-formats) can also be used
Available for the video that belongs to some logical chapter or section: Available for the video that belongs to some logical chapter or section:
@@ -1383,13 +1373,13 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples #### Output template examples
```bash ```bash
$ yt-dlp --get-filename -o "test video.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "test video.%(ext)s" BaW_jenozKc
test video.webm # Literal name with correct extension test video.webm # Literal name with correct extension
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames $ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.webm # Restricted file name youtube-dl_test_video_.webm # Restricted file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist # Download YouTube playlist videos in separate directory indexed by video order in a playlist
@@ -1478,6 +1468,7 @@ You can also filter the video formats by putting a condition in brackets, as in
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals): The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance - `filesize`: The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes
- `width`: Width of the video, if known - `width`: Width of the video, if known
- `height`: Height of the video, if known - `height`: Height of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s - `tbr`: Average bitrate of audio and video in KBit/s
@@ -1485,16 +1476,23 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `vbr`: Average video bitrate in KBit/s - `vbr`: Average video bitrate in KBit/s
- `asr`: Audio sampling rate in Hertz - `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate - `fps`: Frame rate
- `audio_channels`: The number of audio channels
- `stretched_ratio`: `width:height` of the video's pixels, if not square
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields: Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields:
- `url`: Video URL
- `ext`: File extension - `ext`: File extension
- `acodec`: Name of the audio codec in use - `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use - `vcodec`: Name of the video codec in use
- `container`: Name of the container format - `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`) - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
- `language`: Language code - `language`: Language code
- `dynamic_range`: The dynamic range of the video
- `format_id`: A short description of the format
- `format`: A human-readable description of the format
- `format_note`: Additional info about the format
- `resolution`: Textual description of width and height
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`. Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
@@ -1521,7 +1519,7 @@ The available fields are:
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other) - `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec` - `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred. - `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`. - `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
- `ext`: Equivalent to `vext,aext` - `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if known in advance - `filesize`: Exact filesize, if known in advance
- `fs_approx`: Approximate filesize calculated from the manifests - `fs_approx`: Approximate filesize calculated from the manifests
@@ -2058,6 +2056,7 @@ While these options are redundant, they are still expected to be used due to the
#### Not recommended #### Not recommended
While these options still work, their use is not recommended since there are other alternatives to achieve the same While these options still work, their use is not recommended since there are other alternatives to achieve the same
--force-generic-extractor --ies generic,default
--exec-before-download CMD --exec "before_dl:CMD" --exec-before-download CMD --exec "before_dl:CMD"
--no-exec-before-download --no-exec --no-exec-before-download --no-exec
--all-formats -f all --all-formats -f all

View File

@@ -11,14 +11,17 @@ from ..utils import (
# These bloat the lazy_extractors, so allow them to passthrough silently # These bloat the lazy_extractors, so allow them to passthrough silently
ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'} ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'}
_WARNED = False
class LazyLoadMetaClass(type): class LazyLoadMetaClass(type):
def __getattr__(cls, name): def __getattr__(cls, name):
if '_real_class' not in cls.__dict__ and name not in ALLOWED_CLASSMETHODS: global _WARNED
write_string( if ('_real_class' not in cls.__dict__
'WARNING: Falling back to normal extractor since lazy extractor ' and name not in ALLOWED_CLASSMETHODS and not _WARNED):
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n') _WARNED = True
write_string('WARNING: Falling back to normal extractor since lazy extractor '
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n')
return getattr(cls.real_class, name) return getattr(cls.real_class, name)

View File

@@ -12,7 +12,9 @@ from inspect import getsource
from devscripts.utils import get_filename_args, read_file, write_file from devscripts.utils import get_filename_args, read_file, write_file
NO_ATTR = object() NO_ATTR = object()
STATIC_CLASS_PROPERTIES = ['IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_NETRC_MACHINE', 'age_limit'] STATIC_CLASS_PROPERTIES = [
'IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_ENABLED', '_NETRC_MACHINE', 'age_limit'
]
CLASS_METHODS = [ CLASS_METHODS = [
'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable' 'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable'
] ]

View File

@@ -1,13 +1,13 @@
#!/usr/bin/env sh #!/usr/bin/env sh
if [ -z $1 ]; then if [ -z "$1" ]; then
test_set='test' test_set='test'
elif [ $1 = 'core' ]; then elif [ "$1" = 'core' ]; then
test_set="-m not download" test_set="-m not download"
elif [ $1 = 'download' ]; then elif [ "$1" = 'download' ]; then
test_set="-m download" test_set="-m download"
else else
echo 'Invalid test type "'$1'". Use "core" | "download"' echo 'Invalid test type "'"$1"'". Use "core" | "download"'
exit 1 exit 1
fi fi

View File

@@ -81,7 +81,7 @@ def version_to_list(version):
def dependency_options(): def dependency_options():
# Due to the current implementation, these are auto-detected, but explicitly add them just in case # Due to the current implementation, these are auto-detected, but explicitly add them just in case
dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets'] dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets']
excluded_modules = ['test', 'ytdlp_plugins', 'youtube_dl', 'youtube_dlc'] excluded_modules = ('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts')
yield from (f'--hidden-import={module}' for module in dependencies) yield from (f'--hidden-import={module}' for module in dependencies)
yield '--collect-submodules=websockets' yield '--collect-submodules=websockets'

View File

@@ -28,7 +28,7 @@ REQUIREMENTS = read_file('requirements.txt').splitlines()
def packages(): def packages():
if setuptools_available: if setuptools_available:
return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins')) return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts'))
return [ return [
'yt_dlp', 'yt_dlp.extractor', 'yt_dlp.downloader', 'yt_dlp.postprocessor', 'yt_dlp.compat', 'yt_dlp', 'yt_dlp.extractor', 'yt_dlp.downloader', 'yt_dlp.postprocessor', 'yt_dlp.compat',

View File

@@ -128,6 +128,8 @@
- **bbc.co.uk:iplayer:group** - **bbc.co.uk:iplayer:group**
- **bbc.co.uk:playlist** - **bbc.co.uk:playlist**
- **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>] - **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVLive**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVRecordings**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
@@ -348,6 +350,8 @@
- **ehftv** - **ehftv**
- **eHow** - **eHow**
- **EinsUndEinsTV**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>] - **EinsUndEinsTV**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVLive**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVRecordings**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **Einthusan** - **Einthusan**
- **eitb.tv** - **eitb.tv**
- **EllenTube** - **EllenTube**
@@ -360,6 +364,7 @@
- **Engadget** - **Engadget**
- **Epicon** - **Epicon**
- **EpiconSeries** - **EpiconSeries**
- **Epoch**
- **Eporner** - **Eporner**
- **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>] - **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>]
- **EroProfile:album** - **EroProfile:album**
@@ -373,8 +378,11 @@
- **EsriVideo** - **EsriVideo**
- **Europa** - **Europa**
- **EuropeanTour** - **EuropeanTour**
- **Eurosport**
- **EUScreen** - **EUScreen**
- **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>] - **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVLive**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVRecordings**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **ExpoTV** - **ExpoTV**
- **Expressen** - **Expressen**
- **ExtremeTube** - **ExtremeTube**
@@ -454,6 +462,8 @@
- **GiantBomb** - **GiantBomb**
- **Giga** - **Giga**
- **GlattvisionTV**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>] - **GlattvisionTV**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVLive**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVRecordings**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **Glide**: Glide mobile video messages (glide.me) - **Glide**: Glide mobile video messages (glide.me)
- **Globo**: [<abbr title="netrc machine"><em>globo</em></abbr>] - **Globo**: [<abbr title="netrc machine"><em>globo</em></abbr>]
- **GloboArticle** - **GloboArticle**
@@ -545,6 +555,8 @@
- **iq.com**: International version of iQiyi - **iq.com**: International version of iQiyi
- **iq.com:album** - **iq.com:album**
- **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺 - **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺
- **IslamChannel**
- **IslamChannelSeries**
- **ITProTV** - **ITProTV**
- **ITProTVCourse** - **ITProTVCourse**
- **ITTF** - **ITTF**
@@ -715,6 +727,8 @@
- **MLSSoccer** - **MLSSoccer**
- **Mnet** - **Mnet**
- **MNetTV**: [<abbr title="netrc machine"><em>mnettv</em></abbr>] - **MNetTV**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVLive**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVRecordings**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MochaVideo** - **MochaVideo**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
@@ -801,13 +815,16 @@
- **netease:program**: 网易云音乐 - 电台节目 - **netease:program**: 网易云音乐 - 电台节目
- **netease:singer**: 网易云音乐 - 歌手 - **netease:singer**: 网易云音乐 - 歌手
- **netease:song**: 网易云音乐 - **netease:song**: 网易云音乐
- **NetPlus**: [<abbr title="netrc machine"><em>netplus</em></abbr>] - **NetPlusTV**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVLive**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVRecordings**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse** - **Netverse**
- **NetversePlaylist** - **NetversePlaylist**
- **Netzkino** - **Netzkino**
- **Newgrounds** - **Newgrounds**
- **Newgrounds:playlist** - **Newgrounds:playlist**
- **Newgrounds:user** - **Newgrounds:user**
- **NewsPicks**
- **Newstube** - **Newstube**
- **Newsy** - **Newsy**
- **NextMedia**: 蘋果日報 - **NextMedia**: 蘋果日報
@@ -906,6 +923,8 @@
- **orf:radio** - **orf:radio**
- **orf:tvthek**: ORF TVthek - **orf:tvthek**: ORF TVthek
- **OsnatelTV**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>] - **OsnatelTV**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVLive**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVRecordings**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OutsideTV** - **OutsideTV**
- **PacktPub**: [<abbr title="netrc machine"><em>packtpub</em></abbr>] - **PacktPub**: [<abbr title="netrc machine"><em>packtpub</em></abbr>]
- **PacktPubCourse** - **PacktPubCourse**
@@ -1013,6 +1032,8 @@
- **qqmusic:singer**: QQ音乐 - 歌手 - **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜 - **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuantumTV**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>] - **QuantumTV**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVLive**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVRecordings**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **Qub** - **Qub**
- **R7** - **R7**
- **R7Article** - **R7Article**
@@ -1121,7 +1142,11 @@
- **safari:course**: [<abbr title="netrc machine"><em>safari</em></abbr>] safaribooksonline.com online courses - **safari:course**: [<abbr title="netrc machine"><em>safari</em></abbr>] safaribooksonline.com online courses
- **Saitosan** - **Saitosan**
- **SAKTV**: [<abbr title="netrc machine"><em>saktv</em></abbr>] - **SAKTV**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVLive**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVRecordings**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SaltTV**: [<abbr title="netrc machine"><em>salttv</em></abbr>] - **SaltTV**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVLive**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVRecordings**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SampleFocus** - **SampleFocus**
- **Sapo**: SAPO Vídeos - **Sapo**: SAPO Vídeos
- **savefrom.net** - **savefrom.net**
@@ -1311,6 +1336,8 @@
- **ToypicsUser**: Toypics user profile - **ToypicsUser**: Toypics user profile
- **TrailerAddict**: (**Currently broken**) - **TrailerAddict**: (**Currently broken**)
- **TravelChannel** - **TravelChannel**
- **Triller**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **TrillerUser**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **Trilulilu** - **Trilulilu**
- **Trovo** - **Trovo**
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix - **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
@@ -1486,6 +1513,8 @@
- **VoxMedia** - **VoxMedia**
- **VoxMediaVolume** - **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series**
- **vqq:video**
- **Vrak** - **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be - **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be
@@ -1494,6 +1523,8 @@
- **VShare** - **VShare**
- **VTM** - **VTM**
- **VTXTV**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>] - **VTXTV**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVLive**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVRecordings**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VuClip** - **VuClip**
- **Vupload** - **Vupload**
- **VVVVID** - **VVVVID**
@@ -1503,6 +1534,8 @@
- **Wakanim** - **Wakanim**
- **Walla** - **Walla**
- **WalyTV**: [<abbr title="netrc machine"><em>walytv</em></abbr>] - **WalyTV**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVLive**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVRecordings**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **wasdtv:clip** - **wasdtv:clip**
- **wasdtv:record** - **wasdtv:record**
- **wasdtv:stream** - **wasdtv:stream**

View File

@@ -668,7 +668,7 @@ class TestYoutubeDL(unittest.TestCase):
def test_prepare_outtmpl_and_filename(self): def test_prepare_outtmpl_and_filename(self):
def test(tmpl, expected, *, info=None, **params): def test(tmpl, expected, *, info=None, **params):
params['outtmpl'] = tmpl params['outtmpl'] = tmpl
ydl = YoutubeDL(params) ydl = FakeYDL(params)
ydl._num_downloads = 1 ydl._num_downloads = 1
self.assertEqual(ydl.validate_outtmpl(tmpl), None) self.assertEqual(ydl.validate_outtmpl(tmpl), None)

View File

@@ -11,41 +11,46 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib import contextlib
import subprocess import subprocess
from yt_dlp.utils import encodeArgument from yt_dlp.utils import Popen
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
LAZY_EXTRACTORS = 'yt_dlp/extractor/lazy_extractors.py'
try:
_DEV_NULL = subprocess.DEVNULL
except AttributeError:
_DEV_NULL = open(os.devnull, 'wb')
class TestExecution(unittest.TestCase): class TestExecution(unittest.TestCase):
def test_import(self): def run_yt_dlp(self, exe=(sys.executable, 'yt_dlp/__main__.py'), opts=('--version', )):
subprocess.check_call([sys.executable, '-c', 'import yt_dlp'], cwd=rootDir) stdout, stderr, returncode = Popen.run(
[*exe, '--ignore-config', *opts], cwd=rootDir, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def test_module_exec(self): print(stderr, file=sys.stderr)
subprocess.check_call([sys.executable, '-m', 'yt_dlp', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.assertEqual(returncode, 0)
return stdout.strip(), stderr.strip()
def test_main_exec(self): def test_main_exec(self):
subprocess.check_call([sys.executable, 'yt_dlp/__main__.py', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL) self.run_yt_dlp()
def test_import(self):
self.run_yt_dlp(exe=(sys.executable, '-c', 'import yt_dlp'))
def test_module_exec(self):
self.run_yt_dlp(exe=(sys.executable, '-m', 'yt_dlp'))
def test_cmdline_umlauts(self): def test_cmdline_umlauts(self):
p = subprocess.Popen( _, stderr = self.run_yt_dlp(opts=('ä', '--version'))
[sys.executable, 'yt_dlp/__main__.py', '--ignore-config', encodeArgument('ä'), '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
self.assertFalse(stderr) self.assertFalse(stderr)
def test_lazy_extractors(self): def test_lazy_extractors(self):
try: try:
subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', 'yt_dlp/extractor/lazy_extractors.py'], cwd=rootDir, stdout=_DEV_NULL) subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', LAZY_EXTRACTORS],
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=_DEV_NULL) cwd=rootDir, stdout=subprocess.DEVNULL)
self.assertTrue(os.path.exists(LAZY_EXTRACTORS))
_, stderr = self.run_yt_dlp(opts=('-s', 'test:'))
self.assertFalse(stderr)
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=subprocess.DEVNULL)
finally: finally:
with contextlib.suppress(OSError): with contextlib.suppress(OSError):
os.remove('yt_dlp/extractor/lazy_extractors.py') os.remove(LAZY_EXTRACTORS)
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -7,8 +7,10 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math
import re
from yt_dlp.jsinterp import JSInterpreter from yt_dlp.jsinterp import JS_Undefined, JSInterpreter
class TestJSInterpreter(unittest.TestCase): class TestJSInterpreter(unittest.TestCase):
@@ -66,6 +68,12 @@ class TestJSInterpreter(unittest.TestCase):
jsi = JSInterpreter('function f(){return 0 && 1 || 2;}') jsi = JSInterpreter('function f(){return 0 && 1 || 2;}')
self.assertEqual(jsi.call_function('f'), 2) self.assertEqual(jsi.call_function('f'), 2)
jsi = JSInterpreter('function f(){return 0 ?? 42;}')
self.assertEqual(jsi.call_function('f'), 0)
jsi = JSInterpreter('function f(){return "life, the universe and everything" < 42;}')
self.assertFalse(jsi.call_function('f'))
def test_array_access(self): def test_array_access(self):
jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}') jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}')
self.assertEqual(jsi.call_function('f'), [5, 2, 7]) self.assertEqual(jsi.call_function('f'), [5, 2, 7])
@@ -124,6 +132,11 @@ class TestJSInterpreter(unittest.TestCase):
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50]) self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_builtins(self): def test_builtins(self):
jsi = JSInterpreter('''
function x() { return NaN }
''')
self.assertTrue(math.isnan(jsi.call_function('x')))
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; } function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; }
''') ''')
@@ -183,6 +196,30 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x'), 10) self.assertEqual(jsi.call_function('x'), 10)
def test_catch(self):
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_finally(self):
jsi = JSInterpreter('''
function x() { try{throw 10} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_nested_try(self):
jsi = JSInterpreter('''
function x() {try {
try{throw 10} finally {throw 42}
} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_for_loop_continue(self): def test_for_loop_continue(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a } function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a }
@@ -195,6 +232,14 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x'), 0) self.assertEqual(jsi.call_function('x'), 0)
def test_for_loop_try(self):
jsi = JSInterpreter('''
function x() {
for (i=0; i-10; i++) { try { if (i == 5) throw i} catch {return 10} finally {break} };
return 42 }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_literal_list(self): def test_literal_list(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return [1, 2, "asdf", [5, 6, 7]][3] } function x() { return [1, 2, "asdf", [5, 6, 7]][3] }
@@ -212,6 +257,11 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x'), 7) self.assertEqual(jsi.call_function('x'), 7)
jsi = JSInterpreter('''
function x() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_void(self): def test_void(self):
jsi = JSInterpreter(''' jsi = JSInterpreter('''
function x() { return void 42; } function x() { return void 42; }
@@ -224,6 +274,140 @@ class TestJSInterpreter(unittest.TestCase):
''') ''')
self.assertEqual(jsi.call_function('x')([]), 1) self.assertEqual(jsi.call_function('x')([]), 1)
def test_null(self):
jsi = JSInterpreter('''
function x() { return null; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { return [null > 0, null < 0, null == 0, null === 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [null >= 0, null <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True])
def test_undefined(self):
jsi = JSInterpreter('''
function x() { return undefined === undefined; }
''')
self.assertEqual(jsi.call_function('x'), True)
jsi = JSInterpreter('''
function x() { return undefined; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let v; return v; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { return [undefined === undefined, undefined == undefined, undefined < undefined, undefined > undefined]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True, False, False])
jsi = JSInterpreter('''
function x() { return [undefined === 0, undefined == 0, undefined < 0, undefined > 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [undefined >= 0, undefined <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False])
jsi = JSInterpreter('''
function x() { return [undefined > null, undefined < null, undefined == null, undefined === null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, True, False])
jsi = JSInterpreter('''
function x() { return [undefined === null, undefined == null, undefined < null, undefined > null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, True, False, False])
jsi = JSInterpreter('''
function x() { let v; return [42+v, v+42, v**42, 42**v, 0**v]; }
''')
for y in jsi.call_function('x'):
self.assertTrue(math.isnan(y))
jsi = JSInterpreter('''
function x() { let v; return v**0; }
''')
self.assertEqual(jsi.call_function('x'), 1)
jsi = JSInterpreter('''
function x() { let v; return [v>42, v<=42, v&&42, 42&&v]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, JS_Undefined, JS_Undefined])
jsi = JSInterpreter('function x(){return undefined ?? 42; }')
self.assertEqual(jsi.call_function('x'), 42)
def test_object(self):
jsi = JSInterpreter('''
function x() { return {}; }
''')
self.assertEqual(jsi.call_function('x'), {})
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return [a["m1"], a.m2]; }
''')
self.assertEqual(jsi.call_function('x'), [42, 0])
jsi = JSInterpreter('''
function x() { let a; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
def test_regex(self):
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; return a; }
''')
self.assertIsInstance(jsi.call_function('x'), re.Pattern)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/i; return a; }
''')
self.assertEqual(jsi.call_function('x').flags & re.I, re.I)
jsi = JSInterpreter(R'''
function x() { let a=/,][}",],()}(\[)/; return a; }
''')
self.assertEqual(jsi.call_function('x').pattern, r',][}",],()}(\[)')
def test_char_code_at(self):
jsi = JSInterpreter('function x(i){return "test".charCodeAt(i)}')
self.assertEqual(jsi.call_function('x', 0), 116)
self.assertEqual(jsi.call_function('x', 1), 101)
self.assertEqual(jsi.call_function('x', 2), 115)
self.assertEqual(jsi.call_function('x', 3), 116)
self.assertEqual(jsi.call_function('x', 4), None)
self.assertEqual(jsi.call_function('x', 'not_a_number'), 116)
def test_bitwise_operators_overflow(self):
jsi = JSInterpreter('function x(){return -524999584 << 5}')
self.assertEqual(jsi.call_function('x'), 379882496)
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
self.assertEqual(jsi.call_function('x'), 915423904)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -102,6 +102,30 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/4c3f79c5/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/4c3f79c5/player_ias.vflset/en_US/base.js',
'TDCstCG66tEAO5pR9o', 'dbxNtZ14c-yWyw', 'TDCstCG66tEAO5pR9o', 'dbxNtZ14c-yWyw',
), ),
(
'https://www.youtube.com/s/player/c81bbb4a/player_ias.vflset/en_US/base.js',
'gre3EcLurNY2vqp94', 'Z9DfGxWP115WTg',
),
(
'https://www.youtube.com/s/player/1f7d5369/player_ias.vflset/en_US/base.js',
'batNX7sYqIJdkJ', 'IhOkL_zxbkOZBw',
),
(
'https://www.youtube.com/s/player/009f1d77/player_ias.vflset/en_US/base.js',
'5dwFHw8aFWQUQtffRq', 'audescmLUzI3jw',
),
(
'https://www.youtube.com/s/player/dc0c6770/player_ias.vflset/en_US/base.js',
'5EHDMgYLV6HPGk_Mu-kk', 'n9lUJLHbxUI0GQ',
),
(
'https://www.youtube.com/s/player/113ca41c/player_ias.vflset/en_US/base.js',
'cgYl-tlYkhjT7A', 'hI7BBr2zUgcmMg',
),
(
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
),
] ]

View File

@@ -29,6 +29,7 @@ from .cookies import load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
from .extractor import gen_extractor_classes, get_info_extractor from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text from .minicurses import format_text
from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors
@@ -47,7 +48,7 @@ from .postprocessor import (
get_postprocessor, get_postprocessor,
) )
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import detect_variant from .update import REPOSITORY, current_git_head, detect_variant
from .utils import ( from .utils import (
DEFAULT_OUTTMPL, DEFAULT_OUTTMPL,
IDENTITY, IDENTITY,
@@ -89,6 +90,7 @@ from .utils import (
args_to_str, args_to_str,
bug_reports_message, bug_reports_message,
date_from_str, date_from_str,
deprecation_warning,
determine_ext, determine_ext,
determine_protocol, determine_protocol,
encode_compat_str, encode_compat_str,
@@ -115,6 +117,7 @@ from .utils import (
network_exceptions, network_exceptions,
number_of_digits, number_of_digits,
orderedSet, orderedSet,
orderedSet_from_options,
parse_filesize, parse_filesize,
preferredencoding, preferredencoding,
prepend_extension, prepend_extension,
@@ -236,7 +239,7 @@ class YoutubeDL:
Default is 'only_download' for CLI, but False for API Default is 'only_download' for CLI, but False for API
skip_playlist_after_errors: Number of allowed failures until the rest of skip_playlist_after_errors: Number of allowed failures until the rest of
the playlist is skipped the playlist is skipped
force_generic_extractor: Force downloader to use the generic extractor allowed_extractors: List of regexes to match against extractor names that are allowed
overwrites: Overwrite all video and metadata files if True, overwrites: Overwrite all video and metadata files if True,
overwrite only non-video files if None overwrite only non-video files if None
and don't overwrite any file if False and don't overwrite any file if False
@@ -301,8 +304,9 @@ class YoutubeDL:
should act on each input URL as opposed to for the entire queue should act on each input URL as opposed to for the entire queue
cookiefile: File name or text stream from where cookies should be read and dumped to cookiefile: File name or text stream from where cookies should be read and dumped to
cookiesfrombrowser: A tuple containing the name of the browser, the profile cookiesfrombrowser: A tuple containing the name of the browser, the profile
name/path from where cookies are loaded, and the name of the name/path from where cookies are loaded, the name of the keyring,
keyring, e.g. ('chrome', ) or ('vivaldi', 'default', 'BASICTEXT') and the container name, e.g. ('chrome', ) or
('vivaldi', 'default', 'BASICTEXT') or ('firefox', 'default', None, 'Meta')
legacyserverconnect: Explicitly allow HTTPS connection to servers that do not legacyserverconnect: Explicitly allow HTTPS connection to servers that do not
support RFC 5746 secure renegotiation support RFC 5746 secure renegotiation
nocheckcertificate: Do not verify SSL certificates nocheckcertificate: Do not verify SSL certificates
@@ -444,6 +448,7 @@ class YoutubeDL:
* index: Section number (Optional) * index: Section number (Optional)
force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts
noprogress: Do not print the progress bar noprogress: Do not print the progress bar
live_from_start: Whether to download livestreams videos from the start
The following parameters are not used by YoutubeDL itself, they are used by The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py): the downloader (see yt_dlp/downloader/common.py):
@@ -475,6 +480,8 @@ class YoutubeDL:
The following options are deprecated and may be removed in the future: The following options are deprecated and may be removed in the future:
force_generic_extractor: Force downloader to use the generic extractor
- Use allowed_extractors = ['generic', 'default']
playliststart: - Use playlist_items playliststart: - Use playlist_items
Playlist item to start at. Playlist item to start at.
playlistend: - Use playlist_items playlistend: - Use playlist_items
@@ -626,7 +633,7 @@ class YoutubeDL:
for msg in self.params.get('_warnings', []): for msg in self.params.get('_warnings', []):
self.report_warning(msg) self.report_warning(msg)
for msg in self.params.get('_deprecation_warnings', []): for msg in self.params.get('_deprecation_warnings', []):
self.deprecation_warning(msg) self.deprecated_feature(msg)
self.params['compat_opts'] = set(self.params.get('compat_opts', ())) self.params['compat_opts'] = set(self.params.get('compat_opts', ()))
if 'list-formats' in self.params['compat_opts']: if 'list-formats' in self.params['compat_opts']:
@@ -756,13 +763,6 @@ class YoutubeDL:
self._ies_instances[ie_key] = ie self._ies_instances[ie_key] = ie
ie.set_downloader(self) ie.set_downloader(self)
def _get_info_extractor_class(self, ie_key):
ie = self._ies.get(ie_key)
if ie is None:
ie = get_info_extractor(ie_key)
self.add_info_extractor(ie)
return ie
def get_info_extractor(self, ie_key): def get_info_extractor(self, ie_key):
""" """
Get an instance of an IE with name ie_key, it will try to get one from Get an instance of an IE with name ie_key, it will try to get one from
@@ -779,8 +779,19 @@ class YoutubeDL:
""" """
Add the InfoExtractors returned by gen_extractors to the end of the list Add the InfoExtractors returned by gen_extractors to the end of the list
""" """
for ie in gen_extractor_classes(): all_ies = {ie.IE_NAME.lower(): ie for ie in gen_extractor_classes()}
self.add_info_extractor(ie) all_ies['end'] = UnsupportedURLIE()
try:
ie_names = orderedSet_from_options(
self.params.get('allowed_extractors', ['default']), {
'all': list(all_ies),
'default': [name for name, ie in all_ies.items() if ie._ENABLED],
}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for allowed_extractors: {e.pattern}')
for name in ie_names:
self.add_info_extractor(all_ies[name])
self.write_debug(f'Loaded {len(ie_names)} extractors')
def add_post_processor(self, pp, when='post_process'): def add_post_processor(self, pp, when='post_process'):
"""Add a PostProcessor object to the end of the chain.""" """Add a PostProcessor object to the end of the chain."""
@@ -826,9 +837,11 @@ class YoutubeDL:
def to_stdout(self, message, skip_eol=False, quiet=None): def to_stdout(self, message, skip_eol=False, quiet=None):
"""Print message to stdout""" """Print message to stdout"""
if quiet is not None: if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. '
'Use "YoutubeDL.to_screen" instead')
if skip_eol is not False: if skip_eol is not False:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. Use "YoutubeDL.to_screen" instead') self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. '
'Use "YoutubeDL.to_screen" instead')
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out) self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out)
def to_screen(self, message, skip_eol=False, quiet=None): def to_screen(self, message, skip_eol=False, quiet=None):
@@ -964,11 +977,14 @@ class YoutubeDL:
return return
self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once) self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once)
def deprecation_warning(self, message): def deprecation_warning(self, message, *, stacklevel=0):
deprecation_warning(
message, stacklevel=stacklevel + 1, printer=self.report_error, is_error=False)
def deprecated_feature(self, message):
if self.params.get('logger') is not None: if self.params.get('logger') is not None:
self.params['logger'].warning(f'DeprecationWarning: {message}') self.params['logger'].warning(f'Deprecated Feature: {message}')
else: self.to_stderr(f'{self._format_err("Deprecated Feature:", self.Styles.ERROR)} {message}', True)
self.to_stderr(f'{self._format_err("DeprecationWarning:", self.Styles.ERROR)} {message}', True)
def report_error(self, message, *args, **kwargs): def report_error(self, message, *args, **kwargs):
''' '''
@@ -1028,7 +1044,7 @@ class YoutubeDL:
def get_output_path(self, dir_type='', filename=None): def get_output_path(self, dir_type='', filename=None):
paths = self.params.get('paths', {}) paths = self.params.get('paths', {})
assert isinstance(paths, dict) assert isinstance(paths, dict), '"paths" parameter must be a dictionary'
path = os.path.join( path = os.path.join(
expand_path(paths.get('home', '').strip()), expand_path(paths.get('home', '').strip()),
expand_path(paths.get(dir_type, '').strip()) if dir_type else '', expand_path(paths.get(dir_type, '').strip()) if dir_type else '',
@@ -1411,11 +1427,11 @@ class YoutubeDL:
ie_key = 'Generic' ie_key = 'Generic'
if ie_key: if ie_key:
ies = {ie_key: self._get_info_extractor_class(ie_key)} ies = {ie_key: self._ies[ie_key]} if ie_key in self._ies else {}
else: else:
ies = self._ies ies = self._ies
for ie_key, ie in ies.items(): for key, ie in ies.items():
if not ie.suitable(url): if not ie.suitable(url):
continue continue
@@ -1424,14 +1440,16 @@ class YoutubeDL:
'and will probably not work.') 'and will probably not work.')
temp_id = ie.get_temp_id(url) temp_id = ie.get_temp_id(url)
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': ie_key}): if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': key}):
self.to_screen(f'[{ie_key}] {temp_id}: has already been recorded in the archive') self.to_screen(f'[{key}] {temp_id}: has already been recorded in the archive')
if self.params.get('break_on_existing', False): if self.params.get('break_on_existing', False):
raise ExistingVideoReached() raise ExistingVideoReached()
break break
return self.__extract_info(url, self.get_info_extractor(ie_key), download, extra_info, process) return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process)
else: else:
self.report_error('no suitable InfoExtractor for URL %s' % url) extractors_restricted = self.params.get('allowed_extractors') not in (None, ['default'])
self.report_error(f'No suitable extractor{format_field(ie_key, None, " (%s)")} found for URL {url}',
tb=False if extractors_restricted else None)
def _handle_extraction_exceptions(func): def _handle_extraction_exceptions(func):
@functools.wraps(func) @functools.wraps(func)
@@ -2510,9 +2528,6 @@ class YoutubeDL:
'--live-from-start is passed, but there are no formats that can be downloaded from the start. ' '--live-from-start is passed, but there are no formats that can be downloaded from the start. '
'If you want to download from the current time, use --no-live-from-start')) 'If you want to download from the current time, use --no-live-from-start'))
if not formats:
self.raise_no_formats(info_dict)
def is_wellformed(f): def is_wellformed(f):
url = f.get('url') url = f.get('url')
if not url: if not url:
@@ -2525,7 +2540,10 @@ class YoutubeDL:
return True return True
# Filter out malformed formats for better extraction robustness # Filter out malformed formats for better extraction robustness
formats = list(filter(is_wellformed, formats)) formats = list(filter(is_wellformed, formats or []))
if not formats:
self.raise_no_formats(info_dict)
formats_dict = {} formats_dict = {}
@@ -2727,42 +2745,26 @@ class YoutubeDL:
if lang not in available_subs: if lang not in available_subs:
available_subs[lang] = cap_info available_subs[lang] = cap_info
if (not self.params.get('writesubtitles') and not if not available_subs or (
self.params.get('writeautomaticsub') or not not self.params.get('writesubtitles')
available_subs): and not self.params.get('writeautomaticsub')):
return None return None
all_sub_langs = tuple(available_subs.keys()) all_sub_langs = tuple(available_subs.keys())
if self.params.get('allsubtitles', False): if self.params.get('allsubtitles', False):
requested_langs = all_sub_langs requested_langs = all_sub_langs
elif self.params.get('subtitleslangs', False): elif self.params.get('subtitleslangs', False):
# A list is used so that the order of languages will be the same as try:
# given in subtitleslangs. See https://github.com/yt-dlp/yt-dlp/issues/1041 requested_langs = orderedSet_from_options(
requested_langs = [] self.params.get('subtitleslangs'), {'all': all_sub_langs}, use_regex=True)
for lang_re in self.params.get('subtitleslangs'): except re.error as e:
discard = lang_re[0] == '-' raise ValueError(f'Wrong regex for subtitlelangs: {e.pattern}')
if discard:
lang_re = lang_re[1:]
if lang_re == 'all':
if discard:
requested_langs = []
else:
requested_langs.extend(all_sub_langs)
continue
current_langs = filter(re.compile(lang_re + '$').match, all_sub_langs)
if discard:
for lang in current_langs:
while lang in requested_langs:
requested_langs.remove(lang)
else:
requested_langs.extend(current_langs)
requested_langs = orderedSet(requested_langs)
elif normal_sub_langs: elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1] requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else: else:
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1] requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
if requested_langs: if requested_langs:
self.write_debug('Downloading subtitles: %s' % ', '.join(requested_langs)) self.to_screen(f'[info] {video_id}: Downloading subtitles: {", ".join(requested_langs)}')
formats_query = self.params.get('subtitlesformat', 'best') formats_query = self.params.get('subtitlesformat', 'best')
formats_preference = formats_query.split('/') if formats_query else [] formats_preference = formats_query.split('/') if formats_query else []
@@ -3270,6 +3272,7 @@ class YoutubeDL:
self.to_screen(f'[info] {e}') self.to_screen(f'[info] {e}')
if not self.params.get('break_per_url'): if not self.params.get('break_per_url'):
raise raise
self._num_downloads = 0
else: else:
if self.params.get('dump_single_json', False): if self.params.get('dump_single_json', False):
self.post_extract(res) self.post_extract(res)
@@ -3318,6 +3321,12 @@ class YoutubeDL:
return info_dict return info_dict
info_dict.setdefault('epoch', int(time.time())) info_dict.setdefault('epoch', int(time.time()))
info_dict.setdefault('_type', 'video') info_dict.setdefault('_type', 'video')
info_dict.setdefault('_version', {
'version': __version__,
'current_git_head': current_git_head(),
'release_git_head': RELEASE_GIT_HEAD,
'repository': REPOSITORY,
})
if remove_private_keys: if remove_private_keys:
reject = lambda k, v: v is None or k.startswith('__') or k in { reject = lambda k, v: v is None or k.startswith('__') or k in {
@@ -3443,7 +3452,7 @@ class YoutubeDL:
return False return False
vid_ids = [self._make_archive_id(info_dict)] vid_ids = [self._make_archive_id(info_dict)]
vid_ids.extend(info_dict.get('_old_archive_ids', [])) vid_ids.extend(info_dict.get('_old_archive_ids') or [])
return any(id_ in self.archive for id_ in vid_ids) return any(id_ in self.archive for id_ in vid_ids)
def record_download_archive(self, info_dict): def record_download_archive(self, info_dict):
@@ -3682,7 +3691,8 @@ class YoutubeDL:
if VARIANT not in (None, 'pip'): if VARIANT not in (None, 'pip'):
source += '*' source += '*'
write_debug(join_nonempty( write_debug(join_nonempty(
'yt-dlp version', __version__, f'{"yt-dlp" if REPOSITORY == "yt-dlp/yt-dlp" else REPOSITORY} version',
__version__,
f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '', f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})', '' if source == 'unknown' else f'({source})',
delim=' ')) delim=' '))
@@ -3698,18 +3708,8 @@ class YoutubeDL:
if self.params['compat_opts']: if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts'])) write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
if source == 'source': if current_git_head():
try: write_debug(f'Git HEAD: {current_git_head()}')
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
write_debug(f'Git HEAD: {stdout.strip()}')
except Exception:
with contextlib.suppress(Exception):
sys.exc_clear()
write_debug(system_identifier()) write_debug(system_identifier())
exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self) exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self)

View File

@@ -63,6 +63,8 @@ from .utils import (
) )
from .YoutubeDL import YoutubeDL from .YoutubeDL import YoutubeDL
_IN_CLI = False
def _exit(status=0, *args): def _exit(status=0, *args):
for msg in args: for msg in args:
@@ -344,10 +346,16 @@ def validate_options(opts):
# Cookies from browser # Cookies from browser
if opts.cookiesfrombrowser: if opts.cookiesfrombrowser:
mobj = re.match(r'(?P<name>[^+:]+)(\s*\+\s*(?P<keyring>[^:]+))?(\s*:(?P<profile>.+))?', opts.cookiesfrombrowser) container = None
mobj = re.fullmatch(r'''(?x)
(?P<name>[^+:]+)
(?:\s*\+\s*(?P<keyring>[^:]+))?
(?:\s*:\s*(?P<profile>.+?))?
(?:\s*::\s*(?P<container>.+))?
''', opts.cookiesfrombrowser)
if mobj is None: if mobj is None:
raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}') raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}')
browser_name, keyring, profile = mobj.group('name', 'keyring', 'profile') browser_name, keyring, profile, container = mobj.group('name', 'keyring', 'profile', 'container')
browser_name = browser_name.lower() browser_name = browser_name.lower()
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". ' raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". '
@@ -357,7 +365,7 @@ def validate_options(opts):
if keyring not in SUPPORTED_KEYRINGS: if keyring not in SUPPORTED_KEYRINGS:
raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". ' raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". '
f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}') f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}')
opts.cookiesfrombrowser = (browser_name, profile, keyring) opts.cookiesfrombrowser = (browser_name, profile, keyring, container)
# MetadataParser # MetadataParser
def metadataparser_actions(f): def metadataparser_actions(f):
@@ -766,6 +774,7 @@ def parse_options(argv=None):
'windowsfilenames': opts.windowsfilenames, 'windowsfilenames': opts.windowsfilenames,
'ignoreerrors': opts.ignoreerrors, 'ignoreerrors': opts.ignoreerrors,
'force_generic_extractor': opts.force_generic_extractor, 'force_generic_extractor': opts.force_generic_extractor,
'allowed_extractors': opts.allowed_extractors or ['default'],
'ratelimit': opts.ratelimit, 'ratelimit': opts.ratelimit,
'throttledratelimit': opts.throttledratelimit, 'throttledratelimit': opts.throttledratelimit,
'overwrites': opts.overwrites, 'overwrites': opts.overwrites,

View File

@@ -14,4 +14,5 @@ if __package__ is None and not hasattr(sys, 'frozen'):
import yt_dlp import yt_dlp
if __name__ == '__main__': if __name__ == '__main__':
yt_dlp._IN_CLI = True
yt_dlp.main() yt_dlp.main()

View File

@@ -6,7 +6,8 @@ import re
import shutil import shutil
import traceback import traceback
from .utils import expand_path, write_json_file from .utils import expand_path, traverse_obj, version_tuple, write_json_file
from .version import __version__
class Cache: class Cache:
@@ -45,12 +46,20 @@ class Cache:
if ose.errno != errno.EEXIST: if ose.errno != errno.EEXIST:
raise raise
self._ydl.write_debug(f'Saving {section}.{key} to cache') self._ydl.write_debug(f'Saving {section}.{key} to cache')
write_json_file(data, fn) write_json_file({'yt-dlp_version': __version__, 'data': data}, fn)
except Exception: except Exception:
tb = traceback.format_exc() tb = traceback.format_exc()
self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}') self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}')
def load(self, section, key, dtype='json', default=None): def _validate(self, data, min_ver):
version = traverse_obj(data, 'yt-dlp_version')
if not version: # Backward compatibility
data, version = {'data': data}, '2022.08.19'
if not min_ver or version_tuple(version) >= version_tuple(min_ver):
return data['data']
self._ydl.write_debug(f'Discarding old cache from version {version} (needs {min_ver})')
def load(self, section, key, dtype='json', default=None, *, min_ver=None):
assert dtype in ('json',) assert dtype in ('json',)
if not self.enabled: if not self.enabled:
@@ -61,8 +70,8 @@ class Cache:
try: try:
with open(cache_fn, encoding='utf-8') as cachef: with open(cache_fn, encoding='utf-8') as cachef:
self._ydl.write_debug(f'Loading {section}.{key} from cache') self._ydl.write_debug(f'Loading {section}.{key} from cache')
return json.load(cachef) return self._validate(json.load(cachef), min_ver)
except ValueError: except (ValueError, KeyError):
try: try:
file_size = os.path.getsize(cache_fn) file_size = os.path.getsize(cache_fn)
except OSError as oe: except OSError as oe:

View File

@@ -3,6 +3,7 @@ import contextlib
import http.cookiejar import http.cookiejar
import json import json
import os import os
import re
import shutil import shutil
import struct import struct
import subprocess import subprocess
@@ -24,7 +25,13 @@ from .dependencies import (
sqlite3, sqlite3,
) )
from .minicurses import MultilinePrinter, QuietMultilinePrinter from .minicurses import MultilinePrinter, QuietMultilinePrinter
from .utils import Popen, YoutubeDLCookieJar, error_to_str, expand_path from .utils import (
Popen,
YoutubeDLCookieJar,
error_to_str,
expand_path,
try_call,
)
CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'} CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'}
SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'} SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'}
@@ -85,8 +92,9 @@ def _create_progress_bar(logger):
def load_cookies(cookie_file, browser_specification, ydl): def load_cookies(cookie_file, browser_specification, ydl):
cookie_jars = [] cookie_jars = []
if browser_specification is not None: if browser_specification is not None:
browser_name, profile, keyring = _parse_browser_specification(*browser_specification) browser_name, profile, keyring, container = _parse_browser_specification(*browser_specification)
cookie_jars.append(extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring)) cookie_jars.append(
extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring, container=container))
if cookie_file is not None: if cookie_file is not None:
is_filename = YoutubeDLCookieJar.is_path(cookie_file) is_filename = YoutubeDLCookieJar.is_path(cookie_file)
@@ -101,9 +109,9 @@ def load_cookies(cookie_file, browser_specification, ydl):
return _merge_cookie_jars(cookie_jars) return _merge_cookie_jars(cookie_jars)
def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None): def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None, container=None):
if browser_name == 'firefox': if browser_name == 'firefox':
return _extract_firefox_cookies(profile, logger) return _extract_firefox_cookies(profile, container, logger)
elif browser_name == 'safari': elif browser_name == 'safari':
return _extract_safari_cookies(profile, logger) return _extract_safari_cookies(profile, logger)
elif browser_name in CHROMIUM_BASED_BROWSERS: elif browser_name in CHROMIUM_BASED_BROWSERS:
@@ -112,7 +120,7 @@ def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(),
raise ValueError(f'unknown browser: {browser_name}') raise ValueError(f'unknown browser: {browser_name}')
def _extract_firefox_cookies(profile, logger): def _extract_firefox_cookies(profile, container, logger):
logger.info('Extracting cookies from firefox') logger.info('Extracting cookies from firefox')
if not sqlite3: if not sqlite3:
logger.warning('Cannot extract cookies from firefox without sqlite3 support. ' logger.warning('Cannot extract cookies from firefox without sqlite3 support. '
@@ -131,11 +139,36 @@ def _extract_firefox_cookies(profile, logger):
raise FileNotFoundError(f'could not find firefox cookies database in {search_root}') raise FileNotFoundError(f'could not find firefox cookies database in {search_root}')
logger.debug(f'Extracting cookies from: "{cookie_database_path}"') logger.debug(f'Extracting cookies from: "{cookie_database_path}"')
container_id = None
if container not in (None, 'none'):
containers_path = os.path.join(os.path.dirname(cookie_database_path), 'containers.json')
if not os.path.isfile(containers_path) or not os.access(containers_path, os.R_OK):
raise FileNotFoundError(f'could not read containers.json in {search_root}')
with open(containers_path) as containers:
identities = json.load(containers).get('identities', [])
container_id = next((context.get('userContextId') for context in identities if container in (
context.get('name'),
try_call(lambda: re.fullmatch(r'userContext([^\.]+)\.label', context['l10nID']).group())
)), None)
if not isinstance(container_id, int):
raise ValueError(f'could not find firefox container "{container}" in containers.json')
with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir: with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir:
cursor = None cursor = None
try: try:
cursor = _open_database_copy(cookie_database_path, tmpdir) cursor = _open_database_copy(cookie_database_path, tmpdir)
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies') if isinstance(container_id, int):
logger.debug(
f'Only loading cookies from firefox container "{container}", ID {container_id}')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE originAttributes LIKE ? OR originAttributes LIKE ?',
(f'%userContextId={container_id}', f'%userContextId={container_id}&%'))
elif container == 'none':
logger.debug('Only loading cookies not belonging to any container')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE NOT INSTR(originAttributes,"userContextId=")')
else:
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies')
jar = YoutubeDLCookieJar() jar = YoutubeDLCookieJar()
with _create_progress_bar(logger) as progress_bar: with _create_progress_bar(logger) as progress_bar:
table = cursor.fetchall() table = cursor.fetchall()
@@ -948,11 +981,11 @@ def _is_path(value):
return os.path.sep in value return os.path.sep in value
def _parse_browser_specification(browser_name, profile=None, keyring=None): def _parse_browser_specification(browser_name, profile=None, keyring=None, container=None):
if browser_name not in SUPPORTED_BROWSERS: if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser: "{browser_name}"') raise ValueError(f'unsupported browser: "{browser_name}"')
if keyring not in (None, *SUPPORTED_KEYRINGS): if keyring not in (None, *SUPPORTED_KEYRINGS):
raise ValueError(f'unsupported keyring: "{keyring}"') raise ValueError(f'unsupported keyring: "{keyring}"')
if profile is not None and _is_path(profile): if profile is not None and _is_path(profile):
profile = os.path.expanduser(profile) profile = os.path.expanduser(profile)
return browser_name, profile, keyring return browser_name, profile, keyring, container

View File

@@ -92,6 +92,7 @@ class FileDownloader:
for func in ( for func in (
'deprecation_warning', 'deprecation_warning',
'deprecated_feature',
'report_error', 'report_error',
'report_file_already_downloaded', 'report_file_already_downloaded',
'report_warning', 'report_warning',

View File

@@ -515,16 +515,14 @@ _BY_NAME = {
if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD') if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD')
} }
_BY_EXE = {klass.EXE_NAME: klass for klass in _BY_NAME.values()}
def list_external_downloaders(): def list_external_downloaders():
return sorted(_BY_NAME.keys()) return sorted(_BY_NAME.keys())
def get_external_downloader(external_downloader): def get_external_downloader(external_downloader):
""" Given the name of the executable, see whether we support the given """ Given the name of the executable, see whether we support the given downloader """
downloader . """
# Drop .exe extension on Windows
bn = os.path.splitext(os.path.basename(external_downloader))[0] bn = os.path.splitext(os.path.basename(external_downloader))[0]
return _BY_NAME.get(bn, _BY_EXE.get(bn)) return _BY_NAME.get(bn) or next((
klass for klass in _BY_NAME.values() if klass.EXE_NAME in bn
), None)

View File

@@ -65,8 +65,8 @@ class FragmentFD(FileDownloader):
""" """
def report_retry_fragment(self, err, frag_index, count, retries): def report_retry_fragment(self, err, frag_index, count, retries):
self.deprecation_warning( self.deprecation_warning('yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. '
'yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. Use yt_dlp.downloader.FileDownloader.report_retry instead') 'Use yt_dlp.downloader.FileDownloader.report_retry instead')
return self.report_retry(err, count, retries, frag_index) return self.report_retry(err, count, retries, frag_index)
def report_skip_fragment(self, frag_index, err=None): def report_skip_fragment(self, frag_index, err=None):

View File

@@ -1,5 +1,28 @@
# flake8: noqa: F401 # flake8: noqa: F401
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .abc import ( from .abc import (
ABCIE, ABCIE,
ABCIViewIE, ABCIViewIE,
@@ -470,6 +493,7 @@ from .epicon import (
EpiconIE, EpiconIE,
EpiconSeriesIE, EpiconSeriesIE,
) )
from .epoch import EpochIE
from .eporner import EpornerIE from .eporner import EpornerIE
from .eroprofile import ( from .eroprofile import (
EroProfileIE, EroProfileIE,
@@ -491,6 +515,7 @@ from .espn import (
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .europa import EuropaIE from .europa import EuropaIE
from .europeantour import EuropeanTourIE from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE
from .euscreen import EUScreenIE from .euscreen import EUScreenIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
from .expressen import ExpressenIE from .expressen import ExpressenIE
@@ -720,6 +745,10 @@ from .iqiyi import (
IqIE, IqIE,
IqAlbumIE IqAlbumIE
) )
from .islamchannel import (
IslamChannelIE,
IslamChannelSeriesIE,
)
from .itprotv import ( from .itprotv import (
ITProTVIE, ITProTVIE,
ITProTVCourseIE ITProTVCourseIE
@@ -1079,6 +1108,7 @@ from .newgrounds import (
NewgroundsPlaylistIE, NewgroundsPlaylistIE,
NewgroundsUserIE, NewgroundsUserIE,
) )
from .newspicks import NewsPicksIE
from .newstube import NewstubeIE from .newstube import NewstubeIE
from .newsy import NewsyIE from .newsy import NewsyIE
from .nextmedia import ( from .nextmedia import (
@@ -1728,6 +1758,12 @@ from .telequebec import (
from .teletask import TeleTaskIE from .teletask import TeleTaskIE
from .telewebion import TelewebionIE from .telewebion import TelewebionIE
from .tempo import TempoIE from .tempo import TempoIE
from .tencent import (
VQQSeriesIE,
VQQVideoIE,
WeTvEpisodeIE,
WeTvSeriesIE,
)
from .tennistv import TennisTVIE from .tennistv import TennisTVIE
from .tenplay import TenPlayIE from .tenplay import TenPlayIE
from .testurl import TestURLIE from .testurl import TestURLIE
@@ -1787,6 +1823,10 @@ from .toongoggles import ToonGogglesIE
from .toutv import TouTvIE from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE from .traileraddict import TrailerAddictIE
from .triller import (
TrillerIE,
TrillerUserIE,
)
from .trilulilu import TriluliluIE from .trilulilu import TriluliluIE
from .trovo import ( from .trovo import (
TrovoIE, TrovoIE,
@@ -2092,7 +2132,6 @@ from .weibo import (
WeiboMobileIE WeiboMobileIE
) )
from .weiqitv import WeiqiTVIE from .weiqitv import WeiqiTVIE
from .wetv import WeTvEpisodeIE, WeTvSeriesIE
from .wikimedia import WikimediaIE from .wikimedia import WikimediaIE
from .willow import WillowIE from .willow import WillowIE
from .wimtv import WimTVIE from .wimtv import WimTVIE
@@ -2175,42 +2214,44 @@ from .younow import (
from .youporn import YouPornIE from .youporn import YouPornIE
from .yourporn import YourPornIE from .yourporn import YourPornIE
from .yourupload import YourUploadIE from .yourupload import YourUploadIE
from .youtube import (
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zattoo import ( from .zattoo import (
BBVTVIE, BBVTVIE,
BBVTVLiveIE,
BBVTVRecordingsIE,
EinsUndEinsTVIE, EinsUndEinsTVIE,
EinsUndEinsTVLiveIE,
EinsUndEinsTVRecordingsIE,
EWETVIE, EWETVIE,
EWETVLiveIE,
EWETVRecordingsIE,
GlattvisionTVIE, GlattvisionTVIE,
GlattvisionTVLiveIE,
GlattvisionTVRecordingsIE,
MNetTVIE, MNetTVIE,
NetPlusIE, MNetTVLiveIE,
MNetTVRecordingsIE,
NetPlusTVIE,
NetPlusTVLiveIE,
NetPlusTVRecordingsIE,
OsnatelTVIE, OsnatelTVIE,
OsnatelTVLiveIE,
OsnatelTVRecordingsIE,
QuantumTVIE, QuantumTVIE,
QuantumTVLiveIE,
QuantumTVRecordingsIE,
SaltTVIE, SaltTVIE,
SaltTVLiveIE,
SaltTVRecordingsIE,
SAKTVIE, SAKTVIE,
SAKTVLiveIE,
SAKTVRecordingsIE,
VTXTVIE, VTXTVIE,
VTXTVLiveIE,
VTXTVRecordingsIE,
WalyTVIE, WalyTVIE,
WalyTVLiveIE,
WalyTVRecordingsIE,
ZattooIE, ZattooIE,
ZattooLiveIE, ZattooLiveIE,
ZattooMoviesIE, ZattooMoviesIE,

View File

@@ -95,24 +95,24 @@ class ArteTVIE(ArteTVBaseIE):
# all obtained by exhaustive testing # all obtained by exhaustive testing
_COUNTRIES_MAP = { _COUNTRIES_MAP = {
'DE_FR': { 'DE_FR': (
'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC', 'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC',
'PF', 'PM', 'RE', 'WF', 'YT', 'PF', 'PM', 'RE', 'WF', 'YT',
}, ),
# with both of the below 'BE' sometimes works, sometimes doesn't # with both of the below 'BE' sometimes works, sometimes doesn't
'EUR_DE_FR': { 'EUR_DE_FR': (
'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI', 'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI',
'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF', 'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF',
'YT', 'YT',
}, ),
'SAT': { 'SAT': (
'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ', 'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ',
'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF', 'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF',
'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI', 'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI',
'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC', 'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC',
'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO', 'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO',
'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT', 'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT',
}, ),
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -218,6 +218,9 @@ class BiliBiliIE(InfoExtractor):
durl = traverse_obj(video_info, ('dash', 'video')) durl = traverse_obj(video_info, ('dash', 'video'))
audios = traverse_obj(video_info, ('dash', 'audio')) or [] audios = traverse_obj(video_info, ('dash', 'audio')) or []
flac_audio = traverse_obj(video_info, ('dash', 'flac', 'audio'))
if flac_audio:
audios.append(flac_audio)
entries = [] entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4') RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
@@ -620,14 +623,15 @@ class BiliBiliSearchIE(SearchInfoExtractor):
'keyword': query, 'keyword': query,
'page': page_num, 'page': page_num,
'context': '', 'context': '',
'order': 'pubdate',
'duration': 0, 'duration': 0,
'tids_2': '', 'tids_2': '',
'__refresh__': 'true', '__refresh__': 'true',
'search_type': 'video', 'search_type': 'video',
'tids': 0, 'tids': 0,
'highlight': 1, 'highlight': 1,
})['data'].get('result') or [] })['data'].get('result')
if not videos:
break
for video in videos: for video in videos:
yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid'])) yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))

View File

@@ -65,10 +65,12 @@ class BitChuteIE(InfoExtractor):
error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video') error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video')
if error == 'Video Unavailable': if error == 'Video Unavailable':
raise GeoRestrictedError(error) raise GeoRestrictedError(error)
raise ExtractorError(error) raise ExtractorError(error, expected=True)
formats = entries[0]['formats'] formats = entries[0]['formats']
self._check_formats(formats, video_id) self._check_formats(formats, video_id)
if not formats:
raise self.raise_no_formats('Video is unavailable', expected=True, video_id=video_id)
self._sort_formats(formats) self._sort_formats(formats)
description = self._html_search_regex( description = self._html_search_regex(

View File

@@ -480,6 +480,9 @@ class InfoExtractor:
will be used by geo restriction bypass mechanism similarly will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES. to _GEO_COUNTRIES.
The _ENABLED attribute should be set to False for IEs that
are disabled by default and must be explicitly enabled.
The _WORKING attribute should be set to False for broken IEs The _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests. in order to warn the users and skip the tests.
""" """
@@ -491,6 +494,7 @@ class InfoExtractor:
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None _GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
_ENABLED = True
_NETRC_MACHINE = None _NETRC_MACHINE = None
IE_DESC = None IE_DESC = None
SEARCH_KEY = None SEARCH_KEY = None
@@ -1689,7 +1693,7 @@ class InfoExtractor:
'order_free': ('webm', 'mp4', 'flv', '', 'none')}, 'order_free': ('webm', 'mp4', 'flv', '', 'none')},
'aext': {'type': 'ordered', 'field': 'audio_ext', 'aext': {'type': 'ordered', 'field': 'audio_ext',
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'), 'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')}, 'order_free': ('ogg', 'opus', 'webm', 'mp3', 'm4a', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000}, 'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple', 'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple',
'field': ('vcodec', 'acodec'), 'field': ('vcodec', 'acodec'),
@@ -1762,9 +1766,8 @@ class InfoExtractor:
if field not in self.settings: if field not in self.settings:
if key in ('forced', 'priority'): if key in ('forced', 'priority'):
return False return False
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Using arbitrary fields ({field}) for format sorting is '
f'Using arbitrary fields ({field}) for format sorting is deprecated ' 'deprecated and may be removed in a future version')
'and may be removed in a future version')
self.settings[field] = {} self.settings[field] = {}
propObj = self.settings[field] propObj = self.settings[field]
if key not in propObj: if key not in propObj:
@@ -1849,9 +1852,8 @@ class InfoExtractor:
if self._get_field_setting(field, 'type') == 'alias': if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field') alias, field = field, self._get_field_setting(field, 'field')
if self._get_field_setting(alias, 'deprecated'): if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning( self.ydl.deprecated_feature(f'Format sorting alias {alias} is deprecated and may '
f'Format sorting alias {alias} is deprecated ' 'be removed in a future version. Please use {field} instead')
f'and may be removed in a future version. Please use {field} instead')
reverse = match.group('reverse') is not None reverse = match.group('reverse') is not None
closest = match.group('separator') == '~' closest = match.group('separator') == '~'
limit_text = match.group('limit') limit_text = match.group('limit')
@@ -3258,7 +3260,7 @@ class InfoExtractor:
'subtitles': {}, 'subtitles': {},
} }
media_attributes = extract_attributes(media_tag) media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src')) src = strip_or_none(dict_get(media_attributes, ('src', 'data-video-src', 'data-src', 'data-source')))
if src: if src:
f = parse_content_type(media_attributes.get('type')) f = parse_content_type(media_attributes.get('type'))
_, formats = _media_formats(src, media_type, f) _, formats = _media_formats(src, media_type, f)
@@ -3269,7 +3271,7 @@ class InfoExtractor:
s_attr = extract_attributes(source_tag) s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen # data-video-src and data-src are non standard but seen
# several times in the wild # several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src'))) src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src', 'data-source')))
if not src: if not src:
continue continue
f = parse_content_type(s_attr.get('type')) f = parse_content_type(s_attr.get('type'))
@@ -3872,7 +3874,7 @@ class InfoExtractor:
def _extract_from_webpage(cls, url, webpage): def _extract_from_webpage(cls, url, webpage):
for embed_url in orderedSet( for embed_url in orderedSet(
cls._extract_embed_urls(url, webpage) or [], lazy=True): cls._extract_embed_urls(url, webpage) or [], lazy=True):
yield cls.url_result(embed_url, cls) yield cls.url_result(embed_url, None if cls._VALID_URL is False else cls)
@classmethod @classmethod
def _extract_embed_urls(cls, url, webpage): def _extract_embed_urls(cls, url, webpage):
@@ -3941,3 +3943,12 @@ class SearchInfoExtractor(InfoExtractor):
@classproperty @classproperty
def SEARCH_KEY(cls): def SEARCH_KEY(cls):
return cls._SEARCH_KEY return cls._SEARCH_KEY
class UnsupportedURLIE(InfoExtractor):
_VALID_URL = '.*'
_ENABLED = False
IE_DESC = False
def _real_extract(self, url):
raise UnsupportedError(url)

View File

@@ -720,15 +720,20 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
def _get_params(self, lang): def _get_params(self, lang):
if not CrunchyrollBetaBaseIE.params: if not CrunchyrollBetaBaseIE.params:
if self._get_cookies(f'https://beta.crunchyroll.com/{lang}').get('etp_rt'):
grant_type, key = 'etp_rt_cookie', 'accountAuthClientId'
else:
grant_type, key = 'client_id', 'anonClientId'
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage( initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(
f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None) f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None)
api_domain = app_config['cxApiParams']['apiDomain'] api_domain = app_config['cxApiParams']['apiDomain']
basic_token = str(base64.b64encode(('%s:' % app_config['cxApiParams']['accountAuthClientId']).encode('ascii')), 'ascii')
auth_response = self._download_json( auth_response = self._download_json(
f'{api_domain}/auth/v1/token', None, note='Authenticating with cookie', f'{api_domain}/auth/v1/token', None, note=f'Authenticating with grant_type={grant_type}',
headers={ headers={
'Authorization': 'Basic ' + basic_token 'Authorization': 'Basic ' + str(base64.b64encode(('%s:' % app_config['cxApiParams'][key]).encode('ascii')), 'ascii')
}, data='grant_type=etp_rt_cookie'.encode('ascii')) }, data=f'grant_type={grant_type}'.encode('ascii'))
policy_response = self._download_json( policy_response = self._download_json(
f'{api_domain}/index/v2', None, note='Retrieving signed policy', f'{api_domain}/index/v2', None, note='Retrieving signed policy',
headers={ headers={
@@ -747,21 +752,6 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
CrunchyrollBetaBaseIE.params = (api_domain, bucket, params) CrunchyrollBetaBaseIE.params = (api_domain, bucket, params)
return CrunchyrollBetaBaseIE.params return CrunchyrollBetaBaseIE.params
def _redirect_from_beta(self, url, lang, internal_id, display_id, is_episode, iekey):
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(url, display_id), display_id)
content_data = initial_state['content']['byId'][internal_id]
if is_episode:
video_id = content_data['external_id'].split('.')[1]
series_id = content_data['episode_metadata']['series_slug_title']
else:
series_id = content_data['slug_title']
series_id = re.sub(r'-{2,}', '-', series_id)
url = f'https://www.crunchyroll.com/{lang}{series_id}'
if is_episode:
url = url + f'/{display_id}-{video_id}'
self.to_screen(f'{display_id}: Not logged in. Redirecting to non-beta site - {url}')
return self.url_result(url, iekey, display_id)
class CrunchyrollBetaIE(CrunchyrollBetaBaseIE): class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
IE_NAME = 'crunchyroll:beta' IE_NAME = 'crunchyroll:beta'
@@ -800,10 +790,6 @@ class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, True, CrunchyrollIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
episode_response = self._download_json( episode_response = self._download_json(
@@ -897,10 +883,6 @@ class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id') lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, False, CrunchyrollShowPlaylistIE.ie_key())
api_domain, bucket, params = self._get_params(lang) api_domain, bucket, params = self._get_params(lang)
series_response = self._download_json( series_response = self._download_json(

46
yt_dlp/extractor/epoch.py Normal file
View File

@@ -0,0 +1,46 @@
from .common import InfoExtractor
class EpochIE(InfoExtractor):
_VALID_URL = r'https?://www.theepochtimes\.com/[\w-]+_(?P<id>\d+).html'
_TESTS = [
{
'url': 'https://www.theepochtimes.com/they-can-do-audio-video-physical-surveillance-on-you-24h-365d-a-year-rex-lee-on-intrusive-apps_4661688.html',
'info_dict': {
'id': 'a3dd732c-4750-4bc8-8156-69180668bda1',
'ext': 'mp4',
'title': 'They Can Do Audio, Video, Physical Surveillance on You 24H/365D a Year: Rex Lee on Intrusive Apps',
}
},
{
'url': 'https://www.theepochtimes.com/the-communist-partys-cyberattacks-on-america-explained-rex-lee-talks-tech-hybrid-warfare_4342413.html',
'info_dict': {
'id': '276c7f46-3bbf-475d-9934-b9bbe827cf0a',
'ext': 'mp4',
'title': 'The Communist Partys Cyberattacks on America Explained; Rex Lee Talks Tech Hybrid Warfare',
}
},
{
'url': 'https://www.theepochtimes.com/kash-patel-a-6-year-saga-of-government-corruption-from-russiagate-to-mar-a-lago_4690250.html',
'info_dict': {
'id': 'aa9ceecd-a127-453d-a2de-7153d6fd69b6',
'ext': 'mp4',
'title': 'Kash Patel: A 6-Year-Saga of Government Corruption, From Russiagate to Mar-a-Lago',
}
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
youmaker_video_id = self._search_regex(r'data-trailer="[\w-]+" data-id="([\w-]+)"', webpage, 'url')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'http://vs1.youmaker.com/assets/{youmaker_video_id}/playlist.m3u8', video_id, 'mp4', m3u8_id='hls')
return {
'id': youmaker_video_id,
'formats': formats,
'subtitles': subtitles,
'title': self._html_extract_title(webpage)
}

View File

@@ -0,0 +1,99 @@
from .common import InfoExtractor
from ..utils import traverse_obj
class EurosportIE(InfoExtractor):
_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/\d+/[\w-]+_(?P<id>vid\d+)'
_TESTS = [{
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/highlights-rafael-nadal-brushes-aside-caper-ruud-to-win-record-extending-14th-french-open-title_vid1694147/video.shtml',
'info_dict': {
'id': '2480939',
'ext': 'mp4',
'title': 'Highlights: Rafael Nadal brushes aside Caper Ruud to win record-extending 14th French Open title',
'description': 'md5:b564db73ecfe4b14ebbd8e62a3692c76',
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388285-69245968-2560-1440.png',
'duration': 195.0,
'display_id': 'vid1694147',
'timestamp': 1654446698,
'upload_date': '20220605',
}
}, {
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/watch-the-top-five-shots-from-men-s-final-as-rafael-nadal-beats-casper-ruud-to-seal-14th-french-open_vid1694283/video.shtml',
'info_dict': {
'id': '2481254',
'ext': 'mp4',
'title': 'md5:149dcc5dfb38ab7352acc008cc9fb071',
'duration': 130.0,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388422-69248708-2560-1440.png',
'description': 'md5:a0c8a7f6b285e48ae8ddbe7aa85cfee6',
'display_id': 'vid1694283',
'timestamp': 1654456090,
'upload_date': '20220605',
}
}, {
# geo-fence but can bypassed by xff
'url': 'https://www.eurosport.com/cycling/tour-de-france-femmes/2022/incredible-ride-marlen-reusser-storms-to-stage-4-win-at-tour-de-france-femmes_vid1722221/video.shtml',
'info_dict': {
'id': '2582552',
'ext': 'mp4',
'title': 'Incredible ride! - Marlen Reusser storms to Stage 4 win at Tour de France Femmes',
'duration': 188.0,
'display_id': 'vid1722221',
'timestamp': 1658936167,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/07/27/3423347-69852108-2560-1440.jpg',
'description': 'md5:32bbe3a773ac132c57fb1e8cca4b7c71',
'upload_date': '20220727',
}
}]
_TOKEN = None
# actually defined in https://netsport.eurosport.io/?variables={"databaseId":<databaseId>,"playoutType":"VDP"}&extensions={"persistedQuery":{"version":1 ..
# but this method require to get sha256 hash
_GEO_COUNTRIES = ['DE', 'NL', 'EU', 'IT', 'FR'] # Not complete list but it should work
def _real_initialize(self):
if EurosportIE._TOKEN is None:
EurosportIE._TOKEN = self._download_json(
'https://eu3-prod-direct.eurosport.com/token?realm=eurosport', None,
'Trying to get token')['data']['attributes']['token']
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
json_data = self._download_json(
f'https://eu3-prod-direct.eurosport.com/playback/v2/videoPlaybackInfo/sourceSystemId/eurosport-{display_id}',
display_id, query={'usePreAuth': True}, headers={'Authorization': f'Bearer {EurosportIE._TOKEN}'})['data']
json_ld_data = self._search_json_ld(webpage, display_id)
formats, subtitles = [], {}
for stream_type in json_data['attributes']['streaming']:
if stream_type == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4')
elif stream_type == 'dash':
fmts, subs = self._extract_mpd_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
elif stream_type == 'mss':
fmts, subs = self._extract_ism_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
self._sort_formats(formats)
return {
'id': json_data['id'],
'title': json_ld_data.get('title') or self._og_search_title(webpage),
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'thumbnails': json_ld_data.get('thumbnails'),
'description': (json_ld_data.get('description')
or self._html_search_meta(['og:description', 'description'], webpage)),
'duration': json_ld_data.get('duration'),
'timestamp': json_ld_data.get('timestamp'),
}

View File

@@ -3,7 +3,6 @@ import re
import urllib.parse import urllib.parse
import xml.etree.ElementTree import xml.etree.ElementTree
from . import gen_extractor_classes
from .common import InfoExtractor # isort: split from .common import InfoExtractor # isort: split
from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE
from .commonprotocols import RtmpIE from .commonprotocols import RtmpIE
@@ -26,6 +25,7 @@ from ..utils import (
parse_resolution, parse_resolution,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
traverse_obj,
try_call, try_call,
unescapeHTML, unescapeHTML,
unified_timestamp, unified_timestamp,
@@ -2805,7 +2805,7 @@ class GenericIE(InfoExtractor):
self._downloader.write_debug('Looking for embeds') self._downloader.write_debug('Looking for embeds')
embeds = [] embeds = []
for ie in gen_extractor_classes(): for ie in self._downloader._ies.values():
gen = ie.extract_from_webpage(self._downloader, url, webpage) gen = ie.extract_from_webpage(self._downloader, url, webpage)
current_embeds = [] current_embeds = []
try: try:
@@ -2840,8 +2840,9 @@ class GenericIE(InfoExtractor):
try: try:
info = self._parse_jwplayer_data( info = self._parse_jwplayer_data(
jwplayer_data, video_id, require_title=False, base_url=url) jwplayer_data, video_id, require_title=False, base_url=url)
self.report_detected('JW Player data') if traverse_obj(info, 'formats', ('entries', ..., 'formats')):
return merge_dicts(info, info_dict) self.report_detected('JW Player data')
return merge_dicts(info, info_dict)
except ExtractorError: except ExtractorError:
# See https://github.com/ytdl-org/youtube-dl/pull/16735 # See https://github.com/ytdl-org/youtube-dl/pull/16735
pass pass

View File

@@ -6,7 +6,6 @@ from ..compat import compat_urlparse, compat_b64decode
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
js_to_json,
str_or_none, str_or_none,
try_get, try_get,
unescapeHTML, unescapeHTML,
@@ -55,11 +54,7 @@ class HuyaLiveIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id=video_id) webpage = self._download_webpage(url, video_id=video_id)
json_stream = self._search_regex(r'"stream":\s+"([a-zA-Z0-9+=/]+)"', webpage, 'stream', default=None) stream_data = self._search_json(r'stream:\s+', webpage, 'stream', video_id=video_id, default=None)
if not json_stream:
raise ExtractorError('Video is offline', expected=True)
stream_data = self._parse_json(compat_b64decode(json_stream).decode(), video_id=video_id,
transform_source=js_to_json)
room_info = try_get(stream_data, lambda x: x['data'][0]['gameLiveInfo']) room_info = try_get(stream_data, lambda x: x['data'][0]['gameLiveInfo'])
if not room_info: if not room_info:
raise ExtractorError('Can not extract the room info', expected=True) raise ExtractorError('Can not extract the room info', expected=True)
@@ -67,6 +62,8 @@ class HuyaLiveIE(InfoExtractor):
screen_type = room_info.get('screenType') screen_type = room_info.get('screenType')
live_source_type = room_info.get('liveSourceType') live_source_type = room_info.get('liveSourceType')
stream_info_list = stream_data['data'][0]['gameStreamInfoList'] stream_info_list = stream_data['data'][0]['gameStreamInfoList']
if not stream_info_list:
raise ExtractorError('Video is offline', expected=True)
formats = [] formats = []
for stream_info in stream_info_list: for stream_info in stream_info_list:
stream_url = stream_info.get('sFlvUrl') stream_url = stream_info.get('sFlvUrl')

View File

@@ -39,37 +39,42 @@ class InstagramBaseIE(InfoExtractor):
_NETRC_MACHINE = 'instagram' _NETRC_MACHINE = 'instagram'
_IS_LOGGED_IN = False _IS_LOGGED_IN = False
_API_BASE_URL = 'https://i.instagram.com/api/v1'
_LOGIN_URL = 'https://www.instagram.com/accounts/login'
_API_HEADERS = {
'X-IG-App-ID': '936619743392459',
'X-ASBD-ID': '198387',
'X-IG-WWW-Claim': '0',
'Origin': 'https://www.instagram.com',
'Accept': '*/*',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36',
}
def _perform_login(self, username, password): def _perform_login(self, username, password):
if self._IS_LOGGED_IN: if self._IS_LOGGED_IN:
return return
login_webpage = self._download_webpage( login_webpage = self._download_webpage(
'https://www.instagram.com/accounts/login/', None, self._LOGIN_URL, None, note='Downloading login webpage', errnote='Failed to download login webpage')
note='Downloading login webpage', errnote='Failed to download login webpage')
shared_data = self._parse_json( shared_data = self._parse_json(self._search_regex(
self._search_regex( r'window\._sharedData\s*=\s*({.+?});', login_webpage, 'shared data', default='{}'), None)
r'window\._sharedData\s*=\s*({.+?});',
login_webpage, 'shared data', default='{}'),
None)
login = self._download_json('https://www.instagram.com/accounts/login/ajax/', None, note='Logging in', headers={ login = self._download_json(
'Accept': '*/*', f'{self._LOGIN_URL}/ajax/', None, note='Logging in', headers={
'X-IG-App-ID': '936619743392459', **self._API_HEADERS,
'X-ASBD-ID': '198387', 'X-Requested-With': 'XMLHttpRequest',
'X-IG-WWW-Claim': '0', 'X-CSRFToken': shared_data['config']['csrf_token'],
'X-Requested-With': 'XMLHttpRequest', 'X-Instagram-AJAX': shared_data['rollout_hash'],
'X-CSRFToken': shared_data['config']['csrf_token'], 'Referer': 'https://www.instagram.com/',
'X-Instagram-AJAX': shared_data['rollout_hash'], }, data=urlencode_postdata({
'Referer': 'https://www.instagram.com/', 'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
}, data=urlencode_postdata({ 'username': username,
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}', 'queryParams': '{}',
'username': username, 'optIntoOneTap': 'false',
'queryParams': '{}', 'stopDeletionNonce': '',
'optIntoOneTap': 'false', 'trustedDeviceRecords': '{}',
'stopDeletionNonce': '', }))
'trustedDeviceRecords': '{}',
}))
if not login.get('authenticated'): if not login.get('authenticated'):
if login.get('message'): if login.get('message'):
@@ -134,7 +139,7 @@ class InstagramBaseIE(InfoExtractor):
} }
def _extract_product_media(self, product_media): def _extract_product_media(self, product_media):
media_id = product_media.get('code') or product_media.get('id') media_id = product_media.get('code') or _pk_to_id(product_media.get('pk'))
vcodec = product_media.get('video_codec') vcodec = product_media.get('video_codec')
dash_manifest_raw = product_media.get('video_dash_manifest') dash_manifest_raw = product_media.get('video_dash_manifest')
videos_list = product_media.get('video_versions') videos_list = product_media.get('video_versions')
@@ -179,7 +184,7 @@ class InstagramBaseIE(InfoExtractor):
user_info = product_info.get('user') or {} user_info = product_info.get('user') or {}
info_dict = { info_dict = {
'id': product_info.get('code') or product_info.get('id'), 'id': product_info.get('code') or _pk_to_id(product_info.get('pk')),
'title': product_info.get('title') or f'Video by {user_info.get("username")}', 'title': product_info.get('title') or f'Video by {user_info.get("username")}',
'description': traverse_obj(product_info, ('caption', 'text'), expected_type=str_or_none), 'description': traverse_obj(product_info, ('caption', 'text'), expected_type=str_or_none),
'timestamp': int_or_none(product_info.get('taken_at')), 'timestamp': int_or_none(product_info.get('taken_at')),
@@ -360,49 +365,74 @@ class InstagramIE(InstagramBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
video_id, url = self._match_valid_url(url).group('id', 'url') video_id, url = self._match_valid_url(url).group('id', 'url')
general_info = self._download_json( media, webpage = {}, ''
f'https://www.instagram.com/graphql/query/?query_hash=9f8827793ef34641b2fb195d4d41151c'
f'&variables=%7B"shortcode":"{video_id}",' api_check = self._download_json(
'"parent_comment_count":10,"has_threaded_comments":true}', video_id, fatal=False, errnote=False, f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
headers={ video_id, headers=self._API_HEADERS, fatal=False, note='Setting up session', errnote=False) or {}
'Accept': '*', csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'Authority': 'www.instagram.com', if not csrf_token:
'Referer': 'https://www.instagram.com', self.report_warning('No csrf token set by Instagram API', video_id)
'x-ig-app-id': '936619743392459', elif api_check.get('status') != 'ok':
}) self.report_warning('Instagram API is not granting access', video_id)
media = traverse_obj(general_info, ('data', 'shortcode_media')) or {} else:
if self._get_cookies(url).get('sessionid'):
media.update(traverse_obj(self._download_json(
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
fatal=False, note='Downloading video info', headers={
**self._API_HEADERS,
'X-CSRFToken': csrf_token.value,
}), ('items', 0)) or {})
if media:
return self._extract_product(media)
variables = {
'shortcode': video_id,
'child_comment_count': 3,
'fetch_comment_count': 40,
'parent_comment_count': 24,
'has_threaded_comments': True,
}
general_info = self._download_json(
'https://www.instagram.com/graphql/query/', video_id, fatal=False,
headers={
**self._API_HEADERS,
'X-CSRFToken': csrf_token.value,
'X-Requested-With': 'XMLHttpRequest',
'Referer': url,
}, query={
'query_hash': '9f8827793ef34641b2fb195d4d41151c',
'variables': json.dumps(variables, separators=(',', ':')),
})
media.update(traverse_obj(general_info, ('data', 'shortcode_media')) or {})
if not media: if not media:
self.report_warning('General metadata extraction failed', video_id) self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
webpage, urlh = self._download_webpage_handle(url, video_id)
shared_data = self._search_json(
r'window\._sharedData\s*=', webpage, 'shared data', video_id, fatal=False) or {}
info = self._download_json( if shared_data and self._LOGIN_URL not in urlh.geturl():
f'https://i.instagram.com/api/v1/media/{_id_to_pk(video_id)}/info/', video_id, media.update(traverse_obj(
fatal=False, note='Downloading video info', errnote=False, headers={ shared_data, ('entry_data', 'PostPage', 0, 'graphql', 'shortcode_media'),
'Accept': '*', ('entry_data', 'PostPage', 0, 'media'), expected_type=dict) or {})
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36', else:
'Authority': 'www.instagram.com', self.report_warning('Main webpage is locked behind the login page. Retrying with embed webpage')
'Referer': 'https://www.instagram.com', webpage = self._download_webpage(
'x-ig-app-id': '936619743392459', f'{url}/embed/', video_id, note='Downloading embed webpage', fatal=False)
}) additional_data = self._search_json(
if info: r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*', webpage, 'additional data', video_id, fatal=False)
media.update(info['items'][0]) if not additional_data:
return self._extract_product(media) self.raise_login_required('Requested content is not available, rate-limit reached or login required')
webpage = self._download_webpage( product_item = traverse_obj(additional_data, ('items', 0), expected_type=dict)
f'https://www.instagram.com/p/{video_id}/embed/', video_id, if product_item:
note='Downloading embed webpage', fatal=False) media.update(product_item)
if not webpage: return self._extract_product(media)
self.raise_login_required('Requested content was not found, the content might be private')
additional_data = self._search_json( media.update(traverse_obj(
r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*', webpage, 'additional data', video_id, fatal=False) additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
product_item = traverse_obj(additional_data, ('items', 0), expected_type=dict)
if product_item:
media.update(product_item)
return self._extract_product(media)
media.update(traverse_obj(
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
username = traverse_obj(media, ('owner', 'username')) or self._search_regex( username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False) r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
@@ -649,12 +679,8 @@ class InstagramStoryIE(InstagramBaseIE):
story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}' story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}'
videos = traverse_obj(self._download_json( videos = traverse_obj(self._download_json(
f'https://i.instagram.com/api/v1/feed/reels_media/?reel_ids={story_info_url}', f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
story_id, errnote=False, fatal=False, headers={ story_id, errnote=False, fatal=False, headers=self._API_HEADERS), 'reels')
'X-IG-App-ID': 936619743392459,
'X-ASBD-ID': 198387,
'X-IG-WWW-Claim': 0,
}), 'reels')
if not videos: if not videos:
self.raise_login_required('You need to log in to access this content') self.raise_login_required('You need to log in to access this content')

View File

@@ -0,0 +1,82 @@
import re
from .common import InfoExtractor
from ..utils import traverse_obj, urljoin
class IslamChannelIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/watch/(?P<id>\d+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/watch/38604310',
'info_dict': {
'id': '38604310',
'title': 'Omar - Young Omar',
'description': 'md5:5cc7ddecef064ea7afe52eb5e0e33b55',
'thumbnail': r're:https?://.+',
'ext': 'mp4',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
thumbnail = self._search_regex(
r'data-poster="([^"]+)"', webpage, 'data poster', fatal=False) or \
self._html_search_meta(('og:image', 'twitter:image'), webpage)
headers = {
'Token': self._search_regex(r'data-token="([^"]+)"', webpage, 'data token'),
'Token-Expiry': self._search_regex(r'data-expiry="([^"]+)"', webpage, 'data expiry'),
'Uvid': video_id,
}
show_stream = self._download_json(
f'https://v2-streams-elb.simplestreamcdn.com/api/show/stream/{video_id}', video_id,
query={
'key': self._search_regex(r'data-key="([^"]+)"', webpage, 'data key'),
'platform': 'chrome',
}, headers=headers)
# TODO: show_stream['stream'] and show_stream['drm'] may contain something interesting
streams = self._download_json(
traverse_obj(show_stream, ('response', 'tokenization', 'url')), video_id,
headers=headers)
formats, subs = self._extract_m3u8_formats_and_subtitles(traverse_obj(streams, ('Streams', 'Adaptive')), video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': self._html_search_meta(('og:title', 'twitter:title'), webpage),
'description': self._html_search_meta(('og:description', 'twitter:description', 'description'), webpage),
'formats': formats,
'subtitles': subs,
'thumbnails': [{
'id': 'unscaled',
'url': thumbnail.split('?')[0],
'ext': 'jpg',
'preference': 2,
}, {
'id': 'orig',
'url': thumbnail,
'ext': 'jpg',
'preference': 1,
}] if thumbnail else None,
}
class IslamChannelSeriesIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/series/(?P<id>[a-f\d-]+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/series/a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
'info_dict': {
'id': 'a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
},
'playlist_mincount': 31,
}]
def _real_extract(self, url):
pl_id = self._match_id(url)
webpage = self._download_webpage(url, pl_id)
return self.playlist_from_matches(
re.finditer(r'<a\s+href="(/watch/\d+)"[^>]+?data-video-type="show">', webpage),
pl_id, getter=lambda x: urljoin(url, x.group(1)), ie=IslamChannelIE)

View File

@@ -8,15 +8,33 @@ from ..utils import (
float_or_none, float_or_none,
int_or_none, int_or_none,
str_or_none, str_or_none,
try_get, traverse_obj,
) )
class MedalTVIE(InfoExtractor): class MedalTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?medal\.tv/(?P<path>games/[^/?#&]+/clips)/(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://medal.tv/clips/2mA60jWAGQCBH', 'url': 'https://medal.tv/games/valorant/clips/jTBFnLKdLy15K',
'md5': '7b07b064331b1cf9e8e5c52a06ae68fa', 'md5': '6930f8972914b6b9fdc2bb3918098ba0',
'info_dict': {
'id': 'jTBFnLKdLy15K',
'ext': 'mp4',
'title': "Mornu's clutch",
'description': '',
'uploader': 'Aciel',
'timestamp': 1651628243,
'upload_date': '20220504',
'uploader_id': '19335460',
'uploader_url': 'https://medal.tv/users/19335460',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 13,
}
}, {
'url': 'https://medal.tv/games/cod%20cold%20war/clips/2mA60jWAGQCBH',
'md5': '3d19d426fe0b2d91c26e412684e66a06',
'info_dict': { 'info_dict': {
'id': '2mA60jWAGQCBH', 'id': '2mA60jWAGQCBH',
'ext': 'mp4', 'ext': 'mp4',
@@ -26,9 +44,15 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1603165266, 'timestamp': 1603165266,
'upload_date': '20201020', 'upload_date': '20201020',
'uploader_id': '10619174', 'uploader_id': '10619174',
'thumbnail': 'https://cdn.medal.tv/10619174/thumbnail-34934644-720p.jpg?t=1080p&c=202042&missing',
'uploader_url': 'https://medal.tv/users/10619174',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 23,
} }
}, { }, {
'url': 'https://medal.tv/clips/2um24TWdty0NA', 'url': 'https://medal.tv/games/cod%20cold%20war/clips/2um24TWdty0NA',
'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148', 'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148',
'info_dict': { 'info_dict': {
'id': '2um24TWdty0NA', 'id': '2um24TWdty0NA',
@@ -39,25 +63,42 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1605580939, 'timestamp': 1605580939,
'upload_date': '20201117', 'upload_date': '20201117',
'uploader_id': '5156321', 'uploader_id': '5156321',
'thumbnail': 'https://cdn.medal.tv/5156321/thumbnail-36787208-360p.jpg?t=1080p&c=202046&missing',
'uploader_url': 'https://medal.tv/users/5156321',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 9,
} }
}, { }, {
'url': 'https://medal.tv/clips/37rMeFpryCC-9', 'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://medal.tv/clips/2WRj40tpY_EU9', 'url': 'https://medal.tv/games/valorant/clips/2WRj40tpY_EU9',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
path = self._match_valid_url(url).group('path')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
hydration_data = self._parse_json(self._search_regex( next_data = self._search_json(
r'<script[^>]*>\s*(?:var\s*)?hydrationData\s*=\s*({.+?})\s*</script>', '<script[^>]*__NEXT_DATA__[^>]*>', webpage,
webpage, 'hydration data', default='{}'), video_id) 'next data', video_id, end_pattern='</script>', fatal=False)
clip = try_get( build_id = next_data.get('buildId')
hydration_data, lambda x: x['clips'][video_id], dict) or {} if not build_id:
raise ExtractorError(
'Could not find build ID.', video_id=video_id)
locale = next_data.get('locale', 'en')
api_response = self._download_json(
f'https://medal.tv/_next/data/{build_id}/{locale}/{path}/{video_id}.json', video_id)
clip = traverse_obj(api_response, ('pageProps', 'clip')) or {}
if not clip: if not clip:
raise ExtractorError( raise ExtractorError(
'Could not find video information.', video_id=video_id) 'Could not find video information.', video_id=video_id)
@@ -113,9 +154,8 @@ class MedalTVIE(InfoExtractor):
# Necessary because the id of the author is not known in advance. # Necessary because the id of the author is not known in advance.
# Won't raise an issue if no profile can be found as this is optional. # Won't raise an issue if no profile can be found as this is optional.
author = try_get( author = traverse_obj(api_response, ('pageProps', 'profile')) or {}
hydration_data, lambda x: list(x['profiles'].values())[0], dict) or {} author_id = str_or_none(author.get('userId'))
author_id = str_or_none(author.get('id'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s') author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
return { return {

View File

@@ -172,31 +172,27 @@ class MediasetIE(ThePlatformBaseIE):
}] }]
def _extract_from_webpage(self, url, webpage): def _extract_from_webpage(self, url, webpage):
def _qs(url):
return parse_qs(url)
def _program_guid(qs): def _program_guid(qs):
return qs.get('programGuid', [None])[0] return qs.get('programGuid', [None])[0]
entries = []
for mobj in re.finditer( for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1', r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1',
webpage): webpage):
embed_url = mobj.group('url') embed_url = mobj.group('url')
embed_qs = _qs(embed_url) embed_qs = parse_qs(embed_url)
program_guid = _program_guid(embed_qs) program_guid = _program_guid(embed_qs)
if program_guid: if program_guid:
entries.append(embed_url) yield self.url_result(embed_url)
continue continue
video_id = embed_qs.get('id', [None])[0] video_id = embed_qs.get('id', [None])[0]
if not video_id: if not video_id:
continue continue
urlh = self._request_webpage(embed_url, video_id, note='Following embed URL redirect') urlh = self._request_webpage(embed_url, video_id, note='Following embed URL redirect')
embed_url = urlh.geturl() embed_url = urlh.geturl()
program_guid = _program_guid(_qs(embed_url)) program_guid = _program_guid(parse_qs(embed_url))
if program_guid: if program_guid:
entries.append(embed_url) yield self.url_result(embed_url)
return entries
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None): def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
for video in smil.findall(self._xpath_ns('.//video', namespace)): for video in smil.findall(self._xpath_ns('.//video', namespace)):

View File

@@ -159,6 +159,7 @@ class MixcloudIE(MixcloudBaseIE):
formats.append({ formats.append({
'format_id': 'http', 'format_id': 'http',
'url': decrypted, 'url': decrypted,
'vcodec': 'none',
'downloader_options': { 'downloader_options': {
# Mixcloud starts throttling at >~5M # Mixcloud starts throttling at >~5M
'http_chunk_size': 5242880, 'http_chunk_size': 5242880,

View File

@@ -0,0 +1,54 @@
import re
from .common import InfoExtractor
from ..utils import ExtractorError
class NewsPicksIE(InfoExtractor):
_VALID_URL = r'https://newspicks\.com/movie-series/(?P<channel_id>\d+)\?movieId=(?P<id>\d+)'
_TESTS = [{
'url': 'https://newspicks.com/movie-series/11?movieId=1813',
'info_dict': {
'id': '1813',
'title': '日本の課題を破壊せよ【ゲスト:成田悠輔】',
'description': 'md5:09397aad46d6ded6487ff13f138acadf',
'channel': 'HORIE ONE',
'channel_id': '11',
'release_date': '20220117',
'thumbnail': r're:https://.+jpg',
'ext': 'mp4',
},
}]
def _real_extract(self, url):
video_id, channel_id = self._match_valid_url(url).group('id', 'channel_id')
webpage = self._download_webpage(url, video_id)
entries = self._parse_html5_media_entries(
url, webpage.replace('movie-for-pc', 'movie'), video_id, 'hls')
if not entries:
raise ExtractorError('No HTML5 media elements found')
info = entries[0]
self._sort_formats(info['formats'])
title = self._html_search_meta('og:title', webpage, fatal=False)
description = self._html_search_meta(
('og:description', 'twitter:title'), webpage, fatal=False)
channel = self._html_search_regex(
r'value="11".+?<div\s+class="title">(.+?)</div', webpage, 'channel name', fatal=False)
if not title or not channel:
title, channel = re.split(r'\s*|\s*', self._html_extract_title(webpage))
release_date = self._search_regex(
r'<span\s+class="on-air-date">\s*(\d+)年(\d+)月(\d+)日\s*</span>',
webpage, 'release date', fatal=False, group=(1, 2, 3))
info.update({
'id': video_id,
'title': title,
'description': description,
'channel': channel,
'channel_id': channel_id,
'release_date': ('%04d%02d%02d' % tuple(map(int, release_date))) if release_date else None,
})
return info

View File

@@ -1,3 +1,4 @@
import collections
import contextlib import contextlib
import json import json
import os import os
@@ -9,8 +10,10 @@ from ..utils import (
ExtractorError, ExtractorError,
Popen, Popen,
check_executable, check_executable,
format_field,
get_exe_version, get_exe_version,
is_outdated_version, is_outdated_version,
shell_quote,
) )
@@ -49,7 +52,9 @@ class PhantomJSwrapper:
This class is experimental. This class is experimental.
""" """
_TEMPLATE = r''' INSTALL_HINT = 'Please download it from https://phantomjs.org/download.html'
_BASE_JS = R'''
phantom.onError = function(msg, trace) {{ phantom.onError = function(msg, trace) {{
var msgStack = ['PHANTOM ERROR: ' + msg]; var msgStack = ['PHANTOM ERROR: ' + msg];
if(trace && trace.length) {{ if(trace && trace.length) {{
@@ -62,6 +67,9 @@ class PhantomJSwrapper:
console.error(msgStack.join('\n')); console.error(msgStack.join('\n'));
phantom.exit(1); phantom.exit(1);
}}; }};
'''
_TEMPLATE = R'''
var page = require('webpage').create(); var page = require('webpage').create();
var fs = require('fs'); var fs = require('fs');
var read = {{ mode: 'r', charset: 'utf-8' }}; var read = {{ mode: 'r', charset: 'utf-8' }};
@@ -104,8 +112,7 @@ class PhantomJSwrapper:
self.exe = check_executable('phantomjs', ['-v']) self.exe = check_executable('phantomjs', ['-v'])
if not self.exe: if not self.exe:
raise ExtractorError( raise ExtractorError(f'PhantomJS not found, {self.INSTALL_HINT}', expected=True)
'PhantomJS not found, Please download it from https://phantomjs.org/download.html', expected=True)
self.extractor = extractor self.extractor = extractor
@@ -116,14 +123,18 @@ class PhantomJSwrapper:
'Your copy of PhantomJS is outdated, update it to version ' 'Your copy of PhantomJS is outdated, update it to version '
'%s or newer if you encounter any errors.' % required_version) '%s or newer if you encounter any errors.' % required_version)
self.options = {
'timeout': timeout,
}
for name in self._TMP_FILE_NAMES: for name in self._TMP_FILE_NAMES:
tmp = tempfile.NamedTemporaryFile(delete=False) tmp = tempfile.NamedTemporaryFile(delete=False)
tmp.close() tmp.close()
self._TMP_FILES[name] = tmp self._TMP_FILES[name] = tmp
self.options = collections.ChainMap({
'timeout': timeout,
}, {
x: self._TMP_FILES[x].name.replace('\\', '\\\\').replace('"', '\\"')
for x in self._TMP_FILE_NAMES
})
def __del__(self): def __del__(self):
for name in self._TMP_FILE_NAMES: for name in self._TMP_FILE_NAMES:
with contextlib.suppress(OSError, KeyError): with contextlib.suppress(OSError, KeyError):
@@ -194,31 +205,39 @@ class PhantomJSwrapper:
self._save_cookies(url) self._save_cookies(url)
replaces = self.options
replaces['url'] = url
user_agent = headers.get('User-Agent') or self.extractor.get_param('http_headers')['User-Agent'] user_agent = headers.get('User-Agent') or self.extractor.get_param('http_headers')['User-Agent']
replaces['ua'] = user_agent.replace('"', '\\"') jscode = self._TEMPLATE.format_map(self.options.new_child({
replaces['jscode'] = jscode 'url': url,
'ua': user_agent.replace('"', '\\"'),
'jscode': jscode,
}))
for x in self._TMP_FILE_NAMES: stdout = self.execute(jscode, video_id, note2)
replaces[x] = self._TMP_FILES[x].name.replace('\\', '\\\\').replace('"', '\\"')
with open(self._TMP_FILES['script'].name, 'wb') as f:
f.write(self._TEMPLATE.format(**replaces).encode('utf-8'))
if video_id is None:
self.extractor.to_screen(f'{note2}')
else:
self.extractor.to_screen(f'{video_id}: {note2}')
stdout, stderr, returncode = Popen.run(
[self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name],
text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if returncode:
raise ExtractorError(f'Executing JS failed:\n{stderr}')
with open(self._TMP_FILES['html'].name, 'rb') as f: with open(self._TMP_FILES['html'].name, 'rb') as f:
html = f.read().decode('utf-8') html = f.read().decode('utf-8')
self._load_cookies() self._load_cookies()
return html, stdout return html, stdout
def execute(self, jscode, video_id=None, *, note='Executing JS'):
"""Execute JS and return stdout"""
if 'phantom.exit();' not in jscode:
jscode += ';\nphantom.exit();'
jscode = self._BASE_JS + jscode
with open(self._TMP_FILES['script'].name, 'w', encoding='utf-8') as f:
f.write(jscode)
self.extractor.to_screen(f'{format_field(video_id, None, "%s: ")}{note}')
cmd = [self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name]
self.extractor.write_debug(f'PhantomJS command line: {shell_quote(cmd)}')
try:
stdout, stderr, returncode = Popen.run(cmd, timeout=self.options['timeout'] / 1000,
text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except Exception as e:
raise ExtractorError(f'{note} failed: Unable to run PhantomJS binary', cause=e)
if returncode:
raise ExtractorError(f'{note} failed with returncode {returncode}:\n{stderr.strip()}')
return stdout

View File

@@ -156,7 +156,7 @@ class RaiBaseIE(InfoExtractor):
br = int_or_none(tbr) br = int_or_none(tbr)
if len(fmts) == 1 and not br: if len(fmts) == 1 and not br:
br = fmts[0].get('tbr') br = fmts[0].get('tbr')
if br or 0 > 300: if br and br > 300:
tbr = compat_str(math.floor(br / 100) * 100) tbr = compat_str(math.floor(br / 100) * 100)
else: else:
tbr = '250' tbr = '250'

View File

@@ -11,6 +11,7 @@ from ..utils import (
int_or_none, int_or_none,
strip_or_none, strip_or_none,
traverse_obj, traverse_obj,
try_call,
unified_timestamp, unified_timestamp,
) )
@@ -69,6 +70,10 @@ class RedBeeBaseIE(InfoExtractor):
fmts, subs = self._extract_m3u8_formats_and_subtitles( fmts, subs = self._extract_m3u8_formats_and_subtitles(
format['mediaLocator'], asset_id, fatal=False) format['mediaLocator'], asset_id, fatal=False)
if format.get('drm'):
for f in fmts:
f['has_drm'] = True
formats.extend(fmts) formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles) self._merge_subtitles(subs, target=subtitles)
@@ -251,7 +256,7 @@ class RTBFIE(RedBeeBaseIE):
if not login_token: if not login_token:
self.raise_login_required() self.raise_login_required()
session_jwt = self._download_json( session_jwt = try_call(lambda: self._get_cookies(url)['rtbf_jwt'].value) or self._download_json(
'https://login.rtbf.be/accounts.getJWT', media_id, query={ 'https://login.rtbf.be/accounts.getJWT', media_id, query={
'login_token': login_token.value, 'login_token': login_token.value,
'APIKey': self._GIGYA_API_KEY, 'APIKey': self._GIGYA_API_KEY,
@@ -269,8 +274,17 @@ class RTBFIE(RedBeeBaseIE):
embed_page = self._download_webpage( embed_page = self._download_webpage(
'https://www.rtbf.be/auvio/embed/' + ('direct' if live else 'media'), 'https://www.rtbf.be/auvio/embed/' + ('direct' if live else 'media'),
media_id, query={'id': media_id}) media_id, query={'id': media_id})
data = self._parse_json(self._html_search_regex(
r'data-media="([^"]+)"', embed_page, 'media data'), media_id) media_data = self._html_search_regex(r'data-media="([^"]+)"', embed_page, 'media data', fatal=False)
if not media_data:
if re.search(r'<div[^>]+id="js-error-expired"[^>]+class="(?![^"]*hidden)', embed_page):
raise ExtractorError('Livestream has ended.', expected=True)
if re.search(r'<div[^>]+id="js-sso-connect"[^>]+class="(?![^"]*hidden)', embed_page):
self.raise_login_required()
raise ExtractorError('Could not find media data')
data = self._parse_json(media_data, media_id)
error = data.get('error') error = data.get('error')
if error: if error:
@@ -280,15 +294,20 @@ class RTBFIE(RedBeeBaseIE):
if provider in self._PROVIDERS: if provider in self._PROVIDERS:
return self.url_result(data['url'], self._PROVIDERS[provider]) return self.url_result(data['url'], self._PROVIDERS[provider])
title = data['subtitle'] title = traverse_obj(data, 'subtitle', 'title')
is_live = data.get('isLive') is_live = data.get('isLive')
height_re = r'-(\d+)p\.' height_re = r'-(\d+)p\.'
formats = [] formats, subtitles = [], {}
m3u8_url = data.get('urlHlsAes128') or data.get('urlHls') # The old api still returns m3u8 and mpd manifest for livestreams, but these are 'fake'
# since all they contain is a 20s video that is completely unrelated.
# https://github.com/yt-dlp/yt-dlp/issues/4656#issuecomment-1214461092
m3u8_url = None if data.get('isLive') else traverse_obj(data, 'urlHlsAes128', 'urlHls')
if m3u8_url: if m3u8_url:
formats.extend(self._extract_m3u8_formats( fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, media_id, 'mp4', m3u8_id='hls', fatal=False)) m3u8_url, media_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
fix_url = lambda x: x.replace('//rtbf-vod.', '//rtbf.') if '/geo/drm/' in x else x fix_url = lambda x: x.replace('//rtbf-vod.', '//rtbf.') if '/geo/drm/' in x else x
http_url = data.get('url') http_url = data.get('url')
@@ -319,10 +338,12 @@ class RTBFIE(RedBeeBaseIE):
'height': height, 'height': height,
}) })
mpd_url = data.get('urlDash') mpd_url = None if data.get('isLive') else data.get('urlDash')
if mpd_url and (self.get_param('allow_unplayable_formats') or not data.get('drm')): if mpd_url and (self.get_param('allow_unplayable_formats') or not data.get('drm')):
formats.extend(self._extract_mpd_formats( fmts, subs = self._extract_mpd_formats_and_subtitles(
mpd_url, media_id, mpd_id='dash', fatal=False)) mpd_url, media_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
audio_url = data.get('urlAudio') audio_url = data.get('urlAudio')
if audio_url: if audio_url:
@@ -332,7 +353,6 @@ class RTBFIE(RedBeeBaseIE):
'vcodec': 'none', 'vcodec': 'none',
}) })
subtitles = {}
for track in (data.get('tracks') or {}).values(): for track in (data.get('tracks') or {}).values():
sub_url = track.get('url') sub_url = track.get('url')
if not sub_url: if not sub_url:
@@ -342,7 +362,7 @@ class RTBFIE(RedBeeBaseIE):
}) })
if not formats: if not formats:
fmts, subs = self._get_formats_and_subtitles(url, media_id) fmts, subs = self._get_formats_and_subtitles(url, f'live_{media_id}' if is_live else media_id)
formats.extend(fmts) formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles) self._merge_subtitles(subs, target=subtitles)

View File

@@ -1,10 +1,12 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError,
get_element_by_class, get_element_by_class,
int_or_none, int_or_none,
remove_start, remove_start,
strip_or_none, strip_or_none,
unified_strdate, unified_strdate,
urlencode_postdata,
) )
@@ -34,6 +36,28 @@ class ScreencastOMaticIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
'https://screencast-o-matic.com/player/' + video_id, video_id) 'https://screencast-o-matic.com/player/' + video_id, video_id)
if (self._html_extract_title(webpage) == 'Protected Content'
or 'This video is private and requires a password' in webpage):
password = self.get_param('videopassword')
if not password:
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
form = self._search_regex(
r'(?is)<form[^>]*>(?P<form>.+?)</form>', webpage, 'login form', group='form')
form_data = self._hidden_inputs(form)
form_data.update({
'scPassword': password,
})
webpage = self._download_webpage(
'https://screencast-o-matic.com/player/password', video_id, 'Logging in',
data=urlencode_postdata(form_data))
if '<small class="text-danger">Invalid password</small>' in webpage:
raise ExtractorError('Unable to login: Invalid password', expected=True)
info = self._parse_html5_media_entries(url, webpage, video_id)[0] info = self._parse_html5_media_entries(url, webpage, video_id)[0]
info.update({ info.update({
'id': video_id, 'id': video_id,

View File

@@ -44,7 +44,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
_TESTS = [ _TESTS = [
{ {
'url': 'https://sovietscloset.com/video/1337', 'url': 'https://sovietscloset.com/video/1337',
'md5': '11e58781c4ca5b283307aa54db5b3f93', 'md5': 'bd012b04b261725510ca5383074cdd55',
'info_dict': { 'info_dict': {
'id': '1337', 'id': '1337',
'ext': 'mp4', 'ext': 'mp4',
@@ -69,11 +69,11 @@ class SovietsClosetIE(SovietsClosetBaseIE):
}, },
{ {
'url': 'https://sovietscloset.com/video/1105', 'url': 'https://sovietscloset.com/video/1105',
'md5': '578b1958a379e7110ba38697042e9efb', 'md5': '89fa928f183893cb65a0b7be846d8a90',
'info_dict': { 'info_dict': {
'id': '1105', 'id': '1105',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Arma 3 - Zeus Games #3', 'title': 'Arma 3 - Zeus Games #5',
'uploader': 'SovietWomble', 'uploader': 'SovietWomble',
'thumbnail': r're:^https?://.*\.b-cdn\.net/c0e5e76f-3a93-40b4-bf01-12343c2eec5d/thumbnail\.jpg$', 'thumbnail': r're:^https?://.*\.b-cdn\.net/c0e5e76f-3a93-40b4-bf01-12343c2eec5d/thumbnail\.jpg$',
'uploader': 'SovietWomble', 'uploader': 'SovietWomble',
@@ -89,8 +89,8 @@ class SovietsClosetIE(SovietsClosetBaseIE):
'availability': 'public', 'availability': 'public',
'series': 'Arma 3', 'series': 'Arma 3',
'season': 'Zeus Games', 'season': 'Zeus Games',
'episode_number': 3, 'episode_number': 5,
'episode': 'Episode 3', 'episode': 'Episode 5',
}, },
}, },
] ]
@@ -122,7 +122,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
static_assets_base = self._search_regex(r'staticAssetsBase:\"(.*?)\"', webpage, 'staticAssetsBase') static_assets_base = self._search_regex(r'(/_nuxt/static/\d+)', webpage, 'staticAssetsBase')
static_assets_base = f'https://sovietscloset.com{static_assets_base}' static_assets_base = f'https://sovietscloset.com{static_assets_base}'
stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream'] stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream']
@@ -181,7 +181,7 @@ class SovietsClosetPlaylistIE(SovietsClosetBaseIE):
webpage = self._download_webpage(url, playlist_id) webpage = self._download_webpage(url, playlist_id)
static_assets_base = self._search_regex(r'staticAssetsBase:\"(.*?)\"', webpage, 'staticAssetsBase') static_assets_base = self._search_regex(r'(/_nuxt/static/\d+)', webpage, 'staticAssetsBase')
static_assets_base = f'https://sovietscloset.com{static_assets_base}' static_assets_base = f'https://sovietscloset.com{static_assets_base}'
sovietscloset = self.parse_nuxt_jsonp(f'{static_assets_base}/payload.js', playlist_id, 'global')['games'] sovietscloset = self.parse_nuxt_jsonp(f'{static_assets_base}/payload.js', playlist_id, 'global')['games']

View File

@@ -29,9 +29,7 @@ class StripchatIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(url, video_id, headers=self.geo_verification_headers())
'https://stripchat.com/%s/' % video_id, video_id,
headers=self.geo_verification_headers())
data = self._parse_json( data = self._parse_json(
self._search_regex( self._search_regex(

369
yt_dlp/extractor/tencent.py Normal file
View File

@@ -0,0 +1,369 @@
import functools
import random
import re
import string
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
js_to_json,
traverse_obj,
urljoin,
)
class TencentBaseIE(InfoExtractor):
"""Subclasses must set _API_URL, _APP_VERSION, _PLATFORM, _HOST, _REFERER"""
def _get_ckey(self, video_id, url, guid):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{self._APP_VERSION}|{guid}|'
f'{self._PLATFORM}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Windows x86_64|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex().upper()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
guid = ''.join([random.choice(string.digits + string.ascii_lowercase) for _ in range(16)])
ckey = self._get_ckey(video_id, video_url, guid)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0',
'sphls': '2' if video_format == 'hls' else '0',
'dtype': '3' if video_format == 'hls' else '0',
'defn': video_quality,
'spsrt': '2', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json',
'spwm': '1',
# For SHD
'host': self._HOST,
'referer': self._REFERER,
'ehost': video_url,
'appVer': self._APP_VERSION,
'platform': self._PLATFORM,
# For VQQ
'guid': guid,
'flowid': ''.join(random.choice(string.digits + string.ascii_lowercase) for _ in range(32)),
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
self._API_URL, video_id, query=query), 'api_response', video_id)
def _extract_video_formats_and_subtitles(self, api_response, video_id):
video_response = api_response['vl']['vi'][0]
video_width, video_height = video_response.get('vw'), video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pt'], video_id, 'mp4', fatal=False)
for f in fmts:
f.update({'width': video_width, 'height': video_height})
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_native_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')) or ():
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _extract_all_video_formats_and_subtitles(self, url, video_id, series_id):
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (
# '': 480p, 'shd': 720p, 'fhd': 1080p
('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(
url, video_id, series_id, subtitle_format, video_format, video_quality)
if api_response.get('em') != 0 and api_response.get('exem') != 0:
if '您所在区域暂无此内容版权' in api_response.get('msg'):
self.raise_geo_restricted()
raise ExtractorError(f'Tencent said: {api_response.get("msg")}')
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id)
native_subtitles = self._extract_video_native_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
return formats, subtitles
def _get_clean_title(self, title):
return re.sub(
r'\s*[_\-]\s*(?:Watch online|腾讯视频|(?:高清)?1080P在线观看平台).*?$',
'', title or '').strip() or None
class VQQBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://v\.qq\.com'
_API_URL = 'https://h5vv6.video.qq.com/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '10901'
_HOST = 'v.qq.com'
_REFERER = 'v.qq.com'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
self._search_regex(
r'(?s)<script[^>]*>[^<]*window\.__pinia\s*=\s*([^<]+)</script>',
webpage, 'pinia data', fatal=False),
video_id, transform_source=js_to_json, fatal=False)
class VQQVideoIE(VQQBaseIE):
IE_NAME = 'vqq:video'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/(?:page|cover/(?P<series_id>\w+))/(?P<id>\w+)'
_TESTS = [{
'url': 'https://v.qq.com/x/page/q326831cny0.html',
'md5': '826ef93682df09e3deac4a6e6e8cdb6e',
'info_dict': {
'id': 'q326831cny0',
'ext': 'mp4',
'title': '我是选手:雷霆裂阵,终极时刻',
'description': 'md5:e7ed70be89244017dac2a835a10aeb1e',
'thumbnail': r're:^https?://[^?#]+q326831cny0',
},
}, {
'url': 'https://v.qq.com/x/page/o3013za7cse.html',
'md5': 'b91cbbeada22ef8cc4b06df53e36fa21',
'info_dict': {
'id': 'o3013za7cse',
'ext': 'mp4',
'title': '欧阳娜娜VLOG',
'description': 'md5:29fe847497a98e04a8c3826e499edd2e',
'thumbnail': r're:^https?://[^?#]+o3013za7cse',
},
}, {
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27/a00269ix3l8.html',
'md5': '71459c5375c617c265a22f083facce67',
'info_dict': {
'id': 'a00269ix3l8',
'ext': 'mp4',
'title': '鸡毛飞上天 第01集',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
'thumbnail': r're:^https?://[^?#]+7ce5noezvafma27',
'series': '鸡毛飞上天',
},
}, {
'url': 'https://v.qq.com/x/cover/mzc00200p29k31e/s0043cwsgj0.html',
'md5': '96b9fd4a189fdd4078c111f21d7ac1bc',
'info_dict': {
'id': 's0043cwsgj0',
'ext': 'mp4',
'title': '第1集如何快乐吃糖',
'description': 'md5:1d8c3a0b8729ae3827fa5b2d3ebd5213',
'thumbnail': r're:^https?://[^?#]+s0043cwsgj0',
'series': '青年理工工作者生活研究所',
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'title'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'desc'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': (self._og_search_thumbnail(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'pic160x90'))),
'series': traverse_obj(webpage_metadata, ('global', 'coverInfo', 'title')),
}
class VQQSeriesIE(VQQBaseIE):
IE_NAME = 'vqq:series'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/cover/(?P<id>\w+)\.html/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27.html',
'info_dict': {
'id': '7ce5noezvafma27',
'title': '鸡毛飞上天',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
},
'playlist_count': 55,
}, {
'url': 'https://v.qq.com/x/cover/oshd7r0vy9sfq8e.html',
'info_dict': {
'id': 'oshd7r0vy9sfq8e',
'title': '恋爱细胞2',
'description': 'md5:9d8a2245679f71ca828534b0f95d2a03',
},
'playlist_count': 12,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = [f'/x/cover/{series_id}/{video_id}.html' for video_id in re.findall(
r'<div[^>]+data-vid="(?P<video_id>[^"]+)"[^>]+class="[^"]+episode-item-rect--number',
webpage)]
return self.playlist_from_matches(
episode_paths, series_id, ie=VQQVideoIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))
class WeTvBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
_API_URL = 'https://play.wetv.vip/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '4830201'
_HOST = 'wetv.vip'
_REFERER = 'wetv.vip'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': '0c70fdfaa5011ab022eebc598e64bbbe',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '3b3c15ca4b9a158d8d28d5aa9d7c0a49',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'title'))),
'description': (traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = ([f'/play/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')]
or re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage))
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))

View File

@@ -8,12 +8,14 @@ class TestURLIE(InfoExtractor):
""" Allows addressing of the test cases as test:yout.*be_1 """ """ Allows addressing of the test cases as test:yout.*be_1 """
IE_DESC = False # Do not list IE_DESC = False # Do not list
_VALID_URL = r'test(?:url)?:(?P<extractor>.+?)(?:_(?P<num>[0-9]+))?$' _VALID_URL = r'test(?:url)?:(?P<extractor>.*?)(?:_(?P<num>[0-9]+))?$'
def _real_extract(self, url): def _real_extract(self, url):
from . import gen_extractor_classes from . import gen_extractor_classes
extractor_id, num = self._match_valid_url(url).group('extractor', 'num') extractor_id, num = self._match_valid_url(url).group('extractor', 'num')
if not extractor_id:
return {'id': ':test', 'title': '', 'url': url}
rex = re.compile(extractor_id, flags=re.IGNORECASE) rex = re.compile(extractor_id, flags=re.IGNORECASE)
matching_extractors = [e for e in gen_extractor_classes() if rex.search(e.IE_NAME)] matching_extractors = [e for e in gen_extractor_classes() if rex.search(e.IE_NAME)]

304
yt_dlp/extractor/triller.py Normal file
View File

@@ -0,0 +1,304 @@
import itertools
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
unified_timestamp,
url_basename,
)
class TrillerBaseIE(InfoExtractor):
_NETRC_MACHINE = 'triller'
_AUTH_TOKEN = None
_API_BASE_URL = 'https://social.triller.co/v1.5'
def _perform_login(self, username, password):
if self._AUTH_TOKEN:
return
user_check = self._download_json(
f'{self._API_BASE_URL}/api/user/is-valid-username', None, note='Checking username',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps({'username': username}, separators=(',', ':')).encode('utf-8'))
if user_check.get('status'): # endpoint returns "status":false if username exists
raise ExtractorError('Unable to login: Invalid username', expected=True)
credentials = {
'username': username,
'password': password,
}
login = self._download_json(
f'{self._API_BASE_URL}/user/auth', None, note='Logging in',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps(credentials, separators=(',', ':')).encode('utf-8'))
if not login.get('auth_token'):
if login.get('error') == 1008:
raise ExtractorError('Unable to login: Incorrect password', expected=True)
raise ExtractorError('Unable to login')
self._AUTH_TOKEN = login['auth_token']
def _get_comments(self, video_id, limit=15):
comment_info = self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_id}/comments_v2',
video_id, fatal=False, note='Downloading comments API JSON',
headers={'Origin': 'https://triller.co'}, query={'limit': limit}) or {}
if not comment_info.get('comments'):
return
for comment_dict in comment_info['comments']:
yield {
'author': traverse_obj(comment_dict, ('author', 'username')),
'author_id': traverse_obj(comment_dict, ('author', 'user_id')),
'id': comment_dict.get('id'),
'text': comment_dict.get('body'),
'timestamp': unified_timestamp(comment_dict.get('timestamp')),
}
def _check_user_info(self, user_info):
if not user_info:
self.report_warning('Unable to extract user info')
elif user_info.get('private') and not user_info.get('followed_by_me'):
raise ExtractorError('This video is private', expected=True)
elif traverse_obj(user_info, 'blocked_by_user', 'blocking_user'):
raise ExtractorError('The author of the video is blocked', expected=True)
return user_info
def _parse_video_info(self, video_info, username, user_info=None):
video_uuid = video_info.get('video_uuid')
video_id = video_info.get('id')
formats = []
video_url = traverse_obj(video_info, 'video_url', 'stream_url')
if video_url:
formats.append({
'url': video_url,
'ext': 'mp4',
'vcodec': 'h264',
'width': video_info.get('width'),
'height': video_info.get('height'),
'format_id': url_basename(video_url).split('.')[0],
'filesize': video_info.get('filesize'),
})
video_set = video_info.get('video_set') or []
for video in video_set:
resolution = video.get('resolution') or ''
formats.append({
'url': video['url'],
'ext': 'mp4',
'vcodec': video.get('codec'),
'vbr': int_or_none(video.get('bitrate'), 1000),
'width': int_or_none(resolution.split('x')[0]),
'height': int_or_none(resolution.split('x')[1]),
'format_id': url_basename(video['url']).split('.')[0],
})
audio_url = video_info.get('audio_url')
if audio_url:
formats.append({
'url': audio_url,
'ext': 'm4a',
'format_id': url_basename(audio_url).split('.')[0],
})
manifest_url = video_info.get('transcoded_url')
if manifest_url:
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
comment_count = int_or_none(video_info.get('comment_count'))
user_info = user_info or traverse_obj(video_info, 'user', default={})
return {
'id': str_or_none(video_id) or video_uuid,
'title': video_info.get('description') or f'Video by {username}',
'thumbnail': video_info.get('thumbnail_url'),
'description': video_info.get('description'),
'uploader': str_or_none(username),
'uploader_id': str_or_none(user_info.get('user_id')),
'creator': str_or_none(user_info.get('name')),
'timestamp': unified_timestamp(video_info.get('timestamp')),
'upload_date': unified_strdate(video_info.get('timestamp')),
'duration': int_or_none(video_info.get('duration')),
'view_count': int_or_none(video_info.get('play_count')),
'like_count': int_or_none(video_info.get('likes_count')),
'artist': str_or_none(video_info.get('song_artist')),
'track': str_or_none(video_info.get('song_title')),
'webpage_url': f'https://triller.co/@{username}/video/{video_uuid}',
'uploader_url': f'https://triller.co/@{username}',
'extractor_key': TrillerIE.ie_key(),
'extractor': TrillerIE.IE_NAME,
'formats': formats,
'comment_count': comment_count,
'__post_extractor': self.extract_comments(video_id, comment_count),
}
class TrillerIE(TrillerBaseIE):
_VALID_URL = r'''(?x)
https?://(?:www\.)?triller\.co/
@(?P<username>[\w\._]+)/video/
(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
'''
_TESTS = [{
'url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'md5': '228662d783923b60d78395fedddc0a20',
'info_dict': {
'id': '71595734',
'ext': 'mp4',
'title': 'md5:9a2bf9435c5c4292678996a464669416',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:9a2bf9435c5c4292678996a464669416',
'uploader': 'theestallion',
'uploader_id': '18992236',
'creator': 'Megan Thee Stallion',
'timestamp': 1660598222,
'upload_date': '20220815',
'duration': 47,
'height': 3840,
'width': 2160,
'view_count': int,
'like_count': int,
'artist': 'Megan Thee Stallion',
'track': 'Her',
'webpage_url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'uploader_url': 'https://triller.co/@theestallion',
'comment_count': int,
}
}, {
'url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'md5': '874055f462af5b0699b9dbb527a505a0',
'info_dict': {
'id': '71621339',
'ext': 'mp4',
'title': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'uploader': 'charlidamelio',
'uploader_id': '1875551',
'creator': 'charli damelio',
'timestamp': 1660773354,
'upload_date': '20220817',
'duration': 16,
'height': 1920,
'width': 1080,
'view_count': int,
'like_count': int,
'artist': 'Dixie',
'track': 'Someone to Blame',
'webpage_url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'uploader_url': 'https://triller.co/@charlidamelio',
'comment_count': int,
}
}]
def _real_extract(self, url):
username, video_uuid = self._match_valid_url(url).group('username', 'id')
video_info = traverse_obj(self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_uuid}',
video_uuid, note='Downloading video info API JSON',
errnote='Unable to download video info API JSON',
headers={
'Origin': 'https://triller.co',
}), ('videos', 0))
if not video_info:
raise ExtractorError('No video info found in API response')
user_info = self._check_user_info(video_info.get('user') or {})
return self._parse_video_info(video_info, username, user_info)
class TrillerUserIE(TrillerBaseIE):
_VALID_URL = r'https?://(?:www\.)?triller\.co/@(?P<id>[\w\._]+)/?(?:$|[#?])'
_TESTS = [{
# first videos request only returns 2 videos
'url': 'https://triller.co/@theestallion',
'playlist_mincount': 9,
'info_dict': {
'id': '18992236',
'title': 'theestallion',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}, {
'url': 'https://triller.co/@charlidamelio',
'playlist_mincount': 25,
'info_dict': {
'id': '1875551',
'title': 'charlidamelio',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}]
def _real_initialize(self):
if not self._AUTH_TOKEN:
guest = self._download_json(
f'{self._API_BASE_URL}/user/create_guest',
None, note='Creating guest session', data=b'', headers={
'Origin': 'https://triller.co',
}, query={
'platform': 'Web',
'app_version': '',
})
if not guest.get('auth_token'):
raise ExtractorError('Unable to fetch required auth token for user extraction')
self._AUTH_TOKEN = guest['auth_token']
def _extract_video_list(self, username, user_id, limit=6):
query = {
'limit': limit,
}
for page in itertools.count(1):
for retry in self.RetryManager():
try:
video_list = self._download_json(
f'{self._API_BASE_URL}/api/users/{user_id}/videos',
username, note=f'Downloading user video list page {page}',
errnote='Unable to download user video list', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}, query=query)
except ExtractorError as e:
if isinstance(e.cause, json.JSONDecodeError) and e.cause.pos == 0:
retry.error = e
continue
raise
if not video_list.get('videos'):
break
yield from video_list['videos']
query['before_time'] = traverse_obj(video_list, ('videos', -1, 'timestamp'))
if not query['before_time']:
break
def _entries(self, videos, username, user_info):
for video in videos:
yield self._parse_video_info(video, username, user_info)
def _real_extract(self, url):
username = self._match_id(url)
user_info = self._check_user_info(self._download_json(
f'{self._API_BASE_URL}/api/users/by_username/{username}',
username, note='Downloading user info',
errnote='Failed to download user info', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}).get('user', {}))
user_id = str_or_none(user_info.get('user_id'))
videos = self._extract_video_list(username, user_id)
thumbnail = user_info.get('avatar_url')
return self.playlist_result(
self._entries(videos, username, user_info), user_id, username, thumbnail=thumbnail)

View File

@@ -2,7 +2,7 @@ from .common import InfoExtractor
class UKTVPlayIE(InfoExtractor): class UKTVPlayIE(InfoExtractor):
_VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)' _VALID_URL = r'https?://uktvplay\.(?:uktv\.)?co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001', 'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001',
'info_dict': { 'info_dict': {

View File

@@ -1131,7 +1131,7 @@ class VimeoChannelIE(VimeoBaseInfoExtractor):
class VimeoUserIE(VimeoChannelIE): class VimeoUserIE(VimeoChannelIE):
IE_NAME = 'vimeo:user' IE_NAME = 'vimeo:user'
_VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos|[#?]|$)' _VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos)?/?(?:$|[?#])'
_TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>' _TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>'
_TESTS = [{ _TESTS = [{
'url': 'https://vimeo.com/nkistudio/videos', 'url': 'https://vimeo.com/nkistudio/videos',
@@ -1140,6 +1140,9 @@ class VimeoUserIE(VimeoChannelIE):
'id': 'nkistudio', 'id': 'nkistudio',
}, },
'playlist_mincount': 66, 'playlist_mincount': 66,
}, {
'url': 'https://vimeo.com/nkistudio/',
'only_matching': True,
}] }]
_BASE_URL_TEMPL = 'https://vimeo.com/%s' _BASE_URL_TEMPL = 'https://vimeo.com/%s'

View File

@@ -1,208 +0,0 @@
import functools
import re
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import determine_ext, int_or_none, traverse_obj, urljoin
class WeTvBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
def _get_ckey(self, video_id, url, app_version, platform):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{app_version}|0000000000000000|'
f'{platform}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Win32|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
app_version = '3.5.57'
platform = '4830201'
ckey = self._get_ckey(video_id, video_url, app_version, platform)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0', # 0 - SRT, 1 - VTT
'sphls': '1' if video_format == 'hls' else '0', # 0 - MP4, 1 - HLS
'defn': video_quality, # '': 480p, 'shd': 720p, 'fhd': 1080p
'spsrt': '1', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json', # Response format: xml, json,
'dtype': '1',
'spwm': '1',
'host': 'wetv.vip', # These three values are needed for SHD
'referer': 'wetv.vip',
'ehost': video_url,
'appVer': app_version,
'platform': platform,
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
'https://play.wetv.vip/getvinfo', video_id, query=query), 'api_response', video_id)
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': 'a046f565c9dce9b263a0465a422cd7bf',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '4d9d69bcfd11da61f4aae64fc6b316b3',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _extract_video_formats_and_subtitles(self, api_response, video_id, video_quality):
video_response = api_response['vl']['vi'][0]
video_width = video_response.get('vw')
video_height = video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pname'], video_id, 'mp4', fatal=False)
for f in fmts:
f['width'] = video_width
f['height'] = video_height
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')):
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(url, video_id, series_id, subtitle_format, video_format, video_quality)
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id, video_quality)
native_subtitles = self._extract_video_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
return {
'id': video_id,
'title': (self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = (re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage)
or [f'/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')])
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=traverse_obj(webpage_metadata, ('coverInfo', 'title')) or self._og_search_title(webpage),
description=traverse_obj(webpage_metadata, ('coverInfo', 'description')) or self._og_search_description(webpage))

View File

@@ -17,6 +17,7 @@ import urllib.error
import urllib.parse import urllib.parse
from .common import InfoExtractor, SearchInfoExtractor from .common import InfoExtractor, SearchInfoExtractor
from .openload import PhantomJSwrapper
from ..compat import functools from ..compat import functools
from ..jsinterp import JSInterpreter from ..jsinterp import JSInterpreter
from ..utils import ( from ..utils import (
@@ -109,8 +110,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID', 'clientName': 'ANDROID',
'clientVersion': '17.29.34', 'clientVersion': '17.31.35',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 3, 'INNERTUBE_CONTEXT_CLIENT_NAME': 3,
@@ -121,8 +123,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID_EMBEDDED_PLAYER', 'clientName': 'ANDROID_EMBEDDED_PLAYER',
'clientVersion': '17.29.34', 'clientVersion': '17.31.35',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 55, 'INNERTUBE_CONTEXT_CLIENT_NAME': 55,
@@ -134,7 +137,8 @@ INNERTUBE_CLIENTS = {
'client': { 'client': {
'clientName': 'ANDROID_MUSIC', 'clientName': 'ANDROID_MUSIC',
'clientVersion': '5.16.51', 'clientVersion': '5.16.51',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.music/5.16.51 (Linux; U; Android 11) gzip'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 21, 'INNERTUBE_CONTEXT_CLIENT_NAME': 21,
@@ -145,8 +149,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'ANDROID_CREATOR', 'clientName': 'ANDROID_CREATOR',
'clientVersion': '22.28.100', 'clientVersion': '22.30.100',
'androidSdkVersion': 30 'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.creator/22.30.100 (Linux; U; Android 11) gzip'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 14, 'INNERTUBE_CONTEXT_CLIENT_NAME': 14,
@@ -159,8 +164,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS', 'clientName': 'IOS',
'clientVersion': '17.30.1', 'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3', 'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
} }
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 5, 'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
@@ -170,8 +176,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_MESSAGES_EXTENSION', 'clientName': 'IOS_MESSAGES_EXTENSION',
'clientVersion': '17.30.1', 'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3', 'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 66, 'INNERTUBE_CONTEXT_CLIENT_NAME': 66,
@@ -182,7 +189,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_MUSIC', 'clientName': 'IOS_MUSIC',
'clientVersion': '5.18', 'clientVersion': '5.21',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtubemusic/5.21 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 26, 'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
@@ -192,7 +201,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': { 'INNERTUBE_CONTEXT': {
'client': { 'client': {
'clientName': 'IOS_CREATOR', 'clientName': 'IOS_CREATOR',
'clientVersion': '22.29.101', 'clientVersion': '22.33.101',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.ytcreator/22.33.101 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}, },
}, },
'INNERTUBE_CONTEXT_CLIENT_NAME': 15, 'INNERTUBE_CONTEXT_CLIENT_NAME': 15,
@@ -554,7 +565,8 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
'Origin': origin, 'Origin': origin,
'X-Youtube-Identity-Token': identity_token or self._extract_identity_token(ytcfg), 'X-Youtube-Identity-Token': identity_token or self._extract_identity_token(ytcfg),
'X-Goog-PageId': account_syncid or self._extract_account_syncid(ytcfg), 'X-Goog-PageId': account_syncid or self._extract_account_syncid(ytcfg),
'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg) 'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg),
'User-Agent': self._ytcfg_get_safe(ytcfg, lambda x: x['INNERTUBE_CONTEXT']['client']['userAgent'], default_client=default_client)
} }
if session_index is None: if session_index is None:
session_index = self._extract_session_index(ytcfg) session_index = self._extract_session_index(ytcfg)
@@ -809,7 +821,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
# Youtube sometimes sends incomplete data # Youtube sometimes sends incomplete data
# See: https://github.com/ytdl-org/youtube-dl/issues/28194 # See: https://github.com/ytdl-org/youtube-dl/issues/28194
if not traverse_obj(response, *variadic(check_get_keys)): if not traverse_obj(response, *variadic(check_get_keys)):
retry.error = ExtractorError('Incomplete data received') retry.error = ExtractorError('Incomplete data received', expected=True)
continue continue
return response return response
@@ -867,7 +879,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
else None), else None),
'live_status': ('is_upcoming' if scheduled_timestamp is not None 'live_status': ('is_upcoming' if scheduled_timestamp is not None
else 'was_live' if 'streamed' in time_text.lower() else 'was_live' if 'streamed' in time_text.lower()
else 'is_live' if overlay_style is not None and overlay_style == 'LIVE' or 'live now' in badges else 'is_live' if overlay_style == 'LIVE' or 'live now' in badges
else None), else None),
'release_timestamp': scheduled_timestamp, 'release_timestamp': scheduled_timestamp,
'availability': self._availability(needs_premium='premium' in badges, needs_subscription='members only' in badges) 'availability': self._availability(needs_premium='premium' in badges, needs_subscription='members only' in badges)
@@ -2147,6 +2159,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'comment_count': int, 'comment_count': int,
'channel_follower_count': int 'channel_follower_count': int
} }
}, {
# Same video as above, but with --compat-opt no-youtube-prefer-utc-upload-date
'url': 'https://www.youtube.com/watch?v=2NUZ8W2llS4',
'info_dict': {
'id': '2NUZ8W2llS4',
'ext': 'mp4',
'title': 'The NP that test your phone performance 🙂',
'description': 'md5:144494b24d4f9dfacb97c1bbef5de84d',
'uploader': 'Leon Nguyen',
'uploader_id': 'VNSXIII',
'uploader_url': 'http://www.youtube.com/user/VNSXIII',
'channel_id': 'UCRqNBSOHgilHfAczlUmlWHA',
'channel_url': 'https://www.youtube.com/channel/UCRqNBSOHgilHfAczlUmlWHA',
'duration': 21,
'view_count': int,
'age_limit': 0,
'categories': ['Gaming'],
'tags': 'count:23',
'playable_in_embed': True,
'live_status': 'not_live',
'upload_date': '20220102',
'like_count': int,
'availability': 'public',
'channel': 'Leon Nguyen',
'thumbnail': 'https://i.ytimg.com/vi_webp/2NUZ8W2llS4/maxresdefault.webp',
'comment_count': int,
'channel_follower_count': int
},
'params': {'compat_opts': ['no-youtube-prefer-utc-upload-date']}
}, { }, {
# date text is premiered video, ensure upload date in UTC (published 1641172509) # date text is premiered video, ensure upload date in UTC (published 1641172509)
'url': 'https://www.youtube.com/watch?v=mzZzzBU6lrM', 'url': 'https://www.youtube.com/watch?v=mzZzzBU6lrM',
@@ -2512,20 +2553,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
assert os.path.basename(func_id) == func_id assert os.path.basename(func_id) == func_id
self.write_debug(f'Extracting signature function {func_id}') self.write_debug(f'Extracting signature function {func_id}')
cache_spec = self.cache.load('youtube-sigfuncs', func_id) cache_spec, code = self.cache.load('youtube-sigfuncs', func_id), None
if cache_spec is not None:
return lambda s: ''.join(s[i] for i in cache_spec)
code = self._load_player(video_id, player_url) if not cache_spec:
code = self._load_player(video_id, player_url)
if code: if code:
res = self._parse_sig_js(code) res = self._parse_sig_js(code)
test_string = ''.join(map(chr, range(len(example_sig)))) test_string = ''.join(map(chr, range(len(example_sig))))
cache_res = res(test_string) cache_spec = [ord(c) for c in res(test_string)]
cache_spec = [ord(c) for c in cache_res]
self.cache.store('youtube-sigfuncs', func_id, cache_spec) self.cache.store('youtube-sigfuncs', func_id, cache_spec)
return res
return lambda s: ''.join(s[i] for i in cache_spec)
def _print_sig_code(self, func, example_sig): def _print_sig_code(self, func, example_sig):
if not self.get_param('youtube_print_sig_code'): if not self.get_param('youtube_print_sig_code'):
@@ -2593,18 +2631,29 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
initial_function = jsi.extract_function(funcname) initial_function = jsi.extract_function(funcname)
return lambda s: initial_function([s]) return lambda s: initial_function([s])
def _cached(self, func, *cache_id):
def inner(*args, **kwargs):
if cache_id not in self._player_cache:
try:
self._player_cache[cache_id] = func(*args, **kwargs)
except ExtractorError as e:
self._player_cache[cache_id] = e
except Exception as e:
self._player_cache[cache_id] = ExtractorError(traceback.format_exc(), cause=e)
ret = self._player_cache[cache_id]
if isinstance(ret, Exception):
raise ret
return ret
return inner
def _decrypt_signature(self, s, video_id, player_url): def _decrypt_signature(self, s, video_id, player_url):
"""Turn the encrypted s field into a working signature""" """Turn the encrypted s field into a working signature"""
try: extract_sig = self._cached(
player_id = (player_url, self._signature_cache_id(s)) self._extract_signature_function, 'sig', player_url, self._signature_cache_id(s))
if player_id not in self._player_cache: func = extract_sig(video_id, player_url, s)
func = self._extract_signature_function(video_id, player_url, s) self._print_sig_code(func, s)
self._player_cache[player_id] = func return func(s)
func = self._player_cache[player_id]
self._print_sig_code(func, s)
return func(s)
except Exception as e:
raise ExtractorError(traceback.format_exc(), cause=e, video_id=video_id)
def _decrypt_nsig(self, s, video_id, player_url): def _decrypt_nsig(self, s, video_id, player_url):
"""Turn the encrypted n field into a working signature""" """Turn the encrypted n field into a working signature"""
@@ -2612,49 +2661,87 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
raise ExtractorError('Cannot decrypt nsig without player_url') raise ExtractorError('Cannot decrypt nsig without player_url')
player_url = urljoin('https://www.youtube.com', player_url) player_url = urljoin('https://www.youtube.com', player_url)
sig_id = ('nsig_value', s)
if sig_id in self._player_cache:
return self._player_cache[sig_id]
try: try:
player_id = ('nsig', player_url) jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url)
if player_id not in self._player_cache: except ExtractorError as e:
self._player_cache[player_id] = self._extract_n_function(video_id, player_url) raise ExtractorError('Unable to extract nsig function code', cause=e)
func = self._player_cache[player_id]
self._player_cache[sig_id] = func(s)
self.write_debug(f'Decrypted nsig {s} => {self._player_cache[sig_id]}')
return self._player_cache[sig_id]
except Exception as e:
raise ExtractorError(traceback.format_exc(), cause=e, video_id=video_id)
def _extract_n_function_name(self, jscode):
nfunc, idx = self._search_regex(
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx'))
if not idx:
return nfunc
return json.loads(js_to_json(self._search_regex(
rf'var {re.escape(nfunc)}\s*=\s*(\[.+?\]);', jscode,
f'Initial JS player n function list ({nfunc}.{idx})')))[int(idx)]
def _extract_n_function(self, video_id, player_url):
player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id)
if func_code:
jsi = JSInterpreter(func_code)
else:
jscode = self._load_player(video_id, player_url)
funcname = self._extract_n_function_name(jscode)
jsi = JSInterpreter(jscode)
func_code = jsi.extract_function_code(funcname)
self.cache.store('youtube-nsig', player_id, func_code)
if self.get_param('youtube_print_sig_code'): if self.get_param('youtube_print_sig_code'):
self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n') self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n')
try:
extract_nsig = self._cached(self._extract_n_function_from_code, 'nsig func', player_url)
ret = extract_nsig(jsi, func_code)(s)
except JSInterpreter.Exception as e:
try:
jsi = PhantomJSwrapper(self, timeout=5000)
except ExtractorError:
raise e
self.report_warning(
f'Native nsig extraction failed: Trying with PhantomJS\n'
f' n = {s} ; player = {player_url}', video_id)
self.write_debug(e)
args, func_body = func_code
ret = jsi.execute(
f'console.log(function({", ".join(args)}) {{ {func_body} }}({s!r}));',
video_id=video_id, note='Executing signature code').strip()
self.write_debug(f'Decrypted nsig {s} => {ret}')
return ret
def _extract_n_function_name(self, jscode):
funcname, idx = self._search_regex(
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx'))
if not idx:
return funcname
return json.loads(js_to_json(self._search_regex(
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\]);', jscode,
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
def _extract_n_function_code(self, video_id, player_url):
player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2022.09.1')
jscode = func_code or self._load_player(video_id, player_url)
jsi = JSInterpreter(jscode)
if func_code:
return jsi, player_id, func_code
func_name = self._extract_n_function_name(jscode)
# For redundancy
func_code = self._search_regex(
r'''(?xs)%s\s*=\s*function\s*\((?P<var>[\w$]+)\)\s*
# NB: The end of the regex is intentionally kept strict
{(?P<code>.+?}\s*return\ [\w$]+.join\(""\))};''' % func_name,
jscode, 'nsig function', group=('var', 'code'), default=None)
if func_code:
func_code = ([func_code[0]], func_code[1])
else:
self.write_debug('Extracting nsig function with jsinterp')
func_code = jsi.extract_function_code(func_name)
self.cache.store('youtube-nsig', player_id, func_code)
return jsi, player_id, func_code
def _extract_n_function_from_code(self, jsi, func_code):
func = jsi.extract_function_from_code(*func_code) func = jsi.extract_function_from_code(*func_code)
return lambda s: func([s])
def extract_nsig(s):
try:
ret = func([s])
except JSInterpreter.Exception:
raise
except Exception as e:
raise JSInterpreter.Exception(traceback.format_exc(), cause=e)
if ret.startswith('enhanced_except_'):
raise JSInterpreter.Exception('Signature function returned an exception')
return ret
return extract_nsig
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False): def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
""" """
@@ -2917,8 +3004,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# YouTube comments have a max depth of 2 # YouTube comments have a max depth of 2
max_depth = int_or_none(get_single_config_arg('max_comment_depth')) max_depth = int_or_none(get_single_config_arg('max_comment_depth'))
if max_depth: if max_depth:
self._downloader.deprecation_warning( self._downloader.deprecated_feature('[youtube] max_comment_depth extractor argument is deprecated. '
'[youtube] max_comment_depth extractor argument is deprecated. Set max replies in the max-comments extractor argument instead.') 'Set max replies in the max-comments extractor argument instead')
if max_depth == 1 and parent: if max_depth == 1 and parent:
return return
@@ -3040,7 +3127,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _is_unplayable(player_response): def _is_unplayable(player_response):
return traverse_obj(player_response, ('playabilityStatus', 'status')) == 'UNPLAYABLE' return traverse_obj(player_response, ('playabilityStatus', 'status')) == 'UNPLAYABLE'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr): _STORY_PLAYER_PARAMS = '8AEB'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr, smuggled_data):
session_index = self._extract_session_index(player_ytcfg, master_ytcfg) session_index = self._extract_session_index(player_ytcfg, master_ytcfg)
syncid = self._extract_account_syncid(player_ytcfg, master_ytcfg, initial_pr) syncid = self._extract_account_syncid(player_ytcfg, master_ytcfg, initial_pr)
@@ -3050,8 +3139,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
yt_query = { yt_query = {
'videoId': video_id, 'videoId': video_id,
'params': '8AEB' # enable stories
} }
if smuggled_data.get('is_story') or _split_innertube_client(client)[0] == 'android':
yt_query['params'] = self._STORY_PLAYER_PARAMS
yt_query.update(self._generate_player_context(sts)) yt_query.update(self._generate_player_context(sts))
return self._extract_response( return self._extract_response(
item_id=video_id, ep='player', query=yt_query, item_id=video_id, ep='player', query=yt_query,
@@ -3084,7 +3175,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return orderedSet(requested_clients) return orderedSet(requested_clients)
def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg): def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg, smuggled_data):
initial_pr = None initial_pr = None
if webpage: if webpage:
initial_pr = self._search_json( initial_pr = self._search_json(
@@ -3134,7 +3225,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
try: try:
pr = initial_pr if client == 'web' and initial_pr else self._extract_player_response( pr = initial_pr if client == 'web' and initial_pr else self._extract_player_response(
client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr) client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr, smuggled_data)
except ExtractorError as e: except ExtractorError as e:
if last_error: if last_error:
self.report_warning(last_error) self.report_warning(last_error)
@@ -3168,7 +3259,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, is_live, duration): def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, is_live, duration):
itags, stream_ids = {}, [] itags, stream_ids = {}, []
itag_qualities, res_qualities = {}, {} itag_qualities, res_qualities = {}, {0: None}
q = qualities([ q = qualities([
# Normally tiny is the smallest video-only formats. But # Normally tiny is the smallest video-only formats. But
# audio-only formats with unknown quality may get tagged as tiny # audio-only formats with unknown quality may get tagged as tiny
@@ -3220,7 +3311,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self._decrypt_signature(encrypted_sig, video_id, player_url) self._decrypt_signature(encrypted_sig, video_id, player_url)
) )
except ExtractorError as e: except ExtractorError as e:
self.report_warning('Signature extraction failed: Some formats may be missing', only_once=True) self.report_warning('Signature extraction failed: Some formats may be missing',
video_id=video_id, only_once=True)
self.write_debug(e, only_once=True) self.write_debug(e, only_once=True)
continue continue
@@ -3228,12 +3320,18 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
throttled = False throttled = False
if query.get('n'): if query.get('n'):
try: try:
decrypt_nsig = self._cached(self._decrypt_nsig, 'nsig', query['n'][0])
fmt_url = update_url_query(fmt_url, { fmt_url = update_url_query(fmt_url, {
'n': self._decrypt_nsig(query['n'][0], video_id, player_url)}) 'n': decrypt_nsig(query['n'][0], video_id, player_url)
})
except ExtractorError as e: except ExtractorError as e:
phantomjs_hint = ''
if isinstance(e, JSInterpreter.Exception):
phantomjs_hint = (f' Install {self._downloader._format_err("PhantomJS", self._downloader.Styles.EMPHASIS)} '
f'to workaround the issue. {PhantomJSwrapper.INSTALL_HINT}\n')
self.report_warning( self.report_warning(
'nsig extraction failed: You may experience throttling for some formats\n' f'nsig extraction failed: You may experience throttling for some formats\n{phantomjs_hint}'
f'n = {query["n"][0]} ; player = {player_url}', only_once=True) f' n = {query["n"][0]} ; player = {player_url}', video_id=video_id, only_once=True)
self.write_debug(e, only_once=True) self.write_debug(e, only_once=True)
throttled = True throttled = True
@@ -3320,10 +3418,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f['format_id'] = itag f['format_id'] = itag
itags[itag] = proto itags[itag] = proto
f['quality'] = next(( f['quality'] = q(itag_qualities.get(try_get(f, lambda f: f['format_id'].split('-')[0]), -1))
q(qdict[val]) if f['quality'] == -1 and f.get('height'):
for val, qdict in ((f.get('format_id', '').split('-')[0], itag_qualities), (f.get('height'), res_qualities)) f['quality'] = q(res_qualities[min(res_qualities, key=lambda x: abs(x - f['height']))])
if val in qdict), -1)
return True return True
subtitles = {} subtitles = {}
@@ -3392,14 +3489,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _download_player_responses(self, url, smuggled_data, video_id, webpage_url): def _download_player_responses(self, url, smuggled_data, video_id, webpage_url):
webpage = None webpage = None
if 'webpage' not in self._configuration_arg('player_skip'): if 'webpage' not in self._configuration_arg('player_skip'):
query = {'bpctr': '9999999999', 'has_verified': '1'}
if smuggled_data.get('is_story'):
query['pp'] = self._STORY_PLAYER_PARAMS
webpage = self._download_webpage( webpage = self._download_webpage(
webpage_url + '&bpctr=9999999999&has_verified=1&pp=8AEB', video_id, fatal=False) webpage_url, video_id, fatal=False, query=query)
master_ytcfg = self.extract_ytcfg(video_id, webpage) or self._get_default_ytcfg() master_ytcfg = self.extract_ytcfg(video_id, webpage) or self._get_default_ytcfg()
player_responses, player_url = self._extract_player_responses( player_responses, player_url = self._extract_player_responses(
self._get_requested_clients(url, smuggled_data), self._get_requested_clients(url, smuggled_data),
video_id, webpage, master_ytcfg) video_id, webpage, master_ytcfg, smuggled_data)
return webpage, master_ytcfg, player_responses, player_url return webpage, master_ytcfg, player_responses, player_url
@@ -3865,7 +3965,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
upload_date = ( upload_date = (
unified_strdate(get_first(microformats, 'uploadDate')) unified_strdate(get_first(microformats, 'uploadDate'))
or unified_strdate(search_meta('uploadDate'))) or unified_strdate(search_meta('uploadDate')))
if not upload_date or (not info.get('is_live') and not info.get('was_live') and info.get('live_status') != 'is_upcoming'): if not upload_date or (
not info.get('is_live')
and not info.get('was_live')
and info.get('live_status') != 'is_upcoming'
and 'no-youtube-prefer-utc-upload-date' not in self.get_param('compat_opts', [])
):
upload_date = strftime_or_none(self._extract_time_text(vpir, 'dateText')[0], '%Y%m%d') or upload_date upload_date = strftime_or_none(self._extract_time_text(vpir, 'dateText')[0], '%Y%m%d') or upload_date
info['upload_date'] = upload_date info['upload_date'] = upload_date
@@ -5972,7 +6077,7 @@ class YoutubeStoriesIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = f'RLTD{self._match_id(url)}' playlist_id = f'RLTD{self._match_id(url)}'
return self.url_result( return self.url_result(
f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1', smuggle_url(f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1', {'is_story': True}),
ie=YoutubeTabIE, video_id=playlist_id) ie=YoutubeTabIE, video_id=playlist_id)

View File

@@ -236,32 +236,24 @@ class ZattooPlatformBaseIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id, record_id = self._match_valid_url(url).groups() video_id, record_id = self._match_valid_url(url).groups()
return self._extract_video(video_id, record_id) return getattr(self, f'_extract_{self._TYPE}')(video_id or record_id)
def _make_valid_url(host): def _create_valid_url(host, match, qs, base_re=None):
return rf'https?://(?:www\.)?{re.escape(host)}/watch/[^/]+?/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?' match_base = fr'|{base_re}/(?P<vid1>{match})' if base_re else '(?P<vid1>)'
return rf'''(?x)https?://(?:www\.)?{re.escape(host)}/(?:
[^?#]+\?(?:[^#]+&)?{qs}=(?P<vid2>{match})
{match_base}
)'''
class ZattooBaseIE(ZattooPlatformBaseIE): class ZattooBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'zattoo' _NETRC_MACHINE = 'zattoo'
_HOST = 'zattoo.com' _HOST = 'zattoo.com'
@staticmethod
def _create_valid_url(match, qs, base_re=None):
match_base = fr'|{base_re}/(?P<vid1>{match})' if base_re else '(?P<vid1>)'
return rf'''(?x)https?://(?:www\.)?zattoo\.com/(?:
[^?#]+\?(?:[^#]+&)?{qs}=(?P<vid2>{match})
{match_base}
)'''
def _real_extract(self, url):
vid1, vid2 = self._match_valid_url(url).group('vid1', 'vid2')
return getattr(self, f'_extract_{self._TYPE}')(vid1 or vid2)
class ZattooIE(ZattooBaseIE): class ZattooIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\d+', 'program', '(?:program|watch)/[^/]+') _VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video' _TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://zattoo.com/program/zdf/250170418', 'url': 'https://zattoo.com/program/zdf/250170418',
@@ -288,7 +280,7 @@ class ZattooIE(ZattooBaseIE):
class ZattooLiveIE(ZattooBaseIE): class ZattooLiveIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'[^/?&#]+', 'channel', 'live') _VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live' _TYPE = 'live'
_TESTS = [{ _TESTS = [{
'url': 'https://zattoo.com/channels/german?channel=srf_zwei', 'url': 'https://zattoo.com/channels/german?channel=srf_zwei',
@@ -304,7 +296,7 @@ class ZattooLiveIE(ZattooBaseIE):
class ZattooMoviesIE(ZattooBaseIE): class ZattooMoviesIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\w+', 'movie_id', 'vod/movies') _VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'\w+', 'movie_id', 'vod/movies')
_TYPE = 'ondemand' _TYPE = 'ondemand'
_TESTS = [{ _TESTS = [{
'url': 'https://zattoo.com/vod/movies/7521', 'url': 'https://zattoo.com/vod/movies/7521',
@@ -316,7 +308,7 @@ class ZattooMoviesIE(ZattooBaseIE):
class ZattooRecordingsIE(ZattooBaseIE): class ZattooRecordingsIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\d+', 'recording') _VALID_URL = _create_valid_url('zattoo.com', r'\d+', 'recording')
_TYPE = 'record' _TYPE = 'record'
_TESTS = [{ _TESTS = [{
'url': 'https://zattoo.com/recordings?recording=193615508', 'url': 'https://zattoo.com/recordings?recording=193615508',
@@ -327,139 +319,547 @@ class ZattooRecordingsIE(ZattooBaseIE):
}] }]
class NetPlusIE(ZattooPlatformBaseIE): class NetPlusTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'netplus' _NETRC_MACHINE = 'netplus'
_HOST = 'netplus.tv' _HOST = 'netplus.tv'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class NetPlusTVIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.netplus.tv/watch/abc/123-abc', 'url': 'https://netplus.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://netplus.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class MNetTVIE(ZattooPlatformBaseIE): class NetPlusTVLiveIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://netplus.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://netplus.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if NetPlusTVIE.suitable(url) else super().suitable(url)
class NetPlusTVRecordingsIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://netplus.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://netplus.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class MNetTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'mnettv' _NETRC_MACHINE = 'mnettv'
_HOST = 'tvplus.m-net.de' _HOST = 'tvplus.m-net.de'
_VALID_URL = _make_valid_url(_HOST)
class MNetTVIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://tvplus.m-net.de/watch/abc/123-abc', 'url': 'https://tvplus.m-net.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class WalyTVIE(ZattooPlatformBaseIE): class MNetTVLiveIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvplus.m-net.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if MNetTVIE.suitable(url) else super().suitable(url)
class MNetTVRecordingsIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvplus.m-net.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class WalyTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'walytv' _NETRC_MACHINE = 'walytv'
_HOST = 'player.waly.tv' _HOST = 'player.waly.tv'
_VALID_URL = _make_valid_url(_HOST)
class WalyTVIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://player.waly.tv/watch/abc/123-abc', 'url': 'https://player.waly.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class BBVTVIE(ZattooPlatformBaseIE): class WalyTVLiveIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://player.waly.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if WalyTVIE.suitable(url) else super().suitable(url)
class WalyTVRecordingsIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://player.waly.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class BBVTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'bbvtv' _NETRC_MACHINE = 'bbvtv'
_HOST = 'bbv-tv.net' _HOST = 'bbv-tv.net'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class BBVTVIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.bbv-tv.net/watch/abc/123-abc', 'url': 'https://bbv-tv.net/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class VTXTVIE(ZattooPlatformBaseIE): class BBVTVLiveIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://bbv-tv.net/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if BBVTVIE.suitable(url) else super().suitable(url)
class BBVTVRecordingsIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://bbv-tv.net/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class VTXTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'vtxtv' _NETRC_MACHINE = 'vtxtv'
_HOST = 'vtxtv.ch' _HOST = 'vtxtv.ch'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class VTXTVIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.vtxtv.ch/watch/abc/123-abc', 'url': 'https://vtxtv.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class GlattvisionTVIE(ZattooPlatformBaseIE): class VTXTVLiveIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://vtxtv.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if VTXTVIE.suitable(url) else super().suitable(url)
class VTXTVRecordingsIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://vtxtv.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class GlattvisionTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'glattvisiontv' _NETRC_MACHINE = 'glattvisiontv'
_HOST = 'iptv.glattvision.ch' _HOST = 'iptv.glattvision.ch'
_VALID_URL = _make_valid_url(_HOST)
class GlattvisionTVIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://iptv.glattvision.ch/watch/abc/123-abc', 'url': 'https://iptv.glattvision.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class SAKTVIE(ZattooPlatformBaseIE): class GlattvisionTVLiveIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://iptv.glattvision.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if GlattvisionTVIE.suitable(url) else super().suitable(url)
class GlattvisionTVRecordingsIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://iptv.glattvision.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class SAKTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'saktv' _NETRC_MACHINE = 'saktv'
_HOST = 'saktv.ch' _HOST = 'saktv.ch'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class SAKTVIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.saktv.ch/watch/abc/123-abc', 'url': 'https://saktv.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://saktv.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class EWETVIE(ZattooPlatformBaseIE): class SAKTVLiveIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://saktv.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://saktv.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if SAKTVIE.suitable(url) else super().suitable(url)
class SAKTVRecordingsIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://saktv.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://saktv.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class EWETVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'ewetv' _NETRC_MACHINE = 'ewetv'
_HOST = 'tvonline.ewe.de' _HOST = 'tvonline.ewe.de'
_VALID_URL = _make_valid_url(_HOST)
class EWETVIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://tvonline.ewe.de/watch/abc/123-abc', 'url': 'https://tvonline.ewe.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class QuantumTVIE(ZattooPlatformBaseIE): class EWETVLiveIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvonline.ewe.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if EWETVIE.suitable(url) else super().suitable(url)
class EWETVRecordingsIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvonline.ewe.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class QuantumTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'quantumtv' _NETRC_MACHINE = 'quantumtv'
_HOST = 'quantum-tv.com' _HOST = 'quantum-tv.com'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class QuantumTVIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.quantum-tv.com/watch/abc/123-abc', 'url': 'https://quantum-tv.com/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class OsnatelTVIE(ZattooPlatformBaseIE): class QuantumTVLiveIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://quantum-tv.com/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if QuantumTVIE.suitable(url) else super().suitable(url)
class QuantumTVRecordingsIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://quantum-tv.com/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class OsnatelTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'osnateltv' _NETRC_MACHINE = 'osnateltv'
_HOST = 'tvonline.osnatel.de' _HOST = 'tvonline.osnatel.de'
_VALID_URL = _make_valid_url(_HOST)
class OsnatelTVIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://tvonline.osnatel.de/watch/abc/123-abc', 'url': 'https://tvonline.osnatel.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class EinsUndEinsTVIE(ZattooPlatformBaseIE): class OsnatelTVLiveIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvonline.osnatel.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if OsnatelTVIE.suitable(url) else super().suitable(url)
class OsnatelTVRecordingsIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvonline.osnatel.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class EinsUndEinsTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = '1und1tv' _NETRC_MACHINE = '1und1tv'
_HOST = '1und1.tv' _HOST = '1und1.tv'
_API_HOST = 'www.%s' % _HOST _API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class EinsUndEinsTVIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://www.1und1.tv/watch/abc/123-abc', 'url': 'https://1und1.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://1und1.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True, 'only_matching': True,
}] }]
class SaltTVIE(ZattooPlatformBaseIE): class EinsUndEinsTVLiveIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://1und1.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://1und1.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if EinsUndEinsTVIE.suitable(url) else super().suitable(url)
class EinsUndEinsTVRecordingsIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://1und1.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://1und1.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class SaltTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'salttv' _NETRC_MACHINE = 'salttv'
_HOST = 'tv.salt.ch' _HOST = 'tv.salt.ch'
_VALID_URL = _make_valid_url(_HOST)
class SaltTVIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{ _TESTS = [{
'url': 'https://tv.salt.ch/watch/abc/123-abc', 'url': 'https://tv.salt.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class SaltTVLiveIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tv.salt.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if SaltTVIE.suitable(url) else super().suitable(url)
class SaltTVRecordingsIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tv.salt.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True, 'only_matching': True,
}] }]

View File

@@ -16,50 +16,72 @@ from .utils import (
write_string, write_string,
) )
_NAME_RE = r'[a-zA-Z_$][\w$]*'
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence def _js_bit_op(op):
_OPERATORS = { # None => Defined in JSInterpreter._operator def zeroise(x):
'?': None, return 0 if x in (None, JS_Undefined) else x
'||': None, def wrapped(a, b):
'&&': None, return op(zeroise(a), zeroise(b)) & 0xffffffff
'&': operator.and_,
'|': operator.or_,
'^': operator.xor,
'===': operator.is_, return wrapped
'!==': operator.is_not,
'==': operator.eq,
'!=': operator.ne,
'<=': operator.le,
'>=': operator.ge,
'<': operator.lt,
'>': operator.gt,
'>>': operator.rshift,
'<<': operator.lshift,
'+': operator.add,
'-': operator.sub,
'*': operator.mul,
'/': operator.truediv,
'%': operator.mod,
'**': operator.pow,
}
_COMP_OPERATORS = {'===', '!==', '==', '!=', '<=', '>=', '<', '>'}
_MATCHING_PARENS = dict(zip('({[', ')}]'))
_QUOTES = '\'"'
def _ternary(cndn, if_true=True, if_false=False): def _js_arith_op(op):
def wrapped(a, b):
if JS_Undefined in (a, b):
return float('nan')
return op(a or 0, b or 0)
return wrapped
def _js_div(a, b):
if JS_Undefined in (a, b) or not (a and b):
return float('nan')
return (a or 0) / b if b else float('inf')
def _js_mod(a, b):
if JS_Undefined in (a, b) or not b:
return float('nan')
return (a or 0) % b
def _js_exp(a, b):
if not b:
return 1 # even 0 ** 0 !!
elif JS_Undefined in (a, b):
return float('nan')
return (a or 0) ** b
def _js_eq_op(op):
def wrapped(a, b):
if {a, b} <= {None, JS_Undefined}:
return op(a, a)
return op(a, b)
return wrapped
def _js_comp_op(op):
def wrapped(a, b):
if JS_Undefined in (a, b):
return False
if isinstance(a, str) or isinstance(b, str):
return op(str(a or 0), str(b or 0))
return op(a or 0, b or 0)
return wrapped
def _js_ternary(cndn, if_true=True, if_false=False):
"""Simulate JS's ternary operator (cndn?if_true:if_false)""" """Simulate JS's ternary operator (cndn?if_true:if_false)"""
if cndn in (False, None, 0, ''): if cndn in (False, None, 0, '', JS_Undefined):
return if_false return if_false
with contextlib.suppress(TypeError): with contextlib.suppress(TypeError):
if math.isnan(cndn): # NB: NaN cannot be checked by membership if math.isnan(cndn): # NB: NaN cannot be checked by membership
@@ -67,6 +89,50 @@ def _ternary(cndn, if_true=True, if_false=False):
return if_true return if_true
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
_OPERATORS = { # None => Defined in JSInterpreter._operator
'?': None,
'??': None,
'||': None,
'&&': None,
'|': _js_bit_op(operator.or_),
'^': _js_bit_op(operator.xor),
'&': _js_bit_op(operator.and_),
'===': operator.is_,
'!==': operator.is_not,
'==': _js_eq_op(operator.eq),
'!=': _js_eq_op(operator.ne),
'<=': _js_comp_op(operator.le),
'>=': _js_comp_op(operator.ge),
'<': _js_comp_op(operator.lt),
'>': _js_comp_op(operator.gt),
'>>': _js_bit_op(operator.rshift),
'<<': _js_bit_op(operator.lshift),
'+': _js_arith_op(operator.add),
'-': _js_arith_op(operator.sub),
'*': _js_arith_op(operator.mul),
'/': _js_div,
'%': _js_mod,
'**': _js_exp,
}
_COMP_OPERATORS = {'===', '!==', '==', '!=', '<=', '>=', '<', '>'}
_NAME_RE = r'[a-zA-Z_$][\w$]*'
_MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]')))
_QUOTES = '\'"/'
class JS_Undefined:
pass
class JS_Break(ExtractorError): class JS_Break(ExtractorError):
def __init__(self): def __init__(self):
ExtractorError.__init__(self, 'Invalid break') ExtractorError.__init__(self, 'Invalid break')
@@ -77,6 +143,12 @@ class JS_Continue(ExtractorError):
ExtractorError.__init__(self, 'Invalid continue') ExtractorError.__init__(self, 'Invalid continue')
class JS_Throw(ExtractorError):
def __init__(self, e):
self.error = e
ExtractorError.__init__(self, f'Uncaught exception {e}')
class LocalNameSpace(collections.ChainMap): class LocalNameSpace(collections.ChainMap):
def __setitem__(self, key, value): def __setitem__(self, key, value):
for scope in self.maps: for scope in self.maps:
@@ -103,7 +175,14 @@ class Debugger:
def interpret_statement(self, stmt, local_vars, allow_recursion, *args, **kwargs): def interpret_statement(self, stmt, local_vars, allow_recursion, *args, **kwargs):
if cls.ENABLED and stmt.strip(): if cls.ENABLED and stmt.strip():
cls.write(stmt, level=allow_recursion) cls.write(stmt, level=allow_recursion)
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs) try:
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs)
except Exception as e:
if cls.ENABLED:
if isinstance(e, ExtractorError):
e = e.orig_msg
cls.write('=> Raises:', e, '<-|', stmt, level=allow_recursion)
raise
if cls.ENABLED and stmt.strip(): if cls.ENABLED and stmt.strip():
cls.write(['->', '=>'][should_ret], repr(ret), '<-|', stmt, level=allow_recursion) cls.write(['->', '=>'][should_ret], repr(ret), '<-|', stmt, level=allow_recursion)
return ret, should_ret return ret, should_ret
@@ -113,6 +192,21 @@ class Debugger:
class JSInterpreter: class JSInterpreter:
__named_object_counter = 0 __named_object_counter = 0
_RE_FLAGS = {
# special knowledge: Python's re flags are bitmask values, current max 128
# invent new bitmask values well above that for literal parsing
# TODO: new pattern class to execute matches with these flags
'd': 1024, # Generate indices for substring matches
'g': 2048, # Global search
'i': re.I, # Case-insensitive search
'm': re.M, # Multi-line search
's': re.S, # Allows . to match newline characters
'u': re.U, # Treat a pattern as a sequence of unicode code points
'y': 4096, # Perform a "sticky" search that matches starting at the current position in the target string
}
_EXC_NAME = '__yt_dlp_exception__'
def __init__(self, code, objects=None): def __init__(self, code, objects=None):
self.code, self._functions = code, {} self.code, self._functions = code, {}
self._objects = {} if objects is None else objects self._objects = {} if objects is None else objects
@@ -129,21 +223,38 @@ class JSInterpreter:
namespace[name] = obj namespace[name] = obj
return name return name
@classmethod
def _regex_flags(cls, expr):
flags = 0
if not expr:
return flags, expr
for idx, ch in enumerate(expr):
if ch not in cls._RE_FLAGS:
break
flags |= cls._RE_FLAGS[ch]
return flags, expr[idx + 1:]
@staticmethod @staticmethod
def _separate(expr, delim=',', max_split=None): def _separate(expr, delim=',', max_split=None):
OP_CHARS = '+-*/%&|^=<>!,;{}:'
if not expr: if not expr:
return return
counters = {k: 0 for k in _MATCHING_PARENS.values()} counters = {k: 0 for k in _MATCHING_PARENS.values()}
start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1 start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1
in_quote, escaping = None, False in_quote, escaping, after_op, in_regex_char_group = None, False, True, False
for idx, char in enumerate(expr): for idx, char in enumerate(expr):
if not in_quote and char in _MATCHING_PARENS: if not in_quote and char in _MATCHING_PARENS:
counters[_MATCHING_PARENS[char]] += 1 counters[_MATCHING_PARENS[char]] += 1
elif not in_quote and char in counters: elif not in_quote and char in counters:
counters[char] -= 1 counters[char] -= 1
elif not escaping and char in _QUOTES and in_quote in (char, None): elif not escaping:
in_quote = None if in_quote else char if char in _QUOTES and in_quote in (char, None):
if in_quote or after_op or char != '/':
in_quote = None if in_quote and not in_regex_char_group else char
elif in_quote == '/' and char in '[]':
in_regex_char_group = char == '['
escaping = not escaping and in_quote and char == '\\' escaping = not escaping and in_quote and char == '\\'
after_op = not in_quote and char in OP_CHARS or (char.isspace() and after_op)
if char != delim[pos] or any(counters.values()) or in_quote: if char != delim[pos] or any(counters.values()) or in_quote:
pos = 0 pos = 0
@@ -159,7 +270,9 @@ class JSInterpreter:
yield expr[start:] yield expr[start:]
@classmethod @classmethod
def _separate_at_paren(cls, expr, delim): def _separate_at_paren(cls, expr, delim=None):
if delim is None:
delim = expr and _MATCHING_PARENS[expr[0]]
separated = list(cls._separate(expr, delim, 1)) separated = list(cls._separate(expr, delim, 1))
if len(separated) < 2: if len(separated) < 2:
raise cls.Exception(f'No terminating paren {delim}', expr) raise cls.Exception(f'No terminating paren {delim}', expr)
@@ -167,10 +280,13 @@ class JSInterpreter:
def _operator(self, op, left_val, right_expr, expr, local_vars, allow_recursion): def _operator(self, op, left_val, right_expr, expr, local_vars, allow_recursion):
if op in ('||', '&&'): if op in ('||', '&&'):
if (op == '&&') ^ _ternary(left_val): if (op == '&&') ^ _js_ternary(left_val):
return left_val # short circuiting return left_val # short circuiting
elif op == '??':
if left_val not in (None, JS_Undefined):
return left_val
elif op == '?': elif op == '?':
right_expr = _ternary(left_val, *self._separate(right_expr, ':', 1)) right_expr = _js_ternary(left_val, *self._separate(right_expr, ':', 1))
right_val = self.interpret_expression(right_expr, local_vars, allow_recursion) right_val = self.interpret_expression(right_expr, local_vars, allow_recursion)
if not _OPERATORS.get(op): if not _OPERATORS.get(op):
@@ -181,12 +297,14 @@ class JSInterpreter:
except Exception as e: except Exception as e:
raise self.Exception(f'Failed to evaluate {left_val!r} {op} {right_val!r}', expr, cause=e) raise self.Exception(f'Failed to evaluate {left_val!r} {op} {right_val!r}', expr, cause=e)
def _index(self, obj, idx): def _index(self, obj, idx, allow_undefined=False):
if idx == 'length': if idx == 'length':
return len(obj) return len(obj)
try: try:
return obj[int(idx)] if isinstance(obj, list) else obj[idx] return obj[int(idx)] if isinstance(obj, list) else obj[idx]
except Exception as e: except Exception as e:
if allow_undefined:
return JS_Undefined
raise self.Exception(f'Cannot get index {idx}', repr(obj), cause=e) raise self.Exception(f'Cannot get index {idx}', repr(obj), cause=e)
def _dump(self, obj, namespace): def _dump(self, obj, namespace):
@@ -210,16 +328,22 @@ class JSInterpreter:
if should_return: if should_return:
return ret, should_return return ret, should_return
m = re.match(r'(?P<var>(?:var|const|let)\s)|return(?:\s+|$)', stmt) m = re.match(r'(?P<var>(?:var|const|let)\s)|return(?:\s+|(?=["\'])|$)|(?P<throw>throw\s+)', stmt)
if m: if m:
expr = stmt[len(m.group(0)):].strip() expr = stmt[len(m.group(0)):].strip()
if m.group('throw'):
raise JS_Throw(self.interpret_expression(expr, local_vars, allow_recursion))
should_return = not m.group('var') should_return = not m.group('var')
if not expr: if not expr:
return None, should_return return None, should_return
if expr[0] in _QUOTES: if expr[0] in _QUOTES:
inner, outer = self._separate(expr, expr[0], 1) inner, outer = self._separate(expr, expr[0], 1)
inner = json.loads(js_to_json(f'{inner}{expr[0]}', strict=True)) if expr[0] == '/':
flags, outer = self._regex_flags(outer)
inner = re.compile(inner[1:], flags=flags)
else:
inner = json.loads(js_to_json(f'{inner}{expr[0]}', strict=True))
if not outer: if not outer:
return inner, should_return return inner, should_return
expr = self._named_object(local_vars, inner) + outer expr = self._named_object(local_vars, inner) + outer
@@ -227,7 +351,7 @@ class JSInterpreter:
if expr.startswith('new '): if expr.startswith('new '):
obj = expr[4:] obj = expr[4:]
if obj.startswith('Date('): if obj.startswith('Date('):
left, right = self._separate_at_paren(obj[4:], ')') left, right = self._separate_at_paren(obj[4:])
expr = unified_timestamp( expr = unified_timestamp(
self.interpret_expression(left, local_vars, allow_recursion), False) self.interpret_expression(left, local_vars, allow_recursion), False)
if not expr: if not expr:
@@ -241,7 +365,18 @@ class JSInterpreter:
return None, should_return return None, should_return
if expr.startswith('{'): if expr.startswith('{'):
inner, outer = self._separate_at_paren(expr, '}') inner, outer = self._separate_at_paren(expr)
# try for object expression (Map)
sub_expressions = [list(self._separate(sub_expr.strip(), ':', 1)) for sub_expr in self._separate(inner)]
if all(len(sub_expr) == 2 for sub_expr in sub_expressions):
def dict_item(key, val):
val = self.interpret_expression(val, local_vars, allow_recursion)
if re.match(_NAME_RE, key):
return key, val
return self.interpret_expression(key, local_vars, allow_recursion), val
return dict(dict_item(k, v) for k, v in sub_expressions), should_return
inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion) inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion)
if not outer or should_abort: if not outer or should_abort:
return inner, should_abort or should_return return inner, should_abort or should_return
@@ -249,7 +384,7 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer expr = self._dump(inner, local_vars) + outer
if expr.startswith('('): if expr.startswith('('):
inner, outer = self._separate_at_paren(expr, ')') inner, outer = self._separate_at_paren(expr)
inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion) inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion)
if not outer or should_abort: if not outer or should_abort:
return inner, should_abort or should_return return inner, should_abort or should_return
@@ -257,38 +392,62 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer expr = self._dump(inner, local_vars) + outer
if expr.startswith('['): if expr.startswith('['):
inner, outer = self._separate_at_paren(expr, ']') inner, outer = self._separate_at_paren(expr)
name = self._named_object(local_vars, [ name = self._named_object(local_vars, [
self.interpret_expression(item, local_vars, allow_recursion) self.interpret_expression(item, local_vars, allow_recursion)
for item in self._separate(inner)]) for item in self._separate(inner)])
expr = name + outer expr = name + outer
m = re.match(r'(?P<try>try|finally)\s*|(?:(?P<catch>catch)|(?P<for>for)|(?P<switch>switch))\s*\(', expr) m = re.match(r'''(?x)
if m and m.group('try'): (?P<try>try)\s*\{|
if expr[m.end()] == '{': (?P<switch>switch)\s*\(|
try_expr, expr = self._separate_at_paren(expr[m.end():], '}') (?P<for>for)\s*\(
else: ''', expr)
try_expr, expr = expr[m.end() - 1:], '' md = m.groupdict() if m else {}
ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion) if md.get('try'):
try_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
err = None
try:
ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
except Exception as e:
# XXX: This works for now, but makes debugging future issues very hard
err = e
pending = (None, False)
m = re.match(r'catch\s*(?P<err>\(\s*{_NAME_RE}\s*\))?\{{'.format(**globals()), expr)
if m:
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
if err:
catch_vars = {}
if m.group('err'):
catch_vars[m.group('err')] = err.error if isinstance(err, JS_Throw) else err
catch_vars = local_vars.new_child(catch_vars)
err, pending = None, self.interpret_statement(sub_expr, catch_vars, allow_recursion)
m = re.match(r'finally\s*\{', expr)
if m:
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
ret, should_abort = pending
if should_abort: if should_abort:
return ret, True return ret, True
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('catch'): if err:
# We ignore the catch block raise err
_, expr = self._separate_at_paren(expr, '}')
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('for'): elif md.get('for'):
constructor, remaining = self._separate_at_paren(expr[m.end() - 1:], ')') constructor, remaining = self._separate_at_paren(expr[m.end() - 1:])
if remaining.startswith('{'): if remaining.startswith('{'):
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining)
else: else:
switch_m = re.match(r'switch\s*\(', remaining) # FIXME switch_m = re.match(r'switch\s*\(', remaining) # FIXME
if switch_m: if switch_m:
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:], ')') switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:])
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining, '}')
body = 'switch(%s){%s}' % (switch_val, body) body = 'switch(%s){%s}' % (switch_val, body)
else: else:
@@ -296,7 +455,7 @@ class JSInterpreter:
start, cndn, increment = self._separate(constructor, ';') start, cndn, increment = self._separate(constructor, ';')
self.interpret_expression(start, local_vars, allow_recursion) self.interpret_expression(start, local_vars, allow_recursion)
while True: while True:
if not _ternary(self.interpret_expression(cndn, local_vars, allow_recursion)): if not _js_ternary(self.interpret_expression(cndn, local_vars, allow_recursion)):
break break
try: try:
ret, should_abort = self.interpret_statement(body, local_vars, allow_recursion) ret, should_abort = self.interpret_statement(body, local_vars, allow_recursion)
@@ -307,11 +466,9 @@ class JSInterpreter:
except JS_Continue: except JS_Continue:
pass pass
self.interpret_expression(increment, local_vars, allow_recursion) self.interpret_expression(increment, local_vars, allow_recursion)
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('switch'): elif md.get('switch'):
switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:], ')') switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:])
switch_val = self.interpret_expression(switch_val, local_vars, allow_recursion) switch_val = self.interpret_expression(switch_val, local_vars, allow_recursion)
body, expr = self._separate_at_paren(remaining, '}') body, expr = self._separate_at_paren(remaining, '}')
items = body.replace('default:', 'case default:').split('case ')[1:] items = body.replace('default:', 'case default:').split('case ')[1:]
@@ -334,16 +491,19 @@ class JSInterpreter:
break break
if matched: if matched:
break break
if md:
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion) ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return return ret, should_abort or should_return
# Comma separated statements # Comma separated statements
sub_expressions = list(self._separate(expr)) sub_expressions = list(self._separate(expr))
expr = sub_expressions.pop().strip() if sub_expressions else '' if len(sub_expressions) > 1:
for sub_expr in sub_expressions: for sub_expr in sub_expressions:
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion) ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort: if should_abort:
return ret, True return ret, True
return ret, False
for m in re.finditer(rf'''(?x) for m in re.finditer(rf'''(?x)
(?P<pre_sign>\+\+|--)(?P<var1>{_NAME_RE})| (?P<pre_sign>\+\+|--)(?P<var1>{_NAME_RE})|
@@ -364,13 +524,13 @@ class JSInterpreter:
(?P<assign> (?P<assign>
(?P<out>{_NAME_RE})(?:\[(?P<index>[^\]]+?)\])?\s* (?P<out>{_NAME_RE})(?:\[(?P<index>[^\]]+?)\])?\s*
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})? (?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
=(?P<expr>.*)$ =(?!=)(?P<expr>.*)$
)|(?P<return> )|(?P<return>
(?!if|return|true|false|null|undefined)(?P<name>{_NAME_RE})$ (?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
)|(?P<indexing> )|(?P<indexing>
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$ (?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
)|(?P<attribute> )|(?P<attribute>
(?P<var>{_NAME_RE})(?:\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s* (?P<var>{_NAME_RE})(?:(?P<nullish>\?)?\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s*
)|(?P<function> )|(?P<function>
(?P<fname>{_NAME_RE})\((?P<args>.*)\)$ (?P<fname>{_NAME_RE})\((?P<args>.*)\)$
)''', expr) )''', expr)
@@ -381,7 +541,7 @@ class JSInterpreter:
local_vars[m.group('out')] = self._operator( local_vars[m.group('out')] = self._operator(
m.group('op'), left_val, m.group('expr'), expr, local_vars, allow_recursion) m.group('op'), left_val, m.group('expr'), expr, local_vars, allow_recursion)
return local_vars[m.group('out')], should_return return local_vars[m.group('out')], should_return
elif left_val is None: elif left_val in (None, JS_Undefined):
raise self.Exception(f'Cannot index undefined variable {m.group("out")}', expr) raise self.Exception(f'Cannot index undefined variable {m.group("out")}', expr)
idx = self.interpret_expression(m.group('index'), local_vars, allow_recursion) idx = self.interpret_expression(m.group('index'), local_vars, allow_recursion)
@@ -389,7 +549,7 @@ class JSInterpreter:
raise self.Exception(f'List index {idx} must be integer', expr) raise self.Exception(f'List index {idx} must be integer', expr)
idx = int(idx) idx = int(idx)
left_val[idx] = self._operator( left_val[idx] = self._operator(
m.group('op'), left_val[idx], m.group('expr'), expr, local_vars, allow_recursion) m.group('op'), self._index(left_val, idx), m.group('expr'), expr, local_vars, allow_recursion)
return left_val[idx], should_return return left_val[idx], should_return
elif expr.isdigit(): elif expr.isdigit():
@@ -399,9 +559,13 @@ class JSInterpreter:
raise JS_Break() raise JS_Break()
elif expr == 'continue': elif expr == 'continue':
raise JS_Continue() raise JS_Continue()
elif expr == 'undefined':
return JS_Undefined, should_return
elif expr == 'NaN':
return float('NaN'), should_return
elif m and m.group('return'): elif m and m.group('return'):
return local_vars[m.group('name')], should_return return local_vars.get(m.group('name'), JS_Undefined), should_return
with contextlib.suppress(ValueError): with contextlib.suppress(ValueError):
return json.loads(js_to_json(expr, strict=True)), should_return return json.loads(js_to_json(expr, strict=True)), should_return
@@ -414,25 +578,26 @@ class JSInterpreter:
for op in _OPERATORS: for op in _OPERATORS:
separated = list(self._separate(expr, op)) separated = list(self._separate(expr, op))
right_expr = separated.pop() right_expr = separated.pop()
while op in '<>*-' and len(separated) > 1 and not separated[-1].strip(): while True:
separated.pop() if op in '?<>*-' and len(separated) > 1 and not separated[-1].strip():
separated.pop()
elif not (separated and op == '?' and right_expr.startswith('.')):
break
right_expr = f'{op}{right_expr}' right_expr = f'{op}{right_expr}'
if op != '-': if op != '-':
right_expr = f'{separated.pop()}{op}{right_expr}' right_expr = f'{separated.pop()}{op}{right_expr}'
if not separated: if not separated:
continue continue
left_val = self.interpret_expression(op.join(separated), local_vars, allow_recursion) left_val = self.interpret_expression(op.join(separated), local_vars, allow_recursion)
return self._operator(op, 0 if left_val is None else left_val, return self._operator(op, left_val, right_expr, expr, local_vars, allow_recursion), should_return
right_expr, expr, local_vars, allow_recursion), should_return
if m and m.group('attribute'): if m and m.group('attribute'):
variable = m.group('var') variable, member, nullish = m.group('var', 'member', 'nullish')
member = m.group('member')
if not member: if not member:
member = self.interpret_expression(m.group('member2'), local_vars, allow_recursion) member = self.interpret_expression(m.group('member2'), local_vars, allow_recursion)
arg_str = expr[m.end():] arg_str = expr[m.end():]
if arg_str.startswith('('): if arg_str.startswith('('):
arg_str, remaining = self._separate_at_paren(arg_str, ')') arg_str, remaining = self._separate_at_paren(arg_str)
else: else:
arg_str, remaining = None, arg_str arg_str, remaining = None, arg_str
@@ -454,12 +619,19 @@ class JSInterpreter:
obj = local_vars.get(variable, types.get(variable, NO_DEFAULT)) obj = local_vars.get(variable, types.get(variable, NO_DEFAULT))
if obj is NO_DEFAULT: if obj is NO_DEFAULT:
if variable not in self._objects: if variable not in self._objects:
self._objects[variable] = self.extract_object(variable) try:
obj = self._objects[variable] self._objects[variable] = self.extract_object(variable)
except self.Exception:
if not nullish:
raise
obj = self._objects.get(variable, JS_Undefined)
if nullish and obj is JS_Undefined:
return JS_Undefined
# Member access # Member access
if arg_str is None: if arg_str is None:
return self._index(obj, member) return self._index(obj, member, nullish)
# Function call # Function call
argvals = [ argvals = [
@@ -535,6 +707,13 @@ class JSInterpreter:
return obj.index(idx, start) return obj.index(idx, start)
except ValueError: except ValueError:
return -1 return -1
elif member == 'charCodeAt':
assertion(isinstance(obj, str), 'must be applied on a string')
assertion(len(argvals) == 1, 'takes exactly one argument')
idx = argvals[0] if isinstance(argvals[0], int) else 0
if idx >= len(obj):
return None
return ord(obj[idx])
idx = int(member) if isinstance(obj, list) else member idx = int(member) if isinstance(obj, list) else member
return obj[idx](argvals, allow_recursion=allow_recursion) return obj[idx](argvals, allow_recursion=allow_recursion)
@@ -603,7 +782,7 @@ class JSInterpreter:
\((?P<args>[^)]*)\)\s* \((?P<args>[^)]*)\)\s*
(?P<code>{.+})''' % {'name': re.escape(funcname)}, (?P<code>{.+})''' % {'name': re.escape(funcname)},
self.code) self.code)
code, _ = self._separate_at_paren(func_m.group('code'), '}') code, _ = self._separate_at_paren(func_m.group('code'))
if func_m is None: if func_m is None:
raise self.Exception(f'Could not find JS function "{funcname}"') raise self.Exception(f'Could not find JS function "{funcname}"')
return [x.strip() for x in func_m.group('args').split(',')], code return [x.strip() for x in func_m.group('args').split(',')], code
@@ -618,7 +797,7 @@ class JSInterpreter:
if mobj is None: if mobj is None:
break break
start, body_start = mobj.span() start, body_start = mobj.span()
body, remaining = self._separate_at_paren(code[body_start - 1:], '}') body, remaining = self._separate_at_paren(code[body_start - 1:])
name = self._named_object(local_vars, self.extract_function_from_code( name = self._named_object(local_vars, self.extract_function_from_code(
[x.strip() for x in mobj.group('args').split(',')], [x.strip() for x in mobj.group('args').split(',')],
body, local_vars, *global_stack)) body, local_vars, *global_stack))
@@ -636,7 +815,7 @@ class JSInterpreter:
global_stack[0].update(itertools.zip_longest(argnames, args, fillvalue=None)) global_stack[0].update(itertools.zip_longest(argnames, args, fillvalue=None))
global_stack[0].update(kwargs) global_stack[0].update(kwargs)
var_stack = LocalNameSpace(*global_stack) var_stack = LocalNameSpace(*global_stack)
ret, should_abort = self.interpret_statement(code.replace('\n', ''), var_stack, allow_recursion - 1) ret, should_abort = self.interpret_statement(code.replace('\n', ' '), var_stack, allow_recursion - 1)
if should_abort: if should_abort:
return ret return ret
return resf return resf

View File

@@ -25,10 +25,12 @@ from .utils import (
OUTTMPL_TYPES, OUTTMPL_TYPES,
POSTPROCESS_WHEN, POSTPROCESS_WHEN,
Config, Config,
deprecation_warning,
expand_path, expand_path,
format_field, format_field,
get_executable_path, get_executable_path,
join_nonempty, join_nonempty,
orderedSet_from_options,
remove_end, remove_end,
write_string, write_string,
) )
@@ -163,6 +165,7 @@ class _YoutubeDLHelpFormatter(optparse.IndentedHelpFormatter):
class _YoutubeDLOptionParser(optparse.OptionParser): class _YoutubeDLOptionParser(optparse.OptionParser):
# optparse is deprecated since python 3.2. So assume a stable interface even for private methods # optparse is deprecated since python 3.2. So assume a stable interface even for private methods
ALIAS_DEST = '_triggered_aliases'
ALIAS_TRIGGER_LIMIT = 100 ALIAS_TRIGGER_LIMIT = 100
def __init__(self): def __init__(self):
@@ -174,6 +177,7 @@ class _YoutubeDLOptionParser(optparse.OptionParser):
formatter=_YoutubeDLHelpFormatter(), formatter=_YoutubeDLHelpFormatter(),
conflict_handler='resolve', conflict_handler='resolve',
) )
self.set_default(self.ALIAS_DEST, collections.defaultdict(int))
_UNKNOWN_OPTION = (optparse.BadOptionError, optparse.AmbiguousOptionError) _UNKNOWN_OPTION = (optparse.BadOptionError, optparse.AmbiguousOptionError)
_BAD_OPTION = optparse.OptionValueError _BAD_OPTION = optparse.OptionValueError
@@ -232,30 +236,16 @@ def create_parser():
current + value if append is True else value + current) current + value if append is True else value + current)
def _set_from_options_callback( def _set_from_options_callback(
option, opt_str, value, parser, delim=',', allowed_values=None, aliases={}, option, opt_str, value, parser, allowed_values, delim=',', aliases={},
process=lambda x: x.lower().strip()): process=lambda x: x.lower().strip()):
current = set(getattr(parser.values, option.dest)) values = [process(value)] if delim is None else map(process, value.split(delim))
values = [process(value)] if delim is None else list(map(process, value.split(delim)[::-1])) try:
while values: requested = orderedSet_from_options(values, collections.ChainMap(aliases, {'all': allowed_values}),
actual_val = val = values.pop() start=getattr(parser.values, option.dest))
if not val: except ValueError as e:
raise optparse.OptionValueError(f'Invalid {option.metavar} for {opt_str}: {value}') raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {e.args[0]}')
if val == 'all':
current.update(allowed_values)
elif val == '-all':
current = set()
elif val in aliases:
values.extend(aliases[val])
else:
if val[0] == '-':
val = val[1:]
current.discard(val)
else:
current.update([val])
if allowed_values is not None and val not in allowed_values:
raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {actual_val}')
setattr(parser.values, option.dest, current) setattr(parser.values, option.dest, set(requested))
def _dict_from_options_callback( def _dict_from_options_callback(
option, opt_str, value, parser, option, opt_str, value, parser,
@@ -305,8 +295,7 @@ def create_parser():
aliases = (x if x.startswith('-') else f'--{x}' for x in map(str.strip, aliases.split(','))) aliases = (x if x.startswith('-') else f'--{x}' for x in map(str.strip, aliases.split(',')))
try: try:
alias_group.add_option( alias_group.add_option(
*aliases, help=opts, nargs=nargs, type='str' if nargs else None, *aliases, help=opts, nargs=nargs, dest=parser.ALIAS_DEST, type='str' if nargs else None,
dest='_triggered_aliases', default=collections.defaultdict(int),
metavar=' '.join(f'ARG{i}' for i in range(nargs)), action='callback', metavar=' '.join(f'ARG{i}' for i in range(nargs)), action='callback',
callback=_alias_callback, callback_kwargs={'opts': opts, 'nargs': nargs}) callback=_alias_callback, callback_kwargs={'opts': opts, 'nargs': nargs})
except Exception as err: except Exception as err:
@@ -365,10 +354,20 @@ def create_parser():
'--extractor-descriptions', '--extractor-descriptions',
action='store_true', dest='list_extractor_descriptions', default=False, action='store_true', dest='list_extractor_descriptions', default=False,
help='Output descriptions of all supported extractors and exit') help='Output descriptions of all supported extractors and exit')
general.add_option(
'--use-extractors', '--ies',
action='callback', dest='allowed_extractors', metavar='NAMES', type='str',
default=[], callback=_list_from_options_callback,
help=(
'Extractor names to use separated by commas. '
'You can also use regexes, "all", "default" and "end" (end URL matching); '
'e.g. --ies "holodex.*,end,youtube". '
'Prefix the name with a "-" to exclude it, e.g. --ies default,-generic. '
'Use --list-extractors for a list of extractor names. (Alias: --ies)'))
general.add_option( general.add_option(
'--force-generic-extractor', '--force-generic-extractor',
action='store_true', dest='force_generic_extractor', default=False, action='store_true', dest='force_generic_extractor', default=False,
help='Force extraction to use the generic extractor') help=optparse.SUPPRESS_HELP)
general.add_option( general.add_option(
'--default-search', '--default-search',
dest='default_search', metavar='PREFIX', dest='default_search', metavar='PREFIX',
@@ -443,11 +442,12 @@ def create_parser():
'allowed_values': { 'allowed_values': {
'filename', 'filename-sanitization', 'format-sort', 'abort-on-error', 'format-spec', 'no-playlist-metafiles', 'filename', 'filename-sanitization', 'format-sort', 'abort-on-error', 'format-spec', 'no-playlist-metafiles',
'multistreams', 'no-live-chat', 'playlist-index', 'list-formats', 'no-direct-merge', 'multistreams', 'no-live-chat', 'playlist-index', 'list-formats', 'no-direct-merge',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-attach-info-json', 'embed-metadata', 'no-attach-info-json', 'embed-metadata', 'embed-thumbnail-atomicparsley',
'embed-thumbnail-atomicparsley', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-youtube-prefer-utc-upload-date',
}, 'aliases': { }, 'aliases': {
'youtube-dl': ['-multistreams', 'all'], 'youtube-dl': ['all', '-multistreams'],
'youtube-dlc': ['-no-youtube-channel-redirect', '-no-live-chat', 'all'], 'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat'],
} }
}, help=( }, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc ' 'Options that can help keep compatibility with youtube-dl or youtube-dlc '
@@ -634,7 +634,7 @@ def create_parser():
selection.add_option( selection.add_option(
'--break-per-input', '--break-per-input',
action='store_true', dest='break_per_url', default=False, action='store_true', dest='break_per_url', default=False,
help='Make --break-on-existing, --break-on-reject and --max-downloads act only on the current input URL') help='--break-on-existing, --break-on-reject, --max-downloads, and autonumber resets per input URL')
selection.add_option( selection.add_option(
'--no-break-per-input', '--no-break-per-input',
action='store_false', dest='break_per_url', action='store_false', dest='break_per_url',
@@ -1401,14 +1401,15 @@ def create_parser():
help='Do not read/dump cookies from/to file (default)') help='Do not read/dump cookies from/to file (default)')
filesystem.add_option( filesystem.add_option(
'--cookies-from-browser', '--cookies-from-browser',
dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE]', dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE][::CONTAINER]',
help=( help=(
'The name of the browser and (optionally) the name/path of ' 'The name of the browser to load cookies from. '
'the profile to load cookies from, separated by a ":". '
f'Currently supported browsers are: {", ".join(sorted(SUPPORTED_BROWSERS))}. ' f'Currently supported browsers are: {", ".join(sorted(SUPPORTED_BROWSERS))}. '
'By default, the most recently accessed profile is used. ' 'Optionally, the KEYRING used for decrypting Chromium cookies on Linux, '
'The keyring used for decrypting Chromium cookies on Linux can be ' 'the name/path of the PROFILE to load cookies from, '
'(optionally) specified after the browser name separated by a "+". ' 'and the CONTAINER name (if Firefox) ("none" for no container) '
'can be given with their respective seperators. '
'By default, all containers of the most recently accessed profile are used. '
f'Currently supported keyrings are: {", ".join(map(str.lower, sorted(SUPPORTED_KEYRINGS)))}')) f'Currently supported keyrings are: {", ".join(map(str.lower, sorted(SUPPORTED_KEYRINGS)))}'))
filesystem.add_option( filesystem.add_option(
'--no-cookies-from-browser', '--no-cookies-from-browser',
@@ -1866,7 +1867,6 @@ def create_parser():
def _hide_login_info(opts): def _hide_login_info(opts):
write_string( deprecation_warning(f'"{__name__}._hide_login_info" is deprecated and may be removed '
'DeprecationWarning: "yt_dlp.options._hide_login_info" is deprecated and may be removed in a future version. ' 'in a future version. Use "yt_dlp.utils.Config.hide_login_info" instead')
'Use "yt_dlp.utils.Config.hide_login_info" instead\n')
return Config.hide_login_info(opts) return Config.hide_login_info(opts)

View File

@@ -7,10 +7,10 @@ from ..utils import (
PostProcessingError, PostProcessingError,
RetryManager, RetryManager,
_configuration_args, _configuration_args,
deprecation_warning,
encodeFilename, encodeFilename,
network_exceptions, network_exceptions,
sanitized_Request, sanitized_Request,
write_string,
) )
@@ -73,10 +73,14 @@ class PostProcessor(metaclass=PostProcessorMetaClass):
if self._downloader: if self._downloader:
return self._downloader.report_warning(text, *args, **kwargs) return self._downloader.report_warning(text, *args, **kwargs)
def deprecation_warning(self, text): def deprecation_warning(self, msg):
warn = getattr(self._downloader, 'deprecation_warning', deprecation_warning)
return warn(msg, stacklevel=1)
def deprecated_feature(self, msg):
if self._downloader: if self._downloader:
return self._downloader.deprecation_warning(text) return self._downloader.deprecated_feature(msg)
write_string(f'DeprecationWarning: {text}') return deprecation_warning(msg, stacklevel=1)
def report_error(self, text, *args, **kwargs): def report_error(self, text, *args, **kwargs):
self.deprecation_warning('"yt_dlp.postprocessor.PostProcessor.report_error" is deprecated. ' self.deprecation_warning('"yt_dlp.postprocessor.PostProcessor.report_error" is deprecated. '

View File

@@ -15,6 +15,7 @@ from ..utils import (
Popen, Popen,
PostProcessingError, PostProcessingError,
_get_exe_version_output, _get_exe_version_output,
deprecation_warning,
detect_exe_version, detect_exe_version,
determine_ext, determine_ext,
dfxp2srt, dfxp2srt,
@@ -30,7 +31,6 @@ from ..utils import (
traverse_obj, traverse_obj,
variadic, variadic,
write_json_file, write_json_file,
write_string,
) )
EXT_TO_OUT_FORMATS = { EXT_TO_OUT_FORMATS = {
@@ -187,8 +187,8 @@ class FFmpegPostProcessor(PostProcessor):
else: else:
self.probe_basename = basename self.probe_basename = basename
if basename == self._ffmpeg_to_avconv[kind]: if basename == self._ffmpeg_to_avconv[kind]:
self.deprecation_warning( self.deprecated_feature(f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and '
f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and may be removed in a future version. Use {kind} instead') f'may be removed in a future version. Use {kind} instead')
return version return version
@functools.cached_property @functools.cached_property
@@ -1064,7 +1064,7 @@ class FFmpegThumbnailsConvertorPP(FFmpegPostProcessor):
@classmethod @classmethod
def is_webp(cls, path): def is_webp(cls, path):
write_string(f'DeprecationWarning: {cls.__module__}.{cls.__name__}.is_webp is deprecated') deprecation_warning(f'{cls.__module__}.{cls.__name__}.is_webp is deprecated')
return imghdr.what(path) == 'webp' return imghdr.what(path) == 'webp'
def fixup_webp(self, info, idx=-1): def fixup_webp(self, info, idx=-1):

View File

@@ -1,4 +1,5 @@
import atexit import atexit
import contextlib
import hashlib import hashlib
import json import json
import os import os
@@ -13,6 +14,7 @@ from .compat import compat_realpath, compat_shlex_quote
from .utils import ( from .utils import (
Popen, Popen,
cached_method, cached_method,
deprecation_warning,
shell_quote, shell_quote,
system_identifier, system_identifier,
traverse_obj, traverse_obj,
@@ -50,6 +52,19 @@ def detect_variant():
return VARIANT or _get_variant_and_executable_path()[0] return VARIANT or _get_variant_and_executable_path()[0]
@functools.cache
def current_git_head():
if detect_variant() != 'source':
return
with contextlib.suppress(Exception):
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
return stdout.strip()
_FILE_SUFFIXES = { _FILE_SUFFIXES = {
'zip': '', 'zip': '',
'py2exe': '_min.exe', 'py2exe': '_min.exe',
@@ -288,11 +303,8 @@ def run_update(ydl):
def update_self(to_screen, verbose, opener): def update_self(to_screen, verbose, opener):
import traceback import traceback
from .utils import write_string deprecation_warning(f'"{__name__}.update_self" is deprecated and may be removed '
f'in a future version. Use "{__name__}.run_update(ydl)" instead')
write_string(
'DeprecationWarning: "yt_dlp.update.update_self" is deprecated and may be removed in a future version. '
'Use "yt_dlp.update.run_update(ydl)" instead\n')
printfn = to_screen printfn = to_screen

View File

@@ -828,8 +828,8 @@ def escapeHTML(text):
def process_communicate_or_kill(p, *args, **kwargs): def process_communicate_or_kill(p, *args, **kwargs):
write_string('DeprecationWarning: yt_dlp.utils.process_communicate_or_kill is deprecated ' deprecation_warning(f'"{__name__}.process_communicate_or_kill" is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.utils.Popen.communicate_or_kill instead') f'in a future version. Use "{__name__}.Popen.communicate_or_kill" instead')
return Popen.communicate_or_kill(p, *args, **kwargs) return Popen.communicate_or_kill(p, *args, **kwargs)
@@ -840,12 +840,35 @@ class Popen(subprocess.Popen):
else: else:
_startupinfo = None _startupinfo = None
def __init__(self, *args, text=False, **kwargs): @staticmethod
def _fix_pyinstaller_ld_path(env):
"""Restore LD_LIBRARY_PATH when using PyInstaller
Ref: https://github.com/pyinstaller/pyinstaller/blob/develop/doc/runtime-information.rst#ld_library_path--libpath-considerations
https://github.com/yt-dlp/yt-dlp/issues/4573
"""
if not hasattr(sys, '_MEIPASS'):
return
def _fix(key):
orig = env.get(f'{key}_ORIG')
if orig is None:
env.pop(key, None)
else:
env[key] = orig
_fix('LD_LIBRARY_PATH') # Linux
_fix('DYLD_LIBRARY_PATH') # macOS
def __init__(self, *args, env=None, text=False, **kwargs):
if env is None:
env = os.environ.copy()
self._fix_pyinstaller_ld_path(env)
if text is True: if text is True:
kwargs['universal_newlines'] = True # For 3.6 compatibility kwargs['universal_newlines'] = True # For 3.6 compatibility
kwargs.setdefault('encoding', 'utf-8') kwargs.setdefault('encoding', 'utf-8')
kwargs.setdefault('errors', 'replace') kwargs.setdefault('errors', 'replace')
super().__init__(*args, **kwargs, startupinfo=self._startupinfo) super().__init__(*args, env=env, **kwargs, startupinfo=self._startupinfo)
def communicate_or_kill(self, *args, **kwargs): def communicate_or_kill(self, *args, **kwargs):
try: try:
@@ -860,9 +883,9 @@ class Popen(subprocess.Popen):
self.wait(timeout=timeout) self.wait(timeout=timeout)
@classmethod @classmethod
def run(cls, *args, **kwargs): def run(cls, *args, timeout=None, **kwargs):
with cls(*args, **kwargs) as proc: with cls(*args, **kwargs) as proc:
stdout, stderr = proc.communicate_or_kill() stdout, stderr = proc.communicate_or_kill(timeout=timeout)
return stdout or '', stderr or '', proc.returncode return stdout or '', stderr or '', proc.returncode
@@ -1934,7 +1957,7 @@ class DateRange:
def platform_name(): def platform_name():
""" Returns the platform name as a str """ """ Returns the platform name as a str """
write_string('DeprecationWarning: yt_dlp.utils.platform_name is deprecated, use platform.platform instead') deprecation_warning(f'"{__name__}.platform_name" is deprecated, use "platform.platform" instead')
return platform.platform() return platform.platform()
@@ -1980,6 +2003,23 @@ def write_string(s, out=None, encoding=None):
out.flush() out.flush()
def deprecation_warning(msg, *, printer=None, stacklevel=0, **kwargs):
from . import _IN_CLI
if _IN_CLI:
if msg in deprecation_warning._cache:
return
deprecation_warning._cache.add(msg)
if printer:
return printer(f'{msg}{bug_reports_message()}', **kwargs)
return write_string(f'ERROR: {msg}{bug_reports_message()}\n', **kwargs)
else:
import warnings
warnings.warn(DeprecationWarning(msg), stacklevel=stacklevel + 3)
deprecation_warning._cache = set()
def bytes_to_intlist(bs): def bytes_to_intlist(bs):
if not bs: if not bs:
return [] return []
@@ -4862,8 +4902,8 @@ def decode_base_n(string, n=None, table=None):
def decode_base(value, digits): def decode_base(value, digits):
write_string('DeprecationWarning: yt_dlp.utils.decode_base is deprecated ' deprecation_warning(f'{__name__}.decode_base is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.decode_base_n instead') f'in a future version. Use {__name__}.decode_base_n instead')
return decode_base_n(value, table=digits) return decode_base_n(value, table=digits)
@@ -5332,8 +5372,8 @@ def traverse_obj(
def traverse_dict(dictn, keys, casesense=True): def traverse_dict(dictn, keys, casesense=True):
write_string('DeprecationWarning: yt_dlp.utils.traverse_dict is deprecated ' deprecation_warning(f'"{__name__}.traverse_dict" is deprecated and may be removed '
'and may be removed in a future version. Use yt_dlp.utils.traverse_obj instead') f'in a future version. Use "{__name__}.traverse_obj" instead')
return traverse_obj(dictn, keys, casesense=casesense, is_user_input=True, traverse_string=True) return traverse_obj(dictn, keys, casesense=casesense, is_user_input=True, traverse_string=True)
@@ -5764,7 +5804,7 @@ class RetryManager:
if not count: if not count:
return warn(e) return warn(e)
elif isinstance(e, ExtractorError): elif isinstance(e, ExtractorError):
e = remove_end(str(e.cause) or e.orig_msg, '.') e = remove_end(str_or_none(e.cause) or e.orig_msg, '.')
warn(f'{e}. Retrying{format_field(suffix, None, " %s")} ({count}/{retries})...') warn(f'{e}. Retrying{format_field(suffix, None, " %s")} ({count}/{retries})...')
delay = float_or_none(sleep_func(n=count - 1)) if callable(sleep_func) else sleep_func delay = float_or_none(sleep_func(n=count - 1)) if callable(sleep_func) else sleep_func
@@ -5785,6 +5825,36 @@ def truncate_string(s, left, right=0):
return f'{s[:left-3]}...{s[-right:]}' return f'{s[:left-3]}...{s[-right:]}'
def orderedSet_from_options(options, alias_dict, *, use_regex=False, start=None):
assert 'all' in alias_dict, '"all" alias is required'
requested = list(start or [])
for val in options:
discard = val.startswith('-')
if discard:
val = val[1:]
if val in alias_dict:
val = alias_dict[val] if not discard else [
i[1:] if i.startswith('-') else f'-{i}' for i in alias_dict[val]]
# NB: Do not allow regex in aliases for performance
requested = orderedSet_from_options(val, alias_dict, start=requested)
continue
current = (filter(re.compile(val, re.I).fullmatch, alias_dict['all']) if use_regex
else [val] if val in alias_dict['all'] else None)
if current is None:
raise ValueError(val)
if discard:
for item in current:
while item in requested:
requested.remove(item)
else:
requested.extend(current)
return orderedSet(requested)
# Deprecated # Deprecated
has_certifi = bool(certifi) has_certifi = bool(certifi)
has_websockets = bool(websockets) has_websockets = bool(websockets)

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2022.08.14' __version__ = '2022.09.01'
RELEASE_GIT_HEAD = '55937202b' RELEASE_GIT_HEAD = '5d7c7d656'
VARIANT = None VARIANT = None