1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-12 09:51:15 +00:00

Compare commits

..

74 Commits

Author SHA1 Message Date
github-actions
adba24d207 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2022-09-01 11:26:07 +00:00
pukkandan
5d7c7d6569 Release 2022.09.01 2022-09-01 16:49:04 +05:30
pukkandan
d2c8aadf79 [cleanup] Misc
Closes #4710, Closes #4754, Closes #4723
Authored by: pukkandan, MrRawes, DavidH-2022
2022-09-01 16:49:03 +05:30
pukkandan
1ac7f46184 Update to ytdl-commit-ed5c44e7
[compat] Replace deficient ChainMap class in Py3.3 and earlier
ed5c44e7b7
2022-09-01 16:46:32 +05:30
pukkandan
05deb747bb [jsinterp] Fix escape in regex 2022-09-01 16:46:32 +05:30
pukkandan
b505e8517a [extractor/youtube] Fallback regex for nsig code extraction 2022-09-01 16:46:32 +05:30
pukkandan
f2e9fa3ef7 [FormatSort] Fix aext for --prefer-free-formats
Closes #4735
2022-09-01 16:46:31 +05:30
satan1st
50a399326f [build] make tar' should not follow DESTDIR` (#4790)
Ref: https://www.gnu.org/prep/standards/html_node/DESTDIR.html
Authored by: satan1st
2022-09-01 16:46:17 +05:30
coletdjnz
1ff88b7aec [extractor/youtube] Add no-youtube-prefer-utc-upload-date compat option (#4771)
This option reverts 992f9a730b and 17322130a9 to prefer the non-UTC upload date in microformats.

Authored by: coletdjnz, pukkandan
2022-09-01 10:02:28 +00:00
bashonly
825d3ce386 [cookies] Improve container support (#4806)
Closes #4800
Authored by: bashonly, pukkandan, coletdjnz
2022-09-01 15:22:59 +05:30
bashonly
92aa6d6883 [extractor/triller] Add extractor (#4712)
Closes #4703
Authored by: bashonly
2022-09-01 15:20:54 +05:30
Elyse
b2a4db425b [VQQ] Add extractors (#4706)
Closes #1666
Authored by: elyse0
2022-09-01 12:42:34 +05:30
Yifu Yu
de49cdbe9d [extractor/bilibili] Extract flac with premium account (#4759)
Authored by: jackyyf
2022-08-31 23:22:16 +05:30
shirt
9f9c85dda4 [Build] Update pyinstaller 2022-08-31 13:12:26 -04:00
HobbyistDev
11734714c2 [extractor/eurosport] Add extractor (#4613)
Closes #2487
Authored by: HobbyistDev
2022-08-31 22:32:33 +05:30
pukkandan
b86ca447ce [extractor/mediaset] Fix embed extraction
Closes #4804
2022-08-31 22:24:41 +05:30
Tejas Arlimatti
f8c7ba9984 [extractor/epoch] Add extractor (#4772)
Closes #4714
Authored by: tejasa97
2022-08-31 22:16:26 +05:30
DepFA
76f2bb175d [extractor/stripchat] Don't modify input URL (#4781)
Authored by: dfaker
2022-08-31 21:10:59 +05:30
Elyse
f26af78a8a [jsinterp] Add charcodeAt and bitwise overflow (#4706)
Authored by: elyse0
2022-08-31 21:01:22 +05:30
Lesmiscore
bfbecd1174 [extractor/newspicks] Add extractor (#4725)
Authored by: Lesmiscore
2022-08-31 02:07:55 +09:00
bashonly
9bd13fe5bb [cookies] Support firefox container in --cookies-from-browser (#4753)
Authored by: bashonly
2022-08-30 22:24:46 +05:30
Jeff Huffman
459262ac97 [extractor/crunchyroll:beta] Use anonymous access (#4704)
Closes #4692
Authored by: tejing1
2022-08-30 22:04:13 +05:30
Lesmiscore
82ea226c61 Restore LD_LIBRARY_PATH when using PyInstaller (#4666)
Authored by: Lesmiscore
2022-08-31 01:24:14 +09:00
pukkandan
da4db748fa [utils] Add deprecation_warning
See https://github.com/yt-dlp/yt-dlp/pull/2173#issuecomment-1097021515
2022-08-30 21:03:07 +05:30
pukkandan
e1eabd7beb [downloader/external] Smarter detection of executable
Closes #4778
2022-08-30 18:13:38 +05:30
pukkandan
d81ba7d491 [jsinterp, extractor/youtube] Minor fixes 2022-08-30 18:13:37 +05:30
OHaiiBuzzle
5135ed3d4a [extractor/huya] Fix stream extraction (#4798)
Closes #4658
Authored by: ohaiibuzzle
2022-08-30 16:14:16 +05:30
pukkandan
c4b2df872d [jsinterp] Fix _separate
Ref: https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1231126941
2022-08-30 16:06:40 +05:30
Samantaz Fox
224b5a35f7 [extractor/youtube] Update iOS Innertube clients (#4792)
Authored by: SamantazFox
2022-08-29 03:36:55 +00:00
coletdjnz
50ac0e5416 [extractor/youtube] Use device-specific user agent (#4770)
Thwart latest fingerprinting attempt (see https://github.com/iv-org/invidious/issues/3230#issuecomment-1226887639)

Authored by: coletdjnz
2022-08-28 22:59:54 +00:00
Lesmiscore
e0992d5558 [extractor/IslamChannel] Add extractors (#4779)
Authored by: Lesmiscore
2022-08-28 01:37:25 +09:00
pukkandan
5e01315aa1 [cache, extractor/youtube] Invalidate old cache 2022-08-27 07:25:14 +05:30
pukkandan
4e4982ab5b [extractor/generic] Don't return JW player without formats
CLoses #4765
2022-08-27 06:21:17 +05:30
cgrigis
89e4d86171 [extractor/arte] Bug fix (#4769)
Closes #4768
Authored by: cgrigis
2022-08-27 05:58:01 +05:30
Shreyas Minocha
a1af516259 [extractor/screencastomatic] Support --video-password (#4761)
Authored by: shreyasminocha
2022-08-26 08:59:45 +05:30
pukkandan
1d64a59547 [extractor/vimeo:user] Fix _VALID_URL
Closes #4758
2022-08-26 06:29:03 +05:30
pukkandan
ca7f8b8f31 Bugfix for 822d66e591
Closes #4760
2022-08-26 06:08:05 +05:30
pukkandan
164b03c486 [jsinterp] Fix bug in operator precedence
Fixes https://github.com/yt-dlp/yt-dlp/issues/4635#issuecomment-1226659543
2022-08-25 09:40:46 +05:30
pukkandan
e5458d1d88 Fix lazy extractor bug in fe7866d0ed
and add test

Fixes https://github.com/yt-dlp/yt-dlp/pull/3234#issuecomment-1225347071
2022-08-24 15:19:58 +05:30
pukkandan
b5e7a2e69d Add version to infojson 2022-08-24 13:03:45 +05:30
pukkandan
2516cafb28 Fix bug in fe7866d0ed 2022-08-24 08:21:39 +05:30
pukkandan
fd404bec7e Fix --break-per-url --max-downloads 2022-08-24 08:00:13 +05:30
pukkandan
fe7866d0ed Add option --use-extractors
Deprecates `--force-generic-extractor`

Closes #3234, Closes #2044

Related: #4307, #1791
2022-08-24 07:47:51 +05:30
pukkandan
5314b52192 [utils] Add orderedSet_from_options 2022-08-24 07:38:55 +05:30
pukkandan
13db4e7b9e [extractor/mixcloud] All formats are audio-only
Closes #4740
2022-08-23 04:11:27 +05:30
Joshua Lochner
07275b708b [extractor/medaltv] Fix extraction (#4739)
Authored by: xenova
2022-08-23 01:34:12 +05:30
Elyse
b85703d11a [extractor/rtbf] Fix jwt extraction (#4738)
Closes #4683
Authored by: elyse0
2022-08-23 00:15:46 +05:30
pukkandan
992dc6b486 [jsinterp] Implement timeout
Workaround for #4716
2022-08-22 06:19:06 +05:30
pukkandan
822d66e591 Fix bug in --alias 2022-08-22 04:37:23 +05:30
pukkandan
8d1ad6378f [extractor/BiliBiliSearch] Don't sort by date
Related #4682
2022-08-21 05:19:20 +05:30
pukkandan
2d1019542a [extractor/BiliBiliSearch] Fix infinite loop
Closes #4682
2022-08-21 05:19:20 +05:30
pukkandan
b25cac650f [extractor/youtube] Fix bug in format sorting 2022-08-21 00:56:27 +05:30
pukkandan
90a1df305b [test] Fix test_youtube_signature 2022-08-21 00:51:03 +05:30
pukkandan
0a6b4b82e9 [extractor/uktv] Improve _VALID_URL
Closes #4707
Authored by: dirkf
2022-08-20 05:00:45 +05:30
pukkandan
1704c47ba8 [extractor/bitchute] Mark errors as expected
Closes #4685
2022-08-20 04:53:05 +05:30
github-actions
b76e9cedb3 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2022-08-19 00:11:11 +00:00
pukkandan
48c88e088c Release 2022.08.19 2022-08-19 05:08:22 +05:30
pukkandan
a831c2ea90 [cleanup] Misc 2022-08-19 05:08:21 +05:30
pukkandan
be13a6e525 [jsinterp] Bring on-par with youtube-dl
Code from: https://github.com/ytdl-org/youtube-dl/pull/31175, https://github.com/ytdl-org/youtube-dl/pull/31182

Authored by pukkandan, dirkf
2022-08-19 05:08:21 +05:30
bashonly
8a3da4c68c [extractor/instagram] Fix bugs in 7d3b98be4c (#4701)
Authored by: bashonly
2022-08-19 03:45:49 +05:30
nixxo
4d37d4a77c [extractor/rai] Minor fix (#4700)
Closes #4691, #4690
2022-08-19 02:28:59 +05:30
bashonly
7d3b98be4c [extractor/instagram] Fix extraction (#4696)
Closes #4657, #4532, #4475
Authored by: bashonly, pritam20ps05
2022-08-19 02:27:46 +05:30
Elyse
2b3e43e247 [extractor/rtbf] Fix stream extractor (#4671)
Closes #4656
Authored by: elyse0
2022-08-19 01:42:04 +05:30
Alexander Seiler
f60ef66371 [extractor/zattoo] Fix Zattoo resellers (#4675)
Closes #4630
Authored by: goggle
2022-08-19 01:27:51 +05:30
pukkandan
25836db6be [extractor/youtube] Add fallback to phantomjs
Related #4635
2022-08-18 21:35:18 +05:30
pukkandan
587021cd9f [phantomjs] Add function to execute JS without a DOM
Authored by: MinePlayersPE, pukkandan
2022-08-18 21:34:47 +05:30
pukkandan
580ce00782 [youtube] Improve signature caching
and refactor related functions
2022-08-18 21:33:30 +05:30
ChillingPepper
2f1a299c50 [extractor/SovietsCloset] Fix extractor (#4688)
Closes #4200 
Authored by: ChillingPepper
2022-08-18 16:44:45 +05:30
pukkandan
f6ca640b12 [jsinterp] Fix for youtube player 1f7d5369
Closes #4635 again
2022-08-18 16:38:35 +05:30
pukkandan
3ce2933693 [youtube] Fix error reporting of "Incomplete data"
Related: #4669
2022-08-16 22:01:48 +05:30
pukkandan
c200096c03 Fix bug in --download-archive
Closes #4668
2022-08-16 22:00:51 +05:30
pukkandan
6d3e7424bf [jsinterp] Fix for youtube player c81bbb4a 2022-08-16 06:53:45 +05:30
pukkandan
5c6d2ef9d1 [youtube] Improve format sorting for IOS formats
When no itag/resolution is available for reference, use the closest resolution
2022-08-15 14:04:05 +05:30
Lesmiscore
460eb9c50e [build] Exclude devscripts from installs
Closes #4667
2022-08-15 13:51:35 +05:30
66 changed files with 3006 additions and 949 deletions

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm requesting a site-specific feature
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell
validations:

View File

@@ -20,7 +20,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
@@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell

View File

@@ -26,7 +26,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.08.14** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2022.09.01** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true
@@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.14 [9d339c4] (win32_exe)
[debug] yt-dlp version 2022.09.01 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.08.14, Current version: 2022.08.14
yt-dlp is up to date (2022.08.14)
Latest version: 2022.09.01, Current version: 2022.09.01
yt-dlp is up to date (2022.09.01)
<more lines>
render: shell

View File

@@ -194,7 +194,7 @@ jobs:
- name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install --upgrade pip setuptools wheel py2exe
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |
@@ -230,7 +230,7 @@ jobs:
- name: Install Requirements
run: |
python -m pip install --upgrade pip setuptools wheel
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.2-py3-none-any.whl" -r requirements.txt
pip install "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |

View File

@@ -299,3 +299,12 @@ bashonly
jacobtruman
masta79
palewire
cgrigis
DavidH-2022
dfaker
jackyyf
ohaiibuzzle
SamantazFox
shreyasminocha
tejasa97
xenov

View File

@@ -11,6 +11,71 @@
-->
### 2022.09.01
* Add option `--use-extractors`
* Merge youtube-dl: Upto [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7)
* Add yt-dlp version to infojson
* Fix `--break-per-url --max-downloads`
* Fix bug in `--alias`
* [cookies] Support firefox container in `--cookies-from-browser` by [bashonly](https://github.com/bashonly), [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [downloader/external] Smarter detection of executable
* [extractor/generic] Don't return JW player without formats
* [FormatSort] Fix `aext` for `--prefer-free-formats`
* [jsinterp] Various improvements by [pukkandan](https://github.com/pukkandan), [dirkf](https://github.com/dirkf), [elyse0](https://github.com/elyse0)
* [cache] Mechanism to invalidate old cache
* [utils] Add `deprecation_warning`
* [utils] Add `orderedSet_from_options`
* [utils] `Popen`: Restore `LD_LIBRARY_PATH` when using PyInstaller by [Lesmiscore](https://github.com/Lesmiscore)
* [build] `make tar` should not follow `DESTDIR` by [satan1st](https://github.com/satan1st)
* [build] Update pyinstaller by [shirt-dev](https://github.com/shirt-dev)
* [test] Fix `test_youtube_signature`
* [cleanup] Misc fixes and cleanup by [DavidH-2022](https://github.com/DavidH-2022), [MrRawes](https://github.com/MrRawes), [pukkandan](https://github.com/pukkandan)
* [extractor/epoch] Add extractor by [tejasa97](https://github.com/tejasa97)
* [extractor/eurosport] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/IslamChannel] Add extractors by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/newspicks] Add extractor by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/triller] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/VQQ] Add extractors by [elyse0](https://github.com/elyse0)
* [extractor/youtube] Improvements to nsig extraction
* [extractor/youtube] Fix bug in format sorting
* [extractor/youtube] Update iOS Innertube clients by [SamantazFox](https://github.com/SamantazFox)
* [extractor/youtube] Use device-specific user agent by [coletdjnz](https://github.com/coletdjnz)
* [extractor/youtube] Add `--compat-option no-youtube-prefer-utc-upload-date` by [coletdjnz](https://github.com/coletdjnz)
* [extractor/arte] Bug fix by [cgrigis](https://github.com/cgrigis)
* [extractor/bilibili] Extract `flac` with premium account by [jackyyf](https://github.com/jackyyf)
* [extractor/BiliBiliSearch] Don't sort by date
* [extractor/BiliBiliSearch] Fix infinite loop
* [extractor/bitchute] Mark errors as expected
* [extractor/crunchyroll:beta] Use anonymous access by [tejing1](https://github.com/tejing1)
* [extractor/huya] Fix stream extraction by [ohaiibuzzle](https://github.com/ohaiibuzzle)
* [extractor/medaltv] Fix extraction by [xenova](https://github.com/xenova)
* [extractor/mediaset] Fix embed extraction
* [extractor/mixcloud] All formats are audio-only
* [extractor/rtbf] Fix jwt extraction by [elyse0](https://github.com/elyse0)
* [extractor/screencastomatic] Support `--video-password` by [shreyasminocha](https://github.com/shreyasminocha)
* [extractor/stripchat] Don't modify input URL by [dfaker](https://github.com/dfaker)
* [extractor/uktv] Improve `_VALID_URL` by [dirkf](https://github.com/dirkf)
* [extractor/vimeo:user] Fix `_VALID_URL`
### 2022.08.19
* Fix bug in `--download-archive`
* [jsinterp] **Fix for new youtube players** and related improvements by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [phantomjs] Add function to execute JS without a DOM by [MinePlayersPE](https://github.com/MinePlayersPE), [pukkandan](https://github.com/pukkandan)
* [build] Exclude devscripts from installs by [Lesmiscore](https://github.com/Lesmiscore)
* [cleanup] Misc fixes and cleanup
* [extractor/youtube] **Add fallback to phantomjs** for nsig
* [extractor/youtube] Fix error reporting of "Incomplete data"
* [extractor/youtube] Improve format sorting for IOS formats
* [extractor/youtube] Improve signature caching
* [extractor/instagram] Fix extraction by [bashonly](https://github.com/bashonly), [pritam20ps05](https://github.com/pritam20ps05)
* [extractor/rai] Minor fix by [nixxo](https://github.com/nixxo)
* [extractor/rtbf] Fix stream extractor by [elyse0](https://github.com/elyse0)
* [extractor/SovietsCloset] Fix extractor by [ChillingPepper](https://github.com/ChillingPepper)
* [extractor/zattoo] Fix Zattoo resellers by [goggle](https://github.com/goggle)
### 2022.08.14
* Merge youtube-dl: Upto [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56)
@@ -19,8 +84,7 @@
* [extractor] Fix format sorting of `channels`
* [ffmpeg] Disable avconv unless `--prefer-avconv`
* [ffmpeg] Smarter detection of ffprobe filename
* [patreon] Ignore erroneous media attachments by [coletdjnz](https://github.com/coletdjnz)
* [postprocessor/embedthumbnail] Detect `libatomicparsley.so`
* [embedthumbnail] Detect `libatomicparsley.so`
* [ThumbnailsConvertor] Fix conversion after `fixup_webp`
* [utils] Fix `get_compatible_ext`
* [build] Fix changelog
@@ -30,6 +94,7 @@
* [cleanup] Misc fixes and cleanup
* [extractor/moview] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/parler] Add extractor by [palewire](https://github.com/palewire)
* [extractor/patreon] Ignore erroneous media attachments by [coletdjnz](https://github.com/coletdjnz)
* [extractor/truth] Add extractor by [palewire](https://github.com/palewire)
* [extractor/aenetworks] Add formats parameter by [jacobtruman](https://github.com/jacobtruman)
* [extractor/crunchyroll] Improve `_VALID_URL`s

View File

@@ -33,7 +33,6 @@ completion-zsh: completions/zsh/_yt-dlp
lazy-extractors: yt_dlp/extractor/lazy_extractors.py
PREFIX ?= /usr/local
DESTDIR ?= .
BINDIR ?= $(PREFIX)/bin
MANDIR ?= $(PREFIX)/man
SHAREDIR ?= $(PREFIX)/share
@@ -134,7 +133,7 @@ yt_dlp/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscrip
$(PYTHON) devscripts/make_lazy_extractors.py $@
yt-dlp.tar.gz: all
@tar -czf $(DESTDIR)/yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
@tar -czf yt-dlp.tar.gz --transform "s|^|yt-dlp/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \
--exclude '*.kate-swp' \
--exclude '*.pyc' \

View File

@@ -71,7 +71,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/d231b56](https://github.com/ytdl-org/youtube-dl/commit/d231b56717c73ee597d2e077d11b69ed48a1b02d)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* Merged with **youtube-dl v2021.12.17+ [commit/ed5c44e](https://github.com/ytdl-org/youtube-dl/commit/ed5c44e7b74ac77f87ca5ed6cb5e964a0c6a0678)**<!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
@@ -141,6 +141,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* Youtube channel URLs are automatically redirected to `/video`. Append a `/featured` to the URL to download only the videos in the home page. If the channel does not have a videos tab, we try to download the equivalent `UU` playlist instead. For all other tabs, if the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some private fields such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@@ -320,7 +321,7 @@ To build the standalone executable, you must have Python and `pyinstaller` (plus
On some systems, you may need to use `py` or `python` instead of `python3`.
Note that pyinstaller [does not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
@@ -329,7 +330,7 @@ You will need the build tools `python` (3.6+), `zip`, `make` (GNU), `pandoc`\* a
After installing these, simply run `make`.
You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The dependencies marked with **\*** are not needed for this)
You can also run `make yt-dlp` instead to compile only the binary without updating any of the additional files. (The build tools marked with **\*** are not needed for this)
### Standalone Py2Exe Builds (Windows)
@@ -375,7 +376,13 @@ You can also fork the project on github and run your fork's [build workflow](.gi
--list-extractors List all supported extractors and exit
--extractor-descriptions Output descriptions of all supported
extractors and exit
--force-generic-extractor Force extraction to use the generic extractor
--use-extractors NAMES Extractor names to use separated by commas.
You can also use regexes, "all", "default"
and "end" (end URL matching); e.g. --ies
"holodex.*,end,youtube". Prefix the name
with a "-" to exclude it, e.g. --ies
default,-generic. Use --list-extractors for
a list of extractor names. (Alias: --ies)
--default-search PREFIX Use this prefix for unqualified URLs. E.g.
"gvsearch2:python" downloads two videos from
google videos for the search term "python".
@@ -524,8 +531,8 @@ You can also fork the project on github and run your fork's [build workflow](.gi
a file that is in the archive
--break-on-reject Stop the download process when encountering
a file that has been filtered out
--break-per-input Make --break-on-existing, --break-on-reject
and --max-downloads act only on the current
--break-per-input --break-on-existing, --break-on-reject,
--max-downloads, and autonumber resets per
input URL
--no-break-per-input --break-on-existing and similar options
terminates the entire download queue
@@ -700,18 +707,20 @@ You can also fork the project on github and run your fork's [build workflow](.gi
and dump cookie jar in
--no-cookies Do not read/dump cookies from/to file
(default)
--cookies-from-browser BROWSER[+KEYRING][:PROFILE]
The name of the browser and (optionally) the
name/path of the profile to load cookies
from, separated by a ":". Currently
supported browsers are: brave, chrome,
chromium, edge, firefox, opera, safari,
vivaldi. By default, the most recently
accessed profile is used. The keyring used
for decrypting Chromium cookies on Linux can
be (optionally) specified after the browser
name separated by a "+". Currently supported
keyrings are: basictext, gnomekeyring, kwallet
--cookies-from-browser BROWSER[+KEYRING][:PROFILE][::CONTAINER]
The name of the browser to load cookies
from. Currently supported browsers are:
brave, chrome, chromium, edge, firefox,
opera, safari, vivaldi. Optionally, the
KEYRING used for decrypting Chromium cookies
on Linux, the name/path of the PROFILE to
load cookies from, and the CONTAINER name
(if Firefox) ("none" for no container) can
be given with their respective seperators.
By default, all containers of the most
recently accessed profile are used.
Currently supported keyrings are: basictext,
gnomekeyring, kwallet
--no-cookies-from-browser Do not load cookies from browser (default)
--cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information (such
@@ -1229,7 +1238,6 @@ The available fields are:
- `id` (string): Video identifier
- `title` (string): Video title
- `fulltitle` (string): Video title ignoring live timestamp and generic title
- `url` (string): Video URL
- `ext` (string): Video filename extension
- `alt_title` (string): A secondary title of the video
- `description` (string): The description of the video
@@ -1264,26 +1272,6 @@ The available fields are:
- `availability` (string): Whether the video is "private", "premium_only", "subscriber_only", "needs_auth", "unlisted" or "public"
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `format` (string): A human-readable description of the format
- `format_id` (string): Format code specified by `--format`
- `format_note` (string): Additional info about the format
- `width` (numeric): Width of the video
- `height` (numeric): Height of the video
- `resolution` (string): Textual description of width and height
- `tbr` (numeric): Average bitrate of audio and video in KBit/s
- `abr` (numeric): Average audio bitrate in KBit/s
- `acodec` (string): Name of the audio codec in use
- `asr` (numeric): Audio sampling rate in Hertz
- `vbr` (numeric): Average video bitrate in KBit/s
- `fps` (numeric): Frame rate
- `dynamic_range` (string): The dynamic range of the video
- `audio_channels` (numeric): The number of audio channels
- `stretched_ratio` (float): `width:height` of the video's pixels, if not square
- `vcodec` (string): Name of the video codec in use
- `container` (string): Name of the container format
- `filesize` (numeric): The number of bytes, if known in advance
- `filesize_approx` (numeric): An estimate for the number of bytes
- `protocol` (string): The protocol that will be used for the actual download
- `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch of when the information extraction was completed
@@ -1302,6 +1290,8 @@ The available fields are:
- `webpage_url_basename` (string): The basename of the webpage URL
- `webpage_url_domain` (string): The domain of the webpage URL
- `original_url` (string): The URL given by the user (or same as `webpage_url` for playlist entries)
All the fields in [Filtering Formats](#filtering-formats) can also be used
Available for the video that belongs to some logical chapter or section:
@@ -1383,13 +1373,13 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples
```bash
$ yt-dlp --get-filename -o "test video.%(ext)s" BaW_jenozKc
$ yt-dlp --print filename -o "test video.%(ext)s" BaW_jenozKc
test video.webm # Literal name with correct extension
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc
$ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.webm # All kinds of weird characters
$ yt-dlp --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
$ yt-dlp --print filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.webm # Restricted file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist
@@ -1478,6 +1468,7 @@ You can also filter the video formats by putting a condition in brackets, as in
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes
- `width`: Width of the video, if known
- `height`: Height of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s
@@ -1485,16 +1476,23 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `vbr`: Average video bitrate in KBit/s
- `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate
- `audio_channels`: The number of audio channels
- `stretched_ratio`: `width:height` of the video's pixels, if not square
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains), `~=` (matches regex) and following string meta fields:
- `url`: Video URL
- `ext`: File extension
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
- `format_id`: A short description of the format
- `language`: Language code
- `dynamic_range`: The dynamic range of the video
- `format_id`: A short description of the format
- `format`: A human-readable description of the format
- `format_note`: Additional info about the format
- `resolution`: Textual description of width and height
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
@@ -1521,7 +1519,7 @@ The available fields are:
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
- `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if known in advance
- `fs_approx`: Approximate filesize calculated from the manifests
@@ -2058,6 +2056,7 @@ While these options are redundant, they are still expected to be used due to the
#### Not recommended
While these options still work, their use is not recommended since there are other alternatives to achieve the same
--force-generic-extractor --ies generic,default
--exec-before-download CMD --exec "before_dl:CMD"
--no-exec-before-download --no-exec
--all-formats -f all

View File

@@ -11,14 +11,17 @@ from ..utils import (
# These bloat the lazy_extractors, so allow them to passthrough silently
ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'}
_WARNED = False
class LazyLoadMetaClass(type):
def __getattr__(cls, name):
if '_real_class' not in cls.__dict__ and name not in ALLOWED_CLASSMETHODS:
write_string(
'WARNING: Falling back to normal extractor since lazy extractor '
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n')
global _WARNED
if ('_real_class' not in cls.__dict__
and name not in ALLOWED_CLASSMETHODS and not _WARNED):
_WARNED = True
write_string('WARNING: Falling back to normal extractor since lazy extractor '
f'{cls.__name__} does not have attribute {name}{bug_reports_message()}\n')
return getattr(cls.real_class, name)

View File

@@ -12,7 +12,9 @@ from inspect import getsource
from devscripts.utils import get_filename_args, read_file, write_file
NO_ATTR = object()
STATIC_CLASS_PROPERTIES = ['IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_NETRC_MACHINE', 'age_limit']
STATIC_CLASS_PROPERTIES = [
'IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_ENABLED', '_NETRC_MACHINE', 'age_limit'
]
CLASS_METHODS = [
'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable'
]

View File

@@ -1,13 +1,13 @@
#!/usr/bin/env sh
if [ -z $1 ]; then
if [ -z "$1" ]; then
test_set='test'
elif [ $1 = 'core' ]; then
elif [ "$1" = 'core' ]; then
test_set="-m not download"
elif [ $1 = 'download' ]; then
elif [ "$1" = 'download' ]; then
test_set="-m download"
else
echo 'Invalid test type "'$1'". Use "core" | "download"'
echo 'Invalid test type "'"$1"'". Use "core" | "download"'
exit 1
fi

View File

@@ -81,7 +81,7 @@ def version_to_list(version):
def dependency_options():
# Due to the current implementation, these are auto-detected, but explicitly add them just in case
dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets']
excluded_modules = ['test', 'ytdlp_plugins', 'youtube_dl', 'youtube_dlc']
excluded_modules = ('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts')
yield from (f'--hidden-import={module}' for module in dependencies)
yield '--collect-submodules=websockets'

View File

@@ -28,7 +28,7 @@ REQUIREMENTS = read_file('requirements.txt').splitlines()
def packages():
if setuptools_available:
return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins'))
return find_packages(exclude=('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts'))
return [
'yt_dlp', 'yt_dlp.extractor', 'yt_dlp.downloader', 'yt_dlp.postprocessor', 'yt_dlp.compat',

View File

@@ -128,6 +128,8 @@
- **bbc.co.uk:iplayer:group**
- **bbc.co.uk:playlist**
- **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVLive**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVRecordings**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **Beatport**
- **Beeg**
- **BehindKink**
@@ -348,6 +350,8 @@
- **ehftv**
- **eHow**
- **EinsUndEinsTV**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVLive**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **EinsUndEinsTVRecordings**: [<abbr title="netrc machine"><em>1und1tv</em></abbr>]
- **Einthusan**
- **eitb.tv**
- **EllenTube**
@@ -360,6 +364,7 @@
- **Engadget**
- **Epicon**
- **EpiconSeries**
- **Epoch**
- **Eporner**
- **EroProfile**: [<abbr title="netrc machine"><em>eroprofile</em></abbr>]
- **EroProfile:album**
@@ -373,8 +378,11 @@
- **EsriVideo**
- **Europa**
- **EuropeanTour**
- **Eurosport**
- **EUScreen**
- **EWETV**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVLive**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **EWETVRecordings**: [<abbr title="netrc machine"><em>ewetv</em></abbr>]
- **ExpoTV**
- **Expressen**
- **ExtremeTube**
@@ -454,6 +462,8 @@
- **GiantBomb**
- **Giga**
- **GlattvisionTV**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVLive**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **GlattvisionTVRecordings**: [<abbr title="netrc machine"><em>glattvisiontv</em></abbr>]
- **Glide**: Glide mobile video messages (glide.me)
- **Globo**: [<abbr title="netrc machine"><em>globo</em></abbr>]
- **GloboArticle**
@@ -545,6 +555,8 @@
- **iq.com**: International version of iQiyi
- **iq.com:album**
- **iqiyi**: [<abbr title="netrc machine"><em>iqiyi</em></abbr>] 爱奇艺
- **IslamChannel**
- **IslamChannelSeries**
- **ITProTV**
- **ITProTVCourse**
- **ITTF**
@@ -715,6 +727,8 @@
- **MLSSoccer**
- **Mnet**
- **MNetTV**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVLive**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MNetTVRecordings**: [<abbr title="netrc machine"><em>mnettv</em></abbr>]
- **MochaVideo**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex**
@@ -801,13 +815,16 @@
- **netease:program**: 网易云音乐 - 电台节目
- **netease:singer**: 网易云音乐 - 歌手
- **netease:song**: 网易云音乐
- **NetPlus**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTV**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVLive**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **NetPlusTVRecordings**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse**
- **NetversePlaylist**
- **Netzkino**
- **Newgrounds**
- **Newgrounds:playlist**
- **Newgrounds:user**
- **NewsPicks**
- **Newstube**
- **Newsy**
- **NextMedia**: 蘋果日報
@@ -906,6 +923,8 @@
- **orf:radio**
- **orf:tvthek**: ORF TVthek
- **OsnatelTV**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVLive**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OsnatelTVRecordings**: [<abbr title="netrc machine"><em>osnateltv</em></abbr>]
- **OutsideTV**
- **PacktPub**: [<abbr title="netrc machine"><em>packtpub</em></abbr>]
- **PacktPubCourse**
@@ -1013,6 +1032,8 @@
- **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuantumTV**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVLive**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **QuantumTVRecordings**: [<abbr title="netrc machine"><em>quantumtv</em></abbr>]
- **Qub**
- **R7**
- **R7Article**
@@ -1121,7 +1142,11 @@
- **safari:course**: [<abbr title="netrc machine"><em>safari</em></abbr>] safaribooksonline.com online courses
- **Saitosan**
- **SAKTV**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVLive**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SAKTVRecordings**: [<abbr title="netrc machine"><em>saktv</em></abbr>]
- **SaltTV**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVLive**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SaltTVRecordings**: [<abbr title="netrc machine"><em>salttv</em></abbr>]
- **SampleFocus**
- **Sapo**: SAPO Vídeos
- **savefrom.net**
@@ -1311,6 +1336,8 @@
- **ToypicsUser**: Toypics user profile
- **TrailerAddict**: (**Currently broken**)
- **TravelChannel**
- **Triller**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **TrillerUser**: [<abbr title="netrc machine"><em>triller</em></abbr>]
- **Trilulilu**
- **Trovo**
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
@@ -1486,6 +1513,8 @@
- **VoxMedia**
- **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series**
- **vqq:video**
- **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [<abbr title="netrc machine"><em>vrtnu</em></abbr>] VrtNU.be
@@ -1494,6 +1523,8 @@
- **VShare**
- **VTM**
- **VTXTV**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVLive**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VTXTVRecordings**: [<abbr title="netrc machine"><em>vtxtv</em></abbr>]
- **VuClip**
- **Vupload**
- **VVVVID**
@@ -1503,6 +1534,8 @@
- **Wakanim**
- **Walla**
- **WalyTV**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVLive**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **WalyTVRecordings**: [<abbr title="netrc machine"><em>walytv</em></abbr>]
- **wasdtv:clip**
- **wasdtv:record**
- **wasdtv:stream**

View File

@@ -668,7 +668,7 @@ class TestYoutubeDL(unittest.TestCase):
def test_prepare_outtmpl_and_filename(self):
def test(tmpl, expected, *, info=None, **params):
params['outtmpl'] = tmpl
ydl = YoutubeDL(params)
ydl = FakeYDL(params)
ydl._num_downloads = 1
self.assertEqual(ydl.validate_outtmpl(tmpl), None)

View File

@@ -11,41 +11,46 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib
import subprocess
from yt_dlp.utils import encodeArgument
from yt_dlp.utils import Popen
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
try:
_DEV_NULL = subprocess.DEVNULL
except AttributeError:
_DEV_NULL = open(os.devnull, 'wb')
LAZY_EXTRACTORS = 'yt_dlp/extractor/lazy_extractors.py'
class TestExecution(unittest.TestCase):
def test_import(self):
subprocess.check_call([sys.executable, '-c', 'import yt_dlp'], cwd=rootDir)
def test_module_exec(self):
subprocess.check_call([sys.executable, '-m', 'yt_dlp', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL)
def run_yt_dlp(self, exe=(sys.executable, 'yt_dlp/__main__.py'), opts=('--version', )):
stdout, stderr, returncode = Popen.run(
[*exe, '--ignore-config', *opts], cwd=rootDir, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print(stderr, file=sys.stderr)
self.assertEqual(returncode, 0)
return stdout.strip(), stderr.strip()
def test_main_exec(self):
subprocess.check_call([sys.executable, 'yt_dlp/__main__.py', '--ignore-config', '--version'], cwd=rootDir, stdout=_DEV_NULL)
self.run_yt_dlp()
def test_import(self):
self.run_yt_dlp(exe=(sys.executable, '-c', 'import yt_dlp'))
def test_module_exec(self):
self.run_yt_dlp(exe=(sys.executable, '-m', 'yt_dlp'))
def test_cmdline_umlauts(self):
p = subprocess.Popen(
[sys.executable, 'yt_dlp/__main__.py', '--ignore-config', encodeArgument('ä'), '--version'],
cwd=rootDir, stdout=_DEV_NULL, stderr=subprocess.PIPE)
_, stderr = p.communicate()
_, stderr = self.run_yt_dlp(opts=('ä', '--version'))
self.assertFalse(stderr)
def test_lazy_extractors(self):
try:
subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', 'yt_dlp/extractor/lazy_extractors.py'], cwd=rootDir, stdout=_DEV_NULL)
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=_DEV_NULL)
subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', LAZY_EXTRACTORS],
cwd=rootDir, stdout=subprocess.DEVNULL)
self.assertTrue(os.path.exists(LAZY_EXTRACTORS))
_, stderr = self.run_yt_dlp(opts=('-s', 'test:'))
self.assertFalse(stderr)
subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=subprocess.DEVNULL)
finally:
with contextlib.suppress(OSError):
os.remove('yt_dlp/extractor/lazy_extractors.py')
os.remove(LAZY_EXTRACTORS)
if __name__ == '__main__':

View File

@@ -7,8 +7,10 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math
import re
from yt_dlp.jsinterp import JSInterpreter
from yt_dlp.jsinterp import JS_Undefined, JSInterpreter
class TestJSInterpreter(unittest.TestCase):
@@ -66,6 +68,12 @@ class TestJSInterpreter(unittest.TestCase):
jsi = JSInterpreter('function f(){return 0 && 1 || 2;}')
self.assertEqual(jsi.call_function('f'), 2)
jsi = JSInterpreter('function f(){return 0 ?? 42;}')
self.assertEqual(jsi.call_function('f'), 0)
jsi = JSInterpreter('function f(){return "life, the universe and everything" < 42;}')
self.assertFalse(jsi.call_function('f'))
def test_array_access(self):
jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}')
self.assertEqual(jsi.call_function('f'), [5, 2, 7])
@@ -124,6 +132,11 @@ class TestJSInterpreter(unittest.TestCase):
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_builtins(self):
jsi = JSInterpreter('''
function x() { return NaN }
''')
self.assertTrue(math.isnan(jsi.call_function('x')))
jsi = JSInterpreter('''
function x() { return new Date('Wednesday 31 December 1969 18:01:26 MDT') - 0; }
''')
@@ -183,6 +196,30 @@ class TestJSInterpreter(unittest.TestCase):
''')
self.assertEqual(jsi.call_function('x'), 10)
def test_catch(self):
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_finally(self):
jsi = JSInterpreter('''
function x() { try{throw 10} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
jsi = JSInterpreter('''
function x() { try{throw 10} catch(e){return 5} finally {return 42} }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_nested_try(self):
jsi = JSInterpreter('''
function x() {try {
try{throw 10} finally {throw 42}
} catch(e){return 5} }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_for_loop_continue(self):
jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) { continue; a++ } return a }
@@ -195,6 +232,14 @@ class TestJSInterpreter(unittest.TestCase):
''')
self.assertEqual(jsi.call_function('x'), 0)
def test_for_loop_try(self):
jsi = JSInterpreter('''
function x() {
for (i=0; i-10; i++) { try { if (i == 5) throw i} catch {return 10} finally {break} };
return 42 }
''')
self.assertEqual(jsi.call_function('x'), 42)
def test_literal_list(self):
jsi = JSInterpreter('''
function x() { return [1, 2, "asdf", [5, 6, 7]][3] }
@@ -212,6 +257,11 @@ class TestJSInterpreter(unittest.TestCase):
''')
self.assertEqual(jsi.call_function('x'), 7)
jsi = JSInterpreter('''
function x() { return (l=[0,1,2,3], function(a, b){return a+b})((l[1], l[2]), l[3]) }
''')
self.assertEqual(jsi.call_function('x'), 5)
def test_void(self):
jsi = JSInterpreter('''
function x() { return void 42; }
@@ -224,6 +274,140 @@ class TestJSInterpreter(unittest.TestCase):
''')
self.assertEqual(jsi.call_function('x')([]), 1)
def test_null(self):
jsi = JSInterpreter('''
function x() { return null; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { return [null > 0, null < 0, null == 0, null === 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [null >= 0, null <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True])
def test_undefined(self):
jsi = JSInterpreter('''
function x() { return undefined === undefined; }
''')
self.assertEqual(jsi.call_function('x'), True)
jsi = JSInterpreter('''
function x() { return undefined; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let v; return v; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { return [undefined === undefined, undefined == undefined, undefined < undefined, undefined > undefined]; }
''')
self.assertEqual(jsi.call_function('x'), [True, True, False, False])
jsi = JSInterpreter('''
function x() { return [undefined === 0, undefined == 0, undefined < 0, undefined > 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, False, False])
jsi = JSInterpreter('''
function x() { return [undefined >= 0, undefined <= 0]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False])
jsi = JSInterpreter('''
function x() { return [undefined > null, undefined < null, undefined == null, undefined === null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, True, False])
jsi = JSInterpreter('''
function x() { return [undefined === null, undefined == null, undefined < null, undefined > null]; }
''')
self.assertEqual(jsi.call_function('x'), [False, True, False, False])
jsi = JSInterpreter('''
function x() { let v; return [42+v, v+42, v**42, 42**v, 0**v]; }
''')
for y in jsi.call_function('x'):
self.assertTrue(math.isnan(y))
jsi = JSInterpreter('''
function x() { let v; return v**0; }
''')
self.assertEqual(jsi.call_function('x'), 1)
jsi = JSInterpreter('''
function x() { let v; return [v>42, v<=42, v&&42, 42&&v]; }
''')
self.assertEqual(jsi.call_function('x'), [False, False, JS_Undefined, JS_Undefined])
jsi = JSInterpreter('function x(){return undefined ?? 42; }')
self.assertEqual(jsi.call_function('x'), 42)
def test_object(self):
jsi = JSInterpreter('''
function x() { return {}; }
''')
self.assertEqual(jsi.call_function('x'), {})
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return [a["m1"], a.m2]; }
''')
self.assertEqual(jsi.call_function('x'), [42, 0])
jsi = JSInterpreter('''
function x() { let a; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
jsi = JSInterpreter('''
function x() { let a = {m1: 42, m2: 0 }; return a?.qq; }
''')
self.assertEqual(jsi.call_function('x'), JS_Undefined)
def test_regex(self):
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; }
''')
self.assertEqual(jsi.call_function('x'), None)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/; return a; }
''')
self.assertIsInstance(jsi.call_function('x'), re.Pattern)
jsi = JSInterpreter('''
function x() { let a=/,,[/,913,/](,)}/i; return a; }
''')
self.assertEqual(jsi.call_function('x').flags & re.I, re.I)
jsi = JSInterpreter(R'''
function x() { let a=/,][}",],()}(\[)/; return a; }
''')
self.assertEqual(jsi.call_function('x').pattern, r',][}",],()}(\[)')
def test_char_code_at(self):
jsi = JSInterpreter('function x(i){return "test".charCodeAt(i)}')
self.assertEqual(jsi.call_function('x', 0), 116)
self.assertEqual(jsi.call_function('x', 1), 101)
self.assertEqual(jsi.call_function('x', 2), 115)
self.assertEqual(jsi.call_function('x', 3), 116)
self.assertEqual(jsi.call_function('x', 4), None)
self.assertEqual(jsi.call_function('x', 'not_a_number'), 116)
def test_bitwise_operators_overflow(self):
jsi = JSInterpreter('function x(){return -524999584 << 5}')
self.assertEqual(jsi.call_function('x'), 379882496)
jsi = JSInterpreter('function x(){return 1236566549 << 5}')
self.assertEqual(jsi.call_function('x'), 915423904)
if __name__ == '__main__':
unittest.main()

View File

@@ -102,6 +102,30 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/4c3f79c5/player_ias.vflset/en_US/base.js',
'TDCstCG66tEAO5pR9o', 'dbxNtZ14c-yWyw',
),
(
'https://www.youtube.com/s/player/c81bbb4a/player_ias.vflset/en_US/base.js',
'gre3EcLurNY2vqp94', 'Z9DfGxWP115WTg',
),
(
'https://www.youtube.com/s/player/1f7d5369/player_ias.vflset/en_US/base.js',
'batNX7sYqIJdkJ', 'IhOkL_zxbkOZBw',
),
(
'https://www.youtube.com/s/player/009f1d77/player_ias.vflset/en_US/base.js',
'5dwFHw8aFWQUQtffRq', 'audescmLUzI3jw',
),
(
'https://www.youtube.com/s/player/dc0c6770/player_ias.vflset/en_US/base.js',
'5EHDMgYLV6HPGk_Mu-kk', 'n9lUJLHbxUI0GQ',
),
(
'https://www.youtube.com/s/player/113ca41c/player_ias.vflset/en_US/base.js',
'cgYl-tlYkhjT7A', 'hI7BBr2zUgcmMg',
),
(
'https://www.youtube.com/s/player/c57c113c/player_ias.vflset/en_US/base.js',
'M92UUMHa8PdvPd3wyM', '3hPqLJsiNZx7yA',
),
]

View File

@@ -29,6 +29,7 @@ from .cookies import load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version
from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text
from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors
@@ -47,7 +48,7 @@ from .postprocessor import (
get_postprocessor,
)
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import detect_variant
from .update import REPOSITORY, current_git_head, detect_variant
from .utils import (
DEFAULT_OUTTMPL,
IDENTITY,
@@ -89,6 +90,7 @@ from .utils import (
args_to_str,
bug_reports_message,
date_from_str,
deprecation_warning,
determine_ext,
determine_protocol,
encode_compat_str,
@@ -115,6 +117,7 @@ from .utils import (
network_exceptions,
number_of_digits,
orderedSet,
orderedSet_from_options,
parse_filesize,
preferredencoding,
prepend_extension,
@@ -236,7 +239,7 @@ class YoutubeDL:
Default is 'only_download' for CLI, but False for API
skip_playlist_after_errors: Number of allowed failures until the rest of
the playlist is skipped
force_generic_extractor: Force downloader to use the generic extractor
allowed_extractors: List of regexes to match against extractor names that are allowed
overwrites: Overwrite all video and metadata files if True,
overwrite only non-video files if None
and don't overwrite any file if False
@@ -301,8 +304,9 @@ class YoutubeDL:
should act on each input URL as opposed to for the entire queue
cookiefile: File name or text stream from where cookies should be read and dumped to
cookiesfrombrowser: A tuple containing the name of the browser, the profile
name/path from where cookies are loaded, and the name of the
keyring, e.g. ('chrome', ) or ('vivaldi', 'default', 'BASICTEXT')
name/path from where cookies are loaded, the name of the keyring,
and the container name, e.g. ('chrome', ) or
('vivaldi', 'default', 'BASICTEXT') or ('firefox', 'default', None, 'Meta')
legacyserverconnect: Explicitly allow HTTPS connection to servers that do not
support RFC 5746 secure renegotiation
nocheckcertificate: Do not verify SSL certificates
@@ -444,6 +448,7 @@ class YoutubeDL:
* index: Section number (Optional)
force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts
noprogress: Do not print the progress bar
live_from_start: Whether to download livestreams videos from the start
The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py):
@@ -475,6 +480,8 @@ class YoutubeDL:
The following options are deprecated and may be removed in the future:
force_generic_extractor: Force downloader to use the generic extractor
- Use allowed_extractors = ['generic', 'default']
playliststart: - Use playlist_items
Playlist item to start at.
playlistend: - Use playlist_items
@@ -626,7 +633,7 @@ class YoutubeDL:
for msg in self.params.get('_warnings', []):
self.report_warning(msg)
for msg in self.params.get('_deprecation_warnings', []):
self.deprecation_warning(msg)
self.deprecated_feature(msg)
self.params['compat_opts'] = set(self.params.get('compat_opts', ()))
if 'list-formats' in self.params['compat_opts']:
@@ -756,13 +763,6 @@ class YoutubeDL:
self._ies_instances[ie_key] = ie
ie.set_downloader(self)
def _get_info_extractor_class(self, ie_key):
ie = self._ies.get(ie_key)
if ie is None:
ie = get_info_extractor(ie_key)
self.add_info_extractor(ie)
return ie
def get_info_extractor(self, ie_key):
"""
Get an instance of an IE with name ie_key, it will try to get one from
@@ -779,8 +779,19 @@ class YoutubeDL:
"""
Add the InfoExtractors returned by gen_extractors to the end of the list
"""
for ie in gen_extractor_classes():
self.add_info_extractor(ie)
all_ies = {ie.IE_NAME.lower(): ie for ie in gen_extractor_classes()}
all_ies['end'] = UnsupportedURLIE()
try:
ie_names = orderedSet_from_options(
self.params.get('allowed_extractors', ['default']), {
'all': list(all_ies),
'default': [name for name, ie in all_ies.items() if ie._ENABLED],
}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for allowed_extractors: {e.pattern}')
for name in ie_names:
self.add_info_extractor(all_ies[name])
self.write_debug(f'Loaded {len(ie_names)} extractors')
def add_post_processor(self, pp, when='post_process'):
"""Add a PostProcessor object to the end of the chain."""
@@ -826,9 +837,11 @@ class YoutubeDL:
def to_stdout(self, message, skip_eol=False, quiet=None):
"""Print message to stdout"""
if quiet is not None:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. Use "YoutubeDL.to_screen" instead')
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument quiet. '
'Use "YoutubeDL.to_screen" instead')
if skip_eol is not False:
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. Use "YoutubeDL.to_screen" instead')
self.deprecation_warning('"YoutubeDL.to_stdout" no longer accepts the argument skip_eol. '
'Use "YoutubeDL.to_screen" instead')
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.out)
def to_screen(self, message, skip_eol=False, quiet=None):
@@ -964,11 +977,14 @@ class YoutubeDL:
return
self.to_stderr(f'{self._format_err("WARNING:", self.Styles.WARNING)} {message}', only_once)
def deprecation_warning(self, message):
def deprecation_warning(self, message, *, stacklevel=0):
deprecation_warning(
message, stacklevel=stacklevel + 1, printer=self.report_error, is_error=False)
def deprecated_feature(self, message):
if self.params.get('logger') is not None:
self.params['logger'].warning(f'DeprecationWarning: {message}')
else:
self.to_stderr(f'{self._format_err("DeprecationWarning:", self.Styles.ERROR)} {message}', True)
self.params['logger'].warning(f'Deprecated Feature: {message}')
self.to_stderr(f'{self._format_err("Deprecated Feature:", self.Styles.ERROR)} {message}', True)
def report_error(self, message, *args, **kwargs):
'''
@@ -1028,7 +1044,7 @@ class YoutubeDL:
def get_output_path(self, dir_type='', filename=None):
paths = self.params.get('paths', {})
assert isinstance(paths, dict)
assert isinstance(paths, dict), '"paths" parameter must be a dictionary'
path = os.path.join(
expand_path(paths.get('home', '').strip()),
expand_path(paths.get(dir_type, '').strip()) if dir_type else '',
@@ -1411,11 +1427,11 @@ class YoutubeDL:
ie_key = 'Generic'
if ie_key:
ies = {ie_key: self._get_info_extractor_class(ie_key)}
ies = {ie_key: self._ies[ie_key]} if ie_key in self._ies else {}
else:
ies = self._ies
for ie_key, ie in ies.items():
for key, ie in ies.items():
if not ie.suitable(url):
continue
@@ -1424,14 +1440,16 @@ class YoutubeDL:
'and will probably not work.')
temp_id = ie.get_temp_id(url)
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': ie_key}):
self.to_screen(f'[{ie_key}] {temp_id}: has already been recorded in the archive')
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': key}):
self.to_screen(f'[{key}] {temp_id}: has already been recorded in the archive')
if self.params.get('break_on_existing', False):
raise ExistingVideoReached()
break
return self.__extract_info(url, self.get_info_extractor(ie_key), download, extra_info, process)
return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process)
else:
self.report_error('no suitable InfoExtractor for URL %s' % url)
extractors_restricted = self.params.get('allowed_extractors') not in (None, ['default'])
self.report_error(f'No suitable extractor{format_field(ie_key, None, " (%s)")} found for URL {url}',
tb=False if extractors_restricted else None)
def _handle_extraction_exceptions(func):
@functools.wraps(func)
@@ -2510,9 +2528,6 @@ class YoutubeDL:
'--live-from-start is passed, but there are no formats that can be downloaded from the start. '
'If you want to download from the current time, use --no-live-from-start'))
if not formats:
self.raise_no_formats(info_dict)
def is_wellformed(f):
url = f.get('url')
if not url:
@@ -2525,7 +2540,10 @@ class YoutubeDL:
return True
# Filter out malformed formats for better extraction robustness
formats = list(filter(is_wellformed, formats))
formats = list(filter(is_wellformed, formats or []))
if not formats:
self.raise_no_formats(info_dict)
formats_dict = {}
@@ -2727,42 +2745,26 @@ class YoutubeDL:
if lang not in available_subs:
available_subs[lang] = cap_info
if (not self.params.get('writesubtitles') and not
self.params.get('writeautomaticsub') or not
available_subs):
if not available_subs or (
not self.params.get('writesubtitles')
and not self.params.get('writeautomaticsub')):
return None
all_sub_langs = tuple(available_subs.keys())
if self.params.get('allsubtitles', False):
requested_langs = all_sub_langs
elif self.params.get('subtitleslangs', False):
# A list is used so that the order of languages will be the same as
# given in subtitleslangs. See https://github.com/yt-dlp/yt-dlp/issues/1041
requested_langs = []
for lang_re in self.params.get('subtitleslangs'):
discard = lang_re[0] == '-'
if discard:
lang_re = lang_re[1:]
if lang_re == 'all':
if discard:
requested_langs = []
else:
requested_langs.extend(all_sub_langs)
continue
current_langs = filter(re.compile(lang_re + '$').match, all_sub_langs)
if discard:
for lang in current_langs:
while lang in requested_langs:
requested_langs.remove(lang)
else:
requested_langs.extend(current_langs)
requested_langs = orderedSet(requested_langs)
try:
requested_langs = orderedSet_from_options(
self.params.get('subtitleslangs'), {'all': all_sub_langs}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for subtitlelangs: {e.pattern}')
elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else:
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
if requested_langs:
self.write_debug('Downloading subtitles: %s' % ', '.join(requested_langs))
self.to_screen(f'[info] {video_id}: Downloading subtitles: {", ".join(requested_langs)}')
formats_query = self.params.get('subtitlesformat', 'best')
formats_preference = formats_query.split('/') if formats_query else []
@@ -3270,6 +3272,7 @@ class YoutubeDL:
self.to_screen(f'[info] {e}')
if not self.params.get('break_per_url'):
raise
self._num_downloads = 0
else:
if self.params.get('dump_single_json', False):
self.post_extract(res)
@@ -3318,6 +3321,12 @@ class YoutubeDL:
return info_dict
info_dict.setdefault('epoch', int(time.time()))
info_dict.setdefault('_type', 'video')
info_dict.setdefault('_version', {
'version': __version__,
'current_git_head': current_git_head(),
'release_git_head': RELEASE_GIT_HEAD,
'repository': REPOSITORY,
})
if remove_private_keys:
reject = lambda k, v: v is None or k.startswith('__') or k in {
@@ -3443,7 +3452,7 @@ class YoutubeDL:
return False
vid_ids = [self._make_archive_id(info_dict)]
vid_ids.extend(info_dict.get('_old_archive_ids', []))
vid_ids.extend(info_dict.get('_old_archive_ids') or [])
return any(id_ in self.archive for id_ in vid_ids)
def record_download_archive(self, info_dict):
@@ -3682,7 +3691,8 @@ class YoutubeDL:
if VARIANT not in (None, 'pip'):
source += '*'
write_debug(join_nonempty(
'yt-dlp version', __version__,
f'{"yt-dlp" if REPOSITORY == "yt-dlp/yt-dlp" else REPOSITORY} version',
__version__,
f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})',
delim=' '))
@@ -3698,18 +3708,8 @@ class YoutubeDL:
if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
if source == 'source':
try:
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
write_debug(f'Git HEAD: {stdout.strip()}')
except Exception:
with contextlib.suppress(Exception):
sys.exc_clear()
if current_git_head():
write_debug(f'Git HEAD: {current_git_head()}')
write_debug(system_identifier())
exe_versions, ffmpeg_features = FFmpegPostProcessor.get_versions_and_features(self)

View File

@@ -63,6 +63,8 @@ from .utils import (
)
from .YoutubeDL import YoutubeDL
_IN_CLI = False
def _exit(status=0, *args):
for msg in args:
@@ -344,10 +346,16 @@ def validate_options(opts):
# Cookies from browser
if opts.cookiesfrombrowser:
mobj = re.match(r'(?P<name>[^+:]+)(\s*\+\s*(?P<keyring>[^:]+))?(\s*:(?P<profile>.+))?', opts.cookiesfrombrowser)
container = None
mobj = re.fullmatch(r'''(?x)
(?P<name>[^+:]+)
(?:\s*\+\s*(?P<keyring>[^:]+))?
(?:\s*:\s*(?P<profile>.+?))?
(?:\s*::\s*(?P<container>.+))?
''', opts.cookiesfrombrowser)
if mobj is None:
raise ValueError(f'invalid cookies from browser arguments: {opts.cookiesfrombrowser}')
browser_name, keyring, profile = mobj.group('name', 'keyring', 'profile')
browser_name, keyring, profile, container = mobj.group('name', 'keyring', 'profile', 'container')
browser_name = browser_name.lower()
if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser specified for cookies: "{browser_name}". '
@@ -357,7 +365,7 @@ def validate_options(opts):
if keyring not in SUPPORTED_KEYRINGS:
raise ValueError(f'unsupported keyring specified for cookies: "{keyring}". '
f'Supported keyrings are: {", ".join(sorted(SUPPORTED_KEYRINGS))}')
opts.cookiesfrombrowser = (browser_name, profile, keyring)
opts.cookiesfrombrowser = (browser_name, profile, keyring, container)
# MetadataParser
def metadataparser_actions(f):
@@ -766,6 +774,7 @@ def parse_options(argv=None):
'windowsfilenames': opts.windowsfilenames,
'ignoreerrors': opts.ignoreerrors,
'force_generic_extractor': opts.force_generic_extractor,
'allowed_extractors': opts.allowed_extractors or ['default'],
'ratelimit': opts.ratelimit,
'throttledratelimit': opts.throttledratelimit,
'overwrites': opts.overwrites,

View File

@@ -14,4 +14,5 @@ if __package__ is None and not hasattr(sys, 'frozen'):
import yt_dlp
if __name__ == '__main__':
yt_dlp._IN_CLI = True
yt_dlp.main()

View File

@@ -6,7 +6,8 @@ import re
import shutil
import traceback
from .utils import expand_path, write_json_file
from .utils import expand_path, traverse_obj, version_tuple, write_json_file
from .version import __version__
class Cache:
@@ -45,12 +46,20 @@ class Cache:
if ose.errno != errno.EEXIST:
raise
self._ydl.write_debug(f'Saving {section}.{key} to cache')
write_json_file(data, fn)
write_json_file({'yt-dlp_version': __version__, 'data': data}, fn)
except Exception:
tb = traceback.format_exc()
self._ydl.report_warning(f'Writing cache to {fn!r} failed: {tb}')
def load(self, section, key, dtype='json', default=None):
def _validate(self, data, min_ver):
version = traverse_obj(data, 'yt-dlp_version')
if not version: # Backward compatibility
data, version = {'data': data}, '2022.08.19'
if not min_ver or version_tuple(version) >= version_tuple(min_ver):
return data['data']
self._ydl.write_debug(f'Discarding old cache from version {version} (needs {min_ver})')
def load(self, section, key, dtype='json', default=None, *, min_ver=None):
assert dtype in ('json',)
if not self.enabled:
@@ -61,8 +70,8 @@ class Cache:
try:
with open(cache_fn, encoding='utf-8') as cachef:
self._ydl.write_debug(f'Loading {section}.{key} from cache')
return json.load(cachef)
except ValueError:
return self._validate(json.load(cachef), min_ver)
except (ValueError, KeyError):
try:
file_size = os.path.getsize(cache_fn)
except OSError as oe:

View File

@@ -3,6 +3,7 @@ import contextlib
import http.cookiejar
import json
import os
import re
import shutil
import struct
import subprocess
@@ -24,7 +25,13 @@ from .dependencies import (
sqlite3,
)
from .minicurses import MultilinePrinter, QuietMultilinePrinter
from .utils import Popen, YoutubeDLCookieJar, error_to_str, expand_path
from .utils import (
Popen,
YoutubeDLCookieJar,
error_to_str,
expand_path,
try_call,
)
CHROMIUM_BASED_BROWSERS = {'brave', 'chrome', 'chromium', 'edge', 'opera', 'vivaldi'}
SUPPORTED_BROWSERS = CHROMIUM_BASED_BROWSERS | {'firefox', 'safari'}
@@ -85,8 +92,9 @@ def _create_progress_bar(logger):
def load_cookies(cookie_file, browser_specification, ydl):
cookie_jars = []
if browser_specification is not None:
browser_name, profile, keyring = _parse_browser_specification(*browser_specification)
cookie_jars.append(extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring))
browser_name, profile, keyring, container = _parse_browser_specification(*browser_specification)
cookie_jars.append(
extract_cookies_from_browser(browser_name, profile, YDLLogger(ydl), keyring=keyring, container=container))
if cookie_file is not None:
is_filename = YoutubeDLCookieJar.is_path(cookie_file)
@@ -101,9 +109,9 @@ def load_cookies(cookie_file, browser_specification, ydl):
return _merge_cookie_jars(cookie_jars)
def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None):
def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(), *, keyring=None, container=None):
if browser_name == 'firefox':
return _extract_firefox_cookies(profile, logger)
return _extract_firefox_cookies(profile, container, logger)
elif browser_name == 'safari':
return _extract_safari_cookies(profile, logger)
elif browser_name in CHROMIUM_BASED_BROWSERS:
@@ -112,7 +120,7 @@ def extract_cookies_from_browser(browser_name, profile=None, logger=YDLLogger(),
raise ValueError(f'unknown browser: {browser_name}')
def _extract_firefox_cookies(profile, logger):
def _extract_firefox_cookies(profile, container, logger):
logger.info('Extracting cookies from firefox')
if not sqlite3:
logger.warning('Cannot extract cookies from firefox without sqlite3 support. '
@@ -131,11 +139,36 @@ def _extract_firefox_cookies(profile, logger):
raise FileNotFoundError(f'could not find firefox cookies database in {search_root}')
logger.debug(f'Extracting cookies from: "{cookie_database_path}"')
container_id = None
if container not in (None, 'none'):
containers_path = os.path.join(os.path.dirname(cookie_database_path), 'containers.json')
if not os.path.isfile(containers_path) or not os.access(containers_path, os.R_OK):
raise FileNotFoundError(f'could not read containers.json in {search_root}')
with open(containers_path) as containers:
identities = json.load(containers).get('identities', [])
container_id = next((context.get('userContextId') for context in identities if container in (
context.get('name'),
try_call(lambda: re.fullmatch(r'userContext([^\.]+)\.label', context['l10nID']).group())
)), None)
if not isinstance(container_id, int):
raise ValueError(f'could not find firefox container "{container}" in containers.json')
with tempfile.TemporaryDirectory(prefix='yt_dlp') as tmpdir:
cursor = None
try:
cursor = _open_database_copy(cookie_database_path, tmpdir)
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies')
if isinstance(container_id, int):
logger.debug(
f'Only loading cookies from firefox container "{container}", ID {container_id}')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE originAttributes LIKE ? OR originAttributes LIKE ?',
(f'%userContextId={container_id}', f'%userContextId={container_id}&%'))
elif container == 'none':
logger.debug('Only loading cookies not belonging to any container')
cursor.execute(
'SELECT host, name, value, path, expiry, isSecure FROM moz_cookies WHERE NOT INSTR(originAttributes,"userContextId=")')
else:
cursor.execute('SELECT host, name, value, path, expiry, isSecure FROM moz_cookies')
jar = YoutubeDLCookieJar()
with _create_progress_bar(logger) as progress_bar:
table = cursor.fetchall()
@@ -948,11 +981,11 @@ def _is_path(value):
return os.path.sep in value
def _parse_browser_specification(browser_name, profile=None, keyring=None):
def _parse_browser_specification(browser_name, profile=None, keyring=None, container=None):
if browser_name not in SUPPORTED_BROWSERS:
raise ValueError(f'unsupported browser: "{browser_name}"')
if keyring not in (None, *SUPPORTED_KEYRINGS):
raise ValueError(f'unsupported keyring: "{keyring}"')
if profile is not None and _is_path(profile):
profile = os.path.expanduser(profile)
return browser_name, profile, keyring
return browser_name, profile, keyring, container

View File

@@ -92,6 +92,7 @@ class FileDownloader:
for func in (
'deprecation_warning',
'deprecated_feature',
'report_error',
'report_file_already_downloaded',
'report_warning',

View File

@@ -515,16 +515,14 @@ _BY_NAME = {
if name.endswith('FD') and name not in ('ExternalFD', 'FragmentFD')
}
_BY_EXE = {klass.EXE_NAME: klass for klass in _BY_NAME.values()}
def list_external_downloaders():
return sorted(_BY_NAME.keys())
def get_external_downloader(external_downloader):
""" Given the name of the executable, see whether we support the given
downloader . """
# Drop .exe extension on Windows
""" Given the name of the executable, see whether we support the given downloader """
bn = os.path.splitext(os.path.basename(external_downloader))[0]
return _BY_NAME.get(bn, _BY_EXE.get(bn))
return _BY_NAME.get(bn) or next((
klass for klass in _BY_NAME.values() if klass.EXE_NAME in bn
), None)

View File

@@ -65,8 +65,8 @@ class FragmentFD(FileDownloader):
"""
def report_retry_fragment(self, err, frag_index, count, retries):
self.deprecation_warning(
'yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. Use yt_dlp.downloader.FileDownloader.report_retry instead')
self.deprecation_warning('yt_dlp.downloader.FragmentFD.report_retry_fragment is deprecated. '
'Use yt_dlp.downloader.FileDownloader.report_retry instead')
return self.report_retry(err, count, retries, frag_index)
def report_skip_fragment(self, frag_index, err=None):

View File

@@ -1,5 +1,28 @@
# flake8: noqa: F401
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .abc import (
ABCIE,
ABCIViewIE,
@@ -470,6 +493,7 @@ from .epicon import (
EpiconIE,
EpiconSeriesIE,
)
from .epoch import EpochIE
from .eporner import EpornerIE
from .eroprofile import (
EroProfileIE,
@@ -491,6 +515,7 @@ from .espn import (
from .esri import EsriVideoIE
from .europa import EuropaIE
from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE
from .euscreen import EUScreenIE
from .expotv import ExpoTVIE
from .expressen import ExpressenIE
@@ -720,6 +745,10 @@ from .iqiyi import (
IqIE,
IqAlbumIE
)
from .islamchannel import (
IslamChannelIE,
IslamChannelSeriesIE,
)
from .itprotv import (
ITProTVIE,
ITProTVCourseIE
@@ -1079,6 +1108,7 @@ from .newgrounds import (
NewgroundsPlaylistIE,
NewgroundsUserIE,
)
from .newspicks import NewsPicksIE
from .newstube import NewstubeIE
from .newsy import NewsyIE
from .nextmedia import (
@@ -1728,6 +1758,12 @@ from .telequebec import (
from .teletask import TeleTaskIE
from .telewebion import TelewebionIE
from .tempo import TempoIE
from .tencent import (
VQQSeriesIE,
VQQVideoIE,
WeTvEpisodeIE,
WeTvSeriesIE,
)
from .tennistv import TennisTVIE
from .tenplay import TenPlayIE
from .testurl import TestURLIE
@@ -1787,6 +1823,10 @@ from .toongoggles import ToonGogglesIE
from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE
from .triller import (
TrillerIE,
TrillerUserIE,
)
from .trilulilu import TriluliluIE
from .trovo import (
TrovoIE,
@@ -2092,7 +2132,6 @@ from .weibo import (
WeiboMobileIE
)
from .weiqitv import WeiqiTVIE
from .wetv import WeTvEpisodeIE, WeTvSeriesIE
from .wikimedia import WikimediaIE
from .willow import WillowIE
from .wimtv import WimTVIE
@@ -2175,42 +2214,44 @@ from .younow import (
from .youporn import YouPornIE
from .yourporn import YourPornIE
from .yourupload import YourUploadIE
from .youtube import (
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeStoriesIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE
from .zattoo import (
BBVTVIE,
BBVTVLiveIE,
BBVTVRecordingsIE,
EinsUndEinsTVIE,
EinsUndEinsTVLiveIE,
EinsUndEinsTVRecordingsIE,
EWETVIE,
EWETVLiveIE,
EWETVRecordingsIE,
GlattvisionTVIE,
GlattvisionTVLiveIE,
GlattvisionTVRecordingsIE,
MNetTVIE,
NetPlusIE,
MNetTVLiveIE,
MNetTVRecordingsIE,
NetPlusTVIE,
NetPlusTVLiveIE,
NetPlusTVRecordingsIE,
OsnatelTVIE,
OsnatelTVLiveIE,
OsnatelTVRecordingsIE,
QuantumTVIE,
QuantumTVLiveIE,
QuantumTVRecordingsIE,
SaltTVIE,
SaltTVLiveIE,
SaltTVRecordingsIE,
SAKTVIE,
SAKTVLiveIE,
SAKTVRecordingsIE,
VTXTVIE,
VTXTVLiveIE,
VTXTVRecordingsIE,
WalyTVIE,
WalyTVLiveIE,
WalyTVRecordingsIE,
ZattooIE,
ZattooLiveIE,
ZattooMoviesIE,

View File

@@ -95,24 +95,24 @@ class ArteTVIE(ArteTVBaseIE):
# all obtained by exhaustive testing
_COUNTRIES_MAP = {
'DE_FR': {
'DE_FR': (
'BL', 'DE', 'FR', 'GF', 'GP', 'MF', 'MQ', 'NC',
'PF', 'PM', 'RE', 'WF', 'YT',
},
),
# with both of the below 'BE' sometimes works, sometimes doesn't
'EUR_DE_FR': {
'EUR_DE_FR': (
'AT', 'BL', 'CH', 'DE', 'FR', 'GF', 'GP', 'LI',
'MC', 'MF', 'MQ', 'NC', 'PF', 'PM', 'RE', 'WF',
'YT',
},
'SAT': {
),
'SAT': (
'AD', 'AT', 'AX', 'BG', 'BL', 'CH', 'CY', 'CZ',
'DE', 'DK', 'EE', 'ES', 'FI', 'FR', 'GB', 'GF',
'GR', 'HR', 'HU', 'IE', 'IS', 'IT', 'KN', 'LI',
'LT', 'LU', 'LV', 'MC', 'MF', 'MQ', 'MT', 'NC',
'NL', 'NO', 'PF', 'PL', 'PM', 'PT', 'RE', 'RO',
'SE', 'SI', 'SK', 'SM', 'VA', 'WF', 'YT',
},
),
}
def _real_extract(self, url):

View File

@@ -218,6 +218,9 @@ class BiliBiliIE(InfoExtractor):
durl = traverse_obj(video_info, ('dash', 'video'))
audios = traverse_obj(video_info, ('dash', 'audio')) or []
flac_audio = traverse_obj(video_info, ('dash', 'flac', 'audio'))
if flac_audio:
audios.append(flac_audio)
entries = []
RENDITIONS = ('qn=80&quality=80&type=', 'quality=2&type=mp4')
@@ -620,14 +623,15 @@ class BiliBiliSearchIE(SearchInfoExtractor):
'keyword': query,
'page': page_num,
'context': '',
'order': 'pubdate',
'duration': 0,
'tids_2': '',
'__refresh__': 'true',
'search_type': 'video',
'tids': 0,
'highlight': 1,
})['data'].get('result') or []
})['data'].get('result')
if not videos:
break
for video in videos:
yield self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))

View File

@@ -65,10 +65,12 @@ class BitChuteIE(InfoExtractor):
error = self._html_search_regex(r'<h1 class="page-title">([^<]+)</h1>', webpage, 'error', default='Cannot find video')
if error == 'Video Unavailable':
raise GeoRestrictedError(error)
raise ExtractorError(error)
raise ExtractorError(error, expected=True)
formats = entries[0]['formats']
self._check_formats(formats, video_id)
if not formats:
raise self.raise_no_formats('Video is unavailable', expected=True, video_id=video_id)
self._sort_formats(formats)
description = self._html_search_regex(

View File

@@ -480,6 +480,9 @@ class InfoExtractor:
will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES.
The _ENABLED attribute should be set to False for IEs that
are disabled by default and must be explicitly enabled.
The _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests.
"""
@@ -491,6 +494,7 @@ class InfoExtractor:
_GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None
_WORKING = True
_ENABLED = True
_NETRC_MACHINE = None
IE_DESC = None
SEARCH_KEY = None
@@ -1689,7 +1693,7 @@ class InfoExtractor:
'order_free': ('webm', 'mp4', 'flv', '', 'none')},
'aext': {'type': 'ordered', 'field': 'audio_ext',
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')},
'order_free': ('ogg', 'opus', 'webm', 'mp3', 'm4a', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'aud_or_vid': {'visible': False, 'forced': True, 'type': 'multiple',
'field': ('vcodec', 'acodec'),
@@ -1762,9 +1766,8 @@ class InfoExtractor:
if field not in self.settings:
if key in ('forced', 'priority'):
return False
self.ydl.deprecation_warning(
f'Using arbitrary fields ({field}) for format sorting is deprecated '
'and may be removed in a future version')
self.ydl.deprecated_feature(f'Using arbitrary fields ({field}) for format sorting is '
'deprecated and may be removed in a future version')
self.settings[field] = {}
propObj = self.settings[field]
if key not in propObj:
@@ -1849,9 +1852,8 @@ class InfoExtractor:
if self._get_field_setting(field, 'type') == 'alias':
alias, field = field, self._get_field_setting(field, 'field')
if self._get_field_setting(alias, 'deprecated'):
self.ydl.deprecation_warning(
f'Format sorting alias {alias} is deprecated '
f'and may be removed in a future version. Please use {field} instead')
self.ydl.deprecated_feature(f'Format sorting alias {alias} is deprecated and may '
'be removed in a future version. Please use {field} instead')
reverse = match.group('reverse') is not None
closest = match.group('separator') == '~'
limit_text = match.group('limit')
@@ -3258,7 +3260,7 @@ class InfoExtractor:
'subtitles': {},
}
media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src'))
src = strip_or_none(dict_get(media_attributes, ('src', 'data-video-src', 'data-src', 'data-source')))
if src:
f = parse_content_type(media_attributes.get('type'))
_, formats = _media_formats(src, media_type, f)
@@ -3269,7 +3271,7 @@ class InfoExtractor:
s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen
# several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src')))
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src', 'data-source')))
if not src:
continue
f = parse_content_type(s_attr.get('type'))
@@ -3872,7 +3874,7 @@ class InfoExtractor:
def _extract_from_webpage(cls, url, webpage):
for embed_url in orderedSet(
cls._extract_embed_urls(url, webpage) or [], lazy=True):
yield cls.url_result(embed_url, cls)
yield cls.url_result(embed_url, None if cls._VALID_URL is False else cls)
@classmethod
def _extract_embed_urls(cls, url, webpage):
@@ -3941,3 +3943,12 @@ class SearchInfoExtractor(InfoExtractor):
@classproperty
def SEARCH_KEY(cls):
return cls._SEARCH_KEY
class UnsupportedURLIE(InfoExtractor):
_VALID_URL = '.*'
_ENABLED = False
IE_DESC = False
def _real_extract(self, url):
raise UnsupportedError(url)

View File

@@ -720,15 +720,20 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
def _get_params(self, lang):
if not CrunchyrollBetaBaseIE.params:
if self._get_cookies(f'https://beta.crunchyroll.com/{lang}').get('etp_rt'):
grant_type, key = 'etp_rt_cookie', 'accountAuthClientId'
else:
grant_type, key = 'client_id', 'anonClientId'
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(
f'https://beta.crunchyroll.com/{lang}', None, note='Retrieving main page'), None)
api_domain = app_config['cxApiParams']['apiDomain']
basic_token = str(base64.b64encode(('%s:' % app_config['cxApiParams']['accountAuthClientId']).encode('ascii')), 'ascii')
auth_response = self._download_json(
f'{api_domain}/auth/v1/token', None, note='Authenticating with cookie',
f'{api_domain}/auth/v1/token', None, note=f'Authenticating with grant_type={grant_type}',
headers={
'Authorization': 'Basic ' + basic_token
}, data='grant_type=etp_rt_cookie'.encode('ascii'))
'Authorization': 'Basic ' + str(base64.b64encode(('%s:' % app_config['cxApiParams'][key]).encode('ascii')), 'ascii')
}, data=f'grant_type={grant_type}'.encode('ascii'))
policy_response = self._download_json(
f'{api_domain}/index/v2', None, note='Retrieving signed policy',
headers={
@@ -747,21 +752,6 @@ class CrunchyrollBetaBaseIE(CrunchyrollBaseIE):
CrunchyrollBetaBaseIE.params = (api_domain, bucket, params)
return CrunchyrollBetaBaseIE.params
def _redirect_from_beta(self, url, lang, internal_id, display_id, is_episode, iekey):
initial_state, app_config = self._get_beta_embedded_json(self._download_webpage(url, display_id), display_id)
content_data = initial_state['content']['byId'][internal_id]
if is_episode:
video_id = content_data['external_id'].split('.')[1]
series_id = content_data['episode_metadata']['series_slug_title']
else:
series_id = content_data['slug_title']
series_id = re.sub(r'-{2,}', '-', series_id)
url = f'https://www.crunchyroll.com/{lang}{series_id}'
if is_episode:
url = url + f'/{display_id}-{video_id}'
self.to_screen(f'{display_id}: Not logged in. Redirecting to non-beta site - {url}')
return self.url_result(url, iekey, display_id)
class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
IE_NAME = 'crunchyroll:beta'
@@ -800,10 +790,6 @@ class CrunchyrollBetaIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, True, CrunchyrollIE.ie_key())
api_domain, bucket, params = self._get_params(lang)
episode_response = self._download_json(
@@ -897,10 +883,6 @@ class CrunchyrollBetaShowIE(CrunchyrollBetaBaseIE):
def _real_extract(self, url):
lang, internal_id, display_id = self._match_valid_url(url).group('lang', 'id', 'display_id')
if not self._get_cookies(url).get('etp_rt'):
return self._redirect_from_beta(url, lang, internal_id, display_id, False, CrunchyrollShowPlaylistIE.ie_key())
api_domain, bucket, params = self._get_params(lang)
series_response = self._download_json(

46
yt_dlp/extractor/epoch.py Normal file
View File

@@ -0,0 +1,46 @@
from .common import InfoExtractor
class EpochIE(InfoExtractor):
_VALID_URL = r'https?://www.theepochtimes\.com/[\w-]+_(?P<id>\d+).html'
_TESTS = [
{
'url': 'https://www.theepochtimes.com/they-can-do-audio-video-physical-surveillance-on-you-24h-365d-a-year-rex-lee-on-intrusive-apps_4661688.html',
'info_dict': {
'id': 'a3dd732c-4750-4bc8-8156-69180668bda1',
'ext': 'mp4',
'title': 'They Can Do Audio, Video, Physical Surveillance on You 24H/365D a Year: Rex Lee on Intrusive Apps',
}
},
{
'url': 'https://www.theepochtimes.com/the-communist-partys-cyberattacks-on-america-explained-rex-lee-talks-tech-hybrid-warfare_4342413.html',
'info_dict': {
'id': '276c7f46-3bbf-475d-9934-b9bbe827cf0a',
'ext': 'mp4',
'title': 'The Communist Partys Cyberattacks on America Explained; Rex Lee Talks Tech Hybrid Warfare',
}
},
{
'url': 'https://www.theepochtimes.com/kash-patel-a-6-year-saga-of-government-corruption-from-russiagate-to-mar-a-lago_4690250.html',
'info_dict': {
'id': 'aa9ceecd-a127-453d-a2de-7153d6fd69b6',
'ext': 'mp4',
'title': 'Kash Patel: A 6-Year-Saga of Government Corruption, From Russiagate to Mar-a-Lago',
}
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
youmaker_video_id = self._search_regex(r'data-trailer="[\w-]+" data-id="([\w-]+)"', webpage, 'url')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'http://vs1.youmaker.com/assets/{youmaker_video_id}/playlist.m3u8', video_id, 'mp4', m3u8_id='hls')
return {
'id': youmaker_video_id,
'formats': formats,
'subtitles': subtitles,
'title': self._html_extract_title(webpage)
}

View File

@@ -0,0 +1,99 @@
from .common import InfoExtractor
from ..utils import traverse_obj
class EurosportIE(InfoExtractor):
_VALID_URL = r'https?://www\.eurosport\.com/\w+/[\w-]+/\d+/[\w-]+_(?P<id>vid\d+)'
_TESTS = [{
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/highlights-rafael-nadal-brushes-aside-caper-ruud-to-win-record-extending-14th-french-open-title_vid1694147/video.shtml',
'info_dict': {
'id': '2480939',
'ext': 'mp4',
'title': 'Highlights: Rafael Nadal brushes aside Caper Ruud to win record-extending 14th French Open title',
'description': 'md5:b564db73ecfe4b14ebbd8e62a3692c76',
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388285-69245968-2560-1440.png',
'duration': 195.0,
'display_id': 'vid1694147',
'timestamp': 1654446698,
'upload_date': '20220605',
}
}, {
'url': 'https://www.eurosport.com/tennis/roland-garros/2022/watch-the-top-five-shots-from-men-s-final-as-rafael-nadal-beats-casper-ruud-to-seal-14th-french-open_vid1694283/video.shtml',
'info_dict': {
'id': '2481254',
'ext': 'mp4',
'title': 'md5:149dcc5dfb38ab7352acc008cc9fb071',
'duration': 130.0,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/06/05/3388422-69248708-2560-1440.png',
'description': 'md5:a0c8a7f6b285e48ae8ddbe7aa85cfee6',
'display_id': 'vid1694283',
'timestamp': 1654456090,
'upload_date': '20220605',
}
}, {
# geo-fence but can bypassed by xff
'url': 'https://www.eurosport.com/cycling/tour-de-france-femmes/2022/incredible-ride-marlen-reusser-storms-to-stage-4-win-at-tour-de-france-femmes_vid1722221/video.shtml',
'info_dict': {
'id': '2582552',
'ext': 'mp4',
'title': 'Incredible ride! - Marlen Reusser storms to Stage 4 win at Tour de France Femmes',
'duration': 188.0,
'display_id': 'vid1722221',
'timestamp': 1658936167,
'thumbnail': 'https://imgresizer.eurosport.com/unsafe/1280x960/smart/filters:format(jpeg)/origin-imgresizer.eurosport.com/2022/07/27/3423347-69852108-2560-1440.jpg',
'description': 'md5:32bbe3a773ac132c57fb1e8cca4b7c71',
'upload_date': '20220727',
}
}]
_TOKEN = None
# actually defined in https://netsport.eurosport.io/?variables={"databaseId":<databaseId>,"playoutType":"VDP"}&extensions={"persistedQuery":{"version":1 ..
# but this method require to get sha256 hash
_GEO_COUNTRIES = ['DE', 'NL', 'EU', 'IT', 'FR'] # Not complete list but it should work
def _real_initialize(self):
if EurosportIE._TOKEN is None:
EurosportIE._TOKEN = self._download_json(
'https://eu3-prod-direct.eurosport.com/token?realm=eurosport', None,
'Trying to get token')['data']['attributes']['token']
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
json_data = self._download_json(
f'https://eu3-prod-direct.eurosport.com/playback/v2/videoPlaybackInfo/sourceSystemId/eurosport-{display_id}',
display_id, query={'usePreAuth': True}, headers={'Authorization': f'Bearer {EurosportIE._TOKEN}'})['data']
json_ld_data = self._search_json_ld(webpage, display_id)
formats, subtitles = [], {}
for stream_type in json_data['attributes']['streaming']:
if stream_type == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id, ext='mp4')
elif stream_type == 'dash':
fmts, subs = self._extract_mpd_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
elif stream_type == 'mss':
fmts, subs = self._extract_ism_formats_and_subtitles(
traverse_obj(json_data, ('attributes', 'streaming', stream_type, 'url')), display_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
self._sort_formats(formats)
return {
'id': json_data['id'],
'title': json_ld_data.get('title') or self._og_search_title(webpage),
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'thumbnails': json_ld_data.get('thumbnails'),
'description': (json_ld_data.get('description')
or self._html_search_meta(['og:description', 'description'], webpage)),
'duration': json_ld_data.get('duration'),
'timestamp': json_ld_data.get('timestamp'),
}

View File

@@ -3,7 +3,6 @@ import re
import urllib.parse
import xml.etree.ElementTree
from . import gen_extractor_classes
from .common import InfoExtractor # isort: split
from .brightcove import BrightcoveLegacyIE, BrightcoveNewIE
from .commonprotocols import RtmpIE
@@ -26,6 +25,7 @@ from ..utils import (
parse_resolution,
smuggle_url,
str_or_none,
traverse_obj,
try_call,
unescapeHTML,
unified_timestamp,
@@ -2805,7 +2805,7 @@ class GenericIE(InfoExtractor):
self._downloader.write_debug('Looking for embeds')
embeds = []
for ie in gen_extractor_classes():
for ie in self._downloader._ies.values():
gen = ie.extract_from_webpage(self._downloader, url, webpage)
current_embeds = []
try:
@@ -2840,8 +2840,9 @@ class GenericIE(InfoExtractor):
try:
info = self._parse_jwplayer_data(
jwplayer_data, video_id, require_title=False, base_url=url)
self.report_detected('JW Player data')
return merge_dicts(info, info_dict)
if traverse_obj(info, 'formats', ('entries', ..., 'formats')):
self.report_detected('JW Player data')
return merge_dicts(info, info_dict)
except ExtractorError:
# See https://github.com/ytdl-org/youtube-dl/pull/16735
pass

View File

@@ -6,7 +6,6 @@ from ..compat import compat_urlparse, compat_b64decode
from ..utils import (
ExtractorError,
int_or_none,
js_to_json,
str_or_none,
try_get,
unescapeHTML,
@@ -55,11 +54,7 @@ class HuyaLiveIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id=video_id)
json_stream = self._search_regex(r'"stream":\s+"([a-zA-Z0-9+=/]+)"', webpage, 'stream', default=None)
if not json_stream:
raise ExtractorError('Video is offline', expected=True)
stream_data = self._parse_json(compat_b64decode(json_stream).decode(), video_id=video_id,
transform_source=js_to_json)
stream_data = self._search_json(r'stream:\s+', webpage, 'stream', video_id=video_id, default=None)
room_info = try_get(stream_data, lambda x: x['data'][0]['gameLiveInfo'])
if not room_info:
raise ExtractorError('Can not extract the room info', expected=True)
@@ -67,6 +62,8 @@ class HuyaLiveIE(InfoExtractor):
screen_type = room_info.get('screenType')
live_source_type = room_info.get('liveSourceType')
stream_info_list = stream_data['data'][0]['gameStreamInfoList']
if not stream_info_list:
raise ExtractorError('Video is offline', expected=True)
formats = []
for stream_info in stream_info_list:
stream_url = stream_info.get('sFlvUrl')

View File

@@ -39,37 +39,42 @@ class InstagramBaseIE(InfoExtractor):
_NETRC_MACHINE = 'instagram'
_IS_LOGGED_IN = False
_API_BASE_URL = 'https://i.instagram.com/api/v1'
_LOGIN_URL = 'https://www.instagram.com/accounts/login'
_API_HEADERS = {
'X-IG-App-ID': '936619743392459',
'X-ASBD-ID': '198387',
'X-IG-WWW-Claim': '0',
'Origin': 'https://www.instagram.com',
'Accept': '*/*',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36',
}
def _perform_login(self, username, password):
if self._IS_LOGGED_IN:
return
login_webpage = self._download_webpage(
'https://www.instagram.com/accounts/login/', None,
note='Downloading login webpage', errnote='Failed to download login webpage')
self._LOGIN_URL, None, note='Downloading login webpage', errnote='Failed to download login webpage')
shared_data = self._parse_json(
self._search_regex(
r'window\._sharedData\s*=\s*({.+?});',
login_webpage, 'shared data', default='{}'),
None)
shared_data = self._parse_json(self._search_regex(
r'window\._sharedData\s*=\s*({.+?});', login_webpage, 'shared data', default='{}'), None)
login = self._download_json('https://www.instagram.com/accounts/login/ajax/', None, note='Logging in', headers={
'Accept': '*/*',
'X-IG-App-ID': '936619743392459',
'X-ASBD-ID': '198387',
'X-IG-WWW-Claim': '0',
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': shared_data['config']['csrf_token'],
'X-Instagram-AJAX': shared_data['rollout_hash'],
'Referer': 'https://www.instagram.com/',
}, data=urlencode_postdata({
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
'username': username,
'queryParams': '{}',
'optIntoOneTap': 'false',
'stopDeletionNonce': '',
'trustedDeviceRecords': '{}',
}))
login = self._download_json(
f'{self._LOGIN_URL}/ajax/', None, note='Logging in', headers={
**self._API_HEADERS,
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': shared_data['config']['csrf_token'],
'X-Instagram-AJAX': shared_data['rollout_hash'],
'Referer': 'https://www.instagram.com/',
}, data=urlencode_postdata({
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
'username': username,
'queryParams': '{}',
'optIntoOneTap': 'false',
'stopDeletionNonce': '',
'trustedDeviceRecords': '{}',
}))
if not login.get('authenticated'):
if login.get('message'):
@@ -134,7 +139,7 @@ class InstagramBaseIE(InfoExtractor):
}
def _extract_product_media(self, product_media):
media_id = product_media.get('code') or product_media.get('id')
media_id = product_media.get('code') or _pk_to_id(product_media.get('pk'))
vcodec = product_media.get('video_codec')
dash_manifest_raw = product_media.get('video_dash_manifest')
videos_list = product_media.get('video_versions')
@@ -179,7 +184,7 @@ class InstagramBaseIE(InfoExtractor):
user_info = product_info.get('user') or {}
info_dict = {
'id': product_info.get('code') or product_info.get('id'),
'id': product_info.get('code') or _pk_to_id(product_info.get('pk')),
'title': product_info.get('title') or f'Video by {user_info.get("username")}',
'description': traverse_obj(product_info, ('caption', 'text'), expected_type=str_or_none),
'timestamp': int_or_none(product_info.get('taken_at')),
@@ -360,49 +365,74 @@ class InstagramIE(InstagramBaseIE):
def _real_extract(self, url):
video_id, url = self._match_valid_url(url).group('id', 'url')
general_info = self._download_json(
f'https://www.instagram.com/graphql/query/?query_hash=9f8827793ef34641b2fb195d4d41151c'
f'&variables=%7B"shortcode":"{video_id}",'
'"parent_comment_count":10,"has_threaded_comments":true}', video_id, fatal=False, errnote=False,
headers={
'Accept': '*',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'Authority': 'www.instagram.com',
'Referer': 'https://www.instagram.com',
'x-ig-app-id': '936619743392459',
})
media = traverse_obj(general_info, ('data', 'shortcode_media')) or {}
media, webpage = {}, ''
api_check = self._download_json(
f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
video_id, headers=self._API_HEADERS, fatal=False, note='Setting up session', errnote=False) or {}
csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
if not csrf_token:
self.report_warning('No csrf token set by Instagram API', video_id)
elif api_check.get('status') != 'ok':
self.report_warning('Instagram API is not granting access', video_id)
else:
if self._get_cookies(url).get('sessionid'):
media.update(traverse_obj(self._download_json(
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
fatal=False, note='Downloading video info', headers={
**self._API_HEADERS,
'X-CSRFToken': csrf_token.value,
}), ('items', 0)) or {})
if media:
return self._extract_product(media)
variables = {
'shortcode': video_id,
'child_comment_count': 3,
'fetch_comment_count': 40,
'parent_comment_count': 24,
'has_threaded_comments': True,
}
general_info = self._download_json(
'https://www.instagram.com/graphql/query/', video_id, fatal=False,
headers={
**self._API_HEADERS,
'X-CSRFToken': csrf_token.value,
'X-Requested-With': 'XMLHttpRequest',
'Referer': url,
}, query={
'query_hash': '9f8827793ef34641b2fb195d4d41151c',
'variables': json.dumps(variables, separators=(',', ':')),
})
media.update(traverse_obj(general_info, ('data', 'shortcode_media')) or {})
if not media:
self.report_warning('General metadata extraction failed', video_id)
self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
webpage, urlh = self._download_webpage_handle(url, video_id)
shared_data = self._search_json(
r'window\._sharedData\s*=', webpage, 'shared data', video_id, fatal=False) or {}
info = self._download_json(
f'https://i.instagram.com/api/v1/media/{_id_to_pk(video_id)}/info/', video_id,
fatal=False, note='Downloading video info', errnote=False, headers={
'Accept': '*',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'Authority': 'www.instagram.com',
'Referer': 'https://www.instagram.com',
'x-ig-app-id': '936619743392459',
})
if info:
media.update(info['items'][0])
return self._extract_product(media)
if shared_data and self._LOGIN_URL not in urlh.geturl():
media.update(traverse_obj(
shared_data, ('entry_data', 'PostPage', 0, 'graphql', 'shortcode_media'),
('entry_data', 'PostPage', 0, 'media'), expected_type=dict) or {})
else:
self.report_warning('Main webpage is locked behind the login page. Retrying with embed webpage')
webpage = self._download_webpage(
f'{url}/embed/', video_id, note='Downloading embed webpage', fatal=False)
additional_data = self._search_json(
r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*', webpage, 'additional data', video_id, fatal=False)
if not additional_data:
self.raise_login_required('Requested content is not available, rate-limit reached or login required')
webpage = self._download_webpage(
f'https://www.instagram.com/p/{video_id}/embed/', video_id,
note='Downloading embed webpage', fatal=False)
if not webpage:
self.raise_login_required('Requested content was not found, the content might be private')
product_item = traverse_obj(additional_data, ('items', 0), expected_type=dict)
if product_item:
media.update(product_item)
return self._extract_product(media)
additional_data = self._search_json(
r'window\.__additionalDataLoaded\s*\(\s*[^,]+,\s*', webpage, 'additional data', video_id, fatal=False)
product_item = traverse_obj(additional_data, ('items', 0), expected_type=dict)
if product_item:
media.update(product_item)
return self._extract_product(media)
media.update(traverse_obj(
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
media.update(traverse_obj(
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
@@ -649,12 +679,8 @@ class InstagramStoryIE(InstagramBaseIE):
story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}'
videos = traverse_obj(self._download_json(
f'https://i.instagram.com/api/v1/feed/reels_media/?reel_ids={story_info_url}',
story_id, errnote=False, fatal=False, headers={
'X-IG-App-ID': 936619743392459,
'X-ASBD-ID': 198387,
'X-IG-WWW-Claim': 0,
}), 'reels')
f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
story_id, errnote=False, fatal=False, headers=self._API_HEADERS), 'reels')
if not videos:
self.raise_login_required('You need to log in to access this content')

View File

@@ -0,0 +1,82 @@
import re
from .common import InfoExtractor
from ..utils import traverse_obj, urljoin
class IslamChannelIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/watch/(?P<id>\d+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/watch/38604310',
'info_dict': {
'id': '38604310',
'title': 'Omar - Young Omar',
'description': 'md5:5cc7ddecef064ea7afe52eb5e0e33b55',
'thumbnail': r're:https?://.+',
'ext': 'mp4',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
thumbnail = self._search_regex(
r'data-poster="([^"]+)"', webpage, 'data poster', fatal=False) or \
self._html_search_meta(('og:image', 'twitter:image'), webpage)
headers = {
'Token': self._search_regex(r'data-token="([^"]+)"', webpage, 'data token'),
'Token-Expiry': self._search_regex(r'data-expiry="([^"]+)"', webpage, 'data expiry'),
'Uvid': video_id,
}
show_stream = self._download_json(
f'https://v2-streams-elb.simplestreamcdn.com/api/show/stream/{video_id}', video_id,
query={
'key': self._search_regex(r'data-key="([^"]+)"', webpage, 'data key'),
'platform': 'chrome',
}, headers=headers)
# TODO: show_stream['stream'] and show_stream['drm'] may contain something interesting
streams = self._download_json(
traverse_obj(show_stream, ('response', 'tokenization', 'url')), video_id,
headers=headers)
formats, subs = self._extract_m3u8_formats_and_subtitles(traverse_obj(streams, ('Streams', 'Adaptive')), video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': self._html_search_meta(('og:title', 'twitter:title'), webpage),
'description': self._html_search_meta(('og:description', 'twitter:description', 'description'), webpage),
'formats': formats,
'subtitles': subs,
'thumbnails': [{
'id': 'unscaled',
'url': thumbnail.split('?')[0],
'ext': 'jpg',
'preference': 2,
}, {
'id': 'orig',
'url': thumbnail,
'ext': 'jpg',
'preference': 1,
}] if thumbnail else None,
}
class IslamChannelSeriesIE(InfoExtractor):
_VALID_URL = r'https?://watch\.islamchannel\.tv/series/(?P<id>[a-f\d-]+)'
_TESTS = [{
'url': 'https://watch.islamchannel.tv/series/a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
'info_dict': {
'id': 'a6cccef3-3ef1-11eb-bc19-06b69c2357cd',
},
'playlist_mincount': 31,
}]
def _real_extract(self, url):
pl_id = self._match_id(url)
webpage = self._download_webpage(url, pl_id)
return self.playlist_from_matches(
re.finditer(r'<a\s+href="(/watch/\d+)"[^>]+?data-video-type="show">', webpage),
pl_id, getter=lambda x: urljoin(url, x.group(1)), ie=IslamChannelIE)

View File

@@ -8,15 +8,33 @@ from ..utils import (
float_or_none,
int_or_none,
str_or_none,
try_get,
traverse_obj,
)
class MedalTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?medal\.tv/(?P<path>games/[^/?#&]+/clips)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://medal.tv/clips/2mA60jWAGQCBH',
'md5': '7b07b064331b1cf9e8e5c52a06ae68fa',
'url': 'https://medal.tv/games/valorant/clips/jTBFnLKdLy15K',
'md5': '6930f8972914b6b9fdc2bb3918098ba0',
'info_dict': {
'id': 'jTBFnLKdLy15K',
'ext': 'mp4',
'title': "Mornu's clutch",
'description': '',
'uploader': 'Aciel',
'timestamp': 1651628243,
'upload_date': '20220504',
'uploader_id': '19335460',
'uploader_url': 'https://medal.tv/users/19335460',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 13,
}
}, {
'url': 'https://medal.tv/games/cod%20cold%20war/clips/2mA60jWAGQCBH',
'md5': '3d19d426fe0b2d91c26e412684e66a06',
'info_dict': {
'id': '2mA60jWAGQCBH',
'ext': 'mp4',
@@ -26,9 +44,15 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1603165266,
'upload_date': '20201020',
'uploader_id': '10619174',
'thumbnail': 'https://cdn.medal.tv/10619174/thumbnail-34934644-720p.jpg?t=1080p&c=202042&missing',
'uploader_url': 'https://medal.tv/users/10619174',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 23,
}
}, {
'url': 'https://medal.tv/clips/2um24TWdty0NA',
'url': 'https://medal.tv/games/cod%20cold%20war/clips/2um24TWdty0NA',
'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148',
'info_dict': {
'id': '2um24TWdty0NA',
@@ -39,25 +63,42 @@ class MedalTVIE(InfoExtractor):
'timestamp': 1605580939,
'upload_date': '20201117',
'uploader_id': '5156321',
'thumbnail': 'https://cdn.medal.tv/5156321/thumbnail-36787208-360p.jpg?t=1080p&c=202046&missing',
'uploader_url': 'https://medal.tv/users/5156321',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 9,
}
}, {
'url': 'https://medal.tv/clips/37rMeFpryCC-9',
'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True,
}, {
'url': 'https://medal.tv/clips/2WRj40tpY_EU9',
'url': 'https://medal.tv/games/valorant/clips/2WRj40tpY_EU9',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
path = self._match_valid_url(url).group('path')
webpage = self._download_webpage(url, video_id)
hydration_data = self._parse_json(self._search_regex(
r'<script[^>]*>\s*(?:var\s*)?hydrationData\s*=\s*({.+?})\s*</script>',
webpage, 'hydration data', default='{}'), video_id)
next_data = self._search_json(
'<script[^>]*__NEXT_DATA__[^>]*>', webpage,
'next data', video_id, end_pattern='</script>', fatal=False)
clip = try_get(
hydration_data, lambda x: x['clips'][video_id], dict) or {}
build_id = next_data.get('buildId')
if not build_id:
raise ExtractorError(
'Could not find build ID.', video_id=video_id)
locale = next_data.get('locale', 'en')
api_response = self._download_json(
f'https://medal.tv/_next/data/{build_id}/{locale}/{path}/{video_id}.json', video_id)
clip = traverse_obj(api_response, ('pageProps', 'clip')) or {}
if not clip:
raise ExtractorError(
'Could not find video information.', video_id=video_id)
@@ -113,9 +154,8 @@ class MedalTVIE(InfoExtractor):
# Necessary because the id of the author is not known in advance.
# Won't raise an issue if no profile can be found as this is optional.
author = try_get(
hydration_data, lambda x: list(x['profiles'].values())[0], dict) or {}
author_id = str_or_none(author.get('id'))
author = traverse_obj(api_response, ('pageProps', 'profile')) or {}
author_id = str_or_none(author.get('userId'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
return {

View File

@@ -172,31 +172,27 @@ class MediasetIE(ThePlatformBaseIE):
}]
def _extract_from_webpage(self, url, webpage):
def _qs(url):
return parse_qs(url)
def _program_guid(qs):
return qs.get('programGuid', [None])[0]
entries = []
for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1',
webpage):
embed_url = mobj.group('url')
embed_qs = _qs(embed_url)
embed_qs = parse_qs(embed_url)
program_guid = _program_guid(embed_qs)
if program_guid:
entries.append(embed_url)
yield self.url_result(embed_url)
continue
video_id = embed_qs.get('id', [None])[0]
if not video_id:
continue
urlh = self._request_webpage(embed_url, video_id, note='Following embed URL redirect')
embed_url = urlh.geturl()
program_guid = _program_guid(_qs(embed_url))
program_guid = _program_guid(parse_qs(embed_url))
if program_guid:
entries.append(embed_url)
return entries
yield self.url_result(embed_url)
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
for video in smil.findall(self._xpath_ns('.//video', namespace)):

View File

@@ -159,6 +159,7 @@ class MixcloudIE(MixcloudBaseIE):
formats.append({
'format_id': 'http',
'url': decrypted,
'vcodec': 'none',
'downloader_options': {
# Mixcloud starts throttling at >~5M
'http_chunk_size': 5242880,

View File

@@ -0,0 +1,54 @@
import re
from .common import InfoExtractor
from ..utils import ExtractorError
class NewsPicksIE(InfoExtractor):
_VALID_URL = r'https://newspicks\.com/movie-series/(?P<channel_id>\d+)\?movieId=(?P<id>\d+)'
_TESTS = [{
'url': 'https://newspicks.com/movie-series/11?movieId=1813',
'info_dict': {
'id': '1813',
'title': '日本の課題を破壊せよ【ゲスト:成田悠輔】',
'description': 'md5:09397aad46d6ded6487ff13f138acadf',
'channel': 'HORIE ONE',
'channel_id': '11',
'release_date': '20220117',
'thumbnail': r're:https://.+jpg',
'ext': 'mp4',
},
}]
def _real_extract(self, url):
video_id, channel_id = self._match_valid_url(url).group('id', 'channel_id')
webpage = self._download_webpage(url, video_id)
entries = self._parse_html5_media_entries(
url, webpage.replace('movie-for-pc', 'movie'), video_id, 'hls')
if not entries:
raise ExtractorError('No HTML5 media elements found')
info = entries[0]
self._sort_formats(info['formats'])
title = self._html_search_meta('og:title', webpage, fatal=False)
description = self._html_search_meta(
('og:description', 'twitter:title'), webpage, fatal=False)
channel = self._html_search_regex(
r'value="11".+?<div\s+class="title">(.+?)</div', webpage, 'channel name', fatal=False)
if not title or not channel:
title, channel = re.split(r'\s*|\s*', self._html_extract_title(webpage))
release_date = self._search_regex(
r'<span\s+class="on-air-date">\s*(\d+)年(\d+)月(\d+)日\s*</span>',
webpage, 'release date', fatal=False, group=(1, 2, 3))
info.update({
'id': video_id,
'title': title,
'description': description,
'channel': channel,
'channel_id': channel_id,
'release_date': ('%04d%02d%02d' % tuple(map(int, release_date))) if release_date else None,
})
return info

View File

@@ -1,3 +1,4 @@
import collections
import contextlib
import json
import os
@@ -9,8 +10,10 @@ from ..utils import (
ExtractorError,
Popen,
check_executable,
format_field,
get_exe_version,
is_outdated_version,
shell_quote,
)
@@ -49,7 +52,9 @@ class PhantomJSwrapper:
This class is experimental.
"""
_TEMPLATE = r'''
INSTALL_HINT = 'Please download it from https://phantomjs.org/download.html'
_BASE_JS = R'''
phantom.onError = function(msg, trace) {{
var msgStack = ['PHANTOM ERROR: ' + msg];
if(trace && trace.length) {{
@@ -62,6 +67,9 @@ class PhantomJSwrapper:
console.error(msgStack.join('\n'));
phantom.exit(1);
}};
'''
_TEMPLATE = R'''
var page = require('webpage').create();
var fs = require('fs');
var read = {{ mode: 'r', charset: 'utf-8' }};
@@ -104,8 +112,7 @@ class PhantomJSwrapper:
self.exe = check_executable('phantomjs', ['-v'])
if not self.exe:
raise ExtractorError(
'PhantomJS not found, Please download it from https://phantomjs.org/download.html', expected=True)
raise ExtractorError(f'PhantomJS not found, {self.INSTALL_HINT}', expected=True)
self.extractor = extractor
@@ -116,14 +123,18 @@ class PhantomJSwrapper:
'Your copy of PhantomJS is outdated, update it to version '
'%s or newer if you encounter any errors.' % required_version)
self.options = {
'timeout': timeout,
}
for name in self._TMP_FILE_NAMES:
tmp = tempfile.NamedTemporaryFile(delete=False)
tmp.close()
self._TMP_FILES[name] = tmp
self.options = collections.ChainMap({
'timeout': timeout,
}, {
x: self._TMP_FILES[x].name.replace('\\', '\\\\').replace('"', '\\"')
for x in self._TMP_FILE_NAMES
})
def __del__(self):
for name in self._TMP_FILE_NAMES:
with contextlib.suppress(OSError, KeyError):
@@ -194,31 +205,39 @@ class PhantomJSwrapper:
self._save_cookies(url)
replaces = self.options
replaces['url'] = url
user_agent = headers.get('User-Agent') or self.extractor.get_param('http_headers')['User-Agent']
replaces['ua'] = user_agent.replace('"', '\\"')
replaces['jscode'] = jscode
jscode = self._TEMPLATE.format_map(self.options.new_child({
'url': url,
'ua': user_agent.replace('"', '\\"'),
'jscode': jscode,
}))
for x in self._TMP_FILE_NAMES:
replaces[x] = self._TMP_FILES[x].name.replace('\\', '\\\\').replace('"', '\\"')
stdout = self.execute(jscode, video_id, note2)
with open(self._TMP_FILES['script'].name, 'wb') as f:
f.write(self._TEMPLATE.format(**replaces).encode('utf-8'))
if video_id is None:
self.extractor.to_screen(f'{note2}')
else:
self.extractor.to_screen(f'{video_id}: {note2}')
stdout, stderr, returncode = Popen.run(
[self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name],
text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if returncode:
raise ExtractorError(f'Executing JS failed:\n{stderr}')
with open(self._TMP_FILES['html'].name, 'rb') as f:
html = f.read().decode('utf-8')
self._load_cookies()
return html, stdout
def execute(self, jscode, video_id=None, *, note='Executing JS'):
"""Execute JS and return stdout"""
if 'phantom.exit();' not in jscode:
jscode += ';\nphantom.exit();'
jscode = self._BASE_JS + jscode
with open(self._TMP_FILES['script'].name, 'w', encoding='utf-8') as f:
f.write(jscode)
self.extractor.to_screen(f'{format_field(video_id, None, "%s: ")}{note}')
cmd = [self.exe, '--ssl-protocol=any', self._TMP_FILES['script'].name]
self.extractor.write_debug(f'PhantomJS command line: {shell_quote(cmd)}')
try:
stdout, stderr, returncode = Popen.run(cmd, timeout=self.options['timeout'] / 1000,
text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except Exception as e:
raise ExtractorError(f'{note} failed: Unable to run PhantomJS binary', cause=e)
if returncode:
raise ExtractorError(f'{note} failed with returncode {returncode}:\n{stderr.strip()}')
return stdout

View File

@@ -156,7 +156,7 @@ class RaiBaseIE(InfoExtractor):
br = int_or_none(tbr)
if len(fmts) == 1 and not br:
br = fmts[0].get('tbr')
if br or 0 > 300:
if br and br > 300:
tbr = compat_str(math.floor(br / 100) * 100)
else:
tbr = '250'

View File

@@ -11,6 +11,7 @@ from ..utils import (
int_or_none,
strip_or_none,
traverse_obj,
try_call,
unified_timestamp,
)
@@ -69,6 +70,10 @@ class RedBeeBaseIE(InfoExtractor):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
format['mediaLocator'], asset_id, fatal=False)
if format.get('drm'):
for f in fmts:
f['has_drm'] = True
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
@@ -251,7 +256,7 @@ class RTBFIE(RedBeeBaseIE):
if not login_token:
self.raise_login_required()
session_jwt = self._download_json(
session_jwt = try_call(lambda: self._get_cookies(url)['rtbf_jwt'].value) or self._download_json(
'https://login.rtbf.be/accounts.getJWT', media_id, query={
'login_token': login_token.value,
'APIKey': self._GIGYA_API_KEY,
@@ -269,8 +274,17 @@ class RTBFIE(RedBeeBaseIE):
embed_page = self._download_webpage(
'https://www.rtbf.be/auvio/embed/' + ('direct' if live else 'media'),
media_id, query={'id': media_id})
data = self._parse_json(self._html_search_regex(
r'data-media="([^"]+)"', embed_page, 'media data'), media_id)
media_data = self._html_search_regex(r'data-media="([^"]+)"', embed_page, 'media data', fatal=False)
if not media_data:
if re.search(r'<div[^>]+id="js-error-expired"[^>]+class="(?![^"]*hidden)', embed_page):
raise ExtractorError('Livestream has ended.', expected=True)
if re.search(r'<div[^>]+id="js-sso-connect"[^>]+class="(?![^"]*hidden)', embed_page):
self.raise_login_required()
raise ExtractorError('Could not find media data')
data = self._parse_json(media_data, media_id)
error = data.get('error')
if error:
@@ -280,15 +294,20 @@ class RTBFIE(RedBeeBaseIE):
if provider in self._PROVIDERS:
return self.url_result(data['url'], self._PROVIDERS[provider])
title = data['subtitle']
title = traverse_obj(data, 'subtitle', 'title')
is_live = data.get('isLive')
height_re = r'-(\d+)p\.'
formats = []
formats, subtitles = [], {}
m3u8_url = data.get('urlHlsAes128') or data.get('urlHls')
# The old api still returns m3u8 and mpd manifest for livestreams, but these are 'fake'
# since all they contain is a 20s video that is completely unrelated.
# https://github.com/yt-dlp/yt-dlp/issues/4656#issuecomment-1214461092
m3u8_url = None if data.get('isLive') else traverse_obj(data, 'urlHlsAes128', 'urlHls')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, media_id, 'mp4', m3u8_id='hls', fatal=False))
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, media_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
fix_url = lambda x: x.replace('//rtbf-vod.', '//rtbf.') if '/geo/drm/' in x else x
http_url = data.get('url')
@@ -319,10 +338,12 @@ class RTBFIE(RedBeeBaseIE):
'height': height,
})
mpd_url = data.get('urlDash')
mpd_url = None if data.get('isLive') else data.get('urlDash')
if mpd_url and (self.get_param('allow_unplayable_formats') or not data.get('drm')):
formats.extend(self._extract_mpd_formats(
mpd_url, media_id, mpd_id='dash', fatal=False))
fmts, subs = self._extract_mpd_formats_and_subtitles(
mpd_url, media_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
audio_url = data.get('urlAudio')
if audio_url:
@@ -332,7 +353,6 @@ class RTBFIE(RedBeeBaseIE):
'vcodec': 'none',
})
subtitles = {}
for track in (data.get('tracks') or {}).values():
sub_url = track.get('url')
if not sub_url:
@@ -342,7 +362,7 @@ class RTBFIE(RedBeeBaseIE):
})
if not formats:
fmts, subs = self._get_formats_and_subtitles(url, media_id)
fmts, subs = self._get_formats_and_subtitles(url, f'live_{media_id}' if is_live else media_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)

View File

@@ -1,10 +1,12 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
get_element_by_class,
int_or_none,
remove_start,
strip_or_none,
unified_strdate,
urlencode_postdata,
)
@@ -34,6 +36,28 @@ class ScreencastOMaticIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(
'https://screencast-o-matic.com/player/' + video_id, video_id)
if (self._html_extract_title(webpage) == 'Protected Content'
or 'This video is private and requires a password' in webpage):
password = self.get_param('videopassword')
if not password:
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
form = self._search_regex(
r'(?is)<form[^>]*>(?P<form>.+?)</form>', webpage, 'login form', group='form')
form_data = self._hidden_inputs(form)
form_data.update({
'scPassword': password,
})
webpage = self._download_webpage(
'https://screencast-o-matic.com/player/password', video_id, 'Logging in',
data=urlencode_postdata(form_data))
if '<small class="text-danger">Invalid password</small>' in webpage:
raise ExtractorError('Unable to login: Invalid password', expected=True)
info = self._parse_html5_media_entries(url, webpage, video_id)[0]
info.update({
'id': video_id,

View File

@@ -44,7 +44,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
_TESTS = [
{
'url': 'https://sovietscloset.com/video/1337',
'md5': '11e58781c4ca5b283307aa54db5b3f93',
'md5': 'bd012b04b261725510ca5383074cdd55',
'info_dict': {
'id': '1337',
'ext': 'mp4',
@@ -69,11 +69,11 @@ class SovietsClosetIE(SovietsClosetBaseIE):
},
{
'url': 'https://sovietscloset.com/video/1105',
'md5': '578b1958a379e7110ba38697042e9efb',
'md5': '89fa928f183893cb65a0b7be846d8a90',
'info_dict': {
'id': '1105',
'ext': 'mp4',
'title': 'Arma 3 - Zeus Games #3',
'title': 'Arma 3 - Zeus Games #5',
'uploader': 'SovietWomble',
'thumbnail': r're:^https?://.*\.b-cdn\.net/c0e5e76f-3a93-40b4-bf01-12343c2eec5d/thumbnail\.jpg$',
'uploader': 'SovietWomble',
@@ -89,8 +89,8 @@ class SovietsClosetIE(SovietsClosetBaseIE):
'availability': 'public',
'series': 'Arma 3',
'season': 'Zeus Games',
'episode_number': 3,
'episode': 'Episode 3',
'episode_number': 5,
'episode': 'Episode 5',
},
},
]
@@ -122,7 +122,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
static_assets_base = self._search_regex(r'staticAssetsBase:\"(.*?)\"', webpage, 'staticAssetsBase')
static_assets_base = self._search_regex(r'(/_nuxt/static/\d+)', webpage, 'staticAssetsBase')
static_assets_base = f'https://sovietscloset.com{static_assets_base}'
stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream']
@@ -181,7 +181,7 @@ class SovietsClosetPlaylistIE(SovietsClosetBaseIE):
webpage = self._download_webpage(url, playlist_id)
static_assets_base = self._search_regex(r'staticAssetsBase:\"(.*?)\"', webpage, 'staticAssetsBase')
static_assets_base = self._search_regex(r'(/_nuxt/static/\d+)', webpage, 'staticAssetsBase')
static_assets_base = f'https://sovietscloset.com{static_assets_base}'
sovietscloset = self.parse_nuxt_jsonp(f'{static_assets_base}/payload.js', playlist_id, 'global')['games']

View File

@@ -29,9 +29,7 @@ class StripchatIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'https://stripchat.com/%s/' % video_id, video_id,
headers=self.geo_verification_headers())
webpage = self._download_webpage(url, video_id, headers=self.geo_verification_headers())
data = self._parse_json(
self._search_regex(

369
yt_dlp/extractor/tencent.py Normal file
View File

@@ -0,0 +1,369 @@
import functools
import random
import re
import string
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
js_to_json,
traverse_obj,
urljoin,
)
class TencentBaseIE(InfoExtractor):
"""Subclasses must set _API_URL, _APP_VERSION, _PLATFORM, _HOST, _REFERER"""
def _get_ckey(self, video_id, url, guid):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{self._APP_VERSION}|{guid}|'
f'{self._PLATFORM}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Windows x86_64|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex().upper()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
guid = ''.join([random.choice(string.digits + string.ascii_lowercase) for _ in range(16)])
ckey = self._get_ckey(video_id, video_url, guid)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0',
'sphls': '2' if video_format == 'hls' else '0',
'dtype': '3' if video_format == 'hls' else '0',
'defn': video_quality,
'spsrt': '2', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json',
'spwm': '1',
# For SHD
'host': self._HOST,
'referer': self._REFERER,
'ehost': video_url,
'appVer': self._APP_VERSION,
'platform': self._PLATFORM,
# For VQQ
'guid': guid,
'flowid': ''.join(random.choice(string.digits + string.ascii_lowercase) for _ in range(32)),
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
self._API_URL, video_id, query=query), 'api_response', video_id)
def _extract_video_formats_and_subtitles(self, api_response, video_id):
video_response = api_response['vl']['vi'][0]
video_width, video_height = video_response.get('vw'), video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pt'], video_id, 'mp4', fatal=False)
for f in fmts:
f.update({'width': video_width, 'height': video_height})
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_native_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')) or ():
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _extract_all_video_formats_and_subtitles(self, url, video_id, series_id):
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (
# '': 480p, 'shd': 720p, 'fhd': 1080p
('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(
url, video_id, series_id, subtitle_format, video_format, video_quality)
if api_response.get('em') != 0 and api_response.get('exem') != 0:
if '您所在区域暂无此内容版权' in api_response.get('msg'):
self.raise_geo_restricted()
raise ExtractorError(f'Tencent said: {api_response.get("msg")}')
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id)
native_subtitles = self._extract_video_native_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
return formats, subtitles
def _get_clean_title(self, title):
return re.sub(
r'\s*[_\-]\s*(?:Watch online|腾讯视频|(?:高清)?1080P在线观看平台).*?$',
'', title or '').strip() or None
class VQQBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://v\.qq\.com'
_API_URL = 'https://h5vv6.video.qq.com/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '10901'
_HOST = 'v.qq.com'
_REFERER = 'v.qq.com'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
self._search_regex(
r'(?s)<script[^>]*>[^<]*window\.__pinia\s*=\s*([^<]+)</script>',
webpage, 'pinia data', fatal=False),
video_id, transform_source=js_to_json, fatal=False)
class VQQVideoIE(VQQBaseIE):
IE_NAME = 'vqq:video'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/(?:page|cover/(?P<series_id>\w+))/(?P<id>\w+)'
_TESTS = [{
'url': 'https://v.qq.com/x/page/q326831cny0.html',
'md5': '826ef93682df09e3deac4a6e6e8cdb6e',
'info_dict': {
'id': 'q326831cny0',
'ext': 'mp4',
'title': '我是选手:雷霆裂阵,终极时刻',
'description': 'md5:e7ed70be89244017dac2a835a10aeb1e',
'thumbnail': r're:^https?://[^?#]+q326831cny0',
},
}, {
'url': 'https://v.qq.com/x/page/o3013za7cse.html',
'md5': 'b91cbbeada22ef8cc4b06df53e36fa21',
'info_dict': {
'id': 'o3013za7cse',
'ext': 'mp4',
'title': '欧阳娜娜VLOG',
'description': 'md5:29fe847497a98e04a8c3826e499edd2e',
'thumbnail': r're:^https?://[^?#]+o3013za7cse',
},
}, {
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27/a00269ix3l8.html',
'md5': '71459c5375c617c265a22f083facce67',
'info_dict': {
'id': 'a00269ix3l8',
'ext': 'mp4',
'title': '鸡毛飞上天 第01集',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
'thumbnail': r're:^https?://[^?#]+7ce5noezvafma27',
'series': '鸡毛飞上天',
},
}, {
'url': 'https://v.qq.com/x/cover/mzc00200p29k31e/s0043cwsgj0.html',
'md5': '96b9fd4a189fdd4078c111f21d7ac1bc',
'info_dict': {
'id': 's0043cwsgj0',
'ext': 'mp4',
'title': '第1集如何快乐吃糖',
'description': 'md5:1d8c3a0b8729ae3827fa5b2d3ebd5213',
'thumbnail': r're:^https?://[^?#]+s0043cwsgj0',
'series': '青年理工工作者生活研究所',
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'title'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'desc'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': (self._og_search_thumbnail(webpage)
or traverse_obj(webpage_metadata, ('global', 'videoInfo', 'pic160x90'))),
'series': traverse_obj(webpage_metadata, ('global', 'coverInfo', 'title')),
}
class VQQSeriesIE(VQQBaseIE):
IE_NAME = 'vqq:series'
_VALID_URL = VQQBaseIE._VALID_URL_BASE + r'/x/cover/(?P<id>\w+)\.html/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://v.qq.com/x/cover/7ce5noezvafma27.html',
'info_dict': {
'id': '7ce5noezvafma27',
'title': '鸡毛飞上天',
'description': 'md5:8cae3534327315b3872fbef5e51b5c5b',
},
'playlist_count': 55,
}, {
'url': 'https://v.qq.com/x/cover/oshd7r0vy9sfq8e.html',
'info_dict': {
'id': 'oshd7r0vy9sfq8e',
'title': '恋爱细胞2',
'description': 'md5:9d8a2245679f71ca828534b0f95d2a03',
},
'playlist_count': 12,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = [f'/x/cover/{series_id}/{video_id}.html' for video_id in re.findall(
r'<div[^>]+data-vid="(?P<video_id>[^"]+)"[^>]+class="[^"]+episode-item-rect--number',
webpage)]
return self.playlist_from_matches(
episode_paths, series_id, ie=VQQVideoIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))
class WeTvBaseIE(TencentBaseIE):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
_API_URL = 'https://play.wetv.vip/getvinfo'
_APP_VERSION = '3.5.57'
_PLATFORM = '4830201'
_HOST = 'wetv.vip'
_REFERER = 'wetv.vip'
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': '0c70fdfaa5011ab022eebc598e64bbbe',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '3b3c15ca4b9a158d8d28d5aa9d7c0a49',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
formats, subtitles = self._extract_all_video_formats_and_subtitles(url, video_id, series_id)
return {
'id': video_id,
'title': self._get_clean_title(self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'title'))),
'description': (traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = ([f'/play/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')]
or re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage))
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=self._get_clean_title(traverse_obj(webpage_metadata, ('coverInfo', 'title'))
or self._og_search_title(webpage)),
description=(traverse_obj(webpage_metadata, ('coverInfo', 'description'))
or self._og_search_description(webpage)))

View File

@@ -8,12 +8,14 @@ class TestURLIE(InfoExtractor):
""" Allows addressing of the test cases as test:yout.*be_1 """
IE_DESC = False # Do not list
_VALID_URL = r'test(?:url)?:(?P<extractor>.+?)(?:_(?P<num>[0-9]+))?$'
_VALID_URL = r'test(?:url)?:(?P<extractor>.*?)(?:_(?P<num>[0-9]+))?$'
def _real_extract(self, url):
from . import gen_extractor_classes
extractor_id, num = self._match_valid_url(url).group('extractor', 'num')
if not extractor_id:
return {'id': ':test', 'title': '', 'url': url}
rex = re.compile(extractor_id, flags=re.IGNORECASE)
matching_extractors = [e for e in gen_extractor_classes() if rex.search(e.IE_NAME)]

304
yt_dlp/extractor/triller.py Normal file
View File

@@ -0,0 +1,304 @@
import itertools
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
unified_timestamp,
url_basename,
)
class TrillerBaseIE(InfoExtractor):
_NETRC_MACHINE = 'triller'
_AUTH_TOKEN = None
_API_BASE_URL = 'https://social.triller.co/v1.5'
def _perform_login(self, username, password):
if self._AUTH_TOKEN:
return
user_check = self._download_json(
f'{self._API_BASE_URL}/api/user/is-valid-username', None, note='Checking username',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps({'username': username}, separators=(',', ':')).encode('utf-8'))
if user_check.get('status'): # endpoint returns "status":false if username exists
raise ExtractorError('Unable to login: Invalid username', expected=True)
credentials = {
'username': username,
'password': password,
}
login = self._download_json(
f'{self._API_BASE_URL}/user/auth', None, note='Logging in',
fatal=False, expected_status=400, headers={
'Content-Type': 'application/json',
'Origin': 'https://triller.co',
}, data=json.dumps(credentials, separators=(',', ':')).encode('utf-8'))
if not login.get('auth_token'):
if login.get('error') == 1008:
raise ExtractorError('Unable to login: Incorrect password', expected=True)
raise ExtractorError('Unable to login')
self._AUTH_TOKEN = login['auth_token']
def _get_comments(self, video_id, limit=15):
comment_info = self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_id}/comments_v2',
video_id, fatal=False, note='Downloading comments API JSON',
headers={'Origin': 'https://triller.co'}, query={'limit': limit}) or {}
if not comment_info.get('comments'):
return
for comment_dict in comment_info['comments']:
yield {
'author': traverse_obj(comment_dict, ('author', 'username')),
'author_id': traverse_obj(comment_dict, ('author', 'user_id')),
'id': comment_dict.get('id'),
'text': comment_dict.get('body'),
'timestamp': unified_timestamp(comment_dict.get('timestamp')),
}
def _check_user_info(self, user_info):
if not user_info:
self.report_warning('Unable to extract user info')
elif user_info.get('private') and not user_info.get('followed_by_me'):
raise ExtractorError('This video is private', expected=True)
elif traverse_obj(user_info, 'blocked_by_user', 'blocking_user'):
raise ExtractorError('The author of the video is blocked', expected=True)
return user_info
def _parse_video_info(self, video_info, username, user_info=None):
video_uuid = video_info.get('video_uuid')
video_id = video_info.get('id')
formats = []
video_url = traverse_obj(video_info, 'video_url', 'stream_url')
if video_url:
formats.append({
'url': video_url,
'ext': 'mp4',
'vcodec': 'h264',
'width': video_info.get('width'),
'height': video_info.get('height'),
'format_id': url_basename(video_url).split('.')[0],
'filesize': video_info.get('filesize'),
})
video_set = video_info.get('video_set') or []
for video in video_set:
resolution = video.get('resolution') or ''
formats.append({
'url': video['url'],
'ext': 'mp4',
'vcodec': video.get('codec'),
'vbr': int_or_none(video.get('bitrate'), 1000),
'width': int_or_none(resolution.split('x')[0]),
'height': int_or_none(resolution.split('x')[1]),
'format_id': url_basename(video['url']).split('.')[0],
})
audio_url = video_info.get('audio_url')
if audio_url:
formats.append({
'url': audio_url,
'ext': 'm4a',
'format_id': url_basename(audio_url).split('.')[0],
})
manifest_url = video_info.get('transcoded_url')
if manifest_url:
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
comment_count = int_or_none(video_info.get('comment_count'))
user_info = user_info or traverse_obj(video_info, 'user', default={})
return {
'id': str_or_none(video_id) or video_uuid,
'title': video_info.get('description') or f'Video by {username}',
'thumbnail': video_info.get('thumbnail_url'),
'description': video_info.get('description'),
'uploader': str_or_none(username),
'uploader_id': str_or_none(user_info.get('user_id')),
'creator': str_or_none(user_info.get('name')),
'timestamp': unified_timestamp(video_info.get('timestamp')),
'upload_date': unified_strdate(video_info.get('timestamp')),
'duration': int_or_none(video_info.get('duration')),
'view_count': int_or_none(video_info.get('play_count')),
'like_count': int_or_none(video_info.get('likes_count')),
'artist': str_or_none(video_info.get('song_artist')),
'track': str_or_none(video_info.get('song_title')),
'webpage_url': f'https://triller.co/@{username}/video/{video_uuid}',
'uploader_url': f'https://triller.co/@{username}',
'extractor_key': TrillerIE.ie_key(),
'extractor': TrillerIE.IE_NAME,
'formats': formats,
'comment_count': comment_count,
'__post_extractor': self.extract_comments(video_id, comment_count),
}
class TrillerIE(TrillerBaseIE):
_VALID_URL = r'''(?x)
https?://(?:www\.)?triller\.co/
@(?P<username>[\w\._]+)/video/
(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
'''
_TESTS = [{
'url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'md5': '228662d783923b60d78395fedddc0a20',
'info_dict': {
'id': '71595734',
'ext': 'mp4',
'title': 'md5:9a2bf9435c5c4292678996a464669416',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:9a2bf9435c5c4292678996a464669416',
'uploader': 'theestallion',
'uploader_id': '18992236',
'creator': 'Megan Thee Stallion',
'timestamp': 1660598222,
'upload_date': '20220815',
'duration': 47,
'height': 3840,
'width': 2160,
'view_count': int,
'like_count': int,
'artist': 'Megan Thee Stallion',
'track': 'Her',
'webpage_url': 'https://triller.co/@theestallion/video/2358fcd7-3df2-4c77-84c8-1d091610a6cf',
'uploader_url': 'https://triller.co/@theestallion',
'comment_count': int,
}
}, {
'url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'md5': '874055f462af5b0699b9dbb527a505a0',
'info_dict': {
'id': '71621339',
'ext': 'mp4',
'title': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
'description': 'md5:4c91ea82760fe0fffb71b8c3aa7295fc',
'uploader': 'charlidamelio',
'uploader_id': '1875551',
'creator': 'charli damelio',
'timestamp': 1660773354,
'upload_date': '20220817',
'duration': 16,
'height': 1920,
'width': 1080,
'view_count': int,
'like_count': int,
'artist': 'Dixie',
'track': 'Someone to Blame',
'webpage_url': 'https://triller.co/@charlidamelio/video/46c6fcfa-aa9e-4503-a50c-68444f44cddc',
'uploader_url': 'https://triller.co/@charlidamelio',
'comment_count': int,
}
}]
def _real_extract(self, url):
username, video_uuid = self._match_valid_url(url).group('username', 'id')
video_info = traverse_obj(self._download_json(
f'{self._API_BASE_URL}/api/videos/{video_uuid}',
video_uuid, note='Downloading video info API JSON',
errnote='Unable to download video info API JSON',
headers={
'Origin': 'https://triller.co',
}), ('videos', 0))
if not video_info:
raise ExtractorError('No video info found in API response')
user_info = self._check_user_info(video_info.get('user') or {})
return self._parse_video_info(video_info, username, user_info)
class TrillerUserIE(TrillerBaseIE):
_VALID_URL = r'https?://(?:www\.)?triller\.co/@(?P<id>[\w\._]+)/?(?:$|[#?])'
_TESTS = [{
# first videos request only returns 2 videos
'url': 'https://triller.co/@theestallion',
'playlist_mincount': 9,
'info_dict': {
'id': '18992236',
'title': 'theestallion',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}, {
'url': 'https://triller.co/@charlidamelio',
'playlist_mincount': 25,
'info_dict': {
'id': '1875551',
'title': 'charlidamelio',
'thumbnail': r're:^https://uploads\.cdn\.triller\.co/.+\.jpg$',
}
}]
def _real_initialize(self):
if not self._AUTH_TOKEN:
guest = self._download_json(
f'{self._API_BASE_URL}/user/create_guest',
None, note='Creating guest session', data=b'', headers={
'Origin': 'https://triller.co',
}, query={
'platform': 'Web',
'app_version': '',
})
if not guest.get('auth_token'):
raise ExtractorError('Unable to fetch required auth token for user extraction')
self._AUTH_TOKEN = guest['auth_token']
def _extract_video_list(self, username, user_id, limit=6):
query = {
'limit': limit,
}
for page in itertools.count(1):
for retry in self.RetryManager():
try:
video_list = self._download_json(
f'{self._API_BASE_URL}/api/users/{user_id}/videos',
username, note=f'Downloading user video list page {page}',
errnote='Unable to download user video list', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}, query=query)
except ExtractorError as e:
if isinstance(e.cause, json.JSONDecodeError) and e.cause.pos == 0:
retry.error = e
continue
raise
if not video_list.get('videos'):
break
yield from video_list['videos']
query['before_time'] = traverse_obj(video_list, ('videos', -1, 'timestamp'))
if not query['before_time']:
break
def _entries(self, videos, username, user_info):
for video in videos:
yield self._parse_video_info(video, username, user_info)
def _real_extract(self, url):
username = self._match_id(url)
user_info = self._check_user_info(self._download_json(
f'{self._API_BASE_URL}/api/users/by_username/{username}',
username, note='Downloading user info',
errnote='Failed to download user info', headers={
'Authorization': f'Bearer {self._AUTH_TOKEN}',
'Origin': 'https://triller.co',
}).get('user', {}))
user_id = str_or_none(user_info.get('user_id'))
videos = self._extract_video_list(username, user_id)
thumbnail = user_info.get('avatar_url')
return self.playlist_result(
self._entries(videos, username, user_info), user_id, username, thumbnail=thumbnail)

View File

@@ -2,7 +2,7 @@ from .common import InfoExtractor
class UKTVPlayIE(InfoExtractor):
_VALID_URL = r'https?://uktvplay\.uktv\.co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)'
_VALID_URL = r'https?://uktvplay\.(?:uktv\.)?co\.uk/(?:.+?\?.*?\bvideo=|([^/]+/)*watch-online/)(?P<id>\d+)'
_TESTS = [{
'url': 'https://uktvplay.uktv.co.uk/shows/world-at-war/c/200/watch-online/?video=2117008346001',
'info_dict': {

View File

@@ -1131,7 +1131,7 @@ class VimeoChannelIE(VimeoBaseInfoExtractor):
class VimeoUserIE(VimeoChannelIE):
IE_NAME = 'vimeo:user'
_VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos|[#?]|$)'
_VALID_URL = r'https://vimeo\.com/(?!(?:[0-9]+|watchlater)(?:$|[?#/]))(?P<id>[^/]+)(?:/videos)?/?(?:$|[?#])'
_TITLE_RE = r'<a[^>]+?class="user">([^<>]+?)</a>'
_TESTS = [{
'url': 'https://vimeo.com/nkistudio/videos',
@@ -1140,6 +1140,9 @@ class VimeoUserIE(VimeoChannelIE):
'id': 'nkistudio',
},
'playlist_mincount': 66,
}, {
'url': 'https://vimeo.com/nkistudio/',
'only_matching': True,
}]
_BASE_URL_TEMPL = 'https://vimeo.com/%s'

View File

@@ -1,208 +0,0 @@
import functools
import re
import time
from .common import InfoExtractor
from ..aes import aes_cbc_encrypt_bytes
from ..utils import determine_ext, int_or_none, traverse_obj, urljoin
class WeTvBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?wetv\.vip/(?:[^?#]+/)?play'
def _get_ckey(self, video_id, url, app_version, platform):
ua = self.get_param('http_headers')['User-Agent']
payload = (f'{video_id}|{int(time.time())}|mg3c3b04ba|{app_version}|0000000000000000|'
f'{platform}|{url[:48]}|{ua.lower()[:48]}||Mozilla|Netscape|Win32|00|')
return aes_cbc_encrypt_bytes(
bytes(f'|{sum(map(ord, payload))}|{payload}', 'utf-8'),
b'Ok\xda\xa3\x9e/\x8c\xb0\x7f^r-\x9e\xde\xf3\x14',
b'\x01PJ\xf3V\xe6\x19\xcf.B\xbb\xa6\x8c?p\xf9',
padding_mode='whitespace').hex()
def _get_video_api_response(self, video_url, video_id, series_id, subtitle_format, video_format, video_quality):
app_version = '3.5.57'
platform = '4830201'
ckey = self._get_ckey(video_id, video_url, app_version, platform)
query = {
'vid': video_id,
'cid': series_id,
'cKey': ckey,
'encryptVer': '8.1',
'spcaptiontype': '1' if subtitle_format == 'vtt' else '0', # 0 - SRT, 1 - VTT
'sphls': '1' if video_format == 'hls' else '0', # 0 - MP4, 1 - HLS
'defn': video_quality, # '': 480p, 'shd': 720p, 'fhd': 1080p
'spsrt': '1', # Enable subtitles
'sphttps': '1', # Enable HTTPS
'otype': 'json', # Response format: xml, json,
'dtype': '1',
'spwm': '1',
'host': 'wetv.vip', # These three values are needed for SHD
'referer': 'wetv.vip',
'ehost': video_url,
'appVer': app_version,
'platform': platform,
}
return self._search_json(r'QZOutputJson=', self._download_webpage(
'https://play.wetv.vip/getvinfo', video_id, query=query), 'api_response', video_id)
def _get_webpage_metadata(self, webpage, video_id):
return self._parse_json(
traverse_obj(self._search_nextjs_data(webpage, video_id), ('props', 'pageProps', 'data')),
video_id, fatal=False)
class WeTvEpisodeIE(WeTvBaseIE):
IE_NAME = 'wetv:episode'
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<series_id>\w+)(?:-[^?#]+)?/(?P<id>\w+)(?:-[^?#]+)?'
_TESTS = [{
'url': 'https://wetv.vip/en/play/air11ooo2rdsdi3-Cute-Programmer/v0040pr89t9-EP1-Cute-Programmer',
'md5': 'a046f565c9dce9b263a0465a422cd7bf',
'info_dict': {
'id': 'v0040pr89t9',
'ext': 'mp4',
'title': 'EP1: Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
'thumbnail': r're:^https?://[^?#]+air11ooo2rdsdi3',
'series': 'Cute Programmer',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2835,
},
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu/p0039b9nvik',
'md5': '4d9d69bcfd11da61f4aae64fc6b316b3',
'info_dict': {
'id': 'p0039b9nvik',
'ext': 'mp4',
'title': 'EP1: You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
'thumbnail': r're:^https?://[^?#]+u37kgfnfzs73kiu',
'series': 'You Are My Glory',
'episode': 'Episode 1',
'episode_number': 1,
'duration': 2454,
},
}, {
'url': 'https://wetv.vip/en/play/lcxgwod5hapghvw-WeTV-PICK-A-BOO/i0042y00lxp-Zhao-Lusi-Describes-The-First-Experiences-She-Had-In-Who-Rules-The-World-%7C-WeTV-PICK-A-BOO',
'md5': '71133f5c2d5d6cad3427e1b010488280',
'info_dict': {
'id': 'i0042y00lxp',
'ext': 'mp4',
'title': 'md5:f7a0857dbe5fbbe2e7ad630b92b54e6a',
'description': 'md5:76260cb9cdc0ef76826d7ca9d92fadfa',
'thumbnail': r're:^https?://[^?#]+lcxgwod5hapghvw',
'series': 'WeTV PICK-A-BOO',
'episode': 'Episode 0',
'episode_number': 0,
'duration': 442,
},
}]
def _extract_video_formats_and_subtitles(self, api_response, video_id, video_quality):
video_response = api_response['vl']['vi'][0]
video_width = video_response.get('vw')
video_height = video_response.get('vh')
formats, subtitles = [], {}
for video_format in video_response['ul']['ui']:
if video_format.get('hls'):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_format['url'] + video_format['hls']['pname'], video_id, 'mp4', fatal=False)
for f in fmts:
f['width'] = video_width
f['height'] = video_height
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'url': f'{video_format["url"]}{video_response["fn"]}?vkey={video_response["fvkey"]}',
'width': video_width,
'height': video_height,
'ext': 'mp4',
})
return formats, subtitles
def _extract_video_subtitles(self, api_response, subtitles_format):
subtitles = {}
for subtitle in traverse_obj(api_response, ('sfl', 'fi')):
subtitles.setdefault(subtitle['lang'].lower(), []).append({
'url': subtitle['url'],
'ext': subtitles_format,
'protocol': 'm3u8_native' if determine_ext(subtitle['url']) == 'm3u8' else 'http',
})
return subtitles
def _real_extract(self, url):
video_id, series_id = self._match_valid_url(url).group('id', 'series_id')
webpage = self._download_webpage(url, video_id)
formats, subtitles = [], {}
for video_format, subtitle_format, video_quality in (('mp4', 'srt', ''), ('hls', 'vtt', 'shd'), ('hls', 'vtt', 'fhd')):
api_response = self._get_video_api_response(url, video_id, series_id, subtitle_format, video_format, video_quality)
fmts, subs = self._extract_video_formats_and_subtitles(api_response, video_id, video_quality)
native_subtitles = self._extract_video_subtitles(api_response, subtitle_format)
formats.extend(fmts)
self._merge_subtitles(subs, native_subtitles, target=subtitles)
self._sort_formats(formats)
webpage_metadata = self._get_webpage_metadata(webpage, video_id)
return {
'id': video_id,
'title': (self._og_search_title(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'description': (self._og_search_description(webpage)
or traverse_obj(webpage_metadata, ('coverInfo', 'description'))),
'formats': formats,
'subtitles': subtitles,
'thumbnail': self._og_search_thumbnail(webpage),
'duration': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'duration'))),
'series': traverse_obj(webpage_metadata, ('coverInfo', 'title')),
'episode_number': int_or_none(traverse_obj(webpage_metadata, ('videoInfo', 'episode'))),
}
class WeTvSeriesIE(WeTvBaseIE):
_VALID_URL = WeTvBaseIE._VALID_URL_BASE + r'/(?P<id>\w+)(?:-[^/?#]+)?/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://wetv.vip/play/air11ooo2rdsdi3-Cute-Programmer',
'info_dict': {
'id': 'air11ooo2rdsdi3',
'title': 'Cute Programmer',
'description': 'md5:e87beab3bf9f392d6b9e541a63286343',
},
'playlist_count': 30,
}, {
'url': 'https://wetv.vip/en/play/u37kgfnfzs73kiu-You-Are-My-Glory',
'info_dict': {
'id': 'u37kgfnfzs73kiu',
'title': 'You Are My Glory',
'description': 'md5:831363a4c3b4d7615e1f3854be3a123b',
},
'playlist_count': 32,
}]
def _real_extract(self, url):
series_id = self._match_id(url)
webpage = self._download_webpage(url, series_id)
webpage_metadata = self._get_webpage_metadata(webpage, series_id)
episode_paths = (re.findall(r'<a[^>]+class="play-video__link"[^>]+href="(?P<path>[^"]+)', webpage)
or [f'/{series_id}/{episode["vid"]}' for episode in webpage_metadata.get('videoList')])
return self.playlist_from_matches(
episode_paths, series_id, ie=WeTvEpisodeIE, getter=functools.partial(urljoin, url),
title=traverse_obj(webpage_metadata, ('coverInfo', 'title')) or self._og_search_title(webpage),
description=traverse_obj(webpage_metadata, ('coverInfo', 'description')) or self._og_search_description(webpage))

View File

@@ -17,6 +17,7 @@ import urllib.error
import urllib.parse
from .common import InfoExtractor, SearchInfoExtractor
from .openload import PhantomJSwrapper
from ..compat import functools
from ..jsinterp import JSInterpreter
from ..utils import (
@@ -109,8 +110,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID',
'clientVersion': '17.29.34',
'androidSdkVersion': 30
'clientVersion': '17.31.35',
'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
}
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 3,
@@ -121,8 +123,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID_EMBEDDED_PLAYER',
'clientVersion': '17.29.34',
'androidSdkVersion': 30
'clientVersion': '17.31.35',
'androidSdkVersion': 30,
'userAgent': 'com.google.android.youtube/17.31.35 (Linux; U; Android 11) gzip'
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 55,
@@ -134,7 +137,8 @@ INNERTUBE_CLIENTS = {
'client': {
'clientName': 'ANDROID_MUSIC',
'clientVersion': '5.16.51',
'androidSdkVersion': 30
'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.music/5.16.51 (Linux; U; Android 11) gzip'
}
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 21,
@@ -145,8 +149,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'ANDROID_CREATOR',
'clientVersion': '22.28.100',
'androidSdkVersion': 30
'clientVersion': '22.30.100',
'androidSdkVersion': 30,
'userAgent': 'com.google.android.apps.youtube.creator/22.30.100 (Linux; U; Android 11) gzip'
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 14,
@@ -159,8 +164,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'IOS',
'clientVersion': '17.30.1',
'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
}
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
@@ -170,8 +176,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'IOS_MESSAGES_EXTENSION',
'clientVersion': '17.30.1',
'clientVersion': '17.33.2',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtube/17.33.2 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 66,
@@ -182,7 +189,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'IOS_MUSIC',
'clientVersion': '5.18',
'clientVersion': '5.21',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.youtubemusic/5.21 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
@@ -192,7 +201,9 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'IOS_CREATOR',
'clientVersion': '22.29.101',
'clientVersion': '22.33.101',
'deviceModel': 'iPhone14,3',
'userAgent': 'com.google.ios.ytcreator/22.33.101 (iPhone14,3; U; CPU iOS 15_6 like Mac OS X)'
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 15,
@@ -554,7 +565,8 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
'Origin': origin,
'X-Youtube-Identity-Token': identity_token or self._extract_identity_token(ytcfg),
'X-Goog-PageId': account_syncid or self._extract_account_syncid(ytcfg),
'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg)
'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg),
'User-Agent': self._ytcfg_get_safe(ytcfg, lambda x: x['INNERTUBE_CONTEXT']['client']['userAgent'], default_client=default_client)
}
if session_index is None:
session_index = self._extract_session_index(ytcfg)
@@ -809,7 +821,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
# Youtube sometimes sends incomplete data
# See: https://github.com/ytdl-org/youtube-dl/issues/28194
if not traverse_obj(response, *variadic(check_get_keys)):
retry.error = ExtractorError('Incomplete data received')
retry.error = ExtractorError('Incomplete data received', expected=True)
continue
return response
@@ -867,7 +879,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
else None),
'live_status': ('is_upcoming' if scheduled_timestamp is not None
else 'was_live' if 'streamed' in time_text.lower()
else 'is_live' if overlay_style is not None and overlay_style == 'LIVE' or 'live now' in badges
else 'is_live' if overlay_style == 'LIVE' or 'live now' in badges
else None),
'release_timestamp': scheduled_timestamp,
'availability': self._availability(needs_premium='premium' in badges, needs_subscription='members only' in badges)
@@ -2147,6 +2159,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'comment_count': int,
'channel_follower_count': int
}
}, {
# Same video as above, but with --compat-opt no-youtube-prefer-utc-upload-date
'url': 'https://www.youtube.com/watch?v=2NUZ8W2llS4',
'info_dict': {
'id': '2NUZ8W2llS4',
'ext': 'mp4',
'title': 'The NP that test your phone performance 🙂',
'description': 'md5:144494b24d4f9dfacb97c1bbef5de84d',
'uploader': 'Leon Nguyen',
'uploader_id': 'VNSXIII',
'uploader_url': 'http://www.youtube.com/user/VNSXIII',
'channel_id': 'UCRqNBSOHgilHfAczlUmlWHA',
'channel_url': 'https://www.youtube.com/channel/UCRqNBSOHgilHfAczlUmlWHA',
'duration': 21,
'view_count': int,
'age_limit': 0,
'categories': ['Gaming'],
'tags': 'count:23',
'playable_in_embed': True,
'live_status': 'not_live',
'upload_date': '20220102',
'like_count': int,
'availability': 'public',
'channel': 'Leon Nguyen',
'thumbnail': 'https://i.ytimg.com/vi_webp/2NUZ8W2llS4/maxresdefault.webp',
'comment_count': int,
'channel_follower_count': int
},
'params': {'compat_opts': ['no-youtube-prefer-utc-upload-date']}
}, {
# date text is premiered video, ensure upload date in UTC (published 1641172509)
'url': 'https://www.youtube.com/watch?v=mzZzzBU6lrM',
@@ -2512,20 +2553,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
assert os.path.basename(func_id) == func_id
self.write_debug(f'Extracting signature function {func_id}')
cache_spec = self.cache.load('youtube-sigfuncs', func_id)
if cache_spec is not None:
return lambda s: ''.join(s[i] for i in cache_spec)
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id), None
code = self._load_player(video_id, player_url)
if not cache_spec:
code = self._load_player(video_id, player_url)
if code:
res = self._parse_sig_js(code)
test_string = ''.join(map(chr, range(len(example_sig))))
cache_res = res(test_string)
cache_spec = [ord(c) for c in cache_res]
cache_spec = [ord(c) for c in res(test_string)]
self.cache.store('youtube-sigfuncs', func_id, cache_spec)
return res
return lambda s: ''.join(s[i] for i in cache_spec)
def _print_sig_code(self, func, example_sig):
if not self.get_param('youtube_print_sig_code'):
@@ -2593,18 +2631,29 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
initial_function = jsi.extract_function(funcname)
return lambda s: initial_function([s])
def _cached(self, func, *cache_id):
def inner(*args, **kwargs):
if cache_id not in self._player_cache:
try:
self._player_cache[cache_id] = func(*args, **kwargs)
except ExtractorError as e:
self._player_cache[cache_id] = e
except Exception as e:
self._player_cache[cache_id] = ExtractorError(traceback.format_exc(), cause=e)
ret = self._player_cache[cache_id]
if isinstance(ret, Exception):
raise ret
return ret
return inner
def _decrypt_signature(self, s, video_id, player_url):
"""Turn the encrypted s field into a working signature"""
try:
player_id = (player_url, self._signature_cache_id(s))
if player_id not in self._player_cache:
func = self._extract_signature_function(video_id, player_url, s)
self._player_cache[player_id] = func
func = self._player_cache[player_id]
self._print_sig_code(func, s)
return func(s)
except Exception as e:
raise ExtractorError(traceback.format_exc(), cause=e, video_id=video_id)
extract_sig = self._cached(
self._extract_signature_function, 'sig', player_url, self._signature_cache_id(s))
func = extract_sig(video_id, player_url, s)
self._print_sig_code(func, s)
return func(s)
def _decrypt_nsig(self, s, video_id, player_url):
"""Turn the encrypted n field into a working signature"""
@@ -2612,49 +2661,87 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
raise ExtractorError('Cannot decrypt nsig without player_url')
player_url = urljoin('https://www.youtube.com', player_url)
sig_id = ('nsig_value', s)
if sig_id in self._player_cache:
return self._player_cache[sig_id]
try:
player_id = ('nsig', player_url)
if player_id not in self._player_cache:
self._player_cache[player_id] = self._extract_n_function(video_id, player_url)
func = self._player_cache[player_id]
self._player_cache[sig_id] = func(s)
self.write_debug(f'Decrypted nsig {s} => {self._player_cache[sig_id]}')
return self._player_cache[sig_id]
except Exception as e:
raise ExtractorError(traceback.format_exc(), cause=e, video_id=video_id)
def _extract_n_function_name(self, jscode):
nfunc, idx = self._search_regex(
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx'))
if not idx:
return nfunc
return json.loads(js_to_json(self._search_regex(
rf'var {re.escape(nfunc)}\s*=\s*(\[.+?\]);', jscode,
f'Initial JS player n function list ({nfunc}.{idx})')))[int(idx)]
def _extract_n_function(self, video_id, player_url):
player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id)
if func_code:
jsi = JSInterpreter(func_code)
else:
jscode = self._load_player(video_id, player_url)
funcname = self._extract_n_function_name(jscode)
jsi = JSInterpreter(jscode)
func_code = jsi.extract_function_code(funcname)
self.cache.store('youtube-nsig', player_id, func_code)
jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url)
except ExtractorError as e:
raise ExtractorError('Unable to extract nsig function code', cause=e)
if self.get_param('youtube_print_sig_code'):
self.to_screen(f'Extracted nsig function from {player_id}:\n{func_code[1]}\n')
try:
extract_nsig = self._cached(self._extract_n_function_from_code, 'nsig func', player_url)
ret = extract_nsig(jsi, func_code)(s)
except JSInterpreter.Exception as e:
try:
jsi = PhantomJSwrapper(self, timeout=5000)
except ExtractorError:
raise e
self.report_warning(
f'Native nsig extraction failed: Trying with PhantomJS\n'
f' n = {s} ; player = {player_url}', video_id)
self.write_debug(e)
args, func_body = func_code
ret = jsi.execute(
f'console.log(function({", ".join(args)}) {{ {func_body} }}({s!r}));',
video_id=video_id, note='Executing signature code').strip()
self.write_debug(f'Decrypted nsig {s} => {ret}')
return ret
def _extract_n_function_name(self, jscode):
funcname, idx = self._search_regex(
r'\.get\("n"\)\)&&\(b=(?P<nfunc>[a-zA-Z0-9$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z0-9]\)',
jscode, 'Initial JS player n function name', group=('nfunc', 'idx'))
if not idx:
return funcname
return json.loads(js_to_json(self._search_regex(
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\]);', jscode,
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
def _extract_n_function_code(self, video_id, player_url):
player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2022.09.1')
jscode = func_code or self._load_player(video_id, player_url)
jsi = JSInterpreter(jscode)
if func_code:
return jsi, player_id, func_code
func_name = self._extract_n_function_name(jscode)
# For redundancy
func_code = self._search_regex(
r'''(?xs)%s\s*=\s*function\s*\((?P<var>[\w$]+)\)\s*
# NB: The end of the regex is intentionally kept strict
{(?P<code>.+?}\s*return\ [\w$]+.join\(""\))};''' % func_name,
jscode, 'nsig function', group=('var', 'code'), default=None)
if func_code:
func_code = ([func_code[0]], func_code[1])
else:
self.write_debug('Extracting nsig function with jsinterp')
func_code = jsi.extract_function_code(func_name)
self.cache.store('youtube-nsig', player_id, func_code)
return jsi, player_id, func_code
def _extract_n_function_from_code(self, jsi, func_code):
func = jsi.extract_function_from_code(*func_code)
return lambda s: func([s])
def extract_nsig(s):
try:
ret = func([s])
except JSInterpreter.Exception:
raise
except Exception as e:
raise JSInterpreter.Exception(traceback.format_exc(), cause=e)
if ret.startswith('enhanced_except_'):
raise JSInterpreter.Exception('Signature function returned an exception')
return ret
return extract_nsig
def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=False):
"""
@@ -2917,8 +3004,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# YouTube comments have a max depth of 2
max_depth = int_or_none(get_single_config_arg('max_comment_depth'))
if max_depth:
self._downloader.deprecation_warning(
'[youtube] max_comment_depth extractor argument is deprecated. Set max replies in the max-comments extractor argument instead.')
self._downloader.deprecated_feature('[youtube] max_comment_depth extractor argument is deprecated. '
'Set max replies in the max-comments extractor argument instead')
if max_depth == 1 and parent:
return
@@ -3040,7 +3127,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _is_unplayable(player_response):
return traverse_obj(player_response, ('playabilityStatus', 'status')) == 'UNPLAYABLE'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr):
_STORY_PLAYER_PARAMS = '8AEB'
def _extract_player_response(self, client, video_id, master_ytcfg, player_ytcfg, player_url, initial_pr, smuggled_data):
session_index = self._extract_session_index(player_ytcfg, master_ytcfg)
syncid = self._extract_account_syncid(player_ytcfg, master_ytcfg, initial_pr)
@@ -3050,8 +3139,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
yt_query = {
'videoId': video_id,
'params': '8AEB' # enable stories
}
if smuggled_data.get('is_story') or _split_innertube_client(client)[0] == 'android':
yt_query['params'] = self._STORY_PLAYER_PARAMS
yt_query.update(self._generate_player_context(sts))
return self._extract_response(
item_id=video_id, ep='player', query=yt_query,
@@ -3084,7 +3175,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return orderedSet(requested_clients)
def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg):
def _extract_player_responses(self, clients, video_id, webpage, master_ytcfg, smuggled_data):
initial_pr = None
if webpage:
initial_pr = self._search_json(
@@ -3134,7 +3225,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
try:
pr = initial_pr if client == 'web' and initial_pr else self._extract_player_response(
client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr)
client, video_id, player_ytcfg or master_ytcfg, player_ytcfg, player_url if require_js_player else None, initial_pr, smuggled_data)
except ExtractorError as e:
if last_error:
self.report_warning(last_error)
@@ -3168,7 +3259,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, is_live, duration):
itags, stream_ids = {}, []
itag_qualities, res_qualities = {}, {}
itag_qualities, res_qualities = {}, {0: None}
q = qualities([
# Normally tiny is the smallest video-only formats. But
# audio-only formats with unknown quality may get tagged as tiny
@@ -3220,7 +3311,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
self._decrypt_signature(encrypted_sig, video_id, player_url)
)
except ExtractorError as e:
self.report_warning('Signature extraction failed: Some formats may be missing', only_once=True)
self.report_warning('Signature extraction failed: Some formats may be missing',
video_id=video_id, only_once=True)
self.write_debug(e, only_once=True)
continue
@@ -3228,12 +3320,18 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
throttled = False
if query.get('n'):
try:
decrypt_nsig = self._cached(self._decrypt_nsig, 'nsig', query['n'][0])
fmt_url = update_url_query(fmt_url, {
'n': self._decrypt_nsig(query['n'][0], video_id, player_url)})
'n': decrypt_nsig(query['n'][0], video_id, player_url)
})
except ExtractorError as e:
phantomjs_hint = ''
if isinstance(e, JSInterpreter.Exception):
phantomjs_hint = (f' Install {self._downloader._format_err("PhantomJS", self._downloader.Styles.EMPHASIS)} '
f'to workaround the issue. {PhantomJSwrapper.INSTALL_HINT}\n')
self.report_warning(
'nsig extraction failed: You may experience throttling for some formats\n'
f'n = {query["n"][0]} ; player = {player_url}', only_once=True)
f'nsig extraction failed: You may experience throttling for some formats\n{phantomjs_hint}'
f' n = {query["n"][0]} ; player = {player_url}', video_id=video_id, only_once=True)
self.write_debug(e, only_once=True)
throttled = True
@@ -3320,10 +3418,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f['format_id'] = itag
itags[itag] = proto
f['quality'] = next((
q(qdict[val])
for val, qdict in ((f.get('format_id', '').split('-')[0], itag_qualities), (f.get('height'), res_qualities))
if val in qdict), -1)
f['quality'] = q(itag_qualities.get(try_get(f, lambda f: f['format_id'].split('-')[0]), -1))
if f['quality'] == -1 and f.get('height'):
f['quality'] = q(res_qualities[min(res_qualities, key=lambda x: abs(x - f['height']))])
return True
subtitles = {}
@@ -3392,14 +3489,17 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _download_player_responses(self, url, smuggled_data, video_id, webpage_url):
webpage = None
if 'webpage' not in self._configuration_arg('player_skip'):
query = {'bpctr': '9999999999', 'has_verified': '1'}
if smuggled_data.get('is_story'):
query['pp'] = self._STORY_PLAYER_PARAMS
webpage = self._download_webpage(
webpage_url + '&bpctr=9999999999&has_verified=1&pp=8AEB', video_id, fatal=False)
webpage_url, video_id, fatal=False, query=query)
master_ytcfg = self.extract_ytcfg(video_id, webpage) or self._get_default_ytcfg()
player_responses, player_url = self._extract_player_responses(
self._get_requested_clients(url, smuggled_data),
video_id, webpage, master_ytcfg)
video_id, webpage, master_ytcfg, smuggled_data)
return webpage, master_ytcfg, player_responses, player_url
@@ -3865,7 +3965,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
upload_date = (
unified_strdate(get_first(microformats, 'uploadDate'))
or unified_strdate(search_meta('uploadDate')))
if not upload_date or (not info.get('is_live') and not info.get('was_live') and info.get('live_status') != 'is_upcoming'):
if not upload_date or (
not info.get('is_live')
and not info.get('was_live')
and info.get('live_status') != 'is_upcoming'
and 'no-youtube-prefer-utc-upload-date' not in self.get_param('compat_opts', [])
):
upload_date = strftime_or_none(self._extract_time_text(vpir, 'dateText')[0], '%Y%m%d') or upload_date
info['upload_date'] = upload_date
@@ -5972,7 +6077,7 @@ class YoutubeStoriesIE(InfoExtractor):
def _real_extract(self, url):
playlist_id = f'RLTD{self._match_id(url)}'
return self.url_result(
f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1',
smuggle_url(f'https://www.youtube.com/playlist?list={playlist_id}&playnext=1', {'is_story': True}),
ie=YoutubeTabIE, video_id=playlist_id)

View File

@@ -236,32 +236,24 @@ class ZattooPlatformBaseIE(InfoExtractor):
def _real_extract(self, url):
video_id, record_id = self._match_valid_url(url).groups()
return self._extract_video(video_id, record_id)
return getattr(self, f'_extract_{self._TYPE}')(video_id or record_id)
def _make_valid_url(host):
return rf'https?://(?:www\.)?{re.escape(host)}/watch/[^/]+?/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?'
def _create_valid_url(host, match, qs, base_re=None):
match_base = fr'|{base_re}/(?P<vid1>{match})' if base_re else '(?P<vid1>)'
return rf'''(?x)https?://(?:www\.)?{re.escape(host)}/(?:
[^?#]+\?(?:[^#]+&)?{qs}=(?P<vid2>{match})
{match_base}
)'''
class ZattooBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'zattoo'
_HOST = 'zattoo.com'
@staticmethod
def _create_valid_url(match, qs, base_re=None):
match_base = fr'|{base_re}/(?P<vid1>{match})' if base_re else '(?P<vid1>)'
return rf'''(?x)https?://(?:www\.)?zattoo\.com/(?:
[^?#]+\?(?:[^#]+&)?{qs}=(?P<vid2>{match})
{match_base}
)'''
def _real_extract(self, url):
vid1, vid2 = self._match_valid_url(url).group('vid1', 'vid2')
return getattr(self, f'_extract_{self._TYPE}')(vid1 or vid2)
class ZattooIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\d+', 'program', '(?:program|watch)/[^/]+')
_VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://zattoo.com/program/zdf/250170418',
@@ -288,7 +280,7 @@ class ZattooIE(ZattooBaseIE):
class ZattooLiveIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'[^/?&#]+', 'channel', 'live')
_VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://zattoo.com/channels/german?channel=srf_zwei',
@@ -304,7 +296,7 @@ class ZattooLiveIE(ZattooBaseIE):
class ZattooMoviesIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\w+', 'movie_id', 'vod/movies')
_VALID_URL = _create_valid_url(ZattooBaseIE._HOST, r'\w+', 'movie_id', 'vod/movies')
_TYPE = 'ondemand'
_TESTS = [{
'url': 'https://zattoo.com/vod/movies/7521',
@@ -316,7 +308,7 @@ class ZattooMoviesIE(ZattooBaseIE):
class ZattooRecordingsIE(ZattooBaseIE):
_VALID_URL = ZattooBaseIE._create_valid_url(r'\d+', 'recording')
_VALID_URL = _create_valid_url('zattoo.com', r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://zattoo.com/recordings?recording=193615508',
@@ -327,139 +319,547 @@ class ZattooRecordingsIE(ZattooBaseIE):
}]
class NetPlusIE(ZattooPlatformBaseIE):
class NetPlusTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'netplus'
_HOST = 'netplus.tv'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class NetPlusTVIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.netplus.tv/watch/abc/123-abc',
'url': 'https://netplus.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://netplus.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class MNetTVIE(ZattooPlatformBaseIE):
class NetPlusTVLiveIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://netplus.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://netplus.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if NetPlusTVIE.suitable(url) else super().suitable(url)
class NetPlusTVRecordingsIE(NetPlusTVBaseIE):
_VALID_URL = _create_valid_url(NetPlusTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://netplus.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://netplus.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class MNetTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'mnettv'
_HOST = 'tvplus.m-net.de'
_VALID_URL = _make_valid_url(_HOST)
class MNetTVIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://tvplus.m-net.de/watch/abc/123-abc',
'url': 'https://tvplus.m-net.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class WalyTVIE(ZattooPlatformBaseIE):
class MNetTVLiveIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvplus.m-net.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if MNetTVIE.suitable(url) else super().suitable(url)
class MNetTVRecordingsIE(MNetTVBaseIE):
_VALID_URL = _create_valid_url(MNetTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvplus.m-net.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvplus.m-net.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class WalyTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'walytv'
_HOST = 'player.waly.tv'
_VALID_URL = _make_valid_url(_HOST)
class WalyTVIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://player.waly.tv/watch/abc/123-abc',
'url': 'https://player.waly.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class BBVTVIE(ZattooPlatformBaseIE):
class WalyTVLiveIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://player.waly.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if WalyTVIE.suitable(url) else super().suitable(url)
class WalyTVRecordingsIE(WalyTVBaseIE):
_VALID_URL = _create_valid_url(WalyTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://player.waly.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://player.waly.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class BBVTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'bbvtv'
_HOST = 'bbv-tv.net'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class BBVTVIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.bbv-tv.net/watch/abc/123-abc',
'url': 'https://bbv-tv.net/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class VTXTVIE(ZattooPlatformBaseIE):
class BBVTVLiveIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://bbv-tv.net/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if BBVTVIE.suitable(url) else super().suitable(url)
class BBVTVRecordingsIE(BBVTVBaseIE):
_VALID_URL = _create_valid_url(BBVTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://bbv-tv.net/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://bbv-tv.net/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class VTXTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'vtxtv'
_HOST = 'vtxtv.ch'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class VTXTVIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.vtxtv.ch/watch/abc/123-abc',
'url': 'https://vtxtv.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class GlattvisionTVIE(ZattooPlatformBaseIE):
class VTXTVLiveIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://vtxtv.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if VTXTVIE.suitable(url) else super().suitable(url)
class VTXTVRecordingsIE(VTXTVBaseIE):
_VALID_URL = _create_valid_url(VTXTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://vtxtv.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://vtxtv.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class GlattvisionTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'glattvisiontv'
_HOST = 'iptv.glattvision.ch'
_VALID_URL = _make_valid_url(_HOST)
class GlattvisionTVIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://iptv.glattvision.ch/watch/abc/123-abc',
'url': 'https://iptv.glattvision.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class SAKTVIE(ZattooPlatformBaseIE):
class GlattvisionTVLiveIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://iptv.glattvision.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if GlattvisionTVIE.suitable(url) else super().suitable(url)
class GlattvisionTVRecordingsIE(GlattvisionTVBaseIE):
_VALID_URL = _create_valid_url(GlattvisionTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://iptv.glattvision.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://iptv.glattvision.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class SAKTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'saktv'
_HOST = 'saktv.ch'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class SAKTVIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.saktv.ch/watch/abc/123-abc',
'url': 'https://saktv.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://saktv.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class EWETVIE(ZattooPlatformBaseIE):
class SAKTVLiveIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://saktv.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://saktv.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if SAKTVIE.suitable(url) else super().suitable(url)
class SAKTVRecordingsIE(SAKTVBaseIE):
_VALID_URL = _create_valid_url(SAKTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://saktv.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://saktv.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class EWETVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'ewetv'
_HOST = 'tvonline.ewe.de'
_VALID_URL = _make_valid_url(_HOST)
class EWETVIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://tvonline.ewe.de/watch/abc/123-abc',
'url': 'https://tvonline.ewe.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class QuantumTVIE(ZattooPlatformBaseIE):
class EWETVLiveIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvonline.ewe.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if EWETVIE.suitable(url) else super().suitable(url)
class EWETVRecordingsIE(EWETVBaseIE):
_VALID_URL = _create_valid_url(EWETVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvonline.ewe.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvonline.ewe.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class QuantumTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'quantumtv'
_HOST = 'quantum-tv.com'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class QuantumTVIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.quantum-tv.com/watch/abc/123-abc',
'url': 'https://quantum-tv.com/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class OsnatelTVIE(ZattooPlatformBaseIE):
class QuantumTVLiveIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://quantum-tv.com/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if QuantumTVIE.suitable(url) else super().suitable(url)
class QuantumTVRecordingsIE(QuantumTVBaseIE):
_VALID_URL = _create_valid_url(QuantumTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://quantum-tv.com/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://quantum-tv.com/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class OsnatelTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'osnateltv'
_HOST = 'tvonline.osnatel.de'
_VALID_URL = _make_valid_url(_HOST)
class OsnatelTVIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://tvonline.osnatel.de/watch/abc/123-abc',
'url': 'https://tvonline.osnatel.de/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class EinsUndEinsTVIE(ZattooPlatformBaseIE):
class OsnatelTVLiveIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tvonline.osnatel.de/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if OsnatelTVIE.suitable(url) else super().suitable(url)
class OsnatelTVRecordingsIE(OsnatelTVBaseIE):
_VALID_URL = _create_valid_url(OsnatelTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tvonline.osnatel.de/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tvonline.osnatel.de/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class EinsUndEinsTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = '1und1tv'
_HOST = '1und1.tv'
_API_HOST = 'www.%s' % _HOST
_VALID_URL = _make_valid_url(_HOST)
class EinsUndEinsTVIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://www.1und1.tv/watch/abc/123-abc',
'url': 'https://1und1.tv/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://1und1.tv/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class SaltTVIE(ZattooPlatformBaseIE):
class EinsUndEinsTVLiveIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://1und1.tv/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://1und1.tv/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if EinsUndEinsTVIE.suitable(url) else super().suitable(url)
class EinsUndEinsTVRecordingsIE(EinsUndEinsTVBaseIE):
_VALID_URL = _create_valid_url(EinsUndEinsTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://1und1.tv/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://1und1.tv/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]
class SaltTVBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'salttv'
_HOST = 'tv.salt.ch'
_VALID_URL = _make_valid_url(_HOST)
class SaltTVIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'\d+', 'program', '(?:program|watch)/[^/]+')
_TYPE = 'video'
_TESTS = [{
'url': 'https://tv.salt.ch/watch/abc/123-abc',
'url': 'https://tv.salt.ch/program/daserste/210177916',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/guide/german?channel=srf1&program=169860555',
'only_matching': True,
}]
class SaltTVLiveIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'[^/?&#]+', 'channel', 'live')
_TYPE = 'live'
_TESTS = [{
'url': 'https://tv.salt.ch/channels/german?channel=srf_zwei',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/live/srf1',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if SaltTVIE.suitable(url) else super().suitable(url)
class SaltTVRecordingsIE(SaltTVBaseIE):
_VALID_URL = _create_valid_url(SaltTVBaseIE._HOST, r'\d+', 'recording')
_TYPE = 'record'
_TESTS = [{
'url': 'https://tv.salt.ch/recordings?recording=193615508',
'only_matching': True,
}, {
'url': 'https://tv.salt.ch/tc/ptc_recordings_all_recordings?recording=193615420',
'only_matching': True,
}]

View File

@@ -16,50 +16,72 @@ from .utils import (
write_string,
)
_NAME_RE = r'[a-zA-Z_$][\w$]*'
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
_OPERATORS = { # None => Defined in JSInterpreter._operator
'?': None,
def _js_bit_op(op):
def zeroise(x):
return 0 if x in (None, JS_Undefined) else x
'||': None,
'&&': None,
'&': operator.and_,
'|': operator.or_,
'^': operator.xor,
def wrapped(a, b):
return op(zeroise(a), zeroise(b)) & 0xffffffff
'===': operator.is_,
'!==': operator.is_not,
'==': operator.eq,
'!=': operator.ne,
'<=': operator.le,
'>=': operator.ge,
'<': operator.lt,
'>': operator.gt,
'>>': operator.rshift,
'<<': operator.lshift,
'+': operator.add,
'-': operator.sub,
'*': operator.mul,
'/': operator.truediv,
'%': operator.mod,
'**': operator.pow,
}
_COMP_OPERATORS = {'===', '!==', '==', '!=', '<=', '>=', '<', '>'}
_MATCHING_PARENS = dict(zip('({[', ')}]'))
_QUOTES = '\'"'
return wrapped
def _ternary(cndn, if_true=True, if_false=False):
def _js_arith_op(op):
def wrapped(a, b):
if JS_Undefined in (a, b):
return float('nan')
return op(a or 0, b or 0)
return wrapped
def _js_div(a, b):
if JS_Undefined in (a, b) or not (a and b):
return float('nan')
return (a or 0) / b if b else float('inf')
def _js_mod(a, b):
if JS_Undefined in (a, b) or not b:
return float('nan')
return (a or 0) % b
def _js_exp(a, b):
if not b:
return 1 # even 0 ** 0 !!
elif JS_Undefined in (a, b):
return float('nan')
return (a or 0) ** b
def _js_eq_op(op):
def wrapped(a, b):
if {a, b} <= {None, JS_Undefined}:
return op(a, a)
return op(a, b)
return wrapped
def _js_comp_op(op):
def wrapped(a, b):
if JS_Undefined in (a, b):
return False
if isinstance(a, str) or isinstance(b, str):
return op(str(a or 0), str(b or 0))
return op(a or 0, b or 0)
return wrapped
def _js_ternary(cndn, if_true=True, if_false=False):
"""Simulate JS's ternary operator (cndn?if_true:if_false)"""
if cndn in (False, None, 0, ''):
if cndn in (False, None, 0, '', JS_Undefined):
return if_false
with contextlib.suppress(TypeError):
if math.isnan(cndn): # NB: NaN cannot be checked by membership
@@ -67,6 +89,50 @@ def _ternary(cndn, if_true=True, if_false=False):
return if_true
# Ref: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
_OPERATORS = { # None => Defined in JSInterpreter._operator
'?': None,
'??': None,
'||': None,
'&&': None,
'|': _js_bit_op(operator.or_),
'^': _js_bit_op(operator.xor),
'&': _js_bit_op(operator.and_),
'===': operator.is_,
'!==': operator.is_not,
'==': _js_eq_op(operator.eq),
'!=': _js_eq_op(operator.ne),
'<=': _js_comp_op(operator.le),
'>=': _js_comp_op(operator.ge),
'<': _js_comp_op(operator.lt),
'>': _js_comp_op(operator.gt),
'>>': _js_bit_op(operator.rshift),
'<<': _js_bit_op(operator.lshift),
'+': _js_arith_op(operator.add),
'-': _js_arith_op(operator.sub),
'*': _js_arith_op(operator.mul),
'/': _js_div,
'%': _js_mod,
'**': _js_exp,
}
_COMP_OPERATORS = {'===', '!==', '==', '!=', '<=', '>=', '<', '>'}
_NAME_RE = r'[a-zA-Z_$][\w$]*'
_MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]')))
_QUOTES = '\'"/'
class JS_Undefined:
pass
class JS_Break(ExtractorError):
def __init__(self):
ExtractorError.__init__(self, 'Invalid break')
@@ -77,6 +143,12 @@ class JS_Continue(ExtractorError):
ExtractorError.__init__(self, 'Invalid continue')
class JS_Throw(ExtractorError):
def __init__(self, e):
self.error = e
ExtractorError.__init__(self, f'Uncaught exception {e}')
class LocalNameSpace(collections.ChainMap):
def __setitem__(self, key, value):
for scope in self.maps:
@@ -103,7 +175,14 @@ class Debugger:
def interpret_statement(self, stmt, local_vars, allow_recursion, *args, **kwargs):
if cls.ENABLED and stmt.strip():
cls.write(stmt, level=allow_recursion)
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs)
try:
ret, should_ret = f(self, stmt, local_vars, allow_recursion, *args, **kwargs)
except Exception as e:
if cls.ENABLED:
if isinstance(e, ExtractorError):
e = e.orig_msg
cls.write('=> Raises:', e, '<-|', stmt, level=allow_recursion)
raise
if cls.ENABLED and stmt.strip():
cls.write(['->', '=>'][should_ret], repr(ret), '<-|', stmt, level=allow_recursion)
return ret, should_ret
@@ -113,6 +192,21 @@ class Debugger:
class JSInterpreter:
__named_object_counter = 0
_RE_FLAGS = {
# special knowledge: Python's re flags are bitmask values, current max 128
# invent new bitmask values well above that for literal parsing
# TODO: new pattern class to execute matches with these flags
'd': 1024, # Generate indices for substring matches
'g': 2048, # Global search
'i': re.I, # Case-insensitive search
'm': re.M, # Multi-line search
's': re.S, # Allows . to match newline characters
'u': re.U, # Treat a pattern as a sequence of unicode code points
'y': 4096, # Perform a "sticky" search that matches starting at the current position in the target string
}
_EXC_NAME = '__yt_dlp_exception__'
def __init__(self, code, objects=None):
self.code, self._functions = code, {}
self._objects = {} if objects is None else objects
@@ -129,21 +223,38 @@ class JSInterpreter:
namespace[name] = obj
return name
@classmethod
def _regex_flags(cls, expr):
flags = 0
if not expr:
return flags, expr
for idx, ch in enumerate(expr):
if ch not in cls._RE_FLAGS:
break
flags |= cls._RE_FLAGS[ch]
return flags, expr[idx + 1:]
@staticmethod
def _separate(expr, delim=',', max_split=None):
OP_CHARS = '+-*/%&|^=<>!,;{}:'
if not expr:
return
counters = {k: 0 for k in _MATCHING_PARENS.values()}
start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1
in_quote, escaping = None, False
in_quote, escaping, after_op, in_regex_char_group = None, False, True, False
for idx, char in enumerate(expr):
if not in_quote and char in _MATCHING_PARENS:
counters[_MATCHING_PARENS[char]] += 1
elif not in_quote and char in counters:
counters[char] -= 1
elif not escaping and char in _QUOTES and in_quote in (char, None):
in_quote = None if in_quote else char
elif not escaping:
if char in _QUOTES and in_quote in (char, None):
if in_quote or after_op or char != '/':
in_quote = None if in_quote and not in_regex_char_group else char
elif in_quote == '/' and char in '[]':
in_regex_char_group = char == '['
escaping = not escaping and in_quote and char == '\\'
after_op = not in_quote and char in OP_CHARS or (char.isspace() and after_op)
if char != delim[pos] or any(counters.values()) or in_quote:
pos = 0
@@ -159,7 +270,9 @@ class JSInterpreter:
yield expr[start:]
@classmethod
def _separate_at_paren(cls, expr, delim):
def _separate_at_paren(cls, expr, delim=None):
if delim is None:
delim = expr and _MATCHING_PARENS[expr[0]]
separated = list(cls._separate(expr, delim, 1))
if len(separated) < 2:
raise cls.Exception(f'No terminating paren {delim}', expr)
@@ -167,10 +280,13 @@ class JSInterpreter:
def _operator(self, op, left_val, right_expr, expr, local_vars, allow_recursion):
if op in ('||', '&&'):
if (op == '&&') ^ _ternary(left_val):
if (op == '&&') ^ _js_ternary(left_val):
return left_val # short circuiting
elif op == '??':
if left_val not in (None, JS_Undefined):
return left_val
elif op == '?':
right_expr = _ternary(left_val, *self._separate(right_expr, ':', 1))
right_expr = _js_ternary(left_val, *self._separate(right_expr, ':', 1))
right_val = self.interpret_expression(right_expr, local_vars, allow_recursion)
if not _OPERATORS.get(op):
@@ -181,12 +297,14 @@ class JSInterpreter:
except Exception as e:
raise self.Exception(f'Failed to evaluate {left_val!r} {op} {right_val!r}', expr, cause=e)
def _index(self, obj, idx):
def _index(self, obj, idx, allow_undefined=False):
if idx == 'length':
return len(obj)
try:
return obj[int(idx)] if isinstance(obj, list) else obj[idx]
except Exception as e:
if allow_undefined:
return JS_Undefined
raise self.Exception(f'Cannot get index {idx}', repr(obj), cause=e)
def _dump(self, obj, namespace):
@@ -210,16 +328,22 @@ class JSInterpreter:
if should_return:
return ret, should_return
m = re.match(r'(?P<var>(?:var|const|let)\s)|return(?:\s+|$)', stmt)
m = re.match(r'(?P<var>(?:var|const|let)\s)|return(?:\s+|(?=["\'])|$)|(?P<throw>throw\s+)', stmt)
if m:
expr = stmt[len(m.group(0)):].strip()
if m.group('throw'):
raise JS_Throw(self.interpret_expression(expr, local_vars, allow_recursion))
should_return = not m.group('var')
if not expr:
return None, should_return
if expr[0] in _QUOTES:
inner, outer = self._separate(expr, expr[0], 1)
inner = json.loads(js_to_json(f'{inner}{expr[0]}', strict=True))
if expr[0] == '/':
flags, outer = self._regex_flags(outer)
inner = re.compile(inner[1:], flags=flags)
else:
inner = json.loads(js_to_json(f'{inner}{expr[0]}', strict=True))
if not outer:
return inner, should_return
expr = self._named_object(local_vars, inner) + outer
@@ -227,7 +351,7 @@ class JSInterpreter:
if expr.startswith('new '):
obj = expr[4:]
if obj.startswith('Date('):
left, right = self._separate_at_paren(obj[4:], ')')
left, right = self._separate_at_paren(obj[4:])
expr = unified_timestamp(
self.interpret_expression(left, local_vars, allow_recursion), False)
if not expr:
@@ -241,7 +365,18 @@ class JSInterpreter:
return None, should_return
if expr.startswith('{'):
inner, outer = self._separate_at_paren(expr, '}')
inner, outer = self._separate_at_paren(expr)
# try for object expression (Map)
sub_expressions = [list(self._separate(sub_expr.strip(), ':', 1)) for sub_expr in self._separate(inner)]
if all(len(sub_expr) == 2 for sub_expr in sub_expressions):
def dict_item(key, val):
val = self.interpret_expression(val, local_vars, allow_recursion)
if re.match(_NAME_RE, key):
return key, val
return self.interpret_expression(key, local_vars, allow_recursion), val
return dict(dict_item(k, v) for k, v in sub_expressions), should_return
inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion)
if not outer or should_abort:
return inner, should_abort or should_return
@@ -249,7 +384,7 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer
if expr.startswith('('):
inner, outer = self._separate_at_paren(expr, ')')
inner, outer = self._separate_at_paren(expr)
inner, should_abort = self.interpret_statement(inner, local_vars, allow_recursion)
if not outer or should_abort:
return inner, should_abort or should_return
@@ -257,38 +392,62 @@ class JSInterpreter:
expr = self._dump(inner, local_vars) + outer
if expr.startswith('['):
inner, outer = self._separate_at_paren(expr, ']')
inner, outer = self._separate_at_paren(expr)
name = self._named_object(local_vars, [
self.interpret_expression(item, local_vars, allow_recursion)
for item in self._separate(inner)])
expr = name + outer
m = re.match(r'(?P<try>try|finally)\s*|(?:(?P<catch>catch)|(?P<for>for)|(?P<switch>switch))\s*\(', expr)
if m and m.group('try'):
if expr[m.end()] == '{':
try_expr, expr = self._separate_at_paren(expr[m.end():], '}')
else:
try_expr, expr = expr[m.end() - 1:], ''
ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion)
m = re.match(r'''(?x)
(?P<try>try)\s*\{|
(?P<switch>switch)\s*\(|
(?P<for>for)\s*\(
''', expr)
md = m.groupdict() if m else {}
if md.get('try'):
try_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
err = None
try:
ret, should_abort = self.interpret_statement(try_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
except Exception as e:
# XXX: This works for now, but makes debugging future issues very hard
err = e
pending = (None, False)
m = re.match(r'catch\s*(?P<err>\(\s*{_NAME_RE}\s*\))?\{{'.format(**globals()), expr)
if m:
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
if err:
catch_vars = {}
if m.group('err'):
catch_vars[m.group('err')] = err.error if isinstance(err, JS_Throw) else err
catch_vars = local_vars.new_child(catch_vars)
err, pending = None, self.interpret_statement(sub_expr, catch_vars, allow_recursion)
m = re.match(r'finally\s*\{', expr)
if m:
sub_expr, expr = self._separate_at_paren(expr[m.end() - 1:])
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
ret, should_abort = pending
if should_abort:
return ret, True
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('catch'):
# We ignore the catch block
_, expr = self._separate_at_paren(expr, '}')
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
if err:
raise err
elif m and m.group('for'):
constructor, remaining = self._separate_at_paren(expr[m.end() - 1:], ')')
elif md.get('for'):
constructor, remaining = self._separate_at_paren(expr[m.end() - 1:])
if remaining.startswith('{'):
body, expr = self._separate_at_paren(remaining, '}')
body, expr = self._separate_at_paren(remaining)
else:
switch_m = re.match(r'switch\s*\(', remaining) # FIXME
if switch_m:
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:], ')')
switch_val, remaining = self._separate_at_paren(remaining[switch_m.end() - 1:])
body, expr = self._separate_at_paren(remaining, '}')
body = 'switch(%s){%s}' % (switch_val, body)
else:
@@ -296,7 +455,7 @@ class JSInterpreter:
start, cndn, increment = self._separate(constructor, ';')
self.interpret_expression(start, local_vars, allow_recursion)
while True:
if not _ternary(self.interpret_expression(cndn, local_vars, allow_recursion)):
if not _js_ternary(self.interpret_expression(cndn, local_vars, allow_recursion)):
break
try:
ret, should_abort = self.interpret_statement(body, local_vars, allow_recursion)
@@ -307,11 +466,9 @@ class JSInterpreter:
except JS_Continue:
pass
self.interpret_expression(increment, local_vars, allow_recursion)
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
elif m and m.group('switch'):
switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:], ')')
elif md.get('switch'):
switch_val, remaining = self._separate_at_paren(expr[m.end() - 1:])
switch_val = self.interpret_expression(switch_val, local_vars, allow_recursion)
body, expr = self._separate_at_paren(remaining, '}')
items = body.replace('default:', 'case default:').split('case ')[1:]
@@ -334,16 +491,19 @@ class JSInterpreter:
break
if matched:
break
if md:
ret, should_abort = self.interpret_statement(expr, local_vars, allow_recursion)
return ret, should_abort or should_return
# Comma separated statements
sub_expressions = list(self._separate(expr))
expr = sub_expressions.pop().strip() if sub_expressions else ''
for sub_expr in sub_expressions:
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
if len(sub_expressions) > 1:
for sub_expr in sub_expressions:
ret, should_abort = self.interpret_statement(sub_expr, local_vars, allow_recursion)
if should_abort:
return ret, True
return ret, False
for m in re.finditer(rf'''(?x)
(?P<pre_sign>\+\+|--)(?P<var1>{_NAME_RE})|
@@ -364,13 +524,13 @@ class JSInterpreter:
(?P<assign>
(?P<out>{_NAME_RE})(?:\[(?P<index>[^\]]+?)\])?\s*
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
=(?P<expr>.*)$
=(?!=)(?P<expr>.*)$
)|(?P<return>
(?!if|return|true|false|null|undefined)(?P<name>{_NAME_RE})$
(?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
)|(?P<indexing>
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
)|(?P<attribute>
(?P<var>{_NAME_RE})(?:\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s*
(?P<var>{_NAME_RE})(?:(?P<nullish>\?)?\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s*
)|(?P<function>
(?P<fname>{_NAME_RE})\((?P<args>.*)\)$
)''', expr)
@@ -381,7 +541,7 @@ class JSInterpreter:
local_vars[m.group('out')] = self._operator(
m.group('op'), left_val, m.group('expr'), expr, local_vars, allow_recursion)
return local_vars[m.group('out')], should_return
elif left_val is None:
elif left_val in (None, JS_Undefined):
raise self.Exception(f'Cannot index undefined variable {m.group("out")}', expr)
idx = self.interpret_expression(m.group('index'), local_vars, allow_recursion)
@@ -389,7 +549,7 @@ class JSInterpreter:
raise self.Exception(f'List index {idx} must be integer', expr)
idx = int(idx)
left_val[idx] = self._operator(
m.group('op'), left_val[idx], m.group('expr'), expr, local_vars, allow_recursion)
m.group('op'), self._index(left_val, idx), m.group('expr'), expr, local_vars, allow_recursion)
return left_val[idx], should_return
elif expr.isdigit():
@@ -399,9 +559,13 @@ class JSInterpreter:
raise JS_Break()
elif expr == 'continue':
raise JS_Continue()
elif expr == 'undefined':
return JS_Undefined, should_return
elif expr == 'NaN':
return float('NaN'), should_return
elif m and m.group('return'):
return local_vars[m.group('name')], should_return
return local_vars.get(m.group('name'), JS_Undefined), should_return
with contextlib.suppress(ValueError):
return json.loads(js_to_json(expr, strict=True)), should_return
@@ -414,25 +578,26 @@ class JSInterpreter:
for op in _OPERATORS:
separated = list(self._separate(expr, op))
right_expr = separated.pop()
while op in '<>*-' and len(separated) > 1 and not separated[-1].strip():
separated.pop()
while True:
if op in '?<>*-' and len(separated) > 1 and not separated[-1].strip():
separated.pop()
elif not (separated and op == '?' and right_expr.startswith('.')):
break
right_expr = f'{op}{right_expr}'
if op != '-':
right_expr = f'{separated.pop()}{op}{right_expr}'
if not separated:
continue
left_val = self.interpret_expression(op.join(separated), local_vars, allow_recursion)
return self._operator(op, 0 if left_val is None else left_val,
right_expr, expr, local_vars, allow_recursion), should_return
return self._operator(op, left_val, right_expr, expr, local_vars, allow_recursion), should_return
if m and m.group('attribute'):
variable = m.group('var')
member = m.group('member')
variable, member, nullish = m.group('var', 'member', 'nullish')
if not member:
member = self.interpret_expression(m.group('member2'), local_vars, allow_recursion)
arg_str = expr[m.end():]
if arg_str.startswith('('):
arg_str, remaining = self._separate_at_paren(arg_str, ')')
arg_str, remaining = self._separate_at_paren(arg_str)
else:
arg_str, remaining = None, arg_str
@@ -454,12 +619,19 @@ class JSInterpreter:
obj = local_vars.get(variable, types.get(variable, NO_DEFAULT))
if obj is NO_DEFAULT:
if variable not in self._objects:
self._objects[variable] = self.extract_object(variable)
obj = self._objects[variable]
try:
self._objects[variable] = self.extract_object(variable)
except self.Exception:
if not nullish:
raise
obj = self._objects.get(variable, JS_Undefined)
if nullish and obj is JS_Undefined:
return JS_Undefined
# Member access
if arg_str is None:
return self._index(obj, member)
return self._index(obj, member, nullish)
# Function call
argvals = [
@@ -535,6 +707,13 @@ class JSInterpreter:
return obj.index(idx, start)
except ValueError:
return -1
elif member == 'charCodeAt':
assertion(isinstance(obj, str), 'must be applied on a string')
assertion(len(argvals) == 1, 'takes exactly one argument')
idx = argvals[0] if isinstance(argvals[0], int) else 0
if idx >= len(obj):
return None
return ord(obj[idx])
idx = int(member) if isinstance(obj, list) else member
return obj[idx](argvals, allow_recursion=allow_recursion)
@@ -603,7 +782,7 @@ class JSInterpreter:
\((?P<args>[^)]*)\)\s*
(?P<code>{.+})''' % {'name': re.escape(funcname)},
self.code)
code, _ = self._separate_at_paren(func_m.group('code'), '}')
code, _ = self._separate_at_paren(func_m.group('code'))
if func_m is None:
raise self.Exception(f'Could not find JS function "{funcname}"')
return [x.strip() for x in func_m.group('args').split(',')], code
@@ -618,7 +797,7 @@ class JSInterpreter:
if mobj is None:
break
start, body_start = mobj.span()
body, remaining = self._separate_at_paren(code[body_start - 1:], '}')
body, remaining = self._separate_at_paren(code[body_start - 1:])
name = self._named_object(local_vars, self.extract_function_from_code(
[x.strip() for x in mobj.group('args').split(',')],
body, local_vars, *global_stack))
@@ -636,7 +815,7 @@ class JSInterpreter:
global_stack[0].update(itertools.zip_longest(argnames, args, fillvalue=None))
global_stack[0].update(kwargs)
var_stack = LocalNameSpace(*global_stack)
ret, should_abort = self.interpret_statement(code.replace('\n', ''), var_stack, allow_recursion - 1)
ret, should_abort = self.interpret_statement(code.replace('\n', ' '), var_stack, allow_recursion - 1)
if should_abort:
return ret
return resf

View File

@@ -25,10 +25,12 @@ from .utils import (
OUTTMPL_TYPES,
POSTPROCESS_WHEN,
Config,
deprecation_warning,
expand_path,
format_field,
get_executable_path,
join_nonempty,
orderedSet_from_options,
remove_end,
write_string,
)
@@ -163,6 +165,7 @@ class _YoutubeDLHelpFormatter(optparse.IndentedHelpFormatter):
class _YoutubeDLOptionParser(optparse.OptionParser):
# optparse is deprecated since python 3.2. So assume a stable interface even for private methods
ALIAS_DEST = '_triggered_aliases'
ALIAS_TRIGGER_LIMIT = 100
def __init__(self):
@@ -174,6 +177,7 @@ class _YoutubeDLOptionParser(optparse.OptionParser):
formatter=_YoutubeDLHelpFormatter(),
conflict_handler='resolve',
)
self.set_default(self.ALIAS_DEST, collections.defaultdict(int))
_UNKNOWN_OPTION = (optparse.BadOptionError, optparse.AmbiguousOptionError)
_BAD_OPTION = optparse.OptionValueError
@@ -232,30 +236,16 @@ def create_parser():
current + value if append is True else value + current)
def _set_from_options_callback(
option, opt_str, value, parser, delim=',', allowed_values=None, aliases={},
option, opt_str, value, parser, allowed_values, delim=',', aliases={},
process=lambda x: x.lower().strip()):
current = set(getattr(parser.values, option.dest))
values = [process(value)] if delim is None else list(map(process, value.split(delim)[::-1]))
while values:
actual_val = val = values.pop()
if not val:
raise optparse.OptionValueError(f'Invalid {option.metavar} for {opt_str}: {value}')
if val == 'all':
current.update(allowed_values)
elif val == '-all':
current = set()
elif val in aliases:
values.extend(aliases[val])
else:
if val[0] == '-':
val = val[1:]
current.discard(val)
else:
current.update([val])
if allowed_values is not None and val not in allowed_values:
raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {actual_val}')
values = [process(value)] if delim is None else map(process, value.split(delim))
try:
requested = orderedSet_from_options(values, collections.ChainMap(aliases, {'all': allowed_values}),
start=getattr(parser.values, option.dest))
except ValueError as e:
raise optparse.OptionValueError(f'wrong {option.metavar} for {opt_str}: {e.args[0]}')
setattr(parser.values, option.dest, current)
setattr(parser.values, option.dest, set(requested))
def _dict_from_options_callback(
option, opt_str, value, parser,
@@ -305,8 +295,7 @@ def create_parser():
aliases = (x if x.startswith('-') else f'--{x}' for x in map(str.strip, aliases.split(',')))
try:
alias_group.add_option(
*aliases, help=opts, nargs=nargs, type='str' if nargs else None,
dest='_triggered_aliases', default=collections.defaultdict(int),
*aliases, help=opts, nargs=nargs, dest=parser.ALIAS_DEST, type='str' if nargs else None,
metavar=' '.join(f'ARG{i}' for i in range(nargs)), action='callback',
callback=_alias_callback, callback_kwargs={'opts': opts, 'nargs': nargs})
except Exception as err:
@@ -365,10 +354,20 @@ def create_parser():
'--extractor-descriptions',
action='store_true', dest='list_extractor_descriptions', default=False,
help='Output descriptions of all supported extractors and exit')
general.add_option(
'--use-extractors', '--ies',
action='callback', dest='allowed_extractors', metavar='NAMES', type='str',
default=[], callback=_list_from_options_callback,
help=(
'Extractor names to use separated by commas. '
'You can also use regexes, "all", "default" and "end" (end URL matching); '
'e.g. --ies "holodex.*,end,youtube". '
'Prefix the name with a "-" to exclude it, e.g. --ies default,-generic. '
'Use --list-extractors for a list of extractor names. (Alias: --ies)'))
general.add_option(
'--force-generic-extractor',
action='store_true', dest='force_generic_extractor', default=False,
help='Force extraction to use the generic extractor')
help=optparse.SUPPRESS_HELP)
general.add_option(
'--default-search',
dest='default_search', metavar='PREFIX',
@@ -443,11 +442,12 @@ def create_parser():
'allowed_values': {
'filename', 'filename-sanitization', 'format-sort', 'abort-on-error', 'format-spec', 'no-playlist-metafiles',
'multistreams', 'no-live-chat', 'playlist-index', 'list-formats', 'no-direct-merge',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-attach-info-json', 'embed-metadata',
'embed-thumbnail-atomicparsley', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
'no-attach-info-json', 'embed-metadata', 'embed-thumbnail-atomicparsley',
'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-youtube-prefer-utc-upload-date',
}, 'aliases': {
'youtube-dl': ['-multistreams', 'all'],
'youtube-dlc': ['-no-youtube-channel-redirect', '-no-live-chat', 'all'],
'youtube-dl': ['all', '-multistreams'],
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat'],
}
}, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
@@ -634,7 +634,7 @@ def create_parser():
selection.add_option(
'--break-per-input',
action='store_true', dest='break_per_url', default=False,
help='Make --break-on-existing, --break-on-reject and --max-downloads act only on the current input URL')
help='--break-on-existing, --break-on-reject, --max-downloads, and autonumber resets per input URL')
selection.add_option(
'--no-break-per-input',
action='store_false', dest='break_per_url',
@@ -1401,14 +1401,15 @@ def create_parser():
help='Do not read/dump cookies from/to file (default)')
filesystem.add_option(
'--cookies-from-browser',
dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE]',
dest='cookiesfrombrowser', metavar='BROWSER[+KEYRING][:PROFILE][::CONTAINER]',
help=(
'The name of the browser and (optionally) the name/path of '
'the profile to load cookies from, separated by a ":". '
'The name of the browser to load cookies from. '
f'Currently supported browsers are: {", ".join(sorted(SUPPORTED_BROWSERS))}. '
'By default, the most recently accessed profile is used. '
'The keyring used for decrypting Chromium cookies on Linux can be '
'(optionally) specified after the browser name separated by a "+". '
'Optionally, the KEYRING used for decrypting Chromium cookies on Linux, '
'the name/path of the PROFILE to load cookies from, '
'and the CONTAINER name (if Firefox) ("none" for no container) '
'can be given with their respective seperators. '
'By default, all containers of the most recently accessed profile are used. '
f'Currently supported keyrings are: {", ".join(map(str.lower, sorted(SUPPORTED_KEYRINGS)))}'))
filesystem.add_option(
'--no-cookies-from-browser',
@@ -1866,7 +1867,6 @@ def create_parser():
def _hide_login_info(opts):
write_string(
'DeprecationWarning: "yt_dlp.options._hide_login_info" is deprecated and may be removed in a future version. '
'Use "yt_dlp.utils.Config.hide_login_info" instead\n')
deprecation_warning(f'"{__name__}._hide_login_info" is deprecated and may be removed '
'in a future version. Use "yt_dlp.utils.Config.hide_login_info" instead')
return Config.hide_login_info(opts)

View File

@@ -7,10 +7,10 @@ from ..utils import (
PostProcessingError,
RetryManager,
_configuration_args,
deprecation_warning,
encodeFilename,
network_exceptions,
sanitized_Request,
write_string,
)
@@ -73,10 +73,14 @@ class PostProcessor(metaclass=PostProcessorMetaClass):
if self._downloader:
return self._downloader.report_warning(text, *args, **kwargs)
def deprecation_warning(self, text):
def deprecation_warning(self, msg):
warn = getattr(self._downloader, 'deprecation_warning', deprecation_warning)
return warn(msg, stacklevel=1)
def deprecated_feature(self, msg):
if self._downloader:
return self._downloader.deprecation_warning(text)
write_string(f'DeprecationWarning: {text}')
return self._downloader.deprecated_feature(msg)
return deprecation_warning(msg, stacklevel=1)
def report_error(self, text, *args, **kwargs):
self.deprecation_warning('"yt_dlp.postprocessor.PostProcessor.report_error" is deprecated. '

View File

@@ -15,6 +15,7 @@ from ..utils import (
Popen,
PostProcessingError,
_get_exe_version_output,
deprecation_warning,
detect_exe_version,
determine_ext,
dfxp2srt,
@@ -30,7 +31,6 @@ from ..utils import (
traverse_obj,
variadic,
write_json_file,
write_string,
)
EXT_TO_OUT_FORMATS = {
@@ -187,8 +187,8 @@ class FFmpegPostProcessor(PostProcessor):
else:
self.probe_basename = basename
if basename == self._ffmpeg_to_avconv[kind]:
self.deprecation_warning(
f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and may be removed in a future version. Use {kind} instead')
self.deprecated_feature(f'Support for {self._ffmpeg_to_avconv[kind]} is deprecated and '
f'may be removed in a future version. Use {kind} instead')
return version
@functools.cached_property
@@ -1064,7 +1064,7 @@ class FFmpegThumbnailsConvertorPP(FFmpegPostProcessor):
@classmethod
def is_webp(cls, path):
write_string(f'DeprecationWarning: {cls.__module__}.{cls.__name__}.is_webp is deprecated')
deprecation_warning(f'{cls.__module__}.{cls.__name__}.is_webp is deprecated')
return imghdr.what(path) == 'webp'
def fixup_webp(self, info, idx=-1):

View File

@@ -1,4 +1,5 @@
import atexit
import contextlib
import hashlib
import json
import os
@@ -13,6 +14,7 @@ from .compat import compat_realpath, compat_shlex_quote
from .utils import (
Popen,
cached_method,
deprecation_warning,
shell_quote,
system_identifier,
traverse_obj,
@@ -50,6 +52,19 @@ def detect_variant():
return VARIANT or _get_variant_and_executable_path()[0]
@functools.cache
def current_git_head():
if detect_variant() != 'source':
return
with contextlib.suppress(Exception):
stdout, _, _ = Popen.run(
['git', 'rev-parse', '--short', 'HEAD'],
text=True, cwd=os.path.dirname(os.path.abspath(__file__)),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if re.fullmatch('[0-9a-f]+', stdout.strip()):
return stdout.strip()
_FILE_SUFFIXES = {
'zip': '',
'py2exe': '_min.exe',
@@ -288,11 +303,8 @@ def run_update(ydl):
def update_self(to_screen, verbose, opener):
import traceback
from .utils import write_string
write_string(
'DeprecationWarning: "yt_dlp.update.update_self" is deprecated and may be removed in a future version. '
'Use "yt_dlp.update.run_update(ydl)" instead\n')
deprecation_warning(f'"{__name__}.update_self" is deprecated and may be removed '
f'in a future version. Use "{__name__}.run_update(ydl)" instead')
printfn = to_screen

View File

@@ -828,8 +828,8 @@ def escapeHTML(text):
def process_communicate_or_kill(p, *args, **kwargs):
write_string('DeprecationWarning: yt_dlp.utils.process_communicate_or_kill is deprecated '
'and may be removed in a future version. Use yt_dlp.utils.Popen.communicate_or_kill instead')
deprecation_warning(f'"{__name__}.process_communicate_or_kill" is deprecated and may be removed '
f'in a future version. Use "{__name__}.Popen.communicate_or_kill" instead')
return Popen.communicate_or_kill(p, *args, **kwargs)
@@ -840,12 +840,35 @@ class Popen(subprocess.Popen):
else:
_startupinfo = None
def __init__(self, *args, text=False, **kwargs):
@staticmethod
def _fix_pyinstaller_ld_path(env):
"""Restore LD_LIBRARY_PATH when using PyInstaller
Ref: https://github.com/pyinstaller/pyinstaller/blob/develop/doc/runtime-information.rst#ld_library_path--libpath-considerations
https://github.com/yt-dlp/yt-dlp/issues/4573
"""
if not hasattr(sys, '_MEIPASS'):
return
def _fix(key):
orig = env.get(f'{key}_ORIG')
if orig is None:
env.pop(key, None)
else:
env[key] = orig
_fix('LD_LIBRARY_PATH') # Linux
_fix('DYLD_LIBRARY_PATH') # macOS
def __init__(self, *args, env=None, text=False, **kwargs):
if env is None:
env = os.environ.copy()
self._fix_pyinstaller_ld_path(env)
if text is True:
kwargs['universal_newlines'] = True # For 3.6 compatibility
kwargs.setdefault('encoding', 'utf-8')
kwargs.setdefault('errors', 'replace')
super().__init__(*args, **kwargs, startupinfo=self._startupinfo)
super().__init__(*args, env=env, **kwargs, startupinfo=self._startupinfo)
def communicate_or_kill(self, *args, **kwargs):
try:
@@ -860,9 +883,9 @@ class Popen(subprocess.Popen):
self.wait(timeout=timeout)
@classmethod
def run(cls, *args, **kwargs):
def run(cls, *args, timeout=None, **kwargs):
with cls(*args, **kwargs) as proc:
stdout, stderr = proc.communicate_or_kill()
stdout, stderr = proc.communicate_or_kill(timeout=timeout)
return stdout or '', stderr or '', proc.returncode
@@ -1934,7 +1957,7 @@ class DateRange:
def platform_name():
""" Returns the platform name as a str """
write_string('DeprecationWarning: yt_dlp.utils.platform_name is deprecated, use platform.platform instead')
deprecation_warning(f'"{__name__}.platform_name" is deprecated, use "platform.platform" instead')
return platform.platform()
@@ -1980,6 +2003,23 @@ def write_string(s, out=None, encoding=None):
out.flush()
def deprecation_warning(msg, *, printer=None, stacklevel=0, **kwargs):
from . import _IN_CLI
if _IN_CLI:
if msg in deprecation_warning._cache:
return
deprecation_warning._cache.add(msg)
if printer:
return printer(f'{msg}{bug_reports_message()}', **kwargs)
return write_string(f'ERROR: {msg}{bug_reports_message()}\n', **kwargs)
else:
import warnings
warnings.warn(DeprecationWarning(msg), stacklevel=stacklevel + 3)
deprecation_warning._cache = set()
def bytes_to_intlist(bs):
if not bs:
return []
@@ -4862,8 +4902,8 @@ def decode_base_n(string, n=None, table=None):
def decode_base(value, digits):
write_string('DeprecationWarning: yt_dlp.utils.decode_base is deprecated '
'and may be removed in a future version. Use yt_dlp.decode_base_n instead')
deprecation_warning(f'{__name__}.decode_base is deprecated and may be removed '
f'in a future version. Use {__name__}.decode_base_n instead')
return decode_base_n(value, table=digits)
@@ -5332,8 +5372,8 @@ def traverse_obj(
def traverse_dict(dictn, keys, casesense=True):
write_string('DeprecationWarning: yt_dlp.utils.traverse_dict is deprecated '
'and may be removed in a future version. Use yt_dlp.utils.traverse_obj instead')
deprecation_warning(f'"{__name__}.traverse_dict" is deprecated and may be removed '
f'in a future version. Use "{__name__}.traverse_obj" instead')
return traverse_obj(dictn, keys, casesense=casesense, is_user_input=True, traverse_string=True)
@@ -5764,7 +5804,7 @@ class RetryManager:
if not count:
return warn(e)
elif isinstance(e, ExtractorError):
e = remove_end(str(e.cause) or e.orig_msg, '.')
e = remove_end(str_or_none(e.cause) or e.orig_msg, '.')
warn(f'{e}. Retrying{format_field(suffix, None, " %s")} ({count}/{retries})...')
delay = float_or_none(sleep_func(n=count - 1)) if callable(sleep_func) else sleep_func
@@ -5785,6 +5825,36 @@ def truncate_string(s, left, right=0):
return f'{s[:left-3]}...{s[-right:]}'
def orderedSet_from_options(options, alias_dict, *, use_regex=False, start=None):
assert 'all' in alias_dict, '"all" alias is required'
requested = list(start or [])
for val in options:
discard = val.startswith('-')
if discard:
val = val[1:]
if val in alias_dict:
val = alias_dict[val] if not discard else [
i[1:] if i.startswith('-') else f'-{i}' for i in alias_dict[val]]
# NB: Do not allow regex in aliases for performance
requested = orderedSet_from_options(val, alias_dict, start=requested)
continue
current = (filter(re.compile(val, re.I).fullmatch, alias_dict['all']) if use_regex
else [val] if val in alias_dict['all'] else None)
if current is None:
raise ValueError(val)
if discard:
for item in current:
while item in requested:
requested.remove(item)
else:
requested.extend(current)
return orderedSet(requested)
# Deprecated
has_certifi = bool(certifi)
has_websockets = bool(websockets)

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2022.08.14'
__version__ = '2022.09.01'
RELEASE_GIT_HEAD = '55937202b'
RELEASE_GIT_HEAD = '5d7c7d656'
VARIANT = None