1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-12 01:41:26 +00:00

Compare commits

...

153 Commits

Author SHA1 Message Date
github-actions
7287ab92f6 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2023-01-06 21:21:26 +00:00
pukkandan
6becd2508c Release 2023.01.06 2023-01-07 02:48:35 +05:30
pukkandan
edfc7725b1 [cleanup] Misc 2023-01-07 02:48:34 +05:30
JChris246
b382c1fc6a [xanimu] Add extractor (#5969)
Authored by: JChris246
Closes #5810
2023-01-07 01:39:37 +05:30
Christoph Flathmann
8a6b167723 [extractor/crunchyroll:show] Add language to entries (#5687)
Authored by: Chrissi2812
2023-01-07 01:05:03 +05:30
mzhou
253ac4ba6a [extractor/youtube] Retry manifest refresh for live-from-start (#5670)
Avoids ending download early when live stream is temporarily offline.
Best used with somewhat large `--retry-sleep extractor:` and `--extractor-retries`

Authored by: mzhou
2023-01-07 01:00:42 +05:30
George Schizas
84e0e33a19 [extractor/reddit] Add subreddit as channel_id (#5685)
Authored by: gschizas
Closes #5684
2023-01-07 00:57:02 +05:30
Frederik Nordahl Jul Sabroe
ab4cbeff00 [extractor/drtv] Add series extractors (#5644)
Authored by: FrederikNS
Closes #3567
2023-01-07 00:37:52 +05:30
Simon Sawicki
773c272d66 Fix config locations (#5933)
Bug in 8e40b9d1ec
Closes #5953

Authored by: Grub4k, coletdjnz, pukkandan
2023-01-07 00:31:00 +05:30
Jacob Truman
c3366fdfd0 [extractor/nbc] Update graphql query (#5952)
Closes #5918
Authored by: jacobtruman
2023-01-07 00:14:35 +05:30
Simon Sawicki
5be214abed [update] Fix updater file removal on windows (#5970)
Reverts 2fb0f85868
Closes #5632
Authored by: Grub4K
2023-01-06 22:31:18 +05:30
HobbyistDev
d37422f1db [extractor/biliIntl] Add fallback to video_data (#5971)
Authored by: HobbyistDev
2023-01-06 11:52:25 +05:30
JC-Chung
933ed882e9 [extractor/tiktok] Add TikTokLive extractor (#5637)
Closes #3698
Authored by: JC-Chung
2023-01-05 11:23:34 +00:00
HobbyistDev
a1d9aca338 [extractor/aitube] Add extractor (#5946)
Closes #5627
Authored by: HobbyistDev
2023-01-04 17:03:36 +05:30
HobbyistDev
91d54e9b99 [extractor/volejtv] Add extractor (#5943)
Authored by: HobbyistDev
Closes #5883
2023-01-04 13:20:23 +05:30
HobbyistDev
76c3ceccfb [extractor/biliintl] Add /media to VALID_URL (#5939)
Authored by: HobbyistDev
2023-01-03 23:29:52 +05:30
pukkandan
ad68b16a1e [downloader/aria2c] Disable native progress
Closes #5931, closes #5928, Re-opens #2038
2023-01-03 17:25:56 +05:30
pukkandan
f079514957 [utils] windows_enable_vt_mode: Better error handling
Closes #5927
2023-01-03 15:59:49 +05:30
pukkandan
e9df3d42c4 [build] Add minimal pyproject.toml 2023-01-03 11:25:01 +05:30
pukkandan
d80ca5deaa [utils] mimetype2ext: weba is not standard
Fix bug in fbb7383306, 2647c933b8
Closes #5935
2023-01-03 11:25:01 +05:30
OndrejBakan
1a3cd8ec35 [extractor/joj] Fix extractor (#5934)
Authored by: OndrejBakan, pukkandan
2023-01-03 11:05:05 +05:30
github-actions
990dd7b00f [version] update
Created by: pukkandan

:ci skip all :ci run dl
2023-01-02 14:44:06 +00:00
pukkandan
d83b0ad809 Release 2023.01.02 2023-01-02 20:07:07 +05:30
pukkandan
08e29b9f1f [cleanup] Misc
Closes #5576, closes #5887
2023-01-02 19:40:15 +05:30
pukkandan
8e174ba7de [docs] Improvements
Closes #5846, closes #5774
2023-01-02 19:40:13 +05:30
bashonly
05997b6e98 [extractor/generic] Decode unicode-escaped embed URLs (#5919)
Authored by: bashonly
Closes #5854
2023-01-02 19:36:01 +05:30
Simon Sawicki
32a84bcf4e Update to ytdl-commit-195f22f6
[generic] Improve KVS (etc) extraction
195f22f679

Closes #3716
Authored by: Grub4k, pukkandan
2023-01-02 19:15:36 +05:30
Matthew
8300774c4a Add --enable-file-urls (#5917)
Closes https://github.com/yt-dlp/yt-dlp/issues/3675

Authored by: coletdjnz
2023-01-02 06:05:13 +00:00
bashonly
d7f9871469 [extractor/iqiyi] Fix Iq JS regex (#5922)
Closes #5702
Authored by: bashonly
2023-01-02 05:50:37 +00:00
bashonly
13f930abc0 [extractor/fifa] Fix Preplay extraction (#5921)
Closes #5839
Authored by: dirkf
2023-01-02 05:46:06 +00:00
bashonly
b23b503e22 [extractor/odnoklassniki] Extract subtitles (#5920)
Closes #5744
Authored by: bashonly
2023-01-02 05:44:54 +00:00
Matthew
e756f45ba0 Improve handling for overriding extractors with plugins (#5916)
* Extractors replaced with plugin extractors now show in debug output
* Better testcase handling
* Added documentation
Authored by: coletdjnz, pukkandan
2023-01-02 04:55:11 +00:00
Lesmiscore
8c53322cda [downloader/aria2c] Native progress for aria2c via RPC (#3724)
Authored by: Lesmiscore, pukkandan

Closes #2038
2023-01-02 02:16:25 +09:00
pukkandan
193fb150b7 Fix bug in 119e40ef64 2023-01-01 17:01:48 +05:30
pukkandan
26fdfc3704 [extractor/biliintl:series] Make partial download of series faster 2023-01-01 14:41:47 +05:30
pukkandan
78d25e0b7c [extractor/embedly] Handle vimeo embeds
Closes #3360
2023-01-01 14:15:23 +05:30
pukkandan
2a06bb4eb6 Add --compat-options 2021,2022
Use these to guard against future compat changes. This allows devs to
change defaults and make other potentially breaking changes more easily.
If you need everything to work exactly as-is, put this in your config
2023-01-01 14:11:15 +05:30
pukkandan
88fb942577 Add message when there are no subtitles/thumbnails
Closes #5551
2023-01-01 14:11:15 +05:30
pukkandan
1cdda32998 [utils] get_exe_version: Detect broken executables
Authored by: dirkf, pukkandan
Closes #5561
2023-01-01 14:11:14 +05:30
coletdjnz
3e01ce744a [extractor/generic] Use Accept-Encoding: identity for initial request
The existing comment seems to imply this was the desired behavior from the beginning.

Partial fix for https://github.com/yt-dlp/yt-dlp/issues/5855, https://github.com/yt-dlp/yt-dlp/issues/5851, https://github.com/yt-dlp/yt-dlp/issues/4748
2023-01-01 18:41:35 +13:00
Matthew
8e40b9d1ec Improve plugin architecture (#5553)
to make plugins easier to develop and use:
* Plugins are now loaded as namespace packages.
* Plugins can be loaded in any distribution of yt-dlp (binary, pip, source, etc.).
* Plugin packages can be installed and managed via pip, or dropped into any of the documented locations.
* Users do not need to edit any code files to install plugins.
* Backwards-compatible with previous plugin architecture.

As a side-effect, yt-dlp will now search in a few more locations for config files.

Closes https://github.com/yt-dlp/yt-dlp/issues/1389

Authored by: flashdagger, coletdjnz, pukkandan, Grub4K
Co-authored-by: Marcel <flashdagger@googlemail.com>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Co-authored-by: Simon Sawicki <accounts@grub4k.xyz>
2023-01-01 04:29:22 +00:00
pukkandan
2fb0f85868 [update] Workaround #5632 2022-12-31 11:04:02 +05:30
Stel Abrego
a0e526ed4d [extractor/bandcamp] Add album_artist (#5537)
Closes #5536 
Authored by: stelcodes
2022-12-31 10:28:33 +05:30
pukkandan
8d1ddb0805 [extractor/udemy] Fix lectures that have no URL and detect DRM
Closes #5662
2022-12-31 10:02:50 +05:30
pukkandan
9bb856998b [extractor/youtube] Extract DRC formats 2022-12-30 15:50:17 +05:30
pukkandan
fbb7383306 Add weba to known extensions 2022-12-30 15:32:47 +05:30
pukkandan
ec54bd43f3 Fix bug in writing playlist info-json
Closes #4889
2022-12-30 14:07:15 +05:30
pukkandan
f74371a97d [extractor/bilibili] Fix --no-playlist for anthology
Closes #5797
2022-12-30 12:10:57 +05:30
ChillingPepper
d5f043d127 [utils] js_to_json: Fix bug in f55523c (#5771)
Authored by: ChillingPepper, pukkandan
2022-12-30 12:08:38 +05:30
pukkandan
fe74d5b592 Let --parse/replace-in-metadata run at any post-processing stage
Closes #5808, #456
2022-12-30 11:19:39 +05:30
pukkandan
119e40ef64 Add pre-processor stage video
Related: #456, #5808
2022-12-30 11:18:45 +05:30
pukkandan
4455918e7f [extractor/stv] Detect DRM
Closes #5320
2022-12-30 11:17:10 +05:30
Anant Murmu
efa944f4bc [cleanup] Use random.choices (#5800)
Authored by: freezboltz
2022-12-30 08:13:49 +05:30
nosoop
e107c2b8cf [extractor/soundcloud] Support user permalink (#5842)
Closes #5841
Authored by: nosoop
2022-12-30 00:16:43 +05:30
Lesmiscore
ca2f6e14e6 [extractor/BiliLive] Fix extractor
- Remove unnecessary group in `_VALID_URL`
- This extractor always returns livestreams
2022-12-30 03:01:22 +09:00
bashonly
c1edb853b0 [extractor/kick] Add extractor (#5736)
Closes #5722
Authored by: bashonly
2022-12-29 17:31:01 +00:00
bashonly
2647c933b8 [extractor/wistia] Improve extension detection (#5415)
Closes #5053
Authored by: bashonly, Grub4k, pukkandan
2022-12-29 16:32:54 +00:00
bashonly
53006b35ea [extractor/amazon] Add AmazonReviews extractor (#5857)
Closes #5766
Authored by: bashonly
2022-12-29 15:04:09 +00:00
bashonly
4b183d4962 [extractor/videoken] Add extractors (#5824)
Closes #5818
Authored by: bashonly
2022-12-29 14:29:08 +00:00
bashonly
3d667e0047 [extractor/slideslive] Support embeds and slides (#5784)
Authored by: bashonly, Grub4K, pukkandan
2022-12-29 12:03:03 +00:00
Sam
9a9006ba20 [extractor/twitcasting] Fix videos with password (#5894)
Closes #5888
Authored by: bashonly, Spicadox
2022-12-29 11:15:38 +00:00
HobbyistDev
153e88a751 [extractor/netverse] Add NetverseSearch extractor (#5838)
Authored by: HobbyistDev
2022-12-29 13:42:07 +05:30
JChris246
9fcd8ad1f2 [extractor/spankbang] Fix extractor (#5791)
Authored by: JChris246
Closes #5731
2022-12-29 13:38:22 +05:30
monnef
6b71d186dd [extractor/curiositystream] Fix auth (#5730)
Authored by: mnn
2022-12-29 13:17:23 +05:30
lkw123
074b2fae90 [extractor/kankanews] Add extractor (#5729)
Authored by: synthpop123
2022-12-29 13:08:49 +05:30
Kurt Bestor
06a9d68eb8 [extractor/youku] Fix extractor (#5622)
Closes #4456
Authored by: KurtBestor
2022-12-29 12:48:55 +05:30
Damiano Amatruda
a4d6ead30f [extractor/ciscowebex] Support password-protected videos (#5601)
Authored by: damianoamatruda
2022-12-29 12:24:19 +05:30
lauren n. liberda
d1b5f3d79c [extractor/polskieradio] Adapt to next.js redesigns (#5416)
Authored by: selfisekai
2022-12-28 02:17:25 +05:30
lauren n. liberda
da8d2de208 [extractor/cda] Support premium and misc improvements (#5529)
* Fix cache for non-ASCII key
* Improve error messages
* Better UA for fingerprint bypass

Authored by: selfisekai
2022-12-28 01:27:26 +05:30
chris
15e9e578c0 [extractor/ArteTV] Extract chapters (#5879)
Authored by: iw0nderhow, bashonly
2022-12-28 01:22:58 +05:30
Bobscorn
0ef3d47027 [extractor/beatbump] Add extractors (#5304)
Authored by: Bobscorn, pukkandan
Closes #4653
2022-12-27 12:34:56 +05:30
barsnick
247c8dd4f5 [extractor/urplay] Support for audio-only formats (#4606)
Closes #4605
Authored by: barsnick
2022-12-27 12:04:01 +05:30
HobbyistDev
032f22020c [extractor/trtcocuk] Add extractor (#5009)
Closes #2635
Authored by: HobbyistDev
2022-12-27 11:55:09 +05:30
pukkandan
4af47a0003 Fix 9012d20b23 2022-12-27 11:45:22 +05:30
pukkandan
9012d20b23 [extractor/mixch] Support --wait-for-video 2022-12-27 03:01:22 +05:30
Giulio Muscarello
d61ef7f343 [extractor/ARD] Add vtt subtitles (#5835)
Authored by: CapacitorSet
2022-12-24 16:19:10 +05:30
skbeh
1c226ccdd4 [extractor/bilibili] Improve _VALID_URL (#5820)
Authored by: skbeh
2022-12-24 16:17:37 +05:30
pukkandan
8791e78ccc Fix original_url in playlists 2022-12-23 01:44:20 +05:30
pukkandan
69f5fe45b9 [FFmpegVideoConvertor] Add gif to --recode-video 2022-12-23 01:44:20 +05:30
pukkandan
0b5546c723 [extractor] Let _extract_format functions obey --ignore-no-formats 2022-12-23 01:44:18 +05:30
bashonly
1fc089143c [extractor/reddit] Extract crossposted media (#5801)
Closes #5798
Authored by: bashonly
2022-12-21 00:55:47 +00:00
Lesmiscore
5424dbaf91 Deprioritize HEVC-over-FLV formats (#5823)
Authored by: Lesmiscore
2022-12-19 11:36:14 +09:00
Matthew
c733555106 [extractor/youtube:tab] Extract metadata from channel items (#5569)
Authored by: coletdjnz
2022-12-12 23:08:14 +00:00
HobbyistDev
81388c0954 [extractor/oneplace] Add OnePlacePodcast extractor (#5549)
Closes #5543
Authored by: HobbyistDev
2022-12-10 19:10:24 +05:30
Denis
df10bad267 [extractor/rutube] Support private videos (#5761)
Authored by: mexus
2022-12-10 18:47:01 +05:30
HobbyistDev
f0f3fa028b [extractor/netverse] Extract comments (#5568)
Authored by: HobbyistDev
2022-12-10 14:17:06 +05:30
HobbyistDev
22697a84f6 [extractor/europarl] Add EuroParlWebstream Extractor (#5547)
Authored by: HobbyistDev
Closes #4933
2022-12-10 14:14:43 +05:30
HobbyistDev
3ac5476430 [extractor/nosnl] Add support for /video (#5590)
Authored by: HobbyistDev
2022-12-10 14:04:55 +05:30
HobbyistDev
e318b5b87a [extractor/airtv] Add extractor (#5533)
Authored by: HobbyistDev
Closes #5132
2022-12-10 13:59:13 +05:30
bashonly
f549b18512 [extractor/pinterest] Fix extractor (#5739)
Closes #1772
Authored by: bashonly
2022-12-09 23:46:04 +00:00
bashonly
7c5e1701f6 [extractor/foxsports] Fix extractor (#5719)
Closes #5714
Authored by: bashonly
2022-12-09 23:43:10 +00:00
bashonly
16bed382fd [extractor/twitter] Heed --no-playlist for multi-video tweets (#5757)
Closes #5752
Authored by: bashonly, Grub4K
2022-12-09 23:41:45 +00:00
bashonly
3cf50fa8e9 [downloader/ffmpeg] Fix headers for video+audio formats (#5659)
Authored by: bashonly, Grub4K
2022-12-09 23:36:38 +00:00
bashonly
f69b0554eb [extractor/slideslive] Fix extractor (#5737)
Closes #1532
Authored by: bashonly, Grub4K
2022-12-09 23:25:37 +00:00
pukkandan
e74a3c6dcc [extractor/hotstar] Improve format metadata 2022-12-09 15:23:59 +05:30
pukkandan
7108221662 Add ac4 to known codecs
Note: ffmpeg does not currently support this format

Related #5738
2022-12-09 15:23:59 +05:30
nixxo
10dc85924a [extractor/mediaset] Better embed detection and error messages (#5664)
Authored by: nixxo
2022-12-09 12:50:37 +05:30
Vita
b05f0a50e0 [extractor/yle_areena] Support restricted videos (#5735)
* and improve metadata

Closes #5734
Authored by: docbender
2022-12-09 11:33:36 +05:30
Elyse
3d79ebc8b7 [extractor/mediastream] Add extractor (#5640)
Closes #5532, closes #4431, closes #4425
Authored by: elyse0, HobbyistDev

Co-authored-by: HobbyistDev <tesutonihon4@gmail.com>
2022-12-09 02:47:21 +05:30
pukkandan
b44cd29851 [jsinterp] Escape regex that looks like nested set
Closes #5749
2022-12-08 22:43:38 +05:30
milkknife
85a802969e [extractor/webcamerapl] Add extractor (#5715)
Authored by: milkknife
2022-12-08 22:26:36 +05:30
nixxo
72f96c5566 [extractor/la7] Improve extractor (#5538)
Authored by: nixxo
Closes #5360
2022-12-08 22:22:19 +05:30
MMM
839e2a62ae [extractor/rumble] Add RumbleIE extractor (#5515)
Closes #2846
Authored by: flashdagger
2022-12-08 22:02:17 +05:30
HobbyistDev
28b8f57b4b [extractor/noice] Add NoicePodcast extractor (#5621)
Authored by: HobbyistDev
2022-12-08 19:28:36 +05:30
lkw123
dfc186d422 [extractor/xiami] Remove extractors (#5711)
Authored by: synthpop123
2022-12-08 18:13:29 +05:30
David Turner
42ec478fc4 [extractor/plutotv] Fix videos with non-zero start (#5745)
Authored by: digitall
2022-12-08 18:08:52 +05:30
pukkandan
7991ae57a8 [extractor/sibnet] Separate from VKIE
Fixes bfd973ece3 (commitcomment-91834251)
2022-12-08 17:20:02 +05:30
pukkandan
935bac1e4d Fix --cookies-from-browser CLI parsing
Closes #5716
2022-12-06 00:35:53 +05:30
bashonly
c4cbd3bebd [extractor/tiktok] Update _VALID_URL, add api_hostname arg (#5708)
Closes #5706
Authored by: bashonly
2022-12-04 22:30:31 +00:00
pukkandan
c53a18f016 [utils] windows_enable_vt_mode: Proper implementation
Authored by: Grub4K
2022-12-05 01:06:56 +05:30
pukkandan
71df9b7fd5 [cleanup] Misc 2022-12-03 19:52:31 +05:30
Benjamin Ryan
c9f5ce5118 [extractor/tiktok] Update API hostname (#5690)
Closes #5688
Authored by: redraskal
2022-12-02 09:38:00 +00:00
bashonly
ddf1e22d48 [extractor/swearnet] Fix description bug (#5681)
Bug in 049565df2e
Closes #5643
Authoried by: bashonly
2022-12-01 11:24:43 +00:00
bashonly
0e96b408b9 [extractor/reddit] Extract video embeds in text posts (#5677)
Closes #5612
Authored by: bashonly
2022-12-01 04:04:32 +00:00
bashonly
ba72399723 [extractor/tiktok] Fix subs, DouyinIE, improve _VALID_URL (#5676)
Closes #5665, Closes #2267
Authored by: bashonly
2022-12-01 04:00:32 +00:00
pukkandan
9bcfe33be7 [utils] Make ExtractorError mutable 2022-11-30 06:10:26 +05:30
pukkandan
71eb82d1b2 [extractor/youtube] Subtitles cannot be translated to und
Closes #5674
2022-11-30 05:18:18 +05:30
pukkandan
a9d069f5b8 [extractor/amazonminitv] Cleanup 48652590ec 2022-11-29 07:54:07 +05:30
alexia
48652590ec [extractor/amazonminitv] Add extractors (#3628)
Authored by: nyuszika7h, GautamMKGarg
2022-11-28 11:36:18 +09:00
marieell
86f557b636 [extractor/youporn] Fix metadata (#2768)
Authored by: marieell
2022-11-26 08:00:25 +05:30
pukkandan
c0caa80515 [extractor/naver] Treat fan subtitles as separate language
Closes #5467
2022-11-25 16:10:30 +05:30
Mudassir Chapra
0d95d8b00a [extractor/gronkh] Fix _VALID_URL (#5628)
Closes #5531
Authored by: muddi900
2022-11-24 15:34:45 +00:00
Elan Ruusamäe
9d52bf65ff [extractor/kanal2] Add extractor (#5575)
Authored by: glensc, pukkandan, bashonly
2022-11-22 18:09:57 +00:00
bashonly
d761dfd059 [extractor/naver] Improve _VALID_URL for NaverNowIE (#5620)
Authored by: bashonly
2022-11-22 03:42:16 +00:00
bashonly
27c0f899c8 [extractor/screencastify] Add extractor (#5604)
Closes #5603
Authored by: bashonly
2022-11-22 00:40:02 +00:00
bashonly
7ff2fafe47 [extractor/vimeo] Add VimeoProIE (#5596)
* Add support for VimeoPro URLs not containing a Vimeo video ID
* Add support for password-protected VimeoPro pages
Closes #5594
Authored by: bashonly, pukkandan
2022-11-21 00:55:57 +00:00
bashonly
3b021eacef [extractor/generic] Add fragment_query extractor arg for DASH and HLS (#5528)
* `fragment_query`: passthrough any query in generic mpd/m3u8 manifest URLs to their fragments
* Add support for `extra_param_to_segment_url` to DASH downloader
Authored by: bashonly, pukkandan
2022-11-21 00:51:45 +00:00
Marcel
f352a09778 [webvtt] Handle premature EOF
Closes #2867, closes #5600
Authored by: flashdagger
2022-11-20 14:14:42 +05:30
chengzhicn
02b2f9fa7d [extractor/reddit] Add vcodec to fallback format (#5591)
Authored by: chengzhicn
2022-11-20 01:44:21 +05:30
pukkandan
29ca408219 [FormatSort] Add mov to vext
Closes #5581
2022-11-19 09:04:01 +05:30
pukkandan
8486540257 [extractor/unsupported] Add more URLs
Closes #5557, Closes #2744, Closes #5578
2022-11-19 08:42:06 +05:30
bashonly
ed027fd9d8 [extractor/generic] Fix JSON LD manifest extraction (#5577)
Closes #5572
Authored by: bashonly, pukkandan
2022-11-18 02:04:03 +00:00
bashonly
352e7d9873 [extractor/twitter] Refresh guest token when expired (#5560)
Closes #5548
Authored by: bashonly, Grub4K
2022-11-18 02:00:11 +00:00
nixxo
9a0416c6a5 [extractor/twitter:spaces] Add 'Referer' to m3u8 (#5580)
Closes #5565
Authored by: nixxo
2022-11-18 06:42:02 +05:30
bashonly
f5a9e9df0d [extractor/brightcove] Add BrightcoveNewBaseIE and fix embed extraction (#5558)
* Move Brightcove embed extraction and tests into the IEs
* Split `BrightcoveNewBaseIE` from `BrightcoveNewIE`
* Fix bug in ade1fa70cb with the "wrong" spelling of `referrer` being smuggled

Closes #5539
2022-11-17 19:11:35 +00:00
bashonly
f96a3fb7d3 [extractor/redgifs] Fix bug in 8c188d5d09 (#5559) 2022-11-17 19:09:40 +00:00
Bnyro
bc87dac75f [extractor/youtube] Add piped.video (#5571)
Closes #5518
Authored by: Bnyro
2022-11-17 18:45:38 +05:30
pukkandan
9f14daf22b [extractor] Deprecate _sort_formats 2022-11-17 11:40:17 +05:30
pukkandan
784320c98c Implement universal format sorting
Closes #5566
2022-11-17 11:05:49 +05:30
pukkandan
d0d74b7197 [utils] Move format sorting code into utils 2022-11-17 11:04:38 +05:30
pukkandan
64c464a144 [utils] Move FileDownloader.parse_bytes into utils 2022-11-17 08:40:34 +05:30
pukkandan
4de88a6a36 [extractor/generic] Don't report redirect to https 2022-11-17 02:12:07 +05:30
pukkandan
105bfd90f5 Add new field aspect_ratio
Closes #5402
2022-11-16 06:57:09 +05:30
pukkandan
6368e2e639 [cleanup] Misc
Closes #5541
2022-11-16 06:57:07 +05:30
pukkandan
a4894d3e25 [extractor/youtube] Consider language in format de-duplication 2022-11-15 05:23:46 +05:30
pukkandan
d7b460d0e5 Make early reject of --match-filter stricter
Closes #5509
2022-11-13 10:56:06 +05:30
pukkandan
171a31dbe8 [extractor] Add a way to distinguish IEs that returns only videos 2022-11-13 10:56:04 +05:30
pukkandan
83cc7b8aae [utils] classproperty: Add cache support 2022-11-13 08:29:49 +05:30
Elyse
0a4b2f4180 [extractor/tencent] Fix geo-restricted video (#5505)
Closes #5230
Authored by: elyse0
2022-11-12 12:43:13 +05:30
pukkandan
a8c754cc00 [extractor/youtube] Fix bug in handling of music URLs
Bug in bd7e919a75
Closes #5502
2022-11-12 00:02:13 +05:30
pukkandan
bc5c2f8a2c Fix bugs in PlaylistEntries 2022-11-12 00:02:12 +05:30
Audrey
d965856235 [extractor/Veoh] Add user extractor (#5242)
Authored by: tntmod54321
2022-11-11 23:28:54 +05:30
pukkandan
08270da5c3 [extractor/youtube] Fix ytuser: 2022-11-11 16:29:52 +05:30
690 changed files with 7850 additions and 3466 deletions

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm requesting a site-specific feature
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell
validations:

View File

@@ -18,7 +18,7 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
@@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell
validations:

View File

@@ -20,7 +20,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
@@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell

View File

@@ -26,7 +26,7 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2022.11.11** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true
@@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.11.11 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
<more lines>
render: shell

View File

@@ -2,8 +2,6 @@
### Description of your *pull request* and other information
</details>
<!--
Explanation of your *pull request* in arbitrary form goes here. Please **make sure the description explains the purpose and effect** of your *pull request* and is worded well enough to be understood. Provide as much **context and examples** as possible
@@ -41,3 +39,5 @@ Fixes #
- [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy))
- [ ] Core bug fix/improvement
- [ ] New feature (It is strongly [recommended to open an issue first](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#adding-new-feature-or-making-overarching-changes))
</details>

View File

@@ -12,13 +12,13 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
# CPython 3.9 is in quick-test
python-version: ['3.7', '3.10', 3.11-dev, pypy-3.7, pypy-3.8]
# CPython 3.11 is in quick-test
python-version: ['3.8', '3.9', '3.10', pypy-3.7, pypy-3.8]
run-tests-ext: [sh]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest
python-version: '3.8'
python-version: '3.7'
run-tests-ext: bat
- os: windows-latest
python-version: pypy-3.9
@@ -33,5 +33,6 @@ jobs:
run: pip install pytest
- name: Run tests
continue-on-error: False
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }} core
# Linter is in quick-test
run: |
python3 -m yt_dlp -v || true # Print debug head
./devscripts/run_tests.${{ matrix.run-tests-ext }} core

View File

@@ -10,24 +10,23 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.9
python-version: '3.11'
- name: Install test requirements
run: pip install pytest pycryptodomex
- name: Run tests
run: ./devscripts/run_tests.sh core
run: |
python3 -m yt_dlp -v || true
./devscripts/run_tests.sh core
flake8:
name: Linter
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.9
- uses: actions/setup-python@v4
- name: Install flake8
run: pip install flake8
- name: Make lazy extractors

10
.gitignore vendored
View File

@@ -30,6 +30,7 @@ cookies
*.f4v
*.flac
*.flv
*.gif
*.jpeg
*.jpg
*.m4a
@@ -71,6 +72,7 @@ dist/
zip/
tmp/
venv/
.venv/
completions/
# Misc
@@ -119,9 +121,5 @@ yt-dlp.zip
*/extractor/lazy_extractors.py
# Plugins
ytdlp_plugins/extractor/*
!ytdlp_plugins/extractor/__init__.py
!ytdlp_plugins/extractor/sample.py
ytdlp_plugins/postprocessor/*
!ytdlp_plugins/postprocessor/__init__.py
!ytdlp_plugins/postprocessor/sample.py
ytdlp_plugins/
yt-dlp-plugins

View File

@@ -351,8 +351,9 @@ Say you extracted a list of thumbnails into `thumbnail_data` and want to iterate
```python
thumbnail_data = data.get('thumbnails') or []
thumbnails = [{
'url': item['url']
} for item in thumbnail_data] # correct
'url': item['url'],
'height': item.get('h'),
} for item in thumbnail_data if item.get('url')] # correct
```
and not like:
@@ -360,12 +361,27 @@ and not like:
```python
thumbnail_data = data.get('thumbnails')
thumbnails = [{
'url': item['url']
'url': item['url'],
'height': item.get('h'),
} for item in thumbnail_data] # incorrect
```
In this case, `thumbnail_data` will be `None` if the field was not found and this will cause the loop `for item in thumbnail_data` to raise a fatal error. Using `or []` avoids this error and results in setting an empty list in `thumbnails` instead.
Alternately, this can be further simplified by using `traverse_obj`
```python
thumbnails = [{
'url': item['url'],
'height': item.get('h'),
} for item in traverse_obj(data, ('thumbnails', lambda _, v: v['url']))]
```
or, even better,
```python
thumbnails = traverse_obj(data, ('thumbnails', ..., {'url': 'url', 'height': 'h'}))
```
### Provide fallbacks

View File

@@ -3,6 +3,7 @@ shirt-dev (collaborator)
coletdjnz/colethedj (collaborator)
Ashish0804 (collaborator)
nao20010128nao/Lesmiscore (collaborator)
bashonly (collaborator)
h-h-h-h
pauldubois98
nixxo
@@ -295,7 +296,6 @@ Mehavoid
winterbird-code
yashkc2025
aldoridhoni
bashonly
jacobtruman
masta79
palewire
@@ -357,3 +357,27 @@ SG5
the-marenga
tkgmomosheep
vitkhab
glensc
synthpop123
tntmod54321
milkknife
Bnyro
CapacitorSet
stelcodes
skbeh
muddi900
digitall
chengzhicn
mexus
JChris246
redraskal
Spicadox
barsnick
docbender
KurtBestor
Chrissi2812
FrederikNS
gschizas
JC-Chung
mzhou
OndrejBakan

View File

@@ -11,6 +11,157 @@
-->
### 2023.01.06
* Fix config locations by [Grub4k](https://github.com/Grub4k), [coletdjnz](https://github.com/coletdjnz), [pukkandan](https://github.com/pukkandan)
* [downloader/aria2c] Disable native progress
* [utils] `mimetype2ext`: `weba` is not standard
* [utils] `windows_enable_vt_mode`: Better error handling
* [build] Add minimal `pyproject.toml`
* [update] Fix updater file removal on windows by [Grub4K](https://github.com/Grub4K)
* [cleanup] Misc fixes and cleanup
* [extractor/aitube] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/drtv] Add series extractors by [FrederikNS](https://github.com/FrederikNS)
* [extractor/volejtv] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/xanimu] Add extractor by [JChris246](https://github.com/JChris246)
* [extractor/youtube] Retry manifest refresh for live-from-start by [mzhou](https://github.com/mzhou)
* [extractor/biliintl] Add `/media` to `VALID_URL` by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/biliIntl] Add fallback to `video_data` by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/crunchyroll:show] Add `language` to entries by [Chrissi2812](https://github.com/Chrissi2812)
* [extractor/joj] Fix extractor by [OndrejBakan](https://github.com/OndrejBakan), [pukkandan](https://github.com/pukkandan)
* [extractor/nbc] Update graphql query by [jacobtruman](https://github.com/jacobtruman)
* [extractor/reddit] Add subreddit as `channel_id` by [gschizas](https://github.com/gschizas)
* [extractor/tiktok] Add `TikTokLive` extractor by [JC-Chung](https://github.com/JC-Chung)
### 2023.01.02
* **Improve plugin architecture** by [Grub4K](https://github.com/Grub4K), [coletdjnz](https://github.com/coletdjnz), [flashdagger](https://github.com/flashdagger), [pukkandan](https://github.com/pukkandan)
* Plugins can be loaded in any distribution of yt-dlp (binary, pip, source, etc.) and can be distributed and installed as packages. See [the readme](https://github.com/yt-dlp/yt-dlp/tree/05997b6e98e638d97d409c65bb5eb86da68f3b64#plugins) for more information
* Add `--compat-options 2021,2022`
* This allows devs to change defaults and make other potentially breaking changes more easily. If you need everything to work exactly as-is, put Use `--compat 2022` in your config to guard against future compat changes.
* [downloader/aria2c] Native progress for aria2c via RPC by [Lesmiscore](https://github.com/Lesmiscore), [pukkandan](https://github.com/pukkandan)
* Merge youtube-dl: Upto [commit/195f22f](https://github.com/ytdl-org/youtube-dl/commit/195f22f6) by [Grub4k](https://github.com/Grub4k), [pukkandan](https://github.com/pukkandan)
* Add pre-processor stage `video`
* Let `--parse/replace-in-metadata` run at any post-processing stage
* Add `--enable-file-urls` by [coletdjnz](https://github.com/coletdjnz)
* Add new field `aspect_ratio`
* Add `ac4` to known codecs
* Add `weba` to known extensions
* [FFmpegVideoConvertor] Add `gif` to `--recode-video`
* Add message when there are no subtitles/thumbnails
* Deprioritize HEVC-over-FLV formats by [Lesmiscore](https://github.com/Lesmiscore)
* Make early reject of `--match-filter` stricter
* Fix `--cookies-from-browser` CLI parsing
* Fix `original_url` in playlists
* Fix bug in writing playlist info-json
* Fix bugs in `PlaylistEntries`
* [downloader/ffmpeg] Fix headers for video+audio formats by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor] Add a way to distinguish IEs that returns only videos
* [extractor] Implement universal format sorting and deprecate `_sort_formats`
* [extractor] Let `_extract_format` functions obey `--ignore-no-formats`
* [extractor/generic] Add `fragment_query` extractor arg for DASH and HLS by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/generic] Decode unicode-escaped embed URLs by [bashonly](https://github.com/bashonly)
* [extractor/generic] Don't report redirect to https
* [extractor/generic] Fix JSON LD manifest extraction by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/generic] Use `Accept-Encoding: identity` for initial request by [coletdjnz](https://github.com/coletdjnz)
* [FormatSort] Add `mov` to `vext`
* [jsinterp] Escape regex that looks like nested set
* [webvtt] Handle premature EOF by [flashdagger](https://github.com/flashdagger)
* [utils] `classproperty`: Add cache support
* [utils] `get_exe_version`: Detect broken executables by [dirkf](https://github.com/dirkf), [pukkandan](https://github.com/pukkandan)
* [utils] `js_to_json`: Fix bug in [f55523c](https://github.com/yt-dlp/yt-dlp/commit/f55523c) by [ChillingPepper](https://github.com/ChillingPepper), [pukkandan](https://github.com/pukkandan)
* [utils] Make `ExtractorError` mutable
* [utils] Move `FileDownloader.parse_bytes` into utils
* [utils] Move format sorting code into `utils`
* [utils] `windows_enable_vt_mode`: Proper implementation by [Grub4K](https://github.com/Grub4K)
* [update] Workaround [#5632](https://github.com/yt-dlp/yt-dlp/issues/5632)
* [docs] Improvements
* [cleanup] Misc fixes and cleanup
* [cleanup] Use `random.choices` by [freezboltz](https://github.com/freezboltz)
* [extractor/airtv] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/amazonminitv] Add extractors by [GautamMKGarg](https://github.com/GautamMKGarg), [nyuszika7h](https://github.com/nyuszika7h)
* [extractor/beatbump] Add extractors by [Bobscorn](https://github.com/Bobscorn), [pukkandan](https://github.com/pukkandan)
* [extractor/europarl] Add EuroParlWebstream extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/kanal2] Add extractor by [bashonly](https://github.com/bashonly), [glensc](https://github.com/glensc), [pukkandan](https://github.com/pukkandan)
* [extractor/kankanews] Add extractor by [synthpop123](https://github.com/synthpop123)
* [extractor/kick] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/mediastream] Add extractor by [HobbyistDev](https://github.com/HobbyistDev), [elyse0](https://github.com/elyse0)
* [extractor/noice] Add NoicePodcast extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/oneplace] Add OnePlacePodcast extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/rumble] Add RumbleIE extractor by [flashdagger](https://github.com/flashdagger)
* [extractor/screencastify] Add extractor by [bashonly](https://github.com/bashonly)
* [extractor/trtcocuk] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/Veoh] Add user extractor by [tntmod54321](https://github.com/tntmod54321)
* [extractor/videoken] Add extractors by [bashonly](https://github.com/bashonly)
* [extractor/webcamerapl] Add extractor by [milkknife](https://github.com/milkknife)
* [extractor/amazon] Add `AmazonReviews` extractor by [bashonly](https://github.com/bashonly)
* [extractor/netverse] Add `NetverseSearch` extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/vimeo] Add `VimeoProIE` by [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/xiami] Remove extractors by [synthpop123](https://github.com/synthpop123)
* [extractor/youtube] Add `piped.video` by [Bnyro](https://github.com/Bnyro)
* [extractor/youtube] Consider language in format de-duplication
* [extractor/youtube] Extract DRC formats
* [extractor/youtube] Fix `ytuser:`
* [extractor/youtube] Fix bug in handling of music URLs
* [extractor/youtube] Subtitles cannot be translated to `und`
* [extractor/youtube:tab] Extract metadata from channel items by [coletdjnz](https://github.com/coletdjnz)
* [extractor/ARD] Add vtt subtitles by [CapacitorSet](https://github.com/CapacitorSet)
* [extractor/ArteTV] Extract chapters by [bashonly](https://github.com/bashonly), [iw0nderhow](https://github.com/iw0nderhow)
* [extractor/bandcamp] Add `album_artist` by [stelcodes](https://github.com/stelcodes)
* [extractor/bilibili] Fix `--no-playlist` for anthology
* [extractor/bilibili] Improve `_VALID_URL` by [skbeh](https://github.com/skbeh)
* [extractor/biliintl:series] Make partial download of series faster
* [extractor/BiliLive] Fix extractor
* [extractor/brightcove] Add `BrightcoveNewBaseIE` and fix embed extraction
* [extractor/cda] Support premium and misc improvements by [selfisekai](https://github.com/selfisekai)
* [extractor/ciscowebex] Support password-protected videos by [damianoamatruda](https://github.com/damianoamatruda)
* [extractor/curiositystream] Fix auth by [mnn](https://github.com/mnn)
* [extractor/embedly] Handle vimeo embeds
* [extractor/fifa] Fix Preplay extraction by [dirkf](https://github.com/dirkf)
* [extractor/foxsports] Fix extractor by [bashonly](https://github.com/bashonly)
* [extractor/gronkh] Fix `_VALID_URL` by [muddi900](https://github.com/muddi900)
* [extractor/hotstar] Improve format metadata
* [extractor/iqiyi] Fix `Iq` JS regex by [bashonly](https://github.com/bashonly)
* [extractor/la7] Improve extractor by [nixxo](https://github.com/nixxo)
* [extractor/mediaset] Better embed detection and error messages by [nixxo](https://github.com/nixxo)
* [extractor/mixch] Support `--wait-for-video`
* [extractor/naver] Improve `_VALID_URL` for `NaverNowIE` by [bashonly](https://github.com/bashonly)
* [extractor/naver] Treat fan subtitles as separate language
* [extractor/netverse] Extract comments by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/nosnl] Add support for /video by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/odnoklassniki] Extract subtitles by [bashonly](https://github.com/bashonly)
* [extractor/pinterest] Fix extractor by [bashonly](https://github.com/bashonly)
* [extractor/plutotv] Fix videos with non-zero start by [digitall](https://github.com/digitall)
* [extractor/polskieradio] Adapt to next.js redesigns by [selfisekai](https://github.com/selfisekai)
* [extractor/reddit] Add vcodec to fallback format by [chengzhicn](https://github.com/chengzhicn)
* [extractor/reddit] Extract crossposted media by [bashonly](https://github.com/bashonly)
* [extractor/reddit] Extract video embeds in text posts by [bashonly](https://github.com/bashonly)
* [extractor/rutube] Support private videos by [mexus](https://github.com/mexus)
* [extractor/sibnet] Separate from VKIE
* [extractor/slideslive] Fix extractor by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/slideslive] Support embeds and slides by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/soundcloud] Support user permalink by [nosoop](https://github.com/nosoop)
* [extractor/spankbang] Fix extractor by [JChris246](https://github.com/JChris246)
* [extractor/stv] Detect DRM
* [extractor/swearnet] Fix description bug
* [extractor/tencent] Fix geo-restricted video by [elyse0](https://github.com/elyse0)
* [extractor/tiktok] Fix subs, `DouyinIE`, improve `_VALID_URL` by [bashonly](https://github.com/bashonly)
* [extractor/tiktok] Update `_VALID_URL`, add `api_hostname` arg by [bashonly](https://github.com/bashonly)
* [extractor/tiktok] Update API hostname by [redraskal](https://github.com/redraskal)
* [extractor/twitcasting] Fix videos with password by [Spicadox](https://github.com/Spicadox), [bashonly](https://github.com/bashonly)
* [extractor/twitter] Heed `--no-playlist` for multi-video tweets by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/twitter] Refresh guest token when expired by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/twitter:spaces] Add `Referer` to m3u8 by [nixxo](https://github.com/nixxo)
* [extractor/udemy] Fix lectures that have no URL and detect DRM
* [extractor/unsupported] Add more URLs
* [extractor/urplay] Support for audio-only formats by [barsnick](https://github.com/barsnick)
* [extractor/wistia] Improve extension detection by [Grub4k](https://github.com/Grub4k), [bashonly](https://github.com/bashonly), [pukkandan](https://github.com/pukkandan)
* [extractor/yle_areena] Support restricted videos by [docbender](https://github.com/docbender)
* [extractor/youku] Fix extractor by [KurtBestor](https://github.com/KurtBestor)
* [extractor/youporn] Fix metadata by [marieell](https://github.com/marieell)
* [extractor/redgifs] Fix bug in [8c188d5](https://github.com/yt-dlp/yt-dlp/commit/8c188d5d09177ed213a05c900d3523867c5897fd)
### 2022.11.11
* Merge youtube-dl: Upto [commit/de39d12](https://github.com/ytdl-org/youtube-dl/commit/de39d128)

View File

@@ -42,7 +42,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* Improved/fixed support for HiDive, HotStar, Hungama, LBRY, LinkedInLearning, Mxplayer, SonyLiv, TV2, Vimeo, VLive etc
## [Lesmiscore](https://github.com/Lesmiscore) (nao20010128nao)
## [Lesmiscore](https://github.com/Lesmiscore) <sub><sup>(nao20010128nao)</sup></sub>
**Bitcoin**: bc1qfd02r007cutfdjwjmyy9w23rjvtls6ncve7r3s
**Monacoin**: mona1q3tf7dzvshrhfe3md379xtvt2n22duhglv5dskr
@@ -50,3 +50,10 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* Download live from start to end for YouTube
* Added support for new websites AbemaTV, mildom, PixivSketch, skeb, radiko, voicy, mirrativ, openrec, whowatch, damtomo, 17.live, mixch etc
* Improved/fixed support for fc2, YahooJapanNews, tver, iwara etc
## [bashonly](https://github.com/bashonly)
* `--cookies-from-browser` support for Firefox containers
* Added support for new websites Genius, Kick, NBCStations, Triller, VideoKen etc
* Improved/fixed support for Anvato, Brightcove, Instagram, ParamountPlus, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc

View File

@@ -17,8 +17,8 @@ pypi-files: AUTHORS Changelog.md LICENSE README.md README.txt supportedsites \
clean-test:
rm -rf test/testdata/sigs/player-*.js tmp/ *.annotations.xml *.aria2 *.description *.dump *.frag \
*.frag.aria2 *.frag.urls *.info.json *.live_chat.json *.meta *.part* *.tmp *.temp *.unknown_video *.ytdl \
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 *.mp4 \
*.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
*.3gp *.ape *.ass *.avi *.desktop *.f4v *.flac *.flv *.gif *.jpeg *.jpg *.m4a *.m4v *.mhtml *.mkv *.mov *.mp3 \
*.mp4 *.mpga *.oga *.ogg *.opus *.png *.sbv *.srt *.swf *.swp *.tt *.ttml *.url *.vtt *.wav *.webloc *.webm *.webp
clean-dist:
rm -rf yt-dlp.1.temp.md yt-dlp.1 README.txt MANIFEST build/ dist/ .coverage cover/ yt-dlp.tar.gz completions/ \
yt_dlp/extractor/lazy_extractors.py *.spec CONTRIBUTING.md.tmp yt-dlp yt-dlp.exe yt_dlp.egg-info/ AUTHORS .mailmap

218
README.md
View File

@@ -10,7 +10,7 @@
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&labelColor=555555&label=&logo=discord&style=for-the-badge)](https://discord.gg/H5MNcFW63r "Discord")
[![Supported Sites](https://img.shields.io/badge/-Supported_Sites-brightgreen.svg?style=for-the-badge)](supportedsites.md "Supported Sites")
[![License: Unlicense](https://img.shields.io/badge/-Unlicense-blue.svg?style=for-the-badge)](LICENSE "License")
[![CI Status](https://img.shields.io/github/workflow/status/yt-dlp/yt-dlp/Core%20Tests/master?label=Tests&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/actions "CI Status")
[![CI Status](https://img.shields.io/github/actions/workflow/status/yt-dlp/yt-dlp/core.yml?branch=master&label=Tests&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/actions "CI Status")
[![Commits](https://img.shields.io/github/commit-activity/m/yt-dlp/yt-dlp?label=commits&style=for-the-badge)](https://github.com/yt-dlp/yt-dlp/commits "Commit History")
[![Last Commit](https://img.shields.io/github/last-commit/yt-dlp/yt-dlp/master?label=&style=for-the-badge&display_timestamp=committer)](https://github.com/yt-dlp/yt-dlp/commits "Commit History")
@@ -61,6 +61,8 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
* [Modifying metadata examples](#modifying-metadata-examples)
* [EXTRACTOR ARGUMENTS](#extractor-arguments)
* [PLUGINS](#plugins)
* [Installing Plugins](#installing-plugins)
* [Developing Plugins](#developing-plugins)
* [EMBEDDING YT-DLP](#embedding-yt-dlp)
* [Embedding examples](#embedding-examples)
* [DEPRECATED OPTIONS](#deprecated-options)
@@ -74,13 +76,13 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/de39d12](https://github.com/ytdl-org/youtube-dl/commit/de39d128)** <!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* Merged with **youtube-dl v2021.12.17+ [commit/195f22f](https://github.com/ytdl-org/youtube-dl/commit/195f22f)** <!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that the NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **YouTube improvements**:
* Supports Clips, Stories (`ytstories:<channel UCID>`), Search (including filters)**\***, YouTube Music Search, Channel-specific search, Search prefixes (`ytsearch:`, `ytsearchdate:`)**\***, Mixes, YouTube Music Albums/Channels ([except self-uploaded music](https://github.com/yt-dlp/yt-dlp/issues/723)), and Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`, `:ytnotif`)
@@ -151,12 +153,15 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* When `--embed-subs` and `--write-subs` are used together, the subtitles are written to disk and also embedded in the media file. You can use just `--embed-subs` to embed the subs and automatically delete the separate file. See [#630 (comment)](https://github.com/yt-dlp/yt-dlp/issues/630#issuecomment-893659460) for more info. `--compat-options no-keep-subs` can be used to revert this
* `certifi` will be used for SSL root certificates, if installed. If you want to use system certificates (e.g. self-signed), use `--compat-options no-certifi`
* yt-dlp's sanitization of invalid characters in filenames is different/smarter than in youtube-dl. You can use `--compat-options filename-sanitization` to revert to youtube-dl's behavior
* yt-dlp tries to parse the external downloader outputs into the standard progress output if possible (Currently implemented: [~~aria2c~~](https://github.com/yt-dlp/yt-dlp/issues/5931)). You can use `--compat-options no-external-downloader-progress` to get the downloader output as-is
For ease of use, a few more compat options are available:
* `--compat-options all`: Use all compat options (Do NOT use)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
* `--compat-options 2022`: Same as `--compat-options no-external-downloader-progress`. Use this to enable all future compat options
# INSTALLATION
@@ -179,7 +184,7 @@ You can use `yt-dlp -U` to update if you are [using the release binaries](#relea
If you [installed with PIP](https://github.com/yt-dlp/yt-dlp/wiki/Installation#with-pip), simply re-run the same command that was used to install the program
For other third-party package managers, see [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation) or refer their documentation
For other third-party package managers, see [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/Installation#third-party-package-managers) or refer their documentation
<!-- MANPAGE: BEGIN EXCLUDED SECTION -->
@@ -217,7 +222,7 @@ File|Description
<!-- MANPAGE: END EXCLUDED SECTION -->
Note: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
**Note**: The manpages, shell completion files etc. are available in the [source tarball](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp.tar.gz)
## DEPENDENCIES
Python versions 3.7+ (CPython and PyPy) are supported. Other versions and implementations may or may not work correctly.
@@ -233,8 +238,9 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html)
<!-- TODO: ffmpeg has merged this patch. Remove this note once there is new release -->
**Note**: There are some regressions in newer ffmpeg versions that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
There are bugs in ffmpeg that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
**Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg)
### Networking
* [**certifi**](https://github.com/certifi/python-certifi)\* - Provides Mozilla's root certificate bundle. Licensed under [MPLv2](https://github.com/certifi/python-certifi/blob/master/LICENSE)
@@ -281,7 +287,7 @@ On some systems, you may need to use `py` or `python` instead of `python3`.
`pyinst.py` accepts any arguments that can be passed to `pyinstaller`, such as `--onefile/-F` or `--onedir/-D`, which is further [documented here](https://pyinstaller.org/en/stable/usage.html#what-to-generate).
Note that pyinstaller with versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Note**: Pyinstaller versions below 4.4 [do not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment.
**Important**: Running `pyinstaller` directly **without** using `pyinst.py` is **not** officially supported. This may or may not work correctly.
@@ -414,6 +420,8 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
--source-address IP Client-side IP address to bind to
-4, --force-ipv4 Make all connections via IPv4
-6, --force-ipv6 Make all connections via IPv6
--enable-file-urls Enable file:// URLs. This is disabled by
default for security reasons.
## Geo-restriction:
--geo-verification-proxy URL Use this proxy to verify the IP address for
@@ -432,23 +440,25 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
explicitly provided IP block in CIDR notation
## Video Selection:
-I, --playlist-items ITEM_SPEC Comma separated playlist_index of the videos
-I, --playlist-items ITEM_SPEC Comma separated playlist_index of the items
to download. You can specify a range using
"[START]:[STOP][:STEP]". For backward
compatibility, START-STOP is also supported.
Use negative indices to count from the right
and negative STEP to download in reverse
order. E.g. "-I 1:3,7,-5::2" used on a
playlist of size 15 will download the videos
playlist of size 15 will download the items
at index 1,2,3,7,11,13,15
--min-filesize SIZE Do not download any videos smaller than
--min-filesize SIZE Abort download if filesize is smaller than
SIZE, e.g. 50k or 44.6M
--max-filesize SIZE Abort download if filesize is larger than
SIZE, e.g. 50k or 44.6M
--max-filesize SIZE Do not download any videos larger than SIZE,
e.g. 50k or 44.6M
--date DATE Download only videos uploaded on this date.
The date can be "YYYYMMDD" or in the format
[now|today|yesterday][-N[day|week|month|year]].
E.g. --date today-2weeks
E.g. "--date today-2weeks" downloads
only videos uploaded on the same day two
weeks ago
--datebefore DATE Download only videos uploaded on or before
this date. The date formats accepted is the
same as --date
@@ -491,9 +501,9 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
a file that is in the archive
--break-on-reject Stop the download process when encountering
a file that has been filtered out
--break-per-input --break-on-existing, --break-on-reject,
--max-downloads, and autonumber resets per
input URL
--break-per-input Alters --max-downloads, --break-on-existing,
--break-on-reject, and autonumber to reset
per input URL
--no-break-per-input --break-on-existing and similar options
terminates the entire download queue
--skip-playlist-after-errors N Number of allowed failures until the rest of
@@ -525,8 +535,8 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
linear=1::2 --retry-sleep fragment:exp=1:20
--skip-unavailable-fragments Skip unavailable fragments for DASH,
hlsnative and ISM downloads (default)
(Alias: --no-abort-on-unavailable-fragment)
--abort-on-unavailable-fragment
(Alias: --no-abort-on-unavailable-fragments)
--abort-on-unavailable-fragments
Abort download if a fragment is unavailable
(Alias: --no-skip-unavailable-fragments)
--keep-fragments Keep downloaded fragments on disk after
@@ -725,7 +735,7 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
screen, optionally prefixed with when to
print it, separated by a ":". Supported
values of "WHEN" are the same as that of
--use-postprocessor, and "video" (default).
--use-postprocessor (default: video).
Implies --quiet. Implies --simulate unless
--no-simulate or later stages of WHEN are
used. This option can be used multiple times
@@ -893,11 +903,11 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
specific bitrate like 128K (default 5)
--remux-video FORMAT Remux the video into another container if
necessary (currently supported: avi, flv,
mkv, mov, mp4, webm, aac, aiff, alac, flac,
m4a, mka, mp3, ogg, opus, vorbis, wav). If
target container does not support the
video/audio codec, remuxing will fail. You
can specify multiple rules; e.g.
gif, mkv, mov, mp4, webm, aac, aiff, alac,
flac, m4a, mka, mp3, ogg, opus, vorbis,
wav). If target container does not support
the video/audio codec, remuxing will fail.
You can specify multiple rules; e.g.
"aac>m4a/mov>mp4/mkv" will remux aac to m4a,
mov to mp4 and anything else to mkv
--recode-video FORMAT Re-encode the video into another format if
@@ -952,13 +962,18 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
mkv/mka video files
--no-embed-info-json Do not embed the infojson as an attachment
to the video file
--parse-metadata FROM:TO Parse additional metadata like title/artist
--parse-metadata [WHEN:]FROM:TO
Parse additional metadata like title/artist
from other fields; see "MODIFYING METADATA"
for details
--replace-in-metadata FIELDS REGEX REPLACE
for details. Supported values of "WHEN" are
the same as that of --use-postprocessor
(default: pre_process)
--replace-in-metadata [WHEN:]FIELDS REGEX REPLACE
Replace text in a metadata field using the
given regex. This option can be used
multiple times
multiple times. Supported values of "WHEN"
are the same as that of --use-postprocessor
(default: pre_process)
--xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards)
--concat-playlist POLICY Concatenate videos in a playlist. One of
@@ -979,18 +994,18 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
--ffmpeg-location PATH Location of the ffmpeg binary; either the
path to the binary or its containing directory
--exec [WHEN:]CMD Execute a command, optionally prefixed with
when to execute it (after_move if
unspecified), separated by a ":". Supported
values of "WHEN" are the same as that of
--use-postprocessor. Same syntax as the
output template can be used to pass any
field as arguments to the command. After
download, an additional field "filepath"
that contains the final path of the
downloaded file is also available, and if no
fields are passed, %(filepath)q is appended
to the end of the command. This option can
be used multiple times
when to execute it, separated by a ":".
Supported values of "WHEN" are the same as
that of --use-postprocessor (default:
after_move). Same syntax as the output
template can be used to pass any field as
arguments to the command. After download, an
additional field "filepath" that contains
the final path of the downloaded file is
also available, and if no fields are passed,
%(filepath,_filename|)q is appended to the
end of the command. This option can be used
multiple times
--no-exec Remove any previously defined --exec
--convert-subs FORMAT Convert the subtitles to another format
(currently supported: ass, lrc, srt, vtt)
@@ -1028,14 +1043,16 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
postprocessor is invoked. It can be one of
"pre_process" (after video extraction),
"after_filter" (after video passes filter),
"before_dl" (before each video download),
"post_process" (after each video download;
default), "after_move" (after moving video
file to it's final locations), "after_video"
(after downloading and processing all
formats of a video), or "playlist" (at end
of playlist). This option can be used
multiple times to add different postprocessors
"video" (after --format; before
--print/--output), "before_dl" (before each
video download), "post_process" (after each
video download; default), "after_move"
(after moving video file to it's final
locations), "after_video" (after downloading
and processing all formats of a video), or
"playlist" (at end of playlist). This option
can be used multiple times to add different
postprocessors
## SponsorBlock Options:
Make chapter entries for, or remove various segments (sponsor,
@@ -1046,10 +1063,10 @@ Make chapter entries for, or remove various segments (sponsor,
for, separated by commas. Available
categories are sponsor, intro, outro,
selfpromo, preview, filler, interaction,
music_offtopic, poi_highlight, chapter, all and
default (=all). You can prefix the category
with a "-" to exclude it. See [1] for
description of the categories. E.g.
music_offtopic, poi_highlight, chapter, all
and default (=all). You can prefix the
category with a "-" to exclude it. See [1]
for description of the categories. E.g.
--sponsorblock-mark all,-preview
[1] https://wiki.sponsor.ajay.app/w/Segment_Categories
--sponsorblock-remove CATS SponsorBlock categories to be removed from
@@ -1058,7 +1075,7 @@ Make chapter entries for, or remove various segments (sponsor,
remove takes precedence. The syntax and
available categories are the same as for
--sponsorblock-mark except that "default"
refers to "all,-filler" and poi_highlight and
refers to "all,-filler" and poi_highlight,
chapter are not available
--sponsorblock-chapter-title TEMPLATE
An output template for the title of the
@@ -1102,16 +1119,22 @@ You can configure yt-dlp by placing any supported command line option to a confi
* `yt-dlp.conf` in the home path given by `-P`
* If `-P` is not given, the current directory is searched
1. **User Configuration**:
* `${XDG_CONFIG_HOME}/yt-dlp/config` (recommended on Linux/macOS)
* `${XDG_CONFIG_HOME}/yt-dlp.conf`
* `${XDG_CONFIG_HOME}/yt-dlp/config` (recommended on Linux/macOS)
* `${XDG_CONFIG_HOME}/yt-dlp/config.txt`
* `${APPDATA}/yt-dlp.conf`
* `${APPDATA}/yt-dlp/config` (recommended on Windows)
* `${APPDATA}/yt-dlp/config.txt`
* `~/yt-dlp.conf`
* `~/yt-dlp.conf.txt`
* `~/.yt-dlp/config`
* `~/.yt-dlp/config.txt`
See also: [Notes about environment variables](#notes-about-environment-variables)
1. **System Configuration**:
* `/etc/yt-dlp.conf`
* `/etc/yt-dlp/config`
* `/etc/yt-dlp/config.txt`
E.g. with the following configuration file yt-dlp will always extract the audio, not copy the mtime, use a proxy and save all videos under `YouTube` directory in your home directory:
```
@@ -1130,7 +1153,7 @@ E.g. with the following configuration file yt-dlp will always extract the audio,
-o ~/YouTube/%(title)s.%(ext)s
```
Note that options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. They must also be quoted when necessary as-if it were a UNIX shell.
**Note**: Options in configuration file are just the same options aka switches used in regular command line calls; thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`. They must also be quoted when necessary as-if it were a UNIX shell.
You can use `--ignore-config` if you want to disable all configuration files for a particular yt-dlp run. If `--ignore-config` is found inside any configuration file, no further configuration will be loaded. For example, having the option in the portable configuration file prevents loading of home, user, and system configurations. Additionally, (for backward compatibility) if `--ignore-config` is found inside the system configuration file, the user configuration is not loaded.
@@ -1206,7 +1229,7 @@ Additionally, you can set different output templates for the various metadata fi
<a id="outtmpl-postprocess-note"></a>
Note: Due to post-processing (i.e. merging etc.), the actual output filename might differ. Use `--print after_move:filepath` to get the name after all post-processing is complete.
**Note**: Due to post-processing (i.e. merging etc.), the actual output filename might differ. Use `--print after_move:filepath` to get the name after all post-processing is complete.
The available fields are:
@@ -1327,7 +1350,7 @@ Available only in `--sponsorblock-chapter-title`:
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. E.g. for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `yt-dlp test video` and id `BaW_jenozKc`, this will result in a `yt-dlp test video-BaW_jenozKc.mp4` file created in the current directory.
Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
**Note**: Some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
**Tip**: Look at the `-j` output to identify which fields are available for the particular URL
@@ -1442,6 +1465,7 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `filesize_approx`: An estimate for the number of bytes
- `width`: Width of the video, if known
- `height`: Height of the video, if known
- `aspect_ratio`: Aspect ratio of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s
- `abr`: Average audio bitrate in KBit/s
- `vbr`: Average video bitrate in KBit/s
@@ -1467,7 +1491,7 @@ Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends
Any string comparison may be prefixed with negation `!` in order to produce an opposite comparison, e.g. `!*=` (does not contain). The comparand of a string comparison needs to be quoted with either double or single quotes if it contains spaces or special characters other than `._-`.
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
**Note**: None of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the website. Any other field made available by the extractor can also be used for filtering.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height<=?720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. You can also use the filters with `all` to download all formats that satisfy the filter, e.g. `-f "all[vcodec=none]"` selects all audio-only formats.
@@ -1487,9 +1511,9 @@ The available fields are:
- `source`: The preference of the source
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`)
- `vcodec`: Video Codec (`av01` > `vp9.2` > `vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other)
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `eac3` > `ac3` > `dts` > other)
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` `ac4` > > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `vext`: Video Extension (`mp4` > `mov` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
- `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if known in advance
@@ -1565,7 +1589,7 @@ $ yt-dlp -S "+size,+br"
$ yt-dlp -f "bv*[ext=mp4]+ba[ext=m4a]/b[ext=mp4] / bv*+ba/b"
# Download the best video with the best extension
# (For video, mp4 > webm > flv. For audio, m4a > aac > mp3 ...)
# (For video, mp4 > mov > webm > flv. For audio, m4a > aac > mp3 ...)
$ yt-dlp -S "ext"
@@ -1720,7 +1744,7 @@ Some extractors accept additional arguments which can be passed using `--extract
The following extractors use this feature:
#### youtube
* `lang`: Language code to prefer translated metadata of this language (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `android` and `ios` with variants `_music`, `_embedded`, `_embedscreen`, `_creator` (e.g. `web_embedded`); and `mweb` and `tv_embedded` (agegate bypass) with no variants. By default, `android,web` is used, but `tv_embedded` and `creator` variants are added as required for age-gated videos. Similarly, the music variants are added for `music.youtube.com` urls. You can use `all` to use all the clients, and `default` for the default clients.
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
@@ -1735,6 +1759,9 @@ The following extractors use this feature:
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
* `approximate_date`: Extract approximate `upload_date` and `timestamp` in flat-playlist. This may cause date-based filters to be slightly off
#### generic
* `fragment_query`: Passthrough any query in mpd/m3u8 manifest URLs to their fragments. Does not apply to ffmpeg
#### funimation
* `language`: Audio languages to extract, e.g. `funimation:language=english,japanese`
* `version`: The video version to extract - `uncut` or `simulcast`
@@ -1761,6 +1788,7 @@ The following extractors use this feature:
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
#### tiktok
* `api_hostname`: Hostname to use for mobile API requests, e.g. `api-h2.tiktokv.com`
* `app_version`: App version to call mobile APIs with - should be set along with `manifest_app_version`, e.g. `20.2.1`
* `manifest_app_version`: Numeric app version to call mobile APIs with, e.g. `221`
@@ -1770,26 +1798,78 @@ The following extractors use this feature:
#### twitter
* `force_graphql`: Force usage of the GraphQL API. By default it will only be used if login cookies are provided
NOTE: These options may be changed/removed in the future without concern for backward compatibility
**Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
# PLUGINS
Plugins are loaded from `<root-dir>/ytdlp_plugins/<type>/__init__.py`; where `<root-dir>` is the directory of the binary (`<root-dir>/yt-dlp`), or the root directory of the module if you are running directly from source-code (`<root dir>/yt_dlp/__main__.py`). Plugins are currently not supported for the `pip` version
Note that **all** plugins are imported even if not invoked, and that **there are no checks** performed on plugin code. **Use plugins at your own risk and only if you trust the code!**
Plugins can be of `<type>`s `extractor` or `postprocessor`. Extractor plugins do not need to be enabled from the CLI and are automatically invoked when the input URL is suitable for it. Postprocessor plugins can be invoked using `--use-postprocessor NAME`.
Plugins can be of `<type>`s `extractor` or `postprocessor`.
- Extractor plugins do not need to be enabled from the CLI and are automatically invoked when the input URL is suitable for it.
- Extractor plugins take priority over builtin extractors.
- Postprocessor plugins can be invoked using `--use-postprocessor NAME`.
See [ytdlp_plugins](ytdlp_plugins) for example plugins.
Note that **all** plugins are imported even if not invoked, and that **there are no checks** performed on plugin code. Use plugins at your own risk and only if you trust the code
Plugins are loaded from the namespace packages `yt_dlp_plugins.extractor` and `yt_dlp_plugins.postprocessor`.
If you are a plugin author, add [ytdlp-plugins](https://github.com/topics/ytdlp-plugins) as a topic to your repository for discoverability
In other words, the file structure on the disk looks something like:
yt_dlp_plugins/
extractor/
myplugin.py
postprocessor/
myplugin.py
yt-dlp looks for these `yt_dlp_plugins` namespace folders in many locations (see below) and loads in plugins from **all** of them.
See the [wiki for some known plugins](https://github.com/yt-dlp/yt-dlp/wiki/Plugins)
## Installing Plugins
Plugins can be installed using various methods and locations.
1. **Configuration directories**:
Plugin packages (containing a `yt_dlp_plugins` namespace folder) can be dropped into the following standard [configuration locations](#configuration):
* **User Plugins**
* `${XDG_CONFIG_HOME}/yt-dlp/plugins/<package name>/yt_dlp_plugins/` (recommended on Linux/macOS)
* `${XDG_CONFIG_HOME}/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
* `${APPDATA}/yt-dlp/plugins/<package name>/yt_dlp_plugins/` (recommended on Windows)
* `${APPDATA}/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
* `~/.yt-dlp/plugins/<package name>/yt_dlp_plugins/`
* `~/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
* **System Plugins**
* `/etc/yt-dlp/plugins/<package name>/yt_dlp_plugins/`
* `/etc/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
2. **Executable location**: Plugin packages can similarly be installed in a `yt-dlp-plugins` directory under the executable location:
* Binary: where `<root-dir>/yt-dlp.exe`, `<root-dir>/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
* Source: where `<root-dir>/yt_dlp/__main__.py`, `<root-dir>/yt-dlp-plugins/<package name>/yt_dlp_plugins/`
3. **pip and other locations in `PYTHONPATH`**
* Plugin packages can be installed and managed using `pip`. See [yt-dlp-sample-plugins](https://github.com/yt-dlp/yt-dlp-sample-plugins) for an example.
* Note: plugin files between plugin packages installed with pip must have unique filenames.
* Any path in `PYTHONPATH` is searched in for the `yt_dlp_plugins` namespace folder.
* Note: This does not apply for Pyinstaller/py2exe builds.
`.zip`, `.egg` and `.whl` archives containing a `yt_dlp_plugins` namespace folder in their root are also supported as plugin packages.
* e.g. `${XDG_CONFIG_HOME}/yt-dlp/plugins/mypluginpkg.zip` where `mypluginpkg.zip` contains `yt_dlp_plugins/<type>/myplugin.py`
Run yt-dlp with `--verbose` to check if the plugin has been loaded.
## Developing Plugins
See the [yt-dlp-sample-plugins](https://github.com/yt-dlp/yt-dlp-sample-plugins) repo for a template plugin package and the [Plugin Development](https://github.com/yt-dlp/yt-dlp/wiki/Plugin-Development) section of the wiki for a plugin development guide.
All public classes with a name ending in `IE`/`PP` are imported from each file for extractors and postprocessors repectively. This respects underscore prefix (e.g. `_MyBasePluginIE` is private) and `__all__`. Modules can similarly be excluded by prefixing the module name with an underscore (e.g. `_myplugin.py`).
To replace an existing extractor with a subclass of one, set the `plugin_name` class keyword argument (e.g. `class MyPluginIE(ABuiltInIE, plugin_name='myplugin')` will replace `ABuiltInIE` with `MyPluginIE`). Since the extractor replaces the parent, you should exclude the subclass extractor from being imported separately by making it private using one of the methods described above.
If you are a plugin author, add [yt-dlp-plugins](https://github.com/topics/yt-dlp-plugins) as a topic to your repository for discoverability.
See the [Developer Instructions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) on how to write and test an extractor.
# EMBEDDING YT-DLP

View File

@@ -10,7 +10,7 @@ from ..utils import (
)
# These bloat the lazy_extractors, so allow them to passthrough silently
ALLOWED_CLASSMETHODS = {'get_testcases', 'extract_from_webpage'}
ALLOWED_CLASSMETHODS = {'extract_from_webpage', 'get_testcases', 'get_webpage_testcases'}
_WARNED = False

View File

@@ -14,10 +14,17 @@ from devscripts.utils import get_filename_args, read_file, write_file
NO_ATTR = object()
STATIC_CLASS_PROPERTIES = [
'IE_NAME', 'IE_DESC', 'SEARCH_KEY', '_VALID_URL', '_WORKING', '_ENABLED', '_NETRC_MACHINE', 'age_limit'
'IE_NAME', '_ENABLED', '_VALID_URL', # Used for URL matching
'_WORKING', 'IE_DESC', '_NETRC_MACHINE', 'SEARCH_KEY', # Used for --extractor-descriptions
'age_limit', # Used for --age-limit (evaluated)
'_RETURN_TYPE', # Accessed in CLI only with instance (evaluated)
]
CLASS_METHODS = [
'ie_key', 'working', 'description', 'suitable', '_match_valid_url', '_match_id', 'get_temp_id', 'is_suitable'
'ie_key', 'suitable', '_match_valid_url', # Used for URL matching
'working', 'get_temp_id', '_match_id', # Accessed just before instance creation
'description', # Used for --extractor-descriptions
'is_suitable', # Used for --age-limit
'supports_login', 'is_single_video', # Accessed in CLI only with instance
]
IE_TEMPLATE = '''
class {name}({bases}):
@@ -33,8 +40,12 @@ def main():
_ALL_CLASSES = get_all_ies() # Must be before import
import yt_dlp.plugins
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
# Filter out plugins
_ALL_CLASSES = [cls for cls in _ALL_CLASSES if not cls.__module__.startswith(f'{yt_dlp.plugins.PACKAGE_NAME}.')]
DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR})
module_src = '\n'.join((
MODULE_TEMPLATE,

5
pyproject.toml Normal file
View File

@@ -0,0 +1,5 @@
[build-system]
build-backend = 'setuptools.build_meta'
# https://github.com/yt-dlp/yt-dlp/issues/5941
# https://github.com/pypa/distutils/issues/17
requires = ['setuptools > 50']

View File

@@ -26,12 +26,12 @@ markers =
[tox:tox]
skipsdist = true
envlist = py{36,37,38,39,310},pypy{36,37,38,39}
envlist = py{36,37,38,39,310,311},pypy{36,37,38,39}
skip_missing_interpreters = true
[testenv] # tox
deps =
pytest
pytest
commands = pytest {posargs:"-m not download"}
passenv = HOME # For test_compat_expanduser
setenv =

View File

@@ -1,8 +1,12 @@
#!/usr/bin/env python3
import os.path
import subprocess
# Allow execution from anywhere
import os
import sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
import subprocess
import warnings
try:

View File

@@ -51,6 +51,8 @@
- **afreecatv:live**: [<abbr title="netrc machine"><em>afreecatv</em></abbr>] afreecatv.com
- **afreecatv:user**
- **AirMozilla**
- **AirTV**
- **AitubeKZVideo**
- **AliExpressLive**
- **AlJazeera**
- **Allocine**
@@ -60,6 +62,10 @@
- **Alura**: [<abbr title="netrc machine"><em>alura</em></abbr>]
- **AluraCourse**: [<abbr title="netrc machine"><em>aluracourse</em></abbr>]
- **Amara**
- **AmazonMiniTV**
- **amazonminitv:season**: Amazon MiniTV Series, "minitv:season:" prefix
- **amazonminitv:series**
- **AmazonReviews**
- **AmazonStore**
- **AMCNetworks**
- **AmericasTestKitchen**
@@ -130,6 +136,8 @@
- **BBVTV**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVLive**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BBVTVRecordings**: [<abbr title="netrc machine"><em>bbvtv</em></abbr>]
- **BeatBumpPlaylist**
- **BeatBumpVideo**
- **Beatport**
- **Beeg**
- **BehindKink**
@@ -157,7 +165,7 @@
- **BilibiliSpacePlaylist**
- **BilibiliSpaceVideo**
- **BiliIntl**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **BiliIntlSeries**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **biliIntl:series**: [<abbr title="netrc machine"><em>biliintl</em></abbr>]
- **BiliLive**
- **BioBioChileTV**
- **Biography**
@@ -345,6 +353,8 @@
- **DrTuber**
- **drtv**
- **drtv:live**
- **drtv:season**
- **drtv:series**
- **DTube**
- **duboku**: www.duboku.io
- **duboku:list**: www.duboku.io entire series
@@ -387,6 +397,7 @@
- **ESPNCricInfo**
- **EsriVideo**
- **Europa**
- **EuroParlWebstream**
- **EuropeanTour**
- **Eurosport**
- **EUScreen**
@@ -599,6 +610,8 @@
- **JWPlatform**
- **Kakao**
- **Kaltura**
- **Kanal2**
- **KankaNews**
- **Karaoketv**
- **KarriereVideos**
- **Katsomo**
@@ -607,8 +620,10 @@
- **Ketnet**
- **khanacademy**
- **khanacademy:unit**
- **Kick**
- **Kicker**
- **KickStarter**
- **KickVOD**
- **KinjaEmbed**
- **KinoPoisk**
- **KompasVideo**
@@ -709,6 +724,7 @@
- **Mediasite**
- **MediasiteCatalog**
- **MediasiteNamedCatalog**
- **MediaStream**
- **MediaWorksNZVOD**
- **Medici**
- **megaphone.fm**: megaphone.fm embedded players
@@ -845,6 +861,7 @@
- **NetPlusTVRecordings**: [<abbr title="netrc machine"><em>netplus</em></abbr>]
- **Netverse**
- **NetversePlaylist**
- **NetverseSearch**: "netsearch:" prefix
- **Netzkino**
- **Newgrounds**
- **Newgrounds:playlist**
@@ -887,6 +904,7 @@
- **njoy:embed**
- **NJPWWorld**: [<abbr title="netrc machine"><em>njpwworld</em></abbr>] 新日本プロレスワールド
- **NobelPrize**
- **NoicePodcast**
- **NonkTube**
- **NoodleMagazine**
- **Noovo**
@@ -933,6 +951,7 @@
- **on24**: ON24
- **OnDemandKorea**
- **OneFootball**
- **OnePlacePodcast**
- **onet.pl**
- **onet.tv**
- **onet.tv:channel**
@@ -1022,11 +1041,13 @@
- **PokerGoCollection**: [<abbr title="netrc machine"><em>pokergo</em></abbr>]
- **PolsatGo**
- **PolskieRadio**
- **polskieradio:audition**
- **polskieradio:category**
- **polskieradio:kierowcow**
- **polskieradio:legacy**
- **polskieradio:player**
- **polskieradio:podcast**
- **polskieradio:podcast:list**
- **PolskieRadioCategory**
- **Popcorntimes**
- **PopcornTV**
- **PornCom**
@@ -1155,6 +1176,7 @@
- **rtvslo.si**
- **RUHD**
- **Rule34Video**
- **Rumble**
- **RumbleChannel**
- **RumbleEmbed**
- **Ruptly**
@@ -1189,6 +1211,7 @@
- **screen.yahoo:search**: Yahoo screen search; "yvsearch:" prefix
- **Screen9**
- **Screencast**
- **Screencastify**
- **ScreencastOMatic**
- **ScrippsNetworks**
- **scrippsnetworks:watch**
@@ -1212,6 +1235,7 @@
- **ShugiinItvLive**: 衆議院インターネット審議中継
- **ShugiinItvLiveRoom**: 衆議院インターネット審議中継 (中継)
- **ShugiinItvVod**: 衆議院インターネット審議中継 (ビデオライブラリ)
- **SibnetEmbed**
- **simplecast**
- **simplecast:episode**
- **simplecast:podcast**
@@ -1227,7 +1251,7 @@
- **skynewsarabia:video**
- **SkyNewsAU**
- **Slideshare**
- **SlidesLive**: (**Currently broken**)
- **SlidesLive**
- **Slutload**
- **Smotrim**
- **Snotr**
@@ -1241,6 +1265,7 @@
- **soundcloud:set**: [<abbr title="netrc machine"><em>soundcloud</em></abbr>]
- **soundcloud:trackstation**: [<abbr title="netrc machine"><em>soundcloud</em></abbr>]
- **soundcloud:user**: [<abbr title="netrc machine"><em>soundcloud</em></abbr>]
- **soundcloud:user:permalink**: [<abbr title="netrc machine"><em>soundcloud</em></abbr>]
- **SoundcloudEmbed**
- **soundgasm**
- **soundgasm:profile**
@@ -1352,10 +1377,14 @@
- **ThisAmericanLife**
- **ThisAV**
- **ThisOldHouse**
- **ThisVid**
- **ThisVidMember**
- **ThisVidPlaylist**
- **ThreeSpeak**
- **ThreeSpeakUser**
- **TikTok**
- **tiktok:effect**: (**Currently broken**)
- **tiktok:live**
- **tiktok:sound**: (**Currently broken**)
- **tiktok:tag**: (**Currently broken**)
- **tiktok:user**: (**Currently broken**)
@@ -1383,6 +1412,7 @@
- **TrovoChannelClip**: All Clips of a trovo.live channel; "trovoclip:" prefix
- **TrovoChannelVod**: All VODs of a trovo.live channel; "trovovod:" prefix
- **TrovoVod**
- **TrtCocukVideo**
- **TrueID**
- **TruNews**
- **Truth**
@@ -1483,6 +1513,7 @@
- **VeeHD**
- **Veo**
- **Veoh**
- **veoh:user**
- **Vesti**: Вести.Ru
- **Vevo**
- **VevoPlaylist**
@@ -1502,6 +1533,11 @@
- **video.sky.it:live**
- **VideoDetective**
- **videofy.me**
- **VideoKen**
- **VideoKenCategory**
- **VideoKenPlayer**
- **VideoKenPlaylist**
- **VideoKenTopic**
- **videomore**
- **videomore:season**
- **videomore:video**
@@ -1521,6 +1557,7 @@
- **vimeo:group**: [<abbr title="netrc machine"><em>vimeo</em></abbr>]
- **vimeo:likes**: [<abbr title="netrc machine"><em>vimeo</em></abbr>] Vimeo user likes
- **vimeo:ondemand**: [<abbr title="netrc machine"><em>vimeo</em></abbr>]
- **vimeo:pro**: [<abbr title="netrc machine"><em>vimeo</em></abbr>]
- **vimeo:review**: [<abbr title="netrc machine"><em>vimeo</em></abbr>] Review pages on vimeo
- **vimeo:user**: [<abbr title="netrc machine"><em>vimeo</em></abbr>]
- **vimeo:watchlater**: [<abbr title="netrc machine"><em>vimeo</em></abbr>] Vimeo watch later list, ":vimeowatchlater" keyword (requires authentication)
@@ -1549,6 +1586,7 @@
- **VoiceRepublic**
- **voicy**
- **voicy:channel**
- **VolejTV**
- **Voot**
- **VootSeries**
- **VoxMedia**
@@ -1591,6 +1629,7 @@
- **WDRElefant**
- **WDRPage**
- **web.archive:youtube**: web.archive.org saved youtube videos, "ytarchive:" prefix
- **Webcamerapl**
- **Webcaster**
- **WebcasterFeed**
- **WebOfStories**
@@ -1604,6 +1643,7 @@
- **wikimedia.org**
- **Willow**
- **WimTV**
- **WinSportsVideo**
- **Wistia**
- **WistiaChannel**
- **WistiaPlaylist**
@@ -1618,16 +1658,13 @@
- **WWE**
- **wyborcza:video**
- **WyborczaPodcast**
- **Xanimu**
- **XBef**
- **XboxClips**
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, WolfStream, XVideoSharing
- **XHamster**
- **XHamsterEmbed**
- **XHamsterUser**
- **xiami:album**: 虾米音乐 - 专辑
- **xiami:artist**: 虾米音乐 - 歌手
- **xiami:collection**: 虾米音乐 - 精选集
- **xiami:song**: 虾米音乐
- **ximalaya**: 喜马拉雅FM
- **ximalaya:album**: 喜马拉雅FM 专辑
- **xinpianchang**: xinpianchang.com

View File

@@ -44,5 +44,6 @@
"writesubtitles": false,
"allsubtitles": false,
"listsubtitles": false,
"fixup": "never"
"fixup": "never",
"allow_playlist_files": false
}

View File

@@ -41,7 +41,9 @@ class InfoExtractorTestRequestHandler(http.server.BaseHTTPRequestHandler):
class DummyIE(InfoExtractor):
pass
def _sort_formats(self, formats, field_preference=[]):
self._downloader.sort_formats(
{'formats': formats, '_format_sort_fields': field_preference})
class TestInfoExtractor(unittest.TestCase):

View File

@@ -68,8 +68,7 @@ class TestFormatSelection(unittest.TestCase):
{'ext': 'mp4', 'height': 460, 'url': TEST_URL},
]
info_dict = _make_result(formats)
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['ext'], 'webm')
@@ -82,8 +81,7 @@ class TestFormatSelection(unittest.TestCase):
{'ext': 'mp4', 'height': 1080, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['ext'], 'mp4')
@@ -97,8 +95,7 @@ class TestFormatSelection(unittest.TestCase):
{'ext': 'flv', 'height': 720, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['ext'], 'mp4')
@@ -110,15 +107,14 @@ class TestFormatSelection(unittest.TestCase):
{'ext': 'webm', 'height': 720, 'url': TEST_URL},
]
info_dict['formats'] = formats
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['ext'], 'webm')
def test_format_selection(self):
formats = [
{'format_id': '35', 'ext': 'mp4', 'preference': 1, 'url': TEST_URL},
{'format_id': '35', 'ext': 'mp4', 'preference': 0, 'url': TEST_URL},
{'format_id': 'example-with-dashes', 'ext': 'webm', 'preference': 1, 'url': TEST_URL},
{'format_id': '45', 'ext': 'webm', 'preference': 2, 'url': TEST_URL},
{'format_id': '47', 'ext': 'webm', 'preference': 3, 'url': TEST_URL},
@@ -186,22 +182,19 @@ class TestFormatSelection(unittest.TestCase):
info_dict = _make_result(formats)
ydl = YDL({'format': 'best'})
ie = YoutubeIE(ydl)
ie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(copy.deepcopy(info_dict))
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'aac-64')
ydl = YDL({'format': 'mp3'})
ie = YoutubeIE(ydl)
ie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(copy.deepcopy(info_dict))
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'mp3-64')
ydl = YDL({'prefer_free_formats': True})
ie = YoutubeIE(ydl)
ie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(copy.deepcopy(info_dict))
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'ogg-64')
@@ -346,8 +339,7 @@ class TestFormatSelection(unittest.TestCase):
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': 'bestvideo+bestaudio'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], '248+172')
@@ -355,40 +347,35 @@ class TestFormatSelection(unittest.TestCase):
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': 'bestvideo[height>=999999]+bestaudio/best'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], '38')
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': 'bestvideo/best,bestaudio'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['137', '141'])
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': '(bestvideo[ext=mp4],bestvideo[ext=webm])+bestaudio'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['137+141', '248+141'])
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': '(bestvideo[ext=mp4],bestvideo[ext=webm])[height<=720]+bestaudio'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['136+141', '247+141'])
info_dict = _make_result(list(formats_order), extractor='youtube')
ydl = YDL({'format': '(bestvideo[ext=none]/bestvideo[ext=webm])+bestaudio'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['248+141'])
@@ -396,16 +383,14 @@ class TestFormatSelection(unittest.TestCase):
for f1, f2 in zip(formats_order, formats_order[1:]):
info_dict = _make_result([f1, f2], extractor='youtube')
ydl = YDL({'format': 'best/bestvideo'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], f1['format_id'])
info_dict = _make_result([f2, f1], extractor='youtube')
ydl = YDL({'format': 'best/bestvideo'})
yie = YoutubeIE(ydl)
yie._sort_formats(info_dict['formats'])
ydl.sort_formats(info_dict)
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], f1['format_id'])
@@ -480,7 +465,7 @@ class TestFormatSelection(unittest.TestCase):
for f in formats:
f['url'] = 'http://_/'
f['ext'] = 'unknown'
info_dict = _make_result(formats)
info_dict = _make_result(formats, _format_sort_fields=('id', ))
ydl = YDL({'format': 'best[filesize<3000]'})
ydl.process_ie_result(info_dict)

227
test/test_config.py Normal file
View File

@@ -0,0 +1,227 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
import unittest
import unittest.mock
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib
import itertools
from pathlib import Path
from yt_dlp.compat import compat_expanduser
from yt_dlp.options import create_parser, parseOpts
from yt_dlp.utils import Config, get_executable_path
ENVIRON_DEFAULTS = {
'HOME': None,
'XDG_CONFIG_HOME': '/_xdg_config_home/',
'USERPROFILE': 'C:/Users/testing/',
'APPDATA': 'C:/Users/testing/AppData/Roaming/',
'HOMEDRIVE': 'C:/',
'HOMEPATH': 'Users/testing/',
}
@contextlib.contextmanager
def set_environ(**kwargs):
saved_environ = os.environ.copy()
for name, value in {**ENVIRON_DEFAULTS, **kwargs}.items():
if value is None:
os.environ.pop(name, None)
else:
os.environ[name] = value
yield
os.environ.clear()
os.environ.update(saved_environ)
def _generate_expected_groups():
xdg_config_home = os.getenv('XDG_CONFIG_HOME') or compat_expanduser('~/.config')
appdata_dir = os.getenv('appdata')
home_dir = compat_expanduser('~')
return {
'Portable': [
Path(get_executable_path(), 'yt-dlp.conf'),
],
'Home': [
Path('yt-dlp.conf'),
],
'User': [
Path(xdg_config_home, 'yt-dlp.conf'),
Path(xdg_config_home, 'yt-dlp', 'config'),
Path(xdg_config_home, 'yt-dlp', 'config.txt'),
*((
Path(appdata_dir, 'yt-dlp.conf'),
Path(appdata_dir, 'yt-dlp', 'config'),
Path(appdata_dir, 'yt-dlp', 'config.txt'),
) if appdata_dir else ()),
Path(home_dir, 'yt-dlp.conf'),
Path(home_dir, 'yt-dlp.conf.txt'),
Path(home_dir, '.yt-dlp', 'config'),
Path(home_dir, '.yt-dlp', 'config.txt'),
],
'System': [
Path('/etc/yt-dlp.conf'),
Path('/etc/yt-dlp/config'),
Path('/etc/yt-dlp/config.txt'),
]
}
class TestConfig(unittest.TestCase):
maxDiff = None
@set_environ()
def test_config__ENVIRON_DEFAULTS_sanity(self):
expected = make_expected()
self.assertCountEqual(
set(expected), expected,
'ENVIRON_DEFAULTS produces non unique names')
def test_config_all_environ_values(self):
for name, value in ENVIRON_DEFAULTS.items():
for new_value in (None, '', '.', value or '/some/dir'):
with set_environ(**{name: new_value}):
self._simple_grouping_test()
def test_config_default_expected_locations(self):
files, _ = self._simple_config_test()
self.assertEqual(
files, make_expected(),
'Not all expected locations have been checked')
def test_config_default_grouping(self):
self._simple_grouping_test()
def _simple_grouping_test(self):
expected_groups = make_expected_groups()
for name, group in expected_groups.items():
for index, existing_path in enumerate(group):
result, opts = self._simple_config_test(existing_path)
expected = expected_from_expected_groups(expected_groups, existing_path)
self.assertEqual(
result, expected,
f'The checked locations do not match the expected ({name}, {index})')
self.assertEqual(
opts.outtmpl['default'], '1',
f'The used result value was incorrect ({name}, {index})')
def _simple_config_test(self, *stop_paths):
encountered = 0
paths = []
def read_file(filename, default=[]):
nonlocal encountered
path = Path(filename)
paths.append(path)
if path in stop_paths:
encountered += 1
return ['-o', f'{encountered}']
with ConfigMock(read_file):
_, opts, _ = parseOpts([], False)
return paths, opts
@set_environ()
def test_config_early_exit_commandline(self):
self._early_exit_test(0, '--ignore-config')
@set_environ()
def test_config_early_exit_files(self):
for index, _ in enumerate(make_expected(), 1):
self._early_exit_test(index)
def _early_exit_test(self, allowed_reads, *args):
reads = 0
def read_file(filename, default=[]):
nonlocal reads
reads += 1
if reads > allowed_reads:
self.fail('The remaining config was not ignored')
elif reads == allowed_reads:
return ['--ignore-config']
with ConfigMock(read_file):
parseOpts(args, False)
@set_environ()
def test_config_override_commandline(self):
self._override_test(0, '-o', 'pass')
@set_environ()
def test_config_override_files(self):
for index, _ in enumerate(make_expected(), 1):
self._override_test(index)
def _override_test(self, start_index, *args):
index = 0
def read_file(filename, default=[]):
nonlocal index
index += 1
if index > start_index:
return ['-o', 'fail']
elif index == start_index:
return ['-o', 'pass']
with ConfigMock(read_file):
_, opts, _ = parseOpts(args, False)
self.assertEqual(
opts.outtmpl['default'], 'pass',
'The earlier group did not override the later ones')
@contextlib.contextmanager
def ConfigMock(read_file=None):
with unittest.mock.patch('yt_dlp.options.Config') as mock:
mock.return_value = Config(create_parser())
if read_file is not None:
mock.read_file = read_file
yield mock
def make_expected(*filepaths):
return expected_from_expected_groups(_generate_expected_groups(), *filepaths)
def make_expected_groups(*filepaths):
return _filter_expected_groups(_generate_expected_groups(), filepaths)
def expected_from_expected_groups(expected_groups, *filepaths):
return list(itertools.chain.from_iterable(
_filter_expected_groups(expected_groups, filepaths).values()))
def _filter_expected_groups(expected, filepaths):
if not filepaths:
return expected
result = {}
for group, paths in expected.items():
new_paths = []
for path in paths:
new_paths.append(path)
if path in filepaths:
break
result[group] = new_paths
return result
if __name__ == '__main__':
unittest.main()

73
test/test_plugins.py Normal file
View File

@@ -0,0 +1,73 @@
import importlib
import os
import shutil
import sys
import unittest
from pathlib import Path
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
TEST_DATA_DIR = Path(os.path.dirname(os.path.abspath(__file__)), 'testdata')
sys.path.append(str(TEST_DATA_DIR))
importlib.invalidate_caches()
from yt_dlp.plugins import PACKAGE_NAME, directories, load_plugins
class TestPlugins(unittest.TestCase):
TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME
def test_directories_containing_plugins(self):
self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories()))
def test_extractor_classes(self):
for module_name in tuple(sys.modules):
if module_name.startswith(f'{PACKAGE_NAME}.extractor'):
del sys.modules[module_name]
plugins_ie = load_plugins('extractor', 'IE')
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
self.assertIn('NormalPluginIE', plugins_ie.keys())
# don't load modules with underscore prefix
self.assertFalse(
f'{PACKAGE_NAME}.extractor._ignore' in sys.modules.keys(),
'loaded module beginning with underscore')
self.assertNotIn('IgnorePluginIE', plugins_ie.keys())
# Don't load extractors with underscore prefix
self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys())
# Don't load extractors not specified in __all__ (if supplied)
self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys())
self.assertIn('InAllPluginIE', plugins_ie.keys())
def test_postprocessor_classes(self):
plugins_pp = load_plugins('postprocessor', 'PP')
self.assertIn('NormalPluginPP', plugins_pp.keys())
def test_importing_zipped_module(self):
zip_path = TEST_DATA_DIR / 'zipped_plugins.zip'
shutil.make_archive(str(zip_path)[:-4], 'zip', str(zip_path)[:-4])
sys.path.append(str(zip_path)) # add zip to search paths
importlib.invalidate_caches() # reset the import caches
try:
for plugin_type in ('extractor', 'postprocessor'):
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
plugins_ie = load_plugins('extractor', 'IE')
self.assertIn('ZippedPluginIE', plugins_ie.keys())
plugins_pp = load_plugins('postprocessor', 'PP')
self.assertIn('ZippedPluginPP', plugins_pp.keys())
finally:
sys.path.remove(str(zip_path))
os.remove(zip_path)
importlib.invalidate_caches() # reset the import caches
if __name__ == '__main__':
unittest.main()

View File

@@ -954,6 +954,85 @@ class TestUtil(unittest.TestCase):
)
self.assertEqual(escape_url('http://vimeo.com/56015672#at=0'), 'http://vimeo.com/56015672#at=0')
def test_js_to_json_vars_strings(self):
self.assertDictEqual(
json.loads(js_to_json(
'''{
'null': a,
'nullStr': b,
'true': c,
'trueStr': d,
'false': e,
'falseStr': f,
'unresolvedVar': g,
}''',
{
'a': 'null',
'b': '"null"',
'c': 'true',
'd': '"true"',
'e': 'false',
'f': '"false"',
'g': 'var',
}
)),
{
'null': None,
'nullStr': 'null',
'true': True,
'trueStr': 'true',
'false': False,
'falseStr': 'false',
'unresolvedVar': 'var'
}
)
self.assertDictEqual(
json.loads(js_to_json(
'''{
'int': a,
'intStr': b,
'float': c,
'floatStr': d,
}''',
{
'a': '123',
'b': '"123"',
'c': '1.23',
'd': '"1.23"',
}
)),
{
'int': 123,
'intStr': '123',
'float': 1.23,
'floatStr': '1.23',
}
)
self.assertDictEqual(
json.loads(js_to_json(
'''{
'object': a,
'objectStr': b,
'array': c,
'arrayStr': d,
}''',
{
'a': '{}',
'b': '"{}"',
'c': '[]',
'd': '"[]"',
}
)),
{
'object': {},
'objectStr': '{}',
'array': [],
'arrayStr': '[]',
}
)
def test_js_to_json_realworld(self):
inp = '''{
'clip':{'provider':'pseudo'}
@@ -1874,6 +1953,8 @@ Line 1
vcodecs=[None], acodecs=[None], vexts=['webm'], aexts=['m4a']), 'mkv')
self.assertEqual(get_compatible_ext(
vcodecs=[None], acodecs=[None], vexts=['webm'], aexts=['webm']), 'webm')
self.assertEqual(get_compatible_ext(
vcodecs=[None], acodecs=[None], vexts=['webm'], aexts=['weba']), 'webm')
self.assertEqual(get_compatible_ext(
vcodecs=['h264'], acodecs=['mp4a'], vexts=['mov'], aexts=['m4a']), 'mp4')

View File

@@ -0,0 +1,5 @@
from yt_dlp.extractor.common import InfoExtractor
class IgnorePluginIE(InfoExtractor):
pass

View File

@@ -0,0 +1,12 @@
from yt_dlp.extractor.common import InfoExtractor
class IgnoreNotInAllPluginIE(InfoExtractor):
pass
class InAllPluginIE(InfoExtractor):
pass
__all__ = ['InAllPluginIE']

View File

@@ -0,0 +1,9 @@
from yt_dlp.extractor.common import InfoExtractor
class NormalPluginIE(InfoExtractor):
pass
class _IgnoreUnderscorePluginIE(InfoExtractor):
pass

View File

@@ -0,0 +1,5 @@
from yt_dlp.postprocessor.common import PostProcessor
class NormalPluginPP(PostProcessor):
pass

View File

@@ -0,0 +1,5 @@
from yt_dlp.extractor.common import InfoExtractor
class ZippedPluginIE(InfoExtractor):
pass

View File

@@ -0,0 +1,5 @@
from yt_dlp.postprocessor.common import PostProcessor
class ZippedPluginPP(PostProcessor):
pass

View File

@@ -32,7 +32,8 @@ from .extractor import gen_extractor_classes, get_info_extractor
from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper
from .minicurses import format_text
from .postprocessor import _PLUGIN_CLASSES as plugin_postprocessors
from .plugins import directories as plugin_directories
from .postprocessor import _PLUGIN_CLASSES as plugin_pps
from .postprocessor import (
EmbedThumbnailPP,
FFmpegFixupDuplicateMoovPP,
@@ -67,6 +68,7 @@ from .utils import (
EntryNotInPlaylist,
ExistingVideoReached,
ExtractorError,
FormatSorter,
GeoRestrictedError,
HEADRequest,
ISO3166Utils,
@@ -316,6 +318,7 @@ class YoutubeDL:
If not provided and the key is encrypted, yt-dlp will ask interactively
prefer_insecure: Use HTTP instead of HTTPS to retrieve information.
(Only supported by some extractors)
enable_file_urls: Enable file:// URLs. This is disabled by default for security reasons.
http_headers: A dictionary of custom headers to be used for all requests
proxy: URL of the proxy server to use
geo_verification_proxy: URL of the proxy to use for IP address verification
@@ -547,7 +550,7 @@ class YoutubeDL:
_format_fields = {
# NB: Keep in sync with the docstring of extractor/common.py
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
'width', 'height', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start',
'preference', 'language', 'language_preference', 'quality', 'source_preference',
@@ -583,7 +586,6 @@ class YoutubeDL:
self._playlist_urls = set()
self.cache = Cache(self)
windows_enable_vt_mode()
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
self._out_files = Namespace(
out=stdout,
@@ -592,6 +594,12 @@ class YoutubeDL:
console=None if compat_os_name == 'nt' else next(
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None)
)
try:
windows_enable_vt_mode()
except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}')
self._allow_colors = Namespace(**{
type_: not self.params.get('no_color') and supports_terminal_sequences(stream)
for type_, stream in self._out_files.items_ if type_ != 'console'
@@ -1067,7 +1075,7 @@ class YoutubeDL:
# correspondingly that is not what we want since we need to keep
# '%%' intact for template dict substitution step. Working around
# with boundary-alike separator hack.
sep = ''.join([random.choice(ascii_letters) for _ in range(32)])
sep = ''.join(random.choices(ascii_letters, k=32))
outtmpl = outtmpl.replace('%%', f'%{sep}%').replace('$$', f'${sep}$')
# outtmpl should be expand_path'ed before template dict substitution
@@ -1357,11 +1365,19 @@ class YoutubeDL:
return self.get_output_path(dir_type, filename)
def _match_entry(self, info_dict, incomplete=False, silent=False):
""" Returns None if the file should be downloaded """
"""Returns None if the file should be downloaded"""
_type = info_dict.get('_type', 'video')
assert incomplete or _type == 'video', 'Only video result can be considered complete'
video_title = info_dict.get('title', info_dict.get('id', 'entry'))
def check_filter():
if _type in ('playlist', 'multi_video'):
return
elif _type in ('url', 'url_transparent') and not try_call(
lambda: self.get_info_extractor(info_dict['ie_key']).is_single_video(info_dict['url'])):
return
if 'title' in info_dict:
# This can happen when we're just evaluating the playlist
title = info_dict['title']
@@ -1373,6 +1389,7 @@ class YoutubeDL:
if rejecttitle:
if re.search(rejecttitle, title, re.IGNORECASE):
return '"' + title + '" title matched reject pattern "' + rejecttitle + '"'
date = info_dict.get('upload_date')
if date is not None:
dateRange = self.params.get('daterange', DateRange())
@@ -1616,8 +1633,8 @@ class YoutubeDL:
if result_type in ('url', 'url_transparent'):
ie_result['url'] = sanitize_url(
ie_result['url'], scheme='http' if self.params.get('prefer_insecure') else 'https')
if ie_result.get('original_url'):
extra_info.setdefault('original_url', ie_result['original_url'])
if ie_result.get('original_url') and not extra_info.get('original_url'):
extra_info = {'original_url': ie_result['original_url'], **extra_info}
extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info)
@@ -1816,7 +1833,7 @@ class YoutubeDL:
elif self.params.get('playlistrandom'):
random.shuffle(entries)
self.to_screen(f'[{ie_result["extractor"]}] Playlist {title}: Downloading {n_entries} videos'
self.to_screen(f'[{ie_result["extractor"]}] Playlist {title}: Downloading {n_entries} items'
f'{format_field(ie_result, "playlist_count", " of %s")}')
keep_resolved_entries = self.params.get('extract_flat') != 'discard'
@@ -1849,14 +1866,13 @@ class YoutubeDL:
resolved_entries[i] = (playlist_index, NO_DEFAULT)
continue
self.to_screen('[download] Downloading video %s of %s' % (
self.to_screen('[download] Downloading item %s of %s' % (
self._format_screen(i + 1, self.Styles.ID), self._format_screen(n_entries, self.Styles.EMPHASIS)))
extra.update({
entry_result = self.__process_iterable_entry(entry, download, collections.ChainMap({
'playlist_index': playlist_index,
'playlist_autonumber': i + 1,
})
entry_result = self.__process_iterable_entry(entry, download, extra)
}, extra))
if not entry_result:
failures += 1
if failures >= max_failures:
@@ -1867,8 +1883,11 @@ class YoutubeDL:
resolved_entries[i] = (playlist_index, entry_result)
# Update with processed data
ie_result['requested_entries'] = [i for i, e in resolved_entries if e is not NO_DEFAULT]
ie_result['entries'] = [e for _, e in resolved_entries if e is not NO_DEFAULT]
ie_result['requested_entries'] = [i for i, e in resolved_entries if e is not NO_DEFAULT]
if ie_result['requested_entries'] == try_call(lambda: list(range(1, ie_result['playlist_count'] + 1))):
# Do not set for full playlist
ie_result.pop('requested_entries')
# Write the updated info to json
if _infojson_written is True and self._write_info_json(
@@ -2174,6 +2193,7 @@ class YoutubeDL:
'vcodec': the_only_video.get('vcodec'),
'vbr': the_only_video.get('vbr'),
'stretched_ratio': the_only_video.get('stretched_ratio'),
'aspect_ratio': the_only_video.get('aspect_ratio'),
})
if the_only_audio:
@@ -2448,6 +2468,18 @@ class YoutubeDL:
if err:
self.report_error(err, tb=False)
def sort_formats(self, info_dict):
formats = self._get_formats(info_dict)
if not formats:
return
# Backward compatibility with InfoExtractor._sort_formats
field_preference = formats[0].pop('__sort_fields', None)
if field_preference:
info_dict['_format_sort_fields'] = field_preference
formats.sort(key=FormatSorter(
self, info_dict.get('_format_sort_fields', [])).calculate_preference)
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
self._num_videos += 1
@@ -2533,6 +2565,7 @@ class YoutubeDL:
info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, automatic_captions)
self.sort_formats(info_dict)
formats = self._get_formats(info_dict)
# or None ensures --clean-infojson removes it
@@ -2616,6 +2649,8 @@ class YoutubeDL:
format['resolution'] = self.format_resolution(format, default=None)
if format.get('dynamic_range') is None and format.get('vcodec') != 'none':
format['dynamic_range'] = 'SDR'
if format.get('aspect_ratio') is None:
format['aspect_ratio'] = try_call(lambda: round(format['width'] / format['height'], 2))
if (info_dict.get('duration') and format.get('tbr')
and not format.get('filesize') and not format.get('filesize_approx')):
format['filesize_approx'] = int(info_dict['duration'] * format['tbr'] * (1024 / 8))
@@ -2942,14 +2977,22 @@ class YoutubeDL:
if 'format' not in info_dict and 'ext' in info_dict:
info_dict['format'] = info_dict['ext']
# This is mostly just for backward compatibility of process_info
# As a side-effect, this allows for format-specific filters
if self._match_entry(info_dict) is not None:
info_dict['__write_download_archive'] = 'ignore'
return
# Does nothing under normal operation - for backward compatibility of process_info
self.post_extract(info_dict)
def replace_info_dict(new_info):
nonlocal info_dict
if new_info == info_dict:
return
info_dict.clear()
info_dict.update(new_info)
new_info, _ = self.pre_process(info_dict, 'video')
replace_info_dict(new_info)
self._num_downloads += 1
# info_dict['_filename'] needs to be set for backward compatibility
@@ -3063,13 +3106,6 @@ class YoutubeDL:
for link_type, should_write in write_links.items()):
return
def replace_info_dict(new_info):
nonlocal info_dict
if new_info == info_dict:
return
info_dict.clear()
info_dict.update(new_info)
new_info, files_to_move = self.pre_process(info_dict, 'before_dl', files_to_move)
replace_info_dict(new_info)
@@ -3096,7 +3132,7 @@ class YoutubeDL:
fd, success = None, True
if info_dict.get('protocol') or info_dict.get('url'):
fd = get_suitable_downloader(info_dict, self.params, to_stdout=temp_filename == '-')
if fd is not FFmpegFD and (
if fd is not FFmpegFD and 'no-direct-merge' not in self.params['compat_opts'] and (
info_dict.get('section_start') or info_dict.get('section_end')):
msg = ('This format cannot be partially downloaded' if FFmpegFD.available()
else 'You have requested downloading the video partially, but ffmpeg is not installed')
@@ -3361,6 +3397,7 @@ class YoutubeDL:
reject = lambda k, v: v is None or k.startswith('__') or k in {
'requested_downloads', 'requested_formats', 'requested_subtitles', 'requested_entries',
'entries', 'filepath', '_filename', 'infojson_filename', 'original_url', 'playlist_autonumber',
'_format_sort_fields',
}
else:
reject = lambda k, v: False
@@ -3430,7 +3467,8 @@ class YoutubeDL:
return infodict
def run_all_pps(self, key, info, *, additional_pps=None):
self._forceprint(key, info)
if key != 'video':
self._forceprint(key, info)
for pp in (additional_pps or []) + self._pps[key]:
info = self.run_pp(pp, info)
return info
@@ -3699,7 +3737,10 @@ class YoutubeDL:
# These imports can be slow. So import them only as needed
from .extractor.extractors import _LAZY_LOADER
from .extractor.extractors import _PLUGIN_CLASSES as plugin_extractors
from .extractor.extractors import (
_PLUGIN_CLASSES as plugin_ies,
_PLUGIN_OVERRIDES as plugin_ie_overrides
)
def get_encoding(stream):
ret = str(getattr(stream, 'encoding', 'missing (%s)' % type(stream).__name__))
@@ -3744,10 +3785,6 @@ class YoutubeDL:
write_debug('Lazy loading extractors is forcibly disabled')
else:
write_debug('Lazy loading extractors is disabled')
if plugin_extractors or plugin_postprocessors:
write_debug('Plugins: %s' % [
'%s%s' % (klass.__name__, '' if klass.__name__ == name else f' as {name}')
for name, klass in itertools.chain(plugin_extractors.items(), plugin_postprocessors.items())])
if self.params['compat_opts']:
write_debug('Compatibility options: %s' % ', '.join(self.params['compat_opts']))
@@ -3781,6 +3818,21 @@ class YoutubeDL:
proxy_map.update(handler.proxies)
write_debug(f'Proxy map: {proxy_map}')
for plugin_type, plugins in {'Extractor': plugin_ies, 'Post-Processor': plugin_pps}.items():
display_list = ['%s%s' % (
klass.__name__, '' if klass.__name__ == name else f' as {name}')
for name, klass in plugins.items()]
if plugin_type == 'Extractor':
display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})'
for parent, plugins in plugin_ie_overrides.items())
if not display_list:
continue
write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}')
plugin_dirs = plugin_directories()
if plugin_dirs:
write_debug(f'Plugin directories: {plugin_dirs}')
# Not implemented
if False and self.params.get('call_home'):
ipaddr = self.urlopen('https://yt-dl.org/ip').read().decode()
@@ -3830,9 +3882,12 @@ class YoutubeDL:
# https://github.com/ytdl-org/youtube-dl/issues/8227)
file_handler = urllib.request.FileHandler()
def file_open(*args, **kwargs):
raise urllib.error.URLError('file:// scheme is explicitly disabled in yt-dlp for security reasons')
file_handler.file_open = file_open
if not self.params.get('enable_file_urls'):
def file_open(*args, **kwargs):
raise urllib.error.URLError(
'file:// URLs are explicitly disabled in yt-dlp for security reasons. '
'Use --enable-file-urls to enable at your own risk.')
file_handler.file_open = file_open
opener = urllib.request.build_opener(
proxy_handler, https_handler, cookie_processor, ydlh, redirect_handler, data_handler, file_handler)
@@ -3894,7 +3949,7 @@ class YoutubeDL:
elif not self.params.get('overwrites', True) and os.path.exists(descfn):
self.to_screen(f'[info] {label.title()} description is already present')
elif ie_result.get('description') is None:
self.report_warning(f'There\'s no {label} description to write')
self.to_screen(f'[info] There\'s no {label} description to write')
return False
else:
try:
@@ -3910,15 +3965,18 @@ class YoutubeDL:
''' Write subtitles to file and return list of (sub_filename, final_sub_filename); or None if error'''
ret = []
subtitles = info_dict.get('requested_subtitles')
if not subtitles or not (self.params.get('writesubtitles') or self.params.get('writeautomaticsub')):
if not (self.params.get('writesubtitles') or self.params.get('writeautomaticsub')):
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
return ret
elif not subtitles:
self.to_screen('[info] There\'s no subtitles for the requested languages')
return ret
sub_filename_base = self.prepare_filename(info_dict, 'subtitle')
if not sub_filename_base:
self.to_screen('[info] Skipping writing video subtitles')
return ret
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext'))
@@ -3965,6 +4023,9 @@ class YoutubeDL:
thumbnails, ret = [], []
if write_all or self.params.get('writethumbnail', False):
thumbnails = info_dict.get('thumbnails') or []
if not thumbnails:
self.to_screen(f'[info] There\'s no {label} thumbnails to download')
return ret
multiple = write_all and len(thumbnails) > 1
if thumb_filename_base is None:

View File

@@ -16,11 +16,9 @@ import sys
from .compat import compat_shlex_quote
from .cookies import SUPPORTED_BROWSERS, SUPPORTED_KEYRINGS
from .downloader import FileDownloader
from .downloader.external import get_external_downloader
from .extractor import list_extractor_classes
from .extractor.adobepass import MSO_INFO
from .extractor.common import InfoExtractor
from .options import parseOpts
from .postprocessor import (
FFmpegExtractAudioPP,
@@ -40,6 +38,7 @@ from .utils import (
DateRange,
DownloadCancelled,
DownloadError,
FormatSorter,
GeoUtils,
PlaylistEntries,
SameFileError,
@@ -50,6 +49,7 @@ from .utils import (
format_field,
int_or_none,
match_filter_func,
parse_bytes,
parse_duration,
preferredencoding,
read_batch_urls,
@@ -91,12 +91,11 @@ def get_urls(urls, batchfile, verbose):
def print_extractor_information(opts, urls):
# Importing GenericIE is currently slow since it imports other extractors
# TODO: Move this back to module level after generalization of embed detection
from .extractor.generic import GenericIE
out = ''
if opts.list_extractors:
# Importing GenericIE is currently slow since it imports YoutubeIE
from .extractor.generic import GenericIE
urls = dict.fromkeys(urls, False)
for ie in list_extractor_classes(opts.age_limit):
out += ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie.working() else '') + '\n'
@@ -152,7 +151,7 @@ def set_compat_opts(opts):
else:
opts.embed_infojson = False
if 'format-sort' in opts.compat_opts:
opts.format_sort.extend(InfoExtractor.FormatSort.ytdl_default)
opts.format_sort.extend(FormatSorter.ytdl_default)
_video_multistreams_set = set_default_compat('multistreams', 'allow_multiple_video_streams', False, remove_compat=False)
_audio_multistreams_set = set_default_compat('multistreams', 'allow_multiple_audio_streams', False, remove_compat=False)
if _video_multistreams_set is False and _audio_multistreams_set is False:
@@ -227,7 +226,7 @@ def validate_options(opts):
# Format sort
for f in opts.format_sort:
validate_regex('format sorting', f, InfoExtractor.FormatSort.regex)
validate_regex('format sorting', f, FormatSorter.regex)
# Postprocessor formats
validate_regex('merge output format', opts.merge_output_format,
@@ -281,19 +280,19 @@ def validate_options(opts):
raise ValueError(f'invalid {key} retry sleep expression {expr!r}')
# Bytes
def parse_bytes(name, value):
def validate_bytes(name, value):
if value is None:
return None
numeric_limit = FileDownloader.parse_bytes(value)
numeric_limit = parse_bytes(value)
validate(numeric_limit is not None, 'rate limit', value)
return numeric_limit
opts.ratelimit = parse_bytes('rate limit', opts.ratelimit)
opts.throttledratelimit = parse_bytes('throttled rate limit', opts.throttledratelimit)
opts.min_filesize = parse_bytes('min filesize', opts.min_filesize)
opts.max_filesize = parse_bytes('max filesize', opts.max_filesize)
opts.buffersize = parse_bytes('buffer size', opts.buffersize)
opts.http_chunk_size = parse_bytes('http chunk size', opts.http_chunk_size)
opts.ratelimit = validate_bytes('rate limit', opts.ratelimit)
opts.throttledratelimit = validate_bytes('throttled rate limit', opts.throttledratelimit)
opts.min_filesize = validate_bytes('min filesize', opts.min_filesize)
opts.max_filesize = validate_bytes('max filesize', opts.max_filesize)
opts.buffersize = validate_bytes('buffer size', opts.buffersize)
opts.http_chunk_size = validate_bytes('http chunk size', opts.http_chunk_size)
# Output templates
def validate_outtmpl(tmpl, msg):
@@ -333,7 +332,7 @@ def validate_options(opts):
mobj = range_ != '-' and re.fullmatch(r'([^-]+)?\s*-\s*([^-]+)?', range_)
dur = mobj and (parse_timestamp(mobj.group(1) or '0'), parse_timestamp(mobj.group(2) or 'inf'))
if None in (dur or [None]):
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form *start-end')
raise ValueError(f'invalid {name} time range "{regex}". Must be of the form "*start-end"')
ranges.append(dur)
continue
try:
@@ -351,7 +350,7 @@ def validate_options(opts):
mobj = re.fullmatch(r'''(?x)
(?P<name>[^+:]+)
(?:\s*\+\s*(?P<keyring>[^:]+))?
(?:\s*:\s*(?P<profile>.+?))?
(?:\s*:\s*(?!:)(?P<profile>.+?))?
(?:\s*::\s*(?P<container>.+))?
''', opts.cookiesfrombrowser)
if mobj is None:
@@ -387,10 +386,12 @@ def validate_options(opts):
raise ValueError(f'{cmd} is invalid; {err}')
yield action
parse_metadata = opts.parse_metadata or []
if opts.metafromtitle is not None:
parse_metadata.append('title:%s' % opts.metafromtitle)
opts.parse_metadata = list(itertools.chain(*map(metadataparser_actions, parse_metadata)))
opts.parse_metadata.setdefault('pre_process', []).append('title:%s' % opts.metafromtitle)
opts.parse_metadata = {
k: list(itertools.chain(*map(metadataparser_actions, v)))
for k, v in opts.parse_metadata.items()
}
# Other options
if opts.playlist_items is not None:
@@ -562,11 +563,11 @@ def validate_options(opts):
def get_postprocessors(opts):
yield from opts.add_postprocessors
if opts.parse_metadata:
for when, actions in opts.parse_metadata.items():
yield {
'key': 'MetadataParser',
'actions': opts.parse_metadata,
'when': 'pre_process'
'actions': actions,
'when': when
}
sponsorblock_query = opts.sponsorblock_mark | opts.sponsorblock_remove
if sponsorblock_query:
@@ -702,7 +703,7 @@ def parse_options(argv=None):
postprocessors = list(get_postprocessors(opts))
print_only = bool(opts.forceprint) and all(k not in opts.forceprint for k in POSTPROCESS_WHEN[2:])
print_only = bool(opts.forceprint) and all(k not in opts.forceprint for k in POSTPROCESS_WHEN[3:])
any_getting = any(getattr(opts, k) for k in (
'dumpjson', 'dump_single_json', 'getdescription', 'getduration', 'getfilename',
'getformat', 'getid', 'getthumbnail', 'gettitle', 'geturl'
@@ -854,6 +855,7 @@ def parse_options(argv=None):
'legacyserverconnect': opts.legacy_server_connect,
'nocheckcertificate': opts.no_check_certificate,
'prefer_insecure': opts.prefer_insecure,
'enable_file_urls': opts.enable_file_urls,
'http_headers': opts.headers,
'proxy': opts.proxy,
'socket_timeout': opts.socket_timeout,

View File

@@ -5,6 +5,7 @@ import os
import re
import shutil
import traceback
import urllib.parse
from .utils import expand_path, traverse_obj, version_tuple, write_json_file
from .version import __version__
@@ -22,11 +23,9 @@ class Cache:
return expand_path(res)
def _get_cache_fn(self, section, key, dtype):
assert re.match(r'^[a-zA-Z0-9_.-]+$', section), \
'invalid section %r' % section
assert re.match(r'^[a-zA-Z0-9_.-]+$', key), 'invalid key %r' % key
return os.path.join(
self._get_root_dir(), section, f'{key}.{dtype}')
assert re.match(r'^[\w.-]+$', section), f'invalid section {section!r}'
key = urllib.parse.quote(key, safe='').replace('%', ',') # encode non-ascii characters
return os.path.join(self._get_root_dir(), section, f'{key}.{dtype}')
@property
def enabled(self):

View File

@@ -15,15 +15,16 @@ from ..minicurses import (
from ..utils import (
IDENTITY,
NO_DEFAULT,
NUMBER_RE,
LockingUnsupportedError,
Namespace,
RetryManager,
classproperty,
decodeArgument,
deprecation_warning,
encodeFilename,
format_bytes,
join_nonempty,
parse_bytes,
remove_start,
sanitize_open,
shell_quote,
@@ -180,12 +181,9 @@ class FileDownloader:
@staticmethod
def parse_bytes(bytestr):
"""Parse a string indicating a byte quantity into an integer."""
matchobj = re.match(rf'(?i)^({NUMBER_RE})([kMGTPEZY]?)$', bytestr)
if matchobj is None:
return None
number = float(matchobj.group(1))
multiplier = 1024.0 ** 'bkmgtpezy'.index(matchobj.group(2).lower())
return int(round(number * multiplier))
deprecation_warning('yt_dlp.FileDownloader.parse_bytes is deprecated and '
'may be removed in the future. Use yt_dlp.utils.parse_bytes instead')
return parse_bytes(bytestr)
def slow_down(self, start_time, now, byte_counter):
"""Sleep if the download speed is over the rate limit."""

View File

@@ -1,8 +1,9 @@
import time
import urllib.parse
from . import get_suitable_downloader
from .fragment import FragmentFD
from ..utils import urljoin
from ..utils import update_url_query, urljoin
class DashSegmentsFD(FragmentFD):
@@ -40,7 +41,12 @@ class DashSegmentsFD(FragmentFD):
self._prepare_and_start_frag_download(ctx, fmt)
ctx['start'] = real_start
fragments_to_download = self._get_fragments(fmt, ctx)
extra_query = None
extra_param_to_segment_url = info_dict.get('extra_param_to_segment_url')
if extra_param_to_segment_url:
extra_query = urllib.parse.parse_qs(extra_param_to_segment_url)
fragments_to_download = self._get_fragments(fmt, ctx, extra_query)
if real_downloader:
self.to_screen(
@@ -57,7 +63,7 @@ class DashSegmentsFD(FragmentFD):
fragments = fragments(ctx) if callable(fragments) else fragments
return [next(iter(fragments))] if self.params.get('test') else fragments
def _get_fragments(self, fmt, ctx):
def _get_fragments(self, fmt, ctx, extra_query):
fragment_base_url = fmt.get('fragment_base_url')
fragments = self._resolve_fragments(fmt['fragments'], ctx)
@@ -70,6 +76,8 @@ class DashSegmentsFD(FragmentFD):
if not fragment_url:
assert fragment_base_url
fragment_url = urljoin(fragment_base_url, fragment['path'])
if extra_query:
fragment_url = update_url_query(fragment_url, extra_query)
yield {
'frag_index': frag_index,

View File

@@ -1,9 +1,11 @@
import enum
import json
import os.path
import re
import subprocess
import sys
import time
import uuid
from .fragment import FragmentFD
from ..compat import functools
@@ -20,8 +22,10 @@ from ..utils import (
determine_ext,
encodeArgument,
encodeFilename,
find_available_port,
handle_youtubedl_headers,
remove_end,
sanitized_Request,
traverse_obj,
)
@@ -60,7 +64,6 @@ class ExternalFD(FragmentFD):
}
if filename != '-':
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(f'\r[{self.get_basename()}] Downloaded {fsize} bytes')
self.try_rename(tmpfilename, filename)
status.update({
'downloaded_bytes': fsize,
@@ -129,8 +132,7 @@ class ExternalFD(FragmentFD):
self._debug_cmd(cmd)
if 'fragments' not in info_dict:
_, stderr, returncode = Popen.run(
cmd, text=True, stderr=subprocess.PIPE if self._CAPTURE_STDERR else None)
_, stderr, returncode = self._call_process(cmd, info_dict)
if returncode and stderr:
self.to_stderr(stderr)
return returncode
@@ -140,7 +142,7 @@ class ExternalFD(FragmentFD):
retry_manager = RetryManager(self.params.get('fragment_retries'), self.report_retry,
frag_index=None, fatal=not skip_unavailable_fragments)
for retry in retry_manager:
_, stderr, returncode = Popen.run(cmd, text=True, stderr=subprocess.PIPE)
_, stderr, returncode = self._call_process(cmd, info_dict)
if not returncode:
break
# TODO: Decide whether to retry based on error code
@@ -172,6 +174,9 @@ class ExternalFD(FragmentFD):
self.try_remove(encodeFilename('%s.frag.urls' % tmpfilename))
return 0
def _call_process(self, cmd, info_dict):
return Popen.run(cmd, text=True, stderr=subprocess.PIPE)
class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
@@ -256,6 +261,15 @@ class Aria2cFD(ExternalFD):
def _aria2c_filename(fn):
return fn if os.path.isabs(fn) else f'.{os.path.sep}{fn}'
def _call_downloader(self, tmpfilename, info_dict):
# FIXME: Disabled due to https://github.com/yt-dlp/yt-dlp/issues/5931
if False and 'no-external-downloader-progress' not in self.params.get('compat_opts', []):
info_dict['__rpc'] = {
'port': find_available_port() or 19190,
'secret': str(uuid.uuid4()),
}
return super()._call_downloader(tmpfilename, info_dict)
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c',
'--console-log-level=warn', '--summary-interval=0', '--download-result=hide',
@@ -276,6 +290,12 @@ class Aria2cFD(ExternalFD):
cmd += self._bool_option('--show-console-readout', 'noprogress', 'false', 'true', '=')
cmd += self._configuration_args()
if '__rpc' in info_dict:
cmd += [
'--enable-rpc',
f'--rpc-listen-port={info_dict["__rpc"]["port"]}',
f'--rpc-secret={info_dict["__rpc"]["secret"]}']
# aria2c strips out spaces from the beginning/end of filenames and paths.
# We work around this issue by adding a "./" to the beginning of the
# filename and relative path, and adding a "/" at the end of the path.
@@ -304,6 +324,88 @@ class Aria2cFD(ExternalFD):
cmd += ['--', info_dict['url']]
return cmd
def aria2c_rpc(self, rpc_port, rpc_secret, method, params=()):
# Does not actually need to be UUID, just unique
sanitycheck = str(uuid.uuid4())
d = json.dumps({
'jsonrpc': '2.0',
'id': sanitycheck,
'method': method,
'params': [f'token:{rpc_secret}', *params],
}).encode('utf-8')
request = sanitized_Request(
f'http://localhost:{rpc_port}/jsonrpc',
data=d, headers={
'Content-Type': 'application/json',
'Content-Length': f'{len(d)}',
'Ytdl-request-proxy': '__noproxy__',
})
with self.ydl.urlopen(request) as r:
resp = json.load(r)
assert resp.get('id') == sanitycheck, 'Something went wrong with RPC server'
return resp['result']
def _call_process(self, cmd, info_dict):
if '__rpc' not in info_dict:
return super()._call_process(cmd, info_dict)
send_rpc = functools.partial(self.aria2c_rpc, info_dict['__rpc']['port'], info_dict['__rpc']['secret'])
started = time.time()
fragmented = 'fragments' in info_dict
frag_count = len(info_dict['fragments']) if fragmented else 1
status = {
'filename': info_dict.get('_filename'),
'status': 'downloading',
'elapsed': 0,
'downloaded_bytes': 0,
'fragment_count': frag_count if fragmented else None,
'fragment_index': 0 if fragmented else None,
}
self._hook_progress(status, info_dict)
def get_stat(key, *obj, average=False):
val = tuple(filter(None, map(float, traverse_obj(obj, (..., ..., key))))) or [0]
return sum(val) / (len(val) if average else 1)
with Popen(cmd, text=True, stdout=subprocess.DEVNULL, stderr=subprocess.PIPE) as p:
# Add a small sleep so that RPC client can receive response,
# or the connection stalls infinitely
time.sleep(0.2)
retval = p.poll()
while retval is None:
# We don't use tellStatus as we won't know the GID without reading stdout
# Ref: https://aria2.github.io/manual/en/html/aria2c.html#aria2.tellActive
active = send_rpc('aria2.tellActive')
completed = send_rpc('aria2.tellStopped', [0, frag_count])
downloaded = get_stat('totalLength', completed) + get_stat('completedLength', active)
speed = get_stat('downloadSpeed', active)
total = frag_count * get_stat('totalLength', active, completed, average=True)
if total < downloaded:
total = None
status.update({
'downloaded_bytes': int(downloaded),
'speed': speed,
'total_bytes': None if fragmented else total,
'total_bytes_estimate': total,
'eta': (total - downloaded) / (speed or 1),
'fragment_index': min(frag_count, len(completed) + 1) if fragmented else None,
'elapsed': time.time() - started
})
self._hook_progress(status, info_dict)
if not active and len(completed) >= frag_count:
send_rpc('aria2.shutdown')
retval = p.wait()
break
time.sleep(0.1)
retval = p.poll()
return '', p.stderr.read(), retval
class HttpieFD(ExternalFD):
AVAILABLE_OPT = '--version'
@@ -342,7 +444,6 @@ class FFmpegFD(ExternalFD):
and cls.can_download(info_dict))
def _call_downloader(self, tmpfilename, info_dict):
urls = [f['url'] for f in info_dict.get('requested_formats', [])] or [info_dict['url']]
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg could not be found. Please install')
@@ -372,16 +473,6 @@ class FFmpegFD(ExternalFD):
# http://trac.ffmpeg.org/ticket/6125#comment:10
args += ['-seekable', '1' if seekable else '0']
http_headers = None
if info_dict.get('http_headers'):
youtubedl_headers = handle_youtubedl_headers(info_dict['http_headers'])
http_headers = [
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
'-headers',
''.join(f'{key}: {val}\r\n' for key, val in youtubedl_headers.items())
]
env = None
proxy = self.params.get('proxy')
if proxy:
@@ -434,21 +525,26 @@ class FFmpegFD(ExternalFD):
start_time, end_time = info_dict.get('section_start') or 0, info_dict.get('section_end')
for i, url in enumerate(urls):
if http_headers is not None and re.match(r'^https?://', url):
args += http_headers
selected_formats = info_dict.get('requested_formats') or [info_dict]
for i, fmt in enumerate(selected_formats):
if fmt.get('http_headers') and re.match(r'^https?://', fmt['url']):
headers_dict = handle_youtubedl_headers(fmt['http_headers'])
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
args.extend(['-headers', ''.join(f'{key}: {val}\r\n' for key, val in headers_dict.items())])
if start_time:
args += ['-ss', str(start_time)]
if end_time:
args += ['-t', str(end_time - start_time)]
args += self._configuration_args((f'_i{i + 1}', '_i')) + ['-i', url]
args += self._configuration_args((f'_i{i + 1}', '_i')) + ['-i', fmt['url']]
if not (start_time or end_time) or not self.params.get('force_keyframes_at_cuts'):
args += ['-c', 'copy']
if info_dict.get('requested_formats') or protocol == 'http_dash_segments':
for (i, fmt) in enumerate(info_dict.get('requested_formats') or [info_dict]):
for i, fmt in enumerate(selected_formats):
stream_number = fmt.get('manifest_stream_number', 0)
args.extend(['-map', f'{i}:{stream_number}'])
@@ -488,8 +584,9 @@ class FFmpegFD(ExternalFD):
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
piped = any(fmt['url'] in ('-', 'pipe:') for fmt in selected_formats)
with Popen(args, stdin=subprocess.PIPE, env=env) as proc:
if url in ('-', 'pipe:'):
if piped:
self.on_process_started(proc, proc.stdin)
try:
retval = proc.wait()
@@ -499,7 +596,7 @@ class FFmpegFD(ExternalFD):
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/ytdl-org/youtube-dl/issues/8300).
if isinstance(e, KeyboardInterrupt) and sys.platform != 'win32' and url not in ('-', 'pipe:'):
if isinstance(e, KeyboardInterrupt) and sys.platform != 'win32' and not piped:
proc.communicate_or_kill(b'q')
else:
proc.kill(timeout=None)

View File

@@ -78,6 +78,8 @@ from .agora import (
WyborczaVideoIE,
)
from .airmozilla import AirMozillaIE
from .airtv import AirTVIE
from .aitube import AitubeKZVideoIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .amara import AmaraIE
@@ -86,7 +88,15 @@ from .alura import (
AluraCourseIE
)
from .amcnetworks import AMCNetworksIE
from .amazon import AmazonStoreIE
from .amazon import (
AmazonStoreIE,
AmazonReviewsIE,
)
from .amazonminitv import (
AmazonMiniTVIE,
AmazonMiniTVSeasonIE,
AmazonMiniTVSeriesIE,
)
from .americastestkitchen import (
AmericasTestKitchenIE,
AmericasTestKitchenSeasonIE,
@@ -178,6 +188,10 @@ from .bbc import (
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE
from .beatbump import (
BeatBumpVideoIE,
BeatBumpPlaylistIE,
)
from .beatport import BeatportIE
from .berufetv import BerufeTVIE
from .bet import BetIE
@@ -461,6 +475,8 @@ from .drtuber import DrTuberIE
from .drtv import (
DRTVIE,
DRTVLiveIE,
DRTVSeasonIE,
DRTVSeriesIE,
)
from .dtube import DTubeIE
from .dvtv import DVTVIE
@@ -531,7 +547,7 @@ from .espn import (
ESPNCricInfoIE,
)
from .esri import EsriVideoIE
from .europa import EuropaIE
from .europa import EuropaIE, EuroParlWebstreamIE
from .europeantour import EuropeanTourIE
from .eurosport import EurosportIE
from .euscreen import EUScreenIE
@@ -820,6 +836,8 @@ from .joj import JojIE
from .jwplatform import JWPlatformIE
from .kakao import KakaoIE
from .kaltura import KalturaIE
from .kanal2 import Kanal2IE
from .kankanews import KankaNewsIE
from .karaoketv import KaraoketvIE
from .karrierevideos import KarriereVideosIE
from .keezmovies import KeezMoviesIE
@@ -829,6 +847,10 @@ from .khanacademy import (
KhanAcademyIE,
KhanAcademyUnitIE,
)
from .kick import (
KickIE,
KickVODIE,
)
from .kicker import KickerIE
from .kickstarter import KickStarterIE
from .kinja import KinjaEmbedIE
@@ -976,6 +998,10 @@ from .mediasite import (
MediasiteCatalogIE,
MediasiteNamedCatalogIE,
)
from .mediastream import (
MediaStreamIE,
WinSportsVideoIE,
)
from .mediaworksnz import MediaWorksNZVODIE
from .medici import MediciIE
from .megaphone import MegaphoneIE
@@ -1144,6 +1170,7 @@ from .neteasemusic import (
from .netverse import (
NetverseIE,
NetversePlaylistIE,
NetverseSearchIE,
)
from .newgrounds import (
NewgroundsIE,
@@ -1205,6 +1232,7 @@ from .nintendo import NintendoIE
from .nitter import NitterIE
from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE
from .noice import NoicePodcastIE
from .nonktube import NonkTubeIE
from .noodlemagazine import NoodleMagazineIE
from .noovo import NoovoIE
@@ -1270,6 +1298,7 @@ from .on24 import On24IE
from .ondemandkorea import OnDemandKoreaIE
from .onefootball import OneFootballIE
from .onenewsnz import OneNewsNZIE
from .oneplace import OnePlacePodcastIE
from .onet import (
OnetIE,
OnetChannelIE,
@@ -1392,6 +1421,8 @@ from .pokergo import (
from .polsatgo import PolsatGoIE
from .polskieradio import (
PolskieRadioIE,
PolskieRadioLegacyIE,
PolskieRadioAuditionIE,
PolskieRadioCategoryIE,
PolskieRadioPlayerIE,
PolskieRadioPodcastIE,
@@ -1561,6 +1592,7 @@ from .ruhd import RUHDIE
from .rule34video import Rule34VideoIE
from .rumble import (
RumbleEmbedIE,
RumbleIE,
RumbleChannelIE,
)
from .rutube import (
@@ -1603,6 +1635,7 @@ from .savefrom import SaveFromIE
from .sbs import SBSIE
from .screen9 import Screen9IE
from .screencast import ScreencastIE
from .screencastify import ScreencastifyIE
from .screencastomatic import ScreencastOMaticIE
from .scrippsnetworks import (
ScrippsNetworksWatchIE,
@@ -1632,6 +1665,7 @@ from .shared import (
VivoIE,
)
from .sharevideos import ShareVideosEmbedIE
from .sibnet import SibnetEmbedIE
from .shemaroome import ShemarooMeIE
from .showroomlive import ShowRoomLiveIE
from .simplecast import (
@@ -1679,6 +1713,7 @@ from .soundcloud import (
SoundcloudSetIE,
SoundcloudRelatedIE,
SoundcloudUserIE,
SoundcloudUserPermalinkIE,
SoundcloudTrackStationIE,
SoundcloudPlaylistIE,
SoundcloudSearchIE,
@@ -1840,6 +1875,11 @@ from .theweatherchannel import TheWeatherChannelIE
from .thisamericanlife import ThisAmericanLifeIE
from .thisav import ThisAVIE
from .thisoldhouse import ThisOldHouseIE
from .thisvid import (
ThisVidIE,
ThisVidMemberIE,
ThisVidPlaylistIE,
)
from .threespeak import (
ThreeSpeakIE,
ThreeSpeakUserIE,
@@ -1852,6 +1892,7 @@ from .tiktok import (
TikTokEffectIE,
TikTokTagIE,
TikTokVMIE,
TikTokLiveIE,
DouyinIE,
)
from .tinypic import TinyPicIE
@@ -1889,6 +1930,7 @@ from .trovo import (
TrovoChannelVodIE,
TrovoChannelClipIE,
)
from .trtcocuk import TrtCocukVideoIE
from .trueid import TrueIDIE
from .trunews import TruNewsIE
from .truth import TruthIE
@@ -2043,7 +2085,10 @@ from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE
from .veehd import VeeHDIE
from .veo import VeoIE
from .veoh import VeohIE
from .veoh import (
VeohIE,
VeohUserIE
)
from .vesti import VestiIE
from .vevo import (
VevoIE,
@@ -2069,6 +2114,13 @@ from .videocampus_sachsen import (
)
from .videodetective import VideoDetectiveIE
from .videofyme import VideofyMeIE
from .videoken import (
VideoKenIE,
VideoKenPlayerIE,
VideoKenPlaylistIE,
VideoKenCategoryIE,
VideoKenTopicIE,
)
from .videomore import (
VideomoreIE,
VideomoreVideoIE,
@@ -2093,6 +2145,7 @@ from .vimeo import (
VimeoGroupsIE,
VimeoLikesIE,
VimeoOndemandIE,
VimeoProIE,
VimeoReviewIE,
VimeoUserIE,
VimeoWatchLaterIE,
@@ -2135,6 +2188,7 @@ from .voicy import (
VoicyIE,
VoicyChannelIE,
)
from .volejtv import VolejTVIE
from .voot import (
VootIE,
VootSeriesIE,
@@ -2180,6 +2234,7 @@ from .wdr import (
WDRElefantIE,
WDRMobileIE,
)
from .webcamerapl import WebcameraplIE
from .webcaster import (
WebcasterIE,
WebcasterFeedIE,
@@ -2216,6 +2271,7 @@ from .wsj import (
WSJArticleIE,
)
from .wwe import WWEIE
from .xanimu import XanimuIE
from .xbef import XBefIE
from .xboxclips import XboxClipsIE
from .xfileshare import XFileShareIE
@@ -2224,12 +2280,6 @@ from .xhamster import (
XHamsterEmbedIE,
XHamsterUserIE,
)
from .xiami import (
XiamiSongIE,
XiamiAlbumIE,
XiamiArtistIE,
XiamiCollectionIE
)
from .ximalaya import (
XimalayaIE,
XimalayaAlbumIE

View File

@@ -155,8 +155,6 @@ class ABCIE(InfoExtractor):
'format_id': format_id
})
self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(webpage),
@@ -221,7 +219,6 @@ class ABCIViewIE(InfoExtractor):
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
if formats:
break
self._sort_formats(formats)
subtitles = {}
src_vtt = stream.get('captions', {}).get('src-vtt')

View File

@@ -78,7 +78,6 @@ class ABCOTVSIE(InfoExtractor):
'url': mp4_url,
'width': 640,
})
self._sort_formats(formats)
image = video.get('image') or {}
@@ -119,7 +118,6 @@ class ABCOTVSClipsIE(InfoExtractor):
title = video_data['title']
formats = self._extract_m3u8_formats(
video_data['videoURL'].split('?')[0], video_id, 'mp4')
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -27,7 +27,6 @@ class AcFunVideoBaseIE(InfoExtractor):
**parse_codecs(video.get('codecs', ''))
})
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,

View File

@@ -168,7 +168,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
}, data=b'')['token']
links_url = try_get(options, lambda x: x['video']['url']) or (video_base_url + 'link')
self._K = ''.join([random.choice('0123456789abcdef') for _ in range(16)])
self._K = ''.join(random.choices('0123456789abcdef', k=16))
message = bytes_to_intlist(json.dumps({
'k': self._K,
't': token,
@@ -235,7 +235,6 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for f in m3u8_formats:
f['language'] = 'fr'
formats.extend(m3u8_formats)
self._sort_formats(formats)
video = (self._download_json(
self._API_BASE_URL + 'video/%s' % video_id, video_id,

View File

@@ -1352,7 +1352,7 @@ MSO_INFO = {
}
class AdobePassIE(InfoExtractor):
class AdobePassIE(InfoExtractor): # XXX: Conventionally, base classes should end with BaseIE/InfoExtractor
_SERVICE_PROVIDER_TEMPLATE = 'https://sp.auth.adobe.com/adobe-services/%s'
_USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
_MVPD_CACHE = 'ap-mvpd'

View File

@@ -70,7 +70,6 @@ class AdobeTVBaseIE(InfoExtractor):
})
s3_extracted = True
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
@@ -269,7 +268,6 @@ class AdobeTVVideoIE(AdobeTVBaseIE):
'width': int_or_none(source.get('width') or None),
'url': source_src,
})
self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among
# formats. I just pick the max one

View File

@@ -180,7 +180,6 @@ class AdultSwimIE(TurnerBaseIE):
info['subtitles'].setdefault('en', []).append({
'url': asset_url,
})
self._sort_formats(info['formats'])
return info
else:

View File

@@ -8,7 +8,7 @@ from ..utils import (
)
class AENetworksBaseIE(ThePlatformIE):
class AENetworksBaseIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_BASE_URL_REGEX = r'''(?x)https?://
(?:(?:www|play|watch)\.)?
(?P<domain>
@@ -62,7 +62,6 @@ class AENetworksBaseIE(ThePlatformIE):
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
@@ -304,7 +303,6 @@ class HistoryTopicIE(AENetworksBaseIE):
class HistoryPlayerIE(AENetworksBaseIE):
IE_NAME = 'history:player'
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|biography)\.com)/player/(?P<id>\d+)'
_TESTS = []
def _real_extract(self, url):
domain, video_id = self._match_valid_url(url).groups()

View File

@@ -338,7 +338,6 @@ class AfreecaTVIE(InfoExtractor):
}]
if not formats and not self.get_param('ignore_no_formats'):
continue
self._sort_formats(formats)
file_info = common_entry.copy()
file_info.update({
'id': format_id,
@@ -380,7 +379,7 @@ class AfreecaTVIE(InfoExtractor):
return info
class AfreecaTVLiveIE(AfreecaTVIE):
class AfreecaTVLiveIE(AfreecaTVIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'afreecatv:live'
_VALID_URL = r'https?://play\.afreeca(?:tv)?\.com/(?P<id>[^/]+)(?:/(?P<bno>\d+))?'
@@ -464,8 +463,6 @@ class AfreecaTVLiveIE(AfreecaTVIE):
'quality': quality_key(quality_str),
})
self._sort_formats(formats)
station_info = self._download_json(
'https://st.afreecatv.com/api/get_station_status.php', broadcast_no,
query={'szBjId': broadcaster_id}, fatal=False,

View File

@@ -55,7 +55,6 @@ class WyborczaVideoIE(InfoExtractor):
if meta['files'].get('dash'):
formats.extend(self._extract_mpd_formats(base_url + meta['files']['dash'], video_id))
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
@@ -179,7 +178,6 @@ class TokFMPodcastIE(InfoExtractor):
'acodec': ext,
})
self._sort_formats(formats)
return {
'id': media_id,
'formats': formats,

96
yt_dlp/extractor/airtv.py Normal file
View File

@@ -0,0 +1,96 @@
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
determine_ext,
int_or_none,
mimetype2ext,
parse_iso8601,
traverse_obj
)
class AirTVIE(InfoExtractor):
_VALID_URL = r'https?://www\.air\.tv/watch\?v=(?P<id>\w+)'
_TESTS = [{
# without youtube_id
'url': 'https://www.air.tv/watch?v=W87jcWleSn2hXZN47zJZsQ',
'info_dict': {
'id': 'W87jcWleSn2hXZN47zJZsQ',
'ext': 'mp4',
'release_date': '20221003',
'release_timestamp': 1664792603,
'channel_id': 'vgfManQlRQKgoFQ8i8peFQ',
'title': 'md5:c12d49ed367c3dadaa67659aff43494c',
'upload_date': '20221003',
'duration': 151,
'view_count': int,
'thumbnail': 'https://cdn-sp-gcs.air.tv/videos/W/8/W87jcWleSn2hXZN47zJZsQ/b13fc56464f47d9d62a36d110b9b5a72-4096x2160_9.jpg',
'timestamp': 1664792603,
}
}, {
# with youtube_id
'url': 'https://www.air.tv/watch?v=sv57EC8tRXG6h8dNXFUU1Q',
'info_dict': {
'id': '2ZTqmpee-bQ',
'ext': 'mp4',
'comment_count': int,
'tags': 'count:11',
'channel_follower_count': int,
'like_count': int,
'uploader': 'Newsflare',
'thumbnail': 'https://i.ytimg.com/vi_webp/2ZTqmpee-bQ/maxresdefault.webp',
'availability': 'public',
'title': 'Geese Chase Alligator Across Golf Course',
'uploader_id': 'NewsflareBreaking',
'channel_url': 'https://www.youtube.com/channel/UCzSSoloGEz10HALUAbYhngQ',
'description': 'md5:99b21d9cea59330149efbd9706e208f5',
'age_limit': 0,
'channel_id': 'UCzSSoloGEz10HALUAbYhngQ',
'uploader_url': 'http://www.youtube.com/user/NewsflareBreaking',
'view_count': int,
'categories': ['News & Politics'],
'live_status': 'not_live',
'playable_in_embed': True,
'channel': 'Newsflare',
'duration': 37,
'upload_date': '20180511',
}
}]
def _get_formats_and_subtitle(self, json_data, video_id):
formats, subtitles = [], {}
for source in traverse_obj(json_data, 'sources', 'sources_desktop', ...):
ext = determine_ext(source.get('src'), mimetype2ext(source.get('type')))
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(source.get('src'), video_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({'url': source.get('src'), 'ext': ext})
return formats, subtitles
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
nextjs_json = self._search_nextjs_data(webpage, display_id)['props']['pageProps']['initialState']['videos'][display_id]
if nextjs_json.get('youtube_id'):
return self.url_result(
f'https://www.youtube.com/watch?v={nextjs_json.get("youtube_id")}', YoutubeIE)
formats, subtitles = self._get_formats_and_subtitle(nextjs_json, display_id)
return {
'id': display_id,
'title': nextjs_json.get('title') or self._html_search_meta('og:title', webpage),
'formats': formats,
'subtitles': subtitles,
'description': nextjs_json.get('description') or None,
'duration': int_or_none(nextjs_json.get('duration')),
'thumbnails': [
{'url': thumbnail}
for thumbnail in traverse_obj(nextjs_json, ('default_thumbnails', ...))],
'channel_id': traverse_obj(nextjs_json, 'channel', 'channel_slug'),
'timestamp': parse_iso8601(nextjs_json.get('created')),
'release_timestamp': parse_iso8601(nextjs_json.get('published')),
'view_count': int_or_none(nextjs_json.get('views')),
}

View File

@@ -0,0 +1,60 @@
from .common import InfoExtractor
from ..utils import int_or_none, merge_dicts
class AitubeKZVideoIE(InfoExtractor):
_VALID_URL = r'https?://aitube\.kz/(?:video|embed/)\?(?:[^\?]+)?id=(?P<id>[\w-]+)'
_TESTS = [{
# id paramater as first parameter
'url': 'https://aitube.kz/video?id=9291d29b-c038-49a1-ad42-3da2051d353c&playlistId=d55b1f5f-ef2a-4f23-b646-2a86275b86b7&season=1',
'info_dict': {
'id': '9291d29b-c038-49a1-ad42-3da2051d353c',
'ext': 'mp4',
'duration': 2174.0,
'channel_id': '94962f73-013b-432c-8853-1bd78ca860fe',
'like_count': int,
'channel': 'ASTANA TV',
'comment_count': int,
'view_count': int,
'description': 'Смотреть любимые сериалы и видео, поделиться видео и сериалами с друзьями и близкими',
'thumbnail': 'https://cdn.static02.aitube.kz/kz.aitudala.aitube.staticaccess/files/ddf2a2ff-bee3-409b-b5f2-2a8202bba75b',
'upload_date': '20221102',
'timestamp': 1667370519,
'title': 'Ангел хранитель 1 серия',
'channel_follower_count': int,
}
}, {
# embed url
'url': 'https://aitube.kz/embed/?id=9291d29b-c038-49a1-ad42-3da2051d353c',
'only_matching': True,
}, {
# id parameter is not as first paramater
'url': 'https://aitube.kz/video?season=1&id=9291d29b-c038-49a1-ad42-3da2051d353c&playlistId=d55b1f5f-ef2a-4f23-b646-2a86275b86b7',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
nextjs_data = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['videoInfo']
json_ld_data = self._search_json_ld(webpage, video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://api-http.aitube.kz/kz.aitudala.aitube.staticaccess/video/{video_id}/video', video_id)
return merge_dicts({
'id': video_id,
'title': nextjs_data.get('title') or self._html_search_meta(['name', 'og:title'], webpage),
'description': nextjs_data.get('description'),
'formats': formats,
'subtitles': subtitles,
'view_count': (nextjs_data.get('viewCount')
or int_or_none(self._html_search_meta('ya:ovs:views_total', webpage))),
'like_count': nextjs_data.get('likeCount'),
'channel': nextjs_data.get('channelTitle'),
'channel_id': nextjs_data.get('channelId'),
'thumbnail': nextjs_data.get('coverUrl'),
'comment_count': nextjs_data.get('commentCount'),
'channel_follower_count': int_or_none(nextjs_data.get('channelSubscriberCount')),
}, json_ld_data)

View File

@@ -112,8 +112,6 @@ class AllocineIE(InfoExtractor):
})
duration, view_count, timestamp = [None] * 3
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id,

View File

@@ -22,7 +22,6 @@ class Alsace20TVBaseIE(InfoExtractor):
self._extract_smil_formats(fmt_url, video_id, fatal=False)
if '/smil:_' in fmt_url
else self._extract_mpd_formats(fmt_url, video_id, mpd_id=res, fatal=False))
self._sort_formats(formats)
webpage = (url and self._download_webpage(url, video_id, fatal=False)) or ''
thumbnail = url_or_none(dict_get(info, ('image', 'preview', )) or self._og_search_thumbnail(webpage))

View File

@@ -63,8 +63,6 @@ class AluraIE(InfoExtractor):
f['height'] = int('720' if m.group('res') == 'hd' else '480')
formats.extend(video_format)
self._sort_formats(formats)
return {
'id': video_id,
'title': video_title,
@@ -113,7 +111,7 @@ class AluraIE(InfoExtractor):
raise ExtractorError('Unable to log in')
class AluraCourseIE(AluraIE):
class AluraCourseIE(AluraIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?://(?:cursos\.)?alura\.com\.br/course/(?P<id>[^/]+)'
_LOGIN_URL = 'https://cursos.alura.com.br/loginForm?urlAfterLogin=/loginForm'

View File

@@ -1,5 +1,17 @@
import re
from .common import InfoExtractor
from ..utils import ExtractorError, int_or_none
from ..utils import (
ExtractorError,
clean_html,
float_or_none,
get_element_by_attribute,
get_element_by_class,
int_or_none,
js_to_json,
traverse_obj,
url_or_none,
)
class AmazonStoreIE(InfoExtractor):
@@ -9,7 +21,7 @@ class AmazonStoreIE(InfoExtractor):
'url': 'https://www.amazon.co.uk/dp/B098XNCHLD/',
'info_dict': {
'id': 'B098XNCHLD',
'title': 'md5:dae240564cbb2642170c02f7f0d7e472',
'title': str,
},
'playlist_mincount': 1,
'playlist': [{
@@ -20,28 +32,32 @@ class AmazonStoreIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 34,
},
}]
}],
'expected_warnings': ['Unable to extract data'],
}, {
'url': 'https://www.amazon.in/Sony-WH-1000XM4-Cancelling-Headphones-Bluetooth/dp/B0863TXGM3',
'info_dict': {
'id': 'B0863TXGM3',
'title': 'md5:d1d3352428f8f015706c84b31e132169',
'title': str,
},
'playlist_mincount': 4,
'expected_warnings': ['Unable to extract data'],
}, {
'url': 'https://www.amazon.com/dp/B0845NXCXF/',
'info_dict': {
'id': 'B0845NXCXF',
'title': 'md5:f3fa12779bf62ddb6a6ec86a360a858e',
'title': str,
},
'playlist-mincount': 1,
'expected_warnings': ['Unable to extract data'],
}, {
'url': 'https://www.amazon.es/Samsung-Smartphone-s-AMOLED-Quad-c%C3%A1mara-espa%C3%B1ola/dp/B08WX337PQ',
'info_dict': {
'id': 'B08WX337PQ',
'title': 'md5:f3fa12779bf62ddb6a6ec86a360a858e',
'title': str,
},
'playlist_mincount': 1,
'expected_warnings': ['Unable to extract data'],
}]
def _real_extract(self, url):
@@ -52,7 +68,7 @@ class AmazonStoreIE(InfoExtractor):
try:
data_json = self._search_json(
r'var\s?obj\s?=\s?jQuery\.parseJSON\(\'', webpage, 'data', id,
transform_source=lambda x: x.replace(R'\\u', R'\u'))
transform_source=js_to_json)
except ExtractorError as e:
retry.error = e
@@ -66,3 +82,89 @@ class AmazonStoreIE(InfoExtractor):
'width': int_or_none(video.get('videoWidth')),
} for video in (data_json.get('videos') or []) if video.get('isVideo') and video.get('url')]
return self.playlist_result(entries, playlist_id=id, playlist_title=data_json.get('title'))
class AmazonReviewsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?amazon\.(?:[a-z]{2,3})(?:\.[a-z]{2})?/gp/customer-reviews/(?P<id>[^/&#$?]+)'
_TESTS = [{
'url': 'https://www.amazon.com/gp/customer-reviews/R10VE9VUSY19L3/ref=cm_cr_arp_d_rvw_ttl',
'info_dict': {
'id': 'R10VE9VUSY19L3',
'ext': 'mp4',
'title': 'Get squad #Suspicious',
'description': 'md5:7012695052f440a1e064e402d87e0afb',
'uploader': 'Kimberly Cronkright',
'average_rating': 1.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
'expected_warnings': ['Review body was not found in webpage'],
}, {
'url': 'https://www.amazon.com/gp/customer-reviews/R10VE9VUSY19L3/ref=cm_cr_arp_d_rvw_ttl?language=es_US',
'info_dict': {
'id': 'R10VE9VUSY19L3',
'ext': 'mp4',
'title': 'Get squad #Suspicious',
'description': 'md5:7012695052f440a1e064e402d87e0afb',
'uploader': 'Kimberly Cronkright',
'average_rating': 1.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
'expected_warnings': ['Review body was not found in webpage'],
}, {
'url': 'https://www.amazon.in/gp/customer-reviews/RV1CO8JN5VGXV/',
'info_dict': {
'id': 'RV1CO8JN5VGXV',
'ext': 'mp4',
'title': 'Not sure about its durability',
'description': 'md5:1a252c106357f0a3109ebf37d2e87494',
'uploader': 'Shoaib Gulzar',
'average_rating': 2.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
'expected_warnings': ['Review body was not found in webpage'],
}]
def _real_extract(self, url):
video_id = self._match_id(url)
for retry in self.RetryManager():
webpage = self._download_webpage(url, video_id)
review_body = get_element_by_attribute('data-hook', 'review-body', webpage)
if not review_body:
retry.error = ExtractorError('Review body was not found in webpage', expected=True)
formats, subtitles = [], {}
manifest_url = self._search_regex(
r'data-video-url="([^"]+)"', review_body, 'm3u8 url', default=None)
if url_or_none(manifest_url):
fmts, subtitles = self._extract_m3u8_formats_and_subtitles(
manifest_url, video_id, 'mp4', fatal=False)
formats.extend(fmts)
video_url = self._search_regex(
r'<input[^>]+\bvalue="([^"]+)"[^>]+\bclass="video-url"', review_body, 'mp4 url', default=None)
if url_or_none(video_url):
formats.append({
'url': video_url,
'ext': 'mp4',
'format_id': 'http-mp4',
})
if not formats:
self.raise_no_formats('No video found for this customer review', expected=True)
return {
'id': video_id,
'title': (clean_html(get_element_by_attribute('data-hook', 'review-title', webpage))
or self._html_extract_title(webpage)),
'description': clean_html(traverse_obj(re.findall(
r'<span(?:\s+class="cr-original-review-content")?>(.+?)</span>', review_body), -1)),
'uploader': clean_html(get_element_by_class('a-profile-name', webpage)),
'average_rating': float_or_none(clean_html(get_element_by_attribute(
'data-hook', 'review-star-rating', webpage) or '').partition(' ')[0]),
'thumbnail': self._search_regex(
r'data-thumbnail-url="([^"]+)"', review_body, 'thumbnail', default=None),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -0,0 +1,290 @@
import json
from .common import InfoExtractor
from ..utils import ExtractorError, int_or_none, traverse_obj, try_get
class AmazonMiniTVBaseIE(InfoExtractor):
def _real_initialize(self):
self._download_webpage(
'https://www.amazon.in/minitv', None,
note='Fetching guest session cookies')
AmazonMiniTVBaseIE.session_id = self._get_cookies('https://www.amazon.in')['session-id'].value
def _call_api(self, asin, data=None, note=None):
device = {'clientId': 'ATVIN', 'deviceLocale': 'en_GB'}
if data:
data['variables'].update({
'contentType': 'VOD',
'sessionIdToken': self.session_id,
**device,
})
resp = self._download_json(
f'https://www.amazon.in/minitv/api/web/{"graphql" if data else "prs"}',
asin, note=note, headers={'Content-Type': 'application/json'},
data=json.dumps(data).encode() if data else None,
query=None if data else {
'deviceType': 'A1WMMUXPCUJL4N',
'contentId': asin,
**device,
})
if resp.get('errors'):
raise ExtractorError(f'MiniTV said: {resp["errors"][0]["message"]}')
elif not data:
return resp
return resp['data'][data['operationName']]
class AmazonMiniTVIE(AmazonMiniTVBaseIE):
_VALID_URL = r'(?:https?://(?:www\.)?amazon\.in/minitv/tp/|amazonminitv:(?:amzn1\.dv\.gti\.)?)(?P<id>[a-f0-9-]+)'
_TESTS = [{
'url': 'https://www.amazon.in/minitv/tp/75fe3a75-b8fe-4499-8100-5c9424344840?referrer=https%3A%2F%2Fwww.amazon.in%2Fminitv',
'info_dict': {
'id': 'amzn1.dv.gti.75fe3a75-b8fe-4499-8100-5c9424344840',
'ext': 'mp4',
'title': 'May I Kiss You?',
'language': 'Hindi',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:a549bfc747973e04feb707833474e59d',
'release_timestamp': 1644710400,
'release_date': '20220213',
'duration': 846,
'chapters': 'count:2',
'series': 'Couple Goals',
'series_id': 'amzn1.dv.gti.56521d46-b040-4fd5-872e-3e70476a04b0',
'season': 'Season 3',
'season_number': 3,
'season_id': 'amzn1.dv.gti.20331016-d9b9-4968-b991-c89fa4927a36',
'episode': 'May I Kiss You?',
'episode_number': 2,
'episode_id': 'amzn1.dv.gti.75fe3a75-b8fe-4499-8100-5c9424344840',
},
}, {
'url': 'https://www.amazon.in/minitv/tp/280d2564-584f-452f-9c98-7baf906e01ab?referrer=https%3A%2F%2Fwww.amazon.in%2Fminitv',
'info_dict': {
'id': 'amzn1.dv.gti.280d2564-584f-452f-9c98-7baf906e01ab',
'ext': 'mp4',
'title': 'Jahaan',
'language': 'Hindi',
'thumbnail': r're:^https?://.*\.jpg',
'description': 'md5:05eb765a77bf703f322f120ec6867339',
'release_timestamp': 1647475200,
'release_date': '20220317',
'duration': 783,
'chapters': [],
},
}, {
'url': 'https://www.amazon.in/minitv/tp/280d2564-584f-452f-9c98-7baf906e01ab',
'only_matching': True,
}, {
'url': 'amazonminitv:amzn1.dv.gti.280d2564-584f-452f-9c98-7baf906e01ab',
'only_matching': True,
}, {
'url': 'amazonminitv:280d2564-584f-452f-9c98-7baf906e01ab',
'only_matching': True,
}]
_GRAPHQL_QUERY_CONTENT = '''
query content($sessionIdToken: String!, $deviceLocale: String, $contentId: ID!, $contentType: ContentType!, $clientId: String) {
content(
applicationContextInput: {deviceLocale: $deviceLocale, sessionIdToken: $sessionIdToken, clientId: $clientId}
contentId: $contentId
contentType: $contentType
) {
contentId
name
... on Episode {
contentId
vodType
name
images
description {
synopsis
contentLengthInSeconds
}
publicReleaseDateUTC
audioTracks
seasonId
seriesId
seriesName
seasonNumber
episodeNumber
timecode {
endCreditsTime
}
}
... on MovieContent {
contentId
vodType
name
description {
synopsis
contentLengthInSeconds
}
images
publicReleaseDateUTC
audioTracks
}
}
}'''
def _real_extract(self, url):
asin = f'amzn1.dv.gti.{self._match_id(url)}'
prs = self._call_api(asin, note='Downloading playback info')
formats, subtitles = [], {}
for type_, asset in prs['playbackAssets'].items():
if not traverse_obj(asset, 'manifestUrl'):
continue
if type_ == 'hls':
m3u8_fmts, m3u8_subs = self._extract_m3u8_formats_and_subtitles(
asset['manifestUrl'], asin, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=type_, fatal=False)
formats.extend(m3u8_fmts)
subtitles = self._merge_subtitles(subtitles, m3u8_subs)
elif type_ == 'dash':
mpd_fmts, mpd_subs = self._extract_mpd_formats_and_subtitles(
asset['manifestUrl'], asin, mpd_id=type_, fatal=False)
formats.extend(mpd_fmts)
subtitles = self._merge_subtitles(subtitles, mpd_subs)
else:
self.report_warning(f'Unknown asset type: {type_}')
title_info = self._call_api(
asin, note='Downloading title info', data={
'operationName': 'content',
'variables': {'contentId': asin},
'query': self._GRAPHQL_QUERY_CONTENT,
})
credits_time = try_get(title_info, lambda x: x['timecode']['endCreditsTime'] / 1000)
is_episode = title_info.get('vodType') == 'EPISODE'
return {
'id': asin,
'title': title_info.get('name'),
'formats': formats,
'subtitles': subtitles,
'language': traverse_obj(title_info, ('audioTracks', 0)),
'thumbnails': [{
'id': type_,
'url': url,
} for type_, url in (title_info.get('images') or {}).items()],
'description': traverse_obj(title_info, ('description', 'synopsis')),
'release_timestamp': int_or_none(try_get(title_info, lambda x: x['publicReleaseDateUTC'] / 1000)),
'duration': traverse_obj(title_info, ('description', 'contentLengthInSeconds')),
'chapters': [{
'start_time': credits_time,
'title': 'End Credits',
}] if credits_time else [],
'series': title_info.get('seriesName'),
'series_id': title_info.get('seriesId'),
'season_number': title_info.get('seasonNumber'),
'season_id': title_info.get('seasonId'),
'episode': title_info.get('name') if is_episode else None,
'episode_number': title_info.get('episodeNumber'),
'episode_id': asin if is_episode else None,
}
class AmazonMiniTVSeasonIE(AmazonMiniTVBaseIE):
IE_NAME = 'amazonminitv:season'
_VALID_URL = r'amazonminitv:season:(?:amzn1\.dv\.gti\.)?(?P<id>[a-f0-9-]+)'
IE_DESC = 'Amazon MiniTV Series, "minitv:season:" prefix'
_TESTS = [{
'url': 'amazonminitv:season:amzn1.dv.gti.0aa996eb-6a1b-4886-a342-387fbd2f1db0',
'playlist_mincount': 6,
'info_dict': {
'id': 'amzn1.dv.gti.0aa996eb-6a1b-4886-a342-387fbd2f1db0',
},
}, {
'url': 'amazonminitv:season:0aa996eb-6a1b-4886-a342-387fbd2f1db0',
'only_matching': True,
}]
_GRAPHQL_QUERY = '''
query getEpisodes($sessionIdToken: String!, $clientId: String, $episodeOrSeasonId: ID!, $deviceLocale: String) {
getEpisodes(
applicationContextInput: {sessionIdToken: $sessionIdToken, deviceLocale: $deviceLocale, clientId: $clientId}
episodeOrSeasonId: $episodeOrSeasonId
) {
episodes {
... on Episode {
contentId
name
images
seriesName
seasonId
seriesId
seasonNumber
episodeNumber
description {
synopsis
contentLengthInSeconds
}
publicReleaseDateUTC
}
}
}
}
'''
def _entries(self, asin):
season_info = self._call_api(
asin, note='Downloading season info', data={
'operationName': 'getEpisodes',
'variables': {'episodeOrSeasonId': asin},
'query': self._GRAPHQL_QUERY,
})
for episode in season_info['episodes']:
yield self.url_result(
f'amazonminitv:{episode["contentId"]}', AmazonMiniTVIE, episode['contentId'])
def _real_extract(self, url):
asin = f'amzn1.dv.gti.{self._match_id(url)}'
return self.playlist_result(self._entries(asin), asin)
class AmazonMiniTVSeriesIE(AmazonMiniTVBaseIE):
IE_NAME = 'amazonminitv:series'
_VALID_URL = r'amazonminitv:series:(?:amzn1\.dv\.gti\.)?(?P<id>[a-f0-9-]+)'
_TESTS = [{
'url': 'amazonminitv:series:amzn1.dv.gti.56521d46-b040-4fd5-872e-3e70476a04b0',
'playlist_mincount': 3,
'info_dict': {
'id': 'amzn1.dv.gti.56521d46-b040-4fd5-872e-3e70476a04b0',
},
}, {
'url': 'amazonminitv:series:56521d46-b040-4fd5-872e-3e70476a04b0',
'only_matching': True,
}]
_GRAPHQL_QUERY = '''
query getSeasons($sessionIdToken: String!, $deviceLocale: String, $episodeOrSeasonOrSeriesId: ID!, $clientId: String) {
getSeasons(
applicationContextInput: {deviceLocale: $deviceLocale, sessionIdToken: $sessionIdToken, clientId: $clientId}
episodeOrSeasonOrSeriesId: $episodeOrSeasonOrSeriesId
) {
seasons {
seasonId
}
}
}
'''
def _entries(self, asin):
season_info = self._call_api(
asin, note='Downloading series info', data={
'operationName': 'getSeasons',
'variables': {'episodeOrSeasonOrSeriesId': asin},
'query': self._GRAPHQL_QUERY,
})
for season in season_info['seasons']:
yield self.url_result(f'amazonminitv:season:{season["seasonId"]}', AmazonMiniTVSeasonIE, season['seasonId'])
def _real_extract(self, url):
asin = f'amzn1.dv.gti.{self._match_id(url)}'
return self.playlist_result(self._entries(asin), asin)

View File

@@ -9,7 +9,7 @@ from ..utils import (
)
class AMCNetworksIE(ThePlatformIE):
class AMCNetworksIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?://(?:www\.)?(?P<site>amc|bbcamerica|ifc|(?:we|sundance)tv)\.com/(?P<id>(?:movies|shows(?:/[^/]+)+)/[^/?#&]+)'
_TESTS = [{
'url': 'https://www.bbcamerica.com/shows/the-graham-norton-show/videos/tina-feys-adorable-airline-themed-family-dinner--51631',
@@ -106,7 +106,6 @@ class AMCNetworksIE(ThePlatformIE):
media_url = update_url_query(media_url, query)
formats, subtitles = self._extract_theplatform_smil(
media_url, video_id)
self._sort_formats(formats)
thumbnails = []
thumbnail_urls = [properties.get('imageDesktop')]

View File

@@ -10,7 +10,7 @@ from ..utils import (
)
class AMPIE(InfoExtractor):
class AMPIE(InfoExtractor): # XXX: Conventionally, base classes should end with BaseIE/InfoExtractor
# parse Akamai Adaptive Media Player feed
def _extract_feed_info(self, url):
feed = self._download_json(
@@ -84,8 +84,6 @@ class AMPIE(InfoExtractor):
'ext': ext,
})
self._sort_formats(formats)
timestamp = unified_timestamp(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return {

View File

@@ -19,7 +19,6 @@ class Ant1NewsGrBaseIE(InfoExtractor):
raise ExtractorError('no source found for %s' % video_id)
formats, subs = (self._extract_m3u8_formats_and_subtitles(source, video_id, 'mp4')
if determine_ext(source) == 'm3u8' else ([{'url': source}], {}))
self._sort_formats(formats)
thumbnails = scale_thumbnails_to_max_format_width(
formats, [{'url': info['thumb']}], r'(?<=/imgHandler/)\d+')
return {

View File

@@ -354,8 +354,6 @@ class AnvatoIE(InfoExtractor):
})
formats.append(a_format)
self._sort_formats(formats)
subtitles = {}
for caption in video_data.get('captions', []):
a_caption = {

View File

@@ -9,7 +9,7 @@ from ..utils import (
)
class AolIE(YahooIE):
class AolIE(YahooIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})'
@@ -119,7 +119,6 @@ class AolIE(YahooIE):
'height': int_or_none(qs.get('h', [None])[0]),
})
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -72,7 +72,6 @@ class APAIE(InfoExtractor):
'format_id': format_id,
'height': height,
})
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -73,7 +73,6 @@ class AparatIE(InfoExtractor):
r'(\d+)[pP]', label or '', 'height',
default=None)),
})
self._sort_formats(formats)
info = self._search_json_ld(webpage, video_id, default={})

View File

@@ -120,7 +120,6 @@ class AppleTrailersIE(InfoExtractor):
'height': int_or_none(size_data.get('height')),
'language': version[:2],
})
self._sort_formats(formats)
entries.append({
'id': movie + '-' + re.sub(r'[^a-zA-Z0-9]', '', clip_title).lower(),
@@ -185,8 +184,6 @@ class AppleTrailersIE(InfoExtractor):
'height': int_or_none(format['height']),
})
self._sort_formats(formats)
playlist.append({
'_type': 'video',
'id': video_id,

View File

@@ -312,7 +312,7 @@ class ArchiveOrgIE(InfoExtractor):
})
for entry in entries.values():
self._sort_formats(entry['formats'], ('source', ))
entry['_format_sort_fields'] = ('source', )
if len(entries) == 1:
# If there's only one item, use it as the main info dict

View File

@@ -144,7 +144,6 @@ class ArcPublishingIE(InfoExtractor):
'url': s_url,
'quality': -10,
})
self._sort_formats(formats)
subtitles = {}
for subtitle in (try_get(video, lambda x: x['subtitles']['urls'], list) or []):

View File

@@ -40,14 +40,15 @@ class ARDMediathekBaseIE(InfoExtractor):
'This video is not available due to geoblocking',
countries=self._GEO_COUNTRIES, metadata_available=True)
self._sort_formats(formats)
subtitles = {}
subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url:
subtitles['de'] = [{
'ext': 'ttml',
'url': subtitle_url,
}, {
'ext': 'vtt',
'url': subtitle_url.replace('/ebutt/', '/webvtt/') + '.vtt',
}]
return {
@@ -262,7 +263,6 @@ class ARDMediathekIE(ARDMediathekBaseIE):
'format_id': fid,
'url': furl,
})
self._sort_formats(formats)
info = {
'formats': formats,
}
@@ -289,16 +289,16 @@ class ARDMediathekIE(ARDMediathekBaseIE):
class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/(?:[^/?#&]+/)+(?P<id>[^/?#&]+))\.html'
_TESTS = [{
# available till 7.01.2022
'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-die-woche-video100.html',
'md5': '867d8aa39eeaf6d76407c5ad1bb0d4c1',
# available till 7.12.2023
'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-video-424.html',
'md5': 'a438f671e87a7eba04000336a119ccc4',
'info_dict': {
'id': 'maischberger-die-woche-video100',
'display_id': 'maischberger-die-woche-video100',
'id': 'maischberger-video-424',
'display_id': 'maischberger-video-424',
'ext': 'mp4',
'duration': 3687.0,
'title': 'maischberger. die woche vom 7. Januar 2021',
'upload_date': '20210107',
'duration': 4452.0,
'title': 'maischberger am 07.12.2022',
'upload_date': '20221207',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
@@ -371,7 +371,6 @@ class ARDIE(InfoExtractor):
continue
f['url'] = format_url
formats.append(f)
self._sort_formats(formats)
_SUB_FORMATS = (
('./dataTimedText', 'ttml'),

View File

@@ -136,7 +136,6 @@ class ArkenaIE(InfoExtractor):
elif mime_type == 'application/vnd.ms-sstr+xml':
formats.extend(self._extract_ism_formats(
href, video_id, ism_id='mss', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -73,7 +73,6 @@ class ArnesIE(InfoExtractor):
'width': int_or_none(media.get('width')),
'height': int_or_none(media.get('height')),
})
self._sort_formats(formats)
channel = video.get('channel') or {}
channel_id = channel.get('url')

View File

@@ -65,6 +65,21 @@ class ArteTVIE(ArteTVBaseIE):
}, {
'url': 'https://api.arte.tv/api/player/v2/config/de/LIVE',
'only_matching': True,
}, {
'url': 'https://www.arte.tv/de/videos/110203-006-A/zaz/',
'info_dict': {
'id': '110203-006-A',
'chapters': 'count:16',
'description': 'md5:cf592f1df52fe52007e3f8eac813c084',
'alt_title': 'Zaz',
'title': 'Baloise Session 2022',
'timestamp': 1668445200,
'duration': 4054,
'thumbnail': 'https://api-cdn.arte.tv/img/v2/image/ubQjmVCGyRx3hmBuZEK9QZ/940x530',
'upload_date': '20221114',
'ext': 'mp4',
},
'expected_warnings': ['geo restricted']
}]
_GEO_BYPASS = True
@@ -180,13 +195,8 @@ class ArteTVIE(ArteTVBaseIE):
else:
self.report_warning(f'Skipping stream with unknown protocol {stream["protocol"]}')
# TODO: chapters from stream['segments']?
# The JS also looks for chapters in config['data']['attributes']['chapters'],
# but I am yet to find a video having those
formats.extend(secondary_formats)
self._remove_duplicate_formats(formats)
self._sort_formats(formats)
metadata = config['data']['attributes']['metadata']
@@ -206,6 +216,11 @@ class ArteTVIE(ArteTVBaseIE):
{'url': image['url'], 'id': image.get('caption')}
for image in metadata.get('images') or [] if url_or_none(image.get('url'))
],
# TODO: chapters may also be in stream['segments']?
'chapters': traverse_obj(config, ('data', 'attributes', 'chapters', 'elements', ..., {
'start_time': 'startTime',
'title': 'title',
})) or None,
}

View File

@@ -84,7 +84,6 @@ class AtresPlayerIE(InfoExtractor):
elif src_type == 'application/dash+xml':
formats, subtitles = self._extract_mpd_formats(
src, video_id, mpd_id='dash', fatal=False)
self._sort_formats(formats)
heartbeat = episode.get('heartbeat') or {}
omniture = episode.get('omniture') or {}

View File

@@ -49,7 +49,6 @@ class ATVAtIE(InfoExtractor):
'url': source_url,
'format_id': protocol,
})
self._sort_formats(formats)
return {
'id': clip_id,

View File

@@ -76,7 +76,6 @@ class AudiMediaIE(InfoExtractor):
'format_id': 'http-%s' % bitrate,
})
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -168,7 +168,7 @@ class AudiusIE(AudiusBaseIE):
}
class AudiusTrackIE(AudiusIE):
class AudiusTrackIE(AudiusIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'''(?x)(?:audius:)(?:https?://(?:www\.)?.+/v1/tracks/)?(?P<track_id>\w+)'''
IE_NAME = 'audius:track'
IE_DESC = 'Audius track ID or API link. Prepend with "audius:"'
@@ -243,7 +243,7 @@ class AudiusPlaylistIE(AudiusBaseIE):
playlist_data.get('description'))
class AudiusProfileIE(AudiusPlaylistIE):
class AudiusProfileIE(AudiusPlaylistIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'audius:artist'
IE_DESC = 'Audius.co profile/artist pages'
_VALID_URL = r'https?://(?:www)?audius\.co/(?P<id>[^\/]+)/?(?:[?#]|$)'

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlencode
class AWSIE(InfoExtractor):
class AWSIE(InfoExtractor): # XXX: Conventionally, base classes should end with BaseIE/InfoExtractor
_AWS_ALGORITHM = 'AWS4-HMAC-SHA256'
_AWS_REGION = 'us-east-1'

View File

@@ -80,8 +80,6 @@ class BanByeIE(BanByeBaseIE):
'url': f'{self._CDN_BASE}/video/{video_id}/{quality}.mp4',
} for quality in data['quality']]
self._sort_formats(formats)
return {
'id': video_id,
'title': data.get('title'),

View File

@@ -1,8 +1,8 @@
from .brightcove import BrightcoveNewIE
from .brightcove import BrightcoveNewBaseIE
from ..utils import extract_attributes
class BandaiChannelIE(BrightcoveNewIE):
class BandaiChannelIE(BrightcoveNewBaseIE):
IE_NAME = 'bandaichannel'
_VALID_URL = r'https?://(?:www\.)?b-ch\.com/titles/(?P<id>\d+/\d+)'
_TESTS = [{

View File

@@ -29,11 +29,18 @@ class BandcampIE(InfoExtractor):
'info_dict': {
'id': '1812978515',
'ext': 'mp3',
'title': "youtube-dl \"'/\\ä↭ - youtube-dl \"'/\\ä↭ - youtube-dl test song \"'/\\ä↭",
'title': 'youtube-dl "\'/\\ä↭ - youtube-dl "\'/\\ä↭ - youtube-dl test song "\'/\\ä↭',
'duration': 9.8485,
'uploader': 'youtube-dl "\'/\\ä↭',
'uploader': 'youtube-dl "\'/\\ä↭',
'upload_date': '20121129',
'timestamp': 1354224127,
'track': 'youtube-dl "\'/\\ä↭ - youtube-dl test song "\'/\\ä↭',
'album_artist': 'youtube-dl "\'/\\ä↭',
'track_id': '1812978515',
'artist': 'youtube-dl "\'/\\ä↭',
'uploader_url': 'https://youtube-dl.bandcamp.com',
'uploader_id': 'youtube-dl',
'thumbnail': 'https://f4.bcbits.com/img/a3216802731_5.jpg',
},
'_skip': 'There is a limit of 200 free downloads / month for the test song'
}, {
@@ -41,7 +48,8 @@ class BandcampIE(InfoExtractor):
'url': 'http://benprunty.bandcamp.com/track/lanius-battle',
'info_dict': {
'id': '2650410135',
'ext': 'aiff',
'ext': 'm4a',
'acodec': r're:[fa]lac',
'title': 'Ben Prunty - Lanius (Battle)',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Ben Prunty',
@@ -54,7 +62,10 @@ class BandcampIE(InfoExtractor):
'track_number': 1,
'track_id': '2650410135',
'artist': 'Ben Prunty',
'album_artist': 'Ben Prunty',
'album': 'FTL: Advanced Edition Soundtrack',
'uploader_url': 'https://benprunty.bandcamp.com',
'uploader_id': 'benprunty',
},
}, {
# no free download, mp3 128
@@ -75,7 +86,34 @@ class BandcampIE(InfoExtractor):
'track_number': 5,
'track_id': '2584466013',
'artist': 'Mastodon',
'album_artist': 'Mastodon',
'album': 'Call of the Mastodon',
'uploader_url': 'https://relapsealumni.bandcamp.com',
'uploader_id': 'relapsealumni',
},
}, {
# track from compilation album (artist/album_artist difference)
'url': 'https://diskotopia.bandcamp.com/track/safehouse',
'md5': '19c5337bca1428afa54129f86a2f6a69',
'info_dict': {
'id': '1978174799',
'ext': 'mp3',
'title': 'submerse - submerse - Safehouse',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'submerse',
'timestamp': 1480779297,
'upload_date': '20161203',
'release_timestamp': 1481068800,
'release_date': '20161207',
'duration': 154.066,
'track': 'submerse - Safehouse',
'track_number': 3,
'track_id': '1978174799',
'artist': 'submerse',
'album_artist': 'Diskotopia',
'album': 'DSK F/W 2016-2017 Free Compilation',
'uploader_url': 'https://diskotopia.bandcamp.com',
'uploader_id': 'diskotopia',
},
}]
@@ -121,6 +159,9 @@ class BandcampIE(InfoExtractor):
embed = self._extract_data_attr(webpage, title, 'embed', False)
current = tralbum.get('current') or {}
artist = embed.get('artist') or current.get('artist') or tralbum.get('artist')
album_artist = self._html_search_regex(
r'<h3 class="albumTitle">[\S\s]*?by\s*<span>\s*<a href="[^>]+">\s*([^>]+?)\s*</a>',
webpage, 'album artist', fatal=False)
timestamp = unified_timestamp(
current.get('publish_date') or tralbum.get('album_publish_date'))
@@ -184,8 +225,6 @@ class BandcampIE(InfoExtractor):
'acodec': format_id.split('-')[0],
})
self._sort_formats(formats)
title = '%s - %s' % (artist, track) if artist else track
if not duration:
@@ -207,11 +246,12 @@ class BandcampIE(InfoExtractor):
'track_id': track_id,
'artist': artist,
'album': embed.get('album_title'),
'album_artist': album_artist,
'formats': formats,
}
class BandcampAlbumIE(BandcampIE):
class BandcampAlbumIE(BandcampIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'Bandcamp:album'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com/album/(?P<id>[^/?#&]+)'
@@ -314,7 +354,7 @@ class BandcampAlbumIE(BandcampIE):
}
class BandcampWeeklyIE(BandcampIE):
class BandcampWeeklyIE(BandcampIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'Bandcamp:weekly'
_VALID_URL = r'https?://(?:www\.)?bandcamp\.com/?\?(?:.*?&)?show=(?P<id>\d+)'
_TESTS = [{
@@ -363,7 +403,6 @@ class BandcampWeeklyIE(BandcampIE):
'ext': ext,
'vcodec': 'none',
})
self._sort_formats(formats)
title = show.get('audio_title') or 'Bandcamp Weekly'
subtitle = show.get('subtitle')

View File

@@ -135,7 +135,6 @@ query GetCommentReplies($id: String!) {
formats.extend(self._extract_m3u8_formats(
video_info.get('streamUrl'), video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', live=True))
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -575,8 +575,6 @@ class BBCCoUkIE(InfoExtractor):
else:
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
self._sort_formats(formats)
return {
'id': programme_id,
'title': title,
@@ -588,7 +586,7 @@ class BBCCoUkIE(InfoExtractor):
}
class BBCIE(BBCCoUkIE):
class BBCIE(BBCCoUkIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'bbc'
IE_DESC = 'BBC'
_VALID_URL = r'''(?x)
@@ -890,7 +888,6 @@ class BBCIE(BBCCoUkIE):
def _extract_from_playlist_sxml(self, url, playlist_id, timestamp):
programme_id, title, description, duration, formats, subtitles = \
self._process_legacy_playlist_url(url, playlist_id)
self._sort_formats(formats)
return {
'id': programme_id,
'title': title,
@@ -954,7 +951,6 @@ class BBCIE(BBCCoUkIE):
duration = int_or_none(items[0].get('duration'))
programme_id = items[0].get('vpid')
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
entries.append({
'id': programme_id,
'title': title,
@@ -991,7 +987,6 @@ class BBCIE(BBCCoUkIE):
continue
raise
if entry:
self._sort_formats(entry['formats'])
entries.append(entry)
if entries:
@@ -1015,7 +1010,6 @@ class BBCIE(BBCCoUkIE):
if programme_id:
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
# digitalData may be missing (e.g. http://www.bbc.com/autos/story/20130513-hyundais-rock-star)
digital_data = self._parse_json(
self._search_regex(
@@ -1047,7 +1041,6 @@ class BBCIE(BBCCoUkIE):
if version_id:
title = smp_data['title']
formats, subtitles = self._download_media_selector(version_id)
self._sort_formats(formats)
image_url = smp_data.get('holdingImageURL')
display_date = init_data.get('displayDate')
topic_title = init_data.get('topicTitle')
@@ -1089,7 +1082,6 @@ class BBCIE(BBCCoUkIE):
continue
title = lead_media.get('title') or self._og_search_title(webpage)
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
description = lead_media.get('summary')
uploader = lead_media.get('masterBrand')
uploader_id = lead_media.get('mid')
@@ -1118,7 +1110,6 @@ class BBCIE(BBCCoUkIE):
if current_programme and programme_id and current_programme.get('type') == 'playable_item':
title = current_programme.get('titles', {}).get('tertiary') or playlist_title
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
synopses = current_programme.get('synopses') or {}
network = current_programme.get('network') or {}
duration = int_or_none(
@@ -1151,7 +1142,6 @@ class BBCIE(BBCCoUkIE):
clip_title = clip.get('title')
if clip_vpid and clip_title:
formats, subtitles = self._download_media_selector(clip_vpid)
self._sort_formats(formats)
return {
'id': clip_vpid,
'title': clip_title,
@@ -1173,7 +1163,6 @@ class BBCIE(BBCCoUkIE):
if not programme_id:
continue
formats, subtitles = self._download_media_selector(programme_id)
self._sort_formats(formats)
entries.append({
'id': programme_id,
'title': playlist_title,
@@ -1205,7 +1194,6 @@ class BBCIE(BBCCoUkIE):
if not (item_id and item_title):
continue
formats, subtitles = self._download_media_selector(item_id)
self._sort_formats(formats)
item_desc = None
blocks = try_get(media, lambda x: x['summary']['blocks'], list)
if blocks:
@@ -1306,7 +1294,6 @@ class BBCIE(BBCCoUkIE):
formats, subtitles = self._extract_from_media_meta(media_meta, playlist_id)
if not formats and not self.get_param('ignore_no_formats'):
continue
self._sort_formats(formats)
video_id = media_meta.get('externalId')
if not video_id:

View File

@@ -0,0 +1,101 @@
from .common import InfoExtractor
from .youtube import YoutubeIE, YoutubeTabIE
class BeatBumpVideoIE(InfoExtractor):
_VALID_URL = r'https://beatbump\.ml/listen\?id=(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://beatbump.ml/listen?id=MgNrAu2pzNs',
'md5': '5ff3fff41d3935b9810a9731e485fe66',
'info_dict': {
'id': 'MgNrAu2pzNs',
'ext': 'mp4',
'uploader_url': 'http://www.youtube.com/channel/UC-pWHpBjdGG69N9mM2auIAA',
'artist': 'Stephen',
'thumbnail': 'https://i.ytimg.com/vi_webp/MgNrAu2pzNs/maxresdefault.webp',
'channel_url': 'https://www.youtube.com/channel/UC-pWHpBjdGG69N9mM2auIAA',
'upload_date': '20190312',
'categories': ['Music'],
'playable_in_embed': True,
'duration': 169,
'like_count': int,
'alt_title': 'Voyeur Girl',
'view_count': int,
'track': 'Voyeur Girl',
'uploader': 'Stephen - Topic',
'title': 'Voyeur Girl',
'channel_follower_count': int,
'uploader_id': 'UC-pWHpBjdGG69N9mM2auIAA',
'age_limit': 0,
'availability': 'public',
'live_status': 'not_live',
'album': 'it\'s too much love to know my dear',
'channel': 'Stephen',
'comment_count': int,
'description': 'md5:7ae382a65843d6df2685993e90a8628f',
'tags': 'count:11',
'creator': 'Stephen',
'channel_id': 'UC-pWHpBjdGG69N9mM2auIAA',
}
}]
def _real_extract(self, url):
id_ = self._match_id(url)
return self.url_result(f'https://music.youtube.com/watch?v={id_}', YoutubeIE, id_)
class BeatBumpPlaylistIE(InfoExtractor):
_VALID_URL = r'https://beatbump\.ml/(?:release\?id=|artist/|playlist/)(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://beatbump.ml/release?id=MPREb_gTAcphH99wE',
'playlist_count': 50,
'info_dict': {
'id': 'OLAK5uy_l1m0thk3g31NmIIz_vMIbWtyv7eZixlH0',
'availability': 'unlisted',
'view_count': int,
'title': 'Album - Royalty Free Music Library V2 (50 Songs)',
'description': '',
'tags': [],
'modified_date': '20221223',
}
}, {
'url': 'https://beatbump.ml/artist/UC_aEa8K-EOJ3D6gOs7HcyNg',
'playlist_mincount': 1,
'params': {'flatplaylist': True},
'info_dict': {
'id': 'UC_aEa8K-EOJ3D6gOs7HcyNg',
'uploader_url': 'https://www.youtube.com/channel/UC_aEa8K-EOJ3D6gOs7HcyNg',
'channel_url': 'https://www.youtube.com/channel/UC_aEa8K-EOJ3D6gOs7HcyNg',
'uploader_id': 'UC_aEa8K-EOJ3D6gOs7HcyNg',
'channel_follower_count': int,
'title': 'NoCopyrightSounds - Videos',
'uploader': 'NoCopyrightSounds',
'description': 'md5:cd4fd53d81d363d05eee6c1b478b491a',
'channel': 'NoCopyrightSounds',
'tags': 'count:12',
'channel_id': 'UC_aEa8K-EOJ3D6gOs7HcyNg',
},
}, {
'url': 'https://beatbump.ml/playlist/VLPLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq',
'playlist_mincount': 1,
'params': {'flatplaylist': True},
'info_dict': {
'id': 'PLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq',
'uploader_url': 'https://www.youtube.com/@NoCopyrightSounds',
'description': 'Providing you with copyright free / safe music for gaming, live streaming, studying and more!',
'view_count': int,
'channel_url': 'https://www.youtube.com/@NoCopyrightSounds',
'uploader_id': 'UC_aEa8K-EOJ3D6gOs7HcyNg',
'title': 'NCS : All Releases 💿',
'uploader': 'NoCopyrightSounds',
'availability': 'public',
'channel': 'NoCopyrightSounds',
'tags': [],
'modified_date': '20221225',
'channel_id': 'UC_aEa8K-EOJ3D6gOs7HcyNg',
}
}]
def _real_extract(self, url):
id_ = self._match_id(url)
return self.url_result(f'https://music.youtube.com/browse/{id_}', YoutubeTabIE, id_)

View File

@@ -74,7 +74,6 @@ class BeatportIE(InfoExtractor):
fmt['abr'] = 96
fmt['asr'] = 44100
formats.append(fmt)
self._sort_formats(formats)
images = []
for name, info in track['images'].items():

View File

@@ -76,8 +76,6 @@ class BeegIE(InfoExtractor):
f['height'] = height
formats.extend(current_formats)
self._sort_formats(formats)
return {
'id': video_id,
'display_id': first_fact.get('id'),

View File

@@ -42,7 +42,7 @@ class BFMTVIE(BFMTVBaseIE):
return self._brightcove_url_result(video_block['videoid'], video_block)
class BFMTVLiveIE(BFMTVIE):
class BFMTVLiveIE(BFMTVIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'bfmtv:live'
_VALID_URL = BFMTVBaseIE._VALID_URL_BASE + '(?P<id>(?:[^/]+/)?en-direct)'
_TESTS = [{

View File

@@ -63,8 +63,6 @@ class BigflixIE(InfoExtractor):
'url': decode_url(file_url),
})
self._sort_formats(formats)
description = self._html_search_meta('description', webpage)
return {

View File

@@ -16,13 +16,16 @@ from ..utils import (
format_field,
int_or_none,
make_archive_id,
merge_dicts,
mimetype2ext,
parse_count,
parse_qs,
qualities,
smuggle_url,
srt_subtitles_timecode,
str_or_none,
traverse_obj,
unsmuggle_url,
url_or_none,
urlencode_postdata,
)
@@ -65,9 +68,8 @@ class BilibiliBaseIE(InfoExtractor):
missing_formats = format_names.keys() - set(traverse_obj(formats, (..., 'quality')))
if missing_formats:
self.to_screen(f'Format(s) {", ".join(format_names[i] for i in missing_formats)} are missing; '
'you have to login or become premium member to download them')
f'you have to login or become premium member to download them. {self._login_hint()}')
self._sort_formats(formats)
return formats
def json2srt(self, json_data):
@@ -304,7 +306,8 @@ class BiliBiliIE(BilibiliBaseIE):
getter=lambda entry: f'https://www.bilibili.com/video/{video_id}?p={entry["page"]}')
if is_anthology:
title += f' p{part_id:02d} {traverse_obj(page_list_json, ((part_id or 1) - 1, "part")) or ""}'
part_id = part_id or 1
title += f' p{part_id:02d} {traverse_obj(page_list_json, (part_id - 1, "part")) or ""}'
aid = video_data.get('aid')
old_video_id = format_field(aid, None, f'%s_part{part_id or 1}')
@@ -879,19 +882,14 @@ class BiliIntlBaseIE(InfoExtractor):
'filesize': aud.get('size'),
})
self._sort_formats(formats)
return formats
def _extract_video_info(self, video_data, *, ep_id=None, aid=None):
def _parse_video_metadata(self, video_data):
return {
'id': ep_id or aid,
'title': video_data.get('title_display') or video_data.get('title'),
'thumbnail': video_data.get('cover'),
'episode_number': int_or_none(self._search_regex(
r'^E(\d+)(?:$| - )', video_data.get('title_display') or '', 'episode number', default=None)),
'formats': self._get_formats(ep_id=ep_id, aid=aid),
'subtitles': self._get_subtitles(ep_id=ep_id, aid=aid),
'extractor_key': BiliIntlIE.ie_key(),
}
def _perform_login(self, username, password):
@@ -937,6 +935,10 @@ class BiliIntlIE(BiliIntlBaseIE):
'title': 'E2 - The First Night',
'thumbnail': r're:^https://pic\.bstarstatic\.com/ogv/.+\.png$',
'episode_number': 2,
'upload_date': '20201009',
'episode': 'Episode 2',
'timestamp': 1602259500,
'description': 'md5:297b5a17155eb645e14a14b385ab547e',
}
}, {
# Non-Bstation page
@@ -947,6 +949,10 @@ class BiliIntlIE(BiliIntlBaseIE):
'title': 'E3 - Who?',
'thumbnail': r're:^https://pic\.bstarstatic\.com/ogv/.+\.png$',
'episode_number': 3,
'description': 'md5:e1a775e71a35c43f141484715470ad09',
'episode': 'Episode 3',
'upload_date': '20211219',
'timestamp': 1639928700,
}
}, {
# Subtitle with empty content
@@ -959,6 +965,17 @@ class BiliIntlIE(BiliIntlBaseIE):
'episode_number': 140,
},
'skip': 'According to the copyright owner\'s request, you may only watch the video after you log in.'
}, {
'url': 'https://www.bilibili.tv/en/video/2041863208',
'info_dict': {
'id': '2041863208',
'ext': 'mp4',
'timestamp': 1670874843,
'description': 'Scheduled for April 2023.\nStudio: ufotable',
'thumbnail': r're:https?://pic[-\.]bstarstatic.+/ugc/.+\.jpg$',
'upload_date': '20221212',
'title': 'Kimetsu no Yaiba Season 3 Official Trailer - Bstation',
}
}, {
'url': 'https://www.biliintl.com/en/play/34613/341736',
'only_matching': True,
@@ -976,42 +993,78 @@ class BiliIntlIE(BiliIntlBaseIE):
'only_matching': True,
}]
def _real_extract(self, url):
season_id, ep_id, aid = self._match_valid_url(url).group('season_id', 'ep_id', 'aid')
video_id = ep_id or aid
def _make_url(video_id, series_id=None):
if series_id:
return f'https://www.bilibili.tv/en/play/{series_id}/{video_id}'
return f'https://www.bilibili.tv/en/video/{video_id}'
def _extract_video_metadata(self, url, video_id, season_id):
url, smuggled_data = unsmuggle_url(url, {})
if smuggled_data.get('title'):
return smuggled_data
webpage = self._download_webpage(url, video_id)
# Bstation layout
initial_data = (
self._search_json(r'window\.__INITIAL_(?:DATA|STATE)__\s*=', webpage, 'preload state', video_id, default={})
or self._search_nuxt_data(webpage, video_id, '__initialState', fatal=False, traverse=None))
video_data = traverse_obj(
initial_data, ('OgvVideo', 'epDetail'), ('UgcVideo', 'videoData'), ('ugc', 'archive'), expected_type=dict)
initial_data, ('OgvVideo', 'epDetail'), ('UgcVideo', 'videoData'), ('ugc', 'archive'), expected_type=dict) or {}
if season_id and not video_data:
# Non-Bstation layout, read through episode list
season_json = self._call_api(f'/web/v2/ogv/play/episodes?season_id={season_id}&platform=web', video_id)
video_data = traverse_obj(season_json,
('sections', ..., 'episodes', lambda _, v: str(v['episode_id']) == ep_id),
expected_type=dict, get_all=False)
return self._extract_video_info(video_data or {}, ep_id=ep_id, aid=aid)
video_data = traverse_obj(season_json, (
'sections', ..., 'episodes', lambda _, v: str(v['episode_id']) == video_id
), expected_type=dict, get_all=False)
# XXX: webpage metadata may not accurate, it just used to not crash when video_data not found
return merge_dicts(
self._parse_video_metadata(video_data), self._search_json_ld(webpage, video_id), {
'title': self._html_search_meta('og:title', webpage),
'description': self._html_search_meta('og:description', webpage)
})
def _real_extract(self, url):
season_id, ep_id, aid = self._match_valid_url(url).group('season_id', 'ep_id', 'aid')
video_id = ep_id or aid
return {
'id': video_id,
**self._extract_video_metadata(url, video_id, season_id),
'formats': self._get_formats(ep_id=ep_id, aid=aid),
'subtitles': self.extract_subtitles(ep_id=ep_id, aid=aid),
}
class BiliIntlSeriesIE(BiliIntlBaseIE):
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-zA-Z]{2}/)?play/(?P<id>\d+)/?(?:[?#]|$)'
IE_NAME = 'biliIntl:series'
_VALID_URL = r'https?://(?:www\.)?bili(?:bili\.tv|intl\.com)/(?:[a-zA-Z]{2}/)?(?:play|media)/(?P<id>\d+)/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://www.bilibili.tv/en/play/34613',
'playlist_mincount': 15,
'info_dict': {
'id': '34613',
'title': 'Fly Me to the Moon',
'description': 'md5:a861ee1c4dc0acfad85f557cc42ac627',
'categories': ['Romance', 'Comedy', 'Slice of life'],
'title': 'TONIKAWA: Over the Moon For You',
'description': 'md5:297b5a17155eb645e14a14b385ab547e',
'categories': ['Slice of life', 'Comedy', 'Romance'],
'thumbnail': r're:^https://pic\.bstarstatic\.com/ogv/.+\.png$',
'view_count': int,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.bilibili.tv/en/media/1048837',
'info_dict': {
'id': '1048837',
'title': 'SPY×FAMILY',
'description': 'md5:b4434eb1a9a97ad2bccb779514b89f17',
'categories': ['Adventure', 'Action', 'Comedy'],
'thumbnail': r're:^https://pic\.bstarstatic\.com/ogv/.+\.jpg$',
'view_count': int,
},
'playlist_mincount': 25,
}, {
'url': 'https://www.biliintl.com/en/play/34613',
'only_matching': True,
@@ -1022,9 +1075,12 @@ class BiliIntlSeriesIE(BiliIntlBaseIE):
def _entries(self, series_id):
series_json = self._call_api(f'/web/v2/ogv/play/episodes?season_id={series_id}&platform=web', series_id)
for episode in traverse_obj(series_json, ('sections', ..., 'episodes', ...), expected_type=dict, default=[]):
episode_id = str(episode.get('episode_id'))
yield self._extract_video_info(episode, ep_id=episode_id)
for episode in traverse_obj(series_json, ('sections', ..., 'episodes', ...), expected_type=dict):
episode_id = str(episode['episode_id'])
yield self.url_result(smuggle_url(
BiliIntlIE._make_url(episode_id, series_id),
self._parse_video_metadata(episode)
), BiliIntlIE, episode_id)
def _real_extract(self, url):
series_id = self._match_id(url)
@@ -1036,7 +1092,7 @@ class BiliIntlSeriesIE(BiliIntlBaseIE):
class BiliLiveIE(InfoExtractor):
_VALID_URL = r'https?://live.bilibili.com/(?P<id>\d+)'
_VALID_URL = r'https?://live.bilibili.com/(?:blanc/)?(?P<id>\d+)'
_TESTS = [{
'url': 'https://live.bilibili.com/196',
@@ -1052,6 +1108,9 @@ class BiliLiveIE(InfoExtractor):
}, {
'url': 'https://live.bilibili.com/196?broadcast_type=0&is_room_feed=1?spm_id_from=333.999.space_home.strengthen_live_card.click',
'only_matching': True
}, {
'url': 'https://live.bilibili.com/blanc/196',
'only_matching': True
}]
_FORMATS = {
@@ -1105,7 +1164,6 @@ class BiliLiveIE(InfoExtractor):
})
for fmt in traverse_obj(stream_data, ('playurl_info', 'playurl', 'stream', ..., 'format', ...)) or []:
formats.extend(self._parse_formats(qn, fmt))
self._sort_formats(formats)
return {
'id': room_id,
@@ -1114,6 +1172,7 @@ class BiliLiveIE(InfoExtractor):
'thumbnail': room_data.get('user_cover'),
'timestamp': stream_data.get('live_time'),
'formats': formats,
'is_live': True,
'http_headers': {
'Referer': url,
},

View File

@@ -86,7 +86,6 @@ class BIQLEIE(InfoExtractor):
'height': int_or_none(height),
'ext': ext,
})
self._sort_formats(formats)
thumbnails = []
for k, v in item.items():

View File

@@ -117,7 +117,6 @@ class BitChuteIE(InfoExtractor):
self.raise_no_formats(
'Video is unavailable. Please make sure this video is playable in the browser '
'before reporting this issue.', expected=True, video_id=video_id)
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -45,7 +45,6 @@ class BitwaveStreamIE(InfoExtractor):
formats = self._extract_m3u8_formats(
channel['data']['url'], username,
'mp4')
self._sort_formats(formats)
return {
'id': username,

View File

@@ -67,7 +67,6 @@ class BloombergIE(InfoExtractor):
else:
formats.extend(self._extract_f4m_formats(
stream_url, video_id, f4m_id='hds', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -21,8 +21,6 @@ class BokeCCBaseIE(InfoExtractor):
'quality': int(quality.attrib['value']),
} for quality in info_xml.findall('./video/quality')]
self._sort_formats(formats)
return formats

View File

@@ -57,7 +57,6 @@ class BongaCamsIE(InfoExtractor):
formats = self._extract_m3u8_formats(
'%s/hls/stream_%s/playlist.m3u8' % (server_url, uploader_id),
channel_id, 'mp4', m3u8_id='hls', live=True)
self._sort_formats(formats)
return {
'id': channel_id,

View File

@@ -67,7 +67,6 @@ class BooyahClipsIE(BooyahBaseIE):
'height': video_data.get('resolution'),
'preference': -10,
}))
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -79,8 +79,6 @@ class BoxIE(InfoExtractor):
'url': update_url_query(authenticated_download_url, query),
})
self._sort_formats(formats)
creator = f.get('created_by') or {}
return {

View File

@@ -48,8 +48,6 @@ class BpbIE(InfoExtractor):
'format_id': '%s-%s' % (quality, determine_ext(video_url)),
})
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,

View File

@@ -157,7 +157,6 @@ class BRIE(InfoExtractor):
'format_id': 'rtmp-%s' % asset_type,
})
formats.append(rtmp_format_info)
self._sort_formats(formats)
return formats
def _extract_thumbnails(self, variants, base_url):
@@ -272,7 +271,6 @@ class BRMediathekIE(InfoExtractor):
'tbr': tbr,
'filesize': int_or_none(node.get('fileSize')),
})
self._sort_formats(formats)
subtitles = {}
for edge in clip.get('captionFiles', {}).get('edges', []):

View File

@@ -63,7 +63,6 @@ class BreakIE(InfoExtractor):
'format_id': 'http-%d' % bitrate if bitrate else 'http',
'tbr': bitrate,
})
self._sort_formats(formats)
title = self._search_regex(
(r'title["\']\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',

View File

@@ -24,7 +24,6 @@ class BreitBartIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
formats = self._extract_m3u8_formats(f'https://cdn.jwplayer.com/manifests/{video_id}.m3u8', video_id, ext='mp4')
self._sort_formats(formats)
return {
'id': video_id,
'title': self._generic_title('', webpage),

Some files were not shown because too many files have changed in this diff Show More