1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-16 11:51:19 +00:00

Compare commits

...

108 Commits

Author SHA1 Message Date
github-actions[bot]
6eaa574c82 Release 2025.03.26
Created by: bashonly

:ci skip all
2025-03-26 00:04:51 +00:00
sepro
ecee97b4fa [ie/youtube] Only cache nsig code on successful decoding (#12750)
Authored by: seproDev, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-25 23:47:45 +00:00
sepro
a550dfc904 [ie/youtube] Fix signature and nsig extraction for player 4fcd6e4a (#12748)
Closes #12746
Authored by: seproDev
2025-03-25 23:40:58 +00:00
github-actions[bot]
336b33e72f Release 2025.03.25
Created by: bashonly

:ci skip all
2025-03-25 00:07:18 +00:00
sepro
9dde546e7e [cleanup] Misc (#12694)
Authored by: seproDev
2025-03-25 00:05:02 +00:00
Abdulmohsen
66e0bab814 [ie/TVer] Fix extractor (#12659)
Closes #12643, Closes #12282
Authored by: arabcoders, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-25 00:00:22 +00:00
doe1080
801afeac91 [ie/streaks] Add extractor (#12679)
Authored by: doe1080
2025-03-24 23:12:09 +00:00
bashonly
86ab79e1a5 [ie] Fix sorting of HLS audio formats by GROUP-ID (#12714)
Closes #11178
Authored by: bashonly
2025-03-24 22:38:22 +00:00
Subrat Lima
3396eb50dc [ie/17live:vod] Add extractor (#12723)
Closes #12570
Authored by: subrat-lima
2025-03-24 22:26:45 +00:00
fireattack
5086d4aed6 [ie/generic] Fix MPD base URL parsing (#12718)
Closes #12709
Authored by: fireattack
2025-03-24 22:24:09 +00:00
sepro
9491b44032 [utils] js_to_json: Make function less fatal (#12715)
Authored by: seproDev
2025-03-24 22:28:47 +01:00
doe1080
b7fbb5a0a1 [ie/vrsquare] Add extractors (#12515)
Authored by: doe1080
2025-03-24 22:28:09 +01:00
bashonly
4054a2b623 [ie/youtube] Fix PhantomJS nsig fallback (#12728)
Also fixes the NSigDeno plugin

Closes #12724
Authored by: bashonly
2025-03-24 21:22:25 +00:00
bashonly
b9c979461b [ie/youtube] Fix signature and nsig extraction for player 363db69b (#12725)
Closes #12724
Authored by: bashonly
2025-03-24 21:18:51 +00:00
bashonly
9d5e6de2e7 [ie/9now.com.au] Fix extractor (#12702)
Closes #12591
Authored by: bashonly
2025-03-23 16:35:46 +00:00
Simon Sawicki
9bf23902ce [rh:curl_cffi] Support curl_cffi 0.10.x (#12670)
Authored by: Grub4K
2025-03-23 00:15:20 +01:00
sepro
be5af3f9e9 [ie/deezer] Remove extractors (#12704)
Authored by: seproDev
2025-03-22 22:53:20 +01:00
sepro
fe4f14b836 [ie/viki] Remove extractors (#12703)
Closes #2907, Closes #2869
Authored by: seproDev
2025-03-22 22:34:07 +01:00
Simon Sawicki
b872ffec50 [core] Fix attribute error on failed VT init (#12696)
Authored by: Grub4K
2025-03-22 21:03:28 +01:00
bashonly
e2dfccaf80 [ie/chzzk:video] Fix extraction (#12692)
Closes #12487
Authored by: dirkf, bashonly

Co-authored-by: dirkf <fieldhouse@gmx.net>
2025-03-22 16:44:05 +00:00
github-actions[bot]
b4488a9e12 Release 2025.03.21
Created by: bashonly

:ci skip all
2025-03-21 23:49:09 +00:00
Simon Sawicki
f36e4b6e65 [cleanup] Misc (#12526)
Authored by: Grub4K, seproDev, gamer191, dirkf

Co-authored-by: sepro <sepro@sepr0.com>
2025-03-21 23:41:56 +00:00
D Trombett
983095485c [ie/loco] Add extractor (#12667)
Closes #12496
Authored by: DTrombett
2025-03-21 23:24:13 +00:00
Michaël De Boey
bbada3ec07 [ie/ketnet] Remove extractor (#12628)
Authored by: MichaelDeBoey
2025-03-21 23:19:36 +00:00
Michiel Sikma
8305df0001 [ie/soop] Fix timestamp extraction (#12609)
Closes #12606
Authored by: msikma
2025-03-21 23:16:30 +00:00
bashonly
7223d29569 [ie/mitele] Fix extractor (#12689)
Closes #12655
Authored by: bashonly
2025-03-21 23:14:46 +00:00
bashonly
f5fb2229e6 [ie/BilibiliPlaylist] Fix extractor (#12690)
Closes #12651
Authored by: bashonly
2025-03-21 23:04:58 +00:00
JChris246
89a68c4857 [ie/jamendo] Fix thumbnail extraction (#12622)
Closes #11779
Authored by: JChris246, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-21 23:04:34 +00:00
sepro
9b868518a1 [ie/youtube] Fix nsig and signature extraction for player 643afba4 (#12684)
Closes #12677, Closes #12682
Authored by: seproDev, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-21 20:58:10 +00:00
D Trombett
2ee3a0aff9 [ie/tv8.it] Add live and playlist extractors (#12569)
Closes #12542
Authored by: DTrombett
2025-03-16 23:10:16 +01:00
Arc8ne
01a8be4c23 [ie/Canalsurmas] Add extractor (#12497)
Closes #5516
Authored by: Arc8ne
2025-03-16 23:03:10 +01:00
Refael Ackermann
ebac65aa9e [ie/NBCStations] Fix extractor (#12534)
Authored by: refack
2025-03-16 21:41:32 +00:00
thedenv
4815dac131 [ie/msn] Rework extractor (#12513)
Closes #3225
Authored by: thedenv, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-03-16 19:54:46 +01:00
Simon Sawicki
95f8df2f79 [networking] Always add unsupported suffix on version mismatch (#12626)
Authored by: Grub4K
2025-03-16 12:45:44 +01:00
coletdjnz
e67d786c7c [ie/youtube] Warn on DRM formats (#12593)
Authored by: coletdjnz
2025-03-16 10:28:16 +13:00
sepro
d9a53cc1e6 [ie/reddit] Truncate title (#12567)
Authored by: seproDev
2025-03-15 22:16:00 +01:00
sepro
83b119dadb [ie/tiktok] Truncate title (#12566)
Authored by: seproDev
2025-03-15 22:15:29 +01:00
sepro
06f6de78db [ie/twitter] Truncate title (#12560)
Authored by: seproDev
2025-03-15 22:15:03 +01:00
sepro
3380febe99 [ie/youtube] Player client maintenance (#12603)
Authored by: seproDev
2025-03-15 21:57:56 +01:00
rysson
be0d819e11 [ie/cda] Fix login support (#12552)
Closes #10306
Authored by: rysson
2025-03-15 21:47:50 +01:00
Michaël De Boey
df9ebeec00 [ie/vrtmax] Rework extractor (#12479)
Closes #7997, Closes #8174, Closes #9375
Authored by: MichaelDeBoey, bergoid, seproDev

Co-authored-by: bergoid <bergoid@users.noreply.github.com>
Co-authored-by: sepro <sepro@sepr0.com>
2025-03-15 21:29:22 +01:00
fireattack
17504f2535 [ie/openrec] Fix _VALID_URL (#12608)
Authored by: fireattack
2025-03-15 17:14:01 +01:00
coletdjnz
4432a9390c [ie/youtube] Split into package (#12557)
Authored by: coletdjnz
2025-03-13 17:37:33 +13:00
sepro
05c8023a27 [ie/vk] Improve metadata extraction (#12510)
Closes #12509
Authored by: seproDev
2025-03-07 22:14:38 +01:00
bashonly
bd0a668169 [ie/pinterest] Fix extractor (#12538)
Closes #12529
Authored by: mikf

Co-authored-by: =?UTF-8?q?Mike=20F=C3=A4hrmann?= <mike_faehrmann@web.de>
2025-03-05 06:38:23 +00:00
bashonly
b8b4754704 [ie/twitter] Fix syndication token generation (#12537)
Fix 14cd7f3443

Authored by: bashonly
2025-03-05 06:22:52 +00:00
u-spec-png
9d70abe4de [ie/N1] Fix extraction of newer articles (#12514)
Authored by: u-spec-png
2025-03-04 01:51:23 +01:00
sepro
8eb9c1bf3b [ie/RTP] Rework extractor (#11638)
Closes #4661, Closes #10393, Closes #11244
Authored by: seproDev, vallovic, red-acid, pferreir, somini

Co-authored-by: vallovic <vallovic@gmail.com>
Co-authored-by: red-acid <161967284+red-acid@users.noreply.github.com>
Co-authored-by: Pedro Ferreira <pedro@dete.st>
Co-authored-by: somini <dev@somini.xyz>
2025-03-04 00:46:18 +01:00
fries1234
42b7440963 [ie/tvw] Add extractor (#12271)
Authored by: fries1234
2025-03-03 23:25:30 +01:00
sepro
172d5fcd77 [ie/MagellanTV] Fix extractor (#12505)
Closes #12498
Authored by: seproDev
2025-03-03 22:55:03 +01:00
Simon Sawicki
7d18fed8f1 [networking] Add keep_header_casing extension (#11652)
Authored by: coletdjnz, Grub4K

Co-authored-by: coletdjnz <coletdjnz@protonmail.com>
2025-03-03 00:10:01 +01:00
coletdjnz
79ec2fdff7 [ie/youtube] Warn on missing formats due to SSAP (#12483)
See https://github.com/yt-dlp/yt-dlp/issues/12482

Authored by: coletdjnz
2025-02-28 19:33:31 +13:00
sepro
3042afb5fe [ie/CultureUnplugged] Extend _VALID_URL (#12486)
Closes #12477
Authored by: seproDev
2025-02-26 19:39:50 +01:00
sepro
ad60137c14 [ie/Dailymotion] Improve embed detection (#12464)
Closes #12453
Authored by: seproDev
2025-02-26 19:36:33 +01:00
4ft35t
0bb3978862 [ie/weibo] Support playlists (#12284)
Closes #12283
Authored by: 4ft35t
2025-02-23 19:16:06 +00:00
XPA
7508e34f20 [ie/niconico] Fix format sorting (#12442)
Authored by: xpadev-net
2025-02-23 19:07:08 +00:00
bashonly
9807181cfb [ie/lbry] Make m3u8 format extraction non-fatal (#12463)
Closes #12459
Authored by: bashonly
2025-02-23 18:24:48 +00:00
bashonly
7126b47260 [ie/lbry] Raise appropriate error for non-media files (#12462)
Closes #12182
Authored by: bashonly
2025-02-23 17:59:22 +00:00
bashonly
eb1417786a [ie/gem.cbc.ca] Fix login support (#12414)
Closes #12406
Authored by: bashonly
2025-02-23 09:56:47 +00:00
bashonly
6933f5670c [ie/playsuisse] Fix login support (#12444)
Closes #12425
Authored by: bashonly
2025-02-23 09:22:51 +00:00
Alexander Seiler
26a502fc72 [ie/azmedien] Fix extractor (#12375)
Authored by: goggle
2025-02-23 09:14:35 +00:00
Ben Faerber
652827d5a0 [ie/softwhiteunderbelly] Add extractor (#12281)
Authored by: benfaerber
2025-02-23 09:11:58 +00:00
Pedro Belo
0e1697232f [ie/globo] Fix subtitles extraction (#12270)
Authored by: pedro
2025-02-23 08:57:27 +00:00
Kenshin9977
9f77e04c76 Fix external downloader availability when using --ffmpeg-location (#12318)
This fix is only applicable to the CLI option

Authored by: Kenshin9977
2025-02-23 08:50:43 +00:00
Simon Sawicki
c034d65548 Fix lazy extractor state (Fix 4445f37a7a) (#12452)
Authored by: coletdjnz, Grub4K, pukkandan
2025-02-23 09:44:27 +01:00
bashonly
480125560a [ie/instagram] Improve error handling (#12410)
Closes #5967, Closes #6294, Closes #7328, Closes #8452
Authored by: bashonly
2025-02-23 08:35:22 +00:00
bashonly
a59abe0636 [ie/instagram] Fix extraction of older private posts (#12451)
Authored by: bashonly
2025-02-23 08:31:00 +00:00
Chris Ellsworth
a90641c836 [ie/instagram] Add app_id extractor-arg (#12359)
Authored by: chrisellsworth
2025-02-23 08:16:04 +00:00
fireattack
65c3c58c0a [ie/instagram:story] Support --no-playlist (#12397)
Closes #12395
Authored by: fireattack
2025-02-23 07:24:21 +00:00
bashonly
99ea297875 [ie/tiktok] Improve error handling (#12445)
Closes #8678
Authored by: bashonly
2025-02-23 06:53:13 +00:00
bashonly
6deeda5c11 [ie/soundcloud] Fix thumbnail extraction (#12447)
Closes #11835, Closes #12435
Authored by: bashonly
2025-02-23 06:20:53 +00:00
Refael Ackermann
7f3006eb0c [ie/wsj] Support opinion URLs and impersonation (#12431)
Authored by: refack
2025-02-23 00:40:53 +00:00
coletdjnz
4445f37a7a [core] Load plugins on demand (#11305)
- Adds `--no-plugin-dirs` to disable plugin loading
- `--plugin-dirs` now supports post-processors

Authored by: coletdjnz, Grub4K, pukkandan
2025-02-23 11:00:46 +13:00
sepro
3a1583ca75 [ie/BunnyCdn] Add extractor (#11586)
Also adds BunnyCdnFD

Authored by: seproDev, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2025-02-21 22:39:41 +01:00
Simon Sawicki
a3e0c7d3b2 [test] Show all differences for expect_value and expect_dict (#12334)
Authored by: Grub4K
2025-02-21 21:29:07 +01:00
Simon Sawicki
f7a1f2d813 [core] Support emitting ConEmu progress codes (#10649)
Authored by: Grub4K
2025-02-20 20:33:31 +01:00
bashonly
9deed13d7c [ie/soundcloud] Extract tags (#12420)
Authored by: bashonly
2025-02-20 15:51:08 +00:00
bashonly
c2e6e1d5f7 [ie/niconico:live] Fix thumbnail extraction (#12419)
Closes #12417
Authored by: bashonly
2025-02-20 15:39:06 +00:00
github-actions[bot]
9c3e8b1696 Release 2025.02.19
Created by: bashonly

:ci skip all
2025-02-19 02:42:18 +00:00
bashonly
4985a40417 [cleanup] Misc (#12238)
Authored by: StefanLobbenmeier, dirkf, Grub4K

Co-authored-by: Stefan Lobbenmeier <Stefan.Lobbenmeier@gmail.com>
Co-authored-by: dirkf <fieldhouse@gmx.net>
Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2025-02-19 02:29:29 +00:00
sepro
01a63629a2 [docs] Add note to supportedsites.md (#12382)
Authored by: seproDev
2025-02-19 02:27:49 +00:00
bashonly
be69468752 [fd/hls] Support --write-pages for m3u8 media playlists (#12333)
Authored by: bashonly
2025-02-19 02:23:42 +00:00
bashonly
5271ef48c6 [ie/gem.cbc.ca] Fix extractors (#12404)
Does not fix broken login support

Closes #11848
Authored by: bashonly, dirkf

Co-authored-by: dirkf <fieldhouse@gmx.net>
2025-02-19 02:20:50 +00:00
coletdjnz
d48e612609 [ie/youtube] Retry on more critical requests (#12339)
Authored by: coletdjnz
2025-02-19 00:39:51 +00:00
bashonly
5c4c2ddfaa [ie/francetvinfo.fr] Fix extractor (#12402)
Closes #12366
Authored by: bashonly
2025-02-19 00:28:34 +00:00
bashonly
ec17fb16e8 [ie/youtube] nsig workaround for tce player JS (#12401)
Closes #12398
Authored by: bashonly
2025-02-19 00:24:12 +00:00
bashonly
e7882b682b [ie/3sat] Fix extractor (#12403)
Fix 241ace4f10

Closes #12391
Authored by: bashonly
2025-02-19 00:19:02 +00:00
bashonly
6ca23ffaa4 [ie/reddit] Bypass gated subreddit warning (#12335)
Closes #12331
Authored by: bashonly
2025-02-11 21:32:25 +00:00
Laurent FAVOLE
f53553087d [ie/Digiview] Add extractor (#9902)
Authored by: lfavole
2025-02-11 21:04:20 +01:00
bashonly
4ecb833472 [misc] Clarify that the issue template cannot be removed (#12332)
Fix 517ddf3c3f

Authored by: bashonly
2025-02-11 00:40:21 +00:00
Mozi
2081634474 [test:download] Validate and sort info dict fields (#12299)
Authored by: pzhlkj6612, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-02-10 23:22:21 +00:00
bashonly
c987be0acb [fd/hls] Support hls_media_playlist_data format field (#12322)
Authored by: bashonly
2025-02-10 23:08:10 +00:00
Patrick Robertson
14cd7f3443 [ie/twitter] Fix syndication token generation (#12107)
Authored by: pjrobertson, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2025-02-10 19:00:00 +00:00
sepro
4ca8c44a07 [jsinterp] Improve zeroise (#12313)
Authored by: seproDev
2025-02-09 22:37:23 +01:00
Stefan Lobbenmeier
241ace4f10 [ie/zdf] Extract more metadata (#9565)
Closes #9564
Authored by: StefanLobbenmeier
2025-02-09 19:19:28 +00:00
bashonly
1295bbedd4 [ie/francetv:site] Fix livestream extraction (#12316)
Closes #12310
Authored by: bashonly
2025-02-09 02:21:48 +00:00
Julien Valentin
19edaa44fc [ie/generic] Extract live_status for DASH manifest URLs (#12256)
* Also removes the content-type check for dash+xml/mpd.
This was added in cf1f13b817,
but is a no-op since the regex pattern was never changed accordingly.
And it looks like it was unwanted anyways per 28ad7df65d

Closes #12255
Authored by: mp3butcher
2025-02-08 23:28:54 +00:00
entourage8
10b7ff68e9 [fd/hls] Fix BYTERANGE logic (#11972)
Closes #3578, Closes #3810, Closes #9400
Authored by: entourage8
2025-02-08 21:43:12 +00:00
Simon Sawicki
0d9f061d38 [jsinterp] Add js_number_to_string (#12110)
Authored by: Grub4K
2025-02-08 18:48:36 +01:00
sepro
517ddf3c3f [misc] Improve Issue/PR templates (#11499)
Authored by: seproDev
2025-02-08 17:00:38 +01:00
bashonly
03c3d70577 [ie/cwtv:movie] Add extractor (#12227)
Closes #12113
Authored by: bashonly
2025-01-30 19:58:10 +00:00
dove
f8d0161455 [ie/globo] Fix extractor (#11795)
Closes #9512, Closes #11541, Closes #11772
Authored by: slipinthedove, YoshiTabletopGamer

Co-authored-by: YoshiTabletopGamer <88633614+YoshiTabletopGamer@users.noreply.github.com>
2025-01-29 23:55:40 +00:00
alard
d59f14a0a7 [ie/goplay] Fix extractor (#12237)
Authored by: alard
2025-01-29 23:38:36 +00:00
bashonly
817483ccc6 [ie/francetv:site] Fix extractor (#12236)
Closes #12209
Authored by: bashonly
2025-01-29 23:23:29 +00:00
bashonly
861aeec449 [ie/dropbox] Fix extraction (#12228)
Closes #12109
Authored by: bashonly
2025-01-29 16:56:06 +00:00
barsnick
57c717fee4 [ie/acast] Support shows.acast.com URLs (#12223)
Authored by: barsnick
2025-01-28 23:41:02 +00:00
Roland Hieber
9fb8ab2ff6 [ie/pbs] Support www.thirteen.org URLs (#11191)
Authored by: rohieb
2025-01-28 23:38:26 +00:00
arantius
18a28514e3 [ie/cwtv] Fix extractor (#12207)
Closes #12108
Authored by: arantius
2025-01-28 23:26:37 +00:00
152 changed files with 9308 additions and 6533 deletions

View File

@@ -2,13 +2,11 @@ name: Broken site support
description: Report issue with yt-dlp on a supported site description: Report issue with yt-dlp on a supported site
labels: [triage, site-bug] labels: [triage, site-bug]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@@ -24,9 +22,7 @@ body:
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
- type: input - type: input
@@ -47,6 +43,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
required: true required: true
@@ -78,11 +76,3 @@ body:
render: shell render: shell
validations: validations:
required: true required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -2,13 +2,11 @@ name: Site support request
description: Request support for a new site description: Request support for a new site
labels: [triage, site-request] labels: [triage, site-request]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@@ -24,9 +22,7 @@ body:
required: true required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge - label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
- type: input - type: input
@@ -59,6 +55,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
required: true required: true
@@ -90,11 +88,3 @@ body:
render: shell render: shell
validations: validations:
required: true required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -1,14 +1,12 @@
name: Site feature request name: Site feature request
description: Request a new functionality for a supported site description: Request new functionality for a site supported by yt-dlp
labels: [triage, site-enhancement] labels: [triage, site-enhancement]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@@ -22,9 +20,7 @@ body:
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
- type: input - type: input
@@ -55,6 +51,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
required: true required: true
@@ -86,11 +84,3 @@ body:
render: shell render: shell
validations: validations:
required: true required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -2,13 +2,11 @@ name: Core bug report
description: Report a bug unrelated to any particular site or extractor description: Report a bug unrelated to any particular site or extractor
labels: [triage, bug] labels: [triage, bug]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@@ -20,13 +18,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: description id: description
@@ -40,6 +32,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
required: true required: true
@@ -71,11 +65,3 @@ body:
render: shell render: shell
validations: validations:
required: true required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -1,14 +1,12 @@
name: Feature request name: Feature request
description: Request a new functionality unrelated to any particular site or extractor description: Request a new feature unrelated to any particular site or extractor
labels: [triage, enhancement] labels: [triage, enhancement]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes - type: checkboxes
id: checklist id: checklist
attributes: attributes:
@@ -22,9 +20,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: description id: description
@@ -38,6 +34,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead" - label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
@@ -65,11 +63,3 @@ body:
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc [youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
<more lines> <more lines>
render: shell render: shell
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -1,14 +1,12 @@
name: Ask question name: Ask question
description: Ask yt-dlp related question description: Ask a question about using yt-dlp
labels: [question] labels: [question]
body: body:
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: markdown - type: markdown
attributes: attributes:
value: | value: |
@@ -28,9 +26,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%3Aissue%20-label%3Aspam%20%20) for similar questions **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: question id: question
@@ -44,6 +40,8 @@ body:
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
- label: "If using API, add `'verbose': True` to `YoutubeDL` params instead" - label: "If using API, add `'verbose': True` to `YoutubeDL` params instead"
@@ -71,11 +69,3 @@ body:
[youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc [youtube] Extracting URL: https://www.youtube.com/watch?v=BaW_jenozKc
<more lines> <more lines>
render: shell render: shell
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.

View File

@@ -1,8 +1,5 @@
blank_issues_enabled: false blank_issues_enabled: false
contact_links: contact_links:
- name: Get help from the community on Discord - name: Get help on Discord
url: https://discord.gg/H5MNcFW63r url: https://discord.gg/H5MNcFW63r
about: Join the yt-dlp Discord for community-powered support! about: Join the yt-dlp Discord server for support and discussion
- name: Matrix Bridge to the Discord server
url: https://matrix.to/#/#yt-dlp:matrix.org
about: For those who do not want to use Discord

View File

@@ -18,9 +18,7 @@ body:
required: true required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
- type: input - type: input

View File

@@ -18,9 +18,7 @@ body:
required: true required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge - label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and am willing to share it if required
- type: input - type: input

View File

@@ -1,5 +1,5 @@
name: Site feature request name: Site feature request
description: Request a new functionality for a supported site description: Request new functionality for a site supported by yt-dlp
labels: [triage, site-enhancement] labels: [triage, site-enhancement]
body: body:
%(no_skip)s %(no_skip)s
@@ -16,9 +16,7 @@ body:
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required - label: I've read about [sharing account credentials](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#are-you-willing-to-share-account-details-if-needed) and I'm willing to share it if required
- type: input - type: input

View File

@@ -14,13 +14,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: description id: description

View File

@@ -1,5 +1,5 @@
name: Feature request name: Feature request
description: Request a new functionality unrelated to any particular site or extractor description: Request a new feature unrelated to any particular site or extractor
labels: [triage, enhancement] labels: [triage, enhancement]
body: body:
%(no_skip)s %(no_skip)s
@@ -16,9 +16,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates - label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar requests **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: description id: description

View File

@@ -1,5 +1,5 @@
name: Ask question name: Ask question
description: Ask yt-dlp related question description: Ask a question about using yt-dlp
labels: [question] labels: [question]
body: body:
%(no_skip)s %(no_skip)s
@@ -22,9 +22,7 @@ body:
required: true required: true
- label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels)) - label: I've verified that I have **updated yt-dlp to nightly or master** ([update instructions](https://github.com/yt-dlp/yt-dlp#update-channels))
required: true required: true
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates - label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766), [the FAQ](https://github.com/yt-dlp/yt-dlp/wiki/FAQ), and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=is%%3Aissue%%20-label%%3Aspam%%20%%20) for similar questions **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true required: true
- type: textarea - type: textarea
id: question id: question

View File

@@ -1,14 +1,17 @@
**IMPORTANT**: PRs without the template will be CLOSED <!--
**IMPORTANT**: PRs without the template will be CLOSED
Due to the high volume of pull requests, it may be a while before your PR is reviewed.
Please try to keep your pull request focused on a single bugfix or new feature.
Pull requests with a vast scope and/or very large diff will take much longer to review.
It is recommended for new contributors to stick to smaller pull requests, so you can receive much more immediate feedback as you familiarize yourself with the codebase.
PLEASE AVOID FORCE-PUSHING after opening a PR, as it makes reviewing more difficult.
-->
### Description of your *pull request* and other information ### Description of your *pull request* and other information
<!-- ADD DETAILED DESCRIPTION HERE
Explanation of your *pull request* in arbitrary form goes here. Please **make sure the description explains the purpose and effect** of your *pull request* and is worded well enough to be understood. Provide as much **context and examples** as possible
-->
ADD DESCRIPTION HERE
Fixes # Fixes #
@@ -16,24 +19,22 @@ Fixes #
<details open><summary>Template</summary> <!-- OPEN is intentional --> <details open><summary>Template</summary> <!-- OPEN is intentional -->
<!-- <!--
# PLEASE FOLLOW THE GUIDE BELOW
# PLEASE FOLLOW THE GUIDE BELOW - You will be asked some questions, please read them **carefully** and answer honestly
- Put an `x` into all the boxes `[ ]` relevant to your *pull request* (like [x])
- You will be asked some questions, please read them **carefully** and answer honestly - Use *Preview* tab to see what your *pull request* will actually look like
- Put an `x` into all the boxes `[ ]` relevant to your *pull request* (like [x])
- Use *Preview* tab to see how your *pull request* will actually look like
--> -->
### Before submitting a *pull request* make sure you have: ### Before submitting a *pull request* make sure you have:
- [ ] At least skimmed through [contributing guidelines](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) including [yt-dlp coding conventions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#yt-dlp-coding-conventions) - [ ] At least skimmed through [contributing guidelines](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions) including [yt-dlp coding conventions](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#yt-dlp-coding-conventions)
- [ ] [Searched](https://github.com/yt-dlp/yt-dlp/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests - [ ] [Searched](https://github.com/yt-dlp/yt-dlp/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check all of the following options that apply: ### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check those that apply and remove the others:
- [ ] I am the original author of this code and I am willing to release it under [Unlicense](http://unlicense.org/) - [ ] I am the original author of the code in this PR, and I am willing to release it under [Unlicense](http://unlicense.org/)
- [ ] I am not the original author of this code but it is in public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence) - [ ] I am not the original author of the code in this PR, but it is in the public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)
### What is the purpose of your *pull request*? ### What is the purpose of your *pull request*? Check those that apply and remove the others:
- [ ] Fix or improvement to an extractor (Make sure to add/update tests) - [ ] Fix or improvement to an extractor (Make sure to add/update tests)
- [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy)) - [ ] New extractor ([Piracy websites will not be accepted](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy))
- [ ] Core bug fix/improvement - [ ] Core bug fix/improvement

View File

@@ -33,7 +33,7 @@ jobs:
# Initializes the CodeQL tools for scanning. # Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL - name: Initialize CodeQL
uses: github/codeql-action/init@v2 uses: github/codeql-action/init@v3
with: with:
languages: ${{ matrix.language }} languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file. # If you wish to specify custom queries, you can do so here or in a config file.
@@ -47,7 +47,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift). # Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
# If this step fails, then you should remove it and run the build manually (see below) # If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild - name: Autobuild
uses: github/codeql-action/autobuild@v2 uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell. # Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
@@ -60,6 +60,6 @@ jobs:
# ./location_of_script_within_repo/buildscript.sh # ./location_of_script_within_repo/buildscript.sh
- name: Perform CodeQL Analysis - name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2 uses: github/codeql-action/analyze@v3
with: with:
category: "/language:${{matrix.language}}" category: "/language:${{matrix.language}}"

View File

@@ -736,3 +736,25 @@ NecroRomnt
pjrobertson pjrobertson
subsense subsense
test20140 test20140
arantius
entourage8
lfavole
mp3butcher
slipinthedove
YoshiTabletopGamer
Arc8ne
benfaerber
chrisellsworth
fries1234
Kenshin9977
MichaelDeBoey
msikma
pedro
pferreir
red-acid
refack
rysson
somini
thedenv
vallovic
arabcoders

View File

@@ -4,6 +4,156 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2025.03.26
#### Extractor changes
- **youtube**
- [Fix signature and nsig extraction for player `4fcd6e4a`](https://github.com/yt-dlp/yt-dlp/commit/a550dfc904a02843a26369ae50dbb7c0febfb30e) ([#12748](https://github.com/yt-dlp/yt-dlp/issues/12748)) by [seproDev](https://github.com/seproDev)
- [Only cache nsig code on successful decoding](https://github.com/yt-dlp/yt-dlp/commit/ecee97b4fa90d51c48f9154c3a6d5a8ffe46cd5c) ([#12750](https://github.com/yt-dlp/yt-dlp/issues/12750)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.25
#### Core changes
- [Fix attribute error on failed VT init](https://github.com/yt-dlp/yt-dlp/commit/b872ffec50fd50f790a5a490e006a369a28a3df3) ([#12696](https://github.com/yt-dlp/yt-dlp/issues/12696)) by [Grub4K](https://github.com/Grub4K)
- **utils**: `js_to_json`: [Make function less fatal](https://github.com/yt-dlp/yt-dlp/commit/9491b44032b330e05bd5eaa546187005d1e8538e) ([#12715](https://github.com/yt-dlp/yt-dlp/issues/12715)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- [Fix sorting of HLS audio formats by `GROUP-ID`](https://github.com/yt-dlp/yt-dlp/commit/86ab79e1a5182092321102adf6ca34195803b878) ([#12714](https://github.com/yt-dlp/yt-dlp/issues/12714)) by [bashonly](https://github.com/bashonly)
- **17live**: vod: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3396eb50dcd245b49c0f4aecd6e80ec914095d16) ([#12723](https://github.com/yt-dlp/yt-dlp/issues/12723)) by [subrat-lima](https://github.com/subrat-lima)
- **9now.com.au**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/9d5e6de2e7a47226d1f72c713ad45c88ba01db68) ([#12702](https://github.com/yt-dlp/yt-dlp/issues/12702)) by [bashonly](https://github.com/bashonly)
- **chzzk**: video: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/e2dfccaf808b406d5bcb7dd04ae9ce420752dd6f) ([#12692](https://github.com/yt-dlp/yt-dlp/issues/12692)) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf)
- **deezer**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/be5af3f9e91747768c2b41157851bfbe14c663f7) ([#12704](https://github.com/yt-dlp/yt-dlp/issues/12704)) by [seproDev](https://github.com/seproDev)
- **generic**: [Fix MPD base URL parsing](https://github.com/yt-dlp/yt-dlp/commit/5086d4aed6aeb3908c62f49e2d8f74cc0cb05110) ([#12718](https://github.com/yt-dlp/yt-dlp/issues/12718)) by [fireattack](https://github.com/fireattack)
- **streaks**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/801afeac91f97dc0b58cd39cc7e8c50f619dc4e1) ([#12679](https://github.com/yt-dlp/yt-dlp/issues/12679)) by [doe1080](https://github.com/doe1080)
- **tver**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/66e0bab814e4a52ef3e12d81123ad992a29df50e) ([#12659](https://github.com/yt-dlp/yt-dlp/issues/12659)) by [arabcoders](https://github.com/arabcoders), [bashonly](https://github.com/bashonly)
- **viki**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/fe4f14b8369038e7c58f7de546d76de1ce3a91ce) ([#12703](https://github.com/yt-dlp/yt-dlp/issues/12703)) by [seproDev](https://github.com/seproDev)
- **vrsquare**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/b7fbb5a0a16a8e8d3e29c29e26ebed677d0d6ea3) ([#12515](https://github.com/yt-dlp/yt-dlp/issues/12515)) by [doe1080](https://github.com/doe1080)
- **youtube**
- [Fix PhantomJS nsig fallback](https://github.com/yt-dlp/yt-dlp/commit/4054a2b623bd1e277b49d2e9abc3d112a4b1c7be) ([#12728](https://github.com/yt-dlp/yt-dlp/issues/12728)) by [bashonly](https://github.com/bashonly)
- [Fix signature and nsig extraction for player `363db69b`](https://github.com/yt-dlp/yt-dlp/commit/b9c979461b244713bf42691a5bc02834e2ba4b2c) ([#12725](https://github.com/yt-dlp/yt-dlp/issues/12725)) by [bashonly](https://github.com/bashonly)
#### Networking changes
- **Request Handler**: curl_cffi: [Support `curl_cffi` 0.10.x](https://github.com/yt-dlp/yt-dlp/commit/9bf23902ceb948b9685ce1dab575491571720fc6) ([#12670](https://github.com/yt-dlp/yt-dlp/issues/12670)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [9dde546](https://github.com/yt-dlp/yt-dlp/commit/9dde546e7ee3e1515d88ee3af08b099351455dc0) by [seproDev](https://github.com/seproDev)
### 2025.03.21
#### Core changes
- [Fix external downloader availability when using `--ffmpeg-location`](https://github.com/yt-dlp/yt-dlp/commit/9f77e04c76e36e1cbbf49bc9eb385fa6ef804b67) ([#12318](https://github.com/yt-dlp/yt-dlp/issues/12318)) by [Kenshin9977](https://github.com/Kenshin9977)
- [Load plugins on demand](https://github.com/yt-dlp/yt-dlp/commit/4445f37a7a66b248dbd8376c43137e6e441f138e) ([#11305](https://github.com/yt-dlp/yt-dlp/issues/11305)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K), [pukkandan](https://github.com/pukkandan) (With fixes in [c034d65](https://github.com/yt-dlp/yt-dlp/commit/c034d655487be668222ef9476a16f374584e49a7))
- [Support emitting ConEmu progress codes](https://github.com/yt-dlp/yt-dlp/commit/f7a1f2d8132967a62b0f6d5665c6d2dde2d42c09) ([#10649](https://github.com/yt-dlp/yt-dlp/issues/10649)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- **azmedien**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/26a502fc727d0e91b2db6bf4a112823bcc672e85) ([#12375](https://github.com/yt-dlp/yt-dlp/issues/12375)) by [goggle](https://github.com/goggle)
- **bilibiliplaylist**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f5fb2229e66cf59d5bf16065bc041b42a28354a0) ([#12690](https://github.com/yt-dlp/yt-dlp/issues/12690)) by [bashonly](https://github.com/bashonly)
- **bunnycdn**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3a1583ca75fb523cbad0e5e174387ea7b477d175) ([#11586](https://github.com/yt-dlp/yt-dlp/issues/11586)) by [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **canalsurmas**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/01a8be4c23f186329d85f9c78db34a55f3294ac5) ([#12497](https://github.com/yt-dlp/yt-dlp/issues/12497)) by [Arc8ne](https://github.com/Arc8ne)
- **cda**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/be0d819e1103195043f6743650781f0d4d343f6d) ([#12552](https://github.com/yt-dlp/yt-dlp/issues/12552)) by [rysson](https://github.com/rysson)
- **cultureunplugged**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/3042afb5fe342d3a00de76704cd7de611acc350e) ([#12486](https://github.com/yt-dlp/yt-dlp/issues/12486)) by [seproDev](https://github.com/seproDev)
- **dailymotion**: [Improve embed detection](https://github.com/yt-dlp/yt-dlp/commit/ad60137c141efa5023fbc0ac8579eaefe8b3d8cc) ([#12464](https://github.com/yt-dlp/yt-dlp/issues/12464)) by [seproDev](https://github.com/seproDev)
- **gem.cbc.ca**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/eb1417786a3027b1e7290ec37ef6aaece50ebed0) ([#12414](https://github.com/yt-dlp/yt-dlp/issues/12414)) by [bashonly](https://github.com/bashonly)
- **globo**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/0e1697232fcbba7551f983fd1ba93bb445cbb08b) ([#12270](https://github.com/yt-dlp/yt-dlp/issues/12270)) by [pedro](https://github.com/pedro)
- **instagram**
- [Add `app_id` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/a90641c8363fa0c10800b36eb6b01ee22d3a9409) ([#12359](https://github.com/yt-dlp/yt-dlp/issues/12359)) by [chrisellsworth](https://github.com/chrisellsworth)
- [Fix extraction of older private posts](https://github.com/yt-dlp/yt-dlp/commit/a59abe0636dc49b22a67246afe35613571b86f05) ([#12451](https://github.com/yt-dlp/yt-dlp/issues/12451)) by [bashonly](https://github.com/bashonly)
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/480125560a3b9972d29ae0da850aba8109e6bd41) ([#12410](https://github.com/yt-dlp/yt-dlp/issues/12410)) by [bashonly](https://github.com/bashonly)
- story: [Support `--no-playlist`](https://github.com/yt-dlp/yt-dlp/commit/65c3c58c0a67463a150920203cec929045c95a24) ([#12397](https://github.com/yt-dlp/yt-dlp/issues/12397)) by [fireattack](https://github.com/fireattack)
- **jamendo**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/89a68c4857ddbaf937ff22f12648baaf6b5af840) ([#12622](https://github.com/yt-dlp/yt-dlp/issues/12622)) by [bashonly](https://github.com/bashonly), [JChris246](https://github.com/JChris246)
- **ketnet**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/bbada3ec0779422cde34f1ce3dcf595da463b493) ([#12628](https://github.com/yt-dlp/yt-dlp/issues/12628)) by [MichaelDeBoey](https://github.com/MichaelDeBoey)
- **lbry**
- [Make m3u8 format extraction non-fatal](https://github.com/yt-dlp/yt-dlp/commit/9807181cfbf87bfa732f415c30412bdbd77cbf81) ([#12463](https://github.com/yt-dlp/yt-dlp/issues/12463)) by [bashonly](https://github.com/bashonly)
- [Raise appropriate error for non-media files](https://github.com/yt-dlp/yt-dlp/commit/7126b472601814b7fd8c9de02069e8fff1764891) ([#12462](https://github.com/yt-dlp/yt-dlp/issues/12462)) by [bashonly](https://github.com/bashonly)
- **loco**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/983095485c731240aae27c950cb8c24a50827b56) ([#12667](https://github.com/yt-dlp/yt-dlp/issues/12667)) by [DTrombett](https://github.com/DTrombett)
- **magellantv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/172d5fcd778bf2605db7647ebc56b29ed18d24ac) ([#12505](https://github.com/yt-dlp/yt-dlp/issues/12505)) by [seproDev](https://github.com/seproDev)
- **mitele**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7223d29569a48a35ad132a508c115973866838d3) ([#12689](https://github.com/yt-dlp/yt-dlp/issues/12689)) by [bashonly](https://github.com/bashonly)
- **msn**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/4815dac131d42c51e12c1d05232db0bbbf607329) ([#12513](https://github.com/yt-dlp/yt-dlp/issues/12513)) by [seproDev](https://github.com/seproDev), [thedenv](https://github.com/thedenv)
- **n1**: [Fix extraction of newer articles](https://github.com/yt-dlp/yt-dlp/commit/9d70abe4de401175cbbaaa36017806f16b2df9af) ([#12514](https://github.com/yt-dlp/yt-dlp/issues/12514)) by [u-spec-png](https://github.com/u-spec-png)
- **nbcstations**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/ebac65aa9e0bf9a97c24d00f7977900d2577364b) ([#12534](https://github.com/yt-dlp/yt-dlp/issues/12534)) by [refack](https://github.com/refack)
- **niconico**
- [Fix format sorting](https://github.com/yt-dlp/yt-dlp/commit/7508e34f203e97389f1d04db92140b13401dd724) ([#12442](https://github.com/yt-dlp/yt-dlp/issues/12442)) by [xpadev-net](https://github.com/xpadev-net)
- live: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/c2e6e1d5f77f3b720a6266f2869eb750d20e5dc1) ([#12419](https://github.com/yt-dlp/yt-dlp/issues/12419)) by [bashonly](https://github.com/bashonly)
- **openrec**: [Fix `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/17504f253564cfad86244de2b6346d07d2300ca5) ([#12608](https://github.com/yt-dlp/yt-dlp/issues/12608)) by [fireattack](https://github.com/fireattack)
- **pinterest**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/bd0a66816934de70312eea1e71c59c13b401dc3a) ([#12538](https://github.com/yt-dlp/yt-dlp/issues/12538)) by [mikf](https://github.com/mikf)
- **playsuisse**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/6933f5670cea9c3e2fb16c1caa1eda54d13122c5) ([#12444](https://github.com/yt-dlp/yt-dlp/issues/12444)) by [bashonly](https://github.com/bashonly)
- **reddit**: [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/d9a53cc1e6fd912daf500ca4f19e9ca88994dbf9) ([#12567](https://github.com/yt-dlp/yt-dlp/issues/12567)) by [seproDev](https://github.com/seproDev)
- **rtp**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/8eb9c1bf3b9908cca22ef043602aa24fb9f352c6) ([#11638](https://github.com/yt-dlp/yt-dlp/issues/11638)) by [pferreir](https://github.com/pferreir), [red-acid](https://github.com/red-acid), [seproDev](https://github.com/seproDev), [somini](https://github.com/somini), [vallovic](https://github.com/vallovic)
- **softwhiteunderbelly**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/652827d5a076c9483c36654ad2cf3fe46219baf4) ([#12281](https://github.com/yt-dlp/yt-dlp/issues/12281)) by [benfaerber](https://github.com/benfaerber)
- **soop**: [Fix timestamp extraction](https://github.com/yt-dlp/yt-dlp/commit/8305df00012ff8138a6ff95279d06b54ac607f63) ([#12609](https://github.com/yt-dlp/yt-dlp/issues/12609)) by [msikma](https://github.com/msikma)
- **soundcloud**
- [Extract tags](https://github.com/yt-dlp/yt-dlp/commit/9deed13d7cce6d3647379e50589c92de89227509) ([#12420](https://github.com/yt-dlp/yt-dlp/issues/12420)) by [bashonly](https://github.com/bashonly)
- [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/6deeda5c11f34f613724fa0627879f0d607ba1b4) ([#12447](https://github.com/yt-dlp/yt-dlp/issues/12447)) by [bashonly](https://github.com/bashonly)
- **tiktok**
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/99ea2978757a431eeb2a265b3395ccbe4ce202cf) ([#12445](https://github.com/yt-dlp/yt-dlp/issues/12445)) by [bashonly](https://github.com/bashonly)
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/83b119dadb0f267f1fb66bf7ed74c097349de79e) ([#12566](https://github.com/yt-dlp/yt-dlp/issues/12566)) by [seproDev](https://github.com/seproDev)
- **tv8.it**: [Add live and playlist extractors](https://github.com/yt-dlp/yt-dlp/commit/2ee3a0aff9be2be3bea60640d3d8a0febaf0acb6) ([#12569](https://github.com/yt-dlp/yt-dlp/issues/12569)) by [DTrombett](https://github.com/DTrombett)
- **tvw**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/42b7440963866e31ff84a5b89030d1c596fa2e6e) ([#12271](https://github.com/yt-dlp/yt-dlp/issues/12271)) by [fries1234](https://github.com/fries1234)
- **twitter**
- [Fix syndication token generation](https://github.com/yt-dlp/yt-dlp/commit/b8b47547049f5ebc3dd680fc7de70ed0ca9c0d70) ([#12537](https://github.com/yt-dlp/yt-dlp/issues/12537)) by [bashonly](https://github.com/bashonly)
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/06f6de78db2eceeabd062ab1a3023e0ff9d4df53) ([#12560](https://github.com/yt-dlp/yt-dlp/issues/12560)) by [seproDev](https://github.com/seproDev)
- **vk**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/05c8023a27dd37c49163c0498bf98e3e3c1cb4b9) ([#12510](https://github.com/yt-dlp/yt-dlp/issues/12510)) by [seproDev](https://github.com/seproDev)
- **vrtmax**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/df9ebeec00d658693252978d1ffb885e67aa6ab6) ([#12479](https://github.com/yt-dlp/yt-dlp/issues/12479)) by [bergoid](https://github.com/bergoid), [MichaelDeBoey](https://github.com/MichaelDeBoey), [seproDev](https://github.com/seproDev)
- **weibo**: [Support playlists](https://github.com/yt-dlp/yt-dlp/commit/0bb39788626002a8a67e925580227952c563c8b9) ([#12284](https://github.com/yt-dlp/yt-dlp/issues/12284)) by [4ft35t](https://github.com/4ft35t)
- **wsj**: [Support opinion URLs and impersonation](https://github.com/yt-dlp/yt-dlp/commit/7f3006eb0c0659982bb956d71b0bc806bcb0a5f2) ([#12431](https://github.com/yt-dlp/yt-dlp/issues/12431)) by [refack](https://github.com/refack)
- **youtube**
- [Fix nsig and signature extraction for player `643afba4`](https://github.com/yt-dlp/yt-dlp/commit/9b868518a15599f3d7ef5a1c730dda164c30da9b) ([#12684](https://github.com/yt-dlp/yt-dlp/issues/12684)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/3380febe9984c21c79c3147c1d390a4cf339bc4c) ([#12603](https://github.com/yt-dlp/yt-dlp/issues/12603)) by [seproDev](https://github.com/seproDev)
- [Split into package](https://github.com/yt-dlp/yt-dlp/commit/4432a9390c79253ac830702b226d2e558b636725) ([#12557](https://github.com/yt-dlp/yt-dlp/issues/12557)) by [coletdjnz](https://github.com/coletdjnz)
- [Warn on DRM formats](https://github.com/yt-dlp/yt-dlp/commit/e67d786c7cc87bd449d22e0ddef08306891c1173) ([#12593](https://github.com/yt-dlp/yt-dlp/issues/12593)) by [coletdjnz](https://github.com/coletdjnz)
- [Warn on missing formats due to SSAP](https://github.com/yt-dlp/yt-dlp/commit/79ec2fdff75c8c1bb89b550266849ad4dec48dd3) ([#12483](https://github.com/yt-dlp/yt-dlp/issues/12483)) by [coletdjnz](https://github.com/coletdjnz)
#### Networking changes
- [Add `keep_header_casing` extension](https://github.com/yt-dlp/yt-dlp/commit/7d18fed8f1983fe6de4ddc810dfb2761ba5744ac) ([#11652](https://github.com/yt-dlp/yt-dlp/issues/11652)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K)
- [Always add unsupported suffix on version mismatch](https://github.com/yt-dlp/yt-dlp/commit/95f8df2f796d0048119615200758199aedcd7cf4) ([#12626](https://github.com/yt-dlp/yt-dlp/issues/12626)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [f36e4b6](https://github.com/yt-dlp/yt-dlp/commit/f36e4b6e65cb8403791aae2f520697115cb88dec) by [dirkf](https://github.com/dirkf), [gamer191](https://github.com/gamer191), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **test**: [Show all differences for `expect_value` and `expect_dict`](https://github.com/yt-dlp/yt-dlp/commit/a3e0c7d3b267abdf3933b709704a28d43bb46503) ([#12334](https://github.com/yt-dlp/yt-dlp/issues/12334)) by [Grub4K](https://github.com/Grub4K)
### 2025.02.19
#### Core changes
- **jsinterp**
- [Add `js_number_to_string`](https://github.com/yt-dlp/yt-dlp/commit/0d9f061d38c3a4da61972e2adad317079f2f1c84) ([#12110](https://github.com/yt-dlp/yt-dlp/issues/12110)) by [Grub4K](https://github.com/Grub4K)
- [Improve zeroise](https://github.com/yt-dlp/yt-dlp/commit/4ca8c44a073d5aa3a3e3112c35b2b23d6ce25ac6) ([#12313](https://github.com/yt-dlp/yt-dlp/issues/12313)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **acast**: [Support shows.acast.com URLs](https://github.com/yt-dlp/yt-dlp/commit/57c717fee4bfbc9309845bbb48901b72e4b69304) ([#12223](https://github.com/yt-dlp/yt-dlp/issues/12223)) by [barsnick](https://github.com/barsnick)
- **cwtv**
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/18a28514e306e822eab4f3a79c76d515bf076406) ([#12207](https://github.com/yt-dlp/yt-dlp/issues/12207)) by [arantius](https://github.com/arantius)
- movie: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/03c3d705778c07739e0034b51490877cffdc0983) ([#12227](https://github.com/yt-dlp/yt-dlp/issues/12227)) by [bashonly](https://github.com/bashonly)
- **digiview**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/f53553087d3fde9dcd61d6e9f98caf09db1d8ef2) ([#9902](https://github.com/yt-dlp/yt-dlp/issues/9902)) by [lfavole](https://github.com/lfavole)
- **dropbox**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/861aeec449c8f3c062d962945b234ff0341f61f3) ([#12228](https://github.com/yt-dlp/yt-dlp/issues/12228)) by [bashonly](https://github.com/bashonly)
- **francetv**
- site
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/817483ccc68aed6049ed9c4a2ffae44ca82d2b1c) ([#12236](https://github.com/yt-dlp/yt-dlp/issues/12236)) by [bashonly](https://github.com/bashonly)
- [Fix livestream extraction](https://github.com/yt-dlp/yt-dlp/commit/1295bbedd45fa8d9bc3f7a194864ae280297848e) ([#12316](https://github.com/yt-dlp/yt-dlp/issues/12316)) by [bashonly](https://github.com/bashonly)
- **francetvinfo.fr**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/5c4c2ddfaa47988b4d50c1ad4988badc0b4f30c2) ([#12402](https://github.com/yt-dlp/yt-dlp/issues/12402)) by [bashonly](https://github.com/bashonly)
- **gem.cbc.ca**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/5271ef48c6f61c145e03e18e960995d2e651d205) ([#12404](https://github.com/yt-dlp/yt-dlp/issues/12404)) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf)
- **generic**: [Extract `live_status` for DASH manifest URLs](https://github.com/yt-dlp/yt-dlp/commit/19edaa44fcd375f54e63d6227b092f5252d3e889) ([#12256](https://github.com/yt-dlp/yt-dlp/issues/12256)) by [mp3butcher](https://github.com/mp3butcher)
- **globo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f8d0161455f00add65585ca1a476a7b5d56f5f96) ([#11795](https://github.com/yt-dlp/yt-dlp/issues/11795)) by [slipinthedove](https://github.com/slipinthedove), [YoshiTabletopGamer](https://github.com/YoshiTabletopGamer)
- **goplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/d59f14a0a7a8b55e6bf468237def62b73ab4a517) ([#12237](https://github.com/yt-dlp/yt-dlp/issues/12237)) by [alard](https://github.com/alard)
- **pbs**: [Support www.thirteen.org URLs](https://github.com/yt-dlp/yt-dlp/commit/9fb8ab2ff67fb699f60cce09163a580976e90c0e) ([#11191](https://github.com/yt-dlp/yt-dlp/issues/11191)) by [rohieb](https://github.com/rohieb)
- **reddit**: [Bypass gated subreddit warning](https://github.com/yt-dlp/yt-dlp/commit/6ca23ffaa4663cb552f937f0b1e9769b66db11bd) ([#12335](https://github.com/yt-dlp/yt-dlp/issues/12335)) by [bashonly](https://github.com/bashonly)
- **twitter**: [Fix syndication token generation](https://github.com/yt-dlp/yt-dlp/commit/14cd7f3443c6da4d49edaefcc12da9dee86e243e) ([#12107](https://github.com/yt-dlp/yt-dlp/issues/12107)) by [Grub4K](https://github.com/Grub4K), [pjrobertson](https://github.com/pjrobertson)
- **youtube**
- [Retry on more critical requests](https://github.com/yt-dlp/yt-dlp/commit/d48e612609d012abbea3785be4d26d78a014abb2) ([#12339](https://github.com/yt-dlp/yt-dlp/issues/12339)) by [coletdjnz](https://github.com/coletdjnz)
- [nsig workaround for `tce` player JS](https://github.com/yt-dlp/yt-dlp/commit/ec17fb16e8d69d4e3e10fb73bf3221be8570dfee) ([#12401](https://github.com/yt-dlp/yt-dlp/issues/12401)) by [bashonly](https://github.com/bashonly)
- **zdf**: [Extract more metadata](https://github.com/yt-dlp/yt-dlp/commit/241ace4f104d50fdf7638f9203927aefcf57a1f7) ([#9565](https://github.com/yt-dlp/yt-dlp/issues/9565)) by [StefanLobbenmeier](https://github.com/StefanLobbenmeier) (With fixes in [e7882b6](https://github.com/yt-dlp/yt-dlp/commit/e7882b682b959e476d8454911655b3e9b14c79b2) by [bashonly](https://github.com/bashonly))
#### Downloader changes
- **hls**
- [Fix `BYTERANGE` logic](https://github.com/yt-dlp/yt-dlp/commit/10b7ff68e98f17655e31952f6e17120b2d7dda96) ([#11972](https://github.com/yt-dlp/yt-dlp/issues/11972)) by [entourage8](https://github.com/entourage8)
- [Support `--write-pages` for m3u8 media playlists](https://github.com/yt-dlp/yt-dlp/commit/be69468752ff598cacee57bb80533deab2367a5d) ([#12333](https://github.com/yt-dlp/yt-dlp/issues/12333)) by [bashonly](https://github.com/bashonly)
- [Support `hls_media_playlist_data` format field](https://github.com/yt-dlp/yt-dlp/commit/c987be0acb6872c6561f28aa28171e803393d851) ([#12322](https://github.com/yt-dlp/yt-dlp/issues/12322)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- [Improve Issue/PR templates](https://github.com/yt-dlp/yt-dlp/commit/517ddf3c3f12560ab93e3d36244dc82db9f97818) ([#11499](https://github.com/yt-dlp/yt-dlp/issues/11499)) by [seproDev](https://github.com/seproDev) (With fixes in [4ecb833](https://github.com/yt-dlp/yt-dlp/commit/4ecb833472c90e078567b561fb7c089f1aa9587b) by [bashonly](https://github.com/bashonly))
- **cleanup**: Miscellaneous: [4985a40](https://github.com/yt-dlp/yt-dlp/commit/4985a4041770eaa0016271809a1fd950dc809a55) by [dirkf](https://github.com/dirkf), [Grub4K](https://github.com/Grub4K), [StefanLobbenmeier](https://github.com/StefanLobbenmeier)
- **docs**: [Add note to `supportedsites.md`](https://github.com/yt-dlp/yt-dlp/commit/01a63629a21781458dcbd38779898e117678f5ff) ([#12382](https://github.com/yt-dlp/yt-dlp/issues/12382)) by [seproDev](https://github.com/seproDev)
- **test**: download: [Validate and sort info dict fields](https://github.com/yt-dlp/yt-dlp/commit/208163447408c78673b08c172beafe5c310fb167) ([#12299](https://github.com/yt-dlp/yt-dlp/issues/12299)) by [bashonly](https://github.com/bashonly), [pzhlkj6612](https://github.com/pzhlkj6612)
### 2025.01.26 ### 2025.01.26
#### Core changes #### Core changes

View File

@@ -6,7 +6,6 @@
[![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Download&style=for-the-badge)](#installation "Installation") [![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Download&style=for-the-badge)](#installation "Installation")
[![PyPI](https://img.shields.io/badge/-PyPI-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp "PyPI") [![PyPI](https://img.shields.io/badge/-PyPI-blue.svg?logo=pypi&labelColor=555555&style=for-the-badge)](https://pypi.org/project/yt-dlp "PyPI")
[![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate") [![Donate](https://img.shields.io/badge/_-Donate-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](Collaborators.md#collaborators "Donate")
[![Matrix](https://img.shields.io/matrix/yt-dlp:matrix.org?color=brightgreen&labelColor=555555&label=&logo=element&style=for-the-badge)](https://matrix.to/#/#yt-dlp:matrix.org "Matrix")
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&labelColor=555555&label=&logo=discord&style=for-the-badge)](https://discord.gg/H5MNcFW63r "Discord") [![Discord](https://img.shields.io/discord/807245652072857610?color=blue&labelColor=555555&label=&logo=discord&style=for-the-badge)](https://discord.gg/H5MNcFW63r "Discord")
[![Supported Sites](https://img.shields.io/badge/-Supported_Sites-brightgreen.svg?style=for-the-badge)](supportedsites.md "Supported Sites") [![Supported Sites](https://img.shields.io/badge/-Supported_Sites-brightgreen.svg?style=for-the-badge)](supportedsites.md "Supported Sites")
[![License: Unlicense](https://img.shields.io/badge/-Unlicense-blue.svg?style=for-the-badge)](LICENSE "License") [![License: Unlicense](https://img.shields.io/badge/-Unlicense-blue.svg?style=for-the-badge)](LICENSE "License")
@@ -338,10 +337,11 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
--plugin-dirs PATH Path to an additional directory to search --plugin-dirs PATH Path to an additional directory to search
for plugins. This option can be used for plugins. This option can be used
multiple times to add multiple directories. multiple times to add multiple directories.
Note that this currently only works for Use "default" to search the default plugin
extractor plugins; postprocessor plugins can directories (default)
only be loaded from the default plugin --no-plugin-dirs Clear plugin directories to search,
directories including defaults and those provided by
previous --plugin-dirs
--flat-playlist Do not extract a playlist's URL result --flat-playlist Do not extract a playlist's URL result
entries; some entry metadata may be missing entries; some entry metadata may be missing
and downloading may be bypassed and downloading may be bypassed
@@ -1526,7 +1526,7 @@ The available fields are:
- `hasvid`: Gives priority to formats that have a video stream - `hasvid`: Gives priority to formats that have a video stream
- `hasaud`: Gives priority to formats that have an audio stream - `hasaud`: Gives priority to formats that have an audio stream
- `ie_pref`: The format preference - `ie_pref`: The format preference
- `lang`: The language preference - `lang`: The language preference as determined by the extractor (e.g. original language preferred over audio description)
- `quality`: The quality of the format - `quality`: The quality of the format
- `source`: The preference of the source - `source`: The preference of the source
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`) - `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`)
@@ -1769,7 +1769,7 @@ The following extractors use this feature:
#### youtube #### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes * `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively * `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios` * `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details * `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp. * `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side) * `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
@@ -1812,6 +1812,9 @@ The following extractors use this feature:
* `vcodec`: vcodec to ignore - one or more of `h264`, `h265`, `dvh265` * `vcodec`: vcodec to ignore - one or more of `h264`, `h265`, `dvh265`
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv` * `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
#### instagram
* `app_id`: The value of the `X-IG-App-ID` header used for API requests. Default is the web app ID, `936619743392459`
#### niconicochannelplus #### niconicochannelplus
* `max_comments`: Maximum number of comments to extract - default is `120` * `max_comments`: Maximum number of comments to extract - default is `120`
@@ -1863,6 +1866,9 @@ The following extractors use this feature:
#### sonylivseries #### sonylivseries
* `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc` * `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc`
#### tver
* `backend`: Backend API to use for extraction - one of `streaks` (default) or `brightcove` (deprecated)
**Note**: These options may be changed/removed in the future without concern for backward compatibility **Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE --> <!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->

View File

@@ -11,11 +11,13 @@ import re
from devscripts.utils import get_filename_args, read_file, write_file from devscripts.utils import get_filename_args, read_file, write_file
VERBOSE_TMPL = ''' VERBOSE = '''
- type: checkboxes - type: checkboxes
id: verbose id: verbose
attributes: attributes:
label: Provide verbose output that clearly demonstrates the problem label: Provide verbose output that clearly demonstrates the problem
description: |
This is mandatory unless absolutely impossible to provide. If you are unable to provide the output, please explain why.
options: options:
- label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`) - label: Run **your** yt-dlp command with **-vU** flag added (`yt-dlp -vU <your command line>`)
required: true required: true
@@ -47,31 +49,23 @@ VERBOSE_TMPL = '''
render: shell render: shell
validations: validations:
required: true required: true
- type: markdown
attributes:
value: |
> [!CAUTION]
> ### GitHub is experiencing a high volume of malicious spam comments.
> ### If you receive any replies asking you download a file, do NOT follow the download links!
>
> Note that this issue may be temporarily locked as an anti-spam measure after it is opened.
'''.strip() '''.strip()
NO_SKIP = ''' NO_SKIP = '''
- type: checkboxes - type: markdown
attributes: attributes:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE value: |
description: Fill all fields even if you think it is irrelevant for the issue > [!IMPORTANT]
options: > Not providing the required (*) information or removing the template will result in your issue being closed and ignored.
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\\* field
required: true
'''.strip() '''.strip()
def main(): def main():
fields = {'no_skip': NO_SKIP} fields = {
fields['verbose'] = VERBOSE_TMPL % fields 'no_skip': NO_SKIP,
fields['verbose_optional'] = re.sub(r'(\n\s+validations:)?\n\s+required: true', '', fields['verbose']) 'verbose': VERBOSE,
'verbose_optional': re.sub(r'(\n\s+validations:)?\n\s+required: true', '', VERBOSE),
}
infile, outfile = get_filename_args(has_infile=True) infile, outfile = get_filename_args(has_infile=True)
write_file(outfile, read_file(infile) % fields) write_file(outfile, read_file(infile) % fields)

View File

@@ -10,6 +10,9 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from inspect import getsource from inspect import getsource
from devscripts.utils import get_filename_args, read_file, write_file from devscripts.utils import get_filename_args, read_file, write_file
from yt_dlp.extractor import import_extractors
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
from yt_dlp.globals import extractors
NO_ATTR = object() NO_ATTR = object()
STATIC_CLASS_PROPERTIES = [ STATIC_CLASS_PROPERTIES = [
@@ -38,8 +41,7 @@ def main():
lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py') lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py')
from yt_dlp.extractor.extractors import _ALL_CLASSES import_extractors()
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR}) DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR})
module_src = '\n'.join(( module_src = '\n'.join((
@@ -47,7 +49,7 @@ def main():
' _module = None', ' _module = None',
*extra_ie_code(DummyInfoExtractor), *extra_ie_code(DummyInfoExtractor),
'\nclass LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n', '\nclass LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n',
*build_ies(_ALL_CLASSES, (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor), *build_ies(list(extractors.value.values()), (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
)) ))
write_file(lazy_extractors_filename, f'{module_src}\n') write_file(lazy_extractors_filename, f'{module_src}\n')
@@ -73,7 +75,7 @@ def build_ies(ies, bases, attr_base):
if ie in ies: if ie in ies:
names.append(ie.__name__) names.append(ie.__name__)
yield f'\n_ALL_CLASSES = [{", ".join(names)}]' yield '\n_CLASS_LOOKUP = {%s}' % ', '.join(f'{name!r}: {name}' for name in names)
def sort_ies(ies, ignored_bases): def sort_ies(ies, ignored_bases):

View File

@@ -10,10 +10,21 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from devscripts.utils import get_filename_args, write_file from devscripts.utils import get_filename_args, write_file
from yt_dlp.extractor import list_extractor_classes from yt_dlp.extractor import list_extractor_classes
TEMPLATE = '''\
# Supported sites
Below is a list of all extractors that are currently included with yt-dlp.
If a site is not listed here, it might still be supported by yt-dlp's embed extraction or generic extractor.
Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them.
The only reliable way to check if a site is supported is to try it.
{ie_list}
'''
def main(): def main():
out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False) out = '\n'.join(ie.description() for ie in list_extractor_classes() if ie.IE_DESC is not False)
write_file(get_filename_args(), f'# Supported sites\n{out}\n') write_file(get_filename_args(), TEMPLATE.format(ie_list=out))
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -25,7 +25,8 @@ def parse_args():
def run_tests(*tests, pattern=None, ci=False): def run_tests(*tests, pattern=None, ci=False):
run_core = 'core' in tests or (not pattern and not tests) # XXX: hatch uses `tests` if no arguments are passed
run_core = 'core' in tests or 'tests' in tests or (not pattern and not tests)
run_download = 'download' in tests run_download = 'download' in tests
pytest_args = args.pytest_args or os.getenv('HATCH_TEST_ARGS', '') pytest_args = args.pytest_args or os.getenv('HATCH_TEST_ARGS', '')

View File

@@ -55,8 +55,7 @@ default = [
"websockets>=13.0", "websockets>=13.0",
] ]
curl-cffi = [ curl-cffi = [
"curl-cffi==0.5.10; os_name=='nt' and implementation_name=='cpython'", "curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.11; implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,<0.7.2; os_name!='nt' and implementation_name=='cpython'",
] ]
secretstorage = [ secretstorage = [
"cffi", "cffi",
@@ -76,7 +75,7 @@ dev = [
] ]
static-analysis = [ static-analysis = [
"autopep8~=2.0", "autopep8~=2.0",
"ruff~=0.9.0", "ruff~=0.11.0",
] ]
test = [ test = [
"pytest~=8.1", "pytest~=8.1",
@@ -384,9 +383,14 @@ select = [
"W391", "W391",
"W504", "W504",
] ]
exclude = "*/extractor/lazy_extractors.py,*venv*,*/test/testdata/sigs/player-*.js,.idea,.vscode"
[tool.pytest.ini_options] [tool.pytest.ini_options]
addopts = "-ra -v --strict-markers" addopts = [
"-ra", # summary: all except passed
"--verbose",
"--strict-markers",
]
markers = [ markers = [
"download", "download",
] ]

View File

@@ -1,6 +1,13 @@
# Supported sites # Supported sites
Below is a list of all extractors that are currently included with yt-dlp.
If a site is not listed here, it might still be supported by yt-dlp's embed extraction or generic extractor.
Not all sites listed here are guaranteed to work; websites are constantly changing and sometimes this breaks yt-dlp's support for them.
The only reliable way to check if a site is supported is to try it.
- **17live** - **17live**
- **17live:clip** - **17live:clip**
- **17live:vod**
- **1News**: 1news.co.nz article videos - **1News**: 1news.co.nz article videos
- **1tv**: Первый канал - **1tv**: Первый канал
- **20min** - **20min**
@@ -194,7 +201,7 @@
- **blogger.com** - **blogger.com**
- **Bloomberg** - **Bloomberg**
- **Bluesky** - **Bluesky**
- **BokeCC** - **BokeCC**: CC视频
- **BongaCams** - **BongaCams**
- **Boosty** - **Boosty**
- **BostonGlobe** - **BostonGlobe**
@@ -218,6 +225,7 @@
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen - **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **Bundesliga** - **Bundesliga**
- **Bundestag** - **Bundestag**
- **BunnyCdn**
- **BusinessInsider** - **BusinessInsider**
- **BuzzFeed** - **BuzzFeed**
- **BYUtv**: (**Currently broken**) - **BYUtv**: (**Currently broken**)
@@ -236,6 +244,7 @@
- **CanalAlpha** - **CanalAlpha**
- **canalc2.tv** - **canalc2.tv**
- **Canalplus**: mycanal.fr and piwiplus.fr - **Canalplus**: mycanal.fr and piwiplus.fr
- **Canalsurmas**
- **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine") - **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine")
- **CartoonNetwork** - **CartoonNetwork**
- **cbc.ca** - **cbc.ca**
@@ -314,7 +323,8 @@
- **curiositystream**: [*curiositystream*](## "netrc machine") - **curiositystream**: [*curiositystream*](## "netrc machine")
- **curiositystream:collections**: [*curiositystream*](## "netrc machine") - **curiositystream:collections**: [*curiositystream*](## "netrc machine")
- **curiositystream:series**: [*curiositystream*](## "netrc machine") - **curiositystream:series**: [*curiositystream*](## "netrc machine")
- **CWTV** - **cwtv**
- **cwtv:movie**
- **Cybrary**: [*cybrary*](## "netrc machine") - **Cybrary**: [*cybrary*](## "netrc machine")
- **CybraryCourse**: [*cybrary*](## "netrc machine") - **CybraryCourse**: [*cybrary*](## "netrc machine")
- **DacastPlaylist** - **DacastPlaylist**
@@ -338,8 +348,6 @@
- **daystar:clip** - **daystar:clip**
- **DBTV** - **DBTV**
- **DctpTv** - **DctpTv**
- **DeezerAlbum**
- **DeezerPlaylist**
- **democracynow** - **democracynow**
- **DestinationAmerica** - **DestinationAmerica**
- **DetikEmbed** - **DetikEmbed**
@@ -349,6 +357,7 @@
- **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor - **DigitalConcertHall**: [*digitalconcerthall*](## "netrc machine") DigitalConcertHall extractor
- **DigitallySpeaking** - **DigitallySpeaking**
- **Digiteka** - **Digiteka**
- **Digiview**
- **DiscogsReleasePlaylist** - **DiscogsReleasePlaylist**
- **DiscoveryLife** - **DiscoveryLife**
- **DiscoveryNetworksDe** - **DiscoveryNetworksDe**
@@ -465,9 +474,9 @@
- **fptplay**: fptplay.vn - **fptplay**: fptplay.vn
- **FranceCulture** - **FranceCulture**
- **FranceInter** - **FranceInter**
- **FranceTV** - **francetv**
- **francetv:site**
- **francetvinfo.fr** - **francetvinfo.fr**
- **FranceTVSite**
- **Freesound** - **Freesound**
- **freespeech.org** - **freespeech.org**
- **freetv:series** - **freetv:series**
@@ -499,7 +508,7 @@
- **GediDigital** - **GediDigital**
- **gem.cbc.ca**: [*cbcgem*](## "netrc machine") - **gem.cbc.ca**: [*cbcgem*](## "netrc machine")
- **gem.cbc.ca:live** - **gem.cbc.ca:live**
- **gem.cbc.ca:playlist** - **gem.cbc.ca:playlist**: [*cbcgem*](## "netrc machine")
- **Genius** - **Genius**
- **GeniusLyrics** - **GeniusLyrics**
- **Germanupa**: germanupa.de - **Germanupa**: germanupa.de
@@ -601,10 +610,10 @@
- **Inc** - **Inc**
- **IndavideoEmbed** - **IndavideoEmbed**
- **InfoQ** - **InfoQ**
- **Instagram**: [*instagram*](## "netrc machine") - **Instagram**
- **instagram:story**: [*instagram*](## "netrc machine") - **instagram:story**
- **instagram:tag**: [*instagram*](## "netrc machine") Instagram hashtag search URLs - **instagram:tag**: Instagram hashtag search URLs
- **instagram:user**: [*instagram*](## "netrc machine") Instagram user profile (**Currently broken**) - **instagram:user**: Instagram user profile (**Currently broken**)
- **InstagramIOS**: IOS instagram:// URL - **InstagramIOS**: IOS instagram:// URL
- **Internazionale** - **Internazionale**
- **InternetVideoArchive** - **InternetVideoArchive**
@@ -653,7 +662,6 @@
- **KelbyOne**: (**Currently broken**) - **KelbyOne**: (**Currently broken**)
- **Kenh14Playlist** - **Kenh14Playlist**
- **Kenh14Video** - **Kenh14Video**
- **Ketnet**
- **khanacademy** - **khanacademy**
- **khanacademy:unit** - **khanacademy:unit**
- **kick:clips** - **kick:clips**
@@ -725,6 +733,7 @@
- **Livestreamfails** - **Livestreamfails**
- **Lnk** - **Lnk**
- **loc**: Library of Congress - **loc**: Library of Congress
- **Loco**
- **loom** - **loom**
- **loom:folder** - **loom:folder**
- **LoveHomePorn** - **LoveHomePorn**
@@ -819,11 +828,11 @@
- **MotherlessUploader** - **MotherlessUploader**
- **Motorsport**: motorsport.com (**Currently broken**) - **Motorsport**: motorsport.com (**Currently broken**)
- **MovieFap** - **MovieFap**
- **Moviepilot** - **moviepilot**: Moviepilot trailer
- **MoviewPlay** - **MoviewPlay**
- **Moviezine** - **Moviezine**
- **MovingImage** - **MovingImage**
- **MSN**: (**Currently broken**) - **MSN**
- **mtg**: MTG services - **mtg**: MTG services
- **mtv** - **mtv**
- **mtv.de**: (**Currently broken**) - **mtv.de**: (**Currently broken**)
@@ -1297,8 +1306,8 @@
- **sejm** - **sejm**
- **Sen** - **Sen**
- **SenalColombiaLive**: (**Currently broken**) - **SenalColombiaLive**: (**Currently broken**)
- **SenateGov** - **senate.gov**
- **SenateISVP** - **senate.gov:isvp**
- **SendtoNews**: (**Currently broken**) - **SendtoNews**: (**Currently broken**)
- **Servus** - **Servus**
- **Sexu**: (**Currently broken**) - **Sexu**: (**Currently broken**)
@@ -1334,6 +1343,7 @@
- **Smotrim** - **Smotrim**
- **SnapchatSpotlight** - **SnapchatSpotlight**
- **Snotr** - **Snotr**
- **SoftWhiteUnderbelly**: [*softwhiteunderbelly*](## "netrc machine")
- **Sohu** - **Sohu**
- **SohuV** - **SohuV**
- **SonyLIV**: [*sonyliv*](## "netrc machine") - **SonyLIV**: [*sonyliv*](## "netrc machine")
@@ -1390,6 +1400,7 @@
- **StoryFire** - **StoryFire**
- **StoryFireSeries** - **StoryFireSeries**
- **StoryFireUser** - **StoryFireUser**
- **Streaks**
- **Streamable** - **Streamable**
- **StreamCZ** - **StreamCZ**
- **StreetVoice** - **StreetVoice**
@@ -1528,6 +1539,8 @@
- **tv5unis** - **tv5unis**
- **tv5unis:video** - **tv5unis:video**
- **tv8.it** - **tv8.it**
- **tv8.it:live**: TV8 Live
- **tv8.it:playlist**: TV8 Playlist
- **TVANouvelles** - **TVANouvelles**
- **TVANouvellesArticle** - **TVANouvellesArticle**
- **tvaplus**: TVA+ - **tvaplus**: TVA+
@@ -1548,6 +1561,7 @@
- **tvp:vod:series** - **tvp:vod:series**
- **TVPlayer** - **TVPlayer**
- **TVPlayHome** - **TVPlayHome**
- **Tvw**
- **Tweakers** - **Tweakers**
- **TwitCasting** - **TwitCasting**
- **TwitCastingLive** - **TwitCastingLive**
@@ -1629,8 +1643,6 @@
- **viewlift** - **viewlift**
- **viewlift:embed** - **viewlift:embed**
- **Viidea** - **Viidea**
- **viki**: [*viki*](## "netrc machine")
- **viki:channel**: [*viki*](## "netrc machine")
- **vimeo**: [*vimeo*](## "netrc machine") - **vimeo**: [*vimeo*](## "netrc machine")
- **vimeo:album**: [*vimeo*](## "netrc machine") - **vimeo:album**: [*vimeo*](## "netrc machine")
- **vimeo:channel**: [*vimeo*](## "netrc machine") - **vimeo:channel**: [*vimeo*](## "netrc machine")
@@ -1668,8 +1680,12 @@
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series** - **vqq:series**
- **vqq:video** - **vqq:video**
- **vrsquare**: VR SQUARE
- **vrsquare:channel**
- **vrsquare:search**
- **vrsquare:section**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX - **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU)
- **VTM**: (**Currently broken**) - **VTM**: (**Currently broken**)
- **VTV** - **VTV**
- **VTVGo** - **VTVGo**

View File

@@ -101,87 +101,109 @@ def getwebpagetestcases():
md5 = lambda s: hashlib.md5(s.encode()).hexdigest() md5 = lambda s: hashlib.md5(s.encode()).hexdigest()
def expect_value(self, got, expected, field): def _iter_differences(got, expected, field):
if isinstance(expected, str) and expected.startswith('re:'): if isinstance(expected, str):
match_str = expected[len('re:'):] op, _, val = expected.partition(':')
match_rex = re.compile(match_str) if op in ('mincount', 'maxcount', 'count'):
if not isinstance(got, (list, dict)):
yield field, f'expected either {list.__name__} or {dict.__name__}, got {type(got).__name__}'
return
self.assertTrue( expected_num = int(val)
isinstance(got, str), got_num = len(got)
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
self.assertTrue(
match_rex.match(got),
f'field {field} (value: {got!r}) should match {match_str!r}')
elif isinstance(expected, str) and expected.startswith('startswith:'):
start_str = expected[len('startswith:'):]
self.assertTrue(
isinstance(got, str),
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
self.assertTrue(
got.startswith(start_str),
f'field {field} (value: {got!r}) should start with {start_str!r}')
elif isinstance(expected, str) and expected.startswith('contains:'):
contains_str = expected[len('contains:'):]
self.assertTrue(
isinstance(got, str),
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
self.assertTrue(
contains_str in got,
f'field {field} (value: {got!r}) should contain {contains_str!r}')
elif isinstance(expected, type):
self.assertTrue(
isinstance(got, expected),
f'Expected type {expected!r} for field {field}, but got value {got!r} of type {type(got)!r}')
elif isinstance(expected, dict) and isinstance(got, dict):
expect_dict(self, got, expected)
elif isinstance(expected, list) and isinstance(got, list):
self.assertEqual(
len(expected), len(got),
f'Expect a list of length {len(expected)}, but got a list of length {len(got)} for field {field}')
for index, (item_got, item_expected) in enumerate(zip(got, expected)):
type_got = type(item_got)
type_expected = type(item_expected)
self.assertEqual(
type_expected, type_got,
f'Type mismatch for list item at index {index} for field {field}, '
f'expected {type_expected!r}, got {type_got!r}')
expect_value(self, item_got, item_expected, field)
else:
if isinstance(expected, str) and expected.startswith('md5:'):
self.assertTrue(
isinstance(got, str),
f'Expected field {field} to be a unicode object, but got value {got!r} of type {type(got)!r}')
got = 'md5:' + md5(got)
elif isinstance(expected, str) and re.match(r'^(?:min|max)?count:\d+', expected):
self.assertTrue(
isinstance(got, (list, dict)),
f'Expected field {field} to be a list or a dict, but it is of type {type(got).__name__}')
op, _, expected_num = expected.partition(':')
expected_num = int(expected_num)
if op == 'mincount': if op == 'mincount':
assert_func = assertGreaterEqual if got_num < expected_num:
msg_tmpl = 'Expected %d items in field %s, but only got %d' yield field, f'expected at least {val} items, got {got_num}'
elif op == 'maxcount': return
assert_func = assertLessEqual
msg_tmpl = 'Expected maximum %d items in field %s, but got %d' if op == 'maxcount':
elif op == 'count': if got_num > expected_num:
assert_func = assertEqual yield field, f'expected at most {val} items, got {got_num}'
msg_tmpl = 'Expected exactly %d items in field %s, but got %d' return
else:
assert False assert op == 'count'
assert_func( if got_num != expected_num:
self, len(got), expected_num, yield field, f'expected exactly {val} items, got {got_num}'
msg_tmpl % (expected_num, field, len(got)))
return return
self.assertEqual(
expected, got, if not isinstance(got, str):
f'Invalid value for field {field}, expected {expected!r}, got {got!r}') yield field, f'expected {str.__name__}, got {type(got).__name__}'
return
if op == 're':
if not re.match(val, got):
yield field, f'should match {val!r}, got {got!r}'
return
if op == 'startswith':
if not val.startswith(got):
yield field, f'should start with {val!r}, got {got!r}'
return
if op == 'contains':
if not val.startswith(got):
yield field, f'should contain {val!r}, got {got!r}'
return
if op == 'md5':
hash_val = md5(got)
if hash_val != val:
yield field, f'expected hash {val}, got {hash_val}'
return
if got != expected:
yield field, f'expected {expected!r}, got {got!r}'
return
if isinstance(expected, dict) and isinstance(got, dict):
for key, expected_val in expected.items():
if key not in got:
yield field, f'missing key: {key!r}'
continue
field_name = key if field is None else f'{field}.{key}'
yield from _iter_differences(got[key], expected_val, field_name)
return
if isinstance(expected, type):
if not isinstance(got, expected):
yield field, f'expected {expected.__name__}, got {type(got).__name__}'
return
if isinstance(expected, list) and isinstance(got, list):
# TODO: clever diffing algorithm lmao
if len(expected) != len(got):
yield field, f'expected length of {len(expected)}, got {len(got)}'
return
for index, (got_val, expected_val) in enumerate(zip(got, expected)):
field_name = str(index) if field is None else f'{field}.{index}'
yield from _iter_differences(got_val, expected_val, field_name)
return
if got != expected:
yield field, f'expected {expected!r}, got {got!r}'
def _expect_value(message, got, expected, field):
mismatches = list(_iter_differences(got, expected, field))
if not mismatches:
return
fields = [field for field, _ in mismatches if field is not None]
return ''.join((
message, f' ({", ".join(fields)})' if fields else '',
*(f'\n\t{field}: {message}' for field, message in mismatches)))
def expect_value(self, got, expected, field):
if message := _expect_value('values differ', got, expected, field):
self.fail(message)
def expect_dict(self, got_dict, expected_dict): def expect_dict(self, got_dict, expected_dict):
for info_field, expected in expected_dict.items(): if message := _expect_value('dictionaries differ', got_dict, expected_dict, None):
got = got_dict.get(info_field) self.fail(message)
expect_value(self, got, expected, info_field)
def sanitize_got_info_dict(got_dict): def sanitize_got_info_dict(got_dict):
@@ -237,6 +259,20 @@ def sanitize_got_info_dict(got_dict):
def expect_info_dict(self, got_dict, expected_dict): def expect_info_dict(self, got_dict, expected_dict):
ALLOWED_KEYS_SORT_ORDER = (
# NB: Keep in sync with the docstring of extractor/common.py
'id', 'ext', 'direct', 'display_id', 'title', 'alt_title', 'description', 'media_type',
'uploader', 'uploader_id', 'uploader_url', 'channel', 'channel_id', 'channel_url', 'channel_is_verified',
'channel_follower_count', 'comment_count', 'view_count', 'concurrent_view_count',
'like_count', 'dislike_count', 'repost_count', 'average_rating', 'age_limit', 'duration', 'thumbnail', 'heatmap',
'chapters', 'chapter', 'chapter_number', 'chapter_id', 'start_time', 'end_time', 'section_start', 'section_end',
'categories', 'tags', 'cast', 'composers', 'artists', 'album_artists', 'creators', 'genres',
'track', 'track_number', 'track_id', 'album', 'album_type', 'disc_number',
'series', 'series_id', 'season', 'season_number', 'season_id', 'episode', 'episode_number', 'episode_id',
'timestamp', 'upload_date', 'release_timestamp', 'release_date', 'release_year', 'modified_timestamp', 'modified_date',
'playable_in_embed', 'availability', 'live_status', 'location', 'license', '_old_archive_ids',
)
expect_dict(self, got_dict, expected_dict) expect_dict(self, got_dict, expected_dict)
# Check for the presence of mandatory fields # Check for the presence of mandatory fields
if got_dict.get('_type') not in ('playlist', 'multi_video'): if got_dict.get('_type') not in ('playlist', 'multi_video'):
@@ -252,7 +288,13 @@ def expect_info_dict(self, got_dict, expected_dict):
test_info_dict = sanitize_got_info_dict(got_dict) test_info_dict = sanitize_got_info_dict(got_dict)
missing_keys = set(test_info_dict.keys()) - set(expected_dict.keys()) # Check for invalid/misspelled field names being returned by the extractor
invalid_keys = sorted(test_info_dict.keys() - ALLOWED_KEYS_SORT_ORDER)
self.assertFalse(invalid_keys, f'Invalid fields returned by the extractor: {", ".join(invalid_keys)}')
missing_keys = sorted(
test_info_dict.keys() - expected_dict.keys(),
key=lambda x: ALLOWED_KEYS_SORT_ORDER.index(x))
if missing_keys: if missing_keys:
def _repr(v): def _repr(v):
if isinstance(v, str): if isinstance(v, str):

View File

@@ -638,6 +638,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'img_bipbop_adv_example_fmp4', 'img_bipbop_adv_example_fmp4',
'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
[{ [{
# 60kbps (bitrate not provided in m3u8); sorted as worst because it's grouped with lowest bitrate video track
'format_id': 'aud1-English', 'format_id': 'aud1-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a1/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a1/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@@ -645,15 +646,9 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'ext': 'mp4', 'ext': 'mp4',
'protocol': 'm3u8_native', 'protocol': 'm3u8_native',
'audio_ext': 'mp4', 'audio_ext': 'mp4',
'source_preference': 0,
}, { }, {
'format_id': 'aud2-English', # 192kbps (bitrate not provided in m3u8)
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
}, {
'format_id': 'aud3-English', 'format_id': 'aud3-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a3/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a3/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@@ -661,6 +656,17 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'ext': 'mp4', 'ext': 'mp4',
'protocol': 'm3u8_native', 'protocol': 'm3u8_native',
'audio_ext': 'mp4', 'audio_ext': 'mp4',
'source_preference': 1,
}, {
# 384kbps (bitrate not provided in m3u8); sorted as best because it's grouped with the highest bitrate video track
'format_id': 'aud2-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
'source_preference': 2,
}, { }, {
'format_id': '530', 'format_id': '530',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/v2/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/v2/prog_index.m3u8',

View File

@@ -6,6 +6,8 @@ import sys
import unittest import unittest
from unittest.mock import patch from unittest.mock import patch
from yt_dlp.globals import all_plugins_loaded
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@@ -1427,6 +1429,12 @@ class TestYoutubeDL(unittest.TestCase):
self.assertFalse(result.get('cookies'), msg='Cookies set in cookies field for wrong domain') self.assertFalse(result.get('cookies'), msg='Cookies set in cookies field for wrong domain')
self.assertFalse(ydl.cookiejar.get_cookie_header(fmt['url']), msg='Cookies set in cookiejar for wrong domain') self.assertFalse(ydl.cookiejar.get_cookie_header(fmt['url']), msg='Cookies set in cookiejar for wrong domain')
def test_load_plugins_compat(self):
# Should try to reload plugins if they haven't already been loaded
all_plugins_loaded.value = False
FakeYDL().close()
assert all_plugins_loaded.value
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -331,10 +331,6 @@ class TestHTTPConnectProxy:
assert proxy_info['proxy'] == server_address assert proxy_info['proxy'] == server_address
assert 'Proxy-Authorization' in proxy_info['headers'] assert 'Proxy-Authorization' in proxy_info['headers']
@pytest.mark.skip_handler(
'Requests',
'bug in urllib3 causes unclosed socket: https://github.com/urllib3/urllib3/issues/3374',
)
def test_http_connect_bad_auth(self, handler, ctx): def test_http_connect_bad_auth(self, handler, ctx):
with ctx.http_server(HTTPConnectProxyHandler, username='test', password='test') as server_address: with ctx.http_server(HTTPConnectProxyHandler, username='test', password='test') as server_address:
with handler(verify=False, proxies={ctx.REQUEST_PROTO: f'http://test:bad@{server_address}'}) as rh: with handler(verify=False, proxies={ctx.REQUEST_PROTO: f'http://test:bad@{server_address}'}) as rh:

View File

@@ -9,7 +9,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import math import math
from yt_dlp.jsinterp import JS_Undefined, JSInterpreter from yt_dlp.jsinterp import JS_Undefined, JSInterpreter, js_number_to_string
class NaN: class NaN:
@@ -93,6 +93,16 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return 0 ?? 42;}', 0) self._test('function f(){return 0 ?? 42;}', 0)
self._test('function f(){return "life, the universe and everything" < 42;}', False) self._test('function f(){return "life, the universe and everything" < 42;}', False)
self._test('function f(){return 0 - 7 * - 6;}', 42) self._test('function f(){return 0 - 7 * - 6;}', 42)
self._test('function f(){return true << "5";}', 32)
self._test('function f(){return true << true;}', 2)
self._test('function f(){return "19" & "21.9";}', 17)
self._test('function f(){return "19" & false;}', 0)
self._test('function f(){return "11.0" >> "2.1";}', 2)
self._test('function f(){return 5 ^ 9;}', 12)
self._test('function f(){return 0.0 << NaN}', 0)
self._test('function f(){return null << undefined}', 0)
# TODO: Does not work due to number too large
# self._test('function f(){return 21 << 4294967297}', 42)
def test_array_access(self): def test_array_access(self):
self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7]) self._test('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2.0] = 7; return x;}', [5, 2, 7])
@@ -374,7 +384,7 @@ class TestJSInterpreter(unittest.TestCase):
@unittest.skip('Not implemented') @unittest.skip('Not implemented')
def test_packed(self): def test_packed(self):
jsi = JSInterpreter('''function f(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''') jsi = JSInterpreter('''function f(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''')
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|'))) self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|'))) # noqa: SIM905
def test_join(self): def test_join(self):
test_input = list('test') test_input = list('test')
@@ -431,6 +441,37 @@ class TestJSInterpreter(unittest.TestCase):
self._test('function f(){return "012345678".slice(-1, 1)}', '') self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67') self._test('function f(){return "012345678".slice(-3, -1)}', '67')
def test_js_number_to_string(self):
for test, radix, expected in [
(0, None, '0'),
(-0, None, '0'),
(0.0, None, '0'),
(-0.0, None, '0'),
(math.nan, None, 'NaN'),
(-math.nan, None, 'NaN'),
(math.inf, None, 'Infinity'),
(-math.inf, None, '-Infinity'),
(10 ** 21.5, 8, '526665530627250154000000'),
(6, 2, '110'),
(254, 16, 'fe'),
(-10, 2, '-1010'),
(-0xff, 2, '-11111111'),
(0.1 + 0.2, 16, '0.4cccccccccccd'),
(1234.1234, 10, '1234.1234'),
# (1000000000000000128, 10, '1000000000000000100')
]:
assert js_number_to_string(test, radix) == expected
def test_extract_function(self):
jsi = JSInterpreter('function a(b) { return b + 1; }')
func = jsi.extract_function('a')
self.assertEqual(func([2]), 3)
def test_extract_function_with_global_stack(self):
jsi = JSInterpreter('function c(d) { return d + e + f + g; }')
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
self.assertEqual(func([1]), 1111)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -614,7 +614,6 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
rh, Request(f'http://127.0.0.1:{self.http_port}/source_address')).read().decode() rh, Request(f'http://127.0.0.1:{self.http_port}/source_address')).read().decode()
assert source_address == data assert source_address == data
# Not supported by CurlCFFI
@pytest.mark.skip_handler('CurlCFFI', 'not supported by curl-cffi') @pytest.mark.skip_handler('CurlCFFI', 'not supported by curl-cffi')
def test_gzip_trailing_garbage(self, handler): def test_gzip_trailing_garbage(self, handler):
with handler() as rh: with handler() as rh:
@@ -720,6 +719,15 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
rh, Request( rh, Request(
f'http://127.0.0.1:{self.http_port}/headers', proxies={'all': 'http://10.255.255.255'})).close() f'http://127.0.0.1:{self.http_port}/headers', proxies={'all': 'http://10.255.255.255'})).close()
@pytest.mark.skip_handlers_if(lambda _, handler: handler not in ['Urllib', 'CurlCFFI'], 'handler does not support keep_header_casing')
def test_keep_header_casing(self, handler):
with handler() as rh:
res = validate_and_send(
rh, Request(
f'http://127.0.0.1:{self.http_port}/headers', headers={'X-test-heaDer': 'test'}, extensions={'keep_header_casing': True})).read().decode()
assert 'X-test-heaDer: test' in res
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True) @pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
class TestClientCertificate: class TestClientCertificate:
@@ -1289,6 +1297,7 @@ class TestRequestHandlerValidation:
({'legacy_ssl': False}, False), ({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False), ({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError), ({'legacy_ssl': 'notabool'}, AssertionError),
({'keep_header_casing': True}, UnsupportedRequest),
]), ]),
('Requests', 'http', [ ('Requests', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError), ({'cookiejar': 'notacookiejar'}, AssertionError),
@@ -1299,6 +1308,9 @@ class TestRequestHandlerValidation:
({'legacy_ssl': False}, False), ({'legacy_ssl': False}, False),
({'legacy_ssl': True}, False), ({'legacy_ssl': True}, False),
({'legacy_ssl': 'notabool'}, AssertionError), ({'legacy_ssl': 'notabool'}, AssertionError),
({'keep_header_casing': False}, False),
({'keep_header_casing': True}, False),
({'keep_header_casing': 'notabool'}, AssertionError),
]), ]),
('CurlCFFI', 'http', [ ('CurlCFFI', 'http', [
({'cookiejar': 'notacookiejar'}, AssertionError), ({'cookiejar': 'notacookiejar'}, AssertionError),

View File

@@ -10,22 +10,71 @@ TEST_DATA_DIR = Path(os.path.dirname(os.path.abspath(__file__)), 'testdata')
sys.path.append(str(TEST_DATA_DIR)) sys.path.append(str(TEST_DATA_DIR))
importlib.invalidate_caches() importlib.invalidate_caches()
from yt_dlp.utils import Config from yt_dlp.plugins import (
from yt_dlp.plugins import PACKAGE_NAME, directories, load_plugins PACKAGE_NAME,
PluginSpec,
directories,
load_plugins,
load_all_plugins,
register_plugin_spec,
)
from yt_dlp.globals import (
extractors,
postprocessors,
plugin_dirs,
plugin_ies,
plugin_pps,
all_plugins_loaded,
plugin_specs,
)
EXTRACTOR_PLUGIN_SPEC = PluginSpec(
module_name='extractor',
suffix='IE',
destination=extractors,
plugin_destination=plugin_ies,
)
POSTPROCESSOR_PLUGIN_SPEC = PluginSpec(
module_name='postprocessor',
suffix='PP',
destination=postprocessors,
plugin_destination=plugin_pps,
)
def reset_plugins():
plugin_ies.value = {}
plugin_pps.value = {}
plugin_dirs.value = ['default']
plugin_specs.value = {}
all_plugins_loaded.value = False
# Clearing override plugins is probably difficult
for module_name in tuple(sys.modules):
for plugin_type in ('extractor', 'postprocessor'):
if module_name.startswith(f'{PACKAGE_NAME}.{plugin_type}.'):
del sys.modules[module_name]
importlib.invalidate_caches()
class TestPlugins(unittest.TestCase): class TestPlugins(unittest.TestCase):
TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME
def setUp(self):
reset_plugins()
def tearDown(self):
reset_plugins()
def test_directories_containing_plugins(self): def test_directories_containing_plugins(self):
self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories())) self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories()))
def test_extractor_classes(self): def test_extractor_classes(self):
for module_name in tuple(sys.modules): plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
if module_name.startswith(f'{PACKAGE_NAME}.extractor'):
del sys.modules[module_name]
plugins_ie = load_plugins('extractor', 'IE')
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys()) self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
self.assertIn('NormalPluginIE', plugins_ie.keys()) self.assertIn('NormalPluginIE', plugins_ie.keys())
@@ -35,17 +84,29 @@ class TestPlugins(unittest.TestCase):
f'{PACKAGE_NAME}.extractor._ignore' in sys.modules, f'{PACKAGE_NAME}.extractor._ignore' in sys.modules,
'loaded module beginning with underscore') 'loaded module beginning with underscore')
self.assertNotIn('IgnorePluginIE', plugins_ie.keys()) self.assertNotIn('IgnorePluginIE', plugins_ie.keys())
self.assertNotIn('IgnorePluginIE', plugin_ies.value)
# Don't load extractors with underscore prefix # Don't load extractors with underscore prefix
self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys()) self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys())
self.assertNotIn('_IgnoreUnderscorePluginIE', plugin_ies.value)
# Don't load extractors not specified in __all__ (if supplied) # Don't load extractors not specified in __all__ (if supplied)
self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys()) self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys())
self.assertNotIn('IgnoreNotInAllPluginIE', plugin_ies.value)
self.assertIn('InAllPluginIE', plugins_ie.keys()) self.assertIn('InAllPluginIE', plugins_ie.keys())
self.assertIn('InAllPluginIE', plugin_ies.value)
# Don't load override extractors
self.assertNotIn('OverrideGenericIE', plugins_ie.keys())
self.assertNotIn('OverrideGenericIE', plugin_ies.value)
self.assertNotIn('_UnderscoreOverrideGenericIE', plugins_ie.keys())
self.assertNotIn('_UnderscoreOverrideGenericIE', plugin_ies.value)
def test_postprocessor_classes(self): def test_postprocessor_classes(self):
plugins_pp = load_plugins('postprocessor', 'PP') plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
self.assertIn('NormalPluginPP', plugins_pp.keys()) self.assertIn('NormalPluginPP', plugins_pp.keys())
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
self.assertIn('NormalPluginPP', plugin_pps.value)
def test_importing_zipped_module(self): def test_importing_zipped_module(self):
zip_path = TEST_DATA_DIR / 'zipped_plugins.zip' zip_path = TEST_DATA_DIR / 'zipped_plugins.zip'
@@ -58,10 +119,10 @@ class TestPlugins(unittest.TestCase):
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}') package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__)) self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
plugins_ie = load_plugins('extractor', 'IE') plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
self.assertIn('ZippedPluginIE', plugins_ie.keys()) self.assertIn('ZippedPluginIE', plugins_ie.keys())
plugins_pp = load_plugins('postprocessor', 'PP') plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
self.assertIn('ZippedPluginPP', plugins_pp.keys()) self.assertIn('ZippedPluginPP', plugins_pp.keys())
finally: finally:
@@ -69,23 +130,116 @@ class TestPlugins(unittest.TestCase):
os.remove(zip_path) os.remove(zip_path)
importlib.invalidate_caches() # reset the import caches importlib.invalidate_caches() # reset the import caches
def test_plugin_dirs(self): def test_reloading_plugins(self):
# Internal plugin dirs hack for CLI --plugin-dirs reload_plugins_path = TEST_DATA_DIR / 'reload_plugins'
# To be replaced with proper system later load_plugins(EXTRACTOR_PLUGIN_SPEC)
custom_plugin_dir = TEST_DATA_DIR / 'plugin_packages' load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
Config._plugin_dirs = [str(custom_plugin_dir)]
importlib.invalidate_caches() # reset the import caches
# Remove default folder and add reload_plugin path
sys.path.remove(str(TEST_DATA_DIR))
sys.path.append(str(reload_plugins_path))
importlib.invalidate_caches()
try: try:
package = importlib.import_module(f'{PACKAGE_NAME}.extractor') for plugin_type in ('extractor', 'postprocessor'):
self.assertIn(custom_plugin_dir / 'testpackage' / PACKAGE_NAME / 'extractor', map(Path, package.__path__)) package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
self.assertIn(reload_plugins_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
plugins_ie = load_plugins('extractor', 'IE') plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
self.assertIn('PackagePluginIE', plugins_ie.keys()) self.assertIn('NormalPluginIE', plugins_ie.keys())
self.assertTrue(
plugins_ie['NormalPluginIE'].REPLACED,
msg='Reloading has not replaced original extractor plugin')
self.assertTrue(
extractors.value['NormalPluginIE'].REPLACED,
msg='Reloading has not replaced original extractor plugin globally')
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
self.assertIn('NormalPluginPP', plugins_pp.keys())
self.assertTrue(plugins_pp['NormalPluginPP'].REPLACED,
msg='Reloading has not replaced original postprocessor plugin')
self.assertTrue(
postprocessors.value['NormalPluginPP'].REPLACED,
msg='Reloading has not replaced original postprocessor plugin globally')
finally: finally:
Config._plugin_dirs = [] sys.path.remove(str(reload_plugins_path))
importlib.invalidate_caches() # reset the import caches sys.path.append(str(TEST_DATA_DIR))
importlib.invalidate_caches()
def test_extractor_override_plugin(self):
load_plugins(EXTRACTOR_PLUGIN_SPEC)
from yt_dlp.extractor.generic import GenericIE
self.assertEqual(GenericIE.TEST_FIELD, 'override')
self.assertEqual(GenericIE.SECONDARY_TEST_FIELD, 'underscore-override')
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
importlib.invalidate_caches()
# test that loading a second time doesn't wrap a second time
load_plugins(EXTRACTOR_PLUGIN_SPEC)
from yt_dlp.extractor.generic import GenericIE
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
def test_load_all_plugin_types(self):
# no plugin specs registered
load_all_plugins()
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
load_all_plugins()
self.assertTrue(all_plugins_loaded.value)
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
def test_no_plugin_dirs(self):
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
plugin_dirs.value = []
load_all_plugins()
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
def test_set_plugin_dirs(self):
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
plugin_dirs.value = [custom_plugin_dir]
load_plugins(EXTRACTOR_PLUGIN_SPEC)
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
self.assertIn('PackagePluginIE', plugin_ies.value)
def test_invalid_plugin_dir(self):
plugin_dirs.value = ['invalid_dir']
with self.assertRaises(ValueError):
load_plugins(EXTRACTOR_PLUGIN_SPEC)
def test_append_plugin_dirs(self):
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
self.assertEqual(plugin_dirs.value, ['default'])
plugin_dirs.value.append(custom_plugin_dir)
self.assertEqual(plugin_dirs.value, ['default', custom_plugin_dir])
load_plugins(EXTRACTOR_PLUGIN_SPEC)
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
self.assertIn('PackagePluginIE', plugin_ies.value)
def test_get_plugin_spec(self):
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
self.assertEqual(plugin_specs.value.get('extractor'), EXTRACTOR_PLUGIN_SPEC)
self.assertEqual(plugin_specs.value.get('postprocessor'), POSTPROCESSOR_PLUGIN_SPEC)
self.assertIsNone(plugin_specs.value.get('invalid'))
if __name__ == '__main__': if __name__ == '__main__':

View File

@@ -23,7 +23,6 @@ from yt_dlp.extractor import (
TedTalkIE, TedTalkIE,
ThePlatformFeedIE, ThePlatformFeedIE,
ThePlatformIE, ThePlatformIE,
VikiIE,
VimeoIE, VimeoIE,
WallaIE, WallaIE,
YoutubeIE, YoutubeIE,
@@ -331,20 +330,6 @@ class TestRaiPlaySubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd') self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd')
@is_download_test
@unittest.skip('IE broken - DRM only')
class TestVikiSubtitles(BaseTestSubtitles):
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
IE = VikiIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), '53cb083a5914b2d84ef1ab67b880d18a')
@is_download_test @is_download_test
class TestThePlatformSubtitles(BaseTestSubtitles): class TestThePlatformSubtitles(BaseTestSubtitles):
# from http://www.3playmedia.com/services-features/tools/integrations/theplatform/ # from http://www.3playmedia.com/services-features/tools/integrations/theplatform/

View File

@@ -3,19 +3,20 @@
# Allow direct execution # Allow direct execution
import os import os
import sys import sys
import unittest
import unittest.mock
import warnings
import datetime as dt
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import contextlib import contextlib
import datetime as dt
import io import io
import itertools import itertools
import json import json
import pickle
import subprocess import subprocess
import unittest
import unittest.mock
import warnings
import xml.etree.ElementTree import xml.etree.ElementTree
from yt_dlp.compat import ( from yt_dlp.compat import (
@@ -218,11 +219,8 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw') self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI') self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI')
@unittest.mock.patch('sys.platform', 'win32')
def test_sanitize_path(self): def test_sanitize_path(self):
with unittest.mock.patch('sys.platform', 'win32'):
self._test_sanitize_path()
def _test_sanitize_path(self):
self.assertEqual(sanitize_path('abc'), 'abc') self.assertEqual(sanitize_path('abc'), 'abc')
self.assertEqual(sanitize_path('abc/def'), 'abc\\def') self.assertEqual(sanitize_path('abc/def'), 'abc\\def')
self.assertEqual(sanitize_path('abc\\def'), 'abc\\def') self.assertEqual(sanitize_path('abc\\def'), 'abc\\def')
@@ -253,10 +251,8 @@ class TestUtil(unittest.TestCase):
# Check with nt._path_normpath if available # Check with nt._path_normpath if available
try: try:
import nt from nt import _path_normpath as nt_path_normpath
except ImportError:
nt_path_normpath = getattr(nt, '_path_normpath', None)
except Exception:
nt_path_normpath = None nt_path_normpath = None
for test, expected in [ for test, expected in [
@@ -1264,6 +1260,7 @@ class TestUtil(unittest.TestCase):
def test_js_to_json_malformed(self): def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"') self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1') self.assertEqual(js_to_json('42a-1'), '42"a"-1')
self.assertEqual(js_to_json('{a: `${e("")}`}'), '{"a": "\\"e\\"(\\"\\")"}')
def test_js_to_json_template_literal(self): def test_js_to_json_template_literal(self):
self.assertEqual(js_to_json('`Hello ${name}`', {'name': '"world"'}), '"Hello world"') self.assertEqual(js_to_json('`Hello ${name}`', {'name': '"world"'}), '"Hello world"')
@@ -2087,21 +2084,26 @@ Line 1
headers = HTTPHeaderDict() headers = HTTPHeaderDict()
headers['ytdl-test'] = b'0' headers['ytdl-test'] = b'0'
self.assertEqual(list(headers.items()), [('Ytdl-Test', '0')]) self.assertEqual(list(headers.items()), [('Ytdl-Test', '0')])
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '0')])
headers['ytdl-test'] = 1 headers['ytdl-test'] = 1
self.assertEqual(list(headers.items()), [('Ytdl-Test', '1')]) self.assertEqual(list(headers.items()), [('Ytdl-Test', '1')])
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '1')])
headers['Ytdl-test'] = '2' headers['Ytdl-test'] = '2'
self.assertEqual(list(headers.items()), [('Ytdl-Test', '2')]) self.assertEqual(list(headers.items()), [('Ytdl-Test', '2')])
self.assertEqual(list(headers.sensitive().items()), [('Ytdl-test', '2')])
self.assertTrue('ytDl-Test' in headers) self.assertTrue('ytDl-Test' in headers)
self.assertEqual(str(headers), str(dict(headers))) self.assertEqual(str(headers), str(dict(headers)))
self.assertEqual(repr(headers), str(dict(headers))) self.assertEqual(repr(headers), str(dict(headers)))
headers.update({'X-dlp': 'data'}) headers.update({'X-dlp': 'data'})
self.assertEqual(set(headers.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data')}) self.assertEqual(set(headers.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data')})
self.assertEqual(set(headers.sensitive().items()), {('Ytdl-test', '2'), ('X-dlp', 'data')})
self.assertEqual(dict(headers), {'Ytdl-Test': '2', 'X-Dlp': 'data'}) self.assertEqual(dict(headers), {'Ytdl-Test': '2', 'X-Dlp': 'data'})
self.assertEqual(len(headers), 2) self.assertEqual(len(headers), 2)
self.assertEqual(headers.copy(), headers) self.assertEqual(headers.copy(), headers)
headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, **headers, **{'X-dlp': 'data2'}) headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, headers, **{'X-dlP': 'data2'})
self.assertEqual(set(headers2.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data2')}) self.assertEqual(set(headers2.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data2')})
self.assertEqual(set(headers2.sensitive().items()), {('Ytdl-test', '2'), ('X-dlP', 'data2')})
self.assertEqual(len(headers2), 2) self.assertEqual(len(headers2), 2)
headers2.clear() headers2.clear()
self.assertEqual(len(headers2), 0) self.assertEqual(len(headers2), 0)
@@ -2109,16 +2111,23 @@ Line 1
# ensure we prefer latter headers # ensure we prefer latter headers
headers3 = HTTPHeaderDict({'Ytdl-TeSt': 1}, {'Ytdl-test': 2}) headers3 = HTTPHeaderDict({'Ytdl-TeSt': 1}, {'Ytdl-test': 2})
self.assertEqual(set(headers3.items()), {('Ytdl-Test', '2')}) self.assertEqual(set(headers3.items()), {('Ytdl-Test', '2')})
self.assertEqual(set(headers3.sensitive().items()), {('Ytdl-test', '2')})
del headers3['ytdl-tesT'] del headers3['ytdl-tesT']
self.assertEqual(dict(headers3), {}) self.assertEqual(dict(headers3), {})
headers4 = HTTPHeaderDict({'ytdl-test': 'data;'}) headers4 = HTTPHeaderDict({'ytdl-test': 'data;'})
self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')}) self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')})
self.assertEqual(set(headers4.sensitive().items()), {('ytdl-test', 'data;')})
# common mistake: strip whitespace from values # common mistake: strip whitespace from values
# https://github.com/yt-dlp/yt-dlp/issues/8729 # https://github.com/yt-dlp/yt-dlp/issues/8729
headers5 = HTTPHeaderDict({'ytdl-test': ' data; '}) headers5 = HTTPHeaderDict({'ytdl-test': ' data; '})
self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')}) self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')})
self.assertEqual(set(headers5.sensitive().items()), {('ytdl-test', 'data;')})
# test if picklable
headers6 = HTTPHeaderDict(a=1, b=2)
self.assertEqual(pickle.loads(pickle.dumps(headers6)), headers6)
def test_extract_basic_auth(self): def test_extract_basic_auth(self):
assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None) assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None)

View File

@@ -44,7 +44,7 @@ def websocket_handler(websocket):
return websocket.send('2') return websocket.send('2')
elif isinstance(message, str): elif isinstance(message, str):
if message == 'headers': if message == 'headers':
return websocket.send(json.dumps(dict(websocket.request.headers))) return websocket.send(json.dumps(dict(websocket.request.headers.raw_items())))
elif message == 'path': elif message == 'path':
return websocket.send(websocket.request.path) return websocket.send(websocket.request.path)
elif message == 'source_address': elif message == 'source_address':
@@ -266,18 +266,18 @@ class TestWebsSocketRequestHandlerConformance:
with handler(cookiejar=cookiejar) as rh: with handler(cookiejar=cookiejar) as rh:
ws = ws_validate_and_send(rh, Request(self.ws_base_url)) ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers') ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp' assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
ws.close() ws.close()
with handler() as rh: with handler() as rh:
ws = ws_validate_and_send(rh, Request(self.ws_base_url)) ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers') ws.send('headers')
assert 'cookie' not in json.loads(ws.recv()) assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
ws.close() ws.close()
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar})) ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar}))
ws.send('headers') ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp' assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
ws.close() ws.close()
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets') @pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
@@ -287,7 +287,7 @@ class TestWebsSocketRequestHandlerConformance:
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()})) ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()})) ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()}))
ws.send('headers') ws.send('headers')
assert 'cookie' not in json.loads(ws.recv()) assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
ws.close() ws.close()
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets') @pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
@@ -298,12 +298,12 @@ class TestWebsSocketRequestHandlerConformance:
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie')) ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie'))
ws = ws_validate_and_send(rh, Request(self.ws_base_url)) ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers') ws.send('headers')
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp' assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
ws.close() ws.close()
cookiejar.clear_session_cookies() cookiejar.clear_session_cookies()
ws = ws_validate_and_send(rh, Request(self.ws_base_url)) ws = ws_validate_and_send(rh, Request(self.ws_base_url))
ws.send('headers') ws.send('headers')
assert 'cookie' not in json.loads(ws.recv()) assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
ws.close() ws.close()
def test_source_address(self, handler): def test_source_address(self, handler):
@@ -341,6 +341,14 @@ class TestWebsSocketRequestHandlerConformance:
assert headers['test3'] == 'test3' assert headers['test3'] == 'test3'
ws.close() ws.close()
def test_keep_header_casing(self, handler):
with handler(headers=HTTPHeaderDict({'x-TeSt1': 'test'})) as rh:
ws = ws_validate_and_send(rh, Request(self.ws_base_url, headers={'x-TeSt2': 'test'}, extensions={'keep_header_casing': True}))
ws.send('headers')
headers = json.loads(ws.recv())
assert 'x-TeSt1' in headers
assert 'x-TeSt2' in headers
@pytest.mark.parametrize('client_cert', ( @pytest.mark.parametrize('client_cert', (
{'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')}, {'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')},
{ {

View File

@@ -78,6 +78,21 @@ _SIG_TESTS = [
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA', '2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q', '0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
), ),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
] ]
_NSIG_TESTS = [ _NSIG_TESTS = [
@@ -201,6 +216,42 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP', 'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
), ),
(
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
),
(
'https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js',
'Sy4aDGc0VpYRR9ew_', '5UPOT1VhoZxNLQ',
),
(
'https://www.youtube.com/s/player/d50f54ef/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'XFTb2HoeOE5MHg',
),
(
'https://www.youtube.com/s/player/074a8365/player_ias_tce.vflset/en_US/base.js',
'Ha7507LzRmH3Utygtj', 'ufTsrE0IVYrkl8v',
),
(
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
'N5uAlLqm0eg1GyHO', 'dCBQOejdq5s-ww',
),
(
'https://www.youtube.com/s/player/69f581a5/tv-player-ias.vflset/tv-player-ias.js',
'-qIP447rVlTTwaZjY', 'KNcGOksBAvwqQg',
),
(
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
] ]
@@ -214,6 +265,8 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'), ('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
('https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js', 'e7567ecf'),
('https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', '643afba4'),
# obsolete # obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'), ('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'), ('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
@@ -246,7 +299,7 @@ def t_factory(name, sig_func, url_pattern):
def make_tfunc(url, sig_input, expected_sig): def make_tfunc(url, sig_input, expected_sig):
m = url_pattern.match(url) m = url_pattern.match(url)
assert m, f'{url!r} should follow URL format' assert m, f'{url!r} should follow URL format'
test_id = m.group('id') test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
def test_func(self): def test_func(self):
basename = f'player-{name}-{test_id}.js' basename = f'player-{name}-{test_id}.js'
@@ -275,17 +328,22 @@ def n_sig(jscode, sig_input):
ie = YoutubeIE(FakeYDL()) ie = YoutubeIE(FakeYDL())
funcname = ie._extract_n_function_name(jscode) funcname = ie._extract_n_function_name(jscode)
jsi = JSInterpreter(jscode) jsi = JSInterpreter(jscode)
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname))) func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode))
return func([sig_input]) return func([sig_input])
make_sig_test = t_factory( make_sig_test = t_factory(
'signature', signature, re.compile(r'.*(?:-|/player/)(?P<id>[a-zA-Z0-9_-]+)(?:/.+\.js|(?:/watch_as3|/html5player)?\.[a-z]+)$')) 'signature', signature,
re.compile(r'''(?x)
.+(?:
/player/(?P<id>[a-zA-Z0-9_/.-]+)|
/html5player-(?:en_US-)?(?P<compat_id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?
)\.js$'''))
for test_spec in _SIG_TESTS: for test_spec in _SIG_TESTS:
make_sig_test(*test_spec) make_sig_test(*test_spec)
make_nsig_test = t_factory( make_nsig_test = t_factory(
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_-]+)/.+.js$')) 'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_/.-]+)\.js$'))
for test_spec in _NSIG_TESTS: for test_spec in _NSIG_TESTS:
make_nsig_test(*test_spec) make_nsig_test(*test_spec)

View File

@@ -2,4 +2,5 @@ from yt_dlp.extractor.common import InfoExtractor
class PackagePluginIE(InfoExtractor): class PackagePluginIE(InfoExtractor):
_VALID_URL = 'package'
pass pass

View File

@@ -0,0 +1,10 @@
from yt_dlp.extractor.common import InfoExtractor
class NormalPluginIE(InfoExtractor):
_VALID_URL = 'normal'
REPLACED = True
class _IgnoreUnderscorePluginIE(InfoExtractor):
pass

View File

@@ -0,0 +1,5 @@
from yt_dlp.postprocessor.common import PostProcessor
class NormalPluginPP(PostProcessor):
REPLACED = True

View File

@@ -6,6 +6,7 @@ class IgnoreNotInAllPluginIE(InfoExtractor):
class InAllPluginIE(InfoExtractor): class InAllPluginIE(InfoExtractor):
_VALID_URL = 'inallpluginie'
pass pass

View File

@@ -2,8 +2,10 @@ from yt_dlp.extractor.common import InfoExtractor
class NormalPluginIE(InfoExtractor): class NormalPluginIE(InfoExtractor):
pass _VALID_URL = 'normalpluginie'
REPLACED = False
class _IgnoreUnderscorePluginIE(InfoExtractor): class _IgnoreUnderscorePluginIE(InfoExtractor):
_VALID_URL = 'ignoreunderscorepluginie'
pass pass

View File

@@ -0,0 +1,5 @@
from yt_dlp.extractor.generic import GenericIE
class OverrideGenericIE(GenericIE, plugin_name='override'):
TEST_FIELD = 'override'

View File

@@ -0,0 +1,5 @@
from yt_dlp.extractor.generic import GenericIE
class _UnderscoreOverrideGenericIE(GenericIE, plugin_name='underscore-override'):
SECONDARY_TEST_FIELD = 'underscore-override'

View File

@@ -2,4 +2,4 @@ from yt_dlp.postprocessor.common import PostProcessor
class NormalPluginPP(PostProcessor): class NormalPluginPP(PostProcessor):
pass REPLACED = False

View File

@@ -2,4 +2,5 @@ from yt_dlp.extractor.common import InfoExtractor
class ZippedPluginIE(InfoExtractor): class ZippedPluginIE(InfoExtractor):
_VALID_URL = 'zippedpluginie'
pass pass

View File

@@ -30,9 +30,18 @@ from .compat import urllib_req_to_req
from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
from .extractor import gen_extractor_classes, get_info_extractor from .extractor import gen_extractor_classes, get_info_extractor, import_extractors
from .extractor.common import UnsupportedURLIE from .extractor.common import UnsupportedURLIE
from .extractor.openload import PhantomJSwrapper from .extractor.openload import PhantomJSwrapper
from .globals import (
IN_CLI,
LAZY_EXTRACTORS,
plugin_ies,
plugin_ies_overrides,
plugin_pps,
all_plugins_loaded,
plugin_dirs,
)
from .minicurses import format_text from .minicurses import format_text
from .networking import HEADRequest, Request, RequestDirector from .networking import HEADRequest, Request, RequestDirector
from .networking.common import _REQUEST_HANDLERS, _RH_PREFERENCES from .networking.common import _REQUEST_HANDLERS, _RH_PREFERENCES
@@ -44,8 +53,7 @@ from .networking.exceptions import (
network_exceptions, network_exceptions,
) )
from .networking.impersonate import ImpersonateRequestHandler from .networking.impersonate import ImpersonateRequestHandler
from .plugins import directories as plugin_directories from .plugins import directories as plugin_directories, load_all_plugins
from .postprocessor import _PLUGIN_CLASSES as plugin_pps
from .postprocessor import ( from .postprocessor import (
EmbedThumbnailPP, EmbedThumbnailPP,
FFmpegFixupDuplicateMoovPP, FFmpegFixupDuplicateMoovPP,
@@ -157,7 +165,7 @@ from .utils import (
write_json_file, write_json_file,
write_string, write_string,
) )
from .utils._utils import _UnsafeExtensionError, _YDLLogger from .utils._utils import _UnsafeExtensionError, _YDLLogger, _ProgressState
from .utils.networking import ( from .utils.networking import (
HTTPHeaderDict, HTTPHeaderDict,
clean_headers, clean_headers,
@@ -598,7 +606,7 @@ class YoutubeDL:
# NB: Keep in sync with the docstring of extractor/common.py # NB: Keep in sync with the docstring of extractor/common.py
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note', 'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels', 'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns', 'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns', 'hls_media_playlist_data',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'is_dash_periods', 'request_data', 'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'is_dash_periods', 'request_data',
'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies', 'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'extra_param_to_key_url', 'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'extra_param_to_key_url',
@@ -642,13 +650,15 @@ class YoutubeDL:
self.cache = Cache(self) self.cache = Cache(self)
self.__header_cookies = [] self.__header_cookies = []
# compat for API: load plugins if they have not already
if not all_plugins_loaded.value:
load_all_plugins()
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
self._out_files = Namespace( self._out_files = Namespace(
out=stdout, out=stdout,
error=sys.stderr, error=sys.stderr,
screen=sys.stderr if self.params.get('quiet') else stdout, screen=sys.stderr if self.params.get('quiet') else stdout,
console=None if os.name == 'nt' else next(
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
) )
try: try:
@@ -656,6 +666,9 @@ class YoutubeDL:
except Exception as e: except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}') self.write_debug(f'Failed to enable VT mode: {e}')
# hehe "immutable" namespace
self._out_files.console = next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None)
if self.params.get('no_color'): if self.params.get('no_color'):
if self.params.get('color') is not None: if self.params.get('color') is not None:
self.params.setdefault('_warnings', []).append( self.params.setdefault('_warnings', []).append(
@@ -956,21 +969,22 @@ class YoutubeDL:
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once) self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once)
def _send_console_code(self, code): def _send_console_code(self, code):
if os.name == 'nt' or not self._out_files.console: if not supports_terminal_sequences(self._out_files.console):
return return False
self._write_string(code, self._out_files.console) self._write_string(code, self._out_files.console)
return True
def to_console_title(self, message): def to_console_title(self, message=None, progress_state=None, percent=None):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle'):
return return
message = remove_terminal_sequences(message)
if os.name == 'nt': if message:
if ctypes.windll.kernel32.GetConsoleWindow(): success = self._send_console_code(f'\033]0;{remove_terminal_sequences(message)}\007')
# c_wchar_p() might not be necessary if `message` is if not success and os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
# already of type unicode() ctypes.windll.kernel32.SetConsoleTitleW(message)
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
else: if isinstance(progress_state, _ProgressState):
self._send_console_code(f'\033]0;{message}\007') self._send_console_code(progress_state.get_ansi_escape(percent))
def save_console_title(self): def save_console_title(self):
if not self.params.get('consoletitle') or self.params.get('simulate'): if not self.params.get('consoletitle') or self.params.get('simulate'):
@@ -984,6 +998,7 @@ class YoutubeDL:
def __enter__(self): def __enter__(self):
self.save_console_title() self.save_console_title()
self.to_console_title(progress_state=_ProgressState.INDETERMINATE)
return self return self
def save_cookies(self): def save_cookies(self):
@@ -992,6 +1007,7 @@ class YoutubeDL:
def __exit__(self, *args): def __exit__(self, *args):
self.restore_console_title() self.restore_console_title()
self.to_console_title(progress_state=_ProgressState.HIDDEN)
self.close() self.close()
def close(self): def close(self):
@@ -3993,15 +4009,6 @@ class YoutubeDL:
if not self.params.get('verbose'): if not self.params.get('verbose'):
return return
from . import _IN_CLI # Must be delayed import
# These imports can be slow. So import them only as needed
from .extractor.extractors import _LAZY_LOADER
from .extractor.extractors import (
_PLUGIN_CLASSES as plugin_ies,
_PLUGIN_OVERRIDES as plugin_ie_overrides,
)
def get_encoding(stream): def get_encoding(stream):
ret = str(getattr(stream, 'encoding', f'missing ({type(stream).__name__})')) ret = str(getattr(stream, 'encoding', f'missing ({type(stream).__name__})'))
additional_info = [] additional_info = []
@@ -4040,17 +4047,18 @@ class YoutubeDL:
_make_label(ORIGIN, CHANNEL.partition('@')[2] or __version__, __version__), _make_label(ORIGIN, CHANNEL.partition('@')[2] or __version__, __version__),
f'[{RELEASE_GIT_HEAD[:9]}]' if RELEASE_GIT_HEAD else '', f'[{RELEASE_GIT_HEAD[:9]}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})', '' if source == 'unknown' else f'({source})',
'' if _IN_CLI else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}', '' if IN_CLI.value else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
delim=' ')) delim=' '))
if not _IN_CLI: if not IN_CLI.value:
write_debug(f'params: {self.params}') write_debug(f'params: {self.params}')
if not _LAZY_LOADER: import_extractors()
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'): lazy_extractors = LAZY_EXTRACTORS.value
write_debug('Lazy loading extractors is forcibly disabled') if lazy_extractors is None:
else: write_debug('Lazy loading extractors is disabled')
write_debug('Lazy loading extractors is disabled') elif not lazy_extractors:
write_debug('Lazy loading extractors is forcibly disabled')
if self.params['compat_opts']: if self.params['compat_opts']:
write_debug('Compatibility options: {}'.format(', '.join(self.params['compat_opts']))) write_debug('Compatibility options: {}'.format(', '.join(self.params['compat_opts'])))
@@ -4079,24 +4087,27 @@ class YoutubeDL:
write_debug(f'Proxy map: {self.proxies}') write_debug(f'Proxy map: {self.proxies}')
write_debug(f'Request Handlers: {", ".join(rh.RH_NAME for rh in self._request_director.handlers.values())}') write_debug(f'Request Handlers: {", ".join(rh.RH_NAME for rh in self._request_director.handlers.values())}')
if os.environ.get('YTDLP_NO_PLUGINS'):
write_debug('Plugins are forcibly disabled')
return
for plugin_type, plugins in {'Extractor': plugin_ies, 'Post-Processor': plugin_pps}.items(): for plugin_type, plugins in (('Extractor', plugin_ies), ('Post-Processor', plugin_pps)):
display_list = ['{}{}'.format( display_list = [
klass.__name__, '' if klass.__name__ == name else f' as {name}') klass.__name__ if klass.__name__ == name else f'{klass.__name__} as {name}'
for name, klass in plugins.items()] for name, klass in plugins.value.items()]
if plugin_type == 'Extractor': if plugin_type == 'Extractor':
display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})' display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})'
for parent, plugins in plugin_ie_overrides.items()) for parent, plugins in plugin_ies_overrides.value.items())
if not display_list: if not display_list:
continue continue
write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}') write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}')
plugin_dirs = plugin_directories() plugin_dirs_msg = 'none'
if plugin_dirs: if not plugin_dirs.value:
write_debug(f'Plugin directories: {plugin_dirs}') plugin_dirs_msg = 'none (disabled)'
else:
found_plugin_directories = plugin_directories()
if found_plugin_directories:
plugin_dirs_msg = ', '.join(found_plugin_directories)
write_debug(f'Plugin directories: {plugin_dirs_msg}')
@functools.cached_property @functools.cached_property
def proxies(self): def proxies(self):
@@ -4141,7 +4152,7 @@ class YoutubeDL:
(target, rh.RH_NAME) (target, rh.RH_NAME)
for rh in self._request_director.handlers.values() for rh in self._request_director.handlers.values()
if isinstance(rh, ImpersonateRequestHandler) if isinstance(rh, ImpersonateRequestHandler)
for target in rh.supported_targets for target in reversed(rh.supported_targets)
] ]
def _impersonate_target_available(self, target): def _impersonate_target_available(self, target):

View File

@@ -19,7 +19,9 @@ from .downloader.external import get_external_downloader
from .extractor import list_extractor_classes from .extractor import list_extractor_classes
from .extractor.adobepass import MSO_INFO from .extractor.adobepass import MSO_INFO
from .networking.impersonate import ImpersonateTarget from .networking.impersonate import ImpersonateTarget
from .globals import IN_CLI, plugin_dirs
from .options import parseOpts from .options import parseOpts
from .plugins import load_all_plugins as _load_all_plugins
from .postprocessor import ( from .postprocessor import (
FFmpegExtractAudioPP, FFmpegExtractAudioPP,
FFmpegMergerPP, FFmpegMergerPP,
@@ -33,7 +35,6 @@ from .postprocessor import (
) )
from .update import Updater from .update import Updater
from .utils import ( from .utils import (
Config,
NO_DEFAULT, NO_DEFAULT,
POSTPROCESS_WHEN, POSTPROCESS_WHEN,
DateRange, DateRange,
@@ -66,8 +67,6 @@ from .utils.networking import std_headers
from .utils._utils import _UnsafeExtensionError from .utils._utils import _UnsafeExtensionError
from .YoutubeDL import YoutubeDL from .YoutubeDL import YoutubeDL
_IN_CLI = False
def _exit(status=0, *args): def _exit(status=0, *args):
for msg in args: for msg in args:
@@ -295,18 +294,20 @@ def validate_options(opts):
raise ValueError(f'invalid {key} retry sleep expression {expr!r}') raise ValueError(f'invalid {key} retry sleep expression {expr!r}')
# Bytes # Bytes
def validate_bytes(name, value): def validate_bytes(name, value, strict_positive=False):
if value is None: if value is None:
return None return None
numeric_limit = parse_bytes(value) numeric_limit = parse_bytes(value)
validate(numeric_limit is not None, 'rate limit', value) validate(numeric_limit is not None, name, value)
if strict_positive:
validate_positive(name, numeric_limit, True)
return numeric_limit return numeric_limit
opts.ratelimit = validate_bytes('rate limit', opts.ratelimit) opts.ratelimit = validate_bytes('rate limit', opts.ratelimit, True)
opts.throttledratelimit = validate_bytes('throttled rate limit', opts.throttledratelimit) opts.throttledratelimit = validate_bytes('throttled rate limit', opts.throttledratelimit)
opts.min_filesize = validate_bytes('min filesize', opts.min_filesize) opts.min_filesize = validate_bytes('min filesize', opts.min_filesize)
opts.max_filesize = validate_bytes('max filesize', opts.max_filesize) opts.max_filesize = validate_bytes('max filesize', opts.max_filesize)
opts.buffersize = validate_bytes('buffer size', opts.buffersize) opts.buffersize = validate_bytes('buffer size', opts.buffersize, True)
opts.http_chunk_size = validate_bytes('http chunk size', opts.http_chunk_size) opts.http_chunk_size = validate_bytes('http chunk size', opts.http_chunk_size)
# Output templates # Output templates
@@ -431,6 +432,10 @@ def validate_options(opts):
} }
# Other options # Other options
opts.plugin_dirs = opts.plugin_dirs
if opts.plugin_dirs is None:
opts.plugin_dirs = ['default']
if opts.playlist_items is not None: if opts.playlist_items is not None:
try: try:
tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items)) tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items))
@@ -971,11 +976,6 @@ def _real_main(argv=None):
parser, opts, all_urls, ydl_opts = parse_options(argv) parser, opts, all_urls, ydl_opts = parse_options(argv)
# HACK: Set the plugin dirs early on
# TODO(coletdjnz): remove when plugin globals system is implemented
if opts.plugin_dirs is not None:
Config._plugin_dirs = list(map(expand_path, opts.plugin_dirs))
# Dump user agent # Dump user agent
if opts.dump_user_agent: if opts.dump_user_agent:
ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent']) ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent'])
@@ -990,6 +990,11 @@ def _real_main(argv=None):
if opts.ffmpeg_location: if opts.ffmpeg_location:
FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location) FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location)
# load all plugins into the global lookup
plugin_dirs.value = opts.plugin_dirs
if plugin_dirs.value:
_load_all_plugins()
with YoutubeDL(ydl_opts) as ydl: with YoutubeDL(ydl_opts) as ydl:
pre_process = opts.update_self or opts.rm_cachedir pre_process = opts.update_self or opts.rm_cachedir
actual_use = all_urls or opts.load_info_filename actual_use = all_urls or opts.load_info_filename
@@ -1016,8 +1021,9 @@ def _real_main(argv=None):
# List of simplified targets we know are supported, # List of simplified targets we know are supported,
# to help users know what dependencies may be required. # to help users know what dependencies may be required.
(ImpersonateTarget('chrome'), 'curl_cffi'), (ImpersonateTarget('chrome'), 'curl_cffi'),
(ImpersonateTarget('edge'), 'curl_cffi'),
(ImpersonateTarget('safari'), 'curl_cffi'), (ImpersonateTarget('safari'), 'curl_cffi'),
(ImpersonateTarget('firefox'), 'curl_cffi>=0.10'),
(ImpersonateTarget('edge'), 'curl_cffi'),
] ]
available_targets = ydl._get_available_impersonate_targets() available_targets = ydl._get_available_impersonate_targets()
@@ -1033,12 +1039,12 @@ def _real_main(argv=None):
for known_target, known_handler in known_targets: for known_target, known_handler in known_targets:
if not any( if not any(
known_target in target and handler == known_handler known_target in target and known_handler.startswith(handler)
for target, handler in available_targets for target, handler in available_targets
): ):
rows.append([ rows.insert(0, [
ydl._format_out(text, ydl.Styles.SUPPRESS) ydl._format_out(text, ydl.Styles.SUPPRESS)
for text in make_row(known_target, f'{known_handler} (not available)') for text in make_row(known_target, f'{known_handler} (unavailable)')
]) ])
ydl.to_screen('[info] Available impersonate targets') ydl.to_screen('[info] Available impersonate targets')
@@ -1089,8 +1095,7 @@ def _real_main(argv=None):
def main(argv=None): def main(argv=None):
global _IN_CLI IN_CLI.value = True
_IN_CLI = True
try: try:
_exit(*variadic(_real_main(argv))) _exit(*variadic(_real_main(argv)))
except (CookieLoadError, DownloadError): except (CookieLoadError, DownloadError):

View File

@@ -83,7 +83,7 @@ def aes_ecb_encrypt(data, key, iv=None):
@returns {int[]} encrypted data @returns {int[]} encrypted data
""" """
expanded_key = key_expansion(key) expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES)) block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = [] encrypted_data = []
for i in range(block_count): for i in range(block_count):
@@ -103,7 +103,7 @@ def aes_ecb_decrypt(data, key, iv=None):
@returns {int[]} decrypted data @returns {int[]} decrypted data
""" """
expanded_key = key_expansion(key) expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES)) block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = [] encrypted_data = []
for i in range(block_count): for i in range(block_count):
@@ -134,7 +134,7 @@ def aes_ctr_encrypt(data, key, iv):
@returns {int[]} encrypted data @returns {int[]} encrypted data
""" """
expanded_key = key_expansion(key) expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES)) block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
counter = iter_vector(iv) counter = iter_vector(iv)
encrypted_data = [] encrypted_data = []
@@ -158,7 +158,7 @@ def aes_cbc_decrypt(data, key, iv):
@returns {int[]} decrypted data @returns {int[]} decrypted data
""" """
expanded_key = key_expansion(key) expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES)) block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
decrypted_data = [] decrypted_data = []
previous_cipher_block = iv previous_cipher_block = iv
@@ -183,7 +183,7 @@ def aes_cbc_encrypt(data, key, iv, *, padding_mode='pkcs7'):
@returns {int[]} encrypted data @returns {int[]} encrypted data
""" """
expanded_key = key_expansion(key) expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES)) block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
encrypted_data = [] encrypted_data = []
previous_cipher_block = iv previous_cipher_block = iv

View File

@@ -35,6 +35,7 @@ from .rtmp import RtmpFD
from .rtsp import RtspFD from .rtsp import RtspFD
from .websocket import WebSocketFragmentFD from .websocket import WebSocketFragmentFD
from .youtube_live_chat import YoutubeLiveChatFD from .youtube_live_chat import YoutubeLiveChatFD
from .bunnycdn import BunnyCdnFD
PROTOCOL_MAP = { PROTOCOL_MAP = {
'rtmp': RtmpFD, 'rtmp': RtmpFD,
@@ -55,6 +56,7 @@ PROTOCOL_MAP = {
'websocket_frag': WebSocketFragmentFD, 'websocket_frag': WebSocketFragmentFD,
'youtube_live_chat': YoutubeLiveChatFD, 'youtube_live_chat': YoutubeLiveChatFD,
'youtube_live_chat_replay': YoutubeLiveChatFD, 'youtube_live_chat_replay': YoutubeLiveChatFD,
'bunnycdn': BunnyCdnFD,
} }

View File

@@ -0,0 +1,50 @@
import hashlib
import random
import threading
from .common import FileDownloader
from . import HlsFD
from ..networking import Request
from ..networking.exceptions import network_exceptions
class BunnyCdnFD(FileDownloader):
"""
Downloads from BunnyCDN with required pings
Note, this is not a part of public API, and will be removed without notice.
DO NOT USE
"""
def real_download(self, filename, info_dict):
self.to_screen(f'[{self.FD_NAME}] Downloading from BunnyCDN')
fd = HlsFD(self.ydl, self.params)
stop_event = threading.Event()
ping_thread = threading.Thread(target=self.ping_thread, args=(stop_event,), kwargs=info_dict['_bunnycdn_ping_data'])
ping_thread.start()
try:
return fd.real_download(filename, info_dict)
finally:
stop_event.set()
def ping_thread(self, stop_event, url, headers, secret, context_id):
# Site sends ping every 4 seconds, but this throttles the download. Pinging every 2 seconds seems to work.
ping_interval = 2
# Hard coded resolution as it doesn't seem to matter
res = 1080
paused = 'false'
current_time = 0
while not stop_event.wait(ping_interval):
current_time += ping_interval
time = current_time + round(random.random(), 6)
md5_hash = hashlib.md5(f'{secret}_{context_id}_{time}_{paused}_{res}'.encode()).hexdigest()
ping_url = f'{url}?hash={md5_hash}&time={time}&paused={paused}&resolution={res}'
try:
self.ydl.urlopen(Request(ping_url, headers=headers)).read()
except network_exceptions as e:
self.to_screen(f'[{self.FD_NAME}] Ping failed: {e}')

View File

@@ -31,6 +31,7 @@ from ..utils import (
timetuple_from_msec, timetuple_from_msec,
try_call, try_call,
) )
from ..utils._utils import _ProgressState
class FileDownloader: class FileDownloader:
@@ -333,7 +334,7 @@ class FileDownloader:
progress_dict), s.get('progress_idx') or 0) progress_dict), s.get('progress_idx') or 0)
self.to_console_title(self.ydl.evaluate_outtmpl( self.to_console_title(self.ydl.evaluate_outtmpl(
progress_template.get('download-title') or 'yt-dlp %(progress._default_template)s', progress_template.get('download-title') or 'yt-dlp %(progress._default_template)s',
progress_dict)) progress_dict), _ProgressState.from_dict(s), s.get('_percent'))
def _format_progress(self, *args, **kwargs): def _format_progress(self, *args, **kwargs):
return self.ydl._format_text( return self.ydl._format_text(
@@ -357,6 +358,7 @@ class FileDownloader:
'_speed_str': self.format_speed(speed).strip(), '_speed_str': self.format_speed(speed).strip(),
'_total_bytes_str': _format_bytes('total_bytes'), '_total_bytes_str': _format_bytes('total_bytes'),
'_elapsed_str': self.format_seconds(s.get('elapsed')), '_elapsed_str': self.format_seconds(s.get('elapsed')),
'_percent': 100.0,
'_percent_str': self.format_percent(100), '_percent_str': self.format_percent(100),
}) })
self._report_progress_status(s, join_nonempty( self._report_progress_status(s, join_nonempty(
@@ -375,13 +377,15 @@ class FileDownloader:
return return
self._progress_delta_time += update_delta self._progress_delta_time += update_delta
progress = try_call(
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
lambda: s['downloaded_bytes'] == 0 and 0)
s.update({ s.update({
'_eta_str': self.format_eta(s.get('eta')).strip(), '_eta_str': self.format_eta(s.get('eta')).strip(),
'_speed_str': self.format_speed(s.get('speed')), '_speed_str': self.format_speed(s.get('speed')),
'_percent_str': self.format_percent(try_call( '_percent': progress,
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'], '_percent_str': self.format_percent(progress),
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
lambda: s['downloaded_bytes'] == 0 and 0)),
'_total_bytes_str': _format_bytes('total_bytes'), '_total_bytes_str': _format_bytes('total_bytes'),
'_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'), '_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'),
'_downloaded_bytes_str': _format_bytes('downloaded_bytes'), '_downloaded_bytes_str': _format_bytes('downloaded_bytes'),

View File

@@ -457,8 +457,6 @@ class FFmpegFD(ExternalFD):
@classmethod @classmethod
def available(cls, path=None): def available(cls, path=None):
# TODO: Fix path for ffmpeg
# Fixme: This may be wrong when --ffmpeg-location is used
return FFmpegPostProcessor().available return FFmpegPostProcessor().available
def on_process_started(self, proc, stdin): def on_process_started(self, proc, stdin):

View File

@@ -16,6 +16,7 @@ from ..utils import (
update_url_query, update_url_query,
urljoin, urljoin,
) )
from ..utils._utils import _request_dump_filename
class HlsFD(FragmentFD): class HlsFD(FragmentFD):
@@ -72,11 +73,23 @@ class HlsFD(FragmentFD):
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
man_url = info_dict['url'] man_url = info_dict['url']
self.to_screen(f'[{self.FD_NAME}] Downloading m3u8 manifest')
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url)) s = info_dict.get('hls_media_playlist_data')
man_url = urlh.url if s:
s = urlh.read().decode('utf-8', 'ignore') self.to_screen(f'[{self.FD_NAME}] Using m3u8 manifest from extracted info')
else:
self.to_screen(f'[{self.FD_NAME}] Downloading m3u8 manifest')
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.url
s_bytes = urlh.read()
if self.params.get('write_pages'):
dump_filename = _request_dump_filename(
man_url, info_dict['id'], None,
trim_length=self.params.get('trim_file_name'))
self.to_screen(f'[{self.FD_NAME}] Saving request to {dump_filename}')
with open(dump_filename, 'wb') as outf:
outf.write(s_bytes)
s = s_bytes.decode('utf-8', 'ignore')
can_download, message = self.can_download(s, info_dict, self.params.get('allow_unplayable_formats')), None can_download, message = self.can_download(s, info_dict, self.params.get('allow_unplayable_formats')), None
if can_download: if can_download:
@@ -177,6 +190,7 @@ class HlsFD(FragmentFD):
if external_aes_iv: if external_aes_iv:
external_aes_iv = binascii.unhexlify(remove_start(external_aes_iv, '0x').zfill(32)) external_aes_iv = binascii.unhexlify(remove_start(external_aes_iv, '0x').zfill(32))
byte_range = {} byte_range = {}
byte_range_offset = 0
discontinuity_count = 0 discontinuity_count = 0
frag_index = 0 frag_index = 0
ad_frag_next = False ad_frag_next = False
@@ -204,6 +218,11 @@ class HlsFD(FragmentFD):
}) })
media_sequence += 1 media_sequence += 1
# If the byte_range is truthy, reset it after appending a fragment that uses it
if byte_range:
byte_range_offset = byte_range['end']
byte_range = {}
elif line.startswith('#EXT-X-MAP'): elif line.startswith('#EXT-X-MAP'):
if format_index and discontinuity_count != format_index: if format_index and discontinuity_count != format_index:
continue continue
@@ -217,10 +236,12 @@ class HlsFD(FragmentFD):
if extra_segment_query: if extra_segment_query:
frag_url = update_url_query(frag_url, extra_segment_query) frag_url = update_url_query(frag_url, extra_segment_query)
map_byte_range = {}
if map_info.get('BYTERANGE'): if map_info.get('BYTERANGE'):
splitted_byte_range = map_info.get('BYTERANGE').split('@') splitted_byte_range = map_info.get('BYTERANGE').split('@')
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range['end'] sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else 0
byte_range = { map_byte_range = {
'start': sub_range_start, 'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]), 'end': sub_range_start + int(splitted_byte_range[0]),
} }
@@ -229,7 +250,7 @@ class HlsFD(FragmentFD):
'frag_index': frag_index, 'frag_index': frag_index,
'url': frag_url, 'url': frag_url,
'decrypt_info': decrypt_info, 'decrypt_info': decrypt_info,
'byte_range': byte_range, 'byte_range': map_byte_range,
'media_sequence': media_sequence, 'media_sequence': media_sequence,
}) })
media_sequence += 1 media_sequence += 1
@@ -257,7 +278,7 @@ class HlsFD(FragmentFD):
media_sequence = int(line[22:]) media_sequence = int(line[22:])
elif line.startswith('#EXT-X-BYTERANGE'): elif line.startswith('#EXT-X-BYTERANGE'):
splitted_byte_range = line[17:].split('@') splitted_byte_range = line[17:].split('@')
sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range['end'] sub_range_start = int(splitted_byte_range[1]) if len(splitted_byte_range) == 2 else byte_range_offset
byte_range = { byte_range = {
'start': sub_range_start, 'start': sub_range_start,
'end': sub_range_start + int(splitted_byte_range[0]), 'end': sub_range_start + int(splitted_byte_range[0]),

View File

@@ -1,16 +1,25 @@
from ..compat.compat_utils import passthrough_module from ..compat.compat_utils import passthrough_module
from ..globals import extractors as _extractors_context
from ..globals import plugin_ies as _plugin_ies_context
from ..plugins import PluginSpec, register_plugin_spec
passthrough_module(__name__, '.extractors') passthrough_module(__name__, '.extractors')
del passthrough_module del passthrough_module
register_plugin_spec(PluginSpec(
module_name='extractor',
suffix='IE',
destination=_extractors_context,
plugin_destination=_plugin_ies_context,
))
def gen_extractor_classes(): def gen_extractor_classes():
""" Return a list of supported extractors. """ Return a list of supported extractors.
The order does matter; the first extractor matched is the one handling the URL. The order does matter; the first extractor matched is the one handling the URL.
""" """
from .extractors import _ALL_CLASSES import_extractors()
return list(_extractors_context.value.values())
return _ALL_CLASSES
def gen_extractors(): def gen_extractors():
@@ -37,6 +46,9 @@ def list_extractors(age_limit=None):
def get_info_extractor(ie_name): def get_info_extractor(ie_name):
"""Returns the info extractor class with the given ie_name""" """Returns the info extractor class with the given ie_name"""
from . import extractors import_extractors()
return _extractors_context.value[f'{ie_name}IE']
return getattr(extractors, f'{ie_name}IE')
def import_extractors():
from . import extractors # noqa: F401

View File

@@ -312,6 +312,7 @@ from .brilliantpala import (
) )
from .bundesliga import BundesligaIE from .bundesliga import BundesligaIE
from .bundestag import BundestagIE from .bundestag import BundestagIE
from .bunnycdn import BunnyCdnIE
from .businessinsider import BusinessInsiderIE from .businessinsider import BusinessInsiderIE
from .buzzfeed import BuzzFeedIE from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE from .byutv import BYUtvIE
@@ -335,6 +336,7 @@ from .canal1 import Canal1IE
from .canalalpha import CanalAlphaIE from .canalalpha import CanalAlphaIE
from .canalc2 import Canalc2IE from .canalc2 import Canalc2IE
from .canalplus import CanalplusIE from .canalplus import CanalplusIE
from .canalsurmas import CanalsurmasIE
from .caracoltv import CaracolTvPlayIE from .caracoltv import CaracolTvPlayIE
from .cartoonnetwork import CartoonNetworkIE from .cartoonnetwork import CartoonNetworkIE
from .cbc import ( from .cbc import (
@@ -454,7 +456,10 @@ from .curiositystream import (
CuriosityStreamIE, CuriosityStreamIE,
CuriosityStreamSeriesIE, CuriosityStreamSeriesIE,
) )
from .cwtv import CWTVIE from .cwtv import (
CWTVIE,
CWTVMovieIE,
)
from .cybrary import ( from .cybrary import (
CybraryCourseIE, CybraryCourseIE,
CybraryIE, CybraryIE,
@@ -491,10 +496,6 @@ from .daum import (
from .daystar import DaystarClipIE from .daystar import DaystarClipIE
from .dbtv import DBTVIE from .dbtv import DBTVIE
from .dctp import DctpTvIE from .dctp import DctpTvIE
from .deezer import (
DeezerAlbumIE,
DeezerPlaylistIE,
)
from .democracynow import DemocracynowIE from .democracynow import DemocracynowIE
from .detik import DetikEmbedIE from .detik import DetikEmbedIE
from .deuxm import ( from .deuxm import (
@@ -505,6 +506,7 @@ from .dfb import DFBIE
from .dhm import DHMIE from .dhm import DHMIE
from .digitalconcerthall import DigitalConcertHallIE from .digitalconcerthall import DigitalConcertHallIE
from .digiteka import DigitekaIE from .digiteka import DigitekaIE
from .digiview import DigiviewIE
from .discogs import DiscogsReleasePlaylistIE from .discogs import DiscogsReleasePlaylistIE
from .disney import DisneyIE from .disney import DisneyIE
from .dispeak import DigitallySpeakingIE from .dispeak import DigitallySpeakingIE
@@ -837,6 +839,7 @@ from .icareus import IcareusIE
from .ichinanalive import ( from .ichinanalive import (
IchinanaLiveClipIE, IchinanaLiveClipIE,
IchinanaLiveIE, IchinanaLiveIE,
IchinanaLiveVODIE,
) )
from .idolplus import IdolPlusIE from .idolplus import IdolPlusIE
from .ign import ( from .ign import (
@@ -1049,6 +1052,7 @@ from .livestream import (
) )
from .livestreamfails import LivestreamfailsIE from .livestreamfails import LivestreamfailsIE
from .lnk import LnkIE from .lnk import LnkIE
from .loco import LocoIE
from .loom import ( from .loom import (
LoomFolderIE, LoomFolderIE,
LoomIE, LoomIE,
@@ -1877,6 +1881,8 @@ from .skyit import (
SkyItVideoIE, SkyItVideoIE,
SkyItVideoLiveIE, SkyItVideoLiveIE,
TV8ItIE, TV8ItIE,
TV8ItLiveIE,
TV8ItPlaylistIE,
) )
from .skylinewebcams import SkylineWebcamsIE from .skylinewebcams import SkylineWebcamsIE
from .skynewsarabia import ( from .skynewsarabia import (
@@ -1890,6 +1896,7 @@ from .slutload import SlutloadIE
from .smotrim import SmotrimIE from .smotrim import SmotrimIE
from .snapchat import SnapchatSpotlightIE from .snapchat import SnapchatSpotlightIE
from .snotr import SnotrIE from .snotr import SnotrIE
from .softwhiteunderbelly import SoftWhiteUnderbellyIE
from .sohu import ( from .sohu import (
SohuIE, SohuIE,
SohuVIE, SohuVIE,
@@ -1979,6 +1986,7 @@ from .storyfire import (
StoryFireSeriesIE, StoryFireSeriesIE,
StoryFireUserIE, StoryFireUserIE,
) )
from .streaks import StreaksIE
from .streamable import StreamableIE from .streamable import StreamableIE
from .streamcz import StreamCZIE from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE from .streetvoice import StreetVoiceIE
@@ -2218,6 +2226,7 @@ from .tvplay import (
TVPlayIE, TVPlayIE,
) )
from .tvplayer import TVPlayerIE from .tvplayer import TVPlayerIE
from .tvw import TvwIE
from .tweakers import TweakersIE from .tweakers import TweakersIE
from .twentymin import TwentyMinutenIE from .twentymin import TwentyMinutenIE
from .twentythreevideo import TwentyThreeVideoIE from .twentythreevideo import TwentyThreeVideoIE
@@ -2341,10 +2350,6 @@ from .viewlift import (
ViewLiftIE, ViewLiftIE,
) )
from .viidea import ViideaIE from .viidea import ViideaIE
from .viki import (
VikiChannelIE,
VikiIE,
)
from .vimeo import ( from .vimeo import (
VHXEmbedIE, VHXEmbedIE,
VimeoAlbumIE, VimeoAlbumIE,
@@ -2389,10 +2394,15 @@ from .voxmedia import (
VoxMediaIE, VoxMediaIE,
VoxMediaVolumeIE, VoxMediaVolumeIE,
) )
from .vrsquare import (
VrSquareChannelIE,
VrSquareIE,
VrSquareSearchIE,
VrSquareSectionIE,
)
from .vrt import ( from .vrt import (
VRTIE, VRTIE,
DagelijkseKostIE, DagelijkseKostIE,
KetnetIE,
Radio1BeIE, Radio1BeIE,
VrtNUIE, VrtNUIE,
) )

View File

@@ -43,14 +43,14 @@ class ACastIE(ACastBaseIE):
_VALID_URL = r'''(?x: _VALID_URL = r'''(?x:
https?:// https?://
(?: (?:
(?:(?:embed|www)\.)?acast\.com/| (?:(?:embed|www|shows)\.)?acast\.com/|
play\.acast\.com/s/ play\.acast\.com/s/
) )
(?P<channel>[^/]+)/(?P<id>[^/#?"]+) (?P<channel>[^/?#]+)/(?:episodes/)?(?P<id>[^/#?"]+)
)''' )'''
_EMBED_REGEX = [rf'(?x)<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL})'] _EMBED_REGEX = [rf'(?x)<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL})']
_TESTS = [{ _TESTS = [{
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna', 'url': 'https://shows.acast.com/sparpodcast/episodes/2.raggarmordet-rosterurdetforflutna',
'info_dict': { 'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9', 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3', 'ext': 'mp3',
@@ -59,7 +59,7 @@ class ACastIE(ACastBaseIE):
'timestamp': 1477346700, 'timestamp': 1477346700,
'upload_date': '20161024', 'upload_date': '20161024',
'duration': 2766, 'duration': 2766,
'creator': 'Third Ear Studio', 'creators': ['Third Ear Studio'],
'series': 'Spår', 'series': 'Spår',
'episode': '2. Raggarmordet - Röster ur det förflutna', 'episode': '2. Raggarmordet - Röster ur det förflutna',
'thumbnail': 'https://assets.pippa.io/shows/616ebe1886d7b1398620b943/616ebe33c7e6e70013cae7da.jpg', 'thumbnail': 'https://assets.pippa.io/shows/616ebe1886d7b1398620b943/616ebe33c7e6e70013cae7da.jpg',
@@ -74,6 +74,9 @@ class ACastIE(ACastBaseIE):
}, { }, {
'url': 'https://play.acast.com/s/rattegangspodden/s04e09styckmordetihelenelund-del2-2', 'url': 'https://play.acast.com/s/rattegangspodden/s04e09styckmordetihelenelund-del2-2',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'only_matching': True,
}, { }, {
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9', 'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
'only_matching': True, 'only_matching': True,
@@ -110,7 +113,7 @@ class ACastChannelIE(ACastBaseIE):
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?: (?:
(?:www\.)?acast\.com/| (?:(?:www|shows)\.)?acast\.com/|
play\.acast\.com/s/ play\.acast\.com/s/
) )
(?P<id>[^/#?]+) (?P<id>[^/#?]+)
@@ -120,12 +123,15 @@ class ACastChannelIE(ACastBaseIE):
'info_dict': { 'info_dict': {
'id': '4efc5294-5385-4847-98bd-519799ce5786', 'id': '4efc5294-5385-4847-98bd-519799ce5786',
'title': 'Today in Focus', 'title': 'Today in Focus',
'description': 'md5:c09ce28c91002ce4ffce71d6504abaae', 'description': 'md5:feca253de9947634605080cd9eeea2bf',
}, },
'playlist_mincount': 200, 'playlist_mincount': 200,
}, { }, {
'url': 'http://play.acast.com/s/ft-banking-weekly', 'url': 'http://play.acast.com/s/ft-banking-weekly',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://shows.acast.com/sparpodcast',
'only_matching': True,
}] }]
@classmethod @classmethod

View File

@@ -1,3 +1,4 @@
import datetime as dt
import functools import functools
from .common import InfoExtractor from .common import InfoExtractor
@@ -10,7 +11,7 @@ from ..utils import (
filter_dict, filter_dict,
int_or_none, int_or_none,
orderedSet, orderedSet,
unified_timestamp, parse_iso8601,
url_or_none, url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
@@ -87,9 +88,9 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'uploader_id': 'rlantnghks', 'uploader_id': 'rlantnghks',
'uploader': '페이즈으', 'uploader': '페이즈으',
'duration': 10840, 'duration': 10840,
'thumbnail': r're:https?://videoimg\.sooplive\.co/.kr/.+', 'thumbnail': r're:https?://videoimg\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'upload_date': '20230108', 'upload_date': '20230108',
'timestamp': 1673218805, 'timestamp': 1673186405,
'title': '젠지 페이즈', 'title': '젠지 페이즈',
}, },
'params': { 'params': {
@@ -102,7 +103,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'id': '20170411_BE689A0E_190960999_1_2_h', 'id': '20170411_BE689A0E_190960999_1_2_h',
'ext': 'mp4', 'ext': 'mp4',
'title': '혼자사는여자집', 'title': '혼자사는여자집',
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+', 'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'uploader': '♥이슬이', 'uploader': '♥이슬이',
'uploader_id': 'dasl8121', 'uploader_id': 'dasl8121',
'upload_date': '20170411', 'upload_date': '20170411',
@@ -119,7 +120,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'id': '20180327_27901457_202289533_1', 'id': '20180327_27901457_202289533_1',
'ext': 'mp4', 'ext': 'mp4',
'title': '[생]빨개요♥ (part 1)', 'title': '[생]빨개요♥ (part 1)',
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+', 'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
'uploader': '[SA]서아', 'uploader': '[SA]서아',
'uploader_id': 'bjdyrksu', 'uploader_id': 'bjdyrksu',
'upload_date': '20180327', 'upload_date': '20180327',
@@ -187,7 +188,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'formats': formats, 'formats': formats,
**traverse_obj(file_element, { **traverse_obj(file_element, {
'duration': ('duration', {int_or_none(scale=1000)}), 'duration': ('duration', {int_or_none(scale=1000)}),
'timestamp': ('file_start', {unified_timestamp}), 'timestamp': ('file_start', {parse_iso8601(delimiter=' ', timezone=dt.timedelta(hours=9))}),
}), }),
}) })
@@ -370,7 +371,7 @@ class AfreecaTVLiveIE(AfreecaTVBaseIE):
'title': channel_info.get('TITLE') or station_info.get('station_title'), 'title': channel_info.get('TITLE') or station_info.get('station_title'),
'uploader': channel_info.get('BJNICK') or station_info.get('station_name'), 'uploader': channel_info.get('BJNICK') or station_info.get('station_name'),
'uploader_id': broadcaster_id, 'uploader_id': broadcaster_id,
'timestamp': unified_timestamp(station_info.get('broad_start')), 'timestamp': parse_iso8601(station_info.get('broad_start'), delimiter=' ', timezone=dt.timedelta(hours=9)),
'formats': formats, 'formats': formats,
'is_live': True, 'is_live': True,
'http_headers': {'Referer': url}, 'http_headers': {'Referer': url},

View File

@@ -1,7 +1,6 @@
import json
from .common import InfoExtractor from .common import InfoExtractor
from .kaltura import KalturaIE from .kaltura import KalturaIE
from ..utils.traversal import require, traverse_obj
class AZMedienIE(InfoExtractor): class AZMedienIE(InfoExtractor):
@@ -9,15 +8,15 @@ class AZMedienIE(InfoExtractor):
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?:www\.|tv\.)? (?:www\.|tv\.)?
(?P<host> (?:
telezueri\.ch| telezueri\.ch|
telebaern\.tv| telebaern\.tv|
telem1\.ch| telem1\.ch|
tvo-online\.ch tvo-online\.ch
)/ )/
[^/]+/ [^/?#]+/
(?P<id> (?P<id>
[^/]+-(?P<article_id>\d+) [^/?#]+-\d+
) )
(?: (?:
\#video= \#video=
@@ -47,19 +46,17 @@ class AZMedienIE(InfoExtractor):
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1', 'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
'only_matching': True, 'only_matching': True,
}] }]
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
_PARTNER_ID = '1719221' _PARTNER_ID = '1719221'
def _real_extract(self, url): def _real_extract(self, url):
host, display_id, article_id, entry_id = self._match_valid_url(url).groups() display_id, entry_id = self._match_valid_url(url).groups()
if not entry_id: if not entry_id:
entry_id = self._download_json( webpage = self._download_webpage(url, display_id)
self._API_TEMPL % (host, host.split('.')[0]), display_id, query={ data = self._search_json(
'variables': json.dumps({ r'window\.__APOLLO_STATE__\s*=', webpage, 'video data', display_id)
'contextId': 'NewsArticle:' + article_id, entry_id = traverse_obj(data, (
}), lambda _, v: v['__typename'] == 'KalturaData', 'kalturaId', any, {require('kaltura id')}))
})['data']['context']['mainAsset']['video']['kaltura']['kalturaId']
return self.url_result( return self.url_result(
f'kaltura:{self._PARTNER_ID}:{entry_id}', f'kaltura:{self._PARTNER_ID}:{entry_id}',

View File

@@ -86,7 +86,7 @@ class BandlabBaseIE(InfoExtractor):
'webpage_url': ( 'webpage_url': (
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any), 'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any),
'url': ('video', 'url', {url_or_none}), 'url': ('video', 'url', {url_or_none}),
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}), 'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
'description': ('caption', {str}), 'description': ('caption', {str}),
'thumbnail': ('video', 'picture', 'url', {url_or_none}), 'thumbnail': ('video', 'picture', 'url', {url_or_none}),
'view_count': ('video', 'counters', 'plays', {int_or_none}), 'view_count': ('video', 'counters', 'plays', {int_or_none}),
@@ -120,7 +120,7 @@ class BandlabIE(BandlabBaseIE):
'duration': 54.629999999999995, 'duration': 54.629999999999995,
'title': 'sweet black', 'title': 'sweet black',
'upload_date': '20231210', 'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'], 'genres': ['Lofi'],
'uploader': 'ender milze', 'uploader': 'ender milze',
'comment_count': int, 'comment_count': int,
@@ -142,7 +142,7 @@ class BandlabIE(BandlabBaseIE):
'duration': 54.629999999999995, 'duration': 54.629999999999995,
'title': 'sweet black', 'title': 'sweet black',
'upload_date': '20231210', 'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'], 'genres': ['Lofi'],
'uploader': 'ender milze', 'uploader': 'ender milze',
'comment_count': int, 'comment_count': int,
@@ -158,7 +158,7 @@ class BandlabIE(BandlabBaseIE):
'comment_count': int, 'comment_count': int,
'genres': ['Other'], 'genres': ['Other'],
'uploader_id': 'user8353034818103753', 'uploader_id': 'user8353034818103753',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
'timestamp': 1709625771, 'timestamp': 1709625771,
'track': 'PodcastMaerchen4b', 'track': 'PodcastMaerchen4b',
'duration': 468.14, 'duration': 468.14,
@@ -178,7 +178,7 @@ class BandlabIE(BandlabBaseIE):
'id': '110343fc-148b-ea11-96d2-0003ffd1fc09', 'id': '110343fc-148b-ea11-96d2-0003ffd1fc09',
'ext': 'm4a', 'ext': 'm4a',
'timestamp': 1588273294, 'timestamp': 1588273294,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
'description': 'Final Revision.', 'description': 'Final Revision.',
'title': 'Replay ( Instrumental)', 'title': 'Replay ( Instrumental)',
'uploader': 'David R Sparks', 'uploader': 'David R Sparks',
@@ -200,7 +200,7 @@ class BandlabIE(BandlabBaseIE):
'id': '5cdf9036-3857-ef11-991a-6045bd36e0d9', 'id': '5cdf9036-3857-ef11-991a-6045bd36e0d9',
'ext': 'mp4', 'ext': 'mp4',
'duration': 44.705, 'duration': 44.705,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
'comment_count': int, 'comment_count': int,
'title': 'backing vocals', 'title': 'backing vocals',
'uploader_id': 'marliashya', 'uploader_id': 'marliashya',
@@ -224,7 +224,7 @@ class BandlabIE(BandlabBaseIE):
'view_count': int, 'view_count': int,
'track': 'Positronic Meltdown', 'track': 'Positronic Meltdown',
'duration': 318.55, 'duration': 318.55,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
'description': 'Checkout my tracks at AOMX http://aomxsounds.com/', 'description': 'Checkout my tracks at AOMX http://aomxsounds.com/',
'uploader_id': 'microfreaks', 'uploader_id': 'microfreaks',
'title': 'Positronic Meltdown', 'title': 'Positronic Meltdown',
@@ -246,7 +246,7 @@ class BandlabIE(BandlabBaseIE):
'comment_count': int, 'comment_count': int,
'uploader': 'Sorakime', 'uploader': 'Sorakime',
'uploader_id': 'sorakime', 'uploader_id': 'sorakime',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/', 'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
'timestamp': 1691162128, 'timestamp': 1691162128,
'upload_date': '20230804', 'upload_date': '20230804',
'media_type': 'track', 'media_type': 'track',

View File

@@ -1596,16 +1596,16 @@ class BilibiliPlaylistIE(BilibiliSpaceListBaseIE):
webpage = self._download_webpage(url, list_id) webpage = self._download_webpage(url, list_id)
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', list_id) initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', list_id)
if traverse_obj(initial_state, ('error', 'code', {int_or_none})) != 200: error = traverse_obj(initial_state, (('error', 'listError'), all, lambda _, v: v['code'], any))
error_code = traverse_obj(initial_state, ('error', 'trueCode', {int_or_none})) if error and error['code'] != 200:
error_message = traverse_obj(initial_state, ('error', 'message', {str_or_none})) error_code = error.get('trueCode')
if error_code == -400 and list_id == 'watchlater': if error_code == -400 and list_id == 'watchlater':
self.raise_login_required('You need to login to access your watchlater playlist') self.raise_login_required('You need to login to access your watchlater playlist')
elif error_code == -403: elif error_code == -403:
self.raise_login_required('This is a private playlist. You need to login as its owner') self.raise_login_required('This is a private playlist. You need to login as its owner')
elif error_code == 11010: elif error_code == 11010:
raise ExtractorError('Playlist is no longer available', expected=True) raise ExtractorError('Playlist is no longer available', expected=True)
raise ExtractorError(f'Could not access playlist: {error_code} {error_message}') raise ExtractorError(f'Could not access playlist: {error_code} {error.get("message")}')
query = { query = {
'ps': 20, 'ps': 20,

View File

@@ -53,7 +53,7 @@ class BlueskyIE(InfoExtractor):
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur', 'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur', 'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$', 'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
'title': 'Bluesky now has video! Update your app to versi...', 'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
'alt_title': 'Bluesky video feature announcement', 'alt_title': 'Bluesky video feature announcement',
'description': r're:(?s)Bluesky now has video! .{239}', 'description': r're:(?s)Bluesky now has video! .{239}',
'upload_date': '20240911', 'upload_date': '20240911',
@@ -172,7 +172,7 @@ class BlueskyIE(InfoExtractor):
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur', 'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur', 'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$', 'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
'title': 'Bluesky now has video! Update your app to versi...', 'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
'alt_title': 'Bluesky video feature announcement', 'alt_title': 'Bluesky video feature announcement',
'description': r're:(?s)Bluesky now has video! .{239}', 'description': r're:(?s)Bluesky now has video! .{239}',
'upload_date': '20240911', 'upload_date': '20240911',
@@ -191,7 +191,7 @@ class BlueskyIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '3l7rdfxhyds2f', 'id': '3l7rdfxhyds2f',
'ext': 'mp4', 'ext': 'mp4',
'uploader': 'cinnamon', 'uploader': 'cinnamon 🐇 🏳️‍⚧️',
'uploader_id': 'cinny.bun.how', 'uploader_id': 'cinny.bun.how',
'uploader_url': 'https://bsky.app/profile/cinny.bun.how', 'uploader_url': 'https://bsky.app/profile/cinny.bun.how',
'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide', 'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide',
@@ -255,7 +255,7 @@ class BlueskyIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '3l77u64l7le2e', 'id': '3l77u64l7le2e',
'ext': 'mp4', 'ext': 'mp4',
'title': 'hearing people on twitter say that bluesky isn\'...', 'title': "hearing people on twitter say that bluesky isn't funny yet so post t...",
'like_count': int, 'like_count': int,
'uploader_id': 'thafnine.net', 'uploader_id': 'thafnine.net',
'uploader_url': 'https://bsky.app/profile/thafnine.net', 'uploader_url': 'https://bsky.app/profile/thafnine.net',
@@ -387,7 +387,7 @@ class BlueskyIE(InfoExtractor):
'age_limit': ( 'age_limit': (
'labels', ..., 'val', {lambda x: 18 if x in ('sexual', 'porn', 'graphic-media') else None}, any), 'labels', ..., 'val', {lambda x: 18 if x in ('sexual', 'porn', 'graphic-media') else None}, any),
'description': (*record_path, 'text', {str}, filter), 'description': (*record_path, 'text', {str}, filter),
'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}), 'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
}), }),
}) })
return entries return entries

View File

@@ -24,7 +24,7 @@ class BokeCCBaseIE(InfoExtractor):
class BokeCCIE(BokeCCBaseIE): class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频' IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)' _VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{ _TESTS = [{

View File

@@ -0,0 +1,178 @@
import json
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import (
ExtractorError,
extract_attributes,
int_or_none,
parse_qs,
smuggle_url,
unsmuggle_url,
url_or_none,
urlhandle_detect_ext,
)
from ..utils.traversal import find_element, traverse_obj
class BunnyCdnIE(InfoExtractor):
_VALID_URL = r'https?://(?:iframe\.mediadelivery\.net|video\.bunnycdn\.com)/(?:embed|play)/(?P<library_id>\d+)/(?P<id>[\da-f-]+)'
_EMBED_REGEX = [rf'<iframe[^>]+src=[\'"](?P<url>{_VALID_URL}[^\'"]*)[\'"]']
_TESTS = [{
'url': 'https://iframe.mediadelivery.net/embed/113933/e73edec1-e381-4c8b-ae73-717a140e0924',
'info_dict': {
'id': 'e73edec1-e381-4c8b-ae73-717a140e0924',
'ext': 'mp4',
'title': 'mistress morgana (3).mp4',
'description': '',
'timestamp': 1693251673,
'thumbnail': r're:^https?://.*\.b-cdn\.net/e73edec1-e381-4c8b-ae73-717a140e0924/thumbnail\.jpg',
'duration': 7.0,
'upload_date': '20230828',
},
'params': {'skip_download': True},
}, {
'url': 'https://iframe.mediadelivery.net/play/136145/32e34c4b-0d72-437c-9abb-05e67657da34',
'info_dict': {
'id': '32e34c4b-0d72-437c-9abb-05e67657da34',
'ext': 'mp4',
'timestamp': 1691145748,
'thumbnail': r're:^https?://.*\.b-cdn\.net/32e34c4b-0d72-437c-9abb-05e67657da34/thumbnail_9172dc16\.jpg',
'duration': 106.0,
'description': 'md5:981a3e899a5c78352b21ed8b2f1efd81',
'upload_date': '20230804',
'title': 'Sanela ist Teil der #arbeitsmarktkraft',
},
'params': {'skip_download': True},
}, {
# Stream requires activation and pings
'url': 'https://iframe.mediadelivery.net/embed/200867/2e8545ec-509d-4571-b855-4cf0235ccd75',
'info_dict': {
'id': '2e8545ec-509d-4571-b855-4cf0235ccd75',
'ext': 'mp4',
'timestamp': 1708497752,
'title': 'netflix part 1',
'duration': 3959.0,
'description': '',
'upload_date': '20240221',
'thumbnail': r're:^https?://.*\.b-cdn\.net/2e8545ec-509d-4571-b855-4cf0235ccd75/thumbnail\.jpg',
},
'params': {'skip_download': True},
}]
_WEBPAGE_TESTS = [{
# Stream requires Referer
'url': 'https://conword.io/',
'info_dict': {
'id': '3a5d863e-9cd6-447e-b6ef-e289af50b349',
'ext': 'mp4',
'title': 'Conword bei der Stadt Köln und Stadt Dortmund',
'description': '',
'upload_date': '20231031',
'duration': 31.0,
'thumbnail': 'https://video.watchuh.com/3a5d863e-9cd6-447e-b6ef-e289af50b349/thumbnail.jpg',
'timestamp': 1698783879,
},
'params': {'skip_download': True},
}, {
# URL requires token and expires
'url': 'https://www.stockphotos.com/video/moscow-subway-the-train-is-arriving-at-the-park-kultury-station-10017830',
'info_dict': {
'id': '0b02fa20-4e8c-4140-8f87-f64d820a3386',
'ext': 'mp4',
'thumbnail': r're:^https?://.*\.b-cdn\.net/0b02fa20-4e8c-4140-8f87-f64d820a3386/thumbnail\.jpg',
'title': 'Moscow subway. The train is arriving at the Park Kultury station.',
'upload_date': '20240531',
'duration': 18.0,
'timestamp': 1717152269,
'description': '',
},
'params': {'skip_download': True},
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
for embed_url in super()._extract_embed_urls(url, webpage):
yield smuggle_url(embed_url, {'Referer': url})
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
video_id, library_id = self._match_valid_url(url).group('id', 'library_id')
webpage = self._download_webpage(
f'https://iframe.mediadelivery.net/embed/{library_id}/{video_id}', video_id,
headers=traverse_obj(smuggled_data, {'Referer': 'Referer'}),
query=traverse_obj(parse_qs(url), {'token': 'token', 'expires': 'expires'}))
if html_title := self._html_extract_title(webpage, default=None) == '403':
raise ExtractorError(
'This video is inaccessible. Setting a Referer header '
'might be required to access the video', expected=True)
elif html_title == '404':
raise ExtractorError('This video does not exist', expected=True)
headers = {'Referer': url}
info = traverse_obj(self._parse_html5_media_entries(url, webpage, video_id, _headers=headers), 0) or {}
formats = info.get('formats') or []
subtitles = info.get('subtitles') or {}
original_url = self._search_regex(
r'(?:var|const|let)\s+originalUrl\s*=\s*["\']([^"\']+)["\']', webpage, 'original url', default=None)
if url_or_none(original_url):
urlh = self._request_webpage(
HEADRequest(original_url), video_id=video_id, note='Checking original',
headers=headers, fatal=False, expected_status=(403, 404))
if urlh and urlh.status == 200:
formats.append({
'url': original_url,
'format_id': 'source',
'quality': 1,
'http_headers': headers,
'ext': urlhandle_detect_ext(urlh, default='mp4'),
'filesize': int_or_none(urlh.get_header('Content-Length')),
})
# MediaCage Streams require activation and pings
src_url = self._search_regex(
r'\.setAttribute\([\'"]src[\'"],\s*[\'"]([^\'"]+)[\'"]\)', webpage, 'src url', default=None)
activation_url = self._search_regex(
r'loadUrl\([\'"]([^\'"]+/activate)[\'"]', webpage, 'activation url', default=None)
ping_url = self._search_regex(
r'loadUrl\([\'"]([^\'"]+/ping)[\'"]', webpage, 'ping url', default=None)
secret = traverse_obj(parse_qs(src_url), ('secret', 0))
context_id = traverse_obj(parse_qs(src_url), ('contextId', 0))
ping_data = {}
if src_url and activation_url and ping_url and secret and context_id:
self._download_webpage(
activation_url, video_id, headers=headers, note='Downloading activation data')
fmts, subs = self._extract_m3u8_formats_and_subtitles(
src_url, video_id, 'mp4', headers=headers, m3u8_id='hls', fatal=False)
for fmt in fmts:
fmt.update({
'protocol': 'bunnycdn',
'http_headers': headers,
})
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
ping_data = {
'_bunnycdn_ping_data': {
'url': ping_url,
'headers': headers,
'secret': secret,
'context_id': context_id,
},
}
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(webpage, ({find_element(id='main-video', html=True)}, {extract_attributes}, {
'title': ('data-plyr-config', {json.loads}, 'title', {str}),
'thumbnail': ('data-poster', {url_or_none}),
})),
**ping_data,
**self._search_json_ld(webpage, video_id, fatal=False),
}

View File

@@ -0,0 +1,84 @@
import json
import time
from .common import InfoExtractor
from ..utils import (
determine_ext,
float_or_none,
jwt_decode_hs256,
parse_iso8601,
url_or_none,
variadic,
)
from ..utils.traversal import traverse_obj
class CanalsurmasIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?canalsurmas\.es/videos/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.canalsurmas.es/videos/44006-el-gran-queo-1-lora-del-rio-sevilla-20072014',
'md5': '861f86fdc1221175e15523047d0087ef',
'info_dict': {
'id': '44006',
'ext': 'mp4',
'title': 'Lora del Río (Sevilla)',
'description': 'md5:3d9ee40a9b1b26ed8259e6b71ed27b8b',
'thumbnail': 'https://cdn2.rtva.interactvty.com/content_cards/00f3e8f67b0a4f3b90a4a14618a48b0d.jpg',
'timestamp': 1648123182,
'upload_date': '20220324',
},
}]
_API_BASE = 'https://api-rtva.interactvty.com'
_access_token = None
@staticmethod
def _is_jwt_expired(token):
return jwt_decode_hs256(token)['exp'] - time.time() < 300
def _call_api(self, endpoint, video_id, fields=None):
if not self._access_token or self._is_jwt_expired(self._access_token):
self._access_token = self._download_json(
f'{self._API_BASE}/jwt/token/', None,
'Downloading access token', 'Failed to download access token',
headers={'Content-Type': 'application/json'},
data=json.dumps({
'username': 'canalsur_demo',
'password': 'dsUBXUcI',
}).encode())['access']
return self._download_json(
f'{self._API_BASE}/api/2.0/contents/{endpoint}/{video_id}/', video_id,
f'Downloading {endpoint} API JSON', f'Failed to download {endpoint} API JSON',
headers={'Authorization': f'jwtok {self._access_token}'},
query={'optional_fields': ','.join(variadic(fields))} if fields else None)
def _real_extract(self, url):
video_id = self._match_id(url)
video_info = self._call_api('content', video_id, fields=[
'description', 'image', 'duration', 'created_at', 'tags',
])
stream_info = self._call_api('content_resources', video_id, 'media_url')
formats, subtitles = [], {}
for stream_url in traverse_obj(stream_info, ('results', ..., 'media_url', {url_or_none})):
if determine_ext(stream_url) == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
stream_url, video_id, m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({'url': stream_url})
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(video_info, {
'title': ('name', {str.strip}),
'description': ('description', {str}),
'thumbnail': ('image', {url_or_none}),
'duration': ('duration', {float_or_none}),
'timestamp': ('created_at', {parse_iso8601}),
'tags': ('tags', ..., {str}),
}),
}

View File

@@ -1,29 +1,32 @@
import base64
import functools import functools
import json
import re import re
import time import time
import urllib.parse import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..networking import HEADRequest from ..networking import HEADRequest
from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
js_to_json, js_to_json,
jwt_decode_hs256,
mimetype2ext, mimetype2ext,
orderedSet, orderedSet,
parse_age_limit,
parse_iso8601, parse_iso8601,
replace_extension, replace_extension,
smuggle_url, smuggle_url,
strip_or_none, strip_or_none,
traverse_obj,
try_get, try_get,
unified_timestamp,
update_url, update_url,
url_basename, url_basename,
url_or_none, url_or_none,
urlencode_postdata,
) )
from ..utils.traversal import require, traverse_obj, trim_str
class CBCIE(InfoExtractor): class CBCIE(InfoExtractor):
@@ -516,9 +519,43 @@ class CBCPlayerPlaylistIE(InfoExtractor):
return self.playlist_result(entries(), playlist_id) return self.playlist_result(entries(), playlist_id)
class CBCGemIE(InfoExtractor): class CBCGemBaseIE(InfoExtractor):
_NETRC_MACHINE = 'cbcgem'
_GEO_COUNTRIES = ['CA']
def _call_show_api(self, item_id, display_id=None):
return self._download_json(
f'https://services.radio-canada.ca/ott/catalog/v2/gem/show/{item_id}',
display_id or item_id, query={'device': 'web'})
def _extract_item_info(self, item_info):
episode_number = None
title = traverse_obj(item_info, ('title', {str}))
if title and (mobj := re.match(r'(?P<episode>\d+)\. (?P<title>.+)', title)):
episode_number = int_or_none(mobj.group('episode'))
title = mobj.group('title')
return {
'episode_number': episode_number,
**traverse_obj(item_info, {
'id': ('url', {str}),
'episode_id': ('url', {str}),
'description': ('description', {str}),
'thumbnail': ('images', 'card', 'url', {url_or_none}, {update_url(query=None)}),
'episode_number': ('episodeNumber', {int_or_none}),
'duration': ('metadata', 'duration', {int_or_none}),
'release_timestamp': ('metadata', 'airDate', {unified_timestamp}),
'timestamp': ('metadata', 'availabilityDate', {unified_timestamp}),
'age_limit': ('metadata', 'rating', {trim_str(start='C')}, {parse_age_limit}),
}),
'episode': title,
'title': title,
}
class CBCGemIE(CBCGemBaseIE):
IE_NAME = 'gem.cbc.ca' IE_NAME = 'gem.cbc.ca'
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>[0-9a-z-]+/s[0-9]+[a-z][0-9]+)' _VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>[0-9a-z-]+/s(?P<season>[0-9]+)[a-z][0-9]+)'
_TESTS = [{ _TESTS = [{
# This is a normal, public, TV show video # This is a normal, public, TV show video
'url': 'https://gem.cbc.ca/media/schitts-creek/s06e01', 'url': 'https://gem.cbc.ca/media/schitts-creek/s06e01',
@@ -529,7 +566,7 @@ class CBCGemIE(InfoExtractor):
'description': 'md5:929868d20021c924020641769eb3e7f1', 'description': 'md5:929868d20021c924020641769eb3e7f1',
'thumbnail': r're:https://images\.radio-canada\.ca/[^#?]+/cbc_schitts_creek_season_06e01_thumbnail_v01\.jpg', 'thumbnail': r're:https://images\.radio-canada\.ca/[^#?]+/cbc_schitts_creek_season_06e01_thumbnail_v01\.jpg',
'duration': 1324, 'duration': 1324,
'categories': ['comedy'], 'genres': ['Comédie et humour'],
'series': 'Schitt\'s Creek', 'series': 'Schitt\'s Creek',
'season': 'Season 6', 'season': 'Season 6',
'season_number': 6, 'season_number': 6,
@@ -537,9 +574,10 @@ class CBCGemIE(InfoExtractor):
'episode_number': 1, 'episode_number': 1,
'episode_id': 'schitts-creek/s06e01', 'episode_id': 'schitts-creek/s06e01',
'upload_date': '20210618', 'upload_date': '20210618',
'timestamp': 1623988800, 'timestamp': 1623974400,
'release_date': '20200107', 'release_date': '20200107',
'release_timestamp': 1578427200, 'release_timestamp': 1578355200,
'age_limit': 14,
}, },
'params': {'format': 'bv'}, 'params': {'format': 'bv'},
}, { }, {
@@ -557,12 +595,13 @@ class CBCGemIE(InfoExtractor):
'episode_number': 1, 'episode_number': 1,
'episode': 'The Cup Runneth Over', 'episode': 'The Cup Runneth Over',
'episode_id': 'schitts-creek/s01e01', 'episode_id': 'schitts-creek/s01e01',
'duration': 1309, 'duration': 1308,
'categories': ['comedy'], 'genres': ['Comédie et humour'],
'upload_date': '20210617', 'upload_date': '20210617',
'timestamp': 1623902400, 'timestamp': 1623888000,
'release_date': '20151124', 'release_date': '20151123',
'release_timestamp': 1448323200, 'release_timestamp': 1448236800,
'age_limit': 14,
}, },
'params': {'format': 'bv'}, 'params': {'format': 'bv'},
}, { }, {
@@ -570,82 +609,107 @@ class CBCGemIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
_GEO_COUNTRIES = ['CA'] _CLIENT_ID = 'fc05b0ee-3865-4400-a3cc-3da82c330c23'
_TOKEN_API_KEY = '3f4beddd-2061-49b0-ae80-6f1f2ed65b37' _refresh_token = None
_NETRC_MACHINE = 'cbcgem' _access_token = None
_claims_token = None _claims_token = None
def _new_claims_token(self, email, password): @functools.cached_property
data = json.dumps({ def _ropc_settings(self):
'email': email, return self._download_json(
'password': password, 'https://services.radio-canada.ca/ott/catalog/v1/gem/settings', None,
}).encode() 'Downloading site settings', query={'device': 'web'})['identityManagement']['ropc']
headers = {'content-type': 'application/json'}
query = {'apikey': self._TOKEN_API_KEY}
resp = self._download_json('https://api.loginradius.com/identity/v2/auth/login',
None, data=data, headers=headers, query=query)
access_token = resp['access_token']
query = { def _is_jwt_expired(self, token):
'access_token': access_token, return jwt_decode_hs256(token)['exp'] - time.time() < 300
'apikey': self._TOKEN_API_KEY,
'jwtapp': 'jwt',
}
resp = self._download_json('https://cloud-api.loginradius.com/sso/jwt/api/token',
None, headers=headers, query=query)
sig = resp['signature']
data = json.dumps({'jwt': sig}).encode() def _call_oauth_api(self, oauth_data, note='Refreshing access token'):
headers = {'content-type': 'application/json', 'ott-device-type': 'web'} response = self._download_json(
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/token', self._ropc_settings['url'], None, note, data=urlencode_postdata({
None, data=data, headers=headers, expected_status=426) 'client_id': self._CLIENT_ID,
cbc_access_token = resp['accessToken'] **oauth_data,
'scope': self._ropc_settings['scopes'],
}))
self._refresh_token = response['refresh_token']
self._access_token = response['access_token']
self.cache.store(self._NETRC_MACHINE, 'token_data', [self._refresh_token, self._access_token])
headers = {'content-type': 'application/json', 'ott-device-type': 'web', 'ott-access-token': cbc_access_token} def _perform_login(self, username, password):
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/profile', if not self._refresh_token:
None, headers=headers, expected_status=426) self._refresh_token, self._access_token = self.cache.load(
return resp['claimsToken'] self._NETRC_MACHINE, 'token_data', default=[None, None])
def _get_claims_token_expiry(self): if self._refresh_token and self._access_token:
# Token is a JWT self.write_debug('Using cached refresh token')
# JWT is decoded here and 'exp' field is extracted if not self._claims_token:
# It is a Unix timestamp for when the token expires self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
b64_data = self._claims_token.split('.')[1] return
data = base64.urlsafe_b64decode(b64_data + '==')
return json.loads(data)['exp']
def claims_token_expired(self): try:
exp = self._get_claims_token_expiry() self._call_oauth_api({
# It will expire in less than 10 seconds, or has already expired 'grant_type': 'password',
return exp - time.time() < 10 'username': username,
'password': password,
}, note='Logging in')
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
raise ExtractorError('Invalid username and/or password', expected=True)
raise
def claims_token_valid(self): def _fetch_access_token(self):
return self._claims_token is not None and not self.claims_token_expired() if self._is_jwt_expired(self._access_token):
try:
self._call_oauth_api({
'grant_type': 'refresh_token',
'refresh_token': self._refresh_token,
})
except ExtractorError:
self._refresh_token, self._access_token = None, None
self.cache.store(self._NETRC_MACHINE, 'token_data', [None, None])
self.report_warning('Refresh token has been invalidated; retrying with credentials')
self._perform_login(*self._get_login_info())
def _get_claims_token(self, email, password): return self._access_token
if not self.claims_token_valid():
self._claims_token = self._new_claims_token(email, password) def _fetch_claims_token(self):
if not self._get_login_info()[0]:
return None
if not self._claims_token or self._is_jwt_expired(self._claims_token):
self._claims_token = self._download_json(
'https://services.radio-canada.ca/ott/subscription/v2/gem/Subscriber/profile',
None, 'Downloading claims token', query={'device': 'web'},
headers={'Authorization': f'Bearer {self._fetch_access_token()}'})['claimsToken']
self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token) self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token)
else:
self.write_debug('Using cached claims token')
return self._claims_token return self._claims_token
def _real_initialize(self):
if self.claims_token_valid():
return
self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id, season_number = self._match_valid_url(url).group('id', 'season')
video_info = self._download_json( video_info = self._call_show_api(video_id)
f'https://services.radio-canada.ca/ott/cbc-api/v2/assets/{video_id}', item_info = traverse_obj(video_info, (
video_id, expected_status=426) 'content', ..., 'lineups', ..., 'items',
lambda _, v: v['url'] == video_id, any, {require('item info')}))
email, password = self._get_login_info() headers = {}
if email and password: if claims_token := self._fetch_claims_token():
claims_token = self._get_claims_token(email, password) headers['x-claims-token'] = claims_token
headers = {'x-claims-token': claims_token}
else: m3u8_info = self._download_json(
headers = {} 'https://services.radio-canada.ca/media/validation/v2/',
m3u8_info = self._download_json(video_info['playSession']['url'], video_id, headers=headers) video_id, headers=headers, query={
'appCode': 'gem',
'connectionType': 'hd',
'deviceType': 'ipad',
'multibitrate': 'true',
'output': 'json',
'tech': 'hls',
'manifestVersion': '2',
'manifestType': 'desktop',
'idMedia': item_info['idMedia'],
})
if m3u8_info.get('errorCode') == 1: if m3u8_info.get('errorCode') == 1:
self.raise_geo_restricted(countries=['CA']) self.raise_geo_restricted(countries=['CA'])
@@ -671,26 +735,20 @@ class CBCGemIE(InfoExtractor):
fmt['preference'] = -2 fmt['preference'] = -2
return { return {
'season_number': int_or_none(season_number),
**traverse_obj(video_info, {
'series': ('title', {str}),
'season_number': ('structuredMetadata', 'partofSeason', 'seasonNumber', {int_or_none}),
'genres': ('structuredMetadata', 'genre', ..., {str}),
}),
**self._extract_item_info(item_info),
'id': video_id, 'id': video_id,
'episode_id': video_id, 'episode_id': video_id,
'formats': formats, 'formats': formats,
**traverse_obj(video_info, {
'title': ('title', {str}),
'episode': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('image', {url_or_none}),
'series': ('series', {str}),
'season_number': ('season', {int_or_none}),
'episode_number': ('episode', {int_or_none}),
'duration': ('duration', {int_or_none}),
'categories': ('category', {str}, all),
'release_timestamp': ('airDate', {int_or_none(scale=1000)}),
'timestamp': ('availableDate', {int_or_none(scale=1000)}),
}),
} }
class CBCGemPlaylistIE(InfoExtractor): class CBCGemPlaylistIE(CBCGemBaseIE):
IE_NAME = 'gem.cbc.ca:playlist' IE_NAME = 'gem.cbc.ca:playlist'
_VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>(?P<show>[0-9a-z-]+)/s(?P<season>[0-9]+))/?(?:[?#]|$)' _VALID_URL = r'https?://gem\.cbc\.ca/(?:media/)?(?P<id>(?P<show>[0-9a-z-]+)/s(?P<season>[0-9]+))/?(?:[?#]|$)'
_TESTS = [{ _TESTS = [{
@@ -700,70 +758,35 @@ class CBCGemPlaylistIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'schitts-creek/s06', 'id': 'schitts-creek/s06',
'title': 'Season 6', 'title': 'Season 6',
'description': 'md5:6a92104a56cbeb5818cc47884d4326a2',
'series': 'Schitt\'s Creek', 'series': 'Schitt\'s Creek',
'season_number': 6, 'season_number': 6,
'season': 'Season 6', 'season': 'Season 6',
'thumbnail': 'https://images.radio-canada.ca/v1/synps-cbc/season/perso/cbc_schitts_creek_season_06_carousel_v03.jpg?impolicy=ott&im=Resize=(_Size_)&quality=75',
}, },
}, { }, {
'url': 'https://gem.cbc.ca/schitts-creek/s06', 'url': 'https://gem.cbc.ca/schitts-creek/s06',
'only_matching': True, 'only_matching': True,
}] }]
_API_BASE = 'https://services.radio-canada.ca/ott/cbc-api/v2/shows/'
def _entries(self, season_info):
for episode in traverse_obj(season_info, ('items', lambda _, v: v['url'])):
yield self.url_result(
f'https://gem.cbc.ca/media/{episode["url"]}', CBCGemIE,
**self._extract_item_info(episode))
def _real_extract(self, url): def _real_extract(self, url):
match = self._match_valid_url(url) season_id, show, season = self._match_valid_url(url).group('id', 'show', 'season')
season_id = match.group('id') show_info = self._call_show_api(show, display_id=season_id)
show = match.group('show') season_info = traverse_obj(show_info, (
show_info = self._download_json(self._API_BASE + show, season_id, expected_status=426) 'content', ..., 'lineups',
season = int(match.group('season')) lambda _, v: v['seasonNumber'] == int(season), any, {require('season info')}))
season_info = next((s for s in show_info['seasons'] if s.get('season') == season), None) return self.playlist_result(
self._entries(season_info), season_id,
if season_info is None: **traverse_obj(season_info, {
raise ExtractorError(f'Couldn\'t find season {season} of {show}') 'title': ('title', {str}),
'season': ('title', {str}),
episodes = [] 'season_number': ('seasonNumber', {int_or_none}),
for episode in season_info['assets']: }), series=traverse_obj(show_info, ('title', {str})))
episodes.append({
'_type': 'url_transparent',
'ie_key': 'CBCGem',
'url': 'https://gem.cbc.ca/media/' + episode['id'],
'id': episode['id'],
'title': episode.get('title'),
'description': episode.get('description'),
'thumbnail': episode.get('image'),
'series': episode.get('series'),
'season_number': episode.get('season'),
'season': season_info['title'],
'season_id': season_info.get('id'),
'episode_number': episode.get('episode'),
'episode': episode.get('title'),
'episode_id': episode['id'],
'duration': episode.get('duration'),
'categories': [episode.get('category')],
})
thumbnail = None
tn_uri = season_info.get('image')
# the-national was observed to use a "data:image/png;base64"
# URI for their 'image' value. The image was 1x1, and is
# probably just a placeholder, so it is ignored.
if tn_uri is not None and not tn_uri.startswith('data:'):
thumbnail = tn_uri
return {
'_type': 'playlist',
'entries': episodes,
'id': season_id,
'title': season_info['title'],
'description': season_info.get('description'),
'thumbnail': thumbnail,
'series': show_info.get('title'),
'season_number': season_info.get('season'),
'season': season_info['title'],
}
class CBCGemLiveIE(InfoExtractor): class CBCGemLiveIE(InfoExtractor):

View File

@@ -121,10 +121,7 @@ class CDAIE(InfoExtractor):
}, **kwargs) }, **kwargs)
def _perform_login(self, username, password): def _perform_login(self, username, password):
app_version = random.choice(( app_version = '1.2.255 build 21541'
'1.2.88 build 15306',
'1.2.174 build 18469',
))
android_version = random.randrange(8, 14) android_version = random.randrange(8, 14)
phone_model = random.choice(( phone_model = random.choice((
# x-kom.pl top selling Android smartphones, as of 2022-12-26 # x-kom.pl top selling Android smartphones, as of 2022-12-26
@@ -190,7 +187,7 @@ class CDAIE(InfoExtractor):
meta = self._download_json( meta = self._download_json(
f'{self._BASE_API_URL}/video/{video_id}', video_id, headers=self._API_HEADERS)['video'] f'{self._BASE_API_URL}/video/{video_id}', video_id, headers=self._API_HEADERS)['video']
uploader = traverse_obj(meta, 'author', 'login') uploader = traverse_obj(meta, ('author', 'login', {str}))
formats = [{ formats = [{
'url': quality['file'], 'url': quality['file'],

View File

@@ -21,7 +21,7 @@ class CHZZKLiveIE(InfoExtractor):
'channel': '진짜도현', 'channel': '진짜도현',
'channel_id': 'c68b8ef525fb3d2fa146344d84991753', 'channel_id': 'c68b8ef525fb3d2fa146344d84991753',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1705510344, 'timestamp': 1705510344,
'upload_date': '20240117', 'upload_date': '20240117',
'live_status': 'is_live', 'live_status': 'is_live',
@@ -98,7 +98,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '침착맨', 'channel': '침착맨',
'channel_id': 'bb382c2c0cc9fa7c86ab3b037fb5799c', 'channel_id': 'bb382c2c0cc9fa7c86ab3b037fb5799c',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 15577, 'duration': 15577,
'timestamp': 1702970505.417, 'timestamp': 1702970505.417,
'upload_date': '20231219', 'upload_date': '20231219',
@@ -115,7 +115,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '라디유radiyu', 'channel': '라디유radiyu',
'channel_id': '68f895c59a1043bc5019b5e08c83a5c5', 'channel_id': '68f895c59a1043bc5019b5e08c83a5c5',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 95, 'duration': 95,
'timestamp': 1703102631.722, 'timestamp': 1703102631.722,
'upload_date': '20231220', 'upload_date': '20231220',
@@ -131,12 +131,30 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '강지', 'channel': '강지',
'channel_id': 'b5ed5db484d04faf4d150aedd362f34b', 'channel_id': 'b5ed5db484d04faf4d150aedd362f34b',
'channel_is_verified': True, 'channel_is_verified': True,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 4433, 'duration': 4433,
'timestamp': 1703307460.214, 'timestamp': 1703307460.214,
'upload_date': '20231223', 'upload_date': '20231223',
'view_count': int, 'view_count': int,
}, },
}, {
# video_status == 'NONE' but is downloadable
'url': 'https://chzzk.naver.com/video/6325166',
'info_dict': {
'id': '6325166',
'ext': 'mp4',
'title': '와이프 숙제빼주기',
'channel': '이 다',
'channel_id': '0076a519f147ee9fd0959bf02f9571ca',
'channel_is_verified': False,
'view_count': int,
'duration': 28167,
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1742139216.86,
'upload_date': '20250316',
'live_status': 'was_live',
},
'params': {'skip_download': 'm3u8'},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -147,11 +165,7 @@ class CHZZKVideoIE(InfoExtractor):
live_status = 'was_live' if video_meta.get('liveOpenDate') else 'not_live' live_status = 'was_live' if video_meta.get('liveOpenDate') else 'not_live'
video_status = video_meta.get('vodStatus') video_status = video_meta.get('vodStatus')
if video_status == 'UPLOAD': if video_status == 'ABR_HLS':
playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
playback['media'][0]['path'], video_id, 'mp4', m3u8_id='hls')
elif video_status == 'ABR_HLS':
formats, subtitles = self._extract_mpd_formats_and_subtitles( formats, subtitles = self._extract_mpd_formats_and_subtitles(
f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}', f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}',
video_id, query={ video_id, query={
@@ -161,10 +175,17 @@ class CHZZKVideoIE(InfoExtractor):
'cpl': 'en_US', 'cpl': 'en_US',
}) })
else: else:
self.raise_no_formats( fatal = video_status == 'UPLOAD'
f'Unknown video status detected: "{video_status}"', expected=True, video_id=video_id) playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id, fatal=fatal)
formats, subtitles = [], {} formats, subtitles = self._extract_m3u8_formats_and_subtitles(
live_status = 'post_live' if live_status == 'was_live' else None traverse_obj(playback, ('media', 0, 'path')), video_id, 'mp4', m3u8_id='hls', fatal=fatal)
if formats and video_status != 'UPLOAD':
self.write_debug(f'Video found with status: "{video_status}"')
elif not formats:
self.raise_no_formats(
f'Unknown video status detected: "{video_status}"', expected=True, video_id=video_id)
formats, subtitles = [], {}
live_status = 'post_live' if live_status == 'was_live' else None
return { return {
'id': video_id, 'id': video_id,

View File

@@ -2,7 +2,6 @@ import base64
import collections import collections
import functools import functools
import getpass import getpass
import hashlib
import http.client import http.client
import http.cookiejar import http.cookiejar
import http.cookies import http.cookies
@@ -30,6 +29,7 @@ from ..compat import (
from ..cookies import LenientSimpleCookie from ..cookies import LenientSimpleCookie
from ..downloader.f4m import get_base_url, remove_encrypted_media from ..downloader.f4m import get_base_url, remove_encrypted_media
from ..downloader.hls import HlsFD from ..downloader.hls import HlsFD
from ..globals import plugin_ies_overrides
from ..networking import HEADRequest, Request from ..networking import HEADRequest, Request
from ..networking.exceptions import ( from ..networking.exceptions import (
HTTPError, HTTPError,
@@ -78,7 +78,7 @@ from ..utils import (
parse_iso8601, parse_iso8601,
parse_m3u8_attributes, parse_m3u8_attributes,
parse_resolution, parse_resolution,
sanitize_filename, qualities,
sanitize_url, sanitize_url,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
@@ -100,6 +100,7 @@ from ..utils import (
xpath_text, xpath_text,
xpath_with_ns, xpath_with_ns,
) )
from ..utils._utils import _request_dump_filename
class InfoExtractor: class InfoExtractor:
@@ -201,6 +202,11 @@ class InfoExtractor:
fragment_base_url fragment_base_url
* "duration" (optional, int or float) * "duration" (optional, int or float)
* "filesize" (optional, int) * "filesize" (optional, int)
* hls_media_playlist_data
The M3U8 media playlist data as a string.
Only use if the data must be modified during extraction and
the native HLS downloader should bypass requesting the URL.
Does not apply if ffmpeg is used as external downloader
* is_from_start Is a live format that can be downloaded * is_from_start Is a live format that can be downloaded
from the start. Boolean from the start. Boolean
* preference Order number of this format. If this field is * preference Order number of this format. If this field is
@@ -1017,23 +1023,6 @@ class InfoExtractor:
'Visit http://blocklist.rkn.gov.ru/ for a block reason.', 'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
expected=True) expected=True)
def _request_dump_filename(self, url, video_id, data=None):
if data is not None:
data = hashlib.md5(data).hexdigest()
basen = join_nonempty(video_id, data, url, delim='_')
trim_length = self.get_param('trim_file_name') or 240
if len(basen) > trim_length:
h = '___' + hashlib.md5(basen.encode()).hexdigest()
basen = basen[:trim_length - len(h)] + h
filename = sanitize_filename(f'{basen}.dump', restricted=True)
# Working around MAX_PATH limitation on Windows (see
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
if os.name == 'nt':
absfilepath = os.path.abspath(filename)
if len(absfilepath) > 259:
filename = fR'\\?\{absfilepath}'
return filename
def __decode_webpage(self, webpage_bytes, encoding, headers): def __decode_webpage(self, webpage_bytes, encoding, headers):
if not encoding: if not encoding:
encoding = self._guess_encoding_from_content(headers.get('Content-Type', ''), webpage_bytes) encoding = self._guess_encoding_from_content(headers.get('Content-Type', ''), webpage_bytes)
@@ -1062,7 +1051,9 @@ class InfoExtractor:
if self.get_param('write_pages'): if self.get_param('write_pages'):
if isinstance(url_or_request, Request): if isinstance(url_or_request, Request):
data = self._create_request(url_or_request, data).data data = self._create_request(url_or_request, data).data
filename = self._request_dump_filename(urlh.url, video_id, data) filename = _request_dump_filename(
urlh.url, video_id, data,
trim_length=self.get_param('trim_file_name'))
self.to_screen(f'Saving request to {filename}') self.to_screen(f'Saving request to {filename}')
with open(filename, 'wb') as outf: with open(filename, 'wb') as outf:
outf.write(webpage_bytes) outf.write(webpage_bytes)
@@ -1123,7 +1114,9 @@ class InfoExtractor:
impersonate=None, require_impersonation=False): impersonate=None, require_impersonation=False):
if self.get_param('load_pages'): if self.get_param('load_pages'):
url_or_request = self._create_request(url_or_request, data, headers, query) url_or_request = self._create_request(url_or_request, data, headers, query)
filename = self._request_dump_filename(url_or_request.url, video_id, url_or_request.data) filename = _request_dump_filename(
url_or_request.url, video_id, url_or_request.data,
trim_length=self.get_param('trim_file_name'))
self.to_screen(f'Loading request from {filename}') self.to_screen(f'Loading request from {filename}')
try: try:
with open(filename, 'rb') as dumpf: with open(filename, 'rb') as dumpf:
@@ -2185,6 +2178,8 @@ class InfoExtractor:
media_url = media.get('URI') media_url = media.get('URI')
if media_url: if media_url:
manifest_url = format_url(media_url) manifest_url = format_url(media_url)
is_audio = media_type == 'AUDIO'
is_alternate = media.get('DEFAULT') == 'NO' or media.get('AUTOSELECT') == 'NO'
formats.extend({ formats.extend({
'format_id': join_nonempty(m3u8_id, group_id, name, idx), 'format_id': join_nonempty(m3u8_id, group_id, name, idx),
'format_note': name, 'format_note': name,
@@ -2197,7 +2192,11 @@ class InfoExtractor:
'preference': preference, 'preference': preference,
'quality': quality, 'quality': quality,
'has_drm': has_drm, 'has_drm': has_drm,
'vcodec': 'none' if media_type == 'AUDIO' else None, 'vcodec': 'none' if is_audio else None,
# Alternate audio formats (e.g. audio description) should be deprioritized
'source_preference': -2 if is_audio and is_alternate else None,
# Save this to assign source_preference based on associated video stream
'_audio_group_id': group_id if is_audio and not is_alternate else None,
} for idx in _extract_m3u8_playlist_indices(manifest_url)) } for idx in _extract_m3u8_playlist_indices(manifest_url))
def build_stream_name(): def build_stream_name():
@@ -2292,6 +2291,8 @@ class InfoExtractor:
# ignore references to rendition groups and treat them # ignore references to rendition groups and treat them
# as complete formats. # as complete formats.
if audio_group_id and codecs and f.get('vcodec') != 'none': if audio_group_id and codecs and f.get('vcodec') != 'none':
# Save this to determine quality of audio formats that only have a GROUP-ID
f['_audio_group_id'] = audio_group_id
audio_group = groups.get(audio_group_id) audio_group = groups.get(audio_group_id)
if audio_group and audio_group[0].get('URI'): if audio_group and audio_group[0].get('URI'):
# TODO: update acodec for audio only formats with # TODO: update acodec for audio only formats with
@@ -2314,6 +2315,28 @@ class InfoExtractor:
formats.append(http_f) formats.append(http_f)
last_stream_inf = {} last_stream_inf = {}
# Some audio-only formats only have a GROUP-ID without any other quality/bitrate/codec info
# Each audio GROUP-ID corresponds with one or more video formats' AUDIO attribute
# For sorting purposes, set source_preference based on the quality of the video formats they are grouped with
# See https://github.com/yt-dlp/yt-dlp/issues/11178
audio_groups_by_quality = orderedSet(f['_audio_group_id'] for f in sorted(
traverse_obj(formats, lambda _, v: v.get('vcodec') != 'none' and v['_audio_group_id']),
key=lambda x: (x.get('tbr') or 0, x.get('width') or 0)))
audio_quality_map = {
audio_groups_by_quality[0]: 'low',
audio_groups_by_quality[-1]: 'high',
} if len(audio_groups_by_quality) > 1 else None
audio_preference = qualities(audio_groups_by_quality)
for fmt in formats:
audio_group_id = fmt.pop('_audio_group_id', None)
if not audio_quality_map or not audio_group_id or fmt.get('vcodec') != 'none':
continue
# Use source_preference since quality and preference are set by params
fmt['source_preference'] = audio_preference(audio_group_id)
fmt['format_note'] = join_nonempty(
fmt.get('format_note'), audio_quality_map.get(audio_group_id), delim=', ')
return formats, subtitles return formats, subtitles
def _extract_m3u8_vod_duration( def _extract_m3u8_vod_duration(
@@ -2943,8 +2966,7 @@ class InfoExtractor:
segment_duration = None segment_duration = None
if 'total_number' not in representation_ms_info and 'segment_duration' in representation_ms_info: if 'total_number' not in representation_ms_info and 'segment_duration' in representation_ms_info:
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale']) segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
representation_ms_info['total_number'] = int(math.ceil( representation_ms_info['total_number'] = math.ceil(float_or_none(period_duration, segment_duration, default=0))
float_or_none(period_duration, segment_duration, default=0)))
representation_ms_info['fragments'] = [{ representation_ms_info['fragments'] = [{
media_location_key: media_template % { media_location_key: media_template % {
'Number': segment_number, 'Number': segment_number,
@@ -3963,14 +3985,18 @@ class InfoExtractor:
def __init_subclass__(cls, *, plugin_name=None, **kwargs): def __init_subclass__(cls, *, plugin_name=None, **kwargs):
if plugin_name: if plugin_name:
mro = inspect.getmro(cls) mro = inspect.getmro(cls)
super_class = cls.__wrapped__ = mro[mro.index(cls) + 1] next_mro_class = super_class = mro[mro.index(cls) + 1]
cls.PLUGIN_NAME, cls.ie_key = plugin_name, super_class.ie_key
cls.IE_NAME = f'{super_class.IE_NAME}+{plugin_name}'
while getattr(super_class, '__wrapped__', None): while getattr(super_class, '__wrapped__', None):
super_class = super_class.__wrapped__ super_class = super_class.__wrapped__
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
_PLUGIN_OVERRIDES[super_class].append(cls)
if not any(override.PLUGIN_NAME == plugin_name for override in plugin_ies_overrides.value[super_class]):
cls.__wrapped__ = next_mro_class
cls.PLUGIN_NAME, cls.ie_key = plugin_name, next_mro_class.ie_key
cls.IE_NAME = f'{next_mro_class.IE_NAME}+{plugin_name}'
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
plugin_ies_overrides.value[super_class].append(cls)
return super().__init_subclass__(**kwargs) return super().__init_subclass__(**kwargs)
@@ -4026,6 +4052,3 @@ class UnsupportedURLIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
raise UnsupportedError(url) raise UnsupportedError(url)
_PLUGIN_OVERRIDES = collections.defaultdict(list)

View File

@@ -3,7 +3,7 @@ from ..utils import int_or_none
class CultureUnpluggedIE(InfoExtractor): class CultureUnpluggedIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/documentary/watch-online/play/(?P<id>\d+)(?:/(?P<display_id>[^/]+))?' _VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/(?:documentary/watch-online/)?play/(?P<id>\d+)(?:/(?P<display_id>[^/#?]+))?'
_TESTS = [{ _TESTS = [{
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662/The-Next--Best-West', 'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662/The-Next--Best-West',
'md5': 'ac6c093b089f7d05e79934dcb3d228fc', 'md5': 'ac6c093b089f7d05e79934dcb3d228fc',
@@ -12,12 +12,25 @@ class CultureUnpluggedIE(InfoExtractor):
'display_id': 'The-Next--Best-West', 'display_id': 'The-Next--Best-West',
'ext': 'mp4', 'ext': 'mp4',
'title': 'The Next, Best West', 'title': 'The Next, Best West',
'description': 'md5:0423cd00833dea1519cf014e9d0903b1', 'description': 'md5:770033a3b7c2946a3bcfb7f1c6fb7045',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'creator': 'Coldstream Creative', 'creators': ['Coldstream Creative'],
'duration': 2203, 'duration': 2203,
'view_count': int, 'view_count': int,
}, },
}, {
'url': 'https://www.cultureunplugged.com/play/2833/Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
'md5': 'dc2014bc470dfccba389a1c934fa29fa',
'info_dict': {
'id': '2833',
'display_id': 'Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
'ext': 'mp4',
'title': 'Koi Sunta Hai: Journeys with Kumar & Kabir (Someone is Listening)',
'description': 'md5:fa94ac934927c98660362b8285b2cda5',
'view_count': int,
'thumbnail': 'https://s3.amazonaws.com/cdn.cultureunplugged.com/thumbnails_16_9/lg/2833.jpg',
'creators': ['Srishti'],
},
}, { }, {
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662', 'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662',
'only_matching': True, 'only_matching': True,

View File

@@ -1,35 +1,40 @@
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
parse_age_limit, parse_age_limit,
parse_iso8601, parse_iso8601,
parse_qs,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
update_url_query, update_url_query,
) )
from ..utils.traversal import traverse_obj
class CWTVIE(InfoExtractor): class CWTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cw(?:tv(?:pr)?|seed)\.com/(?:shows/)?(?:[^/]+/)+[^?]*\?.*\b(?:play|watch)=(?P<id>[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})' IE_NAME = 'cwtv'
_VALID_URL = r'https?://(?:www\.)?cw(?:tv(?:pr)?|seed)\.com/(?:shows/)?(?:[^/]+/)+[^?]*\?.*\b(?:play|watch|guid)=(?P<id>[a-z0-9]{8}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{4}-[a-z0-9]{12})'
_TESTS = [{ _TESTS = [{
'url': 'https://www.cwtv.com/shows/all-american-homecoming/ready-or-not/?play=d848488f-f62a-40fd-af1f-6440b1821aab', 'url': 'https://www.cwtv.com/shows/continuum/a-stitch-in-time/?play=9149a1e1-4cb2-46d7-81b2-47d35bbd332b',
'info_dict': { 'info_dict': {
'id': 'd848488f-f62a-40fd-af1f-6440b1821aab', 'id': '9149a1e1-4cb2-46d7-81b2-47d35bbd332b',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Ready Or Not', 'title': 'A Stitch in Time',
'description': 'Simone is concerned about changes taking place at Bringston; JR makes a decision about his future.', 'description': r're:(?s)City Protective Services officer Kiera Cameron is transported from 2077.+',
'thumbnail': r're:^https?://.*\.jpe?g$', 'thumbnail': r're:https?://.+\.jpe?g',
'duration': 2547, 'duration': 2632,
'timestamp': 1720519200, 'timestamp': 1736928000,
'uploader': 'CWTV', 'uploader': 'CWTV',
'chapters': 'count:6', 'chapters': 'count:5',
'series': 'All American: Homecoming', 'series': 'Continuum',
'season_number': 3, 'season_number': 1,
'episode_number': 1, 'episode_number': 1,
'age_limit': 0, 'age_limit': 14,
'upload_date': '20240709', 'upload_date': '20250115',
'season': 'Season 3', 'season': 'Season 1',
'episode': 'Episode 1', 'episode': 'Episode 1',
}, },
'params': { 'params': {
@@ -42,7 +47,7 @@ class CWTVIE(InfoExtractor):
'id': '6b15e985-9345-4f60-baf8-56e96be57c63', 'id': '6b15e985-9345-4f60-baf8-56e96be57c63',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Legends of Yesterday', 'title': 'Legends of Yesterday',
'description': 'Oliver and Barry Allen take Kendra Saunders and Carter Hall to a remote location to keep them hidden from Vandal Savage while they figure out how to defeat him.', 'description': r're:(?s)Oliver and Barry Allen take Kendra Saunders and Carter Hall to a remote.+',
'duration': 2665, 'duration': 2665,
'series': 'Arrow', 'series': 'Arrow',
'season_number': 4, 'season_number': 4,
@@ -71,7 +76,7 @@ class CWTVIE(InfoExtractor):
'timestamp': 1444107300, 'timestamp': 1444107300,
'age_limit': 14, 'age_limit': 14,
'uploader': 'CWTV', 'uploader': 'CWTV',
'thumbnail': r're:^https?://.*\.jpe?g$', 'thumbnail': r're:https?://.+\.jpe?g',
'chapters': 'count:4', 'chapters': 'count:4',
'episode': 'Episode 20', 'episode': 'Episode 20',
'season': 'Season 11', 'season': 'Season 11',
@@ -89,14 +94,17 @@ class CWTVIE(InfoExtractor):
}, { }, {
'url': 'http://cwtv.com/shows/arrow/legends-of-yesterday/?watch=6b15e985-9345-4f60-baf8-56e96be57c63', 'url': 'http://cwtv.com/shows/arrow/legends-of-yesterday/?watch=6b15e985-9345-4f60-baf8-56e96be57c63',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.cwtv.com/movies/play/?guid=0a8e8b5b-1356-41d5-9a6a-4eda1a6feb6c',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
data = self._download_json( data = self._download_json(
f'https://images.cwtv.com/feed/mobileapp/video-meta/apiversion_12/guid_{video_id}', video_id) f'https://images.cwtv.com/feed/app-2/video-meta/apiversion_22/device_android/guid_{video_id}', video_id)
if data.get('result') != 'ok': if traverse_obj(data, 'result') != 'ok':
raise ExtractorError(data['msg'], expected=True) raise ExtractorError(traverse_obj(data, (('error_msg', 'msg'), {str}, any)), expected=True)
video_data = data['video'] video_data = data['video']
title = video_data['title'] title = video_data['title']
mpx_url = update_url_query( mpx_url = update_url_query(
@@ -123,3 +131,50 @@ class CWTVIE(InfoExtractor):
'ie_key': 'ThePlatform', 'ie_key': 'ThePlatform',
'thumbnail': video_data.get('large_thumbnail'), 'thumbnail': video_data.get('large_thumbnail'),
} }
class CWTVMovieIE(InfoExtractor):
IE_NAME = 'cwtv:movie'
_VALID_URL = r'https?://(?:www\.)?cwtv\.com/shows/(?P<id>[\w-]+)/?\?(?:[^#]+&)?viewContext=Movies'
_TESTS = [{
'url': 'https://www.cwtv.com/shows/the-crush/?viewContext=Movies+Swimlane',
'info_dict': {
'id': '0a8e8b5b-1356-41d5-9a6a-4eda1a6feb6c',
'ext': 'mp4',
'title': 'The Crush',
'upload_date': '20241112',
'description': 'md5:1549acd90dff4a8273acd7284458363e',
'chapters': 'count:9',
'timestamp': 1731398400,
'age_limit': 16,
'duration': 5337,
'series': 'The Crush',
'season': 'Season 1',
'uploader': 'CWTV',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
'thumbnail': r're:https?://.+\.jpe?g',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
_UUID_RE = r'[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12}'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
app_url = (
self._html_search_meta('al:ios:url', webpage, default=None)
or self._html_search_meta('al:android:url', webpage, default=None))
video_id = (
traverse_obj(parse_qs(app_url), ('video_id', 0, {lambda x: re.fullmatch(self._UUID_RE, x)}, 0))
or self._search_regex([
rf'CWTV\.Site\.curPlayingGUID\s*=\s*["\']({self._UUID_RE})',
rf'CWTV\.Site\.viewInAppURL\s*=\s*["\']/shows/[\w-]+/watch-in-app/\?play=({self._UUID_RE})',
], webpage, 'video ID'))
return self.url_result(
f'https://www.cwtv.com/shows/{display_id}/{display_id}/?play={video_id}', CWTVIE, video_id)

View File

@@ -100,7 +100,7 @@ class DailymotionBaseInfoExtractor(InfoExtractor):
class DailymotionIE(DailymotionBaseInfoExtractor): class DailymotionIE(DailymotionBaseInfoExtractor):
_VALID_URL = r'''(?ix) _VALID_URL = r'''(?ix)
https?:// (?:https?:)?//
(?: (?:
dai\.ly/| dai\.ly/|
(?: (?:
@@ -116,7 +116,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
(?P<id>[^/?_&#]+)(?:[\w-]*\?playlist=(?P<playlist_id>x[0-9a-z]+))? (?P<id>[^/?_&#]+)(?:[\w-]*\?playlist=(?P<playlist_id>x[0-9a-z]+))?
''' '''
IE_NAME = 'dailymotion' IE_NAME = 'dailymotion'
_EMBED_REGEX = [r'<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/(?:embed|swf)/video/.+?)\1'] _EMBED_REGEX = [rf'(?ix)<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)["\'](?P<url>{_VALID_URL[5:]})']
_TESTS = [{ _TESTS = [{
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news', 'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
'md5': '074b95bdee76b9e3654137aee9c79dfe', 'md5': '074b95bdee76b9e3654137aee9c79dfe',
@@ -308,6 +308,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'description': 'Que lindura', 'description': 'Que lindura',
'tags': [], 'tags': [],
}, },
}, {
# //geo.dailymotion.com/player/xysxq.html?video=k2Y4Mjp7krAF9iCuINM
'url': 'https://lcp.fr/programmes/avant-la-catastrophe-la-naissance-de-la-dictature-nazie-1933-1936-346819',
'info_dict': {
'id': 'k2Y4Mjp7krAF9iCuINM',
'ext': 'mp4',
'title': 'Avant la catastrophe la naissance de la dictature nazie 1933 -1936',
'description': 'md5:7b620d5e26edbe45f27bbddc1c0257c1',
'uploader': 'LCP Assemblée nationale',
'uploader_id': 'xbz33d',
'view_count': int,
'like_count': int,
'age_limit': 0,
'duration': 3220,
'thumbnail': 'https://s1.dmcdn.net/v/Xvumk1djJBUZfjj2a/x1080',
'tags': [],
'timestamp': 1739919947,
'upload_date': '20250218',
},
}] }]
_GEO_BYPASS = False _GEO_BYPASS = False
_COMMON_MEDIA_FIELDS = '''description _COMMON_MEDIA_FIELDS = '''description

View File

@@ -1,142 +0,0 @@
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
orderedSet,
)
class DeezerBaseInfoExtractor(InfoExtractor):
def get_data(self, url):
if not self.get_param('test'):
self.report_warning('For now, this extractor only supports the 30 second previews. Patches welcome!')
mobj = self._match_valid_url(url)
data_id = mobj.group('id')
webpage = self._download_webpage(url, data_id)
geoblocking_msg = self._html_search_regex(
r'<p class="soon-txt">(.*?)</p>', webpage, 'geoblocking message',
default=None)
if geoblocking_msg is not None:
raise ExtractorError(
f'Deezer said: {geoblocking_msg}', expected=True)
data_json = self._search_regex(
(r'__DZR_APP_STATE__\s*=\s*({.+?})\s*</script>',
r'naboo\.display\(\'[^\']+\',\s*(.*?)\);\n'),
webpage, 'data JSON')
data = json.loads(data_json)
return data_id, webpage, data
class DeezerPlaylistIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?playlist/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.deezer.com/playlist/176747451',
'info_dict': {
'id': '176747451',
'title': 'Best!',
'uploader': 'anonymous',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 29,
}
def _real_extract(self, url):
playlist_id, webpage, data = self.get_data(url)
playlist_title = data.get('DATA', {}).get('TITLE')
playlist_uploader = data.get('DATA', {}).get('PARENT_USERNAME')
playlist_thumbnail = self._search_regex(
r'<img id="naboo_playlist_image".*?src="([^"]+)"', webpage,
'playlist thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
})
return {
'_type': 'playlist',
'id': playlist_id,
'title': playlist_title,
'uploader': playlist_uploader,
'thumbnail': playlist_thumbnail,
'entries': entries,
}
class DeezerAlbumIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?album/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://www.deezer.com/fr/album/67505622',
'info_dict': {
'id': '67505622',
'title': 'Last Week',
'uploader': 'Home Brew',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 7,
}
def _real_extract(self, url):
album_id, webpage, data = self.get_data(url)
album_title = data.get('DATA', {}).get('ALB_TITLE')
album_uploader = data.get('DATA', {}).get('ART_NAME')
album_thumbnail = self._search_regex(
r'<img id="naboo_album_image".*?src="([^"]+)"', webpage,
'album thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
'track': s.get('SNG_TITLE'),
'track_number': int_or_none(s.get('TRACK_NUMBER')),
'track_id': s.get('SNG_ID'),
'artist': album_uploader,
'album': album_title,
'album_artist': album_uploader,
})
return {
'_type': 'playlist',
'id': album_id,
'title': album_title,
'uploader': album_uploader,
'thumbnail': album_thumbnail,
'entries': entries,
}

View File

@@ -0,0 +1,130 @@
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import clean_html, int_or_none, traverse_obj, url_or_none, urlencode_postdata
class DigiviewIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ladigitale\.dev/digiview/#/v/(?P<id>[0-9a-f]+)'
_TESTS = [{
# normal video
'url': 'https://ladigitale.dev/digiview/#/v/67a8e50aee2ec',
'info_dict': {
'id': '67a8e50aee2ec',
'ext': 'mp4',
'title': 'Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film',
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
'upload_date': '20141110',
'playable_in_embed': True,
'duration': 635,
'view_count': int,
'comment_count': int,
'channel': 'Blender',
'license': 'Creative Commons Attribution license (reuse allowed)',
'like_count': int,
'tags': 'count:8',
'live_status': 'not_live',
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
'channel_follower_count': int,
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
'uploader_id': '@BlenderOfficial',
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
'categories': ['Film & Animation'],
'channel_is_verified': True,
'heatmap': 'count:100',
'section_end': 635,
'uploader': 'Blender',
'timestamp': 1415628355,
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
'age_limit': 0,
'section_start': 0,
'availability': 'public',
},
}, {
# cut video
'url': 'https://ladigitale.dev/digiview/#/v/67a8e51d0dd58',
'info_dict': {
'id': '67a8e51d0dd58',
'ext': 'mp4',
'title': 'Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film',
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
'upload_date': '20141110',
'playable_in_embed': True,
'duration': 5,
'view_count': int,
'comment_count': int,
'channel': 'Blender',
'license': 'Creative Commons Attribution license (reuse allowed)',
'like_count': int,
'tags': 'count:8',
'live_status': 'not_live',
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
'channel_follower_count': int,
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
'uploader_id': '@BlenderOfficial',
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
'categories': ['Film & Animation'],
'channel_is_verified': True,
'heatmap': 'count:100',
'section_end': 10,
'uploader': 'Blender',
'timestamp': 1415628355,
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
'age_limit': 0,
'section_start': 5,
'availability': 'public',
},
}, {
# changed title
'url': 'https://ladigitale.dev/digiview/#/v/67a8ea5644d7a',
'info_dict': {
'id': '67a8ea5644d7a',
'ext': 'mp4',
'title': 'Big Buck Bunny (with title changed)',
'thumbnail': 'https://i.ytimg.com/vi/aqz-KE-bpKQ/hqdefault.jpg',
'upload_date': '20141110',
'playable_in_embed': True,
'duration': 5,
'view_count': int,
'comment_count': int,
'channel': 'Blender',
'license': 'Creative Commons Attribution license (reuse allowed)',
'like_count': int,
'tags': 'count:8',
'live_status': 'not_live',
'channel_id': 'UCSMOQeBJ2RAnuFungnQOxLg',
'channel_follower_count': int,
'channel_url': 'https://www.youtube.com/channel/UCSMOQeBJ2RAnuFungnQOxLg',
'uploader_id': '@BlenderOfficial',
'description': 'md5:8f3ed18a53a1bb36cbb3b70a15782fd0',
'categories': ['Film & Animation'],
'channel_is_verified': True,
'heatmap': 'count:100',
'section_end': 15,
'uploader': 'Blender',
'timestamp': 1415628355,
'uploader_url': 'https://www.youtube.com/@BlenderOfficial',
'age_limit': 0,
'section_start': 10,
'availability': 'public',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(
'https://ladigitale.dev/digiview/inc/recuperer_video.php', video_id,
data=urlencode_postdata({'id': video_id}))
clip_id = video_data['videoId']
return self.url_result(
f'https://www.youtube.com/watch?v={clip_id}',
YoutubeIE, video_id, url_transparent=True,
**traverse_obj(video_data, {
'section_start': ('debut', {int_or_none}),
'section_end': ('fin', {int_or_none}),
'description': ('description', {clean_html}, filter),
'title': ('titre', {str}),
'thumbnail': ('vignette', {url_or_none}),
'view_count': ('vues', {int_or_none}),
}),
)

View File

@@ -1,10 +1,24 @@
from .zdf import ZDFIE from .zdf import ZDFBaseIE
class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE class DreiSatIE(ZDFBaseIE):
IE_NAME = '3sat' IE_NAME = '3sat'
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html' _VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'https://www.3sat.de/dokumentation/reise/traumziele-suedostasiens-die-philippinen-und-vietnam-102.html',
'info_dict': {
'id': '231124_traumziele_philippinen_und_vietnam_dokreise',
'ext': 'mp4',
'title': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'description': 'md5:26329ce5197775b596773b939354079d',
'duration': 2625.0,
'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~2400x1350?cb=1699870351148',
'episode': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'episode_id': 'POS_cc7ff51c-98cf-4d12-b99d-f7a551de1c95',
'timestamp': 1738593000,
'upload_date': '20250203',
},
}, {
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html # Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html', 'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html',
'md5': '0aff3e7bc72c8813f5e0fae333316a1d', 'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
@@ -17,6 +31,7 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
'timestamp': 1608604200, 'timestamp': 1608604200,
'upload_date': '20201222', 'upload_date': '20201222',
}, },
'skip': '410 Gone',
}, { }, {
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html', 'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html',
'info_dict': { 'info_dict': {
@@ -30,6 +45,7 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
}, { }, {
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html # Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html', 'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
@@ -39,3 +55,14 @@ class DreiSatIE(ZDFIE): # XXX: Do not subclass from concrete IE
'url': 'https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html', 'url': 'https://www.3sat.de/wissen/nano/nano-21-mai-2019-102.html',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, fatal=False)
if webpage:
player = self._extract_player(webpage, url, fatal=False)
if player:
return self._extract_regular(url, player, video_id)
return self._extract_mobile(video_id)

View File

@@ -82,7 +82,7 @@ class DropboxIE(InfoExtractor):
has_anonymous_download = self._search_regex( has_anonymous_download = self._search_regex(
r'(anonymous:\tanonymous)', part, 'anonymous', default=False) r'(anonymous:\tanonymous)', part, 'anonymous', default=False)
transcode_url = self._search_regex( transcode_url = self._search_regex(
r'\n.(https://[^\x03\x08\x12\n]+\.m3u8)', part, 'transcode url', default=None) r'\n.?(https://[^\x03\x08\x12\n]+\.m3u8)', part, 'transcode url', default=None)
if not transcode_url: if not transcode_url:
continue continue
formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id, 'mp4') formats, subtitles = self._extract_m3u8_formats_and_subtitles(transcode_url, video_id, 'mp4')

View File

@@ -1,28 +1,37 @@
import contextlib import inspect
import os import os
from ..plugins import load_plugins from ..globals import LAZY_EXTRACTORS
from ..globals import extractors as _extractors_context
# NB: Must be before other imports so that plugins can be correctly injected _CLASS_LOOKUP = None
_PLUGIN_CLASSES = load_plugins('extractor', 'IE') if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
LAZY_EXTRACTORS.value = False
else:
try:
from .lazy_extractors import _CLASS_LOOKUP
LAZY_EXTRACTORS.value = True
except ImportError:
LAZY_EXTRACTORS.value = None
_LAZY_LOADER = False if not _CLASS_LOOKUP:
if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'): from . import _extractors
with contextlib.suppress(ImportError):
from .lazy_extractors import * # noqa: F403
from .lazy_extractors import _ALL_CLASSES
_LAZY_LOADER = True
if not _LAZY_LOADER: _CLASS_LOOKUP = {
from ._extractors import * # noqa: F403 name: value
_ALL_CLASSES = [ # noqa: F811 for name, value in inspect.getmembers(_extractors)
klass
for name, klass in globals().items()
if name.endswith('IE') and name != 'GenericIE' if name.endswith('IE') and name != 'GenericIE'
] }
_ALL_CLASSES.append(GenericIE) # noqa: F405 _CLASS_LOOKUP['GenericIE'] = _extractors.GenericIE
globals().update(_PLUGIN_CLASSES) # We want to append to the main lookup
_ALL_CLASSES[:0] = _PLUGIN_CLASSES.values() _current = _extractors_context.value
for name, ie in _CLASS_LOOKUP.items():
_current.setdefault(name, ie)
from .common import _PLUGIN_OVERRIDES # noqa: F401
def __getattr__(name):
value = _CLASS_LOOKUP.get(name)
if not value:
raise AttributeError(f'module {__name__} has no attribute {name}')
return value

View File

@@ -1,3 +1,4 @@
import json
import re import re
import urllib.parse import urllib.parse
@@ -5,8 +6,10 @@ from .common import InfoExtractor
from .dailymotion import DailymotionIE from .dailymotion import DailymotionIE
from ..networking import HEADRequest from ..networking import HEADRequest
from ..utils import ( from ..utils import (
ExtractorError,
clean_html, clean_html,
determine_ext, determine_ext,
extract_attributes,
filter_dict, filter_dict,
format_field, format_field,
int_or_none, int_or_none,
@@ -16,7 +19,7 @@ from ..utils import (
unsmuggle_url, unsmuggle_url,
url_or_none, url_or_none,
) )
from ..utils.traversal import traverse_obj from ..utils.traversal import find_element, traverse_obj
class FranceTVBaseInfoExtractor(InfoExtractor): class FranceTVBaseInfoExtractor(InfoExtractor):
@@ -29,6 +32,7 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
class FranceTVIE(InfoExtractor): class FranceTVIE(InfoExtractor):
IE_NAME = 'francetv'
_VALID_URL = r'francetv:(?P<id>[^@#]+)' _VALID_URL = r'francetv:(?P<id>[^@#]+)'
_GEO_COUNTRIES = ['FR'] _GEO_COUNTRIES = ['FR']
_GEO_BYPASS = False _GEO_BYPASS = False
@@ -248,18 +252,19 @@ class FranceTVIE(InfoExtractor):
class FranceTVSiteIE(FranceTVBaseInfoExtractor): class FranceTVSiteIE(FranceTVBaseInfoExtractor):
IE_NAME = 'francetv:site'
_VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html' _VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html', 'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html',
'info_dict': { 'info_dict': {
'id': 'c5bda21d-2c6f-4470-8849-3d8327adb2ba', 'id': 'ec217ecc-0733-48cf-ac06-af1347b849d1', # old: c5bda21d-2c6f-4470-8849-3d8327adb2ba'
'ext': 'mp4', 'ext': 'mp4',
'title': '13h15, le dimanche... - Les mystères de Jésus', 'title': '13h15, le dimanche... - Les mystères de Jésus',
'timestamp': 1514118300, 'timestamp': 1502623500,
'duration': 2880, 'duration': 2580,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20171224', 'upload_date': '20170813',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -282,6 +287,7 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1441, 'duration': 1441,
}, },
'skip': 'No longer available',
}, { }, {
# geo-restricted livestream (workflow == 'token-akamai') # geo-restricted livestream (workflow == 'token-akamai')
'url': 'https://www.france.tv/france-4/direct.html', 'url': 'https://www.france.tv/france-4/direct.html',
@@ -336,19 +342,33 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
# XXX: For parsing next.js v15+ data; see also yt_dlp.extractor.goplay
def _find_json(self, s):
return self._search_json(
r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._search_regex( nextjs_data = traverse_obj(
r'(?:data-main-video\s*=|videoId["\']?\s*[:=])\s*(["\'])(?P<id>(?:(?!\1).)+)\1', re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
webpage, 'video id', default=None, group='id') (..., {json.loads}, ..., {self._find_json}, ..., 'children', ..., ..., 'children', ..., ..., 'children'))
if traverse_obj(nextjs_data, (..., ..., 'children', ..., 'isLive', {bool}, any)):
# For livestreams we need the id of the stream instead of the currently airing episode id
video_id = traverse_obj(nextjs_data, (
..., ..., 'children', ..., 'children', ..., 'children', ..., 'children', ..., ...,
'children', ..., ..., 'children', ..., ..., 'children', (..., (..., ...)),
'options', 'id', {str}, any))
else:
video_id = traverse_obj(nextjs_data, (
..., ..., ..., 'children',
lambda _, v: v['video']['url'] == urllib.parse.urlparse(url).path,
'video', ('playerReplayId', 'siId'), {str}, any))
if not video_id: if not video_id:
video_id = self._html_search_regex( raise ExtractorError('Unable to extract video ID')
r'(?:href=|player\.setVideo\(\s*)"http://videos?\.francetv\.fr/video/([^@"]+@[^"]+)"',
webpage, 'video ID')
return self._make_url_result(video_id, url=url) return self._make_url_result(video_id, url=url)
@@ -441,11 +461,16 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
self.url_result(dailymotion_url, DailymotionIE.ie_key()) self.url_result(dailymotion_url, DailymotionIE.ie_key())
for dailymotion_url in dailymotion_urls]) for dailymotion_url in dailymotion_urls])
video_id = self._search_regex( video_id = (
(r'player\.load[^;]+src:\s*["\']([^"\']+)', traverse_obj(webpage, (
r'id-video=([^@]+@[^"]+)', {find_element(tag='button', attr='data-cy', value='francetv-player-wrapper', html=True)},
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"', {extract_attributes}, 'id'))
r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'), or self._search_regex(
webpage, 'video id') (r'player\.load[^;]+src:\s*["\']([^"\']+)',
r'id-video=([^@]+@[^"]+)',
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"',
r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'),
webpage, 'video id')
)
return self._make_url_result(video_id, url=url) return self._make_url_result(video_id, url=url)

View File

@@ -16,6 +16,7 @@ from ..utils import (
MEDIA_EXTENSIONS, MEDIA_EXTENSIONS,
ExtractorError, ExtractorError,
UnsupportedError, UnsupportedError,
base_url,
determine_ext, determine_ext,
determine_protocol, determine_protocol,
dict_get, dict_get,
@@ -293,6 +294,19 @@ class GenericIE(InfoExtractor):
'timestamp': 1378272859.0, 'timestamp': 1378272859.0,
}, },
}, },
# Live DASH MPD
{
'url': 'https://livesim2.dashif.org/livesim2/ato_10/testpic_2s/Manifest.mpd',
'info_dict': {
'id': 'Manifest',
'ext': 'mp4',
'title': r're:Manifest \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'live_status': 'is_live',
},
'params': {
'skip_download': 'livestream',
},
},
# m3u8 served with Content-Type: audio/x-mpegURL; charset=utf-8 # m3u8 served with Content-Type: audio/x-mpegURL; charset=utf-8
{ {
'url': 'http://once.unicornmedia.com/now/master/playlist/bb0b18ba-64f5-4b1b-a29f-0ac252f06b68/77a785f3-5188-4806-b788-0893a61634ed/93677179-2d99-4ef4-9e17-fe70d49abfbf/content.m3u8', 'url': 'http://once.unicornmedia.com/now/master/playlist/bb0b18ba-64f5-4b1b-a29f-0ac252f06b68/77a785f3-5188-4806-b788-0893a61634ed/93677179-2d99-4ef4-9e17-fe70d49abfbf/content.m3u8',
@@ -2436,10 +2450,9 @@ class GenericIE(InfoExtractor):
subtitles = {} subtitles = {}
if format_id.endswith('mpegurl') or ext == 'm3u8': if format_id.endswith('mpegurl') or ext == 'm3u8':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4', headers=headers) formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4', headers=headers)
elif format_id.endswith(('mpd', 'dash+xml')) or ext == 'mpd':
formats, subtitles = self._extract_mpd_formats_and_subtitles(url, video_id, headers=headers)
elif format_id == 'f4m' or ext == 'f4m': elif format_id == 'f4m' or ext == 'f4m':
formats = self._extract_f4m_formats(url, video_id, headers=headers) formats = self._extract_f4m_formats(url, video_id, headers=headers)
# Don't check for DASH/mpd here, do it later w/ first_bytes. Same number of requests either way
else: else:
formats = [{ formats = [{
'format_id': format_id, 'format_id': format_id,
@@ -2519,8 +2532,9 @@ class GenericIE(InfoExtractor):
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag): elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles( info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles(
doc, doc,
mpd_base_url=full_response.url.rpartition('/')[0], mpd_base_url=base_url(full_response.url),
mpd_url=url) mpd_url=url)
info_dict['live_status'] = 'is_live' if doc.get('type') == 'dynamic' else None
self._extra_manifest_info(info_dict, url) self._extra_manifest_info(info_dict, url)
self.report_detected('DASH manifest') self.report_detected('DASH manifest')
return info_dict return info_dict

View File

@@ -1,19 +0,0 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
urlencode_postdata,
)
class GigyaBaseIE(InfoExtractor):
def _gigya_login(self, auth_data):
auth_info = self._download_json(
'https://accounts.eu1.gigya.com/accounts.login', None,
note='Logging in', errnote='Unable to log in',
data=urlencode_postdata(auth_data))
error_message = auth_info.get('errorDetails') or auth_info.get('errorMessage')
if error_message:
raise ExtractorError(
f'Unable to login: {error_message}', expected=True)
return auth_info

View File

@@ -1,32 +1,48 @@
import base64
import hashlib
import json import json
import random
import re import re
import uuid
from .common import InfoExtractor from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import ( from ..utils import (
ExtractorError, determine_ext,
filter_dict,
float_or_none, float_or_none,
int_or_none,
orderedSet, orderedSet,
str_or_none, str_or_none,
try_get, try_get,
url_or_none,
) )
from ..utils.traversal import subs_list_to_dict, traverse_obj
class GloboIE(InfoExtractor): class GloboIE(InfoExtractor):
_VALID_URL = r'(?:globo:|https?://.+?\.globo\.com/(?:[^/]+/)*(?:v/(?:[^/]+/)?|videos/))(?P<id>\d{7,})' _VALID_URL = r'(?:globo:|https?://[^/?#]+?\.globo\.com/(?:[^/?#]+/))(?P<id>\d{7,})'
_NETRC_MACHINE = 'globo' _NETRC_MACHINE = 'globo'
_VIDEO_VIEW = '''
query getVideoView($videoId: ID!) {
video(id: $videoId) {
duration
description
relatedEpisodeNumber
relatedSeasonNumber
headline
title {
originProgramId
headline
}
}
}
'''
_TESTS = [{ _TESTS = [{
'url': 'http://g1.globo.com/carros/autoesporte/videos/t/exclusivos-do-g1/v/mercedes-benz-gla-passa-por-teste-de-colisao-na-europa/3607726/', 'url': 'https://globoplay.globo.com/v/3607726/',
'info_dict': { 'info_dict': {
'id': '3607726', 'id': '3607726',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Mercedes-Benz GLA passa por teste de colisão na Europa', 'title': 'Mercedes-Benz GLA passa por teste de colisão na Europa',
'duration': 103.204, 'duration': 103.204,
'uploader': 'G1', 'uploader': 'G1 ao vivo',
'uploader_id': '2015', 'uploader_id': '4209',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -38,39 +54,46 @@ class GloboIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Acidentes de trânsito estão entre as maiores causas de queda de energia em SP', 'title': 'Acidentes de trânsito estão entre as maiores causas de queda de energia em SP',
'duration': 137.973, 'duration': 137.973,
'uploader': 'Rede Globo', 'uploader': 'Bom Dia Brasil',
'uploader_id': '196', 'uploader_id': '810',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'http://canalbrasil.globo.com/programas/sangue-latino/videos/3928201.html',
'only_matching': True,
}, {
'url': 'http://globosatplay.globo.com/globonews/v/4472924/',
'only_matching': True,
}, {
'url': 'http://globotv.globo.com/t/programa/v/clipe-sexo-e-as-negas-adeus/3836166/',
'only_matching': True,
}, {
'url': 'http://globotv.globo.com/canal-brasil/sangue-latino/t/todos-os-videos/v/ator-e-diretor-argentino-ricado-darin-fala-sobre-utopias-e-suas-perdas/3928201/',
'only_matching': True,
}, {
'url': 'http://canaloff.globo.com/programas/desejar-profundo/videos/4518560.html',
'only_matching': True,
}, { }, {
'url': 'globo:3607726', 'url': 'globo:3607726',
'only_matching': True, 'only_matching': True,
}, { },
'url': 'https://globoplay.globo.com/v/10248083/', {
'url': 'globo:8013907', # needs subscription to globoplay
'info_dict': { 'info_dict': {
'id': '10248083', 'id': '8013907',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Melhores momentos: Equador 1 x 1 Brasil pelas Eliminatórias da Copa do Mundo 2022', 'title': 'Capítulo de 14/08/1989',
'duration': 530.964, 'episode': 'Episode 1',
'uploader': 'SporTV', 'episode_number': 1,
'uploader_id': '698', 'uploader': 'Tieta',
'uploader_id': '11895',
'duration': 2858.389,
'subtitles': 'count:1',
},
'params': {
'skip_download': True,
},
},
{
'url': 'globo:12824146',
'info_dict': {
'id': '12824146',
'ext': 'mp4',
'title': 'Acordo de damas',
'episode': 'Episode 1',
'episode_number': 1,
'uploader': 'Rensga Hits!',
'uploader_id': '20481',
'duration': 1953.994,
'season': 'Season 2',
'season_number': 2,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -80,98 +103,71 @@ class GloboIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
self._request_webpage( info = self._download_json(
HEADRequest('https://globo-ab.globo.com/v2/selected-alternatives?experiments=player-isolated-experiment-02&skipImpressions=true'), 'https://cloud-jarvis.globo.com/graphql', video_id,
video_id, 'Getting cookies') query={
'operationName': 'getVideoView',
video = self._download_json( 'variables': json.dumps({'videoId': video_id}),
f'http://api.globovideos.com/videos/{video_id}/playlist', 'query': self._VIDEO_VIEW,
video_id)['videos'][0] }, headers={
if not self.get_param('allow_unplayable_formats') and video.get('encrypted') is True: 'content-type': 'application/json',
self.report_drm(video_id) 'x-platform-id': 'web',
'x-device-id': 'desktop',
title = video['title'] 'x-client-version': '2024.12-5',
})['data']['video']
formats = [] formats = []
security = self._download_json( video = self._download_json(
'https://playback.video.globo.com/v2/video-session', video_id, f'Downloading security hash for {video_id}', 'https://playback.video.globo.com/v4/video-session', video_id,
headers={'content-type': 'application/json'}, data=json.dumps({ f'Downloading resource info for {video_id}',
'player_type': 'desktop', headers={'Content-Type': 'application/json'},
data=json.dumps(filter_dict({
'player_type': 'mirakulo_8k_hdr',
'video_id': video_id, 'video_id': video_id,
'quality': 'max', 'quality': 'max',
'content_protection': 'widevine', 'content_protection': 'widevine',
'vsid': '581b986b-4c40-71f0-5a58-803e579d5fa2', 'vsid': f'{uuid.uuid4()}',
'tz': '-3.0:00', 'consumption': 'streaming',
}).encode()) 'capabilities': {'low_latency': True},
'tz': '-03:00',
'Authorization': try_get(self._get_cookies('https://globo.com'),
lambda x: f'Bearer {x["GLBID"].value}'),
'version': 1,
})).encode())
self._request_webpage(HEADRequest(security['sources'][0]['url_template']), video_id, 'Getting locksession cookie') if traverse_obj(video, ('resource', 'drm_protection_enabled', {bool})):
self.report_drm(video_id)
security_hash = security['sources'][0]['token'] main_source = video['sources'][0]
if not security_hash:
message = security.get('message')
if message:
raise ExtractorError(
f'{self.IE_NAME} returned error: {message}', expected=True)
hash_code = security_hash[:2] # 4k streams are exclusively outputted in dash, so we need to filter these out
padding = '%010d' % random.randint(1, 10000000000) if determine_ext(main_source['url']) == 'mpd':
if hash_code in ('04', '14'): formats, subtitles = self._extract_mpd_formats_and_subtitles(main_source['url'], video_id, mpd_id='dash')
received_time = security_hash[3:13] else:
received_md5 = security_hash[24:] formats, subtitles = self._extract_m3u8_formats_and_subtitles(
hash_prefix = security_hash[:23] main_source['url'], video_id, 'mp4', m3u8_id='hls')
elif hash_code in ('02', '12', '03', '13'):
received_time = security_hash[2:12]
received_md5 = security_hash[22:]
padding += '1'
hash_prefix = '05' + security_hash[:22]
padded_sign_time = str(int(received_time) + 86400) + padding self._merge_subtitles(traverse_obj(main_source, ('text', ..., ('caption', 'subtitle'), {
md5_data = (received_md5 + padded_sign_time + '0xAC10FD').encode() 'url': ('srt', 'url', {url_or_none}),
signed_md5 = base64.urlsafe_b64encode(hashlib.md5(md5_data).digest()).decode().strip('=') }, all, {subs_list_to_dict(lang='pt-BR')})), target=subtitles)
signed_hash = hash_prefix + padded_sign_time + signed_md5
source = security['sources'][0]['url_parts']
resource_url = source['scheme'] + '://' + source['domain'] + source['path']
signed_url = '{}?h={}&k=html5&a={}'.format(resource_url, signed_hash, 'F' if video.get('subscriber_only') else 'A')
fmts, subtitles = self._extract_m3u8_formats_and_subtitles(
signed_url, video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
formats.extend(fmts)
for resource in video['resources']:
if resource.get('type') == 'subtitle':
subtitles.setdefault(resource.get('language') or 'por', []).append({
'url': resource.get('url'),
})
subs = try_get(security, lambda x: x['source']['subtitles'], expected_type=dict) or {}
for sub_lang, sub_url in subs.items():
if sub_url:
subtitles.setdefault(sub_lang or 'por', []).append({
'url': sub_url,
})
subs = try_get(security, lambda x: x['source']['subtitles_webvtt'], expected_type=dict) or {}
for sub_lang, sub_url in subs.items():
if sub_url:
subtitles.setdefault(sub_lang or 'por', []).append({
'url': sub_url,
})
duration = float_or_none(video.get('duration'), 1000)
uploader = video.get('channel')
uploader_id = str_or_none(video.get('channel_id'))
return { return {
'id': video_id, 'id': video_id,
'title': title, **traverse_obj(info, {
'duration': duration, 'title': ('headline', {str}),
'uploader': uploader, 'duration': ('duration', {float_or_none(scale=1000)}),
'uploader_id': uploader_id, 'uploader': ('title', 'headline', {str}),
'uploader_id': ('title', 'originProgramId', {str_or_none}),
'episode_number': ('relatedEpisodeNumber', {int_or_none}),
'season_number': ('relatedSeasonNumber', {int_or_none}),
}),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
} }
class GloboArticleIE(InfoExtractor): class GloboArticleIE(InfoExtractor):
_VALID_URL = r'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/.]+)(?:\.html)?' _VALID_URL = r'https?://(?!globoplay).+?\.globo\.com/(?:[^/?#]+/)*(?P<id>[^/?#.]+)(?:\.html)?'
_VIDEOID_REGEXES = [ _VIDEOID_REGEXES = [
r'\bdata-video-id=["\'](\d{7,})["\']', r'\bdata-video-id=["\'](\d{7,})["\']',

View File

@@ -12,7 +12,6 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
js_to_json,
remove_end, remove_end,
traverse_obj, traverse_obj,
) )
@@ -76,6 +75,7 @@ class GoPlayIE(InfoExtractor):
if not self._id_token: if not self._id_token:
raise self.raise_login_required(method='password') raise self.raise_login_required(method='password')
# XXX: For parsing next.js v15+ data; see also yt_dlp.extractor.francetv
def _find_json(self, s): def _find_json(self, s):
return self._search_json( return self._search_json(
r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None) r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
@@ -86,9 +86,10 @@ class GoPlayIE(InfoExtractor):
nextjs_data = traverse_obj( nextjs_data = traverse_obj(
re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage), re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
(..., {js_to_json}, {json.loads}, ..., {self._find_json}, ...)) (..., {json.loads}, ..., {self._find_json}, ...))
meta = traverse_obj(nextjs_data, ( meta = traverse_obj(nextjs_data, (
..., lambda _, v: v['meta']['path'] == urllib.parse.urlparse(url).path, 'meta', any)) ..., ..., 'children', ..., ..., 'children',
lambda _, v: v['video']['path'] == urllib.parse.urlparse(url).path, 'video', any))
video_id = meta['uuid'] video_id = meta['uuid']
info_dict = traverse_obj(meta, { info_dict = traverse_obj(meta, {

View File

@@ -6,7 +6,7 @@ from ..utils import (
) )
class HSEShowBaseInfoExtractor(InfoExtractor): class HSEShowBaseIE(InfoExtractor):
_GEO_COUNTRIES = ['DE'] _GEO_COUNTRIES = ['DE']
def _extract_redux_data(self, url, video_id): def _extract_redux_data(self, url, video_id):
@@ -28,7 +28,7 @@ class HSEShowBaseInfoExtractor(InfoExtractor):
return formats, subtitles return formats, subtitles
class HSEShowIE(HSEShowBaseInfoExtractor): class HSEShowIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/c/tv-shows/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/c/tv-shows/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.hse.de/dpl/c/tv-shows/505350', 'url': 'https://www.hse.de/dpl/c/tv-shows/505350',
@@ -64,7 +64,7 @@ class HSEShowIE(HSEShowBaseInfoExtractor):
} }
class HSEProductIE(HSEShowBaseInfoExtractor): class HSEProductIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/p/product/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/p/product/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.hse.de/dpl/p/product/408630', 'url': 'https://www.hse.de/dpl/p/product/408630',

View File

@@ -1,5 +1,13 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ExtractorError, str_or_none, traverse_obj, unified_strdate from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
url_or_none,
)
class IchinanaLiveIE(InfoExtractor): class IchinanaLiveIE(InfoExtractor):
@@ -157,3 +165,51 @@ class IchinanaLiveClipIE(InfoExtractor):
'description': view_data.get('caption'), 'description': view_data.get('caption'),
'upload_date': unified_strdate(str_or_none(view_data.get('createdAt'))), 'upload_date': unified_strdate(str_or_none(view_data.get('createdAt'))),
} }
class IchinanaLiveVODIE(InfoExtractor):
IE_NAME = '17live:vod'
_VALID_URL = r'https?://(?:www\.)?17\.live/ja/vod/[^/?#]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://17.live/ja/vod/27323042/2cf84520-e65e-4b22-891e-1d3a00b0f068',
'md5': '3299b930d7457b069639486998a89580',
'info_dict': {
'id': '2cf84520-e65e-4b22-891e-1d3a00b0f068',
'ext': 'mp4',
'title': 'md5:b5f8cbf497d54cc6a60eb3b480182f01',
'uploader': 'md5:29fb12122ab94b5a8495586e7c3085a5',
'uploader_id': '27323042',
'channel': '🌟オールナイトニッポン アーカイブ🌟',
'channel_id': '2b4f85f1-d61e-429d-a901-68d32bdd8645',
'like_count': int,
'view_count': int,
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)',
'duration': 549,
'description': 'md5:116f326579700f00eaaf5581aae1192e',
'timestamp': 1741058645,
'upload_date': '20250304',
},
}, {
'url': 'https://17.live/ja/vod/27323042/0de11bac-9bea-40b8-9eab-0239a7d88079',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(f'https://wap-api.17app.co/api/v1/vods/{video_id}', video_id)
return traverse_obj(json_data, {
'id': ('vodID', {str}),
'title': ('title', {str}),
'formats': ('vodURL', {lambda x: self._extract_m3u8_formats(x, video_id)}),
'uploader': ('userInfo', 'displayName', {str}),
'uploader_id': ('userInfo', 'roomID', {int}, {str_or_none}),
'channel': ('userInfo', 'name', {str}),
'channel_id': ('userInfo', 'userID', {str}),
'like_count': ('likeCount', {int_or_none}),
'view_count': ('viewCount', {int_or_none}),
'thumbnail': ('imageURL', {url_or_none}),
'duration': ('duration', {int_or_none}),
'description': ('description', {str}),
'timestamp': ('createdAt', {int_or_none}),
})

View File

@@ -2,12 +2,12 @@ import hashlib
import itertools import itertools
import json import json
import re import re
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..networking.exceptions import HTTPError from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
bug_reports_message,
decode_base_n, decode_base_n,
encode_base_n, encode_base_n,
filter_dict, filter_dict,
@@ -15,12 +15,12 @@ from ..utils import (
format_field, format_field,
get_element_by_attribute, get_element_by_attribute,
int_or_none, int_or_none,
join_nonempty,
lowercase_escape, lowercase_escape,
str_or_none, str_or_none,
str_to_int, str_to_int,
traverse_obj, traverse_obj,
url_or_none, url_or_none,
urlencode_postdata,
) )
_ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_' _ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
@@ -28,63 +28,30 @@ _ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz012345678
def _pk_to_id(media_id): def _pk_to_id(media_id):
"""Source: https://stackoverflow.com/questions/24437823/getting-instagram-post-url-from-media-id""" """Source: https://stackoverflow.com/questions/24437823/getting-instagram-post-url-from-media-id"""
return encode_base_n(int(media_id.split('_')[0]), table=_ENCODING_CHARS) pk = int(str(media_id).split('_')[0])
return encode_base_n(pk, table=_ENCODING_CHARS)
def _id_to_pk(shortcode): def _id_to_pk(shortcode):
"""Covert a shortcode to a numeric value""" """Convert a shortcode to a numeric value"""
return decode_base_n(shortcode[:11], table=_ENCODING_CHARS) if len(shortcode) > 28:
shortcode = shortcode[:-28]
return decode_base_n(shortcode, table=_ENCODING_CHARS)
class InstagramBaseIE(InfoExtractor): class InstagramBaseIE(InfoExtractor):
_NETRC_MACHINE = 'instagram'
_IS_LOGGED_IN = False
_API_BASE_URL = 'https://i.instagram.com/api/v1' _API_BASE_URL = 'https://i.instagram.com/api/v1'
_LOGIN_URL = 'https://www.instagram.com/accounts/login' _LOGIN_URL = 'https://www.instagram.com/accounts/login'
_API_HEADERS = {
'X-IG-App-ID': '936619743392459',
'X-ASBD-ID': '198387',
'X-IG-WWW-Claim': '0',
'Origin': 'https://www.instagram.com',
'Accept': '*/*',
}
def _perform_login(self, username, password): @property
if self._IS_LOGGED_IN: def _api_headers(self):
return return {
'X-IG-App-ID': self._configuration_arg('app_id', ['936619743392459'], ie_key=InstagramIE)[0],
login_webpage = self._download_webpage( 'X-ASBD-ID': '198387',
self._LOGIN_URL, None, note='Downloading login webpage', errnote='Failed to download login webpage') 'X-IG-WWW-Claim': '0',
'Origin': 'https://www.instagram.com',
shared_data = self._parse_json(self._search_regex( 'Accept': '*/*',
r'window\._sharedData\s*=\s*({.+?});', login_webpage, 'shared data', default='{}'), None) }
login = self._download_json(
f'{self._LOGIN_URL}/ajax/', None, note='Logging in', headers={
**self._API_HEADERS,
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': shared_data['config']['csrf_token'],
'X-Instagram-AJAX': shared_data['rollout_hash'],
'Referer': 'https://www.instagram.com/',
}, data=urlencode_postdata({
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
'username': username,
'queryParams': '{}',
'optIntoOneTap': 'false',
'stopDeletionNonce': '',
'trustedDeviceRecords': '{}',
}))
if not login.get('authenticated'):
if login.get('message'):
raise ExtractorError(f'Unable to login: {login["message"]}')
elif login.get('user'):
raise ExtractorError('Unable to login: Sorry, your password was incorrect. Please double-check your password.', expected=True)
elif login.get('user') is False:
raise ExtractorError('Unable to login: The username you entered doesn\'t belong to an account. Please check your username and try again.', expected=True)
raise ExtractorError('Unable to login')
InstagramBaseIE._IS_LOGGED_IN = True
def _get_count(self, media, kind, *keys): def _get_count(self, media, kind, *keys):
return traverse_obj( return traverse_obj(
@@ -209,7 +176,7 @@ class InstagramBaseIE(InfoExtractor):
def _get_comments(self, video_id): def _get_comments(self, video_id):
comments_info = self._download_json( comments_info = self._download_json(
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/comments/?can_support_threading=true&permalink_enabled=false', video_id, f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/comments/?can_support_threading=true&permalink_enabled=false', video_id,
fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._API_HEADERS) or {} fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._api_headers) or {}
comment_data = traverse_obj(comments_info, ('edge_media_to_parent_comment', 'edges'), 'comments') comment_data = traverse_obj(comments_info, ('edge_media_to_parent_comment', 'edges'), 'comments')
for comment_dict in comment_data or []: for comment_dict in comment_data or []:
@@ -402,14 +369,14 @@ class InstagramIE(InstagramBaseIE):
info = traverse_obj(self._download_json( info = traverse_obj(self._download_json(
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id, f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
fatal=False, errnote='Video info extraction failed', fatal=False, errnote='Video info extraction failed',
note='Downloading video info', headers=self._API_HEADERS), ('items', 0)) note='Downloading video info', headers=self._api_headers), ('items', 0))
if info: if info:
media.update(info) media.update(info)
return self._extract_product(media) return self._extract_product(media)
api_check = self._download_json( api_check = self._download_json(
f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}', f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
video_id, headers=self._API_HEADERS, fatal=False, note='Setting up session', errnote=False) or {} video_id, headers=self._api_headers, fatal=False, note='Setting up session', errnote=False) or {}
csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken') csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
if not csrf_token: if not csrf_token:
@@ -429,7 +396,7 @@ class InstagramIE(InstagramBaseIE):
general_info = self._download_json( general_info = self._download_json(
'https://www.instagram.com/graphql/query/', video_id, fatal=False, errnote=False, 'https://www.instagram.com/graphql/query/', video_id, fatal=False, errnote=False,
headers={ headers={
**self._API_HEADERS, **self._api_headers,
'X-CSRFToken': csrf_token or '', 'X-CSRFToken': csrf_token or '',
'X-Requested-With': 'XMLHttpRequest', 'X-Requested-With': 'XMLHttpRequest',
'Referer': url, 'Referer': url,
@@ -437,7 +404,6 @@ class InstagramIE(InstagramBaseIE):
'doc_id': '8845758582119845', 'doc_id': '8845758582119845',
'variables': json.dumps(variables, separators=(',', ':')), 'variables': json.dumps(variables, separators=(',', ':')),
}) })
media.update(traverse_obj(general_info, ('data', 'xdt_shortcode_media')) or {})
if not general_info: if not general_info:
self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id) self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
@@ -466,6 +432,26 @@ class InstagramIE(InstagramBaseIE):
media.update(traverse_obj( media.update(traverse_obj(
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {}) additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
else:
xdt_shortcode_media = traverse_obj(general_info, ('data', 'xdt_shortcode_media', {dict})) or {}
if not xdt_shortcode_media:
error = join_nonempty('title', 'description', delim=': ', from_dict=api_check)
if 'Restricted Video' in error:
self.raise_login_required(error)
elif error:
raise ExtractorError(error, expected=True)
elif len(video_id) > 28:
# It's a private post (video_id == shortcode + 28 extra characters)
# Only raise after getting empty response; sometimes "long"-shortcode posts are public
self.raise_login_required(
'This content is only available for registered users who follow this account')
raise ExtractorError(
'Instagram sent an empty media response. Check if this post is accessible in your '
f'browser without being logged-in. If it is not, then u{self._login_hint()[1:]}. '
'Otherwise, if the post is accessible in browser without being logged-in'
f'{bug_reports_message(before=",")}', expected=True)
media.update(xdt_shortcode_media)
username = traverse_obj(media, ('owner', 'username')) or self._search_regex( username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False) r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
@@ -485,8 +471,7 @@ class InstagramIE(InstagramBaseIE):
return self.playlist_result( return self.playlist_result(
self._extract_nodes(nodes, True), video_id, self._extract_nodes(nodes, True), video_id,
format_field(username, None, 'Post by %s'), description) format_field(username, None, 'Post by %s'), description)
raise ExtractorError('There is no video in this post', expected=True)
video_url = self._og_search_video_url(webpage, secure=False)
formats = [{ formats = [{
'url': video_url, 'url': video_url,
@@ -689,7 +674,7 @@ class InstagramTagIE(InstagramPlaylistBaseIE):
class InstagramStoryIE(InstagramBaseIE): class InstagramStoryIE(InstagramBaseIE):
_VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/]+)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/?#]+)(?:/(?P<id>\d+))?'
IE_NAME = 'instagram:story' IE_NAME = 'instagram:story'
_TESTS = [{ _TESTS = [{
@@ -699,25 +684,38 @@ class InstagramStoryIE(InstagramBaseIE):
'title': 'Rare', 'title': 'Rare',
}, },
'playlist_mincount': 50, 'playlist_mincount': 50,
}, {
'url': 'https://www.instagram.com/stories/fruits_zipper/3570766765028588805/',
'only_matching': True,
}, {
'url': 'https://www.instagram.com/stories/fruits_zipper',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
username, story_id = self._match_valid_url(url).groups() username, story_id = self._match_valid_url(url).group('user', 'id')
story_info = self._download_webpage(url, story_id) if username == 'highlights' and not story_id: # story id is only mandatory for highlights
user_info = self._search_json(r'"user":', story_info, 'user info', story_id, fatal=False) raise ExtractorError('Input URL is missing a highlight ID', expected=True)
display_id = story_id or username
story_info = self._download_webpage(url, display_id)
user_info = self._search_json(r'"user":', story_info, 'user info', display_id, fatal=False)
if not user_info: if not user_info:
self.raise_login_required('This content is unreachable') self.raise_login_required('This content is unreachable')
user_id = traverse_obj(user_info, 'pk', 'id', expected_type=str) user_id = traverse_obj(user_info, 'pk', 'id', expected_type=str)
story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}' if username == 'highlights':
if not story_info_url: # user id is only mandatory for non-highlights story_info_url = f'highlight:{story_id}'
raise ExtractorError('Unable to extract user id') else:
if not user_id: # user id is only mandatory for non-highlights
raise ExtractorError('Unable to extract user id')
story_info_url = user_id
videos = traverse_obj(self._download_json( videos = traverse_obj(self._download_json(
f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}', f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
story_id, errnote=False, fatal=False, headers=self._API_HEADERS), 'reels') display_id, errnote=False, fatal=False, headers=self._api_headers), 'reels')
if not videos: if not videos:
self.raise_login_required('You need to log in to access this content') self.raise_login_required('You need to log in to access this content')
user_info = traverse_obj(videos, (user_id, 'user', {dict})) or {}
full_name = traverse_obj(videos, (f'highlight:{story_id}', 'user', 'full_name'), (user_id, 'user', 'full_name')) full_name = traverse_obj(videos, (f'highlight:{story_id}', 'user', 'full_name'), (user_id, 'user', 'full_name'))
story_title = traverse_obj(videos, (f'highlight:{story_id}', 'title')) story_title = traverse_obj(videos, (f'highlight:{story_id}', 'title'))
@@ -727,6 +725,7 @@ class InstagramStoryIE(InstagramBaseIE):
highlights = traverse_obj(videos, (f'highlight:{story_id}', 'items'), (user_id, 'items')) highlights = traverse_obj(videos, (f'highlight:{story_id}', 'items'), (user_id, 'items'))
info_data = [] info_data = []
for highlight in highlights: for highlight in highlights:
highlight.setdefault('user', {}).update(user_info)
highlight_data = self._extract_product(highlight) highlight_data = self._extract_product(highlight)
if highlight_data.get('formats'): if highlight_data.get('formats'):
info_data.append({ info_data.append({
@@ -734,4 +733,7 @@ class InstagramStoryIE(InstagramBaseIE):
'uploader_id': user_id, 'uploader_id': user_id,
**filter_dict(highlight_data), **filter_dict(highlight_data),
}) })
if username != 'highlights' and story_id and not self._yes_playlist(username, story_id):
return traverse_obj(info_data, (lambda _, v: v['id'] == _pk_to_id(story_id), any))
return self.playlist_result(info_data, playlist_id=story_id, playlist_title=story_title) return self.playlist_result(info_data, playlist_id=story_id, playlist_title=story_title)

View File

@@ -2,10 +2,12 @@ import hashlib
import random import random
from .common import InfoExtractor from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import ( from ..utils import (
clean_html, clean_html,
int_or_none, int_or_none,
try_get, try_get,
urlhandle_detect_ext,
) )
@@ -27,7 +29,7 @@ class JamendoIE(InfoExtractor):
'ext': 'flac', 'ext': 'flac',
# 'title': 'Maya Filipič - Stories from Emona I', # 'title': 'Maya Filipič - Stories from Emona I',
'title': 'Stories from Emona I', 'title': 'Stories from Emona I',
'artist': 'Maya Filipič', 'artists': ['Maya Filipič'],
'album': 'Between two worlds', 'album': 'Between two worlds',
'track': 'Stories from Emona I', 'track': 'Stories from Emona I',
'duration': 210, 'duration': 210,
@@ -93,9 +95,15 @@ class JamendoIE(InfoExtractor):
if not cover_url or cover_url in urls: if not cover_url or cover_url in urls:
continue continue
urls.append(cover_url) urls.append(cover_url)
urlh = self._request_webpage(
HEADRequest(cover_url), track_id, 'Checking thumbnail extension',
errnote=False, fatal=False)
if not urlh:
continue
size = int_or_none(cover_id.lstrip('size')) size = int_or_none(cover_id.lstrip('size'))
thumbnails.append({ thumbnails.append({
'id': cover_id, 'id': cover_id,
'ext': urlhandle_detect_ext(urlh, default='jpg'),
'url': cover_url, 'url': cover_url,
'width': size, 'width': size,
'height': size, 'height': size,

View File

@@ -26,6 +26,7 @@ class LBRYBaseIE(InfoExtractor):
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}' _CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
_OPT_CLAIM_ID = f'[^$@:/?#&]+(?:[:#]{_CLAIM_ID_REGEX})?' _OPT_CLAIM_ID = f'[^$@:/?#&]+(?:[:#]{_CLAIM_ID_REGEX})?'
_SUPPORTED_STREAM_TYPES = ['video', 'audio'] _SUPPORTED_STREAM_TYPES = ['video', 'audio']
_UNSUPPORTED_STREAM_TYPES = ['binary']
_PAGE_SIZE = 50 _PAGE_SIZE = 50
def _call_api_proxy(self, method, display_id, params, resource): def _call_api_proxy(self, method, display_id, params, resource):
@@ -336,12 +337,15 @@ class LBRYIE(LBRYBaseIE):
'vcodec': 'none' if stream_type == 'audio' else None, 'vcodec': 'none' if stream_type == 'audio' else None,
}) })
final_url = None
# HEAD request returns redirect response to m3u8 URL if available # HEAD request returns redirect response to m3u8 URL if available
final_url = self._request_webpage( urlh = self._request_webpage(
HEADRequest(streaming_url), display_id, headers=headers, HEADRequest(streaming_url), display_id, headers=headers,
note='Downloading streaming redirect url info').url note='Downloading streaming redirect url info', fatal=False)
if urlh:
final_url = urlh.url
elif result.get('value_type') == 'stream': elif result.get('value_type') == 'stream' and stream_type not in self._UNSUPPORTED_STREAM_TYPES:
claim_id, is_live = result['signing_channel']['claim_id'], True claim_id, is_live = result['signing_channel']['claim_id'], True
live_data = self._download_json( live_data = self._download_json(
'https://api.odysee.live/livestream/is_live', claim_id, 'https://api.odysee.live/livestream/is_live', claim_id,

87
yt_dlp/extractor/loco.py Normal file
View File

@@ -0,0 +1,87 @@
from .common import InfoExtractor
from ..utils import int_or_none, url_or_none
from ..utils.traversal import require, traverse_obj
class LocoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?loco\.com/(?P<type>streamers|stream)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://loco.com/streamers/teuzinfps',
'info_dict': {
'id': 'teuzinfps',
'ext': 'mp4',
'title': r're:MS BOLADAO, RESENHA & GAMEPLAY ALTO NIVEL',
'description': 'bom e novo',
'uploader_id': 'RLUVE3S9JU',
'channel': 'teuzinfps',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/743701a9-98ca-41ae-9a8b-70bd5da070ad.jpg',
'tags': ['MMORPG', 'Gameplay'],
'series': 'Tibia',
'timestamp': int,
'modified_timestamp': int,
'live_status': 'is_live',
'upload_date': str,
'modified_date': str,
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://loco.com/stream/c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
'md5': '45ebc8a47ee1c2240178757caf8881b5',
'info_dict': {
'id': 'c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
'ext': 'mp4',
'title': 'PAULINHO LOKO NA LOCO!',
'description': 'live on na loco',
'uploader_id': '2MDO7Z1DPM',
'channel': 'paulinholokobr',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'duration': 14491,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/59b5970b-23c1-4518-9e96-17ce341299fe.jpg',
'tags': ['Gameplay'],
'series': 'GTA 5',
'timestamp': 1740612872,
'modified_timestamp': 1740613037,
'upload_date': '20250226',
'modified_date': '20250226',
},
}]
def _real_extract(self, url):
video_type, video_id = self._match_valid_url(url).group('type', 'id')
webpage = self._download_webpage(url, video_id)
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
'props', 'pageProps', ('liveStreamData', 'stream'), {dict}, any, {require('stream info')}))
return {
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),
'id': video_id,
'is_live': video_type == 'streamers',
**traverse_obj(stream, {
'title': ('title', {str}),
'series': ('game_name', {str}),
'uploader_id': ('user_uid', {str}),
'channel': ('alias', {str}),
'description': ('description', {str}),
'concurrent_view_count': ('viewersCurrent', {int_or_none}),
'view_count': ('total_views', {int_or_none}),
'thumbnail': ('thumbnail_url_small', {url_or_none}),
'like_count': ('likes', {int_or_none}),
'tags': ('tags', ..., {str}),
'timestamp': ('started_at', {int_or_none(scale=1000)}),
'modified_timestamp': ('updated_at', {int_or_none(scale=1000)}),
'comment_count': ('comments_count', {int_or_none}),
'channel_follower_count': ('followers_count', {int_or_none}),
'duration': ('duration', {int_or_none}),
}),
}

View File

@@ -1,35 +1,36 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import parse_age_limit, parse_duration, traverse_obj from ..utils import parse_age_limit, parse_duration, url_or_none
from ..utils.traversal import traverse_obj
class MagellanTVIE(InfoExtractor): class MagellanTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?magellantv\.com/(?:watch|video)/(?P<id>[\w-]+)' _VALID_URL = r'https?://(?:www\.)?magellantv\.com/(?:watch|video)/(?P<id>[\w-]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.magellantv.com/watch/my-dads-on-death-row?type=v', 'url': 'https://www.magellantv.com/watch/incas-the-new-story?type=v',
'info_dict': { 'info_dict': {
'id': 'my-dads-on-death-row', 'id': 'incas-the-new-story',
'ext': 'mp4', 'ext': 'mp4',
'title': 'My Dad\'s On Death Row', 'title': 'Incas: The New Story',
'description': 'md5:33ba23b9f0651fc4537ed19b1d5b0d7a', 'description': 'md5:936c7f6d711c02dfb9db22a067b586fe',
'duration': 3780.0,
'age_limit': 14, 'age_limit': 14,
'tags': ['Justice', 'Reality', 'United States', 'True Crime'], 'duration': 3060.0,
'tags': ['Ancient History', 'Archaeology', 'Anthropology'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
'url': 'https://www.magellantv.com/video/james-bulger-the-new-revelations', 'url': 'https://www.magellantv.com/video/tortured-to-death-murdering-the-nanny',
'info_dict': { 'info_dict': {
'id': 'james-bulger-the-new-revelations', 'id': 'tortured-to-death-murdering-the-nanny',
'ext': 'mp4', 'ext': 'mp4',
'title': 'James Bulger: The New Revelations', 'title': 'Tortured to Death: Murdering the Nanny',
'description': 'md5:7b97922038bad1d0fe8d0470d8a189f2', 'description': 'md5:d87033594fa218af2b1a8b49f52511e5',
'age_limit': 14,
'duration': 2640.0, 'duration': 2640.0,
'age_limit': 0, 'tags': ['True Crime', 'Murder'],
'tags': ['Investigation', 'True Crime', 'Justice', 'Europe'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
'url': 'https://www.magellantv.com/watch/celebration-nation', 'url': 'https://www.magellantv.com/watch/celebration-nation?type=s',
'info_dict': { 'info_dict': {
'id': 'celebration-nation', 'id': 'celebration-nation',
'ext': 'mp4', 'ext': 'mp4',
@@ -43,10 +44,19 @@ class MagellanTVIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
data = traverse_obj(self._search_nextjs_data(webpage, video_id), ( context = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['reactContext']
'props', 'pageProps', 'reactContext', data = traverse_obj(context, ((('video', 'detail'), ('series', 'currentEpisode')), {dict}, any))
(('video', 'detail'), ('series', 'currentEpisode')), {dict}), get_all=False)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(data['jwpVideoUrl'], video_id) formats, subtitles = [], {}
for m3u8_url in set(traverse_obj(data, ((('manifests', ..., 'hls'), 'jwp_video_url'), {url_or_none}))):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if not formats and (error := traverse_obj(context, ('errorDetailPage', 'errorMessage', {str}))):
if 'available in your country' in error:
self.raise_geo_restricted(msg=error)
self.raise_no_formats(f'{self.IE_NAME} said: {error}', expected=True)
return { return {
'id': video_id, 'id': video_id,

View File

@@ -102,11 +102,10 @@ class MedalTVIE(InfoExtractor):
item_id = item_id or '%dp' % height item_id = item_id or '%dp' % height
if item_id not in item_url: if item_id not in item_url:
return return
width = int(round(aspect_ratio * height))
container.append({ container.append({
'url': item_url, 'url': item_url,
id_key: item_id, id_key: item_id,
'width': width, 'width': round(aspect_ratio * height),
'height': height, 'height': height,
}) })

View File

@@ -1,5 +1,7 @@
from .telecinco import TelecincoBaseIE from .telecinco import TelecincoBaseIE
from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
) )
@@ -79,7 +81,17 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
try: # yt-dlp's default user-agents are too old and blocked by akamai
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient to bypass akamai
webpage = self._download_webpage(url, display_id, impersonate=True)
pre_player = self._search_json( pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=', r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player', display_id)['prePlayer'] webpage, 'Pre Player', display_id)['prePlayer']

View File

@@ -3,8 +3,8 @@ from .dailymotion import DailymotionIE
class MoviepilotIE(InfoExtractor): class MoviepilotIE(InfoExtractor):
_IE_NAME = 'moviepilot' IE_NAME = 'moviepilot'
_IE_DESC = 'Moviepilot trailer' IE_DESC = 'Moviepilot trailer'
_VALID_URL = r'https?://(?:www\.)?moviepilot\.de/movies/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?moviepilot\.de/movies/(?P<id>[^/]+)'
_TESTS = [{ _TESTS = [{

View File

@@ -1,167 +1,215 @@
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
clean_html,
determine_ext, determine_ext,
int_or_none, int_or_none,
unescapeHTML, parse_iso8601,
url_or_none,
) )
from ..utils.traversal import traverse_obj
class MSNIE(InfoExtractor): class MSNIE(InfoExtractor):
_WORKING = False _VALID_URL = r'https?://(?:(?:www|preview)\.)?msn\.com/(?P<locale>[a-z]{2}-[a-z]{2})/(?:[^/?#]+/)+(?P<display_id>[^/?#]+)/[a-z]{2}-(?P<id>[\da-zA-Z]+)'
_VALID_URL = r'https?://(?:(?:www|preview)\.)?msn\.com/(?:[^/]+/)+(?P<display_id>[^/]+)/[a-z]{2}-(?P<id>[\da-zA-Z]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.msn.com/en-in/money/video/7-ways-to-get-rid-of-chest-congestion/vi-BBPxU6d', 'url': 'https://www.msn.com/en-gb/video/news/president-macron-interrupts-trump-over-ukraine-funding/vi-AA1zMcD7',
'md5': '087548191d273c5c55d05028f8d2cbcd',
'info_dict': { 'info_dict': {
'id': 'BBPxU6d', 'id': 'AA1zMcD7',
'display_id': '7-ways-to-get-rid-of-chest-congestion',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Seven ways to get rid of chest congestion', 'display_id': 'president-macron-interrupts-trump-over-ukraine-funding',
'description': '7 Ways to Get Rid of Chest Congestion', 'title': 'President Macron interrupts Trump over Ukraine funding',
'duration': 88, 'description': 'md5:5fd3857ac25849e7a56cb25fbe1a2a8b',
'uploader': 'Health', 'uploader': 'k! News UK',
'uploader_id': 'BBPrMqa', 'uploader_id': 'BB1hz5Rj',
'duration': 59,
'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1zMagX.img',
'tags': 'count:14',
'timestamp': 1740510914,
'upload_date': '20250225',
'release_timestamp': 1740513600,
'release_date': '20250225',
'modified_timestamp': 1741413241,
'modified_date': '20250308',
}, },
}, { }, {
# Article, multiple Dailymotion Embeds 'url': 'https://www.msn.com/en-gb/video/watch/films-success-saved-adam-pearsons-acting-career/vi-AA1znZGE?ocid=hpmsn',
'url': 'https://www.msn.com/en-in/money/sports/hottest-football-wags-greatest-footballers-turned-managers-and-more/ar-BBpc7Nl',
'info_dict': { 'info_dict': {
'id': 'BBpc7Nl', 'id': 'AA1znZGE',
'ext': 'mp4',
'display_id': 'films-success-saved-adam-pearsons-acting-career',
'title': "Films' success saved Adam Pearson's acting career",
'description': 'md5:98c05f7bd9ab4f9c423400f62f2d3da5',
'uploader': 'Sky News',
'uploader_id': 'AA2eki',
'duration': 52,
'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1zo7nU.img',
'timestamp': 1739993965,
'upload_date': '20250219',
'release_timestamp': 1739977753,
'release_date': '20250219',
'modified_timestamp': 1742076259,
'modified_date': '20250315',
}, },
'playlist_mincount': 4,
}, { }, {
'url': 'http://www.msn.com/en-ae/news/offbeat/meet-the-nine-year-old-self-made-millionaire/ar-BBt6ZKf', 'url': 'https://www.msn.com/en-us/entertainment/news/rock-frontman-replacements-you-might-not-know-happened/vi-AA1yLVcD',
'only_matching': True, 'info_dict': {
}, { 'id': 'AA1yLVcD',
'url': 'http://www.msn.com/en-ae/video/watch/obama-a-lot-of-people-will-be-disappointed/vi-AAhxUMH', 'ext': 'mp4',
'only_matching': True, 'display_id': 'rock-frontman-replacements-you-might-not-know-happened',
}, { 'title': 'Rock Frontman Replacements You Might Not Know Happened',
# geo restricted 'description': 'md5:451a125496ff0c9f6816055bb1808da9',
'url': 'http://www.msn.com/en-ae/foodanddrink/joinourtable/the-first-fart-makes-you-laugh-the-last-fart-makes-you-cry/vp-AAhzIBU', 'uploader': 'Grunge (Video)',
'only_matching': True, 'uploader_id': 'BB1oveoV',
}, { 'duration': 596,
'url': 'http://www.msn.com/en-ae/entertainment/bollywood/watch-how-salman-khan-reacted-when-asked-if-he-would-apologize-for-his-raped-woman-comment/vi-AAhvzW6', 'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1yM4OJ.img',
'only_matching': True, 'timestamp': 1739223456,
}, { 'upload_date': '20250210',
# Vidible(AOL) Embed 'release_timestamp': 1739219731,
'url': 'https://www.msn.com/en-us/money/other/jupiter-is-about-to-come-so-close-you-can-see-its-moons-with-binoculars/vi-AACqsHR', 'release_date': '20250210',
'only_matching': True, 'modified_timestamp': 1741427272,
'modified_date': '20250308',
},
}, { }, {
# Dailymotion Embed # Dailymotion Embed
'url': 'https://www.msn.com/es-ve/entretenimiento/watch/winston-salem-paire-refait-des-siennes-en-perdant-sa-raquette-au-service/vp-AAG704L', 'url': 'https://www.msn.com/de-de/nachrichten/other/the-first-descendant-gameplay-trailer-zu-serena-der-neuen-gefl%C3%BCgelten-nachfahrin/vi-AA1B1d06',
'only_matching': True, 'info_dict': {
'id': 'x9g6oli',
'ext': 'mp4',
'title': 'The First Descendant: Gameplay-Trailer zu Serena, der neuen geflügelten Nachfahrin',
'description': '',
'uploader': 'MeinMMO',
'uploader_id': 'x2mvqi4',
'view_count': int,
'like_count': int,
'age_limit': 0,
'duration': 60,
'thumbnail': 'https://s1.dmcdn.net/v/Y3fO61drj56vPB9SS/x1080',
'tags': ['MeinMMO', 'The First Descendant'],
'timestamp': 1742124877,
'upload_date': '20250316',
},
}, { }, {
# YouTube Embed # Youtube Embed
'url': 'https://www.msn.com/en-in/money/news/meet-vikram-%E2%80%94-chandrayaan-2s-lander/vi-AAGUr0v', 'url': 'https://www.msn.com/en-gb/video/webcontent/web-content/vi-AA1ybFaJ',
'only_matching': True, 'info_dict': {
'id': 'kQSChWu95nE',
'ext': 'mp4',
'title': '7 Daily Habits to Nurture Your Personal Growth',
'description': 'md5:6f233c68341b74dee30c8c121924e827',
'uploader': 'TopThink',
'uploader_id': '@TopThink',
'uploader_url': 'https://www.youtube.com/@TopThink',
'channel': 'TopThink',
'channel_id': 'UCMlGmHokrQRp-RaNO7aq4Uw',
'channel_url': 'https://www.youtube.com/channel/UCMlGmHokrQRp-RaNO7aq4Uw',
'channel_is_verified': True,
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'like_count': int,
'age_limit': 0,
'duration': 705,
'thumbnail': 'https://i.ytimg.com/vi/kQSChWu95nE/maxresdefault.jpg',
'categories': ['Howto & Style'],
'tags': ['topthink', 'top think', 'personal growth'],
'timestamp': 1722711620,
'upload_date': '20240803',
'playable_in_embed': True,
'availability': 'public',
'live_status': 'not_live',
},
}, { }, {
# NBCSports Embed # Article with social embed
'url': 'https://www.msn.com/en-us/money/football_nfl/week-13-preview-redskins-vs-panthers/vi-BBXsCDb', 'url': 'https://www.msn.com/en-in/news/techandscience/watch-earth-sets-and-rises-behind-moon-in-breathtaking-blue-ghost-video/ar-AA1zKoAc',
'only_matching': True, 'info_dict': {
'id': 'AA1zKoAc',
'title': 'Watch: Earth sets and rises behind Moon in breathtaking Blue Ghost video',
'description': 'md5:0ad51cfa77e42e7f0c46cf98a619dbbf',
'uploader': 'India Today',
'uploader_id': 'AAyFWG',
'tags': 'count:11',
'timestamp': 1740485034,
'upload_date': '20250225',
'release_timestamp': 1740484875,
'release_date': '20250225',
'modified_timestamp': 1740488561,
'modified_date': '20250225',
},
'playlist_count': 1,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
display_id, page_id = self._match_valid_url(url).groups() locale, display_id, page_id = self._match_valid_url(url).group('locale', 'display_id', 'id')
webpage = self._download_webpage(url, display_id) json_data = self._download_json(
f'https://assets.msn.com/content/view/v2/Detail/{locale}/{page_id}', page_id)
entries = [] common_metadata = traverse_obj(json_data, {
for _, metadata in re.findall(r'data-metadata\s*=\s*(["\'])(?P<data>.+?)\1', webpage): 'title': ('title', {str}),
video = self._parse_json(unescapeHTML(metadata), display_id) 'description': (('abstract', ('body', {clean_html})), {str}, filter, any),
'timestamp': ('createdDateTime', {parse_iso8601}),
provider_id = video.get('providerId') 'release_timestamp': ('publishedDateTime', {parse_iso8601}),
player_name = video.get('playerName') 'modified_timestamp': ('updatedDateTime', {parse_iso8601}),
if player_name and provider_id: 'thumbnail': ('thumbnail', 'image', 'url', {url_or_none}),
entry = None 'duration': ('videoMetadata', 'playTime', {int_or_none}),
if player_name == 'AOL': 'tags': ('keywords', ..., {str}),
if provider_id.startswith('http'): 'uploader': ('provider', 'name', {str}),
provider_id = self._search_regex( 'uploader_id': ('provider', 'id', {str}),
r'https?://delivery\.vidible\.tv/video/redirect/([0-9a-f]{24})', })
provider_id, 'vidible id')
entry = self.url_result(
'aol-video:' + provider_id, 'Aol', provider_id)
elif player_name == 'Dailymotion':
entry = self.url_result(
'https://www.dailymotion.com/video/' + provider_id,
'Dailymotion', provider_id)
elif player_name == 'YouTube':
entry = self.url_result(
provider_id, 'Youtube', provider_id)
elif player_name == 'NBCSports':
entry = self.url_result(
'http://vplayer.nbcsports.com/p/BxmELC/nbcsports_embed/select/media/' + provider_id,
'NBCSportsVPlayer', provider_id)
if entry:
entries.append(entry)
continue
video_id = video['uuid']
title = video['title']
page_type = json_data['type']
source_url = traverse_obj(json_data, ('sourceHref', {url_or_none}))
if page_type == 'video':
if traverse_obj(json_data, ('thirdPartyVideoPlayer', 'enabled')) and source_url:
return self.url_result(source_url)
formats = [] formats = []
for file_ in video.get('videoFiles', []):
format_url = file_.get('url')
if not format_url:
continue
if 'format=m3u8-aapl' in format_url:
# m3u8_native should not be used here until
# https://github.com/ytdl-org/youtube-dl/issues/9913 is fixed
formats.extend(self._extract_m3u8_formats(
format_url, display_id, 'mp4',
m3u8_id='hls', fatal=False))
elif 'format=mpd-time-csf' in format_url:
formats.extend(self._extract_mpd_formats(
format_url, display_id, 'dash', fatal=False))
elif '.ism' in format_url:
if format_url.endswith('.ism'):
format_url += '/manifest'
formats.extend(self._extract_ism_formats(
format_url, display_id, 'mss', fatal=False))
else:
format_id = file_.get('formatCode')
formats.append({
'url': format_url,
'ext': 'mp4',
'format_id': format_id,
'width': int_or_none(file_.get('width')),
'height': int_or_none(file_.get('height')),
'vbr': int_or_none(self._search_regex(r'_(\d+)\.mp4', format_url, 'vbr', default=None)),
'quality': 1 if format_id == '1001' else None,
})
subtitles = {} subtitles = {}
for file_ in video.get('files', []): for file in traverse_obj(json_data, ('videoMetadata', 'externalVideoFiles', lambda _, v: url_or_none(v['url']))):
format_url = file_.get('url') file_url = file['url']
format_code = file_.get('formatCode') ext = determine_ext(file_url)
if not format_url or not format_code: if ext == 'm3u8':
continue fmts, subs = self._extract_m3u8_formats_and_subtitles(
if str(format_code) == '3100': file_url, page_id, 'mp4', m3u8_id='hls', fatal=False)
subtitles.setdefault(file_.get('culture', 'en'), []).append({ formats.extend(fmts)
'ext': determine_ext(format_url, 'ttml'), self._merge_subtitles(subs, target=subtitles)
'url': format_url, elif ext == 'mpd':
}) fmts, subs = self._extract_mpd_formats_and_subtitles(
file_url, page_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append(
traverse_obj(file, {
'url': 'url',
'format_id': ('format', {str}),
'filesize': ('fileSize', {int_or_none}),
'height': ('height', {int_or_none}),
'width': ('width', {int_or_none}),
}))
for caption in traverse_obj(json_data, ('videoMetadata', 'closedCaptions', lambda _, v: url_or_none(v['href']))):
lang = caption.get('locale') or 'en-us'
subtitles.setdefault(lang, []).append({
'url': caption['href'],
'ext': 'ttml',
})
entries.append({ return {
'id': video_id, 'id': page_id,
'display_id': display_id, 'display_id': display_id,
'title': title,
'description': video.get('description'),
'thumbnail': video.get('headlineImage', {}).get('url'),
'duration': int_or_none(video.get('durationSecs')),
'uploader': video.get('sourceFriendly'),
'uploader_id': video.get('providerId'),
'creator': video.get('creator'),
'subtitles': subtitles,
'formats': formats, 'formats': formats,
}) 'subtitles': subtitles,
**common_metadata,
}
elif page_type == 'webcontent':
if not source_url:
raise ExtractorError('Could not find source URL')
return self.url_result(source_url)
elif page_type == 'article':
entries = []
for embed_url in traverse_obj(json_data, ('socialEmbeds', ..., 'postUrl', {url_or_none})):
entries.append(self.url_result(embed_url))
if not entries: return self.playlist_result(entries, page_id, **common_metadata)
error = unescapeHTML(self._search_regex(
r'data-error=(["\'])(?P<error>.+?)\1',
webpage, 'error', group='error'))
raise ExtractorError(f'{self.IE_NAME} said: {error}', expected=True)
return self.playlist_result(entries, page_id) raise ExtractorError(f'Unsupported page type: {page_type}')

View File

@@ -4,7 +4,9 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
extract_attributes, extract_attributes,
unified_timestamp, unified_timestamp,
url_or_none,
) )
from ..utils.traversal import traverse_obj
class N1InfoAssetIE(InfoExtractor): class N1InfoAssetIE(InfoExtractor):
@@ -35,9 +37,9 @@ class N1InfoIIE(InfoExtractor):
IE_NAME = 'N1Info:article' IE_NAME = 'N1Info:article'
_VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
# Youtube embedded # YouTube embedded
'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/', 'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/',
'md5': '01ddb6646d0fd9c4c7d990aa77fe1c5a', 'md5': '987ce6fd72acfecc453281e066b87973',
'info_dict': { 'info_dict': {
'id': 'L5Hd4hQVUpk', 'id': 'L5Hd4hQVUpk',
'ext': 'mp4', 'ext': 'mp4',
@@ -45,7 +47,26 @@ class N1InfoIIE(InfoExtractor):
'title': 'Ozmo i USO21, ep. 13: Novak Đoković Danil Medvedev | Ključevi Poraza, Budućnost | SPORT KLUB TENIS', 'title': 'Ozmo i USO21, ep. 13: Novak Đoković Danil Medvedev | Ključevi Poraza, Budućnost | SPORT KLUB TENIS',
'description': 'md5:467f330af1effedd2e290f10dc31bb8e', 'description': 'md5:467f330af1effedd2e290f10dc31bb8e',
'uploader': 'Sport Klub', 'uploader': 'Sport Klub',
'uploader_id': 'sportklub', 'uploader_id': '@sportklub',
'uploader_url': 'https://www.youtube.com/@sportklub',
'channel': 'Sport Klub',
'channel_id': 'UChpzBje9Ro6CComXe3BgNaw',
'channel_url': 'https://www.youtube.com/channel/UChpzBje9Ro6CComXe3BgNaw',
'channel_is_verified': True,
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'like_count': int,
'age_limit': 0,
'duration': 1049,
'thumbnail': 'https://i.ytimg.com/vi/L5Hd4hQVUpk/maxresdefault.jpg',
'chapters': 'count:9',
'categories': ['Sports'],
'tags': 'count:10',
'timestamp': 1631522787,
'playable_in_embed': True,
'availability': 'public',
'live_status': 'not_live',
}, },
}, { }, {
'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/', 'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/',
@@ -55,6 +76,7 @@ class N1InfoIIE(InfoExtractor):
'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode', 'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode',
'upload_date': '20210924', 'upload_date': '20210924',
'timestamp': 1632481347, 'timestamp': 1632481347,
'thumbnail': 'http://n1info.rs/wp-content/themes/ucnewsportal-n1/dist/assets/images/placeholder-image-video.jpg',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -67,6 +89,7 @@ class N1InfoIIE(InfoExtractor):
'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”', 'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”',
'timestamp': 1632567630, 'timestamp': 1632567630,
'upload_date': '20210925', 'upload_date': '20210925',
'thumbnail': 'https://n1info.si/wp-content/uploads/2021/09/06/1630945843-tomaz3.png',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -81,6 +104,14 @@ class N1InfoIIE(InfoExtractor):
'upload_date': '20210924', 'upload_date': '20210924',
'timestamp': 1632448649.0, 'timestamp': 1632448649.0,
'uploader': 'YouLotWhatDontStop', 'uploader': 'YouLotWhatDontStop',
'display_id': 'pu9wbx',
'channel_id': 'serbia',
'comment_count': int,
'like_count': int,
'dislike_count': int,
'age_limit': 0,
'duration': 134,
'thumbnail': 'https://external-preview.redd.it/5nmmawSeGx60miQM3Iq-ueC9oyCLTLjjqX-qqY8uRsc.png?format=pjpg&auto=webp&s=2f973400b04d23f871b608b178e47fc01f9b8f1d',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -93,6 +124,7 @@ class N1InfoIIE(InfoExtractor):
'title': 'Žaklina Tatalović Ani Brnabić: Pričate laži (VIDEO)', 'title': 'Žaklina Tatalović Ani Brnabić: Pričate laži (VIDEO)',
'upload_date': '20211102', 'upload_date': '20211102',
'timestamp': 1635861677, 'timestamp': 1635861677,
'thumbnail': 'https://nova.rs/wp-content/uploads/2021/11/02/1635860298-TNJG_Ana_Brnabic_i_Zaklina_Tatalovic_100_dana_Vlade_GP.jpg',
}, },
}, { }, {
'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/', 'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/',
@@ -104,6 +136,16 @@ class N1InfoIIE(InfoExtractor):
'timestamp': 1687290536, 'timestamp': 1687290536,
'thumbnail': 'https://cdn.brid.tv/live/partners/26827/snapshot/1332368_th_6492013a8356f_1687290170.jpg', 'thumbnail': 'https://cdn.brid.tv/live/partners/26827/snapshot/1332368_th_6492013a8356f_1687290170.jpg',
}, },
}, {
'url': 'https://n1info.rs/vesti/vuciceva-turneja-po-srbiji-najavljuje-kontrarevoluciju-preti-svom-narodu-vredja-novinare/',
'info_dict': {
'id': '2025974',
'ext': 'mp4',
'title': 'Vučićeva turneja po Srbiji: Najavljuje kontrarevoluciju, preti svom narodu, vređa novinare',
'thumbnail': 'https://cdn-uc.brid.tv/live/partners/26827/snapshot/2025974_fhd_67c4a23280a81_1740939826.jpg',
'timestamp': 1740939936,
'upload_date': '20250302',
},
}, { }, {
'url': 'https://hr.n1info.com/vijesti/pravobraniteljica-o-ubojstvu-u-zagrebu-radi-se-o-doista-nezapamcenoj-situaciji/', 'url': 'https://hr.n1info.com/vijesti/pravobraniteljica-o-ubojstvu-u-zagrebu-radi-se-o-doista-nezapamcenoj-situaciji/',
'only_matching': True, 'only_matching': True,
@@ -115,11 +157,11 @@ class N1InfoIIE(InfoExtractor):
title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title') title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title')
timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage)) timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage))
plugin_data = self._html_search_meta('BridPlugin', webpage) plugin_data = re.findall(r'\$bp\("(?:Brid|TargetVideo)_\d+",\s(.+)\);', webpage)
entries = [] entries = []
if plugin_data: if plugin_data:
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id') site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id')
for video_data in re.findall(r'\$bp\("Brid_\d+", (.+)\);', webpage): for video_data in plugin_data:
video_id = self._parse_json(video_data, title)['video'] video_id = self._parse_json(video_data, title)['video']
entries.append({ entries.append({
'id': video_id, 'id': video_id,
@@ -140,7 +182,7 @@ class N1InfoIIE(InfoExtractor):
'url': video_data.get('data-url'), 'url': video_data.get('data-url'),
'id': video_data.get('id'), 'id': video_data.get('id'),
'title': title, 'title': title,
'thumbnail': video_data.get('data-thumbnail'), 'thumbnail': traverse_obj(video_data, (('data-thumbnail', 'data-default_thumbnail'), {url_or_none}, any)),
'timestamp': timestamp, 'timestamp': timestamp,
'ie_key': 'N1InfoAsset', 'ie_key': 'N1InfoAsset',
}) })
@@ -152,7 +194,7 @@ class N1InfoIIE(InfoExtractor):
if url.startswith('https://www.youtube.com'): if url.startswith('https://www.youtube.com'):
entries.append(self.url_result(url, ie='Youtube')) entries.append(self.url_result(url, ie='Youtube'))
elif url.startswith('https://www.redditmedia.com'): elif url.startswith('https://www.redditmedia.com'):
entries.append(self.url_result(url, ie='RedditR')) entries.append(self.url_result(url, ie='Reddit'))
return { return {
'_type': 'playlist', '_type': 'playlist',

View File

@@ -736,7 +736,7 @@ class NBCStationsIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
nbc_data = self._search_json( nbc_data = self._search_json(
r'<script>\s*var\s+nbc\s*=', webpage, 'NBC JSON data', video_id) r'(?:<script>\s*var\s+nbc\s*=|Object\.assign\(nbc,)', webpage, 'NBC JSON data', video_id)
pdk_acct = nbc_data.get('pdkAcct') or 'Yh1nAC' pdk_acct = nbc_data.get('pdkAcct') or 'Yh1nAC'
fw_ssid = traverse_obj(nbc_data, ('video', 'fwSSID')) fw_ssid = traverse_obj(nbc_data, ('video', 'fwSSID'))

View File

@@ -13,11 +13,13 @@ from ..utils import (
ExtractorError, ExtractorError,
OnDemandPagedList, OnDemandPagedList,
clean_html, clean_html,
determine_ext,
float_or_none, float_or_none,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
parse_qs,
parse_resolution, parse_resolution,
qualities, qualities,
remove_start, remove_start,
@@ -26,6 +28,7 @@ from ..utils import (
try_get, try_get,
unescapeHTML, unescapeHTML,
update_url_query, update_url_query,
url_basename,
url_or_none, url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
@@ -430,6 +433,7 @@ class NiconicoIE(InfoExtractor):
'format_id': ('id', {str}), 'format_id': ('id', {str}),
'abr': ('bitRate', {float_or_none(scale=1000)}), 'abr': ('bitRate', {float_or_none(scale=1000)}),
'asr': ('samplingRate', {int_or_none}), 'asr': ('samplingRate', {int_or_none}),
'quality': ('qualityLevel', {int_or_none}),
}), get_all=False), }), get_all=False),
'acodec': 'aac', 'acodec': 'aac',
} }
@@ -441,7 +445,9 @@ class NiconicoIE(InfoExtractor):
min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000 min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000
for video_fmt in video_fmts: for video_fmt in video_fmts:
video_fmt['tbr'] -= min_abr video_fmt['tbr'] -= min_abr
video_fmt['format_id'] = f'video-{video_fmt["tbr"]:.0f}' video_fmt['format_id'] = url_basename(video_fmt['url']).rpartition('.')[0]
video_fmt['quality'] = traverse_obj(videos, (
lambda _, v: v['id'] == video_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
yield video_fmt yield video_fmt
def _real_extract(self, url): def _real_extract(self, url):
@@ -1033,6 +1039,7 @@ class NiconicoLiveIE(InfoExtractor):
thumbnails.append({ thumbnails.append({
'id': f'{name}_{width}x{height}', 'id': f'{name}_{width}x{height}',
'url': img_url, 'url': img_url,
'ext': traverse_obj(parse_qs(img_url), ('image', 0, {determine_ext(default_ext='jpg')})),
**res, **res,
}) })

View File

@@ -1,34 +1,46 @@
import json
import re
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
smuggle_url, parse_iso8601,
parse_resolution,
str_or_none, str_or_none,
try_get, url_or_none,
unified_strdate,
unified_timestamp,
) )
from ..utils.traversal import require, traverse_obj, value
class NineNowIE(InfoExtractor): class NineNowIE(InfoExtractor):
IE_NAME = '9now.com.au' IE_NAME = '9now.com.au'
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/?#]+/){2}(?P<id>(?P<type>clip|episode)-[^/?#]+)'
_GEO_COUNTRIES = ['AU'] _GEO_BYPASS = False
_TESTS = [{ _TESTS = [{
# clip # clip
'url': 'https://www.9now.com.au/afl-footy-show/2016/clip-ciql02091000g0hp5oktrnytc', 'url': 'https://www.9now.com.au/today/season-2025/clip-cm8hw9h5z00080hquqa5hszq7',
'md5': '17cf47d63ec9323e562c9957a968b565',
'info_dict': { 'info_dict': {
'id': '16801', 'id': '6370295582112',
'ext': 'mp4', 'ext': 'mp4',
'title': 'St. Kilda\'s Joey Montagna on the potential for a player\'s strike', 'title': 'Would Karl Stefanovic be able to land a plane?',
'description': 'Is a boycott of the NAB Cup "on the table"?', 'description': 'The Today host\'s skills are put to the test with the latest simulation tech.',
'uploader_id': '4460760524001', 'uploader_id': '4460760524001',
'upload_date': '20160713', 'duration': 197.376,
'timestamp': 1468421266, 'tags': ['flights', 'technology', 'Karl Stefanovic'],
'season': 'Season 2025',
'season_number': 2025,
'series': 'TODAY',
'timestamp': 1742507988,
'upload_date': '20250320',
'release_timestamp': 1742507983,
'release_date': '20250320',
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
}, },
'skip': 'Only available in Australia',
}, { }, {
# episode # episode
'url': 'https://www.9now.com.au/afl-footy-show/2016/episode-19', 'url': 'https://www.9now.com.au/afl-footy-show/2016/episode-19',
@@ -41,7 +53,7 @@ class NineNowIE(InfoExtractor):
# episode of series # episode of series
'url': 'https://www.9now.com.au/lego-masters/season-3/episode-3', 'url': 'https://www.9now.com.au/lego-masters/season-3/episode-3',
'info_dict': { 'info_dict': {
'id': '6249614030001', 'id': '6308830406112',
'title': 'Episode 3', 'title': 'Episode 3',
'ext': 'mp4', 'ext': 'mp4',
'season_number': 3, 'season_number': 3,
@@ -50,72 +62,87 @@ class NineNowIE(InfoExtractor):
'uploader_id': '4460760524001', 'uploader_id': '4460760524001',
'timestamp': 1619002200, 'timestamp': 1619002200,
'upload_date': '20210421', 'upload_date': '20210421',
'duration': 3574.085,
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
'tags': ['episode'],
'series': 'Lego Masters',
'season': 'Season 3',
'episode': 'Episode 3',
'release_timestamp': 1619002200,
'release_date': '20210421',
}, },
'expected_warnings': ['Ignoring subtitle tracks'],
'params': { 'params': {
'skip_download': True, 'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
},
}, {
'url': 'https://www.9now.com.au/married-at-first-sight/season-12/episode-1',
'info_dict': {
'id': '6367798770112',
'ext': 'mp4',
'title': 'Episode 1',
'description': r're:The cultural sensation of Married At First Sight returns with our first weddings! .{90}$',
'uploader_id': '4460760524001',
'duration': 5415.079,
'thumbnail': r're:https?://.+/1920x0/.+\.png',
'tags': ['episode'],
'season': 'Season 12',
'season_number': 12,
'episode': 'Episode 1',
'episode_number': 1,
'series': 'Married at First Sight',
'timestamp': 1737973800,
'upload_date': '20250127',
'release_timestamp': 1737973800,
'release_date': '20250127',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
}, },
}] }]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId={}'
# XXX: For parsing next.js v15+ data; see also yt_dlp.extractor.francetv and yt_dlp.extractor.goplay
def _find_json(self, s):
return self._search_json(
r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id, video_type = self._match_valid_url(url).group('id', 'type')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
page_data = self._parse_json(self._search_regex(
r'window\.__data\s*=\s*({.*?});', webpage,
'page data', default='{}'), display_id, fatal=False)
if not page_data:
page_data = self._parse_json(self._parse_json(self._search_regex(
r'window\.__data\s*=\s*JSON\.parse\s*\(\s*(".+?")\s*\)\s*;',
webpage, 'page data'), display_id), display_id)
for kind in ('episode', 'clip'): common_data = traverse_obj(
current_key = page_data.get(kind, {}).get( re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
f'current{kind.capitalize()}Key') (..., {json.loads}, ..., {self._find_json},
if not current_key: lambda _, v: v['payload'][video_type]['slug'] == display_id,
continue 'payload', any, {require('video data')}))
cache = page_data.get(kind, {}).get(f'{kind}Cache', {})
if not cache:
continue
common_data = {
'episode': (cache.get(current_key) or next(iter(cache.values())))[kind],
'season': (cache.get(current_key) or next(iter(cache.values()))).get('season', None),
}
break
else:
raise ExtractorError('Unable to find video data')
if not self.get_param('allow_unplayable_formats') and try_get(common_data, lambda x: x['episode']['video']['drm'], bool): if traverse_obj(common_data, (video_type, 'video', 'drm', {bool})):
self.report_drm(display_id) self.report_drm(display_id)
brightcove_id = try_get( brightcove_id = traverse_obj(common_data, (
common_data, lambda x: x['episode']['video']['brightcoveId'], str) or 'ref:{}'.format(common_data['episode']['video']['referenceId']) video_type, 'video', (
video_id = str_or_none(try_get(common_data, lambda x: x['episode']['video']['id'])) or brightcove_id ('brightcoveId', {str}),
('referenceId', {str}, {lambda x: f'ref:{x}' if x else None}),
title = try_get(common_data, lambda x: x['episode']['name'], str) ), any, {require('brightcove ID')}))
season_number = try_get(common_data, lambda x: x['season']['seasonNumber'], int)
episode_number = try_get(common_data, lambda x: x['episode']['episodeNumber'], int)
timestamp = unified_timestamp(try_get(common_data, lambda x: x['episode']['airDate'], str))
release_date = unified_strdate(try_get(common_data, lambda x: x['episode']['availability'], str))
thumbnails_data = try_get(common_data, lambda x: x['episode']['image']['sizes'], dict) or {}
thumbnails = [{
'id': thumbnail_id,
'url': thumbnail_url,
'width': int_or_none(thumbnail_id[1:]),
} for thumbnail_id, thumbnail_url in thumbnails_data.items()]
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': smuggle_url( 'ie_key': BrightcoveNewIE.ie_key(),
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'url': self.BRIGHTCOVE_URL_TEMPLATE.format(brightcove_id),
{'geo_countries': self._GEO_COUNTRIES}), **traverse_obj(common_data, {
'id': video_id, 'id': (video_type, 'video', 'id', {int}, ({str_or_none}, {value(brightcove_id)}), any),
'title': title, 'title': (video_type, 'name', {str}),
'description': try_get(common_data, lambda x: x['episode']['description'], str), 'description': (video_type, 'description', {str}),
'duration': float_or_none(try_get(common_data, lambda x: x['episode']['video']['duration'], float), 1000), 'duration': (video_type, 'video', 'duration', {float_or_none(scale=1000)}),
'thumbnails': thumbnails, 'tags': (video_type, 'tags', ..., 'name', {str}, all, filter),
'ie_key': 'BrightcoveNew', 'series': ('tvSeries', 'name', {str}),
'season_number': season_number, 'season_number': ('season', 'seasonNumber', {int_or_none}),
'episode_number': episode_number, 'episode_number': ('episode', 'episodeNumber', {int_or_none}),
'timestamp': timestamp, 'timestamp': ('episode', 'airDate', {parse_iso8601}),
'release_date': release_date, 'release_timestamp': (video_type, 'availability', {parse_iso8601}),
'thumbnails': (video_type, 'image', 'sizes', {dict.items}, lambda _, v: url_or_none(v[1]), {
'id': 0,
'url': 1,
'width': (1, {parse_resolution}, 'width'),
}),
}),
} }

View File

@@ -67,7 +67,7 @@ class OpenRecBaseIE(InfoExtractor):
class OpenRecIE(OpenRecBaseIE): class OpenRecIE(OpenRecBaseIE):
IE_NAME = 'openrec' IE_NAME = 'openrec'
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/live/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?openrec\.tv/live/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.openrec.tv/live/2p8v31qe4zy', 'url': 'https://www.openrec.tv/live/2p8v31qe4zy',
'only_matching': True, 'only_matching': True,
@@ -85,7 +85,7 @@ class OpenRecIE(OpenRecBaseIE):
class OpenRecCaptureIE(OpenRecBaseIE): class OpenRecCaptureIE(OpenRecBaseIE):
IE_NAME = 'openrec:capture' IE_NAME = 'openrec:capture'
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/capture/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?openrec\.tv/capture/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.openrec.tv/capture/l9nk2x4gn14', 'url': 'https://www.openrec.tv/capture/l9nk2x4gn14',
'only_matching': True, 'only_matching': True,
@@ -129,7 +129,7 @@ class OpenRecCaptureIE(OpenRecBaseIE):
class OpenRecMovieIE(OpenRecBaseIE): class OpenRecMovieIE(OpenRecBaseIE):
IE_NAME = 'openrec:movie' IE_NAME = 'openrec:movie'
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/movie/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?openrec\.tv/movie/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.openrec.tv/movie/nqz5xl5km8v', 'url': 'https://www.openrec.tv/movie/nqz5xl5km8v',
'info_dict': { 'info_dict': {
@@ -141,6 +141,9 @@ class OpenRecMovieIE(OpenRecBaseIE):
'uploader_id': 'taiki_to_kazuhiro', 'uploader_id': 'taiki_to_kazuhiro',
'timestamp': 1638856800, 'timestamp': 1638856800,
}, },
}, {
'url': 'https://www.openrec.tv/movie/2p8vvex548y?playlist_id=98brq96vvsgn2nd',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -61,7 +61,7 @@ class PBSIE(InfoExtractor):
(r'video\.wyomingpbs\.org', 'Wyoming PBS (KCWC)'), # http://www.wyomingpbs.org (r'video\.wyomingpbs\.org', 'Wyoming PBS (KCWC)'), # http://www.wyomingpbs.org
(r'video\.cpt12\.org', 'Colorado Public Television / KBDI 12 (KBDI)'), # http://www.cpt12.org/ (r'video\.cpt12\.org', 'Colorado Public Television / KBDI 12 (KBDI)'), # http://www.cpt12.org/
(r'video\.kbyueleven\.org', 'KBYU-TV (KBYU)'), # http://www.kbyutv.org/ (r'video\.kbyueleven\.org', 'KBYU-TV (KBYU)'), # http://www.kbyutv.org/
(r'video\.thirteen\.org', 'Thirteen/WNET New York (WNET)'), # http://www.thirteen.org (r'(?:video\.|www\.)thirteen\.org', 'Thirteen/WNET New York (WNET)'), # http://www.thirteen.org
(r'video\.wgbh\.org', 'WGBH/Channel 2 (WGBH)'), # http://wgbh.org (r'video\.wgbh\.org', 'WGBH/Channel 2 (WGBH)'), # http://wgbh.org
(r'video\.wgby\.org', 'WGBY (WGBY)'), # http://www.wgby.org (r'video\.wgby\.org', 'WGBY (WGBY)'), # http://www.wgby.org
(r'watch\.njtvonline\.org', 'NJTV Public Media NJ (WNJT)'), # http://www.njtvonline.org/ (r'watch\.njtvonline\.org', 'NJTV Public Media NJ (WNJT)'), # http://www.njtvonline.org/
@@ -208,16 +208,40 @@ class PBSIE(InfoExtractor):
'description': 'md5:31b664af3c65fd07fa460d306b837d00', 'description': 'md5:31b664af3c65fd07fa460d306b837d00',
'duration': 3190, 'duration': 3190,
}, },
'skip': 'dead URL',
},
{
'url': 'https://www.thirteen.org/programs/the-woodwrights-shop/carving-away-with-mary-may-tioglz/',
'info_dict': {
'id': '3004803331',
'ext': 'mp4',
'title': "The Woodwright's Shop - Carving Away with Mary May",
'description': 'md5:7cbaaaa8b9bcc78bd8f0e31911644e28',
'duration': 1606,
'display_id': 'carving-away-with-mary-may-tioglz',
'chapters': [],
'thumbnail': 'https://image.pbs.org/video-assets/NcnTxNl-asset-mezzanine-16x9-K0Keoyv.jpg',
},
}, },
{ {
'url': 'http://www.pbs.org/wgbh/pages/frontline/losing-iraq/', 'url': 'http://www.pbs.org/wgbh/pages/frontline/losing-iraq/',
'md5': '6f722cb3c3982186d34b0f13374499c7', 'md5': '372b12b670070de39438b946474df92f',
'info_dict': { 'info_dict': {
'id': '2365297690', 'id': '2365297690',
'ext': 'mp4', 'ext': 'mp4',
'title': 'FRONTLINE - Losing Iraq', 'title': 'FRONTLINE - Losing Iraq',
'description': 'md5:5979a4d069b157f622d02bff62fbe654', 'description': 'md5:5979a4d069b157f622d02bff62fbe654',
'duration': 5050, 'duration': 5050,
'chapters': [
{'start_time': 0.0, 'end_time': 1234.0, 'title': 'After Saddam, Chaos'},
{'start_time': 1233.0, 'end_time': 1719.0, 'title': 'The Insurgency Takes Root'},
{'start_time': 1718.0, 'end_time': 2461.0, 'title': 'A Light Footprint'},
{'start_time': 2460.0, 'end_time': 3589.0, 'title': 'The Surge '},
{'start_time': 3588.0, 'end_time': 4355.0, 'title': 'The Withdrawal '},
{'start_time': 4354.0, 'end_time': 5051.0, 'title': 'ISIS on the March '},
],
'display_id': 'losing-iraq',
'thumbnail': 'https://image.pbs.org/video-assets/pbs/frontline/138098/images/mezzanine_401.jpg',
}, },
}, },
{ {
@@ -477,6 +501,7 @@ class PBSIE(InfoExtractor):
r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed
r'class="coveplayerid">([^<]+)<', # coveplayer r'class="coveplayerid">([^<]+)<', # coveplayer
r'<section[^>]+data-coveid="(\d+)"', # coveplayer from http://www.pbs.org/wgbh/frontline/film/real-csi/ r'<section[^>]+data-coveid="(\d+)"', # coveplayer from http://www.pbs.org/wgbh/frontline/film/real-csi/
r'\sclass="passportcoveplayer"[^>]*\sdata-media="(\d+)', # https://www.thirteen.org/programs/the-woodwrights-shop/who-wrote-the-book-of-sloyd-fggvvq/
r'<input type="hidden" id="pbs_video_id_[0-9]+" value="([0-9]+)"/>', # jwplayer r'<input type="hidden" id="pbs_video_id_[0-9]+" value="([0-9]+)"/>', # jwplayer
r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',", r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',",
r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/ r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/

View File

@@ -23,9 +23,9 @@ class PinterestBaseIE(InfoExtractor):
def _call_api(self, resource, video_id, options): def _call_api(self, resource, video_id, options):
return self._download_json( return self._download_json(
f'https://www.pinterest.com/resource/{resource}Resource/get/', f'https://www.pinterest.com/resource/{resource}Resource/get/',
video_id, f'Download {resource} JSON metadata', query={ video_id, f'Download {resource} JSON metadata',
'data': json.dumps({'options': options}), query={'data': json.dumps({'options': options})},
})['resource_response'] headers={'X-Pinterest-PWS-Handler': 'www/[username].js'})['resource_response']
def _extract_video(self, data, extract_formats=True): def _extract_video(self, data, extract_formats=True):
video_id = data['id'] video_id = data['id']

View File

@@ -1,4 +1,7 @@
import base64
import hashlib
import json import json
import uuid
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
@@ -142,39 +145,73 @@ class PlaySuisseIE(InfoExtractor):
id id
url url
}''' }'''
_LOGIN_BASE_URL = 'https://login.srgssr.ch/srgssrlogin.onmicrosoft.com' _CLIENT_ID = '1e33f1bf-8bf3-45e4-bbd9-c9ad934b5fca'
_LOGIN_PATH = 'B2C_1A__SignInV2' _LOGIN_BASE = 'https://account.srgssr.ch'
_ID_TOKEN = None _ID_TOKEN = None
def _perform_login(self, username, password): def _perform_login(self, username, password):
login_page = self._download_webpage( code_verifier = uuid.uuid4().hex + uuid.uuid4().hex + uuid.uuid4().hex
'https://www.playsuisse.ch/api/sso/login', None, note='Downloading login page', code_challenge = base64.urlsafe_b64encode(
query={'x': 'x', 'locale': 'de', 'redirectUrl': 'https://www.playsuisse.ch/'}) hashlib.sha256(code_verifier.encode()).digest()).decode().rstrip('=')
settings = self._search_json(r'var\s+SETTINGS\s*=', login_page, 'settings', None)
csrf_token = settings['csrf'] request_id = parse_qs(self._request_webpage(
query = {'tx': settings['transId'], 'p': self._LOGIN_PATH} f'{self._LOGIN_BASE}/authz-srv/authz', None, 'Requesting session ID', query={
'client_id': self._CLIENT_ID,
'redirect_uri': 'https://www.playsuisse.ch/auth',
'scope': 'email profile openid offline_access',
'response_type': 'code',
'code_challenge': code_challenge,
'code_challenge_method': 'S256',
'view_type': 'login',
}).url)['requestId'][0]
status = traverse_obj(self._download_json( try:
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/SelfAsserted', None, 'Logging in', exchange_id = self._download_json(
query=query, headers={'X-CSRF-TOKEN': csrf_token}, data=urlencode_postdata({ f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/initiate/password', None,
'request_type': 'RESPONSE', 'Submitting username', headers={'content-type': 'application/json'}, data=json.dumps({
'signInName': username, 'usage_type': 'INITIAL_AUTHENTICATION',
'password': password, 'request_id': request_id,
}), expected_status=400), ('status', {int_or_none})) 'medium_id': 'PASSWORD',
if status == 400: 'type': 'password',
raise ExtractorError('Invalid username or password', expected=True) 'identifier': username,
}).encode())['data']['exchange_id']['exchange_id']
except ExtractorError:
raise ExtractorError('Invalid username', expected=True)
urlh = self._request_webpage( try:
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/api/CombinedSigninAndSignup/confirmed', login_data = self._download_json(
None, 'Downloading ID token', query={ f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/authenticate/password', None,
'rememberMe': 'false', 'Submitting password', headers={'content-type': 'application/json'}, data=json.dumps({
'csrf_token': csrf_token, 'requestId': request_id,
**query, 'exchange_id': exchange_id,
'diags': '', 'type': 'password',
}) 'password': password,
}).encode())['data']
except ExtractorError:
raise ExtractorError('Invalid password', expected=True)
authorization_code = parse_qs(self._request_webpage(
f'{self._LOGIN_BASE}/login-srv/verification/login', None, 'Logging in',
data=urlencode_postdata({
'requestId': request_id,
'exchange_id': login_data['exchange_id']['exchange_id'],
'verificationType': 'password',
'sub': login_data['sub'],
'status_id': login_data['status_id'],
'rememberMe': True,
'lat': '',
'lon': '',
})).url)['code'][0]
self._ID_TOKEN = self._download_json(
f'{self._LOGIN_BASE}/proxy/token', None, 'Downloading token', data=b'', query={
'client_id': self._CLIENT_ID,
'redirect_uri': 'https://www.playsuisse.ch/auth',
'code': authorization_code,
'code_verifier': code_verifier,
'grant_type': 'authorization_code',
})['id_token']
self._ID_TOKEN = traverse_obj(parse_qs(urlh.url), ('id_token', 0))
if not self._ID_TOKEN: if not self._ID_TOKEN:
raise ExtractorError('Login failed') raise ExtractorError('Login failed')

View File

@@ -22,7 +22,7 @@ from ..utils import (
) )
class PolskieRadioBaseExtractor(InfoExtractor): class PolskieRadioBaseIE(InfoExtractor):
def _extract_webpage_player_entries(self, webpage, playlist_id, base_data): def _extract_webpage_player_entries(self, webpage, playlist_id, base_data):
media_urls = set() media_urls = set()
@@ -47,7 +47,7 @@ class PolskieRadioBaseExtractor(InfoExtractor):
yield entry yield entry
class PolskieRadioLegacyIE(PolskieRadioBaseExtractor): class PolskieRadioLegacyIE(PolskieRadioBaseIE):
# legacy sites # legacy sites
IE_NAME = 'polskieradio:legacy' IE_NAME = 'polskieradio:legacy'
_VALID_URL = r'https?://(?:www\.)?polskieradio(?:24)?\.pl/\d+/\d+/[Aa]rtykul/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?polskieradio(?:24)?\.pl/\d+/\d+/[Aa]rtykul/(?P<id>\d+)'
@@ -127,7 +127,7 @@ class PolskieRadioLegacyIE(PolskieRadioBaseExtractor):
return self.playlist_result(entries, playlist_id, title, description) return self.playlist_result(entries, playlist_id, title, description)
class PolskieRadioIE(PolskieRadioBaseExtractor): class PolskieRadioIE(PolskieRadioBaseIE):
# new next.js sites # new next.js sites
_VALID_URL = r'https?://(?:[^/]+\.)?(?:polskieradio(?:24)?|radiokierowcow)\.pl/artykul/(?P<id>\d+)' _VALID_URL = r'https?://(?:[^/]+\.)?(?:polskieradio(?:24)?|radiokierowcow)\.pl/artykul/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
@@ -519,7 +519,7 @@ class PolskieRadioPlayerIE(InfoExtractor):
} }
class PolskieRadioPodcastBaseExtractor(InfoExtractor): class PolskieRadioPodcastBaseIE(InfoExtractor):
_API_BASE = 'https://apipodcasts.polskieradio.pl/api' _API_BASE = 'https://apipodcasts.polskieradio.pl/api'
def _parse_episode(self, data): def _parse_episode(self, data):
@@ -539,7 +539,7 @@ class PolskieRadioPodcastBaseExtractor(InfoExtractor):
} }
class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseExtractor): class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast:list' IE_NAME = 'polskieradio:podcast:list'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/podcast/(?P<id>\d+)' _VALID_URL = r'https?://podcasty\.polskieradio\.pl/podcast/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
@@ -578,7 +578,7 @@ class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseExtractor):
} }
class PolskieRadioPodcastIE(PolskieRadioPodcastBaseExtractor): class PolskieRadioPodcastIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast' IE_NAME = 'polskieradio:podcast'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/track/(?P<id>[a-f\d]{8}(?:-[a-f\d]{4}){4}[a-f\d]{8})' _VALID_URL = r'https?://podcasty\.polskieradio\.pl/track/(?P<id>[a-f\d]{8}(?:-[a-f\d]{4}){4}[a-f\d]{8})'
_TESTS = [{ _TESTS = [{

Some files were not shown because too many files have changed in this diff Show More