1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-16 20:01:29 +00:00

Compare commits

...

146 Commits

Author SHA1 Message Date
github-actions[bot]
5ff7a43623 Release 2025.01.26
Created by: bashonly

:ci skip all
2025-01-26 03:54:22 +00:00
sepro
3b45319344 [cleanup] Misc (#12194)
Closes #12098, Closes #12133
Authored by: seproDev, bashonly, lonble, pjrobertson

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
Co-authored-by: Lonble <74650029+lonble@users.noreply.github.com>
Co-authored-by: Patrick Robertson <robertson.patrick@gmail.com>
2025-01-26 03:32:10 +00:00
nosoop
421bc72103 [ie/youtube] Extract media_type for livestreams (#11605)
Closes #11563
Authored by: nosoop
2025-01-26 03:27:12 +00:00
FestplattenSchnitzel
d4f5be1735 [ie/ViMP:Playlist] Add support for tags (#11688)
Authored by: FestplattenSchnitzel
2025-01-26 03:20:42 +00:00
bashonly
797d2472a2 [ie/TheaterComplexTownPPV] Support live URLs (#11720)
Closes #11718
Authored by: bashonly
2025-01-26 03:12:32 +00:00
knackku
3b99a0f0e0 [ie/xhamster] Various improvements (#11738)
Closes #7620
Authored by: knackku
2025-01-26 03:10:24 +00:00
middlingphys
c709cc41cb [ie/abematv] Support season extraction (#11771)
Closes #10602
Authored by: middlingphys
2025-01-26 03:05:40 +00:00
invertico
4850ce91d1 [ie/redgifs] Support /ifr/ URLs (#11805)
Authored by: invertico
2025-01-26 02:40:05 +00:00
msm595
e2e73b5c65 [ie/patreon] Extract attachment filename as alt_title (#12000)
Authored by: msm595
2025-01-26 02:36:16 +00:00
krandor
13825ab778 [ie/pbs] Fix extractor (#12024)
Closes #8703, Closes #9740, Closes #11514
Authored by: dirkf, krandor, n10dollar

Co-authored-by: dirkf <fieldhouse@gmx.net>
Co-authored-by: Neil <ntendolkar@berkeley.edu>
2025-01-26 02:25:35 +00:00
test20140
bc88b904cd [ie/niconico:series] Fix extractor (#11822)
Closes #7320, Closes #12001
Authored by: test20140
2025-01-26 01:47:15 +00:00
kibaa
76ac023ff0 [ie/youtube:tab] Improve shorts title extraction (#11991) (#11997)
Closes #11991
Authored by: d3d9, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-01-26 01:23:29 +00:00
kclauhk
b3007c44cd [ie/naver] Fix m3u8 formats extraction (#12037)
Closes #11953
Authored by: kclauhk
2025-01-26 01:16:26 +00:00
N/Ame
78912ed9c8 [ie/bilibili] Support space video list extraction without login (#12089)
Closes #12007
Authored by: grqz
2025-01-26 00:56:36 +00:00
InvalidUsernameException
bb69f5dab7 [ie/zdf] Fix extractors (#11041)
Closes #4782, Closes #10672
Authored by: InvalidUsernameException
2025-01-26 00:29:57 +00:00
gavin
6d304133ab [ie/soundcloud] Extract more metadata (#11945)
Authored by: 7x11x13
2025-01-25 22:52:48 +00:00
Jixun
9ff330948c [ie/vimeo] Fix thumbnail extraction (#12142)
Closes #11931
Authored by: jixunmoe
2025-01-25 21:42:34 +00:00
Simon Sawicki
fc12e724a3 [utils] sanitize_path: Fix some incorrect behavior (#11923)
Authored by: Grub4K
2025-01-25 22:32:00 +01:00
Konstantin Kulakov
61ae5dc34a [ie/1tv] Support sport1tv.ru domain (#11889)
Closes #11894
Authored by: kvk-2015
2025-01-25 22:21:45 +01:00
c-basalt
4651679104 [ie/bilibili] Support space /lists/ URLs (#11964)
Closes #11959
Authored by: c-basalt
2025-01-25 20:56:30 +00:00
sepro
ff44ed5306 [ie/crunchyroll] Remove extractors (#12195)
Closes #2561, Closes #5869, Closes #6278, Closes #7099, Closes #7414, Closes #7465, Closes #7976, Closes #8235, Closes #9867, Closes #10207
Authored by: seproDev
2025-01-25 20:57:08 +01:00
doe1080
cdcf1e8672 [ie/funimation] Remove extractors (#12167)
Closes #1569, Closes #2255, Closes #2517, Closes #2723, Closes #4318, Closes #4345, Closes #5326, Closes #6575, Closes #8644
Authored by: doe1080
2025-01-25 20:29:24 +01:00
Dioarya
f7d071e8aa [core] Fix float comparison values in format filters (#11880)
Closes #10115
Authored by: Dioarya, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-01-25 19:12:56 +00:00
Boof
45732e2590 [ie/nrk] Fix extraction (#12193)
Closes #12192
Authored by: hexahigh
2025-01-25 18:24:04 +00:00
gavin
7bfb4f72e4 [ie/soundcloud:user] Add /comments page support (#11999)
Authored by: 7x11x13
2025-01-25 18:48:06 +01:00
Subrat Lima
5d904b077d [ie/subsplash] Add extractors (#11054)
Closes #10922
Authored by: subrat-lima, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-25 18:14:45 +01:00
Roman
e7cc02b14d [ie/GoodGame] Fix extractor (#12173)
Authored by: NecroRomnt
2025-01-25 18:10:44 +01:00
bashonly
f0d4b8a5d6 [ie/youtube] Restore convenience workarounds (#12181)
Authored by: bashonly
2025-01-25 16:18:15 +00:00
coletdjnz
6b91d232e3 [ie/youtube] Use different PO token for GVS and Player (#12090)
Authored by: coletdjnz
2025-01-25 13:17:37 +13:00
Antoine Bollengier
de82acf876 [ie/youtube] Update ios player client (#12155)
Authored by: b5i
2025-01-23 22:52:32 +00:00
coletdjnz
326fb1ffaf [ie/youtube] Download tv client Innertube config (#12168)
Authored by: coletdjnz
2025-01-23 18:26:02 +13:00
August Wikerfors
ccda63934d [ie/Bluesky] Prefer source format (#12154)
Authored by: 0x9fff00
2025-01-21 22:59:39 +01:00
finch71
9676b05715 [ie/BiliBiliDynamic] Add extractor (#11838)
Closes #11726
Authored by: finch71, grqz

Co-authored-by: N/Ame <173015200+grqz@users.noreply.github.com>
2025-01-20 21:45:04 +01:00
sepro
f9f24ae376 [ie/XiaoHongShu] Extract more formats (#12147)
Authored by: seproDev
2025-01-20 19:55:30 +01:00
kclauhk
af2c821d74 [ie/piramidetv] Add extractors (#10777)
Closes #10706, Closes #10708
Authored by: kclauhk, HobbyistDev, seproDev

Co-authored-by: HobbyistDev <tesutonihon4@gmail.com>
Co-authored-by: sepro <sepro@sepr0.com>
2025-01-20 16:26:05 +01:00
Paul Wise
1ef3ee7500 [ie/nest] Add extractors (#11747)
Authored by: pabs3, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-20 16:13:24 +01:00
subsense
20c765d023 [ie/eggs] Add extractors (#11904)
Closes #11843
Authored by: subsense, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-20 16:08:11 +01:00
cotko
3fc4608656 [ie/rtvslo.si:show] Extract more metadata (#12136)
Authored by: cotko
2025-01-20 07:53:21 +01:00
Grabien
68221ecc87 [ie/senategov] Fix extractors (#9361)
Authored by: Grabien, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-20 00:01:22 +01:00
sepro
de30f652ff [ie/LBRY] Support signed URLs (#12138)
Authored by: seproDev
2025-01-19 17:52:31 +01:00
Boof
89198bb23b [ie/nrk] Extract more formats (#12069)
Closes #12053
Authored by: hexahigh
2025-01-19 14:13:40 +01:00
4ft35t
a567f97b62 [ie/Weibo] Extend _VALID_URL (#12088)
Closes #12086
Authored by: 4ft35t
2025-01-19 14:10:36 +01:00
bashonly
1643686104 [ie/dropout] Fix extraction (#12102)
Closes #12103
Authored by: bashonly
2025-01-16 02:40:13 +00:00
github-actions[bot]
bbc7591d3b Release 2025.01.15
Created by: bashonly

:ci skip all
2025-01-15 23:50:41 +00:00
bashonly
c8541f8b13 [ie/youtube] Do not use web_creator as a default client (#12087)
Closes #12085
Authored by: bashonly
2025-01-15 18:21:56 +00:00
github-actions[bot]
a3c0321825 Release 2025.01.12
Created by: bashonly

:ci skip all
2025-01-12 23:35:35 +00:00
Simon Sawicki
dade5e35c8 [cleanup] Misc (#11915)
Authored by: grqz, Grub4K, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
Co-authored-by: N/Ame <173015200+grqz@users.noreply.github.com>
2025-01-12 23:24:22 +00:00
Allen
e2ef4fece6 [ie/vine] Remove extractors (#11700)
Authored by: allendema
2025-01-12 19:43:16 +01:00
Mozi
1f489f4a45 [ie/DrTalks] Add extractor (#10831)
Closes #6390
Authored by: pzhlkj6612, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-12 19:42:02 +01:00
coletdjnz
75079f4e3f [ie/youtube] Refactor cookie auth (#11989)
Authored by: coletdjnz
2025-01-12 15:02:57 +13:00
coletdjnz
712d2abb32 [ie/youtube] Use tv instead of mweb client by default (#12059)
Authored by: coletdjnz
2025-01-12 15:01:13 +13:00
bashonly
8346b54915 Fix filename sanitization with --no-windows-filenames (#11988)
Fix bug in 6fc85f617a

Closes #11987
Authored by: bashonly
2025-01-11 19:05:23 +00:00
Paul Storkman
1f4e1e85a2 [core] Validate retries values are non-negative (#11927)
Closes #11926
Authored by: Strkmn
2025-01-11 19:51:16 +01:00
HobbyistDev
763ed06ee6 [ie/XiaoHongShu] Extend _VALID_URL (#11806)
Closes #11797
Authored by: HobbyistDev
2025-01-11 18:25:18 +01:00
voidptr_t
3c14e9191f [ie/PlVideo] Add extractor (#10657)
Closes #10311
Authored by: Sanceilaks, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-01-11 15:39:31 +01:00
coletdjnz
0b6b7742c2 [ie/youtube] Fix DASH formats incorrectly skipped in some situations (#11910)
Closes https://github.com/yt-dlp/yt-dlp/issues/11907
Authored by: coletdjnz
2024-12-26 14:19:17 +13:00
github-actions[bot]
3905f64920 Release 2024.12.23
Created by: bashonly

:ci skip all
2024-12-23 23:47:20 +00:00
bashonly
65cf46cddd [ie/youtube] Player client maintenance (#11893)
Closes #11867
Authored by: bashonly
2024-12-23 23:26:35 +00:00
coletdjnz
9f42e68a74 [ie/youtube] Skip iOS formats that require PO Token (#11890)
Partial fix for https://github.com/yt-dlp/yt-dlp/issues/11868

Authored by: coletdjnz
2024-12-24 12:03:28 +13:00
pukkandan
6fc85f617a Don't sanitize filename on Unix when --no-windows-filenames (#9591)
Closes #4547, Closes #8464
Authored by: pukkandan
2024-12-23 15:57:25 +05:30
bashonly
d298693b1b [ie/soundcloud] Various fixes (#11820)
- Fix original/download formats so that they are considered bestaudio
- Raise appropriate error if track is DRM-protected

Authored by: bashonly
2024-12-15 20:16:04 +00:00
bashonly
09a6c68712 [ie/youtube] Add age-gate workaround for some embeddable videos (#11821)
Closes #11296
Authored by: bashonly
2024-12-15 20:09:48 +00:00
bashonly
1a8851b689 [ie/youtube] Fix uploader_id extraction (#11818)
Closes #11816
Authored by: bashonly
2024-12-15 20:07:18 +00:00
bashonly
b91c3925c2 [update] Check 64-bitness when upgrading ARM builds (#11819)
Closes #11813
Authored by: bashonly
2024-12-15 19:55:30 +00:00
bashonly
3d3ee458c1 [update] Fix endless update loop for linux_exe builds (#11827)
Closes #11808
Authored by: bashonly
2024-12-15 19:47:50 +00:00
github-actions[bot]
2037a6414f Release 2024.12.13
Created by: bashonly

:ci skip all
2024-12-13 10:35:40 +00:00
sepro
5421669626 [cleanup] Make more playlist entries lazy (#11763)
Authored by: seproDev
2024-12-13 10:25:29 +00:00
bashonly
dc3c4fddcc [ie/youtube] Prioritize original language over auto-dubbed audio (#11803)
Closes #11753
Authored by: bashonly
2024-12-13 10:21:48 +00:00
bashonly
5460cd9189 [ie/youtube] Fix signature function extraction for 2f1832d2 (#11801)
Closes #11798
Authored by: bashonly
2024-12-13 09:43:08 +00:00
Crypto90
f6c73aad5f [ie/youtube:search_url] Fix playlist searches (#11782)
Closes #11666
Authored by: Crypto90
2024-12-12 13:54:11 +00:00
Pew
d5e2a379f2 [ie/youtube] Fix release_date extraction (#11759)
Authored by: MutantPiggieGolem1
2024-12-12 13:46:52 +00:00
bashonly
bc262bcad4 [ie/patreon:campaign] Support /c/ URLs (#11756)
Closes #11755
Authored by: bashonly
2024-12-12 13:44:19 +00:00
bashonly
f4d3e9e6dc [ie/soundcloud] Fix extraction (#11777)
Authored by: bashonly
2024-12-12 13:39:38 +00:00
github-actions[bot]
6fef824025 Release 2024.12.06
Created by: bashonly

:ci skip all
2024-12-06 16:07:07 +00:00
bashonly
4bd2655398 [ie/youtube] Raise if n function returns input value (#11752)
Improve a95ee6d880

Authored by: bashonly
2024-12-06 15:58:44 +00:00
bashonly
a95ee6d880 [ie/youtube] Fix n sig extraction for player 3bb1f723 (#11750)
Closes #11744
Authored by: bashonly
2024-12-06 15:35:18 +00:00
bashonly
4c85ccd136 [ie/youtube] Fix signature function extraction (#11751)
Closes #11748
Authored by: bashonly
2024-12-06 15:34:13 +00:00
bashonly
2feb28028e [ie/soundcloud] Fix formats extraction (#11742)
Authored by: bashonly
2024-12-06 15:02:30 +00:00
N/Ame
fca3eb5f8b [ie/bilibili] Fix HD formats extraction (#11734)
Fixes dc16876480

Closes #10554
Authored by: grqz
2024-12-04 23:11:55 +00:00
bashonly
2e49c789d3 [ie/youtube] Player client maintenance (#11724)
Closes #11686
Authored by: bashonly
2024-12-04 22:33:14 +00:00
wesson09
354cb4026c [cookies] Add --cookies-from-browser support for MS Store Firefox (#11731)
Authored by: wesson09
2024-12-04 18:41:58 +01:00
github-actions[bot]
cfa76f35d2 Release 2024.12.03
Created by: bashonly

:ci skip all
2024-12-03 20:30:33 +00:00
bashonly
2b67ac300a [cleanup] Misc (#11716)
Authored by: bashonly, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2024-12-03 20:22:21 +00:00
bashonly
c038a7b187 [ie/vk] Fix extractors (#11715)
Closes #5832, Closes #11471, Closes #11646, Closes #11670
Authored by: bashonly
2024-12-03 14:28:43 +00:00
Link
a13a336aa6 [ie/bilibili] Fix subtitles and chapters extraction (#11708)
Authored by: xiaomac
2024-12-03 04:08:46 +00:00
N/Ame
dc16876480 [ie/bilibili] Always try to extract HD formats (#10559)
Closes #10554
Authored by: grqz
2024-12-03 03:44:03 +00:00
N/Ame
f05a1cd149 [ie/bilibili] Fix supporter-only video extraction (#11711)
Fix bug in 239f5f36fe
Closes #11702
Authored by: grqz, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-12-03 01:19:22 +00:00
sepro
d8fb349086 [cleanup] Bump ruff to 0.8.x (#11608)
Authored by: seproDev
2024-12-02 16:29:30 +01:00
sepro
2bea793632 [ie/MicrosoftEmbed] Make format extraction non fatal (#11654)
Authored by: seproDev
2024-12-02 16:22:16 +01:00
Elan Ruusamäe
62cba8a1be [ie/duoplay] Fix extractor (#11588)
Authored by: glensc, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-12-01 22:33:11 +00:00
N/Ame
239f5f36fe [ie/bilibili] Fix extractor (#11667)
Closes #11665
Authored by: grqz
2024-12-01 21:55:18 +00:00
bashonly
0d146c1e36 [ie/youtube] Adjust player clients for site changes (#11663)
Closes #11640
Authored by: bashonly
2024-12-01 15:25:09 +00:00
DarkZeros
cd0f934604 [ie/mitele] Fix extractor (#11683)
Closes #11690
Authored by: DarkZeros
2024-12-01 14:21:57 +00:00
N/Ame
360aed810a [ie/instagram] Support share URLs (#11677)
Closes #11630
Authored by: grqz
2024-12-01 14:16:50 +00:00
bashonly
00dcde7286 [ie/dropbox] Fix password-protected video extraction (#11636)
Closes #11634
Authored by: bashonly
2024-11-27 01:47:28 +00:00
bashonly
910ecc4229 [ie/tiktok] Deprioritize animated thumbnails (#11645)
Closes #11641
Authored by: bashonly
2024-11-27 00:45:01 +00:00
bashonly
0a0d80800b [ie/dacast] Fix HLS AES formats extraction (#11644)
Closes #11643
Authored by: bashonly
2024-11-26 23:18:48 +00:00
Simon Sawicki
e0500cbf79 [ie] Handle fragmented formats in _remove_duplicate_formats (#11637)
Authored by: Grub4K
2024-11-27 00:05:07 +01:00
Jakob Kruse
4b5eec0aaa [ie/chaturbate] Fix support for non-public streams (#11624)
Fix bug in 720b3dc453

Closes #11623
Authored by: jkruse
2024-11-24 22:20:30 +00:00
sepro
fe70f20aed [ie/youtube:tab] Fix playlists tab extraction (#11615)
Closes #11524
Authored by: seproDev
2024-11-23 22:46:50 +01:00
coletdjnz
c7316373c0 [rh:websockets] Support websockets 14.0+ (#11616)
Authored by: coletdjnz
2024-11-24 10:30:00 +13:00
N/Ame
e0f1ae813b [ie/facebook] Support more groups URLs (#11576)
Authored by: grqz
2024-11-23 19:47:37 +00:00
sepro
7d6c259a03 Add playlist_webpage_url field (#11613)
Closes #10827
Authored by: seproDev
2024-11-23 20:42:35 +01:00
gitninja1234
16336c51d0 [ie/stripchat] Fix extractor (#11596)
Closes #11587
Authored by: gitninja1234
2024-11-23 19:40:45 +00:00
bashonly
ccf0a6b86b [cleanup] Misc (#11574)
Authored by: bashonly, pzhlkj6612

Co-authored-by: Mozi <29089388+pzhlkj6612@users.noreply.github.com>
2024-11-23 18:51:51 +00:00
github-actions[bot]
f919729538 Release 2024.11.18
Created by: bashonly

:ci skip all
2024-11-18 05:45:05 +00:00
bashonly
7ea2787920 [ie/reddit] Improve error handling (#11573)
Authored by: bashonly
2024-11-18 05:36:38 +00:00
bashonly
f7257588bd [ie/digitalconcerthall] Support login with access/refresh tokens (#11571)
Removes broken support for login with email and password
Removes obsolete `prefer_combined_hls` extractor-arg

Closes #11404, Closes #11436
Authored by: bashonly
2024-11-18 05:16:17 +00:00
bashonly
da252d9d32 [cleanup] Misc (#11554)
Closes #6884
Authored by: bashonly, Grub4K, seproDev

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
Co-authored-by: sepro <sepro@sepr0.com>
2024-11-17 23:25:05 +00:00
gillux
e079ffbda6 [ie/litv] Fix extractor (#11071)
Authored by: jiru
2024-11-17 21:37:15 +00:00
bashonly
2009cb27e1 [ie/SonyLIVSeries] Add sort_order extractor-arg (#11569)
Authored by: bashonly
2024-11-17 21:16:22 +00:00
Jackson Humphrey
f351440f1d [ie/ctvnews] Fix extractor (#11534)
Closes #8689
Authored by: jshumphrey, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-11-17 21:06:50 +00:00
qbnu
f9d98509a8 [ie/ctvnews] Fix playlist ID extraction (#8892)
Authored by: qbnu
2024-11-17 19:35:10 +00:00
sepro
37cd7660ea [ie/youtube:tab] Fix podcasts tab extraction (#11567)
Authored by: seproDev
2024-11-17 19:46:04 +01:00
ChocoLZS
d867f99622 [ie/PiaLive] Add extractor (#10811)
Authored by: ChocoLZS
2024-11-17 19:41:57 +01:00
doe1080
10fc719bc7 [cleanup] Remove dead extractors (#11566)
- Removes MildomClipIE, MildomIE, MildomUserVodIE, MildomVodIE
- Removes PokemonIE, PokemonWatchIE
- Removes VeohIE, VeohUserIE

Closes #3373, Closes #7059
Authored by: doe1080
2024-11-17 16:22:40 +00:00
krichbanana
eb15fd5a32 [ie/kenh14] Add extractor (#3996)
Closes #3937
Authored by: krichbanana, pzhlkj6612

Co-authored-by: Mozi <29089388+pzhlkj6612@users.noreply.github.com>
2024-11-17 14:12:26 +00:00
sepro
7cecd299e4 [ie/chaturbate] Don't break embed detection (#11565)
Bugfix for 720b3dc453

Authored by: seproDev
2024-11-17 13:32:12 +01:00
bashonly
52c0ffe40a [ie/youtube] Remove broken OAuth support (#11558)
Closes #11462
Authored by: bashonly
2024-11-16 23:40:21 +00:00
sepro
637d62a3a9 [ie/youtube] Player client maintenance (#11528)
Authored by: bashonly, seproDev

Co-authored-by: bashonly <bashonly@protonmail.com>
2024-11-17 00:31:04 +01:00
sepro
f95a92b3d0 [cleanup] Deprecate more compat functions (#11439)
Authored by: seproDev
2024-11-17 00:24:11 +01:00
Jackson Humphrey
1d253b0a27 [ie/patreon] Fix comments extraction (#11530)
Closes #11483
Authored by: jshumphrey, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-11-16 20:02:14 +00:00
powergold1
720b3dc453 [ie/chaturbate] Extract from API and support impersonation (#11555)
Closes #6546, Closes #10359
Authored by: powergold1
2024-11-16 19:55:40 +00:00
Jackson Humphrey
d215fba7ed [ie/RedGifsUser] Fix extraction (#11531)
Closes #7382, Closes #9131
Authored by: jshumphrey
2024-11-16 19:50:17 +00:00
Jackson Humphrey
8388ec256f [ie/spankbang] Support browser impersonation (#11542)
Closes #6545
Authored by: jshumphrey
2024-11-16 19:48:47 +00:00
sepro
6365e92589 [ie/bandlab] Add extractors (#11535)
Closes #7750
Authored by: seproDev
2024-11-16 17:56:43 +01:00
Alessandro Campolo
70c55cb08f [ie/RadioRadicale] Add extractor (#5607)
Authored by: a13ssandr0, pzhlkj6612

Co-authored-by: Mozi <29089388+pzhlkj6612@users.noreply.github.com>
2024-11-16 13:56:15 +01:00
bashonly
c699bafc50 [ie/soop] Fix thumbnail extraction (#11545)
Closes #11537

Authored by: bashonly
2024-11-15 22:51:55 +00:00
bashonly
eb64ae7d5d [ie] Allow ext override for thumbnails (#11545)
Authored by: bashonly
2024-11-15 22:51:55 +00:00
Simon Sawicki
c014fbcddc [utils] subs_list_to_dict: Add lang default parameter (#11508)
Authored by: Grub4K
2024-11-15 23:25:52 +01:00
Simon Sawicki
39d79c9b9c [utils] Fix join_nonempty, add **kwargs to unpack (#11559)
Authored by: Grub4K
2024-11-15 22:06:15 +01:00
Jackson Humphrey
f2a4983df7 [ie/archive.org] Fix comments extraction (#11527)
Closes #11526
Authored by: jshumphrey
2024-11-12 23:26:18 +00:00
bashonly
bacc31b05a [ie/facebook] Fix formats extraction (#11513)
Closes #11497
Authored by: bashonly
2024-11-12 23:23:10 +00:00
manav_chaudhary
a9f85670d0 [ie/Chaturbate] Support alternate domains (#10595)
Closes #10594
Authored by: manavchaudhary1
2024-11-11 23:41:56 +01:00
Sam
6b43a8d84b [ie/goplay] Fix extractor (#11466)
Closes #10857
Authored by: SamDecrock, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-11-11 22:03:31 +00:00
Hugo
2db8c2e7d5 [ie/CloudflareStream] Avoid extraction via videodelivery.net (#11478)
Closes #11477
Authored by: hugovdev
2024-11-11 22:00:05 +00:00
bashonly
f9c8deb4e5 [build] Bump PyInstaller version pin to >=6.11.1 (#11507)
Authored by: bashonly
2024-11-11 21:19:03 +00:00
Sakura286
0ec9bfed4d [ie/MixchMovie] Add extractor (#10897)
Closes #10765
Authored by: Sakura286
2024-11-11 21:40:29 +01:00
Subrat Lima
c673731061 [ie/spreaker] Support podcast and feed pages (#10968)
Closes #10925
Authored by: subrat-lima
2024-11-11 20:08:18 +01:00
sepro
e398217aae [ie/rutube] Rework extractors (#11480)
Closes #9694, Closes #10104, Closes #11117, Closes #11415, Closes #11476
Authored by: seproDev
2024-11-11 18:44:53 +01:00
Julio Napurí
c39016f66d [ie/spreaker] Support episode pages and access keys (#11489)
Authored by: julionc
2024-11-11 18:42:05 +01:00
sepro
b83ca24eb7 [core] Catch broken Cryptodome installations (#11486)
Authored by: seproDev
2024-11-10 00:53:49 +01:00
bashonly
240a7d43c8 [build] Pin websockets version to >=13.0,<14 (#11488)
websockets 14.0 causes CI test failures (a lot more of them)

Authored by: bashonly
2024-11-09 23:46:47 +00:00
bashonly
f13df591d4 [build] Enable attestations for trusted publishing (#11420)
Reverts 428ffb75aa

Authored by: bashonly
2024-11-09 23:26:02 +00:00
Steve Ovens
be3579aaf0 [ie/GameDevTV] Add extractor (#11368)
Authored by: stratus-ss, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-11-06 21:58:44 +00:00
bashonly
85fdc66b6e [ie/adobepass] Fix provider requests (#11472)
Fix bug in dcfeea4dd5

Closes #11469
Authored by: bashonly
2024-11-06 21:26:05 +00:00
139 changed files with 5109 additions and 3840 deletions

View File

@@ -411,7 +411,7 @@ jobs:
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python devscripts/install_deps.py -o --include build python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py --include curl-cffi python devscripts/install_deps.py --include curl-cffi
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.10.0-py3-none-any.whl" python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.11.1-py3-none-any.whl"
- name: Prepare - name: Prepare
run: | run: |
@@ -460,7 +460,7 @@ jobs:
run: | run: |
python devscripts/install_deps.py -o --include build python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py python devscripts/install_deps.py
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.10.0-py3-none-any.whl" python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.11.1-py3-none-any.whl"
- name: Prepare - name: Prepare
run: | run: |
@@ -504,7 +504,8 @@ jobs:
- windows32 - windows32
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/download-artifact@v4 - name: Download artifacts
uses: actions/download-artifact@v4
with: with:
path: artifact path: artifact
pattern: build-bin-* pattern: build-bin-*

View File

@@ -28,3 +28,20 @@ jobs:
actions: write # For cleaning up cache actions: write # For cleaning up cache
id-token: write # mandatory for trusted publishing id-token: write # mandatory for trusted publishing
secrets: inherit secrets: inherit
publish_pypi:
needs: [release]
if: vars.MASTER_PYPI_PROJECT != ''
runs-on: ubuntu-latest
permissions:
id-token: write # mandatory for trusted publishing
steps:
- name: Download artifacts
uses: actions/download-artifact@v4
with:
path: dist
name: build-pypi
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true

View File

@@ -41,3 +41,20 @@ jobs:
actions: write # For cleaning up cache actions: write # For cleaning up cache
id-token: write # mandatory for trusted publishing id-token: write # mandatory for trusted publishing
secrets: inherit secrets: inherit
publish_pypi:
needs: [release]
if: vars.NIGHTLY_PYPI_PROJECT != ''
runs-on: ubuntu-latest
permissions:
id-token: write # mandatory for trusted publishing
steps:
- name: Download artifacts
uses: actions/download-artifact@v4
with:
path: dist
name: build-pypi
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true

View File

@@ -2,10 +2,6 @@ name: Release
on: on:
workflow_call: workflow_call:
inputs: inputs:
prerelease:
required: false
default: true
type: boolean
source: source:
required: false required: false
default: '' default: ''
@@ -18,6 +14,10 @@ on:
required: false required: false
default: '' default: ''
type: string type: string
prerelease:
required: false
default: true
type: boolean
workflow_dispatch: workflow_dispatch:
inputs: inputs:
source: source:
@@ -278,11 +278,20 @@ jobs:
make clean-cache make clean-cache
python -m build --no-isolation . python -m build --no-isolation .
- name: Upload artifacts
if: github.event_name != 'workflow_dispatch'
uses: actions/upload-artifact@v4
with:
name: build-pypi
path: |
dist/*
compression-level: 0
- name: Publish to PyPI - name: Publish to PyPI
if: github.event_name == 'workflow_dispatch'
uses: pypa/gh-action-pypi-publish@release/v1 uses: pypa/gh-action-pypi-publish@release/v1
with: with:
verbose: true verbose: true
attestations: false # Currently doesn't work w/ reusable workflows (breaks nightly)
publish: publish:
needs: [prepare, build] needs: [prepare, build]

1
.gitignore vendored
View File

@@ -92,6 +92,7 @@ updates_key.pem
*.class *.class
*.isorted *.isorted
*.stackdump *.stackdump
uv.lock
# Generated # Generated
AUTHORS AUTHORS

View File

@@ -695,3 +695,44 @@ KBelmin
kesor kesor
MellowKyler MellowKyler
Wesley107772 Wesley107772
a13ssandr0
ChocoLZS
doe1080
hugovdev
jshumphrey
julionc
manavchaudhary1
powergold1
Sakura286
SamDecrock
stratus-ss
subrat-lima
gitninja1234
jkruse
xiaomac
wesson09
Crypto90
MutantPiggieGolem1
Sanceilaks
Strkmn
0x9fff00
4ft35t
7x11x13
b5i
cotko
d3d9
Dioarya
finch71
hexahigh
InvalidUsernameException
jixunmoe
knackku
krandor
kvk-2015
lonble
msm595
n10dollar
NecroRomnt
pjrobertson
subsense
test20140

View File

@@ -4,6 +4,221 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2025.01.26
#### Core changes
- [Fix float comparison values in format filters](https://github.com/yt-dlp/yt-dlp/commit/f7d071e8aa3bf67ed7e0f881e749ca9ab50b3f8f) ([#11880](https://github.com/yt-dlp/yt-dlp/issues/11880)) by [bashonly](https://github.com/bashonly), [Dioarya](https://github.com/Dioarya)
- **utils**: `sanitize_path`: [Fix some incorrect behavior](https://github.com/yt-dlp/yt-dlp/commit/fc12e724a3b4988cfc467d2981887dde48c26b69) ([#11923](https://github.com/yt-dlp/yt-dlp/issues/11923)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- **1tv**: [Support sport1tv.ru domain](https://github.com/yt-dlp/yt-dlp/commit/61ae5dc34ac775d6c122575e21ef2153b1273a2b) ([#11889](https://github.com/yt-dlp/yt-dlp/issues/11889)) by [kvk-2015](https://github.com/kvk-2015)
- **abematv**: [Support season extraction](https://github.com/yt-dlp/yt-dlp/commit/c709cc41cbc16edc846e0a431cfa8508396d4cb6) ([#11771](https://github.com/yt-dlp/yt-dlp/issues/11771)) by [middlingphys](https://github.com/middlingphys)
- **bilibili**
- [Support space `/lists/` URLs](https://github.com/yt-dlp/yt-dlp/commit/465167910407449354eb48e9861efd0819f53eb5) ([#11964](https://github.com/yt-dlp/yt-dlp/issues/11964)) by [c-basalt](https://github.com/c-basalt)
- [Support space video list extraction without login](https://github.com/yt-dlp/yt-dlp/commit/78912ed9c81f109169b828c397294a6cf8eacf41) ([#12089](https://github.com/yt-dlp/yt-dlp/issues/12089)) by [grqz](https://github.com/grqz)
- **bilibilidynamic**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/9676b05715b61c8c5dd5598871e60d8807fb1a86) ([#11838](https://github.com/yt-dlp/yt-dlp/issues/11838)) by [finch71](https://github.com/finch71), [grqz](https://github.com/grqz)
- **bluesky**: [Prefer source format](https://github.com/yt-dlp/yt-dlp/commit/ccda63934df7de2823f0834218c4254c7c4d2e4c) ([#12154](https://github.com/yt-dlp/yt-dlp/issues/12154)) by [0x9fff00](https://github.com/0x9fff00)
- **crunchyroll**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/ff44ed53061e065804da6275d182d7928cc03a5e) ([#12195](https://github.com/yt-dlp/yt-dlp/issues/12195)) by [seproDev](https://github.com/seproDev)
- **dropout**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/164368610456e2d96b279f8b120dea08f7b1d74f) ([#12102](https://github.com/yt-dlp/yt-dlp/issues/12102)) by [bashonly](https://github.com/bashonly)
- **eggs**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/20c765d02385a105c8ef13b6f7a737491d29c19a) ([#11904](https://github.com/yt-dlp/yt-dlp/issues/11904)) by [seproDev](https://github.com/seproDev), [subsense](https://github.com/subsense)
- **funimation**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/cdcf1e86726b8fa44f7e7126bbf1c18e1798d25c) ([#12167](https://github.com/yt-dlp/yt-dlp/issues/12167)) by [doe1080](https://github.com/doe1080)
- **goodgame**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e7cc02b14d8d323f805d14325a9c95593a170d28) ([#12173](https://github.com/yt-dlp/yt-dlp/issues/12173)) by [NecroRomnt](https://github.com/NecroRomnt)
- **lbry**: [Support signed URLs](https://github.com/yt-dlp/yt-dlp/commit/de30f652ffb7623500215f5906844f2ae0d92c7b) ([#12138](https://github.com/yt-dlp/yt-dlp/issues/12138)) by [seproDev](https://github.com/seproDev)
- **naver**: [Fix m3u8 formats extraction](https://github.com/yt-dlp/yt-dlp/commit/b3007c44cdac38187fc6600de76959a7079a44d1) ([#12037](https://github.com/yt-dlp/yt-dlp/issues/12037)) by [kclauhk](https://github.com/kclauhk)
- **nest**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/1ef3ee7500c4ab8c26f7fdc5b0ad1da4d16eec8e) ([#11747](https://github.com/yt-dlp/yt-dlp/issues/11747)) by [pabs3](https://github.com/pabs3), [seproDev](https://github.com/seproDev)
- **niconico**: series: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/bc88b904cd02314da41ce1b2fdf046d0680fe965) ([#11822](https://github.com/yt-dlp/yt-dlp/issues/11822)) by [test20140](https://github.com/test20140)
- **nrk**
- [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/89198bb23b4d03e0473ac408bfb50d67c2f71165) ([#12069](https://github.com/yt-dlp/yt-dlp/issues/12069)) by [hexahigh](https://github.com/hexahigh)
- [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/45732e2590a1bd0bc9608f5eb68c59341ca84f02) ([#12193](https://github.com/yt-dlp/yt-dlp/issues/12193)) by [hexahigh](https://github.com/hexahigh)
- **patreon**: [Extract attachment filename as `alt_title`](https://github.com/yt-dlp/yt-dlp/commit/e2e73b5c65593ec0a5e685663e6ec0f4aaffc1f1) ([#12000](https://github.com/yt-dlp/yt-dlp/issues/12000)) by [msm595](https://github.com/msm595)
- **pbs**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/13825ab77815ee6e1603abbecbb9f3795057b93c) ([#12024](https://github.com/yt-dlp/yt-dlp/issues/12024)) by [dirkf](https://github.com/dirkf), [krandor](https://github.com/krandor), [n10dollar](https://github.com/n10dollar)
- **piramidetv**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/af2c821d74049b519895288aca23cee81fc4b049) ([#10777](https://github.com/yt-dlp/yt-dlp/issues/10777)) by [HobbyistDev](https://github.com/HobbyistDev), [kclauhk](https://github.com/kclauhk), [seproDev](https://github.com/seproDev)
- **redgifs**: [Support `/ifr/` URLs](https://github.com/yt-dlp/yt-dlp/commit/4850ce91d163579fa615c3c0d44c9bd64682c22b) ([#11805](https://github.com/yt-dlp/yt-dlp/issues/11805)) by [invertico](https://github.com/invertico)
- **rtvslo.si**: show: [Extract more metadata](https://github.com/yt-dlp/yt-dlp/commit/3fc46086562857d5493cbcff687f76e4e4ed303f) ([#12136](https://github.com/yt-dlp/yt-dlp/issues/12136)) by [cotko](https://github.com/cotko)
- **senategov**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/68221ecc87c6a3f3515757bac2a0f9674a38e3f2) ([#9361](https://github.com/yt-dlp/yt-dlp/issues/9361)) by [Grabien](https://github.com/Grabien), [seproDev](https://github.com/seproDev)
- **soundcloud**
- [Extract more metadata](https://github.com/yt-dlp/yt-dlp/commit/6d304133ab32bcd1eb78ff1467f1a41dd9b66c33) ([#11945](https://github.com/yt-dlp/yt-dlp/issues/11945)) by [7x11x13](https://github.com/7x11x13)
- user: [Add `/comments` page support](https://github.com/yt-dlp/yt-dlp/commit/7bfb4f72e490310d2681c7f4815218a2ebbc73ee) ([#11999](https://github.com/yt-dlp/yt-dlp/issues/11999)) by [7x11x13](https://github.com/7x11x13)
- **subsplash**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/5d904b077d2f58ae44bdf208d2dcfcc3ff8347f5) ([#11054](https://github.com/yt-dlp/yt-dlp/issues/11054)) by [seproDev](https://github.com/seproDev), [subrat-lima](https://github.com/subrat-lima)
- **theatercomplextownppv**: [Support `live` URLs](https://github.com/yt-dlp/yt-dlp/commit/797d2472a299692e01ad1500e8c3b7bc1daa7fe4) ([#11720](https://github.com/yt-dlp/yt-dlp/issues/11720)) by [bashonly](https://github.com/bashonly)
- **vimeo**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/9ff330948c92f6b2e1d9c928787362ab19cd6c62) ([#12142](https://github.com/yt-dlp/yt-dlp/issues/12142)) by [jixunmoe](https://github.com/jixunmoe)
- **vimp**: Playlist: [Add support for tags](https://github.com/yt-dlp/yt-dlp/commit/d4f5be1735c8feaeb3308666e0b878e9782f529d) ([#11688](https://github.com/yt-dlp/yt-dlp/issues/11688)) by [FestplattenSchnitzel](https://github.com/FestplattenSchnitzel)
- **weibo**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/a567f97b62ae9f6d6f5a9376c361512ab8dceda2) ([#12088](https://github.com/yt-dlp/yt-dlp/issues/12088)) by [4ft35t](https://github.com/4ft35t)
- **xhamster**: [Various improvements](https://github.com/yt-dlp/yt-dlp/commit/3b99a0f0e07f0120ab416f34a8f5ab75d4fdf1d1) ([#11738](https://github.com/yt-dlp/yt-dlp/issues/11738)) by [knackku](https://github.com/knackku)
- **xiaohongshu**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/f9f24ae376a9eaca777816479a4a29f6f0ce7681) ([#12147](https://github.com/yt-dlp/yt-dlp/issues/12147)) by [seproDev](https://github.com/seproDev)
- **youtube**
- [Download `tv` client Innertube config](https://github.com/yt-dlp/yt-dlp/commit/326fb1ffaf4e8349f1fe8ba2a81839652e044bff) ([#12168](https://github.com/yt-dlp/yt-dlp/issues/12168)) by [coletdjnz](https://github.com/coletdjnz)
- [Extract `media_type` for livestreams](https://github.com/yt-dlp/yt-dlp/commit/421bc72103d1faed473a451299cd17d6abb433bb) ([#11605](https://github.com/yt-dlp/yt-dlp/issues/11605)) by [nosoop](https://github.com/nosoop)
- [Restore convenience workarounds](https://github.com/yt-dlp/yt-dlp/commit/f0d4b8a5d6354b294bc9631cf15a7160b7bad5de) ([#12181](https://github.com/yt-dlp/yt-dlp/issues/12181)) by [bashonly](https://github.com/bashonly)
- [Update `ios` player client](https://github.com/yt-dlp/yt-dlp/commit/de82acf8769282ce321a86737ecc1d4bef0e82a7) ([#12155](https://github.com/yt-dlp/yt-dlp/issues/12155)) by [b5i](https://github.com/b5i)
- [Use different PO token for GVS and Player](https://github.com/yt-dlp/yt-dlp/commit/6b91d232e316efa406035915532eb126fbaeea38) ([#12090](https://github.com/yt-dlp/yt-dlp/issues/12090)) by [coletdjnz](https://github.com/coletdjnz)
- tab: [Improve shorts title extraction](https://github.com/yt-dlp/yt-dlp/commit/76ac023ff02f06e8c003d104f02a03deeddebdcd) ([#11997](https://github.com/yt-dlp/yt-dlp/issues/11997)) by [bashonly](https://github.com/bashonly), [d3d9](https://github.com/d3d9)
- **zdf**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/bb69f5dab79fb32c4ec0d50e05f7fa26d05d54ba) ([#11041](https://github.com/yt-dlp/yt-dlp/issues/11041)) by [InvalidUsernameException](https://github.com/InvalidUsernameException)
#### Misc. changes
- **cleanup**: Miscellaneous: [3b45319](https://github.com/yt-dlp/yt-dlp/commit/3b4531934465580be22937fecbb6e1a3a9e2334f) by [bashonly](https://github.com/bashonly), [lonble](https://github.com/lonble), [pjrobertson](https://github.com/pjrobertson), [seproDev](https://github.com/seproDev)
### 2025.01.15
#### Extractor changes
- **youtube**: [Do not use `web_creator` as a default client](https://github.com/yt-dlp/yt-dlp/commit/c8541f8b13e743fcfa06667530d13fee8686e22a) ([#12087](https://github.com/yt-dlp/yt-dlp/issues/12087)) by [bashonly](https://github.com/bashonly)
### 2025.01.12
#### Core changes
- [Fix filename sanitization with `--no-windows-filenames`](https://github.com/yt-dlp/yt-dlp/commit/8346b549150003df988538e54c9d8bc4de568979) ([#11988](https://github.com/yt-dlp/yt-dlp/issues/11988)) by [bashonly](https://github.com/bashonly)
- [Validate retries values are non-negative](https://github.com/yt-dlp/yt-dlp/commit/1f4e1e85a27c5b43e34d7706cfd88ffce1b56a4a) ([#11927](https://github.com/yt-dlp/yt-dlp/issues/11927)) by [Strkmn](https://github.com/Strkmn)
#### Extractor changes
- **drtalks**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/1f489f4a45691cac3f9e787d22a3a8a086229ba6) ([#10831](https://github.com/yt-dlp/yt-dlp/issues/10831)) by [pzhlkj6612](https://github.com/pzhlkj6612), [seproDev](https://github.com/seproDev)
- **plvideo**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3c14e9191f3035b9a729d1d87bc0381f42de57cf) ([#10657](https://github.com/yt-dlp/yt-dlp/issues/10657)) by [Sanceilaks](https://github.com/Sanceilaks), [seproDev](https://github.com/seproDev)
- **vine**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/e2ef4fece6c9742d1733e3bae408c4787765f78c) ([#11700](https://github.com/yt-dlp/yt-dlp/issues/11700)) by [allendema](https://github.com/allendema)
- **xiaohongshu**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/763ed06ee69f13949397897bd42ff2ec3dc3d384) ([#11806](https://github.com/yt-dlp/yt-dlp/issues/11806)) by [HobbyistDev](https://github.com/HobbyistDev)
- **youtube**
- [Fix DASH formats incorrectly skipped in some situations](https://github.com/yt-dlp/yt-dlp/commit/0b6b7742c2e7f2a1fcb0b54ef3dd484bab404b3f) ([#11910](https://github.com/yt-dlp/yt-dlp/issues/11910)) by [coletdjnz](https://github.com/coletdjnz)
- [Refactor cookie auth](https://github.com/yt-dlp/yt-dlp/commit/75079f4e3f7dce49b61ef01da7adcd9876a0ca3b) ([#11989](https://github.com/yt-dlp/yt-dlp/issues/11989)) by [coletdjnz](https://github.com/coletdjnz)
- [Use `tv` instead of `mweb` client by default](https://github.com/yt-dlp/yt-dlp/commit/712d2abb32f59b2d246be2901255f84f1a4c30b3) ([#12059](https://github.com/yt-dlp/yt-dlp/issues/12059)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**: Miscellaneous: [dade5e3](https://github.com/yt-dlp/yt-dlp/commit/dade5e35c89adaad04408bfef766820dbca06ebe) by [grqz](https://github.com/grqz), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
### 2024.12.23
#### Core changes
- [Don't sanitize filename on Unix when `--no-windows-filenames`](https://github.com/yt-dlp/yt-dlp/commit/6fc85f617a5850307fd5b258477070e6ee177796) ([#9591](https://github.com/yt-dlp/yt-dlp/issues/9591)) by [pukkandan](https://github.com/pukkandan)
- **update**
- [Check 64-bitness when upgrading ARM builds](https://github.com/yt-dlp/yt-dlp/commit/b91c3925c2059970daa801cb131c0c2f4f302e72) ([#11819](https://github.com/yt-dlp/yt-dlp/issues/11819)) by [bashonly](https://github.com/bashonly)
- [Fix endless update loop for `linux_exe` builds](https://github.com/yt-dlp/yt-dlp/commit/3d3ee458c1fe49dd5ebd7651a092119d23eb7000) ([#11827](https://github.com/yt-dlp/yt-dlp/issues/11827)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- **soundcloud**: [Various fixes](https://github.com/yt-dlp/yt-dlp/commit/d298693b1b266d198e8eeecb90ea17c4a031268f) ([#11820](https://github.com/yt-dlp/yt-dlp/issues/11820)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Add age-gate workaround for some embeddable videos](https://github.com/yt-dlp/yt-dlp/commit/09a6c687126f04e243fcb105a828787efddd1030) ([#11821](https://github.com/yt-dlp/yt-dlp/issues/11821)) by [bashonly](https://github.com/bashonly)
- [Fix `uploader_id` extraction](https://github.com/yt-dlp/yt-dlp/commit/1a8851b689763e5173b96f70f8a71df0e4a44b66) ([#11818](https://github.com/yt-dlp/yt-dlp/issues/11818)) by [bashonly](https://github.com/bashonly)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/65cf46cddd873fd229dbb0fc0689bca4c201c6b6) ([#11893](https://github.com/yt-dlp/yt-dlp/issues/11893)) by [bashonly](https://github.com/bashonly)
- [Skip iOS formats that require PO Token](https://github.com/yt-dlp/yt-dlp/commit/9f42e68a74f3f00b0253fe70763abd57cac4237b) ([#11890](https://github.com/yt-dlp/yt-dlp/issues/11890)) by [coletdjnz](https://github.com/coletdjnz)
### 2024.12.13
#### Extractor changes
- **patreon**: campaign: [Support /c/ URLs](https://github.com/yt-dlp/yt-dlp/commit/bc262bcad4d3683ceadf61a7eb87e233e72adef3) ([#11756](https://github.com/yt-dlp/yt-dlp/issues/11756)) by [bashonly](https://github.com/bashonly)
- **soundcloud**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/f4d3e9e6dc25077b79849a31a2f67f93fdc01e62) ([#11777](https://github.com/yt-dlp/yt-dlp/issues/11777)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Fix `release_date` extraction](https://github.com/yt-dlp/yt-dlp/commit/d5e2a379f2adcb28bc48c7d9e90716d7278f89d2) ([#11759](https://github.com/yt-dlp/yt-dlp/issues/11759)) by [MutantPiggieGolem1](https://github.com/MutantPiggieGolem1)
- [Fix signature function extraction for `2f1832d2`](https://github.com/yt-dlp/yt-dlp/commit/5460cd91891bf613a2065e2fc278d9903c37a127) ([#11801](https://github.com/yt-dlp/yt-dlp/issues/11801)) by [bashonly](https://github.com/bashonly)
- [Prioritize original language over auto-dubbed audio](https://github.com/yt-dlp/yt-dlp/commit/dc3c4fddcc653989dae71fc563d82a308fc898cc) ([#11803](https://github.com/yt-dlp/yt-dlp/issues/11803)) by [bashonly](https://github.com/bashonly)
- search_url: [Fix playlist searches](https://github.com/yt-dlp/yt-dlp/commit/f6c73aad5f1a67544bea137ebd9d1e22e0e56567) ([#11782](https://github.com/yt-dlp/yt-dlp/issues/11782)) by [Crypto90](https://github.com/Crypto90)
#### Misc. changes
- **cleanup**: [Make more playlist entries lazy](https://github.com/yt-dlp/yt-dlp/commit/54216696261bc07cacd9a837c501d9e0b7fed09e) ([#11763](https://github.com/yt-dlp/yt-dlp/issues/11763)) by [seproDev](https://github.com/seproDev)
### 2024.12.06
#### Core changes
- **cookies**: [Add `--cookies-from-browser` support for MS Store Firefox](https://github.com/yt-dlp/yt-dlp/commit/354cb4026cf2191e1a130ec2a627b95cabfbc60a) ([#11731](https://github.com/yt-dlp/yt-dlp/issues/11731)) by [wesson09](https://github.com/wesson09)
#### Extractor changes
- **bilibili**: [Fix HD formats extraction](https://github.com/yt-dlp/yt-dlp/commit/fca3eb5f8be08d5fab2e18b45b7281a12e566725) ([#11734](https://github.com/yt-dlp/yt-dlp/issues/11734)) by [grqz](https://github.com/grqz)
- **soundcloud**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/2feb28028ee48f2185d2d95076e62accb09b9e2e) ([#11742](https://github.com/yt-dlp/yt-dlp/issues/11742)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Fix `n` sig extraction for player `3bb1f723`](https://github.com/yt-dlp/yt-dlp/commit/a95ee6d8803fca9157adecf63732ab58bf87fd88) ([#11750](https://github.com/yt-dlp/yt-dlp/issues/11750)) by [bashonly](https://github.com/bashonly) (With fixes in [4bd2655](https://github.com/yt-dlp/yt-dlp/commit/4bd2655398aed450456197a6767639114a24eac2))
- [Fix signature function extraction](https://github.com/yt-dlp/yt-dlp/commit/4c85ccd1366c88cf93982f8350f58eed17355981) ([#11751](https://github.com/yt-dlp/yt-dlp/issues/11751)) by [bashonly](https://github.com/bashonly)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/2e49c789d3eebc39af8910705d65a98bca0e4c4f) ([#11724](https://github.com/yt-dlp/yt-dlp/issues/11724)) by [bashonly](https://github.com/bashonly)
### 2024.12.03
#### Core changes
- [Add `playlist_webpage_url` field](https://github.com/yt-dlp/yt-dlp/commit/7d6c259a03bc4707a319e5e8c6eff0278707874b) ([#11613](https://github.com/yt-dlp/yt-dlp/issues/11613)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- [Handle fragmented formats in `_remove_duplicate_formats`](https://github.com/yt-dlp/yt-dlp/commit/e0500cbf796323551bbabe5b8ed8c75a511ba47a) ([#11637](https://github.com/yt-dlp/yt-dlp/issues/11637)) by [Grub4K](https://github.com/Grub4K)
- **bilibili**
- [Always try to extract HD formats](https://github.com/yt-dlp/yt-dlp/commit/dc1687648077c5bf64863b307ecc5ab7e029bd8d) ([#10559](https://github.com/yt-dlp/yt-dlp/issues/10559)) by [grqz](https://github.com/grqz)
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/239f5f36fe04603bec59c8b975f6a792f10246db) ([#11667](https://github.com/yt-dlp/yt-dlp/issues/11667)) by [grqz](https://github.com/grqz) (With fixes in [f05a1cd](https://github.com/yt-dlp/yt-dlp/commit/f05a1cd1492fc98dc8d80d2081d632a1879913d2) by [bashonly](https://github.com/bashonly), [grqz](https://github.com/grqz))
- [Fix subtitles and chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/a13a336aa6f906812701abec8101b73b73db8ff7) ([#11708](https://github.com/yt-dlp/yt-dlp/issues/11708)) by [xiaomac](https://github.com/xiaomac)
- **chaturbate**: [Fix support for non-public streams](https://github.com/yt-dlp/yt-dlp/commit/4b5eec0aaa7c02627f27a386591b735b90e681a8) ([#11624](https://github.com/yt-dlp/yt-dlp/issues/11624)) by [jkruse](https://github.com/jkruse)
- **dacast**: [Fix HLS AES formats extraction](https://github.com/yt-dlp/yt-dlp/commit/0a0d80800b9350d1a4c4b18d82cfb77ffbc3c507) ([#11644](https://github.com/yt-dlp/yt-dlp/issues/11644)) by [bashonly](https://github.com/bashonly)
- **dropbox**: [Fix password-protected video extraction](https://github.com/yt-dlp/yt-dlp/commit/00dcde728635633eee969ad4d498b9f233c4a94e) ([#11636](https://github.com/yt-dlp/yt-dlp/issues/11636)) by [bashonly](https://github.com/bashonly)
- **duoplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/62cba8a1bedbfc0ddde7267ae57b72bf5f7ea7b1) ([#11588](https://github.com/yt-dlp/yt-dlp/issues/11588)) by [bashonly](https://github.com/bashonly), [glensc](https://github.com/glensc)
- **facebook**: [Support more groups URLs](https://github.com/yt-dlp/yt-dlp/commit/e0f1ae813b36e783e2348ba2a1566e12f5cd8f6e) ([#11576](https://github.com/yt-dlp/yt-dlp/issues/11576)) by [grqz](https://github.com/grqz)
- **instagram**: [Support `share` URLs](https://github.com/yt-dlp/yt-dlp/commit/360aed810ad85db950df586282d256516c98cd2d) ([#11677](https://github.com/yt-dlp/yt-dlp/issues/11677)) by [grqz](https://github.com/grqz)
- **microsoftembed**: [Make format extraction non fatal](https://github.com/yt-dlp/yt-dlp/commit/2bea7936323ca4b6f3b9b1fdd892566223e30efa) ([#11654](https://github.com/yt-dlp/yt-dlp/issues/11654)) by [seproDev](https://github.com/seproDev)
- **mitele**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/cd0f934604587ed793e9177f6a127e5dcf99a7dd) ([#11683](https://github.com/yt-dlp/yt-dlp/issues/11683)) by [DarkZeros](https://github.com/DarkZeros)
- **stripchat**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/16336c51d0848a6868a4fa04e749fa03548b4913) ([#11596](https://github.com/yt-dlp/yt-dlp/issues/11596)) by [gitninja1234](https://github.com/gitninja1234)
- **tiktok**: [Deprioritize animated thumbnails](https://github.com/yt-dlp/yt-dlp/commit/910ecc422930bca14e2abe4986f5f92359e3cea8) ([#11645](https://github.com/yt-dlp/yt-dlp/issues/11645)) by [bashonly](https://github.com/bashonly)
- **vk**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/c038a7b187ba24360f14134842a7a2cf897c33b1) ([#11715](https://github.com/yt-dlp/yt-dlp/issues/11715)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Adjust player clients for site changes](https://github.com/yt-dlp/yt-dlp/commit/0d146c1e36f467af30e87b7af651bdee67b73500) ([#11663](https://github.com/yt-dlp/yt-dlp/issues/11663)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlists tab extraction](https://github.com/yt-dlp/yt-dlp/commit/fe70f20aedf528fdee332131bc9b6710e54e6f10) ([#11615](https://github.com/yt-dlp/yt-dlp/issues/11615)) by [seproDev](https://github.com/seproDev)
#### Networking changes
- **Request Handler**: websockets: [Support websockets 14.0+](https://github.com/yt-dlp/yt-dlp/commit/c7316373c0a886f65a07a51e50ee147bb3294c85) ([#11616](https://github.com/yt-dlp/yt-dlp/issues/11616)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**
- [Bump ruff to 0.8.x](https://github.com/yt-dlp/yt-dlp/commit/d8fb3490863653182864d2a53522f350d67a9ff8) ([#11608](https://github.com/yt-dlp/yt-dlp/issues/11608)) by [seproDev](https://github.com/seproDev)
- Miscellaneous
- [ccf0a6b](https://github.com/yt-dlp/yt-dlp/commit/ccf0a6b86b7f68a75463804fe485ec240b8635f0) by [bashonly](https://github.com/bashonly), [pzhlkj6612](https://github.com/pzhlkj6612)
- [2b67ac3](https://github.com/yt-dlp/yt-dlp/commit/2b67ac300ac8b44368fb121637d1743cea8c5b6b) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2024.11.18
#### Important changes
- **Login with OAuth is no longer supported for YouTube**
Due to a change made by the site, yt-dlp is no longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)
#### Core changes
- [Catch broken Cryptodome installations](https://github.com/yt-dlp/yt-dlp/commit/b83ca24eb72e1e558b0185bd73975586c0bc0546) ([#11486](https://github.com/yt-dlp/yt-dlp/issues/11486)) by [seproDev](https://github.com/seproDev)
- **utils**
- [Fix `join_nonempty`, add `**kwargs` to `unpack`](https://github.com/yt-dlp/yt-dlp/commit/39d79c9b9cf23411d935910685c40aa1a2fdb409) ([#11559](https://github.com/yt-dlp/yt-dlp/issues/11559)) by [Grub4K](https://github.com/Grub4K)
- `subs_list_to_dict`: [Add `lang` default parameter](https://github.com/yt-dlp/yt-dlp/commit/c014fbcddcb4c8f79d914ac5bb526758b540ea33) ([#11508](https://github.com/yt-dlp/yt-dlp/issues/11508)) by [Grub4K](https://github.com/Grub4K)
#### Extractor changes
- [Allow `ext` override for thumbnails](https://github.com/yt-dlp/yt-dlp/commit/eb64ae7d5def6df2aba74fb703e7f168fb299865) ([#11545](https://github.com/yt-dlp/yt-dlp/issues/11545)) by [bashonly](https://github.com/bashonly)
- **adobepass**: [Fix provider requests](https://github.com/yt-dlp/yt-dlp/commit/85fdc66b6e01d19a94b4f39b58e3c0cf23600902) ([#11472](https://github.com/yt-dlp/yt-dlp/issues/11472)) by [bashonly](https://github.com/bashonly)
- **archive.org**: [Fix comments extraction](https://github.com/yt-dlp/yt-dlp/commit/f2a4983df7a64c4e93b56f79dbd16a781bd90206) ([#11527](https://github.com/yt-dlp/yt-dlp/issues/11527)) by [jshumphrey](https://github.com/jshumphrey)
- **bandlab**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/6365e92589e4bc17b8fffb0125a716d144ad2137) ([#11535](https://github.com/yt-dlp/yt-dlp/issues/11535)) by [seproDev](https://github.com/seproDev)
- **chaturbate**
- [Extract from API and support impersonation](https://github.com/yt-dlp/yt-dlp/commit/720b3dc453c342bc2e8df7dbc0acaab4479de46c) ([#11555](https://github.com/yt-dlp/yt-dlp/issues/11555)) by [powergold1](https://github.com/powergold1) (With fixes in [7cecd29](https://github.com/yt-dlp/yt-dlp/commit/7cecd299e4a5ef1f0f044b2fedc26f17e41f15e3) by [seproDev](https://github.com/seproDev))
- [Support alternate domains](https://github.com/yt-dlp/yt-dlp/commit/a9f85670d03ab993dc589f21a9ffffcad61392d5) ([#10595](https://github.com/yt-dlp/yt-dlp/issues/10595)) by [manavchaudhary1](https://github.com/manavchaudhary1)
- **cloudflarestream**: [Avoid extraction via videodelivery.net](https://github.com/yt-dlp/yt-dlp/commit/2db8c2e7d57a1784b06057c48e3e91023720d195) ([#11478](https://github.com/yt-dlp/yt-dlp/issues/11478)) by [hugovdev](https://github.com/hugovdev)
- **ctvnews**
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f351440f1dc5b3dfbfc5737b037a869d946056fe) ([#11534](https://github.com/yt-dlp/yt-dlp/issues/11534)) by [bashonly](https://github.com/bashonly), [jshumphrey](https://github.com/jshumphrey)
- [Fix playlist ID extraction](https://github.com/yt-dlp/yt-dlp/commit/f9d98509a898737c12977b2e2117277bada2c196) ([#8892](https://github.com/yt-dlp/yt-dlp/issues/8892)) by [qbnu](https://github.com/qbnu)
- **digitalconcerthall**: [Support login with access/refresh tokens](https://github.com/yt-dlp/yt-dlp/commit/f7257588bdff5f0b0452635a66b253a783c97357) ([#11571](https://github.com/yt-dlp/yt-dlp/issues/11571)) by [bashonly](https://github.com/bashonly)
- **facebook**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/bacc31b05a04181b63100c481565256b14813a5e) ([#11513](https://github.com/yt-dlp/yt-dlp/issues/11513)) by [bashonly](https://github.com/bashonly)
- **gamedevtv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/be3579aaf0c3b71a0a3195e1955415d5e4d6b3d8) ([#11368](https://github.com/yt-dlp/yt-dlp/issues/11368)) by [bashonly](https://github.com/bashonly), [stratus-ss](https://github.com/stratus-ss)
- **goplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/6b43a8d84b881d769b480ba6e20ec691e9d1b92d) ([#11466](https://github.com/yt-dlp/yt-dlp/issues/11466)) by [bashonly](https://github.com/bashonly), [SamDecrock](https://github.com/SamDecrock)
- **kenh14**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/eb15fd5a32d8b35ef515f7a3d1158c03025648ff) ([#3996](https://github.com/yt-dlp/yt-dlp/issues/3996)) by [krichbanana](https://github.com/krichbanana), [pzhlkj6612](https://github.com/pzhlkj6612)
- **litv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e079ffbda66de150c0a9ebef05e89f61bb4d5f76) ([#11071](https://github.com/yt-dlp/yt-dlp/issues/11071)) by [jiru](https://github.com/jiru)
- **mixchmovie**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/0ec9bfed4d4a52bfb4f8733da1acf0aeeae21e6b) ([#10897](https://github.com/yt-dlp/yt-dlp/issues/10897)) by [Sakura286](https://github.com/Sakura286)
- **patreon**: [Fix comments extraction](https://github.com/yt-dlp/yt-dlp/commit/1d253b0a27110d174c40faf8fb1c999d099e0cde) ([#11530](https://github.com/yt-dlp/yt-dlp/issues/11530)) by [bashonly](https://github.com/bashonly), [jshumphrey](https://github.com/jshumphrey)
- **pialive**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/d867f99622ef7fba690b08da56c39d739b822bb7) ([#10811](https://github.com/yt-dlp/yt-dlp/issues/10811)) by [ChocoLZS](https://github.com/ChocoLZS)
- **radioradicale**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/70c55cb08f780eab687e881ef42bb5c6007d290b) ([#5607](https://github.com/yt-dlp/yt-dlp/issues/5607)) by [a13ssandr0](https://github.com/a13ssandr0), [pzhlkj6612](https://github.com/pzhlkj6612)
- **reddit**: [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/7ea2787920cccc6b8ea30791993d114fbd564434) ([#11573](https://github.com/yt-dlp/yt-dlp/issues/11573)) by [bashonly](https://github.com/bashonly)
- **redgifsuser**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/d215fba7edb69d4fa665f43663756fd260b1489f) ([#11531](https://github.com/yt-dlp/yt-dlp/issues/11531)) by [jshumphrey](https://github.com/jshumphrey)
- **rutube**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/e398217aae19bb25f91797bfbe8a3243698d7f45) ([#11480](https://github.com/yt-dlp/yt-dlp/issues/11480)) by [seproDev](https://github.com/seproDev)
- **sonylivseries**: [Add `sort_order` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/2009cb27e17014787bf63eaa2ada51293d54f22a) ([#11569](https://github.com/yt-dlp/yt-dlp/issues/11569)) by [bashonly](https://github.com/bashonly)
- **soop**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/c699bafc5038b59c9afe8c2e69175fb66424c832) ([#11545](https://github.com/yt-dlp/yt-dlp/issues/11545)) by [bashonly](https://github.com/bashonly)
- **spankbang**: [Support browser impersonation](https://github.com/yt-dlp/yt-dlp/commit/8388ec256f7753b02488788e3cfa771f6e1db247) ([#11542](https://github.com/yt-dlp/yt-dlp/issues/11542)) by [jshumphrey](https://github.com/jshumphrey)
- **spreaker**
- [Support episode pages and access keys](https://github.com/yt-dlp/yt-dlp/commit/c39016f66df76d14284c705736ca73db8055d8de) ([#11489](https://github.com/yt-dlp/yt-dlp/issues/11489)) by [julionc](https://github.com/julionc)
- [Support podcast and feed pages](https://github.com/yt-dlp/yt-dlp/commit/c6737310619022248f5d0fd13872073cac168453) ([#10968](https://github.com/yt-dlp/yt-dlp/issues/10968)) by [subrat-lima](https://github.com/subrat-lima)
- **youtube**
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/637d62a3a9fc723d68632c1af25c30acdadeeb85) ([#11528](https://github.com/yt-dlp/yt-dlp/issues/11528)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- [Remove broken OAuth support](https://github.com/yt-dlp/yt-dlp/commit/52c0ffe40ad6e8404d93296f575007b05b04c686) ([#11558](https://github.com/yt-dlp/yt-dlp/issues/11558)) by [bashonly](https://github.com/bashonly)
- tab: [Fix podcasts tab extraction](https://github.com/yt-dlp/yt-dlp/commit/37cd7660eaff397c551ee18d80507702342b0c2b) ([#11567](https://github.com/yt-dlp/yt-dlp/issues/11567)) by [seproDev](https://github.com/seproDev)
#### Misc. changes
- **build**
- [Bump PyInstaller version pin to `>=6.11.1`](https://github.com/yt-dlp/yt-dlp/commit/f9c8deb4e5887ff5150e911ac0452e645f988044) ([#11507](https://github.com/yt-dlp/yt-dlp/issues/11507)) by [bashonly](https://github.com/bashonly)
- [Enable attestations for trusted publishing](https://github.com/yt-dlp/yt-dlp/commit/f13df591d4d7ca8e2f31b35c9c91e69ba9e9b013) ([#11420](https://github.com/yt-dlp/yt-dlp/issues/11420)) by [bashonly](https://github.com/bashonly)
- [Pin `websockets` version to >=13.0,<14](https://github.com/yt-dlp/yt-dlp/commit/240a7d43c8a67ffb86d44dc276805aa43c358dcc) ([#11488](https://github.com/yt-dlp/yt-dlp/issues/11488)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- [Deprecate more compat functions](https://github.com/yt-dlp/yt-dlp/commit/f95a92b3d0169a784ee15a138fbe09d82b2754a1) ([#11439](https://github.com/yt-dlp/yt-dlp/issues/11439)) by [seproDev](https://github.com/seproDev)
- [Remove dead extractors](https://github.com/yt-dlp/yt-dlp/commit/10fc719bc7f1eef469389c5219102266ef411f29) ([#11566](https://github.com/yt-dlp/yt-dlp/issues/11566)) by [doe1080](https://github.com/doe1080)
- Miscellaneous: [da252d9](https://github.com/yt-dlp/yt-dlp/commit/da252d9d322af3e2178ac5eae324809502a0a862) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
### 2024.11.04 ### 2024.11.04
#### Important changes #### Important changes

View File

@@ -342,8 +342,9 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
extractor plugins; postprocessor plugins can extractor plugins; postprocessor plugins can
only be loaded from the default plugin only be loaded from the default plugin
directories directories
--flat-playlist Do not extract the videos of a playlist, --flat-playlist Do not extract a playlist's URL result
only list them entries; some entry metadata may be missing
and downloading may be bypassed
--no-flat-playlist Fully extract the videos of a playlist --no-flat-playlist Fully extract the videos of a playlist
(default) (default)
--live-from-start Download livestreams from the start. --live-from-start Download livestreams from the start.
@@ -612,8 +613,7 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
--no-restrict-filenames Allow Unicode characters, "&" and spaces in --no-restrict-filenames Allow Unicode characters, "&" and spaces in
filenames (default) filenames (default)
--windows-filenames Force filenames to be Windows-compatible --windows-filenames Force filenames to be Windows-compatible
--no-windows-filenames Make filenames Windows-compatible only if --no-windows-filenames Sanitize filenames only minimally
using Windows (default)
--trim-filenames LENGTH Limit the filename length (excluding --trim-filenames LENGTH Limit the filename length (excluding
extension) to the specified number of extension) to the specified number of
characters characters
@@ -1293,6 +1293,7 @@ The available fields are:
- `playlist_uploader_id` (string): Nickname or id of the playlist uploader - `playlist_uploader_id` (string): Nickname or id of the playlist uploader
- `playlist_channel` (string): Display name of the channel that uploaded the playlist - `playlist_channel` (string): Display name of the channel that uploaded the playlist
- `playlist_channel_id` (string): Identifier of the channel that uploaded the playlist - `playlist_channel_id` (string): Identifier of the channel that uploaded the playlist
- `playlist_webpage_url` (string): URL of the playlist webpage
- `webpage_url` (string): A URL to the video webpage which, if given to yt-dlp, should yield the same result again - `webpage_url` (string): A URL to the video webpage which, if given to yt-dlp, should yield the same result again
- `webpage_url_basename` (string): The basename of the webpage URL - `webpage_url_basename` (string): The basename of the webpage URL
- `webpage_url_domain` (string): The domain of the webpage URL - `webpage_url_domain` (string): The domain of the webpage URL
@@ -1759,7 +1760,7 @@ $ yt-dlp --replace-in-metadata "title,uploader" "[ _]" "-"
# EXTRACTOR ARGUMENTS # EXTRACTOR ARGUMENTS
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=mediaconnect,web;formats=incomplete" --extractor-args "funimation:version=uncut"` Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=tv,mweb;formats=incomplete" --extractor-args "twitter:api=syndication"`
Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` becomes `youtube:player_client"` Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` becomes `youtube:player_client"`
@@ -1768,19 +1769,19 @@ The following extractors use this feature:
#### youtube #### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes * `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively * `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `mediaconnect`, `android_testsuite`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,mweb` is used, and `web_creator,mediaconnect` is added as needed for age-gated videos when account age verification is required. Similarly, the `_music` variants are added for `music.youtube.com` URLs. Some clients, such as `web` and `android`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web` * `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details * `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp. * `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side) * `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all` * `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total * E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8) * `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8), `missing_pot` (include formats that require a PO Token but are missing one)
* `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others * `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
* `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used * `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning * `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage` * `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID) * `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use for requesting video playback. Comma seperated list of PO Tokens in the format `CLIENT+PO_TOKEN`, e.g. `youtube:po_token=web+XXX,android+YYY` * `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
#### youtubetab (YouTube playlists, channels, feeds, etc.) #### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details) * `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
@@ -1794,13 +1795,6 @@ The following extractors use this feature:
* `is_live`: Bypass live HLS detection and manually set `live_status` - a value of `false` will set `not_live`, any other value (or no value) will set `is_live` * `is_live`: Bypass live HLS detection and manually set `live_status` - a value of `false` will set `not_live`, any other value (or no value) will set `is_live`
* `impersonate`: Target(s) to try and impersonate with the initial webpage request; e.g. `generic:impersonate=safari,chrome-110`. Use `generic:impersonate` to impersonate any available target, and use `generic:impersonate=false` to disable impersonation (default) * `impersonate`: Target(s) to try and impersonate with the initial webpage request; e.g. `generic:impersonate=safari,chrome-110`. Use `generic:impersonate` to impersonate any available target, and use `generic:impersonate=false` to disable impersonation (default)
#### funimation
* `language`: Audio languages to extract, e.g. `funimation:language=english,japanese`
* `version`: The video version to extract - `uncut` or `simulcast`
#### crunchyrollbeta (Crunchyroll)
* `hardsub`: One or more hardsub versions to extract (in order of preference), or `all` (default: `None` = no hardsubs will be extracted), e.g. `crunchyrollbeta:hardsub=en-US,de-DE`
#### vikichannel #### vikichannel
* `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers` * `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers`
@@ -1858,7 +1852,7 @@ The following extractors use this feature:
* `cdn`: One or more CDN IDs to use with the API call for stream URLs, e.g. `gcp_cdn`, `gs_cdn_pc_app`, `gs_cdn_mobile_web`, `gs_cdn_pc_web` * `cdn`: One or more CDN IDs to use with the API call for stream URLs, e.g. `gcp_cdn`, `gs_cdn_pc_app`, `gs_cdn_mobile_web`, `gs_cdn_pc_web`
#### soundcloud #### soundcloud
* `formats`: Formats to request from the API. Requested values should be in the format of `{protocol}_{extension}` (omitting the bitrate), e.g. `hls_opus,http_aac`. The `*` character functions as a wildcard, e.g. `*_mp3`, and can be passed by itself to request all formats. Known protocols include `http`, `hls` and `hls-aes`; known extensions include `aac`, `opus` and `mp3`. Original `download` formats are always extracted. Default is `http_aac,hls_aac,http_opus,hls_opus,http_mp3,hls_mp3` * `formats`: Formats to request from the API. Requested values should be in the format of `{protocol}_{codec}`, e.g. `hls_opus,http_aac`. The `*` character functions as a wildcard, e.g. `*_mp3`, and can be passed by itself to request all formats. Known protocols include `http`, `hls` and `hls-aes`; known codecs include `aac`, `opus` and `mp3`. Original `download` formats are always extracted. Default is `http_aac,hls_aac,http_opus,hls_opus,http_mp3,hls_mp3`
#### orfon (orf:on) #### orfon (orf:on)
* `prefer_segments_playlist`: Prefer a playlist of program segments instead of a single complete video when available. If individual segments are desired, use `--concat-playlist never --extractor-args "orfon:prefer_segments_playlist"` * `prefer_segments_playlist`: Prefer a playlist of program segments instead of a single complete video when available. If individual segments are desired, use `--concat-playlist never --extractor-args "orfon:prefer_segments_playlist"`
@@ -1866,8 +1860,8 @@ The following extractors use this feature:
#### bilibili #### bilibili
* `prefer_multi_flv`: Prefer extracting flv formats over mp4 for older videos that still provide legacy formats * `prefer_multi_flv`: Prefer extracting flv formats over mp4 for older videos that still provide legacy formats
#### digitalconcerthall #### sonylivseries
* `prefer_combined_hls`: Prefer extracting combined/pre-merged video and audio HLS formats. This will exclude 4K/HEVC video and lossless/FLAC audio formats, which are only available as split video/audio HLS formats * `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc`
**Note**: These options may be changed/removed in the future without concern for backward compatibility **Note**: These options may be changed/removed in the future without concern for backward compatibility

View File

@@ -234,5 +234,16 @@
"when": "57212a5f97ce367590aaa5c3e9a135eead8f81f7", "when": "57212a5f97ce367590aaa5c3e9a135eead8f81f7",
"short": "[ie/vimeo] Fix API retries (#11351)", "short": "[ie/vimeo] Fix API retries (#11351)",
"authors": ["bashonly"] "authors": ["bashonly"]
},
{
"action": "add",
"when": "52c0ffe40ad6e8404d93296f575007b05b04c686",
"short": "[priority] **Login with OAuth is no longer supported for YouTube**\nDue to a change made by the site, yt-dlp is no longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)"
},
{
"action": "change",
"when": "76ac023ff02f06e8c003d104f02a03deeddebdcd",
"short": "[ie/youtube:tab] Improve shorts title extraction (#11997)",
"authors": ["bashonly", "d3d9"]
} }
] ]

View File

@@ -11,13 +11,12 @@ import codecs
import subprocess import subprocess
from yt_dlp.aes import aes_encrypt, key_expansion from yt_dlp.aes import aes_encrypt, key_expansion
from yt_dlp.utils import intlist_to_bytes
secret_msg = b'Secret message goes here' secret_msg = b'Secret message goes here'
def hex_str(int_list): def hex_str(int_list):
return codecs.encode(intlist_to_bytes(int_list), 'hex') return codecs.encode(bytes(int_list), 'hex')
def openssl_encode(algo, key, iv): def openssl_encode(algo, key, iv):

View File

@@ -76,14 +76,14 @@ dev = [
] ]
static-analysis = [ static-analysis = [
"autopep8~=2.0", "autopep8~=2.0",
"ruff~=0.7.0", "ruff~=0.9.0",
] ]
test = [ test = [
"pytest~=8.1", "pytest~=8.1",
"pytest-rerunfailures~=14.0", "pytest-rerunfailures~=14.0",
] ]
pyinstaller = [ pyinstaller = [
"pyinstaller>=6.10.0", # Windows temp cleanup fixed in 6.10.0 "pyinstaller>=6.11.1", # Windows temp cleanup fixed in 6.11.1
] ]
[project.urls] [project.urls]
@@ -186,6 +186,7 @@ ignore = [
"E501", # line-too-long "E501", # line-too-long
"E731", # lambda-assignment "E731", # lambda-assignment
"E741", # ambiguous-variable-name "E741", # ambiguous-variable-name
"UP031", # printf-string-formatting
"UP036", # outdated-version-block "UP036", # outdated-version-block
"B006", # mutable-argument-default "B006", # mutable-argument-default
"B008", # function-call-in-default-argument "B008", # function-call-in-default-argument
@@ -194,6 +195,7 @@ ignore = [
"B023", # function-uses-loop-variable (false positives) "B023", # function-uses-loop-variable (false positives)
"B028", # no-explicit-stacklevel "B028", # no-explicit-stacklevel
"B904", # raise-without-from-inside-except "B904", # raise-without-from-inside-except
"A005", # stdlib-module-shadowing
"C401", # unnecessary-generator-set "C401", # unnecessary-generator-set
"C402", # unnecessary-generator-dict "C402", # unnecessary-generator-dict
"PIE790", # unnecessary-placeholder "PIE790", # unnecessary-placeholder
@@ -258,9 +260,6 @@ select = [
"A002", # builtin-argument-shadowing "A002", # builtin-argument-shadowing
"C408", # unnecessary-collection-call "C408", # unnecessary-collection-call
] ]
"yt_dlp/jsinterp.py" = [
"UP031", # printf-string-formatting
]
[tool.ruff.lint.isort] [tool.ruff.lint.isort]
known-first-party = [ known-first-party = [
@@ -313,6 +312,16 @@ banned-from = [
"yt_dlp.compat.compat_urllib_parse_urlparse".msg = "Use `urllib.parse.urlparse` instead." "yt_dlp.compat.compat_urllib_parse_urlparse".msg = "Use `urllib.parse.urlparse` instead."
"yt_dlp.compat.compat_shlex_quote".msg = "Use `yt_dlp.utils.shell_quote` instead." "yt_dlp.compat.compat_shlex_quote".msg = "Use `yt_dlp.utils.shell_quote` instead."
"yt_dlp.utils.error_to_compat_str".msg = "Use `str` instead." "yt_dlp.utils.error_to_compat_str".msg = "Use `str` instead."
"yt_dlp.utils.bytes_to_intlist".msg = "Use `list` instead."
"yt_dlp.utils.intlist_to_bytes".msg = "Use `bytes` instead."
"yt_dlp.utils.decodeArgument".msg = "Do not use"
"yt_dlp.utils.decodeFilename".msg = "Do not use"
"yt_dlp.utils.encodeFilename".msg = "Do not use"
"yt_dlp.compat.compat_os_name".msg = "Use `os.name` instead."
"yt_dlp.compat.compat_realpath".msg = "Use `os.path.realpath` instead."
"yt_dlp.compat.functools".msg = "Use `functools` instead."
"yt_dlp.utils.decodeOption".msg = "Do not use"
"yt_dlp.utils.compiled_regex_type".msg = "Use `re.Pattern` instead."
[tool.autopep8] [tool.autopep8]
max_line_length = 120 max_line_length = 120

View File

@@ -129,6 +129,8 @@
- **Bandcamp:album** - **Bandcamp:album**
- **Bandcamp:user** - **Bandcamp:user**
- **Bandcamp:weekly** - **Bandcamp:weekly**
- **Bandlab**
- **BandlabPlaylist**
- **BannedVideo** - **BannedVideo**
- **bbc**: [*bbc*](## "netrc machine") BBC - **bbc**: [*bbc*](## "netrc machine") BBC
- **bbc.co.uk**: [*bbc*](## "netrc machine") BBC iPlayer - **bbc.co.uk**: [*bbc*](## "netrc machine") BBC iPlayer
@@ -169,6 +171,7 @@
- **BilibiliCheese** - **BilibiliCheese**
- **BilibiliCheeseSeason** - **BilibiliCheeseSeason**
- **BilibiliCollectionList** - **BilibiliCollectionList**
- **BiliBiliDynamic**
- **BilibiliFavoritesList** - **BilibiliFavoritesList**
- **BiliBiliPlayer** - **BiliBiliPlayer**
- **BilibiliPlaylist** - **BilibiliPlaylist**
@@ -301,10 +304,6 @@
- **CrowdBunker** - **CrowdBunker**
- **CrowdBunkerChannel** - **CrowdBunkerChannel**
- **Crtvg** - **Crtvg**
- **crunchyroll**: [*crunchyroll*](## "netrc machine")
- **crunchyroll:artist**: [*crunchyroll*](## "netrc machine")
- **crunchyroll:music**: [*crunchyroll*](## "netrc machine")
- **crunchyroll:playlist**: [*crunchyroll*](## "netrc machine")
- **CSpan**: C-SPAN - **CSpan**: C-SPAN
- **CSpanCongress** - **CSpanCongress**
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
@@ -372,6 +371,7 @@
- **Dropbox** - **Dropbox**
- **Dropout**: [*dropout*](## "netrc machine") - **Dropout**: [*dropout*](## "netrc machine")
- **DropoutSeason** - **DropoutSeason**
- **DrTalks**
- **DrTuber** - **DrTuber**
- **drtv** - **drtv**
- **drtv:live** - **drtv:live**
@@ -390,6 +390,8 @@
- **Ebay** - **Ebay**
- **egghead:course**: egghead.io course - **egghead:course**: egghead.io course
- **egghead:lesson**: egghead.io lesson - **egghead:lesson**: egghead.io lesson
- **eggs:artist**
- **eggs:single**
- **EinsUndEinsTV**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTV**: [*1und1tv*](## "netrc machine")
- **EinsUndEinsTVLive**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTVLive**: [*1und1tv*](## "netrc machine")
- **EinsUndEinsTVRecordings**: [*1und1tv*](## "netrc machine") - **EinsUndEinsTVRecordings**: [*1und1tv*](## "netrc machine")
@@ -474,9 +476,6 @@
- **FrontendMastersCourse**: [*frontendmasters*](## "netrc machine") - **FrontendMastersCourse**: [*frontendmasters*](## "netrc machine")
- **FrontendMastersLesson**: [*frontendmasters*](## "netrc machine") - **FrontendMastersLesson**: [*frontendmasters*](## "netrc machine")
- **FujiTVFODPlus7** - **FujiTVFODPlus7**
- **Funimation**: [*funimation*](## "netrc machine")
- **funimation:page**: [*funimation*](## "netrc machine")
- **funimation:show**: [*funimation*](## "netrc machine")
- **Funk** - **Funk**
- **Funker530** - **Funker530**
- **Fux** - **Fux**
@@ -484,6 +483,7 @@
- **Gab** - **Gab**
- **GabTV** - **GabTV**
- **Gaia**: [*gaia*](## "netrc machine") - **Gaia**: [*gaia*](## "netrc machine")
- **GameDevTVDashboard**: [*gamedevtv*](## "netrc machine")
- **GameJolt** - **GameJolt**
- **GameJoltCommunity** - **GameJoltCommunity**
- **GameJoltGame** - **GameJoltGame**
@@ -651,6 +651,8 @@
- **Karaoketv** - **Karaoketv**
- **Katsomo**: (**Currently broken**) - **Katsomo**: (**Currently broken**)
- **KelbyOne**: (**Currently broken**) - **KelbyOne**: (**Currently broken**)
- **Kenh14Playlist**
- **Kenh14Video**
- **Ketnet** - **Ketnet**
- **khanacademy** - **khanacademy**
- **khanacademy:unit** - **khanacademy:unit**
@@ -784,10 +786,6 @@
- **MicrosoftLearnSession** - **MicrosoftLearnSession**
- **MicrosoftMedius** - **MicrosoftMedius**
- **microsoftstream**: Microsoft Stream - **microsoftstream**: Microsoft Stream
- **mildom**: Record ongoing live by specific user in Mildom
- **mildom:clip**: Clip in Mildom
- **mildom:user:vod**: Download all VODs from specific user in Mildom
- **mildom:vod**: VOD in Mildom
- **minds** - **minds**
- **minds:channel** - **minds:channel**
- **minds:group** - **minds:group**
@@ -798,6 +796,7 @@
- **MiTele**: mitele.es - **MiTele**: mitele.es
- **mixch** - **mixch**
- **mixch:archive** - **mixch:archive**
- **mixch:movie**
- **mixcloud** - **mixcloud**
- **mixcloud:playlist** - **mixcloud:playlist**
- **mixcloud:user** - **mixcloud:user**
@@ -889,6 +888,8 @@
- **nebula:video**: [*watchnebula*](## "netrc machine") - **nebula:video**: [*watchnebula*](## "netrc machine")
- **NekoHacker** - **NekoHacker**
- **NerdCubedFeed** - **NerdCubedFeed**
- **Nest**
- **NestClip**
- **netease:album**: 网易云音乐 - 专辑 - **netease:album**: 网易云音乐 - 专辑
- **netease:djradio**: 网易云音乐 - 电台 - **netease:djradio**: 网易云音乐 - 电台
- **netease:mv**: 网易云音乐 - MV - **netease:mv**: 网易云音乐 - MV
@@ -1060,14 +1061,16 @@
- **PhilharmonieDeParis**: Philharmonie de Paris - **PhilharmonieDeParis**: Philharmonie de Paris
- **phoenix.de** - **phoenix.de**
- **Photobucket** - **Photobucket**
- **PiaLive**
- **Piapro**: [*piapro*](## "netrc machine") - **Piapro**: [*piapro*](## "netrc machine")
- **PIAULIZAPortal**: ulizaportal.jp - PIA LIVE STREAM
- **Picarto** - **Picarto**
- **PicartoVod** - **PicartoVod**
- **Piksel** - **Piksel**
- **Pinkbike** - **Pinkbike**
- **Pinterest** - **Pinterest**
- **PinterestCollection** - **PinterestCollection**
- **PiramideTV**
- **PiramideTVChannel**
- **pixiv:sketch** - **pixiv:sketch**
- **pixiv:sketch:user** - **pixiv:sketch:user**
- **Pladform** - **Pladform**
@@ -1084,12 +1087,11 @@
- **pluralsight**: [*pluralsight*](## "netrc machine") - **pluralsight**: [*pluralsight*](## "netrc machine")
- **pluralsight:course** - **pluralsight:course**
- **PlutoTV**: (**Currently broken**) - **PlutoTV**: (**Currently broken**)
- **PlVideo**: Платформа
- **PodbayFM** - **PodbayFM**
- **PodbayFMChannel** - **PodbayFMChannel**
- **Podchaser** - **Podchaser**
- **podomatic**: (**Currently broken**) - **podomatic**: (**Currently broken**)
- **Pokemon**
- **PokemonWatch**
- **PokerGo**: [*pokergo*](## "netrc machine") - **PokerGo**: [*pokergo*](## "netrc machine")
- **PokerGoCollection**: [*pokergo*](## "netrc machine") - **PokerGoCollection**: [*pokergo*](## "netrc machine")
- **PolsatGo** - **PolsatGo**
@@ -1160,6 +1162,7 @@
- **RadioJavan**: (**Currently broken**) - **RadioJavan**: (**Currently broken**)
- **radiokapital** - **radiokapital**
- **radiokapital:show** - **radiokapital:show**
- **RadioRadicale**
- **RadioZetPodcast** - **RadioZetPodcast**
- **radlive** - **radlive**
- **radlive:channel** - **radlive:channel**
@@ -1367,9 +1370,7 @@
- **spotify**: Spotify episodes (**Currently broken**) - **spotify**: Spotify episodes (**Currently broken**)
- **spotify:show**: Spotify shows (**Currently broken**) - **spotify:show**: Spotify shows (**Currently broken**)
- **Spreaker** - **Spreaker**
- **SpreakerPage**
- **SpreakerShow** - **SpreakerShow**
- **SpreakerShowPage**
- **SpringboardPlatform** - **SpringboardPlatform**
- **Sprout** - **Sprout**
- **SproutVideo** - **SproutVideo**
@@ -1395,6 +1396,8 @@
- **StretchInternet** - **StretchInternet**
- **Stripchat** - **Stripchat**
- **stv:player** - **stv:player**
- **Subsplash**
- **subsplash:playlist**
- **Substack** - **Substack**
- **SunPorno** - **SunPorno**
- **sverigesradio:episode** - **sverigesradio:episode**
@@ -1570,6 +1573,8 @@
- **UFCTV**: [*ufctv*](## "netrc machine") - **UFCTV**: [*ufctv*](## "netrc machine")
- **ukcolumn**: (**Currently broken**) - **ukcolumn**: (**Currently broken**)
- **UKTVPlay** - **UKTVPlay**
- **UlizaPlayer**
- **UlizaPortal**: ulizaportal.jp
- **umg:de**: Universal Music Deutschland (**Currently broken**) - **umg:de**: Universal Music Deutschland (**Currently broken**)
- **Unistra** - **Unistra**
- **Unity**: (**Currently broken**) - **Unity**: (**Currently broken**)
@@ -1587,8 +1592,6 @@
- **Varzesh3**: (**Currently broken**) - **Varzesh3**: (**Currently broken**)
- **Vbox7** - **Vbox7**
- **Veo** - **Veo**
- **Veoh**
- **veoh:user**
- **Vesti**: Вести.Ru (**Currently broken**) - **Vesti**: Вести.Ru (**Currently broken**)
- **Vevo** - **Vevo**
- **VevoPlaylist** - **VevoPlaylist**
@@ -1642,8 +1645,6 @@
- **Vimm:stream** - **Vimm:stream**
- **ViMP** - **ViMP**
- **ViMP:Playlist** - **ViMP:Playlist**
- **Vine**
- **vine:user**
- **Viously** - **Viously**
- **Viqeo**: (**Currently broken**) - **Viqeo**: (**Currently broken**)
- **Viu** - **Viu**

View File

@@ -9,7 +9,6 @@ import types
import yt_dlp.extractor import yt_dlp.extractor
from yt_dlp import YoutubeDL from yt_dlp import YoutubeDL
from yt_dlp.compat import compat_os_name
from yt_dlp.utils import preferredencoding, try_call, write_string, find_available_port from yt_dlp.utils import preferredencoding, try_call, write_string, find_available_port
if 'pytest' in sys.modules: if 'pytest' in sys.modules:
@@ -49,7 +48,7 @@ def report_warning(message, *args, **kwargs):
Print the message to stderr, it will be prefixed with 'WARNING:' Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored If stderr is a tty file the 'WARNING:' will be colored
""" """
if sys.stderr.isatty() and compat_os_name != 'nt': if sys.stderr.isatty() and os.name != 'nt':
_msg_header = '\033[0;33mWARNING:\033[0m' _msg_header = '\033[0;33mWARNING:\033[0m'
else: else:
_msg_header = 'WARNING:' _msg_header = 'WARNING:'

View File

@@ -15,7 +15,6 @@ import json
from test.helper import FakeYDL, assertRegexpMatches, try_rm from test.helper import FakeYDL, assertRegexpMatches, try_rm
from yt_dlp import YoutubeDL from yt_dlp import YoutubeDL
from yt_dlp.compat import compat_os_name
from yt_dlp.extractor import YoutubeIE from yt_dlp.extractor import YoutubeIE
from yt_dlp.extractor.common import InfoExtractor from yt_dlp.extractor.common import InfoExtractor
from yt_dlp.postprocessor.common import PostProcessor from yt_dlp.postprocessor.common import PostProcessor
@@ -487,11 +486,11 @@ class TestFormatSelection(unittest.TestCase):
def test_format_filtering(self): def test_format_filtering(self):
formats = [ formats = [
{'format_id': 'A', 'filesize': 500, 'width': 1000}, {'format_id': 'A', 'filesize': 500, 'width': 1000, 'aspect_ratio': 1.0},
{'format_id': 'B', 'filesize': 1000, 'width': 500}, {'format_id': 'B', 'filesize': 1000, 'width': 500, 'aspect_ratio': 1.33},
{'format_id': 'C', 'filesize': 1000, 'width': 400}, {'format_id': 'C', 'filesize': 1000, 'width': 400, 'aspect_ratio': 1.5},
{'format_id': 'D', 'filesize': 2000, 'width': 600}, {'format_id': 'D', 'filesize': 2000, 'width': 600, 'aspect_ratio': 1.78},
{'format_id': 'E', 'filesize': 3000}, {'format_id': 'E', 'filesize': 3000, 'aspect_ratio': 0.56},
{'format_id': 'F'}, {'format_id': 'F'},
{'format_id': 'G', 'filesize': 1000000}, {'format_id': 'G', 'filesize': 1000000},
] ]
@@ -550,6 +549,31 @@ class TestFormatSelection(unittest.TestCase):
ydl.process_ie_result(info_dict) ydl.process_ie_result(info_dict)
self.assertEqual(ydl.downloaded_info_dicts, []) self.assertEqual(ydl.downloaded_info_dicts, [])
ydl = YDL({'format': 'best[aspect_ratio=1]'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'A')
ydl = YDL({'format': 'all[aspect_ratio > 1.00]'})
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['D', 'C', 'B'])
ydl = YDL({'format': 'all[aspect_ratio < 1.00]'})
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['E'])
ydl = YDL({'format': 'best[aspect_ratio=1.5]'})
ydl.process_ie_result(info_dict)
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'C')
ydl = YDL({'format': 'all[aspect_ratio!=1]'})
ydl.process_ie_result(info_dict)
downloaded_ids = [info['format_id'] for info in ydl.downloaded_info_dicts]
self.assertEqual(downloaded_ids, ['E', 'D', 'C', 'B'])
@patch('yt_dlp.postprocessor.ffmpeg.FFmpegMergerPP.available', False) @patch('yt_dlp.postprocessor.ffmpeg.FFmpegMergerPP.available', False)
def test_default_format_spec_without_ffmpeg(self): def test_default_format_spec_without_ffmpeg(self):
ydl = YDL({}) ydl = YDL({})
@@ -762,6 +786,13 @@ class TestYoutubeDL(unittest.TestCase):
test('%(width)06d.%%(ext)s', 'NA.%(ext)s') test('%(width)06d.%%(ext)s', 'NA.%(ext)s')
test('%%(width)06d.%(ext)s', '%(width)06d.mp4') test('%%(width)06d.%(ext)s', '%(width)06d.mp4')
# Sanitization options
test('%(title3)s', (None, 'foobartest'))
test('%(title5)s', (None, 'aei_A'), restrictfilenames=True)
test('%(title3)s', (None, 'foo_bar_test'), windowsfilenames=False, restrictfilenames=True)
if sys.platform != 'win32':
test('%(title3)s', (None, 'foobar\\test'), windowsfilenames=False)
# ID sanitization # ID sanitization
test('%(id)s', '_abcd', info={'id': '_abcd'}) test('%(id)s', '_abcd', info={'id': '_abcd'})
test('%(some_id)s', '_abcd', info={'some_id': '_abcd'}) test('%(some_id)s', '_abcd', info={'some_id': '_abcd'})
@@ -839,8 +870,8 @@ class TestYoutubeDL(unittest.TestCase):
test('%(filesize)#D', '1Ki') test('%(filesize)#D', '1Ki')
test('%(height)5.2D', ' 1.08k') test('%(height)5.2D', ' 1.08k')
test('%(title4)#S', 'foo_bar_test') test('%(title4)#S', 'foo_bar_test')
test('%(title4).10S', ('foo bar ', 'foo bar' + ('#' if compat_os_name == 'nt' else ' '))) test('%(title4).10S', ('foo bar ', 'foo bar' + ('#' if os.name == 'nt' else ' ')))
if compat_os_name == 'nt': if os.name == 'nt':
test('%(title4)q', ('"foo ""bar"" test"', None)) test('%(title4)q', ('"foo ""bar"" test"', None))
test('%(formats.:.id)#q', ('"id 1" "id 2" "id 3"', None)) test('%(formats.:.id)#q', ('"id 1" "id 2" "id 3"', None))
test('%(formats.0.id)#q', ('"id 1"', None)) test('%(formats.0.id)#q', ('"id 1"', None))
@@ -903,9 +934,9 @@ class TestYoutubeDL(unittest.TestCase):
# Environment variable expansion for prepare_filename # Environment variable expansion for prepare_filename
os.environ['__yt_dlp_var'] = 'expanded' os.environ['__yt_dlp_var'] = 'expanded'
envvar = '%__yt_dlp_var%' if compat_os_name == 'nt' else '$__yt_dlp_var' envvar = '%__yt_dlp_var%' if os.name == 'nt' else '$__yt_dlp_var'
test(envvar, (envvar, 'expanded')) test(envvar, (envvar, 'expanded'))
if compat_os_name == 'nt': if os.name == 'nt':
test('%s%', ('%s%', '%s%')) test('%s%', ('%s%', '%s%'))
os.environ['s'] = 'expanded' os.environ['s'] = 'expanded'
test('%s%', ('%s%', 'expanded')) # %s% should be expanded before escaping %s test('%s%', ('%s%', 'expanded')) # %s% should be expanded before escaping %s

View File

@@ -27,7 +27,6 @@ from yt_dlp.aes import (
pad_block, pad_block,
) )
from yt_dlp.dependencies import Cryptodome from yt_dlp.dependencies import Cryptodome
from yt_dlp.utils import bytes_to_intlist, intlist_to_bytes
# the encrypted data can be generate with 'devscripts/generate_aes_testdata.py' # the encrypted data can be generate with 'devscripts/generate_aes_testdata.py'
@@ -40,33 +39,33 @@ class TestAES(unittest.TestCase):
def test_encrypt(self): def test_encrypt(self):
msg = b'message' msg = b'message'
key = list(range(16)) key = list(range(16))
encrypted = aes_encrypt(bytes_to_intlist(msg), key) encrypted = aes_encrypt(list(msg), key)
decrypted = intlist_to_bytes(aes_decrypt(encrypted, key)) decrypted = bytes(aes_decrypt(encrypted, key))
self.assertEqual(decrypted, msg) self.assertEqual(decrypted, msg)
def test_cbc_decrypt(self): def test_cbc_decrypt(self):
data = b'\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6\x27\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd' data = b'\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6\x27\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd'
decrypted = intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(data), self.key, self.iv)) decrypted = bytes(aes_cbc_decrypt(list(data), self.key, self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
if Cryptodome.AES: if Cryptodome.AES:
decrypted = aes_cbc_decrypt_bytes(data, intlist_to_bytes(self.key), intlist_to_bytes(self.iv)) decrypted = aes_cbc_decrypt_bytes(data, bytes(self.key), bytes(self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
def test_cbc_encrypt(self): def test_cbc_encrypt(self):
data = bytes_to_intlist(self.secret_msg) data = list(self.secret_msg)
encrypted = intlist_to_bytes(aes_cbc_encrypt(data, self.key, self.iv)) encrypted = bytes(aes_cbc_encrypt(data, self.key, self.iv))
self.assertEqual( self.assertEqual(
encrypted, encrypted,
b'\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6\'\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd') b'\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6\'\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd')
def test_ctr_decrypt(self): def test_ctr_decrypt(self):
data = bytes_to_intlist(b'\x03\xc7\xdd\xd4\x8e\xb3\xbc\x1a*O\xdc1\x12+8Aio\xd1z\xb5#\xaf\x08') data = list(b'\x03\xc7\xdd\xd4\x8e\xb3\xbc\x1a*O\xdc1\x12+8Aio\xd1z\xb5#\xaf\x08')
decrypted = intlist_to_bytes(aes_ctr_decrypt(data, self.key, self.iv)) decrypted = bytes(aes_ctr_decrypt(data, self.key, self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
def test_ctr_encrypt(self): def test_ctr_encrypt(self):
data = bytes_to_intlist(self.secret_msg) data = list(self.secret_msg)
encrypted = intlist_to_bytes(aes_ctr_encrypt(data, self.key, self.iv)) encrypted = bytes(aes_ctr_encrypt(data, self.key, self.iv))
self.assertEqual( self.assertEqual(
encrypted, encrypted,
b'\x03\xc7\xdd\xd4\x8e\xb3\xbc\x1a*O\xdc1\x12+8Aio\xd1z\xb5#\xaf\x08') b'\x03\xc7\xdd\xd4\x8e\xb3\xbc\x1a*O\xdc1\x12+8Aio\xd1z\xb5#\xaf\x08')
@@ -75,19 +74,19 @@ class TestAES(unittest.TestCase):
data = b'\x159Y\xcf5eud\x90\x9c\x85&]\x14\x1d\x0f.\x08\xb4T\xe4/\x17\xbd' data = b'\x159Y\xcf5eud\x90\x9c\x85&]\x14\x1d\x0f.\x08\xb4T\xe4/\x17\xbd'
authentication_tag = b'\xe8&I\x80rI\x07\x9d}YWuU@:e' authentication_tag = b'\xe8&I\x80rI\x07\x9d}YWuU@:e'
decrypted = intlist_to_bytes(aes_gcm_decrypt_and_verify( decrypted = bytes(aes_gcm_decrypt_and_verify(
bytes_to_intlist(data), self.key, bytes_to_intlist(authentication_tag), self.iv[:12])) list(data), self.key, list(authentication_tag), self.iv[:12]))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
if Cryptodome.AES: if Cryptodome.AES:
decrypted = aes_gcm_decrypt_and_verify_bytes( decrypted = aes_gcm_decrypt_and_verify_bytes(
data, intlist_to_bytes(self.key), authentication_tag, intlist_to_bytes(self.iv[:12])) data, bytes(self.key), authentication_tag, bytes(self.iv[:12]))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
def test_gcm_aligned_decrypt(self): def test_gcm_aligned_decrypt(self):
data = b'\x159Y\xcf5eud\x90\x9c\x85&]\x14\x1d\x0f' data = b'\x159Y\xcf5eud\x90\x9c\x85&]\x14\x1d\x0f'
authentication_tag = b'\x08\xb1\x9d!&\x98\xd0\xeaRq\x90\xe6;\xb5]\xd8' authentication_tag = b'\x08\xb1\x9d!&\x98\xd0\xeaRq\x90\xe6;\xb5]\xd8'
decrypted = intlist_to_bytes(aes_gcm_decrypt_and_verify( decrypted = bytes(aes_gcm_decrypt_and_verify(
list(data), self.key, list(authentication_tag), self.iv[:12])) list(data), self.key, list(authentication_tag), self.iv[:12]))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16]) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16])
if Cryptodome.AES: if Cryptodome.AES:
@@ -96,38 +95,38 @@ class TestAES(unittest.TestCase):
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16]) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16])
def test_decrypt_text(self): def test_decrypt_text(self):
password = intlist_to_bytes(self.key).decode() password = bytes(self.key).decode()
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) bytes(self.iv[:8])
+ b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae', + b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae',
).decode() ).decode()
decrypted = (aes_decrypt_text(encrypted, password, 16)) decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
password = intlist_to_bytes(self.key).decode() password = bytes(self.key).decode()
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) bytes(self.iv[:8])
+ b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83', + b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83',
).decode() ).decode()
decrypted = (aes_decrypt_text(encrypted, password, 32)) decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
def test_ecb_encrypt(self): def test_ecb_encrypt(self):
data = bytes_to_intlist(self.secret_msg) data = list(self.secret_msg)
encrypted = intlist_to_bytes(aes_ecb_encrypt(data, self.key)) encrypted = bytes(aes_ecb_encrypt(data, self.key))
self.assertEqual( self.assertEqual(
encrypted, encrypted,
b'\xaa\x86]\x81\x97>\x02\x92\x9d\x1bR[[L/u\xd3&\xd1(h\xde{\x81\x94\xba\x02\xae\xbd\xa6\xd0:') b'\xaa\x86]\x81\x97>\x02\x92\x9d\x1bR[[L/u\xd3&\xd1(h\xde{\x81\x94\xba\x02\xae\xbd\xa6\xd0:')
def test_ecb_decrypt(self): def test_ecb_decrypt(self):
data = bytes_to_intlist(b'\xaa\x86]\x81\x97>\x02\x92\x9d\x1bR[[L/u\xd3&\xd1(h\xde{\x81\x94\xba\x02\xae\xbd\xa6\xd0:') data = list(b'\xaa\x86]\x81\x97>\x02\x92\x9d\x1bR[[L/u\xd3&\xd1(h\xde{\x81\x94\xba\x02\xae\xbd\xa6\xd0:')
decrypted = intlist_to_bytes(aes_ecb_decrypt(data, self.key, self.iv)) decrypted = bytes(aes_ecb_decrypt(data, self.key, self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg) self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
def test_key_expansion(self): def test_key_expansion(self):
key = '4f6bdaa39e2f8cb07f5e722d9edef314' key = '4f6bdaa39e2f8cb07f5e722d9edef314'
self.assertEqual(key_expansion(bytes_to_intlist(bytearray.fromhex(key))), [ self.assertEqual(key_expansion(list(bytearray.fromhex(key))), [
0x4F, 0x6B, 0xDA, 0xA3, 0x9E, 0x2F, 0x8C, 0xB0, 0x7F, 0x5E, 0x72, 0x2D, 0x9E, 0xDE, 0xF3, 0x14, 0x4F, 0x6B, 0xDA, 0xA3, 0x9E, 0x2F, 0x8C, 0xB0, 0x7F, 0x5E, 0x72, 0x2D, 0x9E, 0xDE, 0xF3, 0x14,
0x53, 0x66, 0x20, 0xA8, 0xCD, 0x49, 0xAC, 0x18, 0xB2, 0x17, 0xDE, 0x35, 0x2C, 0xC9, 0x2D, 0x21, 0x53, 0x66, 0x20, 0xA8, 0xCD, 0x49, 0xAC, 0x18, 0xB2, 0x17, 0xDE, 0x35, 0x2C, 0xC9, 0x2D, 0x21,
0x8C, 0xBE, 0xDD, 0xD9, 0x41, 0xF7, 0x71, 0xC1, 0xF3, 0xE0, 0xAF, 0xF4, 0xDF, 0x29, 0x82, 0xD5, 0x8C, 0xBE, 0xDD, 0xD9, 0x41, 0xF7, 0x71, 0xC1, 0xF3, 0xE0, 0xAF, 0xF4, 0xDF, 0x29, 0x82, 0xD5,

View File

@@ -12,12 +12,7 @@ import struct
from yt_dlp import compat from yt_dlp import compat
from yt_dlp.compat import urllib # isort: split from yt_dlp.compat import urllib # isort: split
from yt_dlp.compat import ( from yt_dlp.compat import compat_etree_fromstring, compat_expanduser
compat_etree_fromstring,
compat_expanduser,
compat_urllib_parse_unquote, # noqa: TID251
compat_urllib_parse_urlencode, # noqa: TID251
)
from yt_dlp.compat.urllib.request import getproxies from yt_dlp.compat.urllib.request import getproxies
@@ -43,39 +38,6 @@ class TestCompat(unittest.TestCase):
finally: finally:
os.environ['HOME'] = old_home or '' os.environ['HOME'] = old_home or ''
def test_compat_urllib_parse_unquote(self):
self.assertEqual(compat_urllib_parse_unquote('abc%20def'), 'abc def')
self.assertEqual(compat_urllib_parse_unquote('%7e/abc+def'), '~/abc+def')
self.assertEqual(compat_urllib_parse_unquote(''), '')
self.assertEqual(compat_urllib_parse_unquote('%'), '%')
self.assertEqual(compat_urllib_parse_unquote('%%'), '%%')
self.assertEqual(compat_urllib_parse_unquote('%%%'), '%%%')
self.assertEqual(compat_urllib_parse_unquote('%2F'), '/')
self.assertEqual(compat_urllib_parse_unquote('%2f'), '/')
self.assertEqual(compat_urllib_parse_unquote('%E6%B4%A5%E6%B3%A2'), '津波')
self.assertEqual(
compat_urllib_parse_unquote('''<meta property="og:description" content="%E2%96%81%E2%96%82%E2%96%83%E2%96%84%25%E2%96%85%E2%96%86%E2%96%87%E2%96%88" />
%<a href="https://ar.wikipedia.org/wiki/%D8%AA%D8%B3%D9%88%D9%86%D8%A7%D9%85%D9%8A">%a'''),
'''<meta property="og:description" content="▁▂▃▄%▅▆▇█" />
%<a href="https://ar.wikipedia.org/wiki/تسونامي">%a''')
self.assertEqual(
compat_urllib_parse_unquote('''%28%5E%E2%97%A3_%E2%97%A2%5E%29%E3%81%A3%EF%B8%BB%E3%83%87%E2%95%90%E4%B8%80 %E2%87%80 %E2%87%80 %E2%87%80 %E2%87%80 %E2%87%80 %E2%86%B6%I%Break%25Things%'''),
'''(^◣_◢^)っ︻デ═一 ⇀ ⇀ ⇀ ⇀ ⇀ ↶%I%Break%Things%''')
def test_compat_urllib_parse_unquote_plus(self):
self.assertEqual(urllib.parse.unquote_plus('abc%20def'), 'abc def')
self.assertEqual(urllib.parse.unquote_plus('%7e/abc+def'), '~/abc def')
def test_compat_urllib_parse_urlencode(self):
self.assertEqual(compat_urllib_parse_urlencode({'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', b'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', b'def')]), 'abc=def')
def test_compat_etree_fromstring(self): def test_compat_etree_fromstring(self):
xml = ''' xml = '''
<root foo="bar" spam="中文"> <root foo="bar" spam="中文">

View File

@@ -15,7 +15,6 @@ import threading
from test.helper import http_server_port, try_rm from test.helper import http_server_port, try_rm
from yt_dlp import YoutubeDL from yt_dlp import YoutubeDL
from yt_dlp.downloader.http import HttpFD from yt_dlp.downloader.http import HttpFD
from yt_dlp.utils import encodeFilename
from yt_dlp.utils._utils import _YDLLogger as FakeLogger from yt_dlp.utils._utils import _YDLLogger as FakeLogger
TEST_DIR = os.path.dirname(os.path.abspath(__file__)) TEST_DIR = os.path.dirname(os.path.abspath(__file__))
@@ -82,12 +81,12 @@ class TestHttpFD(unittest.TestCase):
ydl = YoutubeDL(params) ydl = YoutubeDL(params)
downloader = HttpFD(ydl, params) downloader = HttpFD(ydl, params)
filename = 'testfile.mp4' filename = 'testfile.mp4'
try_rm(encodeFilename(filename)) try_rm(filename)
self.assertTrue(downloader.real_download(filename, { self.assertTrue(downloader.real_download(filename, {
'url': f'http://127.0.0.1:{self.port}/{ep}', 'url': f'http://127.0.0.1:{self.port}/{ep}',
}), ep) }), ep)
self.assertEqual(os.path.getsize(encodeFilename(filename)), TEST_SIZE, ep) self.assertEqual(os.path.getsize(filename), TEST_SIZE, ep)
try_rm(encodeFilename(filename)) try_rm(filename)
def download_all(self, params): def download_all(self, params):
for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'): for ep in ('regular', 'no-content-length', 'no-range', 'no-range-no-content-length'):

View File

@@ -216,7 +216,9 @@ class SocksWebSocketTestRequestHandler(SocksTestRequestHandler):
protocol = websockets.ServerProtocol() protocol = websockets.ServerProtocol()
connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0) connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0)
connection.handshake() connection.handshake()
connection.send(json.dumps(self.socks_info)) for message in connection:
if message == 'socks_info':
connection.send(json.dumps(self.socks_info))
connection.close() connection.close()

View File

@@ -481,7 +481,7 @@ class TestTraversalHelpers:
'id': 'name', 'id': 'name',
'data': 'content', 'data': 'content',
'url': 'url', 'url': 'url',
}, all, {subs_list_to_dict}]) == { }, all, {subs_list_to_dict(lang=None)}]) == {
'de': [{'url': 'https://example.com/subs/de.ass'}], 'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [{'data': 'content'}], 'en': [{'data': 'content'}],
}, 'subs with mandatory items missing should be filtered' }, 'subs with mandatory items missing should be filtered'
@@ -507,6 +507,54 @@ class TestTraversalHelpers:
{'url': 'https://example.com/subs/en1', 'ext': 'ext'}, {'url': 'https://example.com/subs/en1', 'ext': 'ext'},
{'url': 'https://example.com/subs/en2', 'ext': 'ext'}, {'url': 'https://example.com/subs/en2', 'ext': 'ext'},
]}, '`quality` key should sort subtitle list accordingly' ]}, '`quality` key should sort subtitle list accordingly'
assert traverse_obj([
{'name': 'de', 'url': 'https://example.com/subs/de.ass'},
{'name': 'de'},
{'name': 'en', 'content': 'content'},
{'url': 'https://example.com/subs/en'},
], [..., {
'id': 'name',
'url': 'url',
'data': 'content',
}, all, {subs_list_to_dict(lang='en')}]) == {
'de': [{'url': 'https://example.com/subs/de.ass'}],
'en': [
{'data': 'content'},
{'url': 'https://example.com/subs/en'},
],
}, 'optionally provided lang should be used if no id available'
assert traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [..., {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, {subs_list_to_dict(lang=None)}]) == {
'de': [
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be ignored for id and ext'
assert traverse_obj([
{'name': 1, 'url': 'https://example.com/subs/de1'},
{'name': {}, 'url': 'https://example.com/subs/de2'},
{'name': 'de', 'ext': 1, 'url': 'https://example.com/subs/de3'},
{'name': 'de', 'ext': {}, 'url': 'https://example.com/subs/de4'},
], [..., {
'id': 'name',
'url': 'url',
'ext': 'ext',
}, all, {subs_list_to_dict(lang='de')}]) == {
'de': [
{'url': 'https://example.com/subs/de1'},
{'url': 'https://example.com/subs/de2'},
{'url': 'https://example.com/subs/de3'},
{'url': 'https://example.com/subs/de4'},
],
}, 'non str types should be replaced by default id'
def test_trim_str(self): def test_trim_str(self):
with pytest.raises(TypeError): with pytest.raises(TypeError):
@@ -525,7 +573,7 @@ class TestTraversalHelpers:
def test_unpack(self): def test_unpack(self):
assert unpack(lambda *x: ''.join(map(str, x)))([1, 2, 3]) == '123' assert unpack(lambda *x: ''.join(map(str, x)))([1, 2, 3]) == '123'
assert unpack(join_nonempty)([1, 2, 3]) == '1-2-3' assert unpack(join_nonempty)([1, 2, 3]) == '1-2-3'
assert unpack(join_nonempty(delim=' '))([1, 2, 3]) == '1 2 3' assert unpack(join_nonempty, delim=' ')([1, 2, 3]) == '1 2 3'
with pytest.raises(TypeError): with pytest.raises(TypeError):
unpack(join_nonempty)() unpack(join_nonempty)()
with pytest.raises(TypeError): with pytest.raises(TypeError):

View File

@@ -21,7 +21,6 @@ import xml.etree.ElementTree
from yt_dlp.compat import ( from yt_dlp.compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_HTMLParseError, compat_HTMLParseError,
compat_os_name,
) )
from yt_dlp.utils import ( from yt_dlp.utils import (
Config, Config,
@@ -49,7 +48,6 @@ from yt_dlp.utils import (
dfxp2srt, dfxp2srt,
encode_base_n, encode_base_n,
encode_compat_str, encode_compat_str,
encodeFilename,
expand_path, expand_path,
extract_attributes, extract_attributes,
extract_basic_auth, extract_basic_auth,
@@ -69,10 +67,8 @@ from yt_dlp.utils import (
get_elements_html_by_class, get_elements_html_by_class,
get_elements_text_and_html_by_attribute, get_elements_text_and_html_by_attribute,
int_or_none, int_or_none,
intlist_to_bytes,
iri_to_uri, iri_to_uri,
is_html, is_html,
join_nonempty,
js_to_json, js_to_json,
limit_length, limit_length,
locked_file, locked_file,
@@ -253,17 +249,36 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_path('abc/def...'), 'abc\\def..#') self.assertEqual(sanitize_path('abc/def...'), 'abc\\def..#')
self.assertEqual(sanitize_path('abc.../def'), 'abc..#\\def') self.assertEqual(sanitize_path('abc.../def'), 'abc..#\\def')
self.assertEqual(sanitize_path('abc.../def...'), 'abc..#\\def..#') self.assertEqual(sanitize_path('abc.../def...'), 'abc..#\\def..#')
self.assertEqual(sanitize_path('../abc'), '..\\abc')
self.assertEqual(sanitize_path('../../abc'), '..\\..\\abc')
self.assertEqual(sanitize_path('./abc'), 'abc')
self.assertEqual(sanitize_path('./../abc'), '..\\abc')
self.assertEqual(sanitize_path('\\abc'), '\\abc')
self.assertEqual(sanitize_path('C:abc'), 'C:abc')
self.assertEqual(sanitize_path('C:abc\\..\\'), 'C:..')
self.assertEqual(sanitize_path('C:\\abc:%(title)s.%(ext)s'), 'C:\\abc#%(title)s.%(ext)s') self.assertEqual(sanitize_path('C:\\abc:%(title)s.%(ext)s'), 'C:\\abc#%(title)s.%(ext)s')
# Check with nt._path_normpath if available
try:
import nt
nt_path_normpath = getattr(nt, '_path_normpath', None)
except Exception:
nt_path_normpath = None
for test, expected in [
('C:\\', 'C:\\'),
('../abc', '..\\abc'),
('../../abc', '..\\..\\abc'),
('./abc', 'abc'),
('./../abc', '..\\abc'),
('\\abc', '\\abc'),
('C:abc', 'C:abc'),
('C:abc\\..\\', 'C:'),
('C:abc\\..\\def\\..\\..\\', 'C:..'),
('C:\\abc\\xyz///..\\def\\', 'C:\\abc\\def'),
('abc/../', '.'),
('./abc/../', '.'),
]:
result = sanitize_path(test)
assert result == expected, f'{test} was incorrectly resolved'
assert result == sanitize_path(result), f'{test} changed after sanitizing again'
if nt_path_normpath:
assert result == nt_path_normpath(test), f'{test} does not match nt._path_normpath'
def test_sanitize_url(self): def test_sanitize_url(self):
self.assertEqual(sanitize_url('//foo.bar'), 'http://foo.bar') self.assertEqual(sanitize_url('//foo.bar'), 'http://foo.bar')
self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar') self.assertEqual(sanitize_url('httpss://foo.bar'), 'https://foo.bar')
@@ -567,10 +582,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(res_data, {'a': 'b', 'c': 'd'}) self.assertEqual(res_data, {'a': 'b', 'c': 'd'})
def test_shell_quote(self): def test_shell_quote(self):
args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')] args = ['ffmpeg', '-i', 'ñ€ß\'.mp4']
self.assertEqual( self.assertEqual(
shell_quote(args), shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''') """ffmpeg -i 'ñ€ß'"'"'.mp4'""" if os.name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_float_or_none(self): def test_float_or_none(self):
self.assertEqual(float_or_none('42.42'), 42.42) self.assertEqual(float_or_none('42.42'), 42.42)
@@ -1310,15 +1325,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"') self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
self.assertEqual(clean_html('a<br>\xa0b'), 'a\nb') self.assertEqual(clean_html('a<br>\xa0b'), 'a\nb')
def test_intlist_to_bytes(self):
self.assertEqual(
intlist_to_bytes([0, 1, 127, 128, 255]),
b'\x00\x01\x7f\x80\xff')
def test_args_to_str(self): def test_args_to_str(self):
self.assertEqual( self.assertEqual(
args_to_str(['foo', 'ba/r', '-baz', '2 be', '']), args_to_str(['foo', 'ba/r', '-baz', '2 be', '']),
'foo ba/r -baz \'2 be\' \'\'' if compat_os_name != 'nt' else 'foo ba/r -baz "2 be" ""', 'foo ba/r -baz \'2 be\' \'\'' if os.name != 'nt' else 'foo ba/r -baz "2 be" ""',
) )
def test_parse_filesize(self): def test_parse_filesize(self):
@@ -2118,7 +2128,7 @@ Line 1
assert extract_basic_auth('http://user:@foo.bar') == ('http://foo.bar', 'Basic dXNlcjo=') assert extract_basic_auth('http://user:@foo.bar') == ('http://foo.bar', 'Basic dXNlcjo=')
assert extract_basic_auth('http://user:pass@foo.bar') == ('http://foo.bar', 'Basic dXNlcjpwYXNz') assert extract_basic_auth('http://user:pass@foo.bar') == ('http://foo.bar', 'Basic dXNlcjpwYXNz')
@unittest.skipUnless(compat_os_name == 'nt', 'Only relevant on Windows') @unittest.skipUnless(os.name == 'nt', 'Only relevant on Windows')
def test_windows_escaping(self): def test_windows_escaping(self):
tests = [ tests = [
'test"&', 'test"&',
@@ -2158,10 +2168,6 @@ Line 1
assert int_or_none(v=10) == 10, 'keyword passed positional should call function' assert int_or_none(v=10) == 10, 'keyword passed positional should call function'
assert int_or_none(scale=0.1)(10) == 100, 'call after partial application should call the function' assert int_or_none(scale=0.1)(10) == 100, 'call after partial application should call the function'
assert callable(join_nonempty(delim=', ')), 'varargs positional should apply partially'
assert callable(join_nonempty()), 'varargs positional should apply partially'
assert join_nonempty(None, delim=', ') == '', 'passed varargs should call the function'
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -68,6 +68,16 @@ _SIG_TESTS = [
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA', '2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL2QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0', 'AOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL2QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
), ),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'MyOSJXtKI3m-uME_jv7-pT12gOFC02RFkGoqWpzE0Cs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
),
] ]
_NSIG_TESTS = [ _NSIG_TESTS = [
@@ -183,6 +193,14 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js', 'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js',
'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw', 'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw',
), ),
(
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'gK15nzVyaXE9RsMP3z', 'ZFFWFLPWx9DEgQ',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
),
] ]
@@ -254,8 +272,11 @@ def signature(jscode, sig_input):
def n_sig(jscode, sig_input): def n_sig(jscode, sig_input):
funcname = YoutubeIE(FakeYDL())._extract_n_function_name(jscode) ie = YoutubeIE(FakeYDL())
return JSInterpreter(jscode).call_function(funcname, sig_input) funcname = ie._extract_n_function_name(jscode)
jsi = JSInterpreter(jscode)
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname)))
return func([sig_input])
make_sig_test = t_factory( make_sig_test = t_factory(

View File

@@ -26,7 +26,7 @@ import unicodedata
from .cache import Cache from .cache import Cache
from .compat import urllib # isort: split from .compat import urllib # isort: split
from .compat import compat_os_name, urllib_req_to_req from .compat import urllib_req_to_req
from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
@@ -109,7 +109,6 @@ from .utils import (
determine_ext, determine_ext,
determine_protocol, determine_protocol,
encode_compat_str, encode_compat_str,
encodeFilename,
escapeHTML, escapeHTML,
expand_path, expand_path,
extract_basic_auth, extract_basic_auth,
@@ -167,7 +166,7 @@ from .utils.networking import (
) )
from .version import CHANNEL, ORIGIN, RELEASE_GIT_HEAD, VARIANT, __version__ from .version import CHANNEL, ORIGIN, RELEASE_GIT_HEAD, VARIANT, __version__
if compat_os_name == 'nt': if os.name == 'nt':
import ctypes import ctypes
@@ -267,7 +266,9 @@ class YoutubeDL:
outtmpl_na_placeholder: Placeholder for unavailable meta fields. outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names restrictfilenames: Do not allow "&" and spaces in file names
trim_file_name: Limit length of filename (extension excluded) trim_file_name: Limit length of filename (extension excluded)
windowsfilenames: Force the filenames to be windows compatible windowsfilenames: True: Force filenames to be Windows compatible
False: Sanitize filenames only minimally
This option has no effect when running on Windows
ignoreerrors: Do not stop on download/postprocessing errors. ignoreerrors: Do not stop on download/postprocessing errors.
Can be 'only_download' to ignore only download errors. Can be 'only_download' to ignore only download errors.
Default is 'only_download' for CLI, but False for API Default is 'only_download' for CLI, but False for API
@@ -282,7 +283,10 @@ class YoutubeDL:
lazy_playlist: Process playlist entries as they are received. lazy_playlist: Process playlist entries as they are received.
matchtitle: Download only matching titles. matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles. rejecttitle: Reject downloads for matching titles.
logger: Log messages to a logging.Logger instance. logger: A class having a `debug`, `warning` and `error` function where
each has a single string parameter, the message to be logged.
For compatibility reasons, both debug and info messages are passed to `debug`.
A debug message will have a prefix of `[debug] ` to discern it from info messages.
logtostderr: Print everything to stderr instead of stdout. logtostderr: Print everything to stderr instead of stdout.
consoletitle: Display progress in the console window's titlebar. consoletitle: Display progress in the console window's titlebar.
writedescription: Write the video description to a .description file writedescription: Write the video description to a .description file
@@ -643,7 +647,7 @@ class YoutubeDL:
out=stdout, out=stdout,
error=sys.stderr, error=sys.stderr,
screen=sys.stderr if self.params.get('quiet') else stdout, screen=sys.stderr if self.params.get('quiet') else stdout,
console=None if compat_os_name == 'nt' else next( console=None if os.name == 'nt' else next(
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None), filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
) )
@@ -952,7 +956,7 @@ class YoutubeDL:
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once) self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once)
def _send_console_code(self, code): def _send_console_code(self, code):
if compat_os_name == 'nt' or not self._out_files.console: if os.name == 'nt' or not self._out_files.console:
return return
self._write_string(code, self._out_files.console) self._write_string(code, self._out_files.console)
@@ -960,7 +964,7 @@ class YoutubeDL:
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
message = remove_terminal_sequences(message) message = remove_terminal_sequences(message)
if compat_os_name == 'nt': if os.name == 'nt':
if ctypes.windll.kernel32.GetConsoleWindow(): if ctypes.windll.kernel32.GetConsoleWindow():
# c_wchar_p() might not be necessary if `message` is # c_wchar_p() might not be necessary if `message` is
# already of type unicode() # already of type unicode()
@@ -1117,7 +1121,7 @@ class YoutubeDL:
def raise_no_formats(self, info, forced=False, *, msg=None): def raise_no_formats(self, info, forced=False, *, msg=None):
has_drm = info.get('_has_drm') has_drm = info.get('_has_drm')
ignored, expected = self.params.get('ignore_no_formats_error'), bool(msg) ignored, expected = self.params.get('ignore_no_formats_error'), bool(msg)
msg = msg or has_drm and 'This video is DRM protected' or 'No video formats found!' msg = msg or (has_drm and 'This video is DRM protected') or 'No video formats found!'
if forced or not ignored: if forced or not ignored:
raise ExtractorError(msg, video_id=info['id'], ie=info['extractor'], raise ExtractorError(msg, video_id=info['id'], ie=info['extractor'],
expected=has_drm or ignored or expected) expected=has_drm or ignored or expected)
@@ -1193,8 +1197,7 @@ class YoutubeDL:
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False): def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
""" Make the outtmpl and info_dict suitable for substitution: ydl.escape_outtmpl(outtmpl) % info_dict """ Make the outtmpl and info_dict suitable for substitution: ydl.escape_outtmpl(outtmpl) % info_dict
@param sanitize Whether to sanitize the output as a filename. @param sanitize Whether to sanitize the output as a filename
For backward compatibility, a function can also be passed
""" """
info_dict.setdefault('epoch', int(time.time())) # keep epoch consistent once set info_dict.setdefault('epoch', int(time.time())) # keep epoch consistent once set
@@ -1310,14 +1313,23 @@ class YoutubeDL:
na = self.params.get('outtmpl_na_placeholder', 'NA') na = self.params.get('outtmpl_na_placeholder', 'NA')
def filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames')): def filename_sanitizer(key, value, restricted):
return sanitize_filename(str(value), restricted=restricted, is_id=( return sanitize_filename(str(value), restricted=restricted, is_id=(
bool(re.search(r'(^|[_.])id(\.|$)', key)) bool(re.search(r'(^|[_.])id(\.|$)', key))
if 'filename-sanitization' in self.params['compat_opts'] if 'filename-sanitization' in self.params['compat_opts']
else NO_DEFAULT)) else NO_DEFAULT))
sanitizer = sanitize if callable(sanitize) else filename_sanitizer if callable(sanitize):
sanitize = bool(sanitize) self.deprecation_warning('Passing a callable "sanitize" to YoutubeDL.prepare_outtmpl is deprecated')
elif not sanitize:
pass
elif (sys.platform != 'win32' and not self.params.get('restrictfilenames')
and self.params.get('windowsfilenames') is False):
def sanitize(key, value):
return str(value).replace('/', '\u29F8').replace('\0', '')
else:
def sanitize(key, value):
return filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames'))
def _dumpjson_default(obj): def _dumpjson_default(obj):
if isinstance(obj, (set, LazyList)): if isinstance(obj, (set, LazyList)):
@@ -1400,13 +1412,13 @@ class YoutubeDL:
if sanitize: if sanitize:
# If value is an object, sanitize might convert it to a string # If value is an object, sanitize might convert it to a string
# So we convert it to repr first # So we manually convert it before sanitizing
if fmt[-1] == 'r': if fmt[-1] == 'r':
value, fmt = repr(value), str_fmt value, fmt = repr(value), str_fmt
elif fmt[-1] == 'a': elif fmt[-1] == 'a':
value, fmt = ascii(value), str_fmt value, fmt = ascii(value), str_fmt
if fmt[-1] in 'csra': if fmt[-1] in 'csra':
value = sanitizer(last_field, value) value = sanitize(last_field, value)
key = '{}\0{}'.format(key.replace('%', '%\0'), outer_mobj.group('format')) key = '{}\0{}'.format(key.replace('%', '%\0'), outer_mobj.group('format'))
TMPL_DICT[key] = value TMPL_DICT[key] = value
@@ -1948,6 +1960,7 @@ class YoutubeDL:
'playlist_uploader_id': ie_result.get('uploader_id'), 'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_channel': ie_result.get('channel'), 'playlist_channel': ie_result.get('channel'),
'playlist_channel_id': ie_result.get('channel_id'), 'playlist_channel_id': ie_result.get('channel_id'),
'playlist_webpage_url': ie_result.get('webpage_url'),
**kwargs, **kwargs,
} }
if strict: if strict:
@@ -2108,7 +2121,7 @@ class YoutubeDL:
m = operator_rex.fullmatch(filter_spec) m = operator_rex.fullmatch(filter_spec)
if m: if m:
try: try:
comparison_value = int(m.group('value')) comparison_value = float(m.group('value'))
except ValueError: except ValueError:
comparison_value = parse_filesize(m.group('value')) comparison_value = parse_filesize(m.group('value'))
if comparison_value is None: if comparison_value is None:
@@ -2196,7 +2209,7 @@ class YoutubeDL:
def _default_format_spec(self, info_dict): def _default_format_spec(self, info_dict):
prefer_best = ( prefer_best = (
self.params['outtmpl']['default'] == '-' self.params['outtmpl']['default'] == '-'
or info_dict.get('is_live') and not self.params.get('live_from_start')) or (info_dict.get('is_live') and not self.params.get('live_from_start')))
def can_merge(): def can_merge():
merger = FFmpegMergerPP(self) merger = FFmpegMergerPP(self)
@@ -2365,7 +2378,7 @@ class YoutubeDL:
vexts=[f['ext'] for f in video_fmts], vexts=[f['ext'] for f in video_fmts],
aexts=[f['ext'] for f in audio_fmts], aexts=[f['ext'] for f in audio_fmts],
preferences=(try_call(lambda: self.params['merge_output_format'].split('/')) preferences=(try_call(lambda: self.params['merge_output_format'].split('/'))
or self.params.get('prefer_free_formats') and ('webm', 'mkv'))) or (self.params.get('prefer_free_formats') and ('webm', 'mkv'))))
filtered = lambda *keys: filter(None, (traverse_obj(fmt, *keys) for fmt in formats_info)) filtered = lambda *keys: filter(None, (traverse_obj(fmt, *keys) for fmt in formats_info))
@@ -3255,9 +3268,9 @@ class YoutubeDL:
if full_filename is None: if full_filename is None:
return return
if not self._ensure_dir_exists(encodeFilename(full_filename)): if not self._ensure_dir_exists(full_filename):
return return
if not self._ensure_dir_exists(encodeFilename(temp_filename)): if not self._ensure_dir_exists(temp_filename):
return return
if self._write_description('video', info_dict, if self._write_description('video', info_dict,
@@ -3289,16 +3302,16 @@ class YoutubeDL:
if self.params.get('writeannotations', False): if self.params.get('writeannotations', False):
annofn = self.prepare_filename(info_dict, 'annotation') annofn = self.prepare_filename(info_dict, 'annotation')
if annofn: if annofn:
if not self._ensure_dir_exists(encodeFilename(annofn)): if not self._ensure_dir_exists(annofn):
return return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(annofn)): if not self.params.get('overwrites', True) and os.path.exists(annofn):
self.to_screen('[info] Video annotations are already present') self.to_screen('[info] Video annotations are already present')
elif not info_dict.get('annotations'): elif not info_dict.get('annotations'):
self.report_warning('There are no annotations to write.') self.report_warning('There are no annotations to write.')
else: else:
try: try:
self.to_screen('[info] Writing video annotations to: ' + annofn) self.to_screen('[info] Writing video annotations to: ' + annofn)
with open(encodeFilename(annofn), 'w', encoding='utf-8') as annofile: with open(annofn, 'w', encoding='utf-8') as annofile:
annofile.write(info_dict['annotations']) annofile.write(info_dict['annotations'])
except (KeyError, TypeError): except (KeyError, TypeError):
self.report_warning('There are no annotations to write.') self.report_warning('There are no annotations to write.')
@@ -3314,14 +3327,14 @@ class YoutubeDL:
f'Cannot write internet shortcut file because the actual URL of "{info_dict["webpage_url"]}" is unknown') f'Cannot write internet shortcut file because the actual URL of "{info_dict["webpage_url"]}" is unknown')
return True return True
linkfn = replace_extension(self.prepare_filename(info_dict, 'link'), link_type, info_dict.get('ext')) linkfn = replace_extension(self.prepare_filename(info_dict, 'link'), link_type, info_dict.get('ext'))
if not self._ensure_dir_exists(encodeFilename(linkfn)): if not self._ensure_dir_exists(linkfn):
return False return False
if self.params.get('overwrites', True) and os.path.exists(encodeFilename(linkfn)): if self.params.get('overwrites', True) and os.path.exists(linkfn):
self.to_screen(f'[info] Internet shortcut (.{link_type}) is already present') self.to_screen(f'[info] Internet shortcut (.{link_type}) is already present')
return True return True
try: try:
self.to_screen(f'[info] Writing internet shortcut (.{link_type}) to: {linkfn}') self.to_screen(f'[info] Writing internet shortcut (.{link_type}) to: {linkfn}')
with open(encodeFilename(to_high_limit_path(linkfn)), 'w', encoding='utf-8', with open(to_high_limit_path(linkfn), 'w', encoding='utf-8',
newline='\r\n' if link_type == 'url' else '\n') as linkfile: newline='\r\n' if link_type == 'url' else '\n') as linkfile:
template_vars = {'url': url} template_vars = {'url': url}
if link_type == 'desktop': if link_type == 'desktop':
@@ -3352,7 +3365,7 @@ class YoutubeDL:
if self.params.get('skip_download'): if self.params.get('skip_download'):
info_dict['filepath'] = temp_filename info_dict['filepath'] = temp_filename
info_dict['__finaldir'] = os.path.dirname(os.path.abspath(encodeFilename(full_filename))) info_dict['__finaldir'] = os.path.dirname(os.path.abspath(full_filename))
info_dict['__files_to_move'] = files_to_move info_dict['__files_to_move'] = files_to_move
replace_info_dict(self.run_pp(MoveFilesAfterDownloadPP(self, False), info_dict)) replace_info_dict(self.run_pp(MoveFilesAfterDownloadPP(self, False), info_dict))
info_dict['__write_download_archive'] = self.params.get('force_write_download_archive') info_dict['__write_download_archive'] = self.params.get('force_write_download_archive')
@@ -3482,7 +3495,7 @@ class YoutubeDL:
self.report_file_already_downloaded(dl_filename) self.report_file_already_downloaded(dl_filename)
dl_filename = dl_filename or temp_filename dl_filename = dl_filename or temp_filename
info_dict['__finaldir'] = os.path.dirname(os.path.abspath(encodeFilename(full_filename))) info_dict['__finaldir'] = os.path.dirname(os.path.abspath(full_filename))
except network_exceptions as err: except network_exceptions as err:
self.report_error(f'unable to download video data: {err}') self.report_error(f'unable to download video data: {err}')
@@ -3541,8 +3554,8 @@ class YoutubeDL:
and info_dict.get('container') == 'm4a_dash', and info_dict.get('container') == 'm4a_dash',
'writing DASH m4a. Only some players support this container', 'writing DASH m4a. Only some players support this container',
FFmpegFixupM4aPP) FFmpegFixupM4aPP)
ffmpeg_fixup(downloader == 'hlsnative' and not self.params.get('hls_use_mpegts') ffmpeg_fixup((downloader == 'hlsnative' and not self.params.get('hls_use_mpegts'))
or info_dict.get('is_live') and self.params.get('hls_use_mpegts') is None, or (info_dict.get('is_live') and self.params.get('hls_use_mpegts') is None),
'Possible MPEG-TS in MP4 container or malformed AAC timestamps', 'Possible MPEG-TS in MP4 container or malformed AAC timestamps',
FFmpegFixupM3u8PP) FFmpegFixupM3u8PP)
ffmpeg_fixup(downloader == 'dashsegments' ffmpeg_fixup(downloader == 'dashsegments'
@@ -4297,7 +4310,7 @@ class YoutubeDL:
else: else:
try: try:
self.to_screen(f'[info] Writing {label} description to: {descfn}') self.to_screen(f'[info] Writing {label} description to: {descfn}')
with open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile: with open(descfn, 'w', encoding='utf-8') as descfile:
descfile.write(ie_result['description']) descfile.write(ie_result['description'])
except OSError: except OSError:
self.report_error(f'Cannot write {label} description file {descfn}') self.report_error(f'Cannot write {label} description file {descfn}')
@@ -4381,7 +4394,9 @@ class YoutubeDL:
return None return None
for idx, t in list(enumerate(thumbnails))[::-1]: for idx, t in list(enumerate(thumbnails))[::-1]:
thumb_ext = (f'{t["id"]}.' if multiple else '') + determine_ext(t['url'], 'jpg') thumb_ext = t.get('ext') or determine_ext(t['url'], 'jpg')
if multiple:
thumb_ext = f'{t["id"]}.{thumb_ext}'
thumb_display_id = f'{label} thumbnail {t["id"]}' thumb_display_id = f'{label} thumbnail {t["id"]}'
thumb_filename = replace_extension(filename, thumb_ext, info_dict.get('ext')) thumb_filename = replace_extension(filename, thumb_ext, info_dict.get('ext'))
thumb_filename_final = replace_extension(thumb_filename_base, thumb_ext, info_dict.get('ext')) thumb_filename_final = replace_extension(thumb_filename_base, thumb_ext, info_dict.get('ext'))
@@ -4397,7 +4412,7 @@ class YoutubeDL:
try: try:
uf = self.urlopen(Request(t['url'], headers=t.get('http_headers', {}))) uf = self.urlopen(Request(t['url'], headers=t.get('http_headers', {})))
self.to_screen(f'[info] Writing {thumb_display_id} to: {thumb_filename}') self.to_screen(f'[info] Writing {thumb_display_id} to: {thumb_filename}')
with open(encodeFilename(thumb_filename), 'wb') as thumbf: with open(thumb_filename, 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf) shutil.copyfileobj(uf, thumbf)
ret.append((thumb_filename, thumb_filename_final)) ret.append((thumb_filename, thumb_filename_final))
t['filepath'] = thumb_filename t['filepath'] = thumb_filename

View File

@@ -14,7 +14,6 @@ import os
import re import re
import traceback import traceback
from .compat import compat_os_name
from .cookies import SUPPORTED_BROWSERS, SUPPORTED_KEYRINGS, CookieLoadError from .cookies import SUPPORTED_BROWSERS, SUPPORTED_KEYRINGS, CookieLoadError
from .downloader.external import get_external_downloader from .downloader.external import get_external_downloader
from .extractor import list_extractor_classes from .extractor import list_extractor_classes
@@ -44,7 +43,6 @@ from .utils import (
GeoUtils, GeoUtils,
PlaylistEntries, PlaylistEntries,
SameFileError, SameFileError,
decodeOption,
download_range_func, download_range_func,
expand_path, expand_path,
float_or_none, float_or_none,
@@ -263,9 +261,11 @@ def validate_options(opts):
elif value in ('inf', 'infinite'): elif value in ('inf', 'infinite'):
return float('inf') return float('inf')
try: try:
return int(value) int_value = int(value)
except (TypeError, ValueError): except (TypeError, ValueError):
validate(False, f'{name} retry count', value) validate(False, f'{name} retry count', value)
validate_positive(f'{name} retry count', int_value)
return int_value
opts.retries = parse_retries('download', opts.retries) opts.retries = parse_retries('download', opts.retries)
opts.fragment_retries = parse_retries('fragment', opts.fragment_retries) opts.fragment_retries = parse_retries('fragment', opts.fragment_retries)
@@ -883,8 +883,8 @@ def parse_options(argv=None):
'listsubtitles': opts.listsubtitles, 'listsubtitles': opts.listsubtitles,
'subtitlesformat': opts.subtitlesformat, 'subtitlesformat': opts.subtitlesformat,
'subtitleslangs': opts.subtitleslangs, 'subtitleslangs': opts.subtitleslangs,
'matchtitle': decodeOption(opts.matchtitle), 'matchtitle': opts.matchtitle,
'rejecttitle': decodeOption(opts.rejecttitle), 'rejecttitle': opts.rejecttitle,
'max_downloads': opts.max_downloads, 'max_downloads': opts.max_downloads,
'prefer_free_formats': opts.prefer_free_formats, 'prefer_free_formats': opts.prefer_free_formats,
'trim_file_name': opts.trim_file_name, 'trim_file_name': opts.trim_file_name,
@@ -1053,7 +1053,7 @@ def _real_main(argv=None):
ydl.warn_if_short_id(args) ydl.warn_if_short_id(args)
# Show a useful error message and wait for keypress if not launched from shell on Windows # Show a useful error message and wait for keypress if not launched from shell on Windows
if not args and compat_os_name == 'nt' and getattr(sys, 'frozen', False): if not args and os.name == 'nt' and getattr(sys, 'frozen', False):
import ctypes.wintypes import ctypes.wintypes
import msvcrt import msvcrt
@@ -1064,7 +1064,7 @@ def _real_main(argv=None):
# If we only have a single process attached, then the executable was double clicked # If we only have a single process attached, then the executable was double clicked
# When using `pyinstaller` with `--onefile`, two processes get attached # When using `pyinstaller` with `--onefile`, two processes get attached
is_onefile = hasattr(sys, '_MEIPASS') and os.path.basename(sys._MEIPASS).startswith('_MEI') is_onefile = hasattr(sys, '_MEIPASS') and os.path.basename(sys._MEIPASS).startswith('_MEI')
if attached_processes == 1 or is_onefile and attached_processes == 2: if attached_processes == 1 or (is_onefile and attached_processes == 2):
print(parser._generate_error_message( print(parser._generate_error_message(
'Do not double-click the executable, instead call it from a command line.\n' 'Do not double-click the executable, instead call it from a command line.\n'
'Please read the README for further information on how to use yt-dlp: ' 'Please read the README for further information on how to use yt-dlp: '
@@ -1111,9 +1111,9 @@ def main(argv=None):
from .extractor import gen_extractors, list_extractors from .extractor import gen_extractors, list_extractors
__all__ = [ __all__ = [
'main',
'YoutubeDL', 'YoutubeDL',
'parse_options',
'gen_extractors', 'gen_extractors',
'list_extractors', 'list_extractors',
'main',
'parse_options',
] ]

View File

@@ -3,7 +3,6 @@ from math import ceil
from .compat import compat_ord from .compat import compat_ord
from .dependencies import Cryptodome from .dependencies import Cryptodome
from .utils import bytes_to_intlist, intlist_to_bytes
if Cryptodome.AES: if Cryptodome.AES:
def aes_cbc_decrypt_bytes(data, key, iv): def aes_cbc_decrypt_bytes(data, key, iv):
@@ -17,15 +16,15 @@ if Cryptodome.AES:
else: else:
def aes_cbc_decrypt_bytes(data, key, iv): def aes_cbc_decrypt_bytes(data, key, iv):
""" Decrypt bytes with AES-CBC using native implementation since pycryptodome is unavailable """ """ Decrypt bytes with AES-CBC using native implementation since pycryptodome is unavailable """
return intlist_to_bytes(aes_cbc_decrypt(*map(bytes_to_intlist, (data, key, iv)))) return bytes(aes_cbc_decrypt(*map(list, (data, key, iv))))
def aes_gcm_decrypt_and_verify_bytes(data, key, tag, nonce): def aes_gcm_decrypt_and_verify_bytes(data, key, tag, nonce):
""" Decrypt bytes with AES-GCM using native implementation since pycryptodome is unavailable """ """ Decrypt bytes with AES-GCM using native implementation since pycryptodome is unavailable """
return intlist_to_bytes(aes_gcm_decrypt_and_verify(*map(bytes_to_intlist, (data, key, tag, nonce)))) return bytes(aes_gcm_decrypt_and_verify(*map(list, (data, key, tag, nonce))))
def aes_cbc_encrypt_bytes(data, key, iv, **kwargs): def aes_cbc_encrypt_bytes(data, key, iv, **kwargs):
return intlist_to_bytes(aes_cbc_encrypt(*map(bytes_to_intlist, (data, key, iv)), **kwargs)) return bytes(aes_cbc_encrypt(*map(list, (data, key, iv)), **kwargs))
BLOCK_SIZE_BYTES = 16 BLOCK_SIZE_BYTES = 16
@@ -221,7 +220,7 @@ def aes_gcm_decrypt_and_verify(data, key, tag, nonce):
j0 = [*nonce, 0, 0, 0, 1] j0 = [*nonce, 0, 0, 0, 1]
else: else:
fill = (BLOCK_SIZE_BYTES - (len(nonce) % BLOCK_SIZE_BYTES)) % BLOCK_SIZE_BYTES + 8 fill = (BLOCK_SIZE_BYTES - (len(nonce) % BLOCK_SIZE_BYTES)) % BLOCK_SIZE_BYTES + 8
ghash_in = nonce + [0] * fill + bytes_to_intlist((8 * len(nonce)).to_bytes(8, 'big')) ghash_in = nonce + [0] * fill + list((8 * len(nonce)).to_bytes(8, 'big'))
j0 = ghash(hash_subkey, ghash_in) j0 = ghash(hash_subkey, ghash_in)
# TODO: add nonce support to aes_ctr_decrypt # TODO: add nonce support to aes_ctr_decrypt
@@ -234,9 +233,9 @@ def aes_gcm_decrypt_and_verify(data, key, tag, nonce):
s_tag = ghash( s_tag = ghash(
hash_subkey, hash_subkey,
data data
+ [0] * pad_len # pad + [0] * pad_len # pad
+ bytes_to_intlist((0 * 8).to_bytes(8, 'big') # length of associated data + list((0 * 8).to_bytes(8, 'big') # length of associated data
+ ((len(data) * 8).to_bytes(8, 'big'))), # length of data + ((len(data) * 8).to_bytes(8, 'big'))), # length of data
) )
if tag != aes_ctr_encrypt(s_tag, key, j0): if tag != aes_ctr_encrypt(s_tag, key, j0):
@@ -300,8 +299,8 @@ def aes_decrypt_text(data, password, key_size_bytes):
""" """
NONCE_LENGTH_BYTES = 8 NONCE_LENGTH_BYTES = 8
data = bytes_to_intlist(base64.b64decode(data)) data = list(base64.b64decode(data))
password = bytes_to_intlist(password.encode()) password = list(password.encode())
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password)) key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))
key = aes_encrypt(key[:BLOCK_SIZE_BYTES], key_expansion(key)) * (key_size_bytes // BLOCK_SIZE_BYTES) key = aes_encrypt(key[:BLOCK_SIZE_BYTES], key_expansion(key)) * (key_size_bytes // BLOCK_SIZE_BYTES)
@@ -310,7 +309,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
cipher = data[NONCE_LENGTH_BYTES:] cipher = data[NONCE_LENGTH_BYTES:]
decrypted_data = aes_ctr_decrypt(cipher, key, nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES)) decrypted_data = aes_ctr_decrypt(cipher, key, nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES))
return intlist_to_bytes(decrypted_data) return bytes(decrypted_data)
RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36) RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36)
@@ -535,19 +534,17 @@ def ghash(subkey, data):
__all__ = [ __all__ = [
'aes_cbc_decrypt', 'aes_cbc_decrypt',
'aes_cbc_decrypt_bytes', 'aes_cbc_decrypt_bytes',
'aes_ctr_decrypt',
'aes_decrypt_text',
'aes_decrypt',
'aes_ecb_decrypt',
'aes_gcm_decrypt_and_verify',
'aes_gcm_decrypt_and_verify_bytes',
'aes_cbc_encrypt', 'aes_cbc_encrypt',
'aes_cbc_encrypt_bytes', 'aes_cbc_encrypt_bytes',
'aes_ctr_decrypt',
'aes_ctr_encrypt', 'aes_ctr_encrypt',
'aes_decrypt',
'aes_decrypt_text',
'aes_ecb_decrypt',
'aes_ecb_encrypt', 'aes_ecb_encrypt',
'aes_encrypt', 'aes_encrypt',
'aes_gcm_decrypt_and_verify',
'aes_gcm_decrypt_and_verify_bytes',
'key_expansion', 'key_expansion',
'pad_block', 'pad_block',
'pkcs7_padding', 'pkcs7_padding',

View File

@@ -1,5 +1,4 @@
import os import os
import sys
import xml.etree.ElementTree as etree import xml.etree.ElementTree as etree
from .compat_utils import passthrough_module from .compat_utils import passthrough_module
@@ -24,33 +23,14 @@ def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder())) return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
compat_os_name = os._name if os.name == 'java' else os.name
def compat_shlex_quote(s):
from ..utils import shell_quote
return shell_quote(s)
def compat_ord(c): def compat_ord(c):
return c if isinstance(c, int) else ord(c) return c if isinstance(c, int) else ord(c)
if compat_os_name == 'nt' and sys.version_info < (3, 8):
# os.path.realpath on Windows does not follow symbolic links
# prior to Python 3.8 (see https://bugs.python.org/issue9949)
def compat_realpath(path):
while os.path.islink(path):
path = os.path.abspath(os.readlink(path))
return os.path.realpath(path)
else:
compat_realpath = os.path.realpath
# Python 3.8+ does not honor %HOME% on windows, but this breaks compatibility with youtube-dl # Python 3.8+ does not honor %HOME% on windows, but this breaks compatibility with youtube-dl
# See https://github.com/yt-dlp/yt-dlp/issues/792 # See https://github.com/yt-dlp/yt-dlp/issues/792
# https://docs.python.org/3/library/os.path.html#os.path.expanduser # https://docs.python.org/3/library/os.path.html#os.path.expanduser
if compat_os_name in ('nt', 'ce'): if os.name in ('nt', 'ce'):
def compat_expanduser(path): def compat_expanduser(path):
HOME = os.environ.get('HOME') HOME = os.environ.get('HOME')
if not HOME: if not HOME:

View File

@@ -8,16 +8,14 @@ passthrough_module(__name__, '.._legacy', callback=lambda attr: warnings.warn(
DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=6)) DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=6))
del passthrough_module del passthrough_module
import base64 import functools # noqa: F401
import urllib.error import os
import urllib.parse
compat_str = str
compat_b64decode = base64.b64decode compat_os_name = os.name
compat_realpath = os.path.realpath
compat_urlparse = urllib.parse
compat_parse_qs = urllib.parse.parse_qs def compat_shlex_quote(s):
compat_urllib_parse_unquote = urllib.parse.unquote from ..utils import shell_quote
compat_urllib_parse_urlencode = urllib.parse.urlencode return shell_quote(s)
compat_urllib_parse_urlparse = urllib.parse.urlparse

View File

@@ -30,7 +30,7 @@ from asyncio import run as compat_asyncio_run # noqa: F401
from re import Pattern as compat_Pattern # noqa: F401 from re import Pattern as compat_Pattern # noqa: F401
from re import match as compat_Match # noqa: F401 from re import match as compat_Match # noqa: F401
from . import compat_expanduser, compat_HTMLParseError, compat_realpath from . import compat_expanduser, compat_HTMLParseError
from .compat_utils import passthrough_module from .compat_utils import passthrough_module
from ..dependencies import brotli as compat_brotli # noqa: F401 from ..dependencies import brotli as compat_brotli # noqa: F401
from ..dependencies import websockets as compat_websockets # noqa: F401 from ..dependencies import websockets as compat_websockets # noqa: F401
@@ -78,7 +78,7 @@ compat_kwargs = lambda kwargs: kwargs
compat_map = map compat_map = map
compat_numeric_types = (int, float, complex) compat_numeric_types = (int, float, complex)
compat_os_path_expanduser = compat_expanduser compat_os_path_expanduser = compat_expanduser
compat_os_path_realpath = compat_realpath compat_os_path_realpath = os.path.realpath
compat_print = print compat_print = print
compat_shlex_split = shlex.split compat_shlex_split = shlex.split
compat_socket_create_connection = socket.create_connection compat_socket_create_connection = socket.create_connection
@@ -104,5 +104,12 @@ compat_xml_parse_error = compat_xml_etree_ElementTree_ParseError = etree.ParseEr
compat_xpath = lambda xpath: xpath compat_xpath = lambda xpath: xpath
compat_zip = zip compat_zip = zip
workaround_optparse_bug9161 = lambda: None workaround_optparse_bug9161 = lambda: None
compat_str = str
compat_b64decode = base64.b64decode
compat_urlparse = urllib.parse
compat_parse_qs = urllib.parse.parse_qs
compat_urllib_parse_unquote = urllib.parse.unquote
compat_urllib_parse_urlencode = urllib.parse.urlencode
compat_urllib_parse_urlparse = urllib.parse.urlparse
legacy = [] legacy = []

View File

@@ -1,7 +0,0 @@
# flake8: noqa: F405
from functools import * # noqa: F403
from .compat_utils import passthrough_module
passthrough_module(__name__, 'functools')
del passthrough_module

View File

@@ -7,9 +7,9 @@ passthrough_module(__name__, 'urllib.request')
del passthrough_module del passthrough_module
from .. import compat_os_name import os
if compat_os_name == 'nt': if os.name == 'nt':
# On older Python versions, proxies are extracted from Windows registry erroneously. [1] # On older Python versions, proxies are extracted from Windows registry erroneously. [1]
# If the https proxy in the registry does not have a scheme, urllib will incorrectly add https:// to it. [2] # If the https proxy in the registry does not have a scheme, urllib will incorrectly add https:// to it. [2]
# It is unlikely that the user has actually set it to be https, so we should be fine to safely downgrade # It is unlikely that the user has actually set it to be https, so we should be fine to safely downgrade
@@ -37,4 +37,4 @@ if compat_os_name == 'nt':
def getproxies(): def getproxies():
return getproxies_environment() or getproxies_registry_patched() return getproxies_environment() or getproxies_registry_patched()
del compat_os_name del os

View File

@@ -25,7 +25,6 @@ from .aes import (
aes_gcm_decrypt_and_verify_bytes, aes_gcm_decrypt_and_verify_bytes,
unpad_pkcs7, unpad_pkcs7,
) )
from .compat import compat_os_name
from .dependencies import ( from .dependencies import (
_SECRETSTORAGE_UNAVAILABLE_REASON, _SECRETSTORAGE_UNAVAILABLE_REASON,
secretstorage, secretstorage,
@@ -196,7 +195,10 @@ def _extract_firefox_cookies(profile, container, logger):
def _firefox_browser_dirs(): def _firefox_browser_dirs():
if sys.platform in ('cygwin', 'win32'): if sys.platform in ('cygwin', 'win32'):
yield os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles') yield from map(os.path.expandvars, (
R'%APPDATA%\Mozilla\Firefox\Profiles',
R'%LOCALAPPDATA%\Packages\Mozilla.Firefox_n80bbvh6b1yt2\LocalCache\Roaming\Mozilla\Firefox\Profiles',
))
elif sys.platform == 'darwin': elif sys.platform == 'darwin':
yield os.path.expanduser('~/Library/Application Support/Firefox/Profiles') yield os.path.expanduser('~/Library/Application Support/Firefox/Profiles')
@@ -343,7 +345,7 @@ def _extract_chrome_cookies(browser_name, profile, keyring, logger):
logger.debug(f'cookie version breakdown: {counts}') logger.debug(f'cookie version breakdown: {counts}')
return jar return jar
except PermissionError as error: except PermissionError as error:
if compat_os_name == 'nt' and error.errno == 13: if os.name == 'nt' and error.errno == 13:
message = 'Could not copy Chrome cookie database. See https://github.com/yt-dlp/yt-dlp/issues/7271 for more info' message = 'Could not copy Chrome cookie database. See https://github.com/yt-dlp/yt-dlp/issues/7271 for more info'
logger.error(message) logger.error(message)
raise DownloadError(message) # force exit raise DownloadError(message) # force exit
@@ -1277,8 +1279,8 @@ class YoutubeDLCookieJar(http.cookiejar.MozillaCookieJar):
def _really_save(self, f, ignore_discard, ignore_expires): def _really_save(self, f, ignore_discard, ignore_expires):
now = time.time() now = time.time()
for cookie in self: for cookie in self:
if (not ignore_discard and cookie.discard if ((not ignore_discard and cookie.discard)
or not ignore_expires and cookie.is_expired(now)): or (not ignore_expires and cookie.is_expired(now))):
continue continue
name, value = cookie.name, cookie.value name, value = cookie.name, cookie.value
if value is None: if value is None:

View File

@@ -24,7 +24,7 @@ try:
from Crypto.Cipher import AES, PKCS1_OAEP, Blowfish, PKCS1_v1_5 # noqa: F401 from Crypto.Cipher import AES, PKCS1_OAEP, Blowfish, PKCS1_v1_5 # noqa: F401
from Crypto.Hash import CMAC, SHA1 # noqa: F401 from Crypto.Hash import CMAC, SHA1 # noqa: F401
from Crypto.PublicKey import RSA # noqa: F401 from Crypto.PublicKey import RSA # noqa: F401
except ImportError: except (ImportError, OSError):
__version__ = f'broken {__version__}'.strip() __version__ = f'broken {__version__}'.strip()

View File

@@ -20,9 +20,7 @@ from ..utils import (
Namespace, Namespace,
RetryManager, RetryManager,
classproperty, classproperty,
decodeArgument,
deprecation_warning, deprecation_warning,
encodeFilename,
format_bytes, format_bytes,
join_nonempty, join_nonempty,
parse_bytes, parse_bytes,
@@ -219,7 +217,7 @@ class FileDownloader:
def temp_name(self, filename): def temp_name(self, filename):
"""Returns a temporary filename for the given filename.""" """Returns a temporary filename for the given filename."""
if self.params.get('nopart', False) or filename == '-' or \ if self.params.get('nopart', False) or filename == '-' or \
(os.path.exists(encodeFilename(filename)) and not os.path.isfile(encodeFilename(filename))): (os.path.exists(filename) and not os.path.isfile(filename)):
return filename return filename
return filename + '.part' return filename + '.part'
@@ -273,7 +271,7 @@ class FileDownloader:
"""Try to set the last-modified time of the given file.""" """Try to set the last-modified time of the given file."""
if last_modified_hdr is None: if last_modified_hdr is None:
return return
if not os.path.isfile(encodeFilename(filename)): if not os.path.isfile(filename):
return return
timestr = last_modified_hdr timestr = last_modified_hdr
if timestr is None: if timestr is None:
@@ -432,13 +430,13 @@ class FileDownloader:
""" """
nooverwrites_and_exists = ( nooverwrites_and_exists = (
not self.params.get('overwrites', True) not self.params.get('overwrites', True)
and os.path.exists(encodeFilename(filename)) and os.path.exists(filename)
) )
if not hasattr(filename, 'write'): if not hasattr(filename, 'write'):
continuedl_and_exists = ( continuedl_and_exists = (
self.params.get('continuedl', True) self.params.get('continuedl', True)
and os.path.isfile(encodeFilename(filename)) and os.path.isfile(filename)
and not self.params.get('nopart', False) and not self.params.get('nopart', False)
) )
@@ -448,7 +446,7 @@ class FileDownloader:
self._hook_progress({ self._hook_progress({
'filename': filename, 'filename': filename,
'status': 'finished', 'status': 'finished',
'total_bytes': os.path.getsize(encodeFilename(filename)), 'total_bytes': os.path.getsize(filename),
}, info_dict) }, info_dict)
self._finish_multiline_status() self._finish_multiline_status()
return True, False return True, False
@@ -489,9 +487,7 @@ class FileDownloader:
if not self.params.get('verbose', False): if not self.params.get('verbose', False):
return return
str_args = [decodeArgument(a) for a in args]
if exe is None: if exe is None:
exe = os.path.basename(str_args[0]) exe = os.path.basename(args[0])
self.write_debug(f'{exe} command line: {shell_quote(str_args)}') self.write_debug(f'{exe} command line: {shell_quote(args)}')

View File

@@ -23,7 +23,6 @@ from ..utils import (
cli_valueless_option, cli_valueless_option,
determine_ext, determine_ext,
encodeArgument, encodeArgument,
encodeFilename,
find_available_port, find_available_port,
remove_end, remove_end,
traverse_obj, traverse_obj,
@@ -67,7 +66,7 @@ class ExternalFD(FragmentFD):
'elapsed': time.time() - started, 'elapsed': time.time() - started,
} }
if filename != '-': if filename != '-':
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(tmpfilename)
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
status.update({ status.update({
'downloaded_bytes': fsize, 'downloaded_bytes': fsize,
@@ -184,9 +183,9 @@ class ExternalFD(FragmentFD):
dest.write(decrypt_fragment(fragment, src.read())) dest.write(decrypt_fragment(fragment, src.read()))
src.close() src.close()
if not self.params.get('keep_fragments', False): if not self.params.get('keep_fragments', False):
self.try_remove(encodeFilename(fragment_filename)) self.try_remove(fragment_filename)
dest.close() dest.close()
self.try_remove(encodeFilename(f'{tmpfilename}.frag.urls')) self.try_remove(f'{tmpfilename}.frag.urls')
return 0 return 0
def _call_process(self, cmd, info_dict): def _call_process(self, cmd, info_dict):
@@ -620,7 +619,7 @@ class FFmpegFD(ExternalFD):
args += self._configuration_args(('_o1', '_o', '')) args += self._configuration_args(('_o1', '_o', ''))
args = [encodeArgument(opt) for opt in args] args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True)) args.append(ffpp._ffmpeg_filename_argument(tmpfilename))
self._debug_cmd(args) self._debug_cmd(args)
piped = any(fmt['url'] in ('-', 'pipe:') for fmt in selected_formats) piped = any(fmt['url'] in ('-', 'pipe:') for fmt in selected_formats)

View File

@@ -9,10 +9,9 @@ import time
from .common import FileDownloader from .common import FileDownloader
from .http import HttpFD from .http import HttpFD
from ..aes import aes_cbc_decrypt_bytes, unpad_pkcs7 from ..aes import aes_cbc_decrypt_bytes, unpad_pkcs7
from ..compat import compat_os_name
from ..networking import Request from ..networking import Request
from ..networking.exceptions import HTTPError, IncompleteRead from ..networking.exceptions import HTTPError, IncompleteRead
from ..utils import DownloadError, RetryManager, encodeFilename, traverse_obj from ..utils import DownloadError, RetryManager, traverse_obj
from ..utils.networking import HTTPHeaderDict from ..utils.networking import HTTPHeaderDict
from ..utils.progress import ProgressCalculator from ..utils.progress import ProgressCalculator
@@ -152,7 +151,7 @@ class FragmentFD(FileDownloader):
if self.__do_ytdl_file(ctx): if self.__do_ytdl_file(ctx):
self._write_ytdl_file(ctx) self._write_ytdl_file(ctx)
if not self.params.get('keep_fragments', False): if not self.params.get('keep_fragments', False):
self.try_remove(encodeFilename(ctx['fragment_filename_sanitized'])) self.try_remove(ctx['fragment_filename_sanitized'])
del ctx['fragment_filename_sanitized'] del ctx['fragment_filename_sanitized']
def _prepare_frag_download(self, ctx): def _prepare_frag_download(self, ctx):
@@ -188,7 +187,7 @@ class FragmentFD(FileDownloader):
}) })
if self.__do_ytdl_file(ctx): if self.__do_ytdl_file(ctx):
ytdl_file_exists = os.path.isfile(encodeFilename(self.ytdl_filename(ctx['filename']))) ytdl_file_exists = os.path.isfile(self.ytdl_filename(ctx['filename']))
continuedl = self.params.get('continuedl', True) continuedl = self.params.get('continuedl', True)
if continuedl and ytdl_file_exists: if continuedl and ytdl_file_exists:
self._read_ytdl_file(ctx) self._read_ytdl_file(ctx)
@@ -390,7 +389,7 @@ class FragmentFD(FileDownloader):
def __exit__(self, exc_type, exc_val, exc_tb): def __exit__(self, exc_type, exc_val, exc_tb):
pass pass
if compat_os_name == 'nt': if os.name == 'nt':
def future_result(future): def future_result(future):
while True: while True:
try: try:

View File

@@ -119,12 +119,12 @@ class HlsFD(FragmentFD):
self.to_screen(f'[{self.FD_NAME}] Fragment downloads will be delegated to {real_downloader.get_basename()}') self.to_screen(f'[{self.FD_NAME}] Fragment downloads will be delegated to {real_downloader.get_basename()}')
def is_ad_fragment_start(s): def is_ad_fragment_start(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s return ((s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s)
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad')) or (s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad')))
def is_ad_fragment_end(s): def is_ad_fragment_end(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s return ((s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s)
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment')) or (s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment')))
fragments = [] fragments = []

View File

@@ -15,7 +15,6 @@ from ..utils import (
ThrottledDownload, ThrottledDownload,
XAttrMetadataError, XAttrMetadataError,
XAttrUnavailableError, XAttrUnavailableError,
encodeFilename,
int_or_none, int_or_none,
parse_http_range, parse_http_range,
try_call, try_call,
@@ -58,9 +57,8 @@ class HttpFD(FileDownloader):
if self.params.get('continuedl', True): if self.params.get('continuedl', True):
# Establish possible resume length # Establish possible resume length
if os.path.isfile(encodeFilename(ctx.tmpfilename)): if os.path.isfile(ctx.tmpfilename):
ctx.resume_len = os.path.getsize( ctx.resume_len = os.path.getsize(ctx.tmpfilename)
encodeFilename(ctx.tmpfilename))
ctx.is_resume = ctx.resume_len > 0 ctx.is_resume = ctx.resume_len > 0
@@ -241,7 +239,7 @@ class HttpFD(FileDownloader):
ctx.resume_len = byte_counter ctx.resume_len = byte_counter
else: else:
try: try:
ctx.resume_len = os.path.getsize(encodeFilename(ctx.tmpfilename)) ctx.resume_len = os.path.getsize(ctx.tmpfilename)
except FileNotFoundError: except FileNotFoundError:
ctx.resume_len = 0 ctx.resume_len = 0
raise RetryDownload(e) raise RetryDownload(e)

View File

@@ -8,7 +8,6 @@ from ..utils import (
Popen, Popen,
check_executable, check_executable,
encodeArgument, encodeArgument,
encodeFilename,
get_exe_version, get_exe_version,
) )
@@ -179,7 +178,7 @@ class RtmpFD(FileDownloader):
return False return False
while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live: while retval in (RD_INCOMPLETE, RD_FAILED) and not test and not live:
prevsize = os.path.getsize(encodeFilename(tmpfilename)) prevsize = os.path.getsize(tmpfilename)
self.to_screen(f'[rtmpdump] Downloaded {prevsize} bytes') self.to_screen(f'[rtmpdump] Downloaded {prevsize} bytes')
time.sleep(5.0) # This seems to be needed time.sleep(5.0) # This seems to be needed
args = [*basic_args, '--resume'] args = [*basic_args, '--resume']
@@ -187,7 +186,7 @@ class RtmpFD(FileDownloader):
args += ['--skip', '1'] args += ['--skip', '1']
args = [encodeArgument(a) for a in args] args = [encodeArgument(a) for a in args]
retval = run_rtmpdump(args) retval = run_rtmpdump(args)
cursize = os.path.getsize(encodeFilename(tmpfilename)) cursize = os.path.getsize(tmpfilename)
if prevsize == cursize and retval == RD_FAILED: if prevsize == cursize and retval == RD_FAILED:
break break
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those # Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
@@ -196,7 +195,7 @@ class RtmpFD(FileDownloader):
retval = RD_SUCCESS retval = RD_SUCCESS
break break
if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE): if retval == RD_SUCCESS or (test and retval == RD_INCOMPLETE):
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(tmpfilename)
self.to_screen(f'[rtmpdump] Downloaded {fsize} bytes') self.to_screen(f'[rtmpdump] Downloaded {fsize} bytes')
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
self._hook_progress({ self._hook_progress({

View File

@@ -2,7 +2,7 @@ import os
import subprocess import subprocess
from .common import FileDownloader from .common import FileDownloader
from ..utils import check_executable, encodeFilename from ..utils import check_executable
class RtspFD(FileDownloader): class RtspFD(FileDownloader):
@@ -26,7 +26,7 @@ class RtspFD(FileDownloader):
retval = subprocess.call(args) retval = subprocess.call(args)
if retval == 0: if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename)) fsize = os.path.getsize(tmpfilename)
self.to_screen(f'\r[{args[0]}] {fsize} bytes') self.to_screen(f'\r[{args[0]}] {fsize} bytes')
self.try_rename(tmpfilename, filename) self.try_rename(tmpfilename, filename)
self._hook_progress({ self._hook_progress({

View File

@@ -123,8 +123,8 @@ class YoutubeLiveChatFD(FragmentFD):
data, data,
lambda x: x['continuationContents']['liveChatContinuation'], dict) or {} lambda x: x['continuationContents']['liveChatContinuation'], dict) or {}
func = (info_dict['protocol'] == 'youtube_live_chat' and parse_actions_live func = ((info_dict['protocol'] == 'youtube_live_chat' and parse_actions_live)
or frag_index == 1 and try_refresh_replay_beginning or (frag_index == 1 and try_refresh_replay_beginning)
or parse_actions_replay) or parse_actions_replay)
return (True, *func(live_chat_continuation)) return (True, *func(live_chat_continuation))
except HTTPError as err: except HTTPError as err:

View File

@@ -208,6 +208,10 @@ from .bandcamp import (
BandcampUserIE, BandcampUserIE,
BandcampWeeklyIE, BandcampWeeklyIE,
) )
from .bandlab import (
BandlabIE,
BandlabPlaylistIE,
)
from .bannedvideo import BannedVideoIE from .bannedvideo import BannedVideoIE
from .bbc import ( from .bbc import (
BBCIE, BBCIE,
@@ -252,6 +256,7 @@ from .bilibili import (
BilibiliCheeseIE, BilibiliCheeseIE,
BilibiliCheeseSeasonIE, BilibiliCheeseSeasonIE,
BilibiliCollectionListIE, BilibiliCollectionListIE,
BiliBiliDynamicIE,
BilibiliFavoritesListIE, BilibiliFavoritesListIE,
BiliBiliIE, BiliBiliIE,
BiliBiliPlayerIE, BiliBiliPlayerIE,
@@ -436,12 +441,6 @@ from .crowdbunker import (
CrowdBunkerIE, CrowdBunkerIE,
) )
from .crtvg import CrtvgIE from .crtvg import CrtvgIE
from .crunchyroll import (
CrunchyrollArtistIE,
CrunchyrollBetaIE,
CrunchyrollBetaShowIE,
CrunchyrollMusicIE,
)
from .cspan import ( from .cspan import (
CSpanCongressIE, CSpanCongressIE,
CSpanIE, CSpanIE,
@@ -551,6 +550,7 @@ from .dropout import (
DropoutIE, DropoutIE,
DropoutSeasonIE, DropoutSeasonIE,
) )
from .drtalks import DrTalksIE
from .drtuber import DrTuberIE from .drtuber import DrTuberIE
from .drtv import ( from .drtv import (
DRTVIE, DRTVIE,
@@ -580,6 +580,10 @@ from .egghead import (
EggheadCourseIE, EggheadCourseIE,
EggheadLessonIE, EggheadLessonIE,
) )
from .eggs import (
EggsArtistIE,
EggsIE,
)
from .eighttracks import EightTracksIE from .eighttracks import EightTracksIE
from .eitb import EitbIE from .eitb import EitbIE
from .elementorembed import ElementorEmbedIE from .elementorembed import ElementorEmbedIE
@@ -695,11 +699,6 @@ from .frontendmasters import (
FrontendMastersLessonIE, FrontendMastersLessonIE,
) )
from .fujitv import FujiTVFODPlus7IE from .fujitv import FujiTVFODPlus7IE
from .funimation import (
FunimationIE,
FunimationPageIE,
FunimationShowIE,
)
from .funk import FunkIE from .funk import FunkIE
from .funker530 import Funker530IE from .funker530 import Funker530IE
from .fuyintv import FuyinTVIE from .fuyintv import FuyinTVIE
@@ -708,6 +707,7 @@ from .gab import (
GabTVIE, GabTVIE,
) )
from .gaia import GaiaIE from .gaia import GaiaIE
from .gamedevtv import GameDevTVDashboardIE
from .gamejolt import ( from .gamejolt import (
GameJoltCommunityIE, GameJoltCommunityIE,
GameJoltGameIE, GameJoltGameIE,
@@ -941,6 +941,10 @@ from .kaltura import KalturaIE
from .kankanews import KankaNewsIE from .kankanews import KankaNewsIE
from .karaoketv import KaraoketvIE from .karaoketv import KaraoketvIE
from .kelbyone import KelbyOneIE from .kelbyone import KelbyOneIE
from .kenh14 import (
Kenh14PlaylistIE,
Kenh14VideoIE,
)
from .khanacademy import ( from .khanacademy import (
KhanAcademyIE, KhanAcademyIE,
KhanAcademyUnitIE, KhanAcademyUnitIE,
@@ -1130,12 +1134,6 @@ from .microsoftembed import (
MicrosoftMediusIE, MicrosoftMediusIE,
) )
from .microsoftstream import MicrosoftStreamIE from .microsoftstream import MicrosoftStreamIE
from .mildom import (
MildomClipIE,
MildomIE,
MildomUserVodIE,
MildomVodIE,
)
from .minds import ( from .minds import (
MindsChannelIE, MindsChannelIE,
MindsGroupIE, MindsGroupIE,
@@ -1155,6 +1153,7 @@ from .mitele import MiTeleIE
from .mixch import ( from .mixch import (
MixchArchiveIE, MixchArchiveIE,
MixchIE, MixchIE,
MixchMovieIE,
) )
from .mixcloud import ( from .mixcloud import (
MixcloudIE, MixcloudIE,
@@ -1274,6 +1273,10 @@ from .nebula import (
) )
from .nekohacker import NekoHackerIE from .nekohacker import NekoHackerIE
from .nerdcubed import NerdCubedFeedIE from .nerdcubed import NerdCubedFeedIE
from .nest import (
NestClipIE,
NestIE,
)
from .neteasemusic import ( from .neteasemusic import (
NetEaseMusicAlbumIE, NetEaseMusicAlbumIE,
NetEaseMusicDjRadioIE, NetEaseMusicDjRadioIE,
@@ -1516,8 +1519,8 @@ from .pgatour import PGATourIE
from .philharmoniedeparis import PhilharmonieDeParisIE from .philharmoniedeparis import PhilharmonieDeParisIE
from .phoenix import PhoenixIE from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE from .photobucket import PhotobucketIE
from .pialive import PiaLiveIE
from .piapro import PiaproIE from .piapro import PiaproIE
from .piaulizaportal import PIAULIZAPortalIE
from .picarto import ( from .picarto import (
PicartoIE, PicartoIE,
PicartoVodIE, PicartoVodIE,
@@ -1528,6 +1531,10 @@ from .pinterest import (
PinterestCollectionIE, PinterestCollectionIE,
PinterestIE, PinterestIE,
) )
from .piramidetv import (
PiramideTVChannelIE,
PiramideTVIE,
)
from .pixivsketch import ( from .pixivsketch import (
PixivSketchIE, PixivSketchIE,
PixivSketchUserIE, PixivSketchUserIE,
@@ -1547,16 +1554,13 @@ from .pluralsight import (
PluralsightIE, PluralsightIE,
) )
from .plutotv import PlutoTVIE from .plutotv import PlutoTVIE
from .plvideo import PlVideoIE
from .podbayfm import ( from .podbayfm import (
PodbayFMChannelIE, PodbayFMChannelIE,
PodbayFMIE, PodbayFMIE,
) )
from .podchaser import PodchaserIE from .podchaser import PodchaserIE
from .podomatic import PodomaticIE from .podomatic import PodomaticIE
from .pokemon import (
PokemonIE,
PokemonWatchIE,
)
from .pokergo import ( from .pokergo import (
PokerGoCollectionIE, PokerGoCollectionIE,
PokerGoIE, PokerGoIE,
@@ -1647,6 +1651,7 @@ from .radiokapital import (
RadioKapitalIE, RadioKapitalIE,
RadioKapitalShowIE, RadioKapitalShowIE,
) )
from .radioradicale import RadioRadicaleIE
from .radiozet import RadioZetPodcastIE from .radiozet import RadioZetPodcastIE
from .radlive import ( from .radlive import (
RadLiveChannelIE, RadLiveChannelIE,
@@ -1938,9 +1943,7 @@ from .spotify import (
) )
from .spreaker import ( from .spreaker import (
SpreakerIE, SpreakerIE,
SpreakerPageIE,
SpreakerShowIE, SpreakerShowIE,
SpreakerShowPageIE,
) )
from .springboardplatform import SpringboardPlatformIE from .springboardplatform import SpringboardPlatformIE
from .sprout import SproutIE from .sprout import SproutIE
@@ -1982,6 +1985,10 @@ from .streetvoice import StreetVoiceIE
from .stretchinternet import StretchInternetIE from .stretchinternet import StretchInternetIE
from .stripchat import StripchatIE from .stripchat import StripchatIE
from .stv import STVPlayerIE from .stv import STVPlayerIE
from .subsplash import (
SubsplashIE,
SubsplashPlaylistIE,
)
from .substack import SubstackIE from .substack import SubstackIE
from .sunporno import SunPornoIE from .sunporno import SunPornoIE
from .sverigesradio import ( from .sverigesradio import (
@@ -2251,6 +2258,10 @@ from .ufctv import (
) )
from .ukcolumn import UkColumnIE from .ukcolumn import UkColumnIE
from .uktvplay import UKTVPlayIE from .uktvplay import UKTVPlayIE
from .uliza import (
UlizaPlayerIE,
UlizaPortalIE,
)
from .umg import UMGDeIE from .umg import UMGDeIE
from .unistra import UnistraIE from .unistra import UnistraIE
from .unity import UnityIE from .unity import UnityIE
@@ -2279,10 +2290,6 @@ from .utreon import UtreonIE
from .varzesh3 import Varzesh3IE from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE from .vbox7 import Vbox7IE
from .veo import VeoIE from .veo import VeoIE
from .veoh import (
VeohIE,
VeohUserIE,
)
from .vesti import VestiIE from .vesti import VestiIE
from .vevo import ( from .vevo import (
VevoIE, VevoIE,
@@ -2355,10 +2362,6 @@ from .vimm import (
VimmIE, VimmIE,
VimmRecordingIE, VimmRecordingIE,
) )
from .vine import (
VineIE,
VineUserIE,
)
from .viously import ViouslyIE from .viously import ViouslyIE
from .viqeo import ViqeoIE from .viqeo import ViqeoIE
from .viu import ( from .viu import (

View File

@@ -6,7 +6,6 @@ import hmac
import io import io
import json import json
import re import re
import struct
import time import time
import urllib.parse import urllib.parse
import uuid import uuid
@@ -18,10 +17,8 @@ from ..networking.exceptions import TransportError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
OnDemandPagedList, OnDemandPagedList,
bytes_to_intlist,
decode_base_n, decode_base_n,
int_or_none, int_or_none,
intlist_to_bytes,
time_seconds, time_seconds,
traverse_obj, traverse_obj,
update_url_query, update_url_query,
@@ -72,15 +69,15 @@ class AbemaLicenseRH(RequestHandler):
}) })
res = decode_base_n(license_response['k'], table=self._STRTABLE) res = decode_base_n(license_response['k'], table=self._STRTABLE)
encvideokey = bytes_to_intlist(struct.pack('>QQ', res >> 64, res & 0xffffffffffffffff)) encvideokey = list(res.to_bytes(16, 'big'))
h = hmac.new( h = hmac.new(
binascii.unhexlify(self._HKEY), binascii.unhexlify(self._HKEY),
(license_response['cid'] + self.ie._DEVICE_ID).encode(), (license_response['cid'] + self.ie._DEVICE_ID).encode(),
digestmod=hashlib.sha256) digestmod=hashlib.sha256)
enckey = bytes_to_intlist(h.digest()) enckey = list(h.digest())
return intlist_to_bytes(aes_ecb_decrypt(encvideokey, enckey)) return bytes(aes_ecb_decrypt(encvideokey, enckey))
class AbemaTVBaseIE(InfoExtractor): class AbemaTVBaseIE(InfoExtractor):
@@ -424,14 +421,15 @@ class AbemaTVIE(AbemaTVBaseIE):
class AbemaTVTitleIE(AbemaTVBaseIE): class AbemaTVTitleIE(AbemaTVBaseIE):
_VALID_URL = r'https?://abema\.tv/video/title/(?P<id>[^?/]+)' _VALID_URL = r'https?://abema\.tv/video/title/(?P<id>[^?/#]+)/?(?:\?(?:[^#]+&)?s=(?P<season>[^&#]+))?'
_PAGE_SIZE = 25 _PAGE_SIZE = 25
_TESTS = [{ _TESTS = [{
'url': 'https://abema.tv/video/title/90-1597', 'url': 'https://abema.tv/video/title/90-1887',
'info_dict': { 'info_dict': {
'id': '90-1597', 'id': '90-1887',
'title': 'シャッフルアイランド', 'title': 'シャッフルアイランド',
'description': 'md5:61b2425308f41a5282a926edda66f178',
}, },
'playlist_mincount': 2, 'playlist_mincount': 2,
}, { }, {
@@ -439,41 +437,54 @@ class AbemaTVTitleIE(AbemaTVBaseIE):
'info_dict': { 'info_dict': {
'id': '193-132', 'id': '193-132',
'title': '真心が届く~僕とスターのオフィス・ラブ!?~', 'title': '真心が届く~僕とスターのオフィス・ラブ!?~',
'description': 'md5:9b59493d1f3a792bafbc7319258e7af8',
}, },
'playlist_mincount': 16, 'playlist_mincount': 16,
}, { }, {
'url': 'https://abema.tv/video/title/25-102', 'url': 'https://abema.tv/video/title/25-1nzan-whrxe',
'info_dict': { 'info_dict': {
'id': '25-102', 'id': '25-1nzan-whrxe',
'title': 'ソードアート・オンライン アリシゼーション', 'title': 'ソードアート・オンライン',
'description': 'md5:c094904052322e6978495532bdbf06e6',
}, },
'playlist_mincount': 24, 'playlist_mincount': 25,
}, {
'url': 'https://abema.tv/video/title/26-2mzbynr-cph?s=26-2mzbynr-cph_s40',
'info_dict': {
'title': '〈物語〉シリーズ',
'id': '26-2mzbynr-cph',
'description': 'md5:e67873de1c88f360af1f0a4b84847a52',
},
'playlist_count': 59,
}] }]
def _fetch_page(self, playlist_id, series_version, page): def _fetch_page(self, playlist_id, series_version, season_id, page):
query = {
'seriesVersion': series_version,
'offset': str(page * self._PAGE_SIZE),
'order': 'seq',
'limit': str(self._PAGE_SIZE),
}
if season_id:
query['seasonId'] = season_id
programs = self._call_api( programs = self._call_api(
f'v1/video/series/{playlist_id}/programs', playlist_id, f'v1/video/series/{playlist_id}/programs', playlist_id,
note=f'Downloading page {page + 1}', note=f'Downloading page {page + 1}',
query={ query=query)
'seriesVersion': series_version,
'offset': str(page * self._PAGE_SIZE),
'order': 'seq',
'limit': str(self._PAGE_SIZE),
})
yield from ( yield from (
self.url_result(f'https://abema.tv/video/episode/{x}') self.url_result(f'https://abema.tv/video/episode/{x}')
for x in traverse_obj(programs, ('programs', ..., 'id'))) for x in traverse_obj(programs, ('programs', ..., 'id')))
def _entries(self, playlist_id, series_version): def _entries(self, playlist_id, series_version, season_id):
return OnDemandPagedList( return OnDemandPagedList(
functools.partial(self._fetch_page, playlist_id, series_version), functools.partial(self._fetch_page, playlist_id, series_version, season_id),
self._PAGE_SIZE) self._PAGE_SIZE)
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id, season_id = self._match_valid_url(url).group('id', 'season')
series_info = self._call_api(f'v1/video/series/{playlist_id}', playlist_id) series_info = self._call_api(f'v1/video/series/{playlist_id}', playlist_id)
return self.playlist_result( return self.playlist_result(
self._entries(playlist_id, series_info['version']), playlist_id=playlist_id, self._entries(playlist_id, series_info['version'], season_id), playlist_id=playlist_id,
playlist_title=series_info.get('title'), playlist_title=series_info.get('title'),
playlist_description=series_info.get('content')) playlist_description=series_info.get('content'))

View File

@@ -11,11 +11,9 @@ from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
ass_subtitles_timecode, ass_subtitles_timecode,
bytes_to_intlist,
bytes_to_long, bytes_to_long,
float_or_none, float_or_none,
int_or_none, int_or_none,
intlist_to_bytes,
join_nonempty, join_nonempty,
long_to_bytes, long_to_bytes,
parse_iso8601, parse_iso8601,
@@ -198,16 +196,16 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
links_url = try_get(options, lambda x: x['video']['url']) or (video_base_url + 'link') links_url = try_get(options, lambda x: x['video']['url']) or (video_base_url + 'link')
self._K = ''.join(random.choices('0123456789abcdef', k=16)) self._K = ''.join(random.choices('0123456789abcdef', k=16))
message = bytes_to_intlist(json.dumps({ message = list(json.dumps({
'k': self._K, 'k': self._K,
't': token, 't': token,
})) }).encode())
# Sometimes authentication fails for no good reason, retry with # Sometimes authentication fails for no good reason, retry with
# a different random padding # a different random padding
links_data = None links_data = None
for _ in range(3): for _ in range(3):
padded_message = intlist_to_bytes(pkcs1pad(message, 128)) padded_message = bytes(pkcs1pad(message, 128))
n, e = self._RSA_KEY n, e = self._RSA_KEY
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n)) encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode() authorization = base64.b64encode(encrypted_message).decode()
@@ -234,7 +232,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
error = self._parse_json(e.cause.response.read(), video_id) error = self._parse_json(e.cause.response.read(), video_id)
message = error.get('message') message = error.get('message')
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country': if e.cause.status == 403 and error.get('code') == 'player-bad-geolocation-country':
self.raise_geo_restricted(msg=message) self.raise_geo_restricted(msg=message)
raise ExtractorError(message) raise ExtractorError(message)
else: else:

View File

@@ -1362,7 +1362,7 @@ class AdobePassIE(InfoExtractor): # XXX: Conventionally, base classes should en
def _download_webpage_handle(self, *args, **kwargs): def _download_webpage_handle(self, *args, **kwargs):
headers = self.geo_verification_headers() headers = self.geo_verification_headers()
headers.update(kwargs.get('headers', {})) headers.update(kwargs.get('headers') or {})
kwargs['headers'] = headers kwargs['headers'] = headers
return super()._download_webpage_handle( return super()._download_webpage_handle(
*args, **kwargs) *args, **kwargs)

View File

@@ -66,6 +66,14 @@ class AfreecaTVBaseIE(InfoExtractor):
extensions={'legacy_ssl': True}), display_id, extensions={'legacy_ssl': True}), display_id,
'Downloading API JSON', 'Unable to download API JSON') 'Downloading API JSON', 'Unable to download API JSON')
@staticmethod
def _fixup_thumb(thumb_url):
if not url_or_none(thumb_url):
return None
# Core would determine_ext as 'php' from the url, so we need to provide the real ext
# See: https://github.com/yt-dlp/yt-dlp/issues/11537
return [{'url': thumb_url, 'ext': 'jpg'}]
class AfreecaTVIE(AfreecaTVBaseIE): class AfreecaTVIE(AfreecaTVBaseIE):
IE_NAME = 'soop' IE_NAME = 'soop'
@@ -155,7 +163,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
'uploader': ('writer_nick', {str}), 'uploader': ('writer_nick', {str}),
'uploader_id': ('bj_id', {str}), 'uploader_id': ('bj_id', {str}),
'duration': ('total_file_duration', {int_or_none(scale=1000)}), 'duration': ('total_file_duration', {int_or_none(scale=1000)}),
'thumbnail': ('thumb', {url_or_none}), 'thumbnails': ('thumb', {self._fixup_thumb}),
}) })
entries = [] entries = []
@@ -226,8 +234,7 @@ class AfreecaTVCatchStoryIE(AfreecaTVBaseIE):
return self.playlist_result(self._entries(data), video_id) return self.playlist_result(self._entries(data), video_id)
@staticmethod def _entries(self, data):
def _entries(data):
# 'files' is always a list with 1 element # 'files' is always a list with 1 element
yield from traverse_obj(data, ( yield from traverse_obj(data, (
'data', lambda _, v: v['story_type'] == 'catch', 'data', lambda _, v: v['story_type'] == 'catch',
@@ -238,7 +245,7 @@ class AfreecaTVCatchStoryIE(AfreecaTVBaseIE):
'title': ('title', {str}), 'title': ('title', {str}),
'uploader': ('writer_nick', {str}), 'uploader': ('writer_nick', {str}),
'uploader_id': ('writer_id', {str}), 'uploader_id': ('writer_id', {str}),
'thumbnail': ('thumb', {url_or_none}), 'thumbnails': ('thumb', {self._fixup_thumb}),
'timestamp': ('write_timestamp', {int_or_none}), 'timestamp': ('write_timestamp', {int_or_none}),
})) }))

View File

@@ -8,10 +8,8 @@ import time
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_encrypt from ..aes import aes_encrypt
from ..utils import ( from ..utils import (
bytes_to_intlist,
determine_ext, determine_ext,
int_or_none, int_or_none,
intlist_to_bytes,
join_nonempty, join_nonempty,
smuggle_url, smuggle_url,
strip_jsonp, strip_jsonp,
@@ -234,8 +232,8 @@ class AnvatoIE(InfoExtractor):
server_time = self._server_time(access_key, video_id) server_time = self._server_time(access_key, video_id)
input_data = f'{server_time}~{md5_text(video_data_url)}~{md5_text(server_time)}' input_data = f'{server_time}~{md5_text(video_data_url)}~{md5_text(server_time)}'
auth_secret = intlist_to_bytes(aes_encrypt( auth_secret = bytes(aes_encrypt(
bytes_to_intlist(input_data[:64]), bytes_to_intlist(self._AUTH_KEY))) list(input_data[:64].encode()), list(self._AUTH_KEY)))
query = { query = {
'X-Anvato-Adst-Auth': base64.b64encode(auth_secret).decode('ascii'), 'X-Anvato-Adst-Auth': base64.b64encode(auth_secret).decode('ascii'),
'rtyp': 'fp', 'rtyp': 'fp',

View File

@@ -205,6 +205,26 @@ class ArchiveOrgIE(InfoExtractor):
}, },
}, },
], ],
}, {
# The reviewbody is None for one of the reviews; just need to extract data without crashing
'url': 'https://archive.org/details/gd95-04-02.sbd.11622.sbeok.shnf/gd95-04-02d1t04.shn',
'info_dict': {
'id': 'gd95-04-02.sbd.11622.sbeok.shnf/gd95-04-02d1t04.shn',
'ext': 'mp3',
'title': 'Stuck Inside of Mobile with the Memphis Blues Again',
'creators': ['Grateful Dead'],
'duration': 338.31,
'track': 'Stuck Inside of Mobile with the Memphis Blues Again',
'description': 'md5:764348a470b986f1217ffd38d6ac7b72',
'display_id': 'gd95-04-02d1t04.shn',
'location': 'Pyramid Arena',
'uploader': 'jon@archive.org',
'album': '1995-04-02 - Pyramid Arena',
'upload_date': '20040519',
'track_number': 4,
'release_date': '19950402',
'timestamp': 1084927901,
},
}] }]
@staticmethod @staticmethod
@@ -335,7 +355,7 @@ class ArchiveOrgIE(InfoExtractor):
info['comments'].append({ info['comments'].append({
'id': review.get('review_id'), 'id': review.get('review_id'),
'author': review.get('reviewer'), 'author': review.get('reviewer'),
'text': str_or_none(review.get('reviewtitle'), '') + '\n\n' + review.get('reviewbody'), 'text': join_nonempty('reviewtitle', 'reviewbody', from_dict=review, delim='\n\n'),
'timestamp': unified_timestamp(review.get('createdate')), 'timestamp': unified_timestamp(review.get('createdate')),
'parent': 'root'}) 'parent': 'root'})

437
yt_dlp/extractor/bandlab.py Normal file
View File

@@ -0,0 +1,437 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
format_field,
int_or_none,
parse_iso8601,
parse_qs,
truncate_string,
url_or_none,
)
from ..utils.traversal import traverse_obj, value
class BandlabBaseIE(InfoExtractor):
def _call_api(self, endpoint, asset_id, **kwargs):
headers = kwargs.pop('headers', None) or {}
return self._download_json(
f'https://www.bandlab.com/api/v1.3/{endpoint}/{asset_id}',
asset_id, headers={
'accept': 'application/json',
'referer': 'https://www.bandlab.com/',
'x-client-id': 'BandLab-Web',
'x-client-version': '10.1.124',
**headers,
}, **kwargs)
def _parse_revision(self, revision_data, url=None):
return {
'vcodec': 'none',
'media_type': 'revision',
'extractor_key': BandlabIE.ie_key(),
'extractor': BandlabIE.IE_NAME,
**traverse_obj(revision_data, {
'webpage_url': (
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/revision/%s')}), filter, any),
'id': (('revisionId', 'id'), {str}, any),
'title': ('song', 'name', {str}),
'track': ('song', 'name', {str}),
'url': ('mixdown', 'file', {url_or_none}),
'thumbnail': ('song', 'picture', 'url', {url_or_none}),
'description': ('description', {str}),
'uploader': ('creator', 'name', {str}),
'uploader_id': ('creator', 'username', {str}),
'timestamp': ('createdOn', {parse_iso8601}),
'duration': ('mixdown', 'duration', {float_or_none}),
'view_count': ('counters', 'plays', {int_or_none}),
'like_count': ('counters', 'likes', {int_or_none}),
'comment_count': ('counters', 'comments', {int_or_none}),
'genres': ('genres', ..., 'name', {str}),
}),
}
def _parse_track(self, track_data, url=None):
return {
'vcodec': 'none',
'media_type': 'track',
'extractor_key': BandlabIE.ie_key(),
'extractor': BandlabIE.IE_NAME,
**traverse_obj(track_data, {
'webpage_url': (
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any),
'id': (('revisionId', 'id'), {str}, any),
'url': ('track', 'sample', 'audioUrl', {url_or_none}),
'title': ('track', 'name', {str}),
'track': ('track', 'name', {str}),
'description': ('caption', {str}),
'thumbnail': ('track', 'picture', ('original', 'url'), {url_or_none}, any),
'view_count': ('counters', 'plays', {int_or_none}),
'like_count': ('counters', 'likes', {int_or_none}),
'comment_count': ('counters', 'comments', {int_or_none}),
'duration': ('track', 'sample', 'duration', {float_or_none}),
'uploader': ('creator', 'name', {str}),
'uploader_id': ('creator', 'username', {str}),
'timestamp': ('createdOn', {parse_iso8601}),
}),
}
def _parse_video(self, video_data, url=None):
return {
'media_type': 'video',
'extractor_key': BandlabIE.ie_key(),
'extractor': BandlabIE.IE_NAME,
**traverse_obj(video_data, {
'id': ('id', {str}),
'webpage_url': (
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any),
'url': ('video', 'url', {url_or_none}),
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}),
'description': ('caption', {str}),
'thumbnail': ('video', 'picture', 'url', {url_or_none}),
'view_count': ('video', 'counters', 'plays', {int_or_none}),
'like_count': ('video', 'counters', 'likes', {int_or_none}),
'comment_count': ('counters', 'comments', {int_or_none}),
'duration': ('video', 'duration', {float_or_none}),
'uploader': ('creator', 'name', {str}),
'uploader_id': ('creator', 'username', {str}),
}),
}
class BandlabIE(BandlabBaseIE):
_VALID_URL = [
r'https?://(?:www\.)?bandlab.com/(?P<url_type>track|post|revision)/(?P<id>[\da-f_-]+)',
r'https?://(?:www\.)?bandlab.com/(?P<url_type>embed)/\?(?:[^#]*&)?id=(?P<id>[\da-f-]+)',
]
_EMBED_REGEX = [rf'<iframe[^>]+src=[\'"](?P<url>{_VALID_URL[1]})[\'"]']
_TESTS = [{
'url': 'https://www.bandlab.com/track/04b37e88dba24967b9dac8eb8567ff39_07d7f906fc96ee11b75e000d3a428fff',
'md5': '46f7b43367dd268bbcf0bbe466753b2c',
'info_dict': {
'id': '02d7f906-fc96-ee11-b75e-000d3a428fff',
'ext': 'm4a',
'uploader_id': 'ender_milze',
'track': 'sweet black',
'description': 'composed by juanjn3737',
'timestamp': 1702171963,
'view_count': int,
'like_count': int,
'duration': 54.629999999999995,
'title': 'sweet black',
'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'],
'uploader': 'ender milze',
'comment_count': int,
'media_type': 'revision',
},
}, {
# Same track as above but post URL
'url': 'https://www.bandlab.com/post/07d7f906-fc96-ee11-b75e-000d3a428fff',
'md5': '46f7b43367dd268bbcf0bbe466753b2c',
'info_dict': {
'id': '02d7f906-fc96-ee11-b75e-000d3a428fff',
'ext': 'm4a',
'uploader_id': 'ender_milze',
'track': 'sweet black',
'description': 'composed by juanjn3737',
'timestamp': 1702171973,
'view_count': int,
'like_count': int,
'duration': 54.629999999999995,
'title': 'sweet black',
'upload_date': '20231210',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
'genres': ['Lofi'],
'uploader': 'ender milze',
'comment_count': int,
'media_type': 'revision',
},
}, {
# SharedKey Example
'url': 'https://www.bandlab.com/track/048916c2-c6da-ee11-85f9-6045bd2e11f9?sharedKey=0NNWX8qYAEmI38lWAzCNDA',
'md5': '15174b57c44440e2a2008be9cae00250',
'info_dict': {
'id': '038916c2-c6da-ee11-85f9-6045bd2e11f9',
'ext': 'm4a',
'comment_count': int,
'genres': ['Other'],
'uploader_id': 'user8353034818103753',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
'timestamp': 1709625771,
'track': 'PodcastMaerchen4b',
'duration': 468.14,
'view_count': int,
'description': 'Podcast: Neues aus der Märchenwelt',
'like_count': int,
'upload_date': '20240305',
'uploader': 'Erna Wageneder',
'title': 'PodcastMaerchen4b',
'media_type': 'revision',
},
}, {
# Different Revision selected
'url': 'https://www.bandlab.com/track/130343fc-148b-ea11-96d2-0003ffd1fc09?revId=110343fc-148b-ea11-96d2-0003ffd1fc09',
'md5': '74e055ef9325d63f37088772fbfe4454',
'info_dict': {
'id': '110343fc-148b-ea11-96d2-0003ffd1fc09',
'ext': 'm4a',
'timestamp': 1588273294,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
'description': 'Final Revision.',
'title': 'Replay ( Instrumental)',
'uploader': 'David R Sparks',
'uploader_id': 'davesnothome69',
'view_count': int,
'comment_count': int,
'track': 'Replay ( Instrumental)',
'genres': ['Rock'],
'upload_date': '20200430',
'like_count': int,
'duration': 279.43,
'media_type': 'revision',
},
}, {
# Video
'url': 'https://www.bandlab.com/post/5cdf9036-3857-ef11-991a-6045bd36e0d9',
'md5': '8caa2ef28e86c1dacf167293cfdbeba9',
'info_dict': {
'id': '5cdf9036-3857-ef11-991a-6045bd36e0d9',
'ext': 'mp4',
'duration': 44.705,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
'comment_count': int,
'title': 'backing vocals',
'uploader_id': 'marliashya',
'uploader': 'auraa',
'like_count': int,
'description': 'backing vocals',
'media_type': 'video',
},
}, {
# Embed Example
'url': 'https://www.bandlab.com/embed/?blur=false&id=014de0a4-7d82-ea11-a94c-0003ffd19c0f',
'md5': 'a4ad05cb68c54faaed9b0a8453a8cf4a',
'info_dict': {
'id': '014de0a4-7d82-ea11-a94c-0003ffd19c0f',
'ext': 'm4a',
'comment_count': int,
'genres': ['Electronic'],
'uploader': 'Charlie Henson',
'timestamp': 1587328674,
'upload_date': '20200419',
'view_count': int,
'track': 'Positronic Meltdown',
'duration': 318.55,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
'description': 'Checkout my tracks at AOMX http://aomxsounds.com/',
'uploader_id': 'microfreaks',
'title': 'Positronic Meltdown',
'like_count': int,
'media_type': 'revision',
},
}, {
# Track without revisions available
'url': 'https://www.bandlab.com/track/55767ac51789ea11a94c0003ffd1fc09_2f007b0a37b94ec7a69bc25ae15108a5',
'md5': 'f05d68a3769952c2d9257c473e14c15f',
'info_dict': {
'id': '55767ac51789ea11a94c0003ffd1fc09_2f007b0a37b94ec7a69bc25ae15108a5',
'ext': 'm4a',
'track': 'insame',
'like_count': int,
'duration': 84.03,
'title': 'insame',
'view_count': int,
'comment_count': int,
'uploader': 'Sorakime',
'uploader_id': 'sorakime',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
'timestamp': 1691162128,
'upload_date': '20230804',
'media_type': 'track',
},
}, {
'url': 'https://www.bandlab.com/revision/014de0a4-7d82-ea11-a94c-0003ffd19c0f',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://phantomluigi.github.io/',
'info_dict': {
'id': 'e14223c3-7871-ef11-bdfd-000d3a980db3',
'ext': 'm4a',
'view_count': int,
'upload_date': '20240913',
'uploader_id': 'phantommusicofficial',
'timestamp': 1726194897,
'uploader': 'Phantom',
'comment_count': int,
'genres': ['Progresive Rock'],
'description': 'md5:a38cd668f7a2843295ef284114f18429',
'duration': 225.23,
'like_count': int,
'title': 'Vermilion Pt. 2 (Cover)',
'track': 'Vermilion Pt. 2 (Cover)',
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/62b10750-7aef-4f42-ad08-1af52f577e97/',
'media_type': 'revision',
},
}]
def _real_extract(self, url):
display_id, url_type = self._match_valid_url(url).group('id', 'url_type')
qs = parse_qs(url)
revision_id = traverse_obj(qs, (('revId', 'id'), 0, any))
if url_type == 'revision':
revision_id = display_id
revision_data = None
if not revision_id:
post_data = self._call_api(
'posts', display_id, note='Downloading post data',
query=traverse_obj(qs, {'sharedKey': ('sharedKey', 0)}))
revision_id = traverse_obj(post_data, (('revisionId', ('revision', 'id')), {str}, any))
revision_data = traverse_obj(post_data, ('revision', {dict}))
if not revision_data and not revision_id:
post_type = post_data.get('type')
if post_type == 'Video':
return self._parse_video(post_data, url=url)
if post_type == 'Track':
return self._parse_track(post_data, url=url)
raise ExtractorError(f'Could not extract data for post type {post_type!r}')
if not revision_data:
revision_data = self._call_api(
'revisions', revision_id, note='Downloading revision data', query={'edit': 'false'})
return self._parse_revision(revision_data, url=url)
class BandlabPlaylistIE(BandlabBaseIE):
_VALID_URL = [
r'https?://(?:www\.)?bandlab.com/(?:[\w]+/)?(?P<type>albums|collections)/(?P<id>[\da-f-]+)',
r'https?://(?:www\.)?bandlab.com/(?P<type>embed)/collection/\?(?:[^#]*&)?id=(?P<id>[\da-f-]+)',
]
_EMBED_REGEX = [rf'<iframe[^>]+src=[\'"](?P<url>{_VALID_URL[1]})[\'"]']
_TESTS = [{
'url': 'https://www.bandlab.com/davesnothome69/albums/89b79ea6-de42-ed11-b495-00224845aac7',
'info_dict': {
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.3/albums/69507ff3-579a-45be-afca-9e87eddec944/',
'release_date': '20221003',
'title': 'Remnants',
'album': 'Remnants',
'like_count': int,
'album_type': 'LP',
'description': 'A collection of some feel good, rock hits.',
'comment_count': int,
'view_count': int,
'id': '89b79ea6-de42-ed11-b495-00224845aac7',
'uploader': 'David R Sparks',
'uploader_id': 'davesnothome69',
},
'playlist_count': 10,
}, {
'url': 'https://www.bandlab.com/slytheband/collections/955102d4-1040-ef11-86c3-000d3a42581b',
'info_dict': {
'id': '955102d4-1040-ef11-86c3-000d3a42581b',
'timestamp': 1720762659,
'view_count': int,
'title': 'My Shit 🖤',
'uploader_id': 'slytheband',
'uploader': '𝓢𝓛𝓨',
'upload_date': '20240712',
'like_count': int,
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/collections/2c64ca12-b180-4b76-8587-7a8da76bddc8/',
},
'playlist_count': 15,
}, {
# Embeds can contain both albums and collections with the same URL pattern. This is an album
'url': 'https://www.bandlab.com/embed/collection/?id=12cc6f7f-951b-ee11-907c-00224844f303',
'info_dict': {
'id': '12cc6f7f-951b-ee11-907c-00224844f303',
'release_date': '20230706',
'description': 'This is a collection of songs I created when I had an Amiga computer.',
'view_count': int,
'title': 'Mark Salud The Amiga Collection',
'uploader_id': 'mssirmooth1962',
'comment_count': int,
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.3/albums/d618bd7b-0537-40d5-bdd8-61b066e77d59/',
'like_count': int,
'uploader': 'Mark Salud',
'album': 'Mark Salud The Amiga Collection',
'album_type': 'LP',
},
'playlist_count': 24,
}, {
# Tracks without revision id
'url': 'https://www.bandlab.com/embed/collection/?id=e98aafb5-d932-ee11-b8f0-00224844c719',
'info_dict': {
'like_count': int,
'uploader_id': 'sorakime',
'comment_count': int,
'uploader': 'Sorakime',
'view_count': int,
'description': 'md5:4ec31c568a5f5a5a2b17572ea64c3825',
'release_date': '20230812',
'title': 'Art',
'album': 'Art',
'album_type': 'Album',
'id': 'e98aafb5-d932-ee11-b8f0-00224844c719',
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.3/albums/20c890de-e94a-4422-828a-2da6377a13c8/',
},
'playlist_count': 13,
}, {
'url': 'https://www.bandlab.com/albums/89b79ea6-de42-ed11-b495-00224845aac7',
'only_matching': True,
}]
def _entries(self, album_data):
for post in traverse_obj(album_data, ('posts', lambda _, v: v['type'])):
post_type = post['type']
if post_type == 'Revision':
yield self._parse_revision(post.get('revision'))
elif post_type == 'Track':
yield self._parse_track(post)
elif post_type == 'Video':
yield self._parse_video(post)
else:
self.report_warning(f'Skipping unknown post type: "{post_type}"')
def _real_extract(self, url):
playlist_id, playlist_type = self._match_valid_url(url).group('id', 'type')
endpoints = {
'albums': ['albums'],
'collections': ['collections'],
'embed': ['collections', 'albums'],
}.get(playlist_type)
for endpoint in endpoints:
playlist_data = self._call_api(
endpoint, playlist_id, note=f'Downloading {endpoint[:-1]} data',
fatal=False, expected_status=404)
if not playlist_data.get('errorCode'):
playlist_type = endpoint
break
if error_code := playlist_data.get('errorCode'):
raise ExtractorError(f'Could not find playlist data. Error code: "{error_code}"')
return self.playlist_result(
self._entries(playlist_data), playlist_id,
**traverse_obj(playlist_data, {
'title': ('name', {str}),
'description': ('description', {str}),
'uploader': ('creator', 'name', {str}),
'uploader_id': ('creator', 'username', {str}),
'timestamp': ('createdOn', {parse_iso8601}),
'release_date': ('releaseDate', {lambda x: x.replace('-', '')}, filter),
'thumbnail': ('picture', ('original', 'url'), {url_or_none}, any),
'like_count': ('counters', 'likes', {int_or_none}),
'comment_count': ('counters', 'comments', {int_or_none}),
'view_count': ('counters', 'plays', {int_or_none}),
}),
**(traverse_obj(playlist_data, {
'album': ('name', {str}),
'album_type': ('type', {str}),
}) if playlist_type == 'albums' else {}))

View File

@@ -4,7 +4,9 @@ import hashlib
import itertools import itertools
import json import json
import math import math
import random
import re import re
import string
import time import time
import urllib.parse import urllib.parse
import uuid import uuid
@@ -18,7 +20,6 @@ from ..utils import (
InAdvancePagedList, InAdvancePagedList,
OnDemandPagedList, OnDemandPagedList,
bool_or_none, bool_or_none,
clean_html,
determine_ext, determine_ext,
filter_dict, filter_dict,
float_or_none, float_or_none,
@@ -63,7 +64,7 @@ class BilibiliBaseIE(InfoExtractor):
'support_formats', lambda _, v: v['quality'] not in parsed_qualities))], delim=', ') 'support_formats', lambda _, v: v['quality'] not in parsed_qualities))], delim=', ')
if missing_formats: if missing_formats:
self.to_screen( self.to_screen(
f'Format(s) {missing_formats} are missing; you have to login or ' f'Format(s) {missing_formats} are missing; you have to '
f'become a premium member to download them. {self._login_hint()}') f'become a premium member to download them. {self._login_hint()}')
def extract_formats(self, play_info): def extract_formats(self, play_info):
@@ -165,14 +166,18 @@ class BilibiliBaseIE(InfoExtractor):
params['w_rid'] = hashlib.md5(f'{query}{self._get_wbi_key(video_id)}'.encode()).hexdigest() params['w_rid'] = hashlib.md5(f'{query}{self._get_wbi_key(video_id)}'.encode()).hexdigest()
return params return params
def _download_playinfo(self, bvid, cid, headers=None, qn=None): def _download_playinfo(self, bvid, cid, headers=None, query=None):
params = {'bvid': bvid, 'cid': cid, 'fnval': 4048} params = {'bvid': bvid, 'cid': cid, 'fnval': 4048, **(query or {})}
if qn: if self.is_logged_in:
params['qn'] = qn params.pop('try_look', None)
if qn := params.get('qn'):
note = f'Downloading video format {qn} for cid {cid}'
else:
note = f'Downloading video formats for cid {cid}'
return self._download_json( return self._download_json(
'https://api.bilibili.com/x/player/wbi/playurl', bvid, 'https://api.bilibili.com/x/player/wbi/playurl', bvid,
query=self._sign_wbi(params, bvid), headers=headers, query=self._sign_wbi(params, bvid), headers=headers, note=note)['data']
note=f'Downloading video formats for cid {cid} {qn or ""}')['data']
def json2srt(self, json_data): def json2srt(self, json_data):
srt_data = '' srt_data = ''
@@ -191,7 +196,7 @@ class BilibiliBaseIE(InfoExtractor):
} }
video_info = self._download_json( video_info = self._download_json(
'https://api.bilibili.com/x/player/v2', video_id, 'https://api.bilibili.com/x/player/wbi/v2', video_id,
query={'aid': aid, 'cid': cid} if aid else {'bvid': video_id, 'cid': cid}, query={'aid': aid, 'cid': cid} if aid else {'bvid': video_id, 'cid': cid},
note=f'Extracting subtitle info {cid}', headers=self._HEADERS) note=f'Extracting subtitle info {cid}', headers=self._HEADERS)
if traverse_obj(video_info, ('data', 'need_login_subtitle')): if traverse_obj(video_info, ('data', 'need_login_subtitle')):
@@ -207,7 +212,7 @@ class BilibiliBaseIE(InfoExtractor):
def _get_chapters(self, aid, cid): def _get_chapters(self, aid, cid):
chapters = aid and cid and self._download_json( chapters = aid and cid and self._download_json(
'https://api.bilibili.com/x/player/v2', aid, query={'aid': aid, 'cid': cid}, 'https://api.bilibili.com/x/player/wbi/v2', aid, query={'aid': aid, 'cid': cid},
note='Extracting chapters', fatal=False, headers=self._HEADERS) note='Extracting chapters', fatal=False, headers=self._HEADERS)
return traverse_obj(chapters, ('data', 'view_points', ..., { return traverse_obj(chapters, ('data', 'view_points', ..., {
'title': 'content', 'title': 'content',
@@ -286,7 +291,7 @@ class BilibiliBaseIE(InfoExtractor):
('data', 'interaction', 'graph_version', {int_or_none})) ('data', 'interaction', 'graph_version', {int_or_none}))
cid_edges = self._get_divisions(video_id, graph_version, {1: {'cid': cid}}, 1) cid_edges = self._get_divisions(video_id, graph_version, {1: {'cid': cid}}, 1)
for cid, edges in cid_edges.items(): for cid, edges in cid_edges.items():
play_info = self._download_playinfo(video_id, cid, headers=headers) play_info = self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1})
yield { yield {
**metainfo, **metainfo,
'id': f'{video_id}_{cid}', 'id': f'{video_id}_{cid}',
@@ -639,40 +644,29 @@ class BiliBiliIE(BilibiliBaseIE):
headers['Referer'] = url headers['Referer'] = url
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id) initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()
if traverse_obj(initial_state, ('error', 'trueCode')) == -404:
raise ExtractorError(
'This video may be deleted or geo-restricted. '
'You might want to try a VPN or a proxy server (with --proxy)', expected=True)
is_festival = 'videoData' not in initial_state is_festival = 'videoData' not in initial_state
if is_festival: if is_festival:
video_data = initial_state['videoInfo'] video_data = initial_state['videoInfo']
else: else:
play_info_obj = self._search_json(
r'window\.__playinfo__\s*=', webpage, 'play info', video_id, fatal=False)
if not play_info_obj:
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()
if traverse_obj(initial_state, ('error', 'trueCode')) == -404:
raise ExtractorError(
'This video may be deleted or geo-restricted. '
'You might want to try a VPN or a proxy server (with --proxy)', expected=True)
play_info = traverse_obj(play_info_obj, ('data', {dict}))
if not play_info:
if traverse_obj(play_info_obj, 'code') == 87007:
toast = get_element_by_class('tips-toast', webpage) or ''
msg = clean_html(
f'{get_element_by_class("belongs-to", toast) or ""}'
+ (get_element_by_class('level', toast) or ''))
raise ExtractorError(
f'This is a supporter-only video: {msg}. {self._login_hint()}', expected=True)
raise ExtractorError('Failed to extract play info')
video_data = initial_state['videoData'] video_data = initial_state['videoData']
video_id, title = video_data['bvid'], video_data.get('title') video_id, title = video_data['bvid'], video_data.get('title')
# Bilibili anthologies are similar to playlists but all videos share the same video ID as the anthology itself. # Bilibili anthologies are similar to playlists but all videos share the same video ID as the anthology itself.
page_list_json = not is_festival and traverse_obj( page_list_json = (not is_festival and traverse_obj(
self._download_json( self._download_json(
'https://api.bilibili.com/x/player/pagelist', video_id, 'https://api.bilibili.com/x/player/pagelist', video_id,
fatal=False, query={'bvid': video_id, 'jsonp': 'jsonp'}, fatal=False, query={'bvid': video_id, 'jsonp': 'jsonp'},
note='Extracting videos in anthology', headers=headers), note='Extracting videos in anthology', headers=headers),
'data', expected_type=list) or [] 'data', expected_type=list)) or []
is_anthology = len(page_list_json) > 1 is_anthology = len(page_list_json) > 1
part_id = int_or_none(parse_qs(url).get('p', [None])[-1]) part_id = int_or_none(parse_qs(url).get('p', [None])[-1])
@@ -691,8 +685,6 @@ class BiliBiliIE(BilibiliBaseIE):
festival_info = {} festival_info = {}
if is_festival: if is_festival:
play_info = self._download_playinfo(video_id, cid, headers=headers)
festival_info = traverse_obj(initial_state, { festival_info = traverse_obj(initial_state, {
'uploader': ('videoInfo', 'upName'), 'uploader': ('videoInfo', 'upName'),
'uploader_id': ('videoInfo', 'upMid', {str_or_none}), 'uploader_id': ('videoInfo', 'upMid', {str_or_none}),
@@ -727,62 +719,79 @@ class BiliBiliIE(BilibiliBaseIE):
self._get_interactive_entries(video_id, cid, metainfo, headers=headers), **metainfo, self._get_interactive_entries(video_id, cid, metainfo, headers=headers), **metainfo,
duration=traverse_obj(initial_state, ('videoData', 'duration', {int_or_none})), duration=traverse_obj(initial_state, ('videoData', 'duration', {int_or_none})),
__post_extractor=self.extract_comments(aid)) __post_extractor=self.extract_comments(aid))
else:
formats = self.extract_formats(play_info)
if not traverse_obj(play_info, ('dash')): play_info = None
# we only have legacy formats and need additional work if self.is_logged_in:
has_qn = lambda x: x in traverse_obj(formats, (..., 'quality')) play_info = traverse_obj(
for qn in traverse_obj(play_info, ('accept_quality', lambda _, v: not has_qn(v), {int})): self._search_json(r'window\.__playinfo__\s*=', webpage, 'play info', video_id, default=None),
formats.extend(traverse_obj( ('data', {dict}))
self.extract_formats(self._download_playinfo(video_id, cid, headers=headers, qn=qn)), if not play_info:
lambda _, v: not has_qn(v['quality']))) play_info = self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1})
self._check_missing_formats(play_info, formats) formats = self.extract_formats(play_info)
flv_formats = traverse_obj(formats, lambda _, v: v['fragments'])
if flv_formats and len(flv_formats) < len(formats):
# Flv and mp4 are incompatible due to `multi_video` workaround, so drop one
if not self._configuration_arg('prefer_multi_flv'):
dropped_fmts = ', '.join(
f'{f.get("format_note")} ({f.get("format_id")})' for f in flv_formats)
formats = traverse_obj(formats, lambda _, v: not v.get('fragments'))
if dropped_fmts:
self.to_screen(
f'Dropping incompatible flv format(s) {dropped_fmts} since mp4 is available. '
'To extract flv, pass --extractor-args "bilibili:prefer_multi_flv"')
else:
formats = traverse_obj(
# XXX: Filtering by extractor-arg is for testing purposes
formats, lambda _, v: v['quality'] == int(self._configuration_arg('prefer_multi_flv')[0]),
) or [max(flv_formats, key=lambda x: x['quality'])]
if traverse_obj(formats, (0, 'fragments')): if video_data.get('is_upower_exclusive'):
# We have flv formats, which are individual short videos with their own timestamps and metainfo high_level = traverse_obj(initial_state, ('elecFullInfo', 'show_info', 'high_level', {dict})) or {}
# Binary concatenation corrupts their timestamps, so we need a `multi_video` workaround msg = f'{join_nonempty("title", "sub_title", from_dict=high_level, delim="")}. {self._login_hint()}'
return { if not formats:
**metainfo, raise ExtractorError(f'This is a supporter-only video: {msg}', expected=True)
'_type': 'multi_video', if '试看' in traverse_obj(play_info, ('accept_description', ..., {str})):
'entries': [{ self.report_warning(
'id': f'{metainfo["id"]}_{idx}', f'This is a supporter-only video, only the preview will be extracted: {msg}',
'title': metainfo['title'], video_id=video_id)
'http_headers': metainfo['http_headers'],
'formats': [{ if not traverse_obj(play_info, 'dash'):
**fragment, # we only have legacy formats and need additional work
'format_id': formats[0].get('format_id'), has_qn = lambda x: x in traverse_obj(formats, (..., 'quality'))
}], for qn in traverse_obj(play_info, ('accept_quality', lambda _, v: not has_qn(v), {int})):
'subtitles': self.extract_subtitles(video_id, cid) if idx == 0 else None, formats.extend(traverse_obj(
'__post_extractor': self.extract_comments(aid) if idx == 0 else None, self.extract_formats(self._download_playinfo(video_id, cid, headers=headers, query={'qn': qn})),
} for idx, fragment in enumerate(formats[0]['fragments'])], lambda _, v: not has_qn(v['quality'])))
'duration': float_or_none(play_info.get('timelength'), scale=1000), self._check_missing_formats(play_info, formats)
} flv_formats = traverse_obj(formats, lambda _, v: v['fragments'])
else: if flv_formats and len(flv_formats) < len(formats):
return { # Flv and mp4 are incompatible due to `multi_video` workaround, so drop one
**metainfo, if not self._configuration_arg('prefer_multi_flv'):
'formats': formats, dropped_fmts = ', '.join(
'duration': float_or_none(play_info.get('timelength'), scale=1000), f'{f.get("format_note")} ({f.get("format_id")})' for f in flv_formats)
'chapters': self._get_chapters(aid, cid), formats = traverse_obj(formats, lambda _, v: not v.get('fragments'))
'subtitles': self.extract_subtitles(video_id, cid), if dropped_fmts:
'__post_extractor': self.extract_comments(aid), self.to_screen(
} f'Dropping incompatible flv format(s) {dropped_fmts} since mp4 is available. '
'To extract flv, pass --extractor-args "bilibili:prefer_multi_flv"')
else:
formats = traverse_obj(
# XXX: Filtering by extractor-arg is for testing purposes
formats, lambda _, v: v['quality'] == int(self._configuration_arg('prefer_multi_flv')[0]),
) or [max(flv_formats, key=lambda x: x['quality'])]
if traverse_obj(formats, (0, 'fragments')):
# We have flv formats, which are individual short videos with their own timestamps and metainfo
# Binary concatenation corrupts their timestamps, so we need a `multi_video` workaround
return {
**metainfo,
'_type': 'multi_video',
'entries': [{
'id': f'{metainfo["id"]}_{idx}',
'title': metainfo['title'],
'http_headers': metainfo['http_headers'],
'formats': [{
**fragment,
'format_id': formats[0].get('format_id'),
}],
'subtitles': self.extract_subtitles(video_id, cid) if idx == 0 else None,
'__post_extractor': self.extract_comments(aid) if idx == 0 else None,
} for idx, fragment in enumerate(formats[0]['fragments'])],
'duration': float_or_none(play_info.get('timelength'), scale=1000),
}
return {
**metainfo,
'formats': formats,
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'chapters': self._get_chapters(aid, cid),
'subtitles': self.extract_subtitles(video_id, cid),
'__post_extractor': self.extract_comments(aid),
}
class BiliBiliBangumiIE(BilibiliBaseIE): class BiliBiliBangumiIE(BilibiliBaseIE):
@@ -860,10 +869,16 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
self.raise_login_required('This video is for premium members only') self.raise_login_required('This video is for premium members only')
headers['Referer'] = url headers['Referer'] = url
play_info = self._download_json(
'https://api.bilibili.com/pgc/player/web/v2/playurl', episode_id, play_info = (
'Extracting episode', query={'fnval': '4048', 'ep_id': episode_id}, self._search_json(
headers=headers) r'playurlSSRData\s*=', webpage, 'embedded page info', episode_id,
end_pattern='\n', default=None)
or self._download_json(
'https://api.bilibili.com/pgc/player/web/v2/playurl', episode_id,
'Extracting episode', query={'fnval': 12240, 'ep_id': episode_id},
headers=headers))
premium_only = play_info.get('code') == -10403 premium_only = play_info.get('code') == -10403
play_info = traverse_obj(play_info, ('result', 'video_info', {dict})) or {} play_info = traverse_obj(play_info, ('result', 'video_info', {dict})) or {}
@@ -1164,28 +1179,26 @@ class BilibiliSpaceBaseIE(BilibiliBaseIE):
class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE): class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)(?P<video>/video)?/?(?:[?#]|$)' _VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)(?P<video>(?:/upload)?/video)?/?(?:[?#]|$)'
_TESTS = [{ _TESTS = [{
'url': 'https://space.bilibili.com/3985676/video', 'url': 'https://space.bilibili.com/3985676/video',
'info_dict': { 'info_dict': {
'id': '3985676', 'id': '3985676',
}, },
'playlist_mincount': 178, 'playlist_mincount': 178,
'skip': 'login required',
}, { }, {
'url': 'https://space.bilibili.com/313580179/video', 'url': 'https://space.bilibili.com/313580179/video',
'info_dict': { 'info_dict': {
'id': '313580179', 'id': '313580179',
}, },
'playlist_mincount': 92, 'playlist_mincount': 92,
'skip': 'login required',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
playlist_id, is_video_url = self._match_valid_url(url).group('id', 'video') playlist_id, is_video_url = self._match_valid_url(url).group('id', 'video')
if not is_video_url: if not is_video_url:
self.to_screen('A channel URL was given. Only the channel\'s videos will be downloaded. ' self.to_screen('A channel URL was given. Only the channel\'s videos will be downloaded. '
'To download audios, add a "/audio" to the URL') 'To download audios, add a "/upload/audio" to the URL')
def fetch_page(page_idx): def fetch_page(page_idx):
query = { query = {
@@ -1198,6 +1211,12 @@ class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
'ps': 30, 'ps': 30,
'tid': 0, 'tid': 0,
'web_location': 1550101, 'web_location': 1550101,
'dm_img_list': '[]',
'dm_img_str': base64.b64encode(
''.join(random.choices(string.printable, k=random.randint(16, 64))).encode())[:-2].decode(),
'dm_cover_img_str': base64.b64encode(
''.join(random.choices(string.printable, k=random.randint(32, 128))).encode())[:-2].decode(),
'dm_img_inter': '{"ds":[],"wh":[6093,6631,31],"of":[430,760,380]}',
} }
try: try:
@@ -1208,14 +1227,14 @@ class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 412: if isinstance(e.cause, HTTPError) and e.cause.status == 412:
raise ExtractorError( raise ExtractorError(
'Request is blocked by server (412), please add cookies, wait and try later.', expected=True) 'Request is blocked by server (412), please wait and try later.', expected=True)
raise raise
status_code = response['code'] status_code = response['code']
if status_code == -401: if status_code == -401:
raise ExtractorError( raise ExtractorError(
'Request is blocked by server (401), please add cookies, wait and try later.', expected=True) 'Request is blocked by server (401), please wait and try later.', expected=True)
elif status_code == -352 and not self.is_logged_in: elif status_code == -352:
self.raise_login_required('Request is rejected, you need to login to access playlist') raise ExtractorError('Request is rejected by server (352)', expected=True)
elif status_code != 0: elif status_code != 0:
raise ExtractorError(f'Request failed ({status_code}): {response.get("message") or "Unknown error"}') raise ExtractorError(f'Request failed ({status_code}): {response.get("message") or "Unknown error"}')
return response['data'] return response['data']
@@ -1237,9 +1256,9 @@ class BilibiliSpaceVideoIE(BilibiliSpaceBaseIE):
class BilibiliSpaceAudioIE(BilibiliSpaceBaseIE): class BilibiliSpaceAudioIE(BilibiliSpaceBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)/audio' _VALID_URL = r'https?://space\.bilibili\.com/(?P<id>\d+)/(?:upload/)?audio'
_TESTS = [{ _TESTS = [{
'url': 'https://space.bilibili.com/313580179/audio', 'url': 'https://space.bilibili.com/313580179/upload/audio',
'info_dict': { 'info_dict': {
'id': '313580179', 'id': '313580179',
}, },
@@ -1262,7 +1281,8 @@ class BilibiliSpaceAudioIE(BilibiliSpaceBaseIE):
} }
def get_entries(page_data): def get_entries(page_data):
for entry in page_data.get('data', []): # data is None when the playlist is empty
for entry in page_data.get('data') or []:
yield self.url_result(f'https://www.bilibili.com/audio/au{entry["id"]}', BilibiliAudioIE, entry['id']) yield self.url_result(f'https://www.bilibili.com/audio/au{entry["id"]}', BilibiliAudioIE, entry['id'])
metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries) metadata, paged_list = self._extract_playlist(fetch_page, get_metadata, get_entries)
@@ -1286,30 +1306,43 @@ class BilibiliSpaceListBaseIE(BilibiliSpaceBaseIE):
class BilibiliCollectionListIE(BilibiliSpaceListBaseIE): class BilibiliCollectionListIE(BilibiliSpaceListBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<mid>\d+)/channel/collectiondetail/?\?sid=(?P<sid>\d+)' _VALID_URL = [
r'https?://space\.bilibili\.com/(?P<mid>\d+)/channel/collectiondetail/?\?sid=(?P<sid>\d+)',
r'https?://space\.bilibili\.com/(?P<mid>\d+)/lists/(?P<sid>\d+)',
]
_TESTS = [{ _TESTS = [{
'url': 'https://space.bilibili.com/2142762/channel/collectiondetail?sid=57445', 'url': 'https://space.bilibili.com/2142762/lists/3662502?type=season',
'info_dict': { 'info_dict': {
'id': '2142762_57445', 'id': '2142762_3662502',
'title': '【完结】《底特律 变人》全结局流程解说', 'title': '合集·《黑神话悟空》流程解说',
'description': '', 'description': '黑神话悟空 相关节目',
'uploader': '老戴在此', 'uploader': '老戴在此',
'uploader_id': '2142762', 'uploader_id': '2142762',
'timestamp': int, 'timestamp': int,
'upload_date': str, 'upload_date': str,
'thumbnail': 'https://archive.biliimg.com/bfs/archive/e0e543ae35ad3df863ea7dea526bc32e70f4c091.jpg', 'thumbnail': 'https://archive.biliimg.com/bfs/archive/22302e17dc849dd4533606d71bc89df162c3a9bf.jpg',
}, },
'playlist_mincount': 31, 'playlist_mincount': 62,
}, {
'url': 'https://space.bilibili.com/2142762/lists/3662502',
'only_matching': True,
}, {
'url': 'https://space.bilibili.com/2142762/channel/collectiondetail?sid=57445',
'only_matching': True,
}] }]
@classmethod
def suitable(cls, url):
return False if BilibiliSeriesListIE.suitable(url) else super().suitable(url)
def _real_extract(self, url): def _real_extract(self, url):
mid, sid = self._match_valid_url(url).group('mid', 'sid') mid, sid = self._match_valid_url(url).group('mid', 'sid')
playlist_id = f'{mid}_{sid}' playlist_id = f'{mid}_{sid}'
def fetch_page(page_idx): def fetch_page(page_idx):
return self._download_json( return self._download_json(
'https://api.bilibili.com/x/polymer/space/seasons_archives_list', 'https://api.bilibili.com/x/polymer/web-space/seasons_archives_list',
playlist_id, note=f'Downloading page {page_idx}', playlist_id, note=f'Downloading page {page_idx}', headers={'Referer': url},
query={'mid': mid, 'season_id': sid, 'page_num': page_idx + 1, 'page_size': 30})['data'] query={'mid': mid, 'season_id': sid, 'page_num': page_idx + 1, 'page_size': 30})['data']
def get_metadata(page_data): def get_metadata(page_data):
@@ -1336,9 +1369,12 @@ class BilibiliCollectionListIE(BilibiliSpaceListBaseIE):
class BilibiliSeriesListIE(BilibiliSpaceListBaseIE): class BilibiliSeriesListIE(BilibiliSpaceListBaseIE):
_VALID_URL = r'https?://space\.bilibili\.com/(?P<mid>\d+)/channel/seriesdetail/?\?\bsid=(?P<sid>\d+)' _VALID_URL = [
r'https?://space\.bilibili\.com/(?P<mid>\d+)/channel/seriesdetail/?\?\bsid=(?P<sid>\d+)',
r'https?://space\.bilibili\.com/(?P<mid>\d+)/lists/(?P<sid>\d+)/?\?(?:[^#]+&)?type=series(?:[&#]|$)',
]
_TESTS = [{ _TESTS = [{
'url': 'https://space.bilibili.com/1958703906/channel/seriesdetail?sid=547718&ctype=0', 'url': 'https://space.bilibili.com/1958703906/lists/547718?type=series',
'info_dict': { 'info_dict': {
'id': '1958703906_547718', 'id': '1958703906_547718',
'title': '直播回放', 'title': '直播回放',
@@ -1351,6 +1387,9 @@ class BilibiliSeriesListIE(BilibiliSpaceListBaseIE):
'modified_date': str, 'modified_date': str,
}, },
'playlist_mincount': 513, 'playlist_mincount': 513,
}, {
'url': 'https://space.bilibili.com/1958703906/channel/seriesdetail?sid=547718&ctype=0',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -1369,7 +1408,7 @@ class BilibiliSeriesListIE(BilibiliSpaceListBaseIE):
def fetch_page(page_idx): def fetch_page(page_idx):
return self._download_json( return self._download_json(
'https://api.bilibili.com/x/series/archives', 'https://api.bilibili.com/x/series/archives',
playlist_id, note=f'Downloading page {page_idx}', playlist_id, note=f'Downloading page {page_idx}', headers={'Referer': url},
query={'mid': mid, 'series_id': sid, 'pn': page_idx + 1, 'ps': 30})['data'] query={'mid': mid, 'series_id': sid, 'pn': page_idx + 1, 'ps': 30})['data']
def get_metadata(page_data): def get_metadata(page_data):
@@ -1848,6 +1887,47 @@ class BiliBiliPlayerIE(InfoExtractor):
ie=BiliBiliIE.ie_key(), video_id=video_id) ie=BiliBiliIE.ie_key(), video_id=video_id)
class BiliBiliDynamicIE(InfoExtractor):
_VALID_URL = r'https?://(?:t\.bilibili\.com|(?:www\.)?bilibili\.com/opus)/(?P<id>\d+)'
_TESTS = [{
'url': 'https://t.bilibili.com/998134289197432852',
'info_dict': {
'id': 'BV1TAmBYVEJr',
'ext': 'mp4',
'uploader_id': '1192648858',
'comment_count': int,
'_old_archive_ids': ['bilibili 113457567568273_part1'],
'thumbnail': 'http://i2.hdslb.com/bfs/archive/50091efd965d9f13ff6814f7ad374f90ab21e77d.jpg',
'duration': 929.238,
'upload_date': '20241110',
'uploader': '何同学工作室',
'like_count': int,
'view_count': int,
'title': '美国小朋友就玩这个何同学工作室11月开箱',
'description': '本期产品信息:\n机器狗\n气味模拟器\nCloudboom Strike LS\n无弦吉他\n蓝牙磁带音箱\n神奇画板',
'timestamp': 1731232800,
'tags': list,
'chapters': list,
},
}]
def _real_extract(self, url):
post_id = self._match_id(url)
# Without the newer chrome UA, the API will return an error (-352)
post_data = self._download_json(
'https://api.bilibili.com/x/polymer/web-dynamic/v1/detail', post_id,
query={'id': post_id}, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
})
video_url = traverse_obj(post_data, (
'data', 'item', (None, 'orig'), 'modules', 'module_dynamic',
(('major', ('archive', 'pgc')), ('additional', ('reserve', 'common'))),
'jump_url', {url_or_none}, any, {self._proto_relative_url}))
if not video_url or (self.suitable(video_url) and post_id == self._match_id(video_url)):
raise ExtractorError('No valid video URL found', expected=True)
return self.url_result(video_url)
class BiliIntlBaseIE(InfoExtractor): class BiliIntlBaseIE(InfoExtractor):
_API_URL = 'https://api.bilibili.tv/intl/gateway' _API_URL = 'https://api.bilibili.tv/intl/gateway'
_NETRC_MACHINE = 'biliintl' _NETRC_MACHINE = 'biliintl'

View File

@@ -88,7 +88,7 @@ class BlueskyIE(InfoExtractor):
}, },
}, { }, {
'url': 'https://bsky.app/profile/de1.pds.tentacle.expert/post/3l3w4tnezek2e', 'url': 'https://bsky.app/profile/de1.pds.tentacle.expert/post/3l3w4tnezek2e',
'md5': '1af9c7fda061cf7593bbffca89e43d1c', 'md5': 'cc0110ed1f6b0247caac8234cc1e861d',
'info_dict': { 'info_dict': {
'id': '3l3w4tnezek2e', 'id': '3l3w4tnezek2e',
'ext': 'mp4', 'ext': 'mp4',
@@ -133,6 +133,8 @@ class BlueskyIE(InfoExtractor):
'channel_follower_count': int, 'channel_follower_count': int,
'categories': ['Entertainment'], 'categories': ['Entertainment'],
'tags': [], 'tags': [],
'chapters': list,
'heatmap': 'count:100',
}, },
'add_ie': ['Youtube'], 'add_ie': ['Youtube'],
}, { }, {
@@ -184,14 +186,14 @@ class BlueskyIE(InfoExtractor):
}, },
}, },
}, { }, {
'url': 'https://bsky.app/profile/alt.bun.how/post/3l7rdfxhyds2f', 'url': 'https://bsky.app/profile/cinny.bun.how/post/3l7rdfxhyds2f',
'md5': '8775118b235cf9fa6b5ad30f95cda75c', 'md5': '8775118b235cf9fa6b5ad30f95cda75c',
'info_dict': { 'info_dict': {
'id': '3l7rdfxhyds2f', 'id': '3l7rdfxhyds2f',
'ext': 'mp4', 'ext': 'mp4',
'uploader': 'cinnamon', 'uploader': 'cinnamon',
'uploader_id': 'alt.bun.how', 'uploader_id': 'cinny.bun.how',
'uploader_url': 'https://bsky.app/profile/alt.bun.how', 'uploader_url': 'https://bsky.app/profile/cinny.bun.how',
'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide', 'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide',
'channel_url': 'https://bsky.app/profile/did:plc:7x6rtuenkuvxq3zsvffp2ide', 'channel_url': 'https://bsky.app/profile/did:plc:7x6rtuenkuvxq3zsvffp2ide',
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$', 'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
@@ -284,17 +286,19 @@ class BlueskyIE(InfoExtractor):
services, ('service', lambda _, x: x['type'] == 'AtprotoPersonalDataServer', services, ('service', lambda _, x: x['type'] == 'AtprotoPersonalDataServer',
'serviceEndpoint', {url_or_none}, any)) or 'https://bsky.social' 'serviceEndpoint', {url_or_none}, any)) or 'https://bsky.social'
def _real_extract(self, url): def _extract_post(self, handle, post_id):
handle, video_id = self._match_valid_url(url).group('handle', 'id') return self._download_json(
post = self._download_json(
'https://public.api.bsky.app/xrpc/app.bsky.feed.getPostThread', 'https://public.api.bsky.app/xrpc/app.bsky.feed.getPostThread',
video_id, query={ post_id, query={
'uri': f'at://{handle}/app.bsky.feed.post/{video_id}', 'uri': f'at://{handle}/app.bsky.feed.post/{post_id}',
'depth': 0, 'depth': 0,
'parentHeight': 0, 'parentHeight': 0,
})['thread']['post'] })['thread']['post']
def _real_extract(self, url):
handle, video_id = self._match_valid_url(url).group('handle', 'id')
post = self._extract_post(handle, video_id)
entries = [] entries = []
# app.bsky.embed.video.view/app.bsky.embed.external.view # app.bsky.embed.video.view/app.bsky.embed.external.view
entries.extend(self._extract_videos(post, video_id)) entries.extend(self._extract_videos(post, video_id))
@@ -341,6 +345,7 @@ class BlueskyIE(InfoExtractor):
formats.append({ formats.append({
'format_id': 'blob', 'format_id': 'blob',
'quality': 1,
'url': update_url_query( 'url': update_url_query(
self._BLOB_URL_TMPL.format(endpoint), {'did': did, 'cid': video_cid}), self._BLOB_URL_TMPL.format(endpoint), {'did': did, 'cid': video_cid}),
**traverse_obj(root, (*embed_path, 'aspectRatio', { **traverse_obj(root, (*embed_path, 'aspectRatio', {

View File

@@ -31,6 +31,7 @@ from ..utils import (
update_url_query, update_url_query,
url_or_none, url_or_none,
) )
from ..utils.traversal import traverse_obj
class BrightcoveLegacyIE(InfoExtractor): class BrightcoveLegacyIE(InfoExtractor):
@@ -935,8 +936,8 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
if content_type == 'playlist': if content_type == 'playlist':
return self.playlist_result( return self.playlist_result(
[self._parse_brightcove_metadata(vid, vid.get('id'), headers) (self._parse_brightcove_metadata(vid, vid['id'], headers)
for vid in json_data.get('videos', []) if vid.get('id')], for vid in traverse_obj(json_data, ('videos', lambda _, v: v['id']))),
json_data.get('id'), json_data.get('name'), json_data.get('id'), json_data.get('name'),
json_data.get('description')) json_data.get('description'))

View File

@@ -5,11 +5,12 @@ from ..utils import (
ExtractorError, ExtractorError,
lowercase_escape, lowercase_escape,
url_or_none, url_or_none,
urlencode_postdata,
) )
class ChaturbateIE(InfoExtractor): class ChaturbateIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)' _VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.(?P<tld>com|eu|global)/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.chaturbate.com/siswet19/', 'url': 'https://www.chaturbate.com/siswet19/',
'info_dict': { 'info_dict': {
@@ -29,16 +30,58 @@ class ChaturbateIE(InfoExtractor):
}, { }, {
'url': 'https://en.chaturbate.com/siswet19/', 'url': 'https://en.chaturbate.com/siswet19/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://chaturbate.eu/siswet19/',
'only_matching': True,
}, {
'url': 'https://chaturbate.eu/fullvideo/?b=caylin',
'only_matching': True,
}, {
'url': 'https://chaturbate.global/siswet19/',
'only_matching': True,
}] }]
_ROOM_OFFLINE = 'Room is currently offline' _ERROR_MAP = {
'offline': 'Room is currently offline',
'private': 'Room is currently in a private show',
'away': 'Performer is currently away',
'password protected': 'Room is password protected',
'hidden': 'Hidden session in progress',
}
def _real_extract(self, url): def _extract_from_api(self, video_id, tld):
video_id = self._match_id(url) response = self._download_json(
f'https://chaturbate.{tld}/get_edge_hls_url_ajax/', video_id,
data=urlencode_postdata({'room_slug': video_id}),
headers={
**self.geo_verification_headers(),
'X-Requested-With': 'XMLHttpRequest',
'Accept': 'application/json',
}, fatal=False, impersonate=True) or {}
m3u8_url = response.get('url')
if not m3u8_url:
status = response.get('room_status')
if error := self._ERROR_MAP.get(status):
raise ExtractorError(error, expected=True)
if status == 'public':
self.raise_geo_restricted()
self.report_warning(f'Got status "{status}" from API; falling back to webpage extraction')
return None
return {
'id': video_id,
'title': video_id,
'thumbnail': f'https://roomimg.stream.highwebmedia.com/ri/{video_id}.jpg',
'is_live': True,
'age_limit': 18,
'formats': self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True),
}
def _extract_from_html(self, video_id, tld):
webpage = self._download_webpage( webpage = self._download_webpage(
f'https://chaturbate.com/{video_id}/', video_id, f'https://chaturbate.{tld}/{video_id}/', video_id,
headers=self.geo_verification_headers()) headers=self.geo_verification_headers(), impersonate=True)
found_m3u8_urls = [] found_m3u8_urls = []
@@ -76,8 +119,8 @@ class ChaturbateIE(InfoExtractor):
webpage, 'error', group='error', default=None) webpage, 'error', group='error', default=None)
if not error: if not error:
if any(p in webpage for p in ( if any(p in webpage for p in (
self._ROOM_OFFLINE, 'offline_tipping', 'tip_offline')): self._ERROR_MAP['offline'], 'offline_tipping', 'tip_offline')):
error = self._ROOM_OFFLINE error = self._ERROR_MAP['offline']
if error: if error:
raise ExtractorError(error, expected=True) raise ExtractorError(error, expected=True)
raise ExtractorError('Unable to find stream URL') raise ExtractorError('Unable to find stream URL')
@@ -104,3 +147,7 @@ class ChaturbateIE(InfoExtractor):
'is_live': True, 'is_live': True,
'formats': formats, 'formats': formats,
} }
def _real_extract(self, url):
video_id, tld = self._match_valid_url(url).group('id', 'tld')
return self._extract_from_api(video_id, tld) or self._extract_from_html(video_id, tld)

View File

@@ -8,7 +8,7 @@ class CloudflareStreamIE(InfoExtractor):
_DOMAIN_RE = r'(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)' _DOMAIN_RE = r'(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)'
_EMBED_RE = rf'(?:embed\.|{_SUBDOMAIN_RE}){_DOMAIN_RE}/embed/[^/?#]+\.js\?(?:[^#]+&)?video=' _EMBED_RE = rf'(?:embed\.|{_SUBDOMAIN_RE}){_DOMAIN_RE}/embed/[^/?#]+\.js\?(?:[^#]+&)?video='
_ID_RE = r'[\da-f]{32}|eyJ[\w-]+\.[\w-]+\.[\w-]+' _ID_RE = r'[\da-f]{32}|eyJ[\w-]+\.[\w-]+\.[\w-]+'
_VALID_URL = rf'https?://(?:{_SUBDOMAIN_RE}{_DOMAIN_RE}/|{_EMBED_RE})(?P<id>{_ID_RE})' _VALID_URL = rf'https?://(?:{_SUBDOMAIN_RE}(?P<domain>{_DOMAIN_RE})/|{_EMBED_RE})(?P<id>{_ID_RE})'
_EMBED_REGEX = [ _EMBED_REGEX = [
rf'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//{_EMBED_RE}(?:{_ID_RE})(?:(?!\1).)*)\1', rf'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//{_EMBED_RE}(?:{_ID_RE})(?:(?!\1).)*)\1',
rf'<iframe[^>]+\bsrc=["\'](?P<url>https?://{_SUBDOMAIN_RE}{_DOMAIN_RE}/[\da-f]{{32}})', rf'<iframe[^>]+\bsrc=["\'](?P<url>https?://{_SUBDOMAIN_RE}{_DOMAIN_RE}/[\da-f]{{32}})',
@@ -19,7 +19,7 @@ class CloudflareStreamIE(InfoExtractor):
'id': '31c9291ab41fac05471db4e73aa11717', 'id': '31c9291ab41fac05471db4e73aa11717',
'ext': 'mp4', 'ext': 'mp4',
'title': '31c9291ab41fac05471db4e73aa11717', 'title': '31c9291ab41fac05471db4e73aa11717',
'thumbnail': 'https://videodelivery.net/31c9291ab41fac05471db4e73aa11717/thumbnails/thumbnail.jpg', 'thumbnail': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/thumbnails/thumbnail.jpg',
}, },
'params': { 'params': {
'skip_download': 'm3u8', 'skip_download': 'm3u8',
@@ -30,7 +30,7 @@ class CloudflareStreamIE(InfoExtractor):
'id': '0e8e040aec776862e1d632a699edf59e', 'id': '0e8e040aec776862e1d632a699edf59e',
'ext': 'mp4', 'ext': 'mp4',
'title': '0e8e040aec776862e1d632a699edf59e', 'title': '0e8e040aec776862e1d632a699edf59e',
'thumbnail': 'https://videodelivery.net/0e8e040aec776862e1d632a699edf59e/thumbnails/thumbnail.jpg', 'thumbnail': 'https://cloudflarestream.com/0e8e040aec776862e1d632a699edf59e/thumbnails/thumbnail.jpg',
}, },
}, { }, {
'url': 'https://watch.cloudflarestream.com/9df17203414fd1db3e3ed74abbe936c1', 'url': 'https://watch.cloudflarestream.com/9df17203414fd1db3e3ed74abbe936c1',
@@ -54,7 +54,7 @@ class CloudflareStreamIE(InfoExtractor):
'id': 'eaef9dea5159cf968be84241b5cedfe7', 'id': 'eaef9dea5159cf968be84241b5cedfe7',
'ext': 'mp4', 'ext': 'mp4',
'title': 'eaef9dea5159cf968be84241b5cedfe7', 'title': 'eaef9dea5159cf968be84241b5cedfe7',
'thumbnail': 'https://videodelivery.net/eaef9dea5159cf968be84241b5cedfe7/thumbnails/thumbnail.jpg', 'thumbnail': 'https://cloudflarestream.com/eaef9dea5159cf968be84241b5cedfe7/thumbnails/thumbnail.jpg',
}, },
'params': { 'params': {
'skip_download': 'm3u8', 'skip_download': 'm3u8',
@@ -62,8 +62,9 @@ class CloudflareStreamIE(InfoExtractor):
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id, domain = self._match_valid_url(url).group('id', 'domain')
domain = 'bytehighway.net' if 'bytehighway.net/' in url else 'videodelivery.net' if domain != 'bytehighway.net':
domain = 'cloudflarestream.com'
base_url = f'https://{domain}/{video_id}/' base_url = f'https://{domain}/{video_id}/'
if '.' in video_id: if '.' in video_id:
video_id = self._parse_json(base64.urlsafe_b64decode( video_id = self._parse_json(base64.urlsafe_b64decode(

View File

@@ -25,7 +25,6 @@ import xml.etree.ElementTree
from ..compat import ( from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_expanduser, compat_expanduser,
compat_os_name,
urllib_req_to_req, urllib_req_to_req,
) )
from ..cookies import LenientSimpleCookie from ..cookies import LenientSimpleCookie
@@ -279,6 +278,7 @@ class InfoExtractor:
thumbnails: A list of dictionaries, with the following entries: thumbnails: A list of dictionaries, with the following entries:
* "id" (optional, string) - Thumbnail format ID * "id" (optional, string) - Thumbnail format ID
* "url" * "url"
* "ext" (optional, string) - actual image extension if not given in URL
* "preference" (optional, int) - quality of the image * "preference" (optional, int) - quality of the image
* "width" (optional, int) * "width" (optional, int)
* "height" (optional, int) * "height" (optional, int)
@@ -1028,7 +1028,7 @@ class InfoExtractor:
filename = sanitize_filename(f'{basen}.dump', restricted=True) filename = sanitize_filename(f'{basen}.dump', restricted=True)
# Working around MAX_PATH limitation on Windows (see # Working around MAX_PATH limitation on Windows (see
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx) # http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
if compat_os_name == 'nt': if os.name == 'nt':
absfilepath = os.path.abspath(filename) absfilepath = os.path.abspath(filename)
if len(absfilepath) > 259: if len(absfilepath) > 259:
filename = fR'\\?\{absfilepath}' filename = fR'\\?\{absfilepath}'
@@ -1854,12 +1854,26 @@ class InfoExtractor:
@staticmethod @staticmethod
def _remove_duplicate_formats(formats): def _remove_duplicate_formats(formats):
format_urls = set() seen_urls = set()
seen_fragment_urls = set()
unique_formats = [] unique_formats = []
for f in formats: for f in formats:
if f['url'] not in format_urls: fragments = f.get('fragments')
format_urls.add(f['url']) if callable(fragments):
unique_formats.append(f) unique_formats.append(f)
elif fragments:
fragment_urls = frozenset(
fragment.get('url') or urljoin(f['fragment_base_url'], fragment['path'])
for fragment in fragments)
if fragment_urls not in seen_fragment_urls:
seen_fragment_urls.add(fragment_urls)
unique_formats.append(f)
elif f['url'] not in seen_urls:
seen_urls.add(f['url'])
unique_formats.append(f)
formats[:] = unique_formats formats[:] = unique_formats
def _is_valid_url(self, url, video_id, item='video', headers={}): def _is_valid_url(self, url, video_id, item='video', headers={}):
@@ -3767,7 +3781,7 @@ class InfoExtractor:
""" Merge subtitle dictionaries, language by language. """ """ Merge subtitle dictionaries, language by language. """
if target is None: if target is None:
target = {} target = {}
for d in dicts: for d in filter(None, dicts):
for lang, subs in d.items(): for lang, subs in d.items():
target[lang] = cls._merge_subtitle_items(target.get(lang, []), subs) target[lang] = cls._merge_subtitle_items(target.get(lang, []), subs)
return target return target
@@ -3789,7 +3803,7 @@ class InfoExtractor:
def mark_watched(self, *args, **kwargs): def mark_watched(self, *args, **kwargs):
if not self.get_param('mark_watched', False): if not self.get_param('mark_watched', False):
return return
if self.supports_login() and self._get_login_info()[0] is not None or self._cookies_passed: if (self.supports_login() and self._get_login_info()[0] is not None) or self._cookies_passed:
self._mark_watched(*args, **kwargs) self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs): def _mark_watched(self, *args, **kwargs):

View File

@@ -1,692 +0,0 @@
import base64
import uuid
from .common import InfoExtractor
from ..networking import Request
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
float_or_none,
format_field,
int_or_none,
jwt_decode_hs256,
parse_age_limit,
parse_count,
parse_iso8601,
qualities,
time_seconds,
traverse_obj,
url_or_none,
urlencode_postdata,
)
class CrunchyrollBaseIE(InfoExtractor):
_BASE_URL = 'https://www.crunchyroll.com'
_API_BASE = 'https://api.crunchyroll.com'
_NETRC_MACHINE = 'crunchyroll'
_SWITCH_USER_AGENT = 'Crunchyroll/1.8.0 Nintendo Switch/12.3.12.0 UE4/4.27'
_REFRESH_TOKEN = None
_AUTH_HEADERS = None
_AUTH_EXPIRY = None
_API_ENDPOINT = None
_BASIC_AUTH = 'Basic ' + base64.b64encode(':'.join((
't-kdgp2h8c3jub8fn0fq',
'yfLDfMfrYvKXh4JXS1LEI2cCqu1v5Wan',
)).encode()).decode()
_IS_PREMIUM = None
_LOCALE_LOOKUP = {
'ar': 'ar-SA',
'de': 'de-DE',
'': 'en-US',
'es': 'es-419',
'es-es': 'es-ES',
'fr': 'fr-FR',
'it': 'it-IT',
'pt-br': 'pt-BR',
'pt-pt': 'pt-PT',
'ru': 'ru-RU',
'hi': 'hi-IN',
}
def _set_auth_info(self, response):
CrunchyrollBaseIE._IS_PREMIUM = 'cr_premium' in traverse_obj(response, ('access_token', {jwt_decode_hs256}, 'benefits', ...))
CrunchyrollBaseIE._AUTH_HEADERS = {'Authorization': response['token_type'] + ' ' + response['access_token']}
CrunchyrollBaseIE._AUTH_EXPIRY = time_seconds(seconds=traverse_obj(response, ('expires_in', {float_or_none}), default=300) - 10)
def _request_token(self, headers, data, note='Requesting token', errnote='Failed to request token'):
try:
return self._download_json(
f'{self._BASE_URL}/auth/v1/token', None, note=note, errnote=errnote,
headers=headers, data=urlencode_postdata(data), impersonate=True)
except ExtractorError as error:
if not isinstance(error.cause, HTTPError) or error.cause.status != 403:
raise
if target := error.cause.response.extensions.get('impersonate'):
raise ExtractorError(f'Got HTTP Error 403 when using impersonate target "{target}"')
raise ExtractorError(
'Request blocked by Cloudflare. '
'Install the required impersonation dependency if possible, '
'or else navigate to Crunchyroll in your browser, '
'then pass the fresh cookies (with --cookies-from-browser or --cookies) '
'and your browser\'s User-Agent (with --user-agent)', expected=True)
def _perform_login(self, username, password):
if not CrunchyrollBaseIE._REFRESH_TOKEN:
CrunchyrollBaseIE._REFRESH_TOKEN = self.cache.load(self._NETRC_MACHINE, username)
if CrunchyrollBaseIE._REFRESH_TOKEN:
return
try:
login_response = self._request_token(
headers={'Authorization': self._BASIC_AUTH}, data={
'username': username,
'password': password,
'grant_type': 'password',
'scope': 'offline_access',
}, note='Logging in', errnote='Failed to log in')
except ExtractorError as error:
if isinstance(error.cause, HTTPError) and error.cause.status == 401:
raise ExtractorError('Invalid username and/or password', expected=True)
raise
CrunchyrollBaseIE._REFRESH_TOKEN = login_response['refresh_token']
self.cache.store(self._NETRC_MACHINE, username, CrunchyrollBaseIE._REFRESH_TOKEN)
self._set_auth_info(login_response)
def _update_auth(self):
if CrunchyrollBaseIE._AUTH_HEADERS and CrunchyrollBaseIE._AUTH_EXPIRY > time_seconds():
return
auth_headers = {'Authorization': self._BASIC_AUTH}
if CrunchyrollBaseIE._REFRESH_TOKEN:
data = {
'refresh_token': CrunchyrollBaseIE._REFRESH_TOKEN,
'grant_type': 'refresh_token',
'scope': 'offline_access',
}
else:
data = {'grant_type': 'client_id'}
auth_headers['ETP-Anonymous-ID'] = uuid.uuid4()
try:
auth_response = self._request_token(auth_headers, data)
except ExtractorError as error:
username, password = self._get_login_info()
if not username or not isinstance(error.cause, HTTPError) or error.cause.status != 400:
raise
self.to_screen('Refresh token has expired. Re-logging in')
CrunchyrollBaseIE._REFRESH_TOKEN = None
self.cache.store(self._NETRC_MACHINE, username, None)
self._perform_login(username, password)
return
self._set_auth_info(auth_response)
def _locale_from_language(self, language):
config_locale = self._configuration_arg('metadata', ie_key=CrunchyrollBetaIE, casesense=True)
return config_locale[0] if config_locale else self._LOCALE_LOOKUP.get(language)
def _call_base_api(self, endpoint, internal_id, lang, note=None, query={}):
self._update_auth()
if not endpoint.startswith('/'):
endpoint = f'/{endpoint}'
query = query.copy()
locale = self._locale_from_language(lang)
if locale:
query['locale'] = locale
return self._download_json(
f'{self._BASE_URL}{endpoint}', internal_id, note or f'Calling API: {endpoint}',
headers=CrunchyrollBaseIE._AUTH_HEADERS, query=query)
def _call_api(self, path, internal_id, lang, note='api', query={}):
if not path.startswith(f'/content/v2/{self._API_ENDPOINT}/'):
path = f'/content/v2/{self._API_ENDPOINT}/{path}'
try:
result = self._call_base_api(
path, internal_id, lang, f'Downloading {note} JSON ({self._API_ENDPOINT})', query=query)
except ExtractorError as error:
if isinstance(error.cause, HTTPError) and error.cause.status == 404:
return None
raise
if not result:
raise ExtractorError(f'Unexpected response when downloading {note} JSON')
return result
def _extract_chapters(self, internal_id):
# if no skip events are available, a 403 xml error is returned
skip_events = self._download_json(
f'https://static.crunchyroll.com/skip-events/production/{internal_id}.json',
internal_id, note='Downloading chapter info', fatal=False, errnote=False)
if not skip_events:
return None
chapters = []
for event in ('recap', 'intro', 'credits', 'preview'):
start = traverse_obj(skip_events, (event, 'start', {float_or_none}))
end = traverse_obj(skip_events, (event, 'end', {float_or_none}))
# some chapters have no start and/or ending time, they will just be ignored
if start is None or end is None:
continue
chapters.append({'title': event.capitalize(), 'start_time': start, 'end_time': end})
return chapters
def _extract_stream(self, identifier, display_id=None):
if not display_id:
display_id = identifier
self._update_auth()
headers = {**CrunchyrollBaseIE._AUTH_HEADERS, 'User-Agent': self._SWITCH_USER_AGENT}
try:
stream_response = self._download_json(
f'https://cr-play-service.prd.crunchyrollsvc.com/v1/{identifier}/console/switch/play',
display_id, note='Downloading stream info', errnote='Failed to download stream info', headers=headers)
except ExtractorError as error:
if self.get_param('ignore_no_formats_error'):
self.report_warning(error.orig_msg)
return [], {}
elif isinstance(error.cause, HTTPError) and error.cause.status == 420:
raise ExtractorError(
'You have reached the rate-limit for active streams; try again later', expected=True)
raise
available_formats = {'': ('', '', stream_response['url'])}
for hardsub_lang, stream in traverse_obj(stream_response, ('hardSubs', {dict.items}, lambda _, v: v[1]['url'])):
available_formats[hardsub_lang] = (f'hardsub-{hardsub_lang}', hardsub_lang, stream['url'])
requested_hardsubs = [('' if val == 'none' else val) for val in (self._configuration_arg('hardsub') or ['none'])]
hardsub_langs = [lang for lang in available_formats if lang]
if hardsub_langs and 'all' not in requested_hardsubs:
full_format_langs = set(requested_hardsubs)
self.to_screen(f'Available hardsub languages: {", ".join(hardsub_langs)}')
self.to_screen(
'To extract formats of a hardsub language, use '
'"--extractor-args crunchyrollbeta:hardsub=<language_code or all>". '
'See https://github.com/yt-dlp/yt-dlp#crunchyrollbeta-crunchyroll for more info',
only_once=True)
else:
full_format_langs = set(map(str.lower, available_formats))
audio_locale = traverse_obj(stream_response, ('audioLocale', {str}))
hardsub_preference = qualities(requested_hardsubs[::-1])
formats, subtitles = [], {}
for format_id, hardsub_lang, stream_url in available_formats.values():
if hardsub_lang.lower() in full_format_langs:
adaptive_formats, dash_subs = self._extract_mpd_formats_and_subtitles(
stream_url, display_id, mpd_id=format_id, headers=CrunchyrollBaseIE._AUTH_HEADERS,
fatal=False, note=f'Downloading {f"{format_id} " if hardsub_lang else ""}MPD manifest')
self._merge_subtitles(dash_subs, target=subtitles)
else:
continue # XXX: Update this if meta mpd formats work; will be tricky with token invalidation
for f in adaptive_formats:
if f.get('acodec') != 'none':
f['language'] = audio_locale
f['quality'] = hardsub_preference(hardsub_lang.lower())
formats.extend(adaptive_formats)
for locale, subtitle in traverse_obj(stream_response, (('subtitles', 'captions'), {dict.items}, ...)):
subtitles.setdefault(locale, []).append(traverse_obj(subtitle, {'url': 'url', 'ext': 'format'}))
# Invalidate stream token to avoid rate-limit
error_msg = 'Unable to invalidate stream token; you may experience rate-limiting'
if stream_token := stream_response.get('token'):
self._request_webpage(Request(
f'https://cr-play-service.prd.crunchyrollsvc.com/v1/token/{identifier}/{stream_token}/inactive',
headers=headers, method='PATCH'), display_id, 'Invalidating stream token', error_msg, fatal=False)
else:
self.report_warning(error_msg)
return formats, subtitles
class CrunchyrollCmsBaseIE(CrunchyrollBaseIE):
_API_ENDPOINT = 'cms'
_CMS_EXPIRY = None
def _call_cms_api_signed(self, path, internal_id, lang, note='api'):
if not CrunchyrollCmsBaseIE._CMS_EXPIRY or CrunchyrollCmsBaseIE._CMS_EXPIRY <= time_seconds():
response = self._call_base_api('index/v2', None, lang, 'Retrieving signed policy')['cms_web']
CrunchyrollCmsBaseIE._CMS_QUERY = {
'Policy': response['policy'],
'Signature': response['signature'],
'Key-Pair-Id': response['key_pair_id'],
}
CrunchyrollCmsBaseIE._CMS_BUCKET = response['bucket']
CrunchyrollCmsBaseIE._CMS_EXPIRY = parse_iso8601(response['expires']) - 10
if not path.startswith('/cms/v2'):
path = f'/cms/v2{CrunchyrollCmsBaseIE._CMS_BUCKET}/{path}'
return self._call_base_api(
path, internal_id, lang, f'Downloading {note} JSON (signed cms)', query=CrunchyrollCmsBaseIE._CMS_QUERY)
class CrunchyrollBetaIE(CrunchyrollCmsBaseIE):
IE_NAME = 'crunchyroll'
_VALID_URL = r'''(?x)
https?://(?:beta\.|www\.)?crunchyroll\.com/
(?:(?P<lang>\w{2}(?:-\w{2})?)/)?
watch/(?!concert|musicvideo)(?P<id>\w+)'''
_TESTS = [{
# Premium only
'url': 'https://www.crunchyroll.com/watch/GY2P1Q98Y/to-the-future',
'info_dict': {
'id': 'GY2P1Q98Y',
'ext': 'mp4',
'duration': 1380.241,
'timestamp': 1459632600,
'description': 'md5:a022fbec4fbb023d43631032c91ed64b',
'title': 'World Trigger Episode 73 To the Future',
'upload_date': '20160402',
'series': 'World Trigger',
'series_id': 'GR757DMKY',
'season': 'World Trigger',
'season_id': 'GR9P39NJ6',
'season_number': 1,
'episode': 'To the Future',
'episode_number': 73,
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'chapters': 'count:2',
'age_limit': 14,
'like_count': int,
'dislike_count': int,
},
'params': {
'skip_download': 'm3u8',
'extractor_args': {'crunchyrollbeta': {'hardsub': ['de-DE']}},
'format': 'bv[format_id~=hardsub]',
},
}, {
# Premium only
'url': 'https://www.crunchyroll.com/watch/GYE5WKQGR',
'info_dict': {
'id': 'GYE5WKQGR',
'ext': 'mp4',
'duration': 366.459,
'timestamp': 1476788400,
'description': 'md5:74b67283ffddd75f6e224ca7dc031e76',
'title': 'SHELTER Porter Robinson presents Shelter the Animation',
'upload_date': '20161018',
'series': 'SHELTER',
'series_id': 'GYGG09WWY',
'season': 'SHELTER',
'season_id': 'GR09MGK4R',
'season_number': 1,
'episode': 'Porter Robinson presents Shelter the Animation',
'episode_number': 0,
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'age_limit': 14,
'like_count': int,
'dislike_count': int,
},
'params': {'skip_download': True},
}, {
'url': 'https://www.crunchyroll.com/watch/GJWU2VKK3/cherry-blossom-meeting-and-a-coming-blizzard',
'info_dict': {
'id': 'GJWU2VKK3',
'ext': 'mp4',
'duration': 1420.054,
'description': 'md5:2d1c67c0ec6ae514d9c30b0b99a625cd',
'title': 'The Ice Guy and His Cool Female Colleague Episode 1 Cherry Blossom Meeting and a Coming Blizzard',
'series': 'The Ice Guy and His Cool Female Colleague',
'series_id': 'GW4HM75NP',
'season': 'The Ice Guy and His Cool Female Colleague',
'season_id': 'GY9PC21VE',
'season_number': 1,
'episode': 'Cherry Blossom Meeting and a Coming Blizzard',
'episode_number': 1,
'chapters': 'count:2',
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'timestamp': 1672839000,
'upload_date': '20230104',
'age_limit': 14,
'like_count': int,
'dislike_count': int,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.crunchyroll.com/watch/GM8F313NQ',
'info_dict': {
'id': 'GM8F313NQ',
'ext': 'mp4',
'title': 'Garakowa -Restore the World-',
'description': 'md5:8d2f8b6b9dd77d87810882e7d2ee5608',
'duration': 3996.104,
'age_limit': 13,
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
},
'params': {'skip_download': 'm3u8'},
'skip': 'no longer exists',
}, {
'url': 'https://www.crunchyroll.com/watch/G62PEZ2E6',
'info_dict': {
'id': 'G62PEZ2E6',
'description': 'md5:8d2f8b6b9dd77d87810882e7d2ee5608',
'age_limit': 13,
'duration': 65.138,
'title': 'Garakowa -Restore the World-',
},
'playlist_mincount': 5,
}, {
'url': 'https://www.crunchyroll.com/de/watch/GY2P1Q98Y',
'only_matching': True,
}, {
'url': 'https://beta.crunchyroll.com/pt-br/watch/G8WUN8VKP/the-ruler-of-conspiracy',
'only_matching': True,
}]
# We want to support lazy playlist filtering and movie listings cannot be inside a playlist
_RETURN_TYPE = 'video'
def _real_extract(self, url):
lang, internal_id = self._match_valid_url(url).group('lang', 'id')
# We need to use unsigned API call to allow ratings query string
response = traverse_obj(self._call_api(
f'objects/{internal_id}', internal_id, lang, 'object info', {'ratings': 'true'}), ('data', 0, {dict}))
if not response:
raise ExtractorError(f'No video with id {internal_id} could be found (possibly region locked?)', expected=True)
object_type = response.get('type')
if object_type == 'episode':
result = self._transform_episode_response(response)
elif object_type == 'movie':
result = self._transform_movie_response(response)
elif object_type == 'movie_listing':
first_movie_id = traverse_obj(response, ('movie_listing_metadata', 'first_movie_id'))
if not self._yes_playlist(internal_id, first_movie_id):
return self.url_result(f'{self._BASE_URL}/{lang}watch/{first_movie_id}', CrunchyrollBetaIE, first_movie_id)
def entries():
movies = self._call_api(f'movie_listings/{internal_id}/movies', internal_id, lang, 'movie list')
for movie_response in traverse_obj(movies, ('data', ...)):
yield self.url_result(
f'{self._BASE_URL}/{lang}watch/{movie_response["id"]}',
CrunchyrollBetaIE, **self._transform_movie_response(movie_response))
return self.playlist_result(entries(), **self._transform_movie_response(response))
else:
raise ExtractorError(f'Unknown object type {object_type}')
if not self._IS_PREMIUM and traverse_obj(response, (f'{object_type}_metadata', 'is_premium_only')):
message = f'This {object_type} is for premium members only'
if CrunchyrollBaseIE._REFRESH_TOKEN:
self.raise_no_formats(message, expected=True, video_id=internal_id)
else:
self.raise_login_required(message, method='password', metadata_available=True)
else:
result['formats'], result['subtitles'] = self._extract_stream(internal_id)
result['chapters'] = self._extract_chapters(internal_id)
def calculate_count(item):
return parse_count(''.join((item['displayed'], item.get('unit') or '')))
result.update(traverse_obj(response, ('rating', {
'like_count': ('up', {calculate_count}),
'dislike_count': ('down', {calculate_count}),
})))
return result
@staticmethod
def _transform_episode_response(data):
metadata = traverse_obj(data, (('episode_metadata', None), {dict}), get_all=False) or {}
return {
'id': data['id'],
'title': ' \u2013 '.join((
('{}{}'.format(
format_field(metadata, 'season_title'),
format_field(metadata, 'episode', ' Episode %s'))),
format_field(data, 'title'))),
**traverse_obj(data, {
'episode': ('title', {str}),
'description': ('description', {str}, {lambda x: x.replace(r'\r\n', '\n')}),
'thumbnails': ('images', 'thumbnail', ..., ..., {
'url': ('source', {url_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
}),
**traverse_obj(metadata, {
'duration': ('duration_ms', {float_or_none(scale=1000)}),
'timestamp': ('upload_date', {parse_iso8601}),
'series': ('series_title', {str}),
'series_id': ('series_id', {str}),
'season': ('season_title', {str}),
'season_id': ('season_id', {str}),
'season_number': ('season_number', ({int}, {float_or_none})),
'episode_number': ('sequence_number', ({int}, {float_or_none})),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
'language': ('audio_locale', {str}),
}, get_all=False),
}
@staticmethod
def _transform_movie_response(data):
metadata = traverse_obj(data, (('movie_metadata', 'movie_listing_metadata', None), {dict}), get_all=False) or {}
return {
'id': data['id'],
**traverse_obj(data, {
'title': ('title', {str}),
'description': ('description', {str}, {lambda x: x.replace(r'\r\n', '\n')}),
'thumbnails': ('images', 'thumbnail', ..., ..., {
'url': ('source', {url_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
}),
**traverse_obj(metadata, {
'duration': ('duration_ms', {float_or_none(scale=1000)}),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
}),
}
class CrunchyrollBetaShowIE(CrunchyrollCmsBaseIE):
IE_NAME = 'crunchyroll:playlist'
_VALID_URL = r'''(?x)
https?://(?:beta\.|www\.)?crunchyroll\.com/
(?P<lang>(?:\w{2}(?:-\w{2})?/)?)
series/(?P<id>\w+)'''
_TESTS = [{
'url': 'https://www.crunchyroll.com/series/GY19NQ2QR/Girl-Friend-BETA',
'info_dict': {
'id': 'GY19NQ2QR',
'title': 'Girl Friend BETA',
'description': 'md5:99c1b22ee30a74b536a8277ced8eb750',
# XXX: `thumbnail` does not get set from `thumbnails` in playlist
# 'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'age_limit': 14,
},
'playlist_mincount': 10,
}, {
'url': 'https://beta.crunchyroll.com/it/series/GY19NQ2QR',
'only_matching': True,
}]
def _real_extract(self, url):
lang, internal_id = self._match_valid_url(url).group('lang', 'id')
def entries():
seasons_response = self._call_cms_api_signed(f'seasons?series_id={internal_id}', internal_id, lang, 'seasons')
for season in traverse_obj(seasons_response, ('items', ..., {dict})):
episodes_response = self._call_cms_api_signed(
f'episodes?season_id={season["id"]}', season['id'], lang, 'episode list')
for episode_response in traverse_obj(episodes_response, ('items', ..., {dict})):
yield self.url_result(
f'{self._BASE_URL}/{lang}watch/{episode_response["id"]}',
CrunchyrollBetaIE, **CrunchyrollBetaIE._transform_episode_response(episode_response))
return self.playlist_result(
entries(), internal_id,
**traverse_obj(self._call_api(f'series/{internal_id}', internal_id, lang, 'series'), ('data', 0, {
'title': ('title', {str}),
'description': ('description', {lambda x: x.replace(r'\r\n', '\n')}),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
'thumbnails': ('images', ..., ..., ..., {
'url': ('source', {url_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
})))
class CrunchyrollMusicIE(CrunchyrollBaseIE):
IE_NAME = 'crunchyroll:music'
_VALID_URL = r'''(?x)
https?://(?:www\.)?crunchyroll\.com/
(?P<lang>(?:\w{2}(?:-\w{2})?/)?)
watch/(?P<type>concert|musicvideo)/(?P<id>\w+)'''
_TESTS = [{
'url': 'https://www.crunchyroll.com/de/watch/musicvideo/MV5B02C79',
'info_dict': {
'ext': 'mp4',
'id': 'MV5B02C79',
'display_id': 'egaono-hana',
'title': 'Egaono Hana',
'track': 'Egaono Hana',
'artists': ['Goose house'],
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genres': ['J-Pop'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.crunchyroll.com/watch/musicvideo/MV88BB7F2C',
'info_dict': {
'ext': 'mp4',
'id': 'MV88BB7F2C',
'display_id': 'crossing-field',
'title': 'Crossing Field',
'track': 'Crossing Field',
'artists': ['LiSA'],
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genres': ['Anime'],
},
'params': {'skip_download': 'm3u8'},
'skip': 'no longer exists',
}, {
'url': 'https://www.crunchyroll.com/watch/concert/MC2E2AC135',
'info_dict': {
'ext': 'mp4',
'id': 'MC2E2AC135',
'display_id': 'live-is-smile-always-364joker-at-yokohama-arena',
'title': 'LiVE is Smile Always-364+JOKER- at YOKOHAMA ARENA',
'track': 'LiVE is Smile Always-364+JOKER- at YOKOHAMA ARENA',
'artists': ['LiSA'],
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'description': 'md5:747444e7e6300907b7a43f0a0503072e',
'genres': ['J-Pop'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.crunchyroll.com/de/watch/musicvideo/MV5B02C79/egaono-hana',
'only_matching': True,
}, {
'url': 'https://www.crunchyroll.com/watch/concert/MC2E2AC135/live-is-smile-always-364joker-at-yokohama-arena',
'only_matching': True,
}, {
'url': 'https://www.crunchyroll.com/watch/musicvideo/MV88BB7F2C/crossing-field',
'only_matching': True,
}]
_API_ENDPOINT = 'music'
def _real_extract(self, url):
lang, internal_id, object_type = self._match_valid_url(url).group('lang', 'id', 'type')
path, name = {
'concert': ('concerts', 'concert info'),
'musicvideo': ('music_videos', 'music video info'),
}[object_type]
response = traverse_obj(self._call_api(f'{path}/{internal_id}', internal_id, lang, name), ('data', 0, {dict}))
if not response:
raise ExtractorError(f'No video with id {internal_id} could be found (possibly region locked?)', expected=True)
result = self._transform_music_response(response)
if not self._IS_PREMIUM and response.get('isPremiumOnly'):
message = f'This {response.get("type") or "media"} is for premium members only'
if CrunchyrollBaseIE._REFRESH_TOKEN:
self.raise_no_formats(message, expected=True, video_id=internal_id)
else:
self.raise_login_required(message, method='password', metadata_available=True)
else:
result['formats'], _ = self._extract_stream(f'music/{internal_id}', internal_id)
return result
@staticmethod
def _transform_music_response(data):
return {
'id': data['id'],
**traverse_obj(data, {
'display_id': 'slug',
'title': 'title',
'track': 'title',
'artists': ('artist', 'name', all),
'description': ('description', {str}, {lambda x: x.replace(r'\r\n', '\n') or None}),
'thumbnails': ('images', ..., ..., {
'url': ('source', {url_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
'genres': ('genres', ..., 'displayValue'),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
}),
}
class CrunchyrollArtistIE(CrunchyrollBaseIE):
IE_NAME = 'crunchyroll:artist'
_VALID_URL = r'''(?x)
https?://(?:www\.)?crunchyroll\.com/
(?P<lang>(?:\w{2}(?:-\w{2})?/)?)
artist/(?P<id>\w{10})'''
_TESTS = [{
'url': 'https://www.crunchyroll.com/artist/MA179CB50D',
'info_dict': {
'id': 'MA179CB50D',
'title': 'LiSA',
'genres': ['Anime', 'J-Pop', 'Rock'],
'description': 'md5:16d87de61a55c3f7d6c454b73285938e',
},
'playlist_mincount': 83,
}, {
'url': 'https://www.crunchyroll.com/artist/MA179CB50D/lisa',
'only_matching': True,
}]
_API_ENDPOINT = 'music'
def _real_extract(self, url):
lang, internal_id = self._match_valid_url(url).group('lang', 'id')
response = traverse_obj(self._call_api(
f'artists/{internal_id}', internal_id, lang, 'artist info'), ('data', 0))
def entries():
for attribute, path in [('concerts', 'concert'), ('videos', 'musicvideo')]:
for internal_id in traverse_obj(response, (attribute, ...)):
yield self.url_result(f'{self._BASE_URL}/watch/{path}/{internal_id}', CrunchyrollMusicIE, internal_id)
return self.playlist_result(entries(), **self._transform_artist_response(response))
@staticmethod
def _transform_artist_response(data):
return {
'id': data['id'],
**traverse_obj(data, {
'title': 'name',
'description': ('description', {str}, {lambda x: x.replace(r'\r\n', '\n')}),
'thumbnails': ('images', ..., ..., {
'url': ('source', {url_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
'genres': ('genres', ..., 'displayValue'),
}),
}

View File

@@ -1,14 +1,27 @@
import json
import re import re
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import orderedSet from .ninecninemedia import NineCNineMediaIE
from ..utils import extract_attributes, orderedSet
from ..utils.traversal import find_element, traverse_obj
class CTVNewsIE(InfoExtractor): class CTVNewsIE(InfoExtractor):
_VALID_URL = r'https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)' _BASE_REGEX = r'https?://(?:[^.]+\.)?ctvnews\.ca/'
_VIDEO_ID_RE = r'(?P<id>\d{5,})'
_PLAYLIST_ID_RE = r'(?P<id>\d\.\d{5,})'
_VALID_URL = [
rf'{_BASE_REGEX}video/c{_VIDEO_ID_RE}',
rf'{_BASE_REGEX}video(?:-gallery)?/?\?clipId={_VIDEO_ID_RE}',
rf'{_BASE_REGEX}video/?\?(?:playlist|bin)Id={_PLAYLIST_ID_RE}',
rf'{_BASE_REGEX}(?!video/)[^?#]*?{_PLAYLIST_ID_RE}/?(?:$|[?#])',
rf'{_BASE_REGEX}(?!video/)[^?#]+\?binId={_PLAYLIST_ID_RE}',
]
_TESTS = [{ _TESTS = [{
'url': 'http://www.ctvnews.ca/video?clipId=901995', 'url': 'http://www.ctvnews.ca/video?clipId=901995',
'md5': '9b8624ba66351a23e0b6e1391971f9af', 'md5': 'b608f466c7fa24b9666c6439d766ab7e',
'info_dict': { 'info_dict': {
'id': '901995', 'id': '901995',
'ext': 'flv', 'ext': 'flv',
@@ -16,6 +29,33 @@ class CTVNewsIE(InfoExtractor):
'description': 'md5:958dd3b4f5bbbf0ed4d045c790d89285', 'description': 'md5:958dd3b4f5bbbf0ed4d045c790d89285',
'timestamp': 1467286284, 'timestamp': 1467286284,
'upload_date': '20160630', 'upload_date': '20160630',
'categories': [],
'season_number': 0,
'season': 'Season 0',
'tags': [],
'series': 'CTV News National | Archive | Stories 2',
'season_id': '57981',
'thumbnail': r're:https?://.*\.jpg$',
'duration': 764.631,
},
}, {
'url': 'https://barrie.ctvnews.ca/video/c3030933-here_s-what_s-making-news-for-nov--15?binId=1272429',
'md5': '8b8c2b33c5c1803e3c26bc74ff8694d5',
'info_dict': {
'id': '3030933',
'ext': 'flv',
'title': 'Heres whats making news for Nov. 15',
'description': 'Here are the top stories were working on for CTV News at 11 for Nov. 15',
'thumbnail': 'http://images2.9c9media.com/image_asset/2021_2_22_a602e68e-1514-410e-a67a-e1f7cccbacab_png_2000x1125.jpg',
'season_id': '58104',
'season_number': 0,
'tags': [],
'season': 'Season 0',
'categories': [],
'series': 'CTV News Barrie',
'upload_date': '20241116',
'duration': 42.943,
'timestamp': 1731722452,
}, },
}, { }, {
'url': 'http://www.ctvnews.ca/video?playlistId=1.2966224', 'url': 'http://www.ctvnews.ca/video?playlistId=1.2966224',
@@ -31,6 +71,72 @@ class CTVNewsIE(InfoExtractor):
'id': '1.2876780', 'id': '1.2876780',
}, },
'playlist_mincount': 100, 'playlist_mincount': 100,
}, {
'url': 'https://www.ctvnews.ca/it-s-been-23-years-since-toronto-called-in-the-army-after-a-major-snowstorm-1.5736957',
'info_dict':
{
'id': '1.5736957',
},
'playlist_mincount': 6,
}, {
'url': 'https://www.ctvnews.ca/business/respondents-to-bank-of-canada-questionnaire-largely-oppose-creating-a-digital-loonie-1.6665797',
'md5': '24bc4b88cdc17d8c3fc01dfc228ab72c',
'info_dict': {
'id': '2695026',
'ext': 'flv',
'season_id': '89852',
'series': 'From CTV News Channel',
'description': 'md5:796a985a23cacc7e1e2fafefd94afd0a',
'season': '2023',
'title': 'Bank of Canada asks public about digital currency',
'categories': [],
'tags': [],
'upload_date': '20230526',
'season_number': 2023,
'thumbnail': 'http://images2.9c9media.com/image_asset/2019_3_28_35f5afc3-10f6-4d92-b194-8b9a86f55c6a_png_1920x1080.jpg',
'timestamp': 1685105157,
'duration': 253.553,
},
}, {
'url': 'https://stox.ctvnews.ca/video-gallery?clipId=582589',
'md5': '135cc592df607d29dddc931f1b756ae2',
'info_dict': {
'id': '582589',
'ext': 'flv',
'categories': [],
'timestamp': 1427906183,
'season_number': 0,
'duration': 125.559,
'thumbnail': 'http://images2.9c9media.com/image_asset/2019_3_28_35f5afc3-10f6-4d92-b194-8b9a86f55c6a_png_1920x1080.jpg',
'series': 'CTV News Stox',
'description': 'CTV original footage of the rise and fall of the Berlin Wall.',
'title': 'Berlin Wall',
'season_id': '63817',
'season': 'Season 0',
'tags': [],
'upload_date': '20150401',
},
}, {
'url': 'https://ottawa.ctvnews.ca/features/regional-contact/regional-contact-archive?binId=1.1164587#3023759',
'md5': 'a14c0603557decc6531260791c23cc5e',
'info_dict': {
'id': '3023759',
'ext': 'flv',
'season_number': 2024,
'timestamp': 1731798000,
'season': '2024',
'episode': 'Episode 125',
'description': 'CTV News Ottawa at Six',
'duration': 2712.076,
'episode_number': 125,
'upload_date': '20241116',
'title': 'CTV News Ottawa at Six for Saturday, November 16, 2024',
'thumbnail': 'http://images2.9c9media.com/image_asset/2019_3_28_35f5afc3-10f6-4d92-b194-8b9a86f55c6a_png_1920x1080.jpg',
'categories': [],
'tags': [],
'series': 'CTV News Ottawa at Six',
'season_id': '92667',
},
}, { }, {
'url': 'http://www.ctvnews.ca/1.810401', 'url': 'http://www.ctvnews.ca/1.810401',
'only_matching': True, 'only_matching': True,
@@ -42,29 +148,35 @@ class CTVNewsIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
def _ninecninemedia_url_result(self, clip_id):
return self.url_result(f'9c9media:ctvnews_web:{clip_id}', NineCNineMediaIE, clip_id)
def _real_extract(self, url): def _real_extract(self, url):
page_id = self._match_id(url) page_id = self._match_id(url)
def ninecninemedia_url_result(clip_id): if mobj := re.fullmatch(self._VIDEO_ID_RE, urllib.parse.urlparse(url).fragment):
return { page_id = mobj.group('id')
'_type': 'url_transparent',
'id': clip_id,
'url': f'9c9media:ctvnews_web:{clip_id}',
'ie_key': 'NineCNineMedia',
}
if page_id.isdigit(): if re.fullmatch(self._VIDEO_ID_RE, page_id):
return ninecninemedia_url_result(page_id) return self._ninecninemedia_url_result(page_id)
else:
webpage = self._download_webpage(f'http://www.ctvnews.ca/{page_id}', page_id, query={ webpage = self._download_webpage(f'https://www.ctvnews.ca/{page_id}', page_id, query={
'ot': 'example.AjaxPageLayout.ot', 'ot': 'example.AjaxPageLayout.ot',
'maxItemsPerPage': 1000000, 'maxItemsPerPage': 1000000,
}) })
entries = [ninecninemedia_url_result(clip_id) for clip_id in orderedSet( entries = [self._ninecninemedia_url_result(clip_id)
re.findall(r'clip\.id\s*=\s*(\d+);', webpage))] for clip_id in orderedSet(re.findall(r'clip\.id\s*=\s*(\d+);', webpage))]
if not entries: if not entries:
webpage = self._download_webpage(url, page_id) webpage = self._download_webpage(url, page_id)
if 'getAuthStates("' in webpage: if 'getAuthStates("' in webpage:
entries = [ninecninemedia_url_result(clip_id) for clip_id in entries = [self._ninecninemedia_url_result(clip_id) for clip_id in
self._search_regex(r'getAuthStates\("([\d+,]+)"', webpage, 'clip ids').split(',')] self._search_regex(r'getAuthStates\("([\d+,]+)"', webpage, 'clip ids').split(',')]
return self.playlist_result(entries, page_id) else:
entries = [
self._ninecninemedia_url_result(clip_id) for clip_id in
traverse_obj(webpage, (
{find_element(tag='jasper-player-container', html=True)},
{extract_attributes}, 'axis-ids', {json.loads}, ..., 'axisId', {str}))
]
return self.playlist_result(entries, page_id)

View File

@@ -1,7 +1,4 @@
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import int_or_none from ..utils import int_or_none
@@ -31,9 +28,6 @@ class CultureUnpluggedIE(InfoExtractor):
video_id = mobj.group('id') video_id = mobj.group('id')
display_id = mobj.group('display_id') or video_id display_id = mobj.group('display_id') or video_id
# request setClientTimezone.php to get PHPSESSID cookie which is need to get valid json data in the next request
self._request_webpage(HEADRequest(
'http://www.cultureunplugged.com/setClientTimezone.php?timeOffset=%d' % -(time.timezone / 3600)), display_id)
movie_data = self._download_json( movie_data = self._download_json(
f'http://www.cultureunplugged.com/movie-data/cu-{video_id}.json', display_id) f'http://www.cultureunplugged.com/movie-data/cu-{video_id}.json', display_id)

View File

@@ -1,3 +1,4 @@
import functools
import hashlib import hashlib
import re import re
import time import time
@@ -51,6 +52,15 @@ class DacastVODIE(DacastBaseIE):
'thumbnail': 'https://universe-files.dacast.com/26137208-5858-65c1-5e9a-9d6b6bd2b6c2', 'thumbnail': 'https://universe-files.dacast.com/26137208-5858-65c1-5e9a-9d6b6bd2b6c2',
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { # /uspaes/ in hls_url
'url': 'https://iframe.dacast.com/vod/f9823fc6-faba-b98f-0d00-4a7b50a58c5b/348c5c84-b6af-4859-bb9d-1d01009c795b',
'info_dict': {
'id': '348c5c84-b6af-4859-bb9d-1d01009c795b',
'ext': 'mp4',
'title': 'pl1-edyta-rubas-211124.mp4',
'uploader_id': 'f9823fc6-faba-b98f-0d00-4a7b50a58c5b',
'thumbnail': 'https://universe-files.dacast.com/4d0bd042-a536-752d-fc34-ad2fa44bbcbb.png',
},
}] }]
_WEBPAGE_TESTS = [{ _WEBPAGE_TESTS = [{
'url': 'https://www.dacast.com/support/knowledgebase/how-can-i-embed-a-video-on-my-website/', 'url': 'https://www.dacast.com/support/knowledgebase/how-can-i-embed-a-video-on-my-website/',
@@ -74,6 +84,15 @@ class DacastVODIE(DacastBaseIE):
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}] }]
@functools.cached_property
def _usp_signing_secret(self):
player_js = self._download_webpage(
'https://player.dacast.com/js/player.js', None, 'Downloading player JS')
# Rotates every so often, but hardcode a fallback in case of JS change/breakage before rotation
return self._search_regex(
r'\bUSP_SIGNING_SECRET\s*=\s*(["\'])(?P<secret>(?:(?!\1).)+)', player_js,
'usp signing secret', group='secret', fatal=False) or 'odnInCGqhvtyRTtIiddxtuRtawYYICZP'
def _real_extract(self, url): def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id') user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
query = {'contentId': f'{user_id}-vod-{video_id}', 'provider': 'universe'} query = {'contentId': f'{user_id}-vod-{video_id}', 'provider': 'universe'}
@@ -94,10 +113,10 @@ class DacastVODIE(DacastBaseIE):
if 'DRM_EXT' in hls_url: if 'DRM_EXT' in hls_url:
self.report_drm(video_id) self.report_drm(video_id)
elif '/uspaes/' in hls_url: elif '/uspaes/' in hls_url:
# From https://player.dacast.com/js/player.js # Ref: https://player.dacast.com/js/player.js
ts = int(time.time()) ts = int(time.time())
signature = hashlib.sha1( signature = hashlib.sha1(
f'{10413792000 - ts}{ts}YfaKtquEEpDeusCKbvYszIEZnWmBcSvw').digest().hex() f'{10413792000 - ts}{ts}{self._usp_signing_secret}'.encode()).digest().hex()
hls_aes['uri'] = f'https://keys.dacast.com/uspaes/{video_id}.key?s={signature}&ts={ts}' hls_aes['uri'] = f'https://keys.dacast.com/uspaes/{video_id}.key?s={signature}&ts={ts}'
for retry in self.RetryManager(): for retry in self.RetryManager():

View File

@@ -261,6 +261,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'tags': [], 'tags': [],
'view_count': int, 'view_count': int,
'like_count': int, 'like_count': int,
'thumbnail': r're:https://\w+.dmcdn.net/v/WnEY61cmvMxt2Fi6d/x1080',
}, },
}, { }, {
# https://geo.dailymotion.com/player/xf7zn.html?playlist=x7wdsj # https://geo.dailymotion.com/player/xf7zn.html?playlist=x7wdsj
@@ -288,6 +289,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'description': 'À bord du « véloto », lalternative à la voiture pour la campagne', 'description': 'À bord du « véloto », lalternative à la voiture pour la campagne',
'tags': ['biclou', 'vélo', 'véloto', 'campagne', 'voiture', 'environnement', 'véhicules intermédiaires'], 'tags': ['biclou', 'vélo', 'véloto', 'campagne', 'voiture', 'environnement', 'véhicules intermédiaires'],
}, },
}, {
# https://geo.dailymotion.com/player/xry80.html?video=x8vu47w
'url': 'https://www.metatube.com/en/videos/546765/This-frogs-decorates-Christmas-tree/',
'info_dict': {
'id': 'x8vu47w',
'ext': 'mp4',
'like_count': int,
'uploader': 'Metatube',
'thumbnail': r're:https://\w+.dmcdn.net/v/W1G_S1coGSFTfkTeR/x1080',
'upload_date': '20240326',
'view_count': int,
'timestamp': 1711496732,
'age_limit': 0,
'uploader_id': 'x2xpy74',
'title': 'Está lindas ranitas ponen su arbolito',
'duration': 28,
'description': 'Que lindura',
'tags': [],
},
}] }]
_GEO_BYPASS = False _GEO_BYPASS = False
_COMMON_MEDIA_FIELDS = '''description _COMMON_MEDIA_FIELDS = '''description
@@ -302,7 +322,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
yield from super()._extract_embed_urls(url, webpage) yield from super()._extract_embed_urls(url, webpage)
for mobj in re.finditer( for mobj in re.finditer(
r'(?s)DM\.player\([^,]+,\s*{.*?video[\'"]?\s*:\s*["\']?(?P<id>[0-9a-zA-Z]+).+?}\s*\);', webpage): r'(?s)DM\.player\([^,]+,\s*{.*?video[\'"]?\s*:\s*["\']?(?P<id>[0-9a-zA-Z]+).+?}\s*\);', webpage):
yield from 'https://www.dailymotion.com/embed/video/' + mobj.group('id') yield 'https://www.dailymotion.com/embed/video/' + mobj.group('id')
for mobj in re.finditer( for mobj in re.finditer(
r'(?s)<script [^>]*\bsrc=(["\'])(?:https?:)?//[\w-]+\.dailymotion\.com/player/(?:(?!\1).)+\1[^>]*>', webpage): r'(?s)<script [^>]*\bsrc=(["\'])(?:https?:)?//[\w-]+\.dailymotion\.com/player/(?:(?!\1).)+\1[^>]*>', webpage):
attrs = extract_attributes(mobj.group(0)) attrs = extract_attributes(mobj.group(0))

View File

@@ -1,7 +1,10 @@
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..networking.exceptions import HTTPError from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
jwt_decode_hs256,
parse_codecs, parse_codecs,
try_get, try_get,
url_or_none, url_or_none,
@@ -13,9 +16,6 @@ from ..utils.traversal import traverse_obj
class DigitalConcertHallIE(InfoExtractor): class DigitalConcertHallIE(InfoExtractor):
IE_DESC = 'DigitalConcertHall extractor' IE_DESC = 'DigitalConcertHall extractor'
_VALID_URL = r'https?://(?:www\.)?digitalconcerthall\.com/(?P<language>[a-z]+)/(?P<type>film|concert|work)/(?P<id>[0-9]+)-?(?P<part>[0-9]+)?' _VALID_URL = r'https?://(?:www\.)?digitalconcerthall\.com/(?P<language>[a-z]+)/(?P<type>film|concert|work)/(?P<id>[0-9]+)-?(?P<part>[0-9]+)?'
_OAUTH_URL = 'https://api.digitalconcerthall.com/v2/oauth2/token'
_USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15'
_ACCESS_TOKEN = None
_NETRC_MACHINE = 'digitalconcerthall' _NETRC_MACHINE = 'digitalconcerthall'
_TESTS = [{ _TESTS = [{
'note': 'Playlist with only one video', 'note': 'Playlist with only one video',
@@ -69,59 +69,157 @@ class DigitalConcertHallIE(InfoExtractor):
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
'playlist_count': 1, 'playlist_count': 1,
}] }]
_LOGIN_HINT = ('Use --username token --password ACCESS_TOKEN where ACCESS_TOKEN '
'is the "access_token_production" from your browser local storage')
_REFRESH_HINT = 'or else use a "refresh_token" with --username refresh --password REFRESH_TOKEN'
_OAUTH_URL = 'https://api.digitalconcerthall.com/v2/oauth2/token'
_CLIENT_ID = 'dch.webapp'
_CLIENT_SECRET = '2ySLN+2Fwb'
_USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15'
_OAUTH_HEADERS = {
'Accept': 'application/json',
'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
'Origin': 'https://www.digitalconcerthall.com',
'Referer': 'https://www.digitalconcerthall.com/',
'User-Agent': _USER_AGENT,
}
_access_token = None
_access_token_expiry = 0
_refresh_token = None
def _perform_login(self, username, password): @property
login_token = self._download_json( def _access_token_is_expired(self):
self._OAUTH_URL, return self._access_token_expiry - 30 <= int(time.time())
None, 'Obtaining token', errnote='Unable to obtain token', data=urlencode_postdata({
def _set_access_token(self, value):
self._access_token = value
self._access_token_expiry = traverse_obj(value, ({jwt_decode_hs256}, 'exp', {int})) or 0
def _cache_tokens(self, /):
self.cache.store(self._NETRC_MACHINE, 'tokens', {
'access_token': self._access_token,
'refresh_token': self._refresh_token,
})
def _fetch_new_tokens(self, invalidate=False):
if invalidate:
self.report_warning('Access token has been invalidated')
self._set_access_token(None)
if not self._access_token_is_expired:
return
if not self._refresh_token:
self._set_access_token(None)
self._cache_tokens()
raise ExtractorError(
'Access token has expired or been invalidated. '
'Get a new "access_token_production" value from your browser '
f'and try again, {self._REFRESH_HINT}', expected=True)
# If we only have a refresh token, we need a temporary "initial token" for the refresh flow
bearer_token = self._access_token or self._download_json(
self._OAUTH_URL, None, 'Obtaining initial token', 'Unable to obtain initial token',
data=urlencode_postdata({
'affiliate': 'none', 'affiliate': 'none',
'grant_type': 'device', 'grant_type': 'device',
'device_vendor': 'unknown', 'device_vendor': 'unknown',
# device_model 'Safari' gets split streams of 4K/HEVC video and lossless/FLAC audio # device_model 'Safari' gets split streams of 4K/HEVC video and lossless/FLAC audio,
'device_model': 'unknown' if self._configuration_arg('prefer_combined_hls') else 'Safari', # but this is no longer effective since actual login is not possible anymore
'app_id': 'dch.webapp', 'device_model': 'unknown',
'app_id': self._CLIENT_ID,
'app_distributor': 'berlinphil', 'app_distributor': 'berlinphil',
'app_version': '1.84.0', 'app_version': '1.95.0',
'client_secret': '2ySLN+2Fwb', 'client_secret': self._CLIENT_SECRET,
}), headers={ }), headers=self._OAUTH_HEADERS)['access_token']
'Accept': 'application/json',
'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
'User-Agent': self._USER_AGENT,
})['access_token']
try: try:
login_response = self._download_json( response = self._download_json(
self._OAUTH_URL, self._OAUTH_URL, None, 'Refreshing token', 'Unable to refresh token',
None, note='Logging in', errnote='Unable to login', data=urlencode_postdata({ data=urlencode_postdata({
'grant_type': 'password', 'grant_type': 'refresh_token',
'username': username, 'refresh_token': self._refresh_token,
'password': password, 'client_id': self._CLIENT_ID,
'client_secret': self._CLIENT_SECRET,
}), headers={ }), headers={
'Accept': 'application/json', **self._OAUTH_HEADERS,
'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8', 'Authorization': f'Bearer {bearer_token}',
'Referer': 'https://www.digitalconcerthall.com',
'Authorization': f'Bearer {login_token}',
'User-Agent': self._USER_AGENT,
}) })
except ExtractorError as error: except ExtractorError as e:
if isinstance(error.cause, HTTPError) and error.cause.status == 401: if isinstance(e.cause, HTTPError) and e.cause.status == 401:
raise ExtractorError('Invalid username or password', expected=True) self._set_access_token(None)
self._refresh_token = None
self._cache_tokens()
raise ExtractorError('Your tokens have been invalidated', expected=True)
raise raise
self._ACCESS_TOKEN = login_response['access_token']
self._set_access_token(response['access_token'])
if refresh_token := traverse_obj(response, ('refresh_token', {str})):
self.write_debug('New refresh token granted')
self._refresh_token = refresh_token
self._cache_tokens()
def _perform_login(self, username, password):
self.report_login()
if username == 'refresh':
self._refresh_token = password
self._fetch_new_tokens()
if username == 'token':
if not traverse_obj(password, {jwt_decode_hs256}):
raise ExtractorError(
f'The access token passed to yt-dlp is not valid. {self._LOGIN_HINT}', expected=True)
self._set_access_token(password)
self._cache_tokens()
if username in ('refresh', 'token'):
if self.get_param('cachedir') is not False:
token_type = 'access' if username == 'token' else 'refresh'
self.to_screen(f'Your {token_type} token has been cached to disk. To use the cached '
'token next time, pass --username cache along with any password')
return
if username != 'cache':
raise ExtractorError(
'Login with username and password is no longer supported '
f'for this site. {self._LOGIN_HINT}, {self._REFRESH_HINT}', expected=True)
# Try cached access_token
cached_tokens = self.cache.load(self._NETRC_MACHINE, 'tokens', default={})
self._set_access_token(cached_tokens.get('access_token'))
self._refresh_token = cached_tokens.get('refresh_token')
if not self._access_token_is_expired:
return
# Try cached refresh_token
self._fetch_new_tokens(invalidate=True)
def _real_initialize(self): def _real_initialize(self):
if not self._ACCESS_TOKEN: if not self._access_token:
self.raise_login_required(method='password') self.raise_login_required(
'All content on this site is only available for registered users. '
f'{self._LOGIN_HINT}, {self._REFRESH_HINT}', method=None)
def _entries(self, items, language, type_, **kwargs): def _entries(self, items, language, type_, **kwargs):
for item in items: for item in items:
video_id = item['id'] video_id = item['id']
stream_info = self._download_json(
self._proto_relative_url(item['_links']['streams']['href']), video_id, headers={ for should_retry in (True, False):
'Accept': 'application/json', self._fetch_new_tokens(invalidate=not should_retry)
'Authorization': f'Bearer {self._ACCESS_TOKEN}', try:
'Accept-Language': language, stream_info = self._download_json(
'User-Agent': self._USER_AGENT, self._proto_relative_url(item['_links']['streams']['href']), video_id, headers={
}) 'Accept': 'application/json',
'Authorization': f'Bearer {self._access_token}',
'Accept-Language': language,
'User-Agent': self._USER_AGENT,
})
break
except ExtractorError as error:
if should_retry and isinstance(error.cause, HTTPError) and error.cause.status == 401:
continue
raise
formats = [] formats = []
for m3u8_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})): for m3u8_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})):
@@ -157,7 +255,6 @@ class DigitalConcertHallIE(InfoExtractor):
'Accept': 'application/json', 'Accept': 'application/json',
'Accept-Language': language, 'Accept-Language': language,
'User-Agent': self._USER_AGENT, 'User-Agent': self._USER_AGENT,
'Authorization': f'Bearer {self._ACCESS_TOKEN}',
}) })
videos = [vid_info] if type_ == 'film' else traverse_obj(vid_info, ('_embedded', ..., ...)) videos = [vid_info] if type_ == 'film' else traverse_obj(vid_info, ('_embedded', ..., ...))

View File

@@ -48,32 +48,30 @@ class DropboxIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
fn = urllib.parse.unquote(url_basename(url)) fn = urllib.parse.unquote(url_basename(url))
title = os.path.splitext(fn)[0] title = os.path.splitext(fn)[0]
password = self.get_param('videopassword') content_id = None
for part in self._yield_decoded_parts(webpage): for part in self._yield_decoded_parts(webpage):
if '/sm/password' in part: if '/sm/password' in part:
webpage = self._download_webpage( content_id = self._search_regex(r'content_id=([\w.+=/-]+)', part, 'content ID')
update_url('https://www.dropbox.com/sm/password', query=part.partition('?')[2]), video_id)
break break
if (self._og_search_title(webpage, default=None) == 'Dropbox - Password Required' if content_id:
or 'Enter the password for this link' in webpage): password = self.get_param('videopassword')
if password: if not password:
response = self._download_json(
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password',
headers={'content-type': 'application/x-www-form-urlencoded; charset=UTF-8'},
data=urlencode_postdata({
'is_xhr': 'true',
't': self._get_cookies('https://www.dropbox.com')['t'].value,
'content_id': self._search_regex(r'content_id=([\w.+=/-]+)["\']', webpage, 'content id'),
'password': password,
'url': url,
}))
if response.get('status') != 'authed':
raise ExtractorError('Invalid password', expected=True)
elif not self._get_cookies('https://dropbox.com').get('sm_auth'):
raise ExtractorError('Password protected video, use --video-password <password>', expected=True) raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
response = self._download_json(
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password',
data=urlencode_postdata({
'is_xhr': 'true',
't': self._get_cookies('https://www.dropbox.com')['t'].value,
'content_id': content_id,
'password': password,
'url': update_url(url, scheme='', netloc=''),
}))
if response.get('status') != 'authed':
raise ExtractorError('Invalid password', expected=True)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
formats, subtitles = [], {} formats, subtitles = [], {}

View File

@@ -135,7 +135,7 @@ class DropoutIE(InfoExtractor):
self.raise_login_required(method='any') self.raise_login_required(method='any')
raise ExtractorError(login_err, expected=True) raise ExtractorError(login_err, expected=True)
embed_url = self._search_regex(r'embed_url:\s*["\'](.+?)["\']', webpage, 'embed url') embed_url = self._html_search_regex(r'embed_url:\s*["\'](.+?)["\']', webpage, 'embed url')
thumbnail = self._og_search_thumbnail(webpage) thumbnail = self._og_search_thumbnail(webpage)
watch_info = get_element_by_id('watch-info', webpage) or '' watch_info = get_element_by_id('watch-info', webpage) or ''

View File

@@ -0,0 +1,51 @@
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..utils import url_or_none
from ..utils.traversal import traverse_obj
class DrTalksIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?drtalks\.com/videos/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://drtalks.com/videos/six-pillars-of-resilience-tools-for-managing-stress-and-flourishing/',
'info_dict': {
'id': '6366193757112',
'ext': 'mp4',
'uploader_id': '6314452011001',
'tags': ['resilience'],
'description': 'md5:9c6805aee237ee6de8052461855b9dda',
'timestamp': 1734546659,
'thumbnail': 'https://drtalks.com/wp-content/uploads/2024/12/Episode-82-Eva-Selhub-DrTalks-Thumbs.jpg',
'title': 'Six Pillars of Resilience: Tools for Managing Stress and Flourishing',
'duration': 2800.682,
'upload_date': '20241218',
},
}, {
'url': 'https://drtalks.com/videos/the-pcos-puzzle-mastering-metabolic-health-with-marcelle-pick/',
'info_dict': {
'id': '6364699891112',
'ext': 'mp4',
'title': 'The PCOS Puzzle: Mastering Metabolic Health with Marcelle Pick',
'description': 'md5:e87cbe00ca50135d5702787fc4043aaa',
'thumbnail': 'https://drtalks.com/wp-content/uploads/2024/11/Episode-34-Marcelle-Pick-OBGYN-NP-DrTalks.jpg',
'duration': 3515.2,
'tags': ['pcos'],
'upload_date': '20241114',
'timestamp': 1731592119,
'uploader_id': '6314452011001',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
next_data = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['data']['video']
return self.url_result(
next_data['videos']['brightcoveVideoLink'], BrightcoveNewIE, video_id,
url_transparent=True,
**traverse_obj(next_data, {
'title': ('title', {str}),
'description': ('videos', 'summury', {str}),
'thumbnail': ('featuredImage', 'node', 'sourceUrl', {url_or_none}),
}))

View File

@@ -5,15 +5,16 @@ from ..utils import (
get_element_text_and_html_by_tag, get_element_text_and_html_by_tag,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
parse_qs,
str_or_none, str_or_none,
try_call, try_call,
unified_timestamp, unified_timestamp,
) )
from ..utils.traversal import traverse_obj from ..utils.traversal import traverse_obj, value
class DuoplayIE(InfoExtractor): class DuoplayIE(InfoExtractor):
_VALID_URL = r'https?://duoplay\.ee/(?P<id>\d+)/[\w-]+/?(?:\?(?:[^#]+&)?ep=(?P<ep>\d+))?' _VALID_URL = r'https?://duoplay\.ee/(?P<id>\d+)(?:[/?#]|$)'
_TESTS = [{ _TESTS = [{
'note': 'Siberi võmm S02E12', 'note': 'Siberi võmm S02E12',
'url': 'https://duoplay.ee/4312/siberi-vomm?ep=24', 'url': 'https://duoplay.ee/4312/siberi-vomm?ep=24',
@@ -34,15 +35,16 @@ class DuoplayIE(InfoExtractor):
'episode_number': 12, 'episode_number': 12,
'episode_id': '24', 'episode_id': '24',
}, },
'skip': 'No video found',
}, { }, {
'note': 'Empty title', 'note': 'Empty title',
'url': 'https://duoplay.ee/17/uhikarotid?ep=14', 'url': 'https://duoplay.ee/17/uhikarotid?ep=14',
'md5': '6aca68be71112314738dd17cced7f8bf', 'md5': 'cba9f5dabf2582b224d80ac44fb80e47',
'info_dict': { 'info_dict': {
'id': '17_14', 'id': '17_14',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Ühikarotid', 'title': 'Episode 14',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$', 'thumbnail': r're:https?://.+\.jpg',
'description': 'md5:4719b418e058c209def41d48b601276e', 'description': 'md5:4719b418e058c209def41d48b601276e',
'upload_date': '20100916', 'upload_date': '20100916',
'timestamp': 1284661800, 'timestamp': 1284661800,
@@ -52,6 +54,8 @@ class DuoplayIE(InfoExtractor):
'season_number': 2, 'season_number': 2,
'episode_id': '14', 'episode_id': '14',
'release_year': 2010, 'release_year': 2010,
'episode': 'Episode 14',
'episode_number': 14,
}, },
}, { }, {
'note': 'Movie without expiry', 'note': 'Movie without expiry',
@@ -68,10 +72,32 @@ class DuoplayIE(InfoExtractor):
'timestamp': 1671054000, 'timestamp': 1671054000,
'release_year': 2018, 'release_year': 2018,
}, },
'skip': 'No video found',
}, {
'note': 'Episode url without show name',
'url': 'https://duoplay.ee/9644?ep=185',
'md5': '63f324b4fe2dbd8194dca16a6d52184a',
'info_dict': {
'id': '9644_185',
'ext': 'mp4',
'title': 'Episode 185',
'thumbnail': r're:https?://.+\.jpg',
'description': 'md5:ed25ba4e9e5d54bc291a4a0cdd241467',
'upload_date': '20241120',
'timestamp': 1732077000,
'episode': 'Episode 63',
'episode_id': '185',
'episode_number': 63,
'season': 'Season 2',
'season_number': 2,
'series': 'Telehommik',
'series_id': '9644',
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
telecast_id, episode = self._match_valid_url(url).group('id', 'ep') telecast_id = self._match_id(url)
episode = traverse_obj(parse_qs(url), ('ep', 0, {int_or_none}, {str_or_none}))
video_id = join_nonempty(telecast_id, episode, delim='_') video_id = join_nonempty(telecast_id, episode, delim='_')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
video_player = try_call(lambda: extract_attributes( video_player = try_call(lambda: extract_attributes(
@@ -79,25 +105,33 @@ class DuoplayIE(InfoExtractor):
if not video_player or not video_player.get('manifest-url'): if not video_player or not video_player.get('manifest-url'):
raise ExtractorError('No video found', expected=True) raise ExtractorError('No video found', expected=True)
manifest_url = video_player['manifest-url']
session_token = self._download_json(
'https://sts.postimees.ee/session/register', video_id, 'Registering session',
'Unable to register session', headers={
'Accept': 'application/json',
'X-Original-URI': manifest_url,
})['session']
episode_attr = self._parse_json(video_player.get(':episode') or '', video_id, fatal=False) or {} episode_attr = self._parse_json(video_player.get(':episode') or '', video_id, fatal=False) or {}
return { return {
'id': video_id, 'id': video_id,
'formats': self._extract_m3u8_formats(video_player['manifest-url'], video_id, 'mp4'), 'formats': self._extract_m3u8_formats(manifest_url, video_id, 'mp4', query={'s': session_token}),
**traverse_obj(episode_attr, { **traverse_obj(episode_attr, {
'title': 'title', 'title': ('title', {str}),
'description': 'synopsis', 'description': ('synopsis', {str}),
'thumbnail': ('images', 'original'), 'thumbnail': ('images', 'original'),
'timestamp': ('airtime', {lambda x: unified_timestamp(x + ' +0200')}), 'timestamp': ('airtime', {lambda x: unified_timestamp(x + ' +0200')}),
'cast': ('cast', {lambda x: x.split(', ')}), 'cast': ('cast', filter, {lambda x: x.split(', ')}),
'release_year': ('year', {int_or_none}), 'release_year': ('year', {int_or_none}),
}), }),
**(traverse_obj(episode_attr, { **(traverse_obj(episode_attr, {
'title': (None, ('subtitle', ('episode_nr', {lambda x: f'Episode {x}' if x else None}))), 'title': (None, (('subtitle', {str}, filter), {value(f'Episode {episode}' if episode else None)})),
'series': 'title', 'series': ('title', {str}),
'series_id': ('telecast_id', {str_or_none}), 'series_id': ('telecast_id', {str_or_none}),
'season_number': ('season_id', {int_or_none}), 'season_number': ('season_id', {int_or_none}),
'episode': 'subtitle', 'episode': ('subtitle', {str}, filter),
'episode_number': ('episode_nr', {int_or_none}), 'episode_number': ('episode_nr', {int_or_none}),
'episode_id': ('episode_id', {str_or_none}), 'episode_id': ('episode_id', {str_or_none}),
}, get_all=False) if episode_attr.get('category') != 'movies' else {}), }, get_all=False) if episode_attr.get('category') != 'movies' else {}),

View File

@@ -162,7 +162,7 @@ class DVTVIE(InfoExtractor):
items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage) items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage)
if items: if items:
return self.playlist_result( return self.playlist_result(
[self._parse_video_metadata(i, video_id, timestamp) for i in items], (self._parse_video_metadata(i, video_id, timestamp) for i in items),
video_id, self._html_search_meta('twitter:title', webpage)) video_id, self._html_search_meta('twitter:title', webpage))
item = self._search_regex( item = self._search_regex(

155
yt_dlp/extractor/eggs.py Normal file
View File

@@ -0,0 +1,155 @@
import secrets
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
int_or_none,
parse_iso8601,
str_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class EggsBaseIE(InfoExtractor):
_API_HEADERS = {
'Accept': '*/*',
'apVersion': '8.2.00',
'deviceName': 'Android',
}
def _real_initialize(self):
self._API_HEADERS['deviceId'] = secrets.token_hex(8)
def _call_api(self, endpoint, video_id):
return self._download_json(
f'https://app-front-api.eggs.mu/v1/{endpoint}', video_id,
headers=self._API_HEADERS)
def _extract_music_info(self, data):
if yt_url := traverse_obj(data, ('youtubeUrl', {url_or_none})):
return self.url_result(yt_url, ie=YoutubeIE)
artist_name = traverse_obj(data, ('artist', 'artistName', {str_or_none}))
music_id = traverse_obj(data, ('musicId', {str_or_none}))
webpage_url = None
if artist_name and music_id:
webpage_url = f'https://eggs.mu/artist/{artist_name}/song/{music_id}'
return {
'id': music_id,
'vcodec': 'none',
'webpage_url': webpage_url,
'extractor_key': EggsIE.ie_key(),
'extractor': EggsIE.IE_NAME,
**traverse_obj(data, {
'title': ('musicTitle', {str}),
'url': ('musicDataPath', {url_or_none}),
'uploader': ('artist', 'displayName', {str}),
'uploader_id': ('artist', 'artistId', {str_or_none}),
'thumbnail': ('imageDataPath', {url_or_none}),
'view_count': ('numberOfMusicPlays', {int_or_none}),
'like_count': ('numberOfLikes', {int_or_none}),
'comment_count': ('numberOfComments', {int_or_none}),
'composers': ('composer', {str}, all),
'tags': ('tags', ..., {str}),
'timestamp': ('releaseDate', {parse_iso8601}),
'artist': ('artist', 'displayName', {str}),
})}
class EggsIE(EggsBaseIE):
IE_NAME = 'eggs:single'
_VALID_URL = r'https?://eggs\.mu/artist/[^/?#]+/song/(?P<id>[\da-f-]+)'
_TESTS = [{
'url': 'https://eggs.mu/artist/32_sunny_girl/song/0e95fd1d-4d61-4d5b-8b18-6092c551da90',
'info_dict': {
'id': '0e95fd1d-4d61-4d5b-8b18-6092c551da90',
'ext': 'm4a',
'title': 'シネマと信号',
'uploader': 'Sunny Girl',
'thumbnail': r're:https?://.*\.jpg(?:\?.*)?$',
'uploader_id': '1607',
'like_count': int,
'timestamp': 1731327327,
'composers': ['橘高連太郎'],
'view_count': int,
'comment_count': int,
'artists': ['Sunny Girl'],
'upload_date': '20241111',
'tags': ['SunnyGirl', 'シネマと信号'],
},
}, {
'url': 'https://eggs.mu/artist/KAMO_3pband/song/1d4bc45f-1af6-47a9-8b30-a70cae350b4f',
'info_dict': {
'id': '80cLKA2wnoA',
'ext': 'mp4',
'title': 'KAMO「いい女だから」Audio',
'uploader': 'KAMO',
'live_status': 'not_live',
'channel_id': 'UCsHLBw2__5Q9y55skXPotOg',
'channel_follower_count': int,
'description': 'md5:d260da711ecbec3e720293dc11401b87',
'availability': 'public',
'uploader_id': '@KAMO_band',
'upload_date': '20240925',
'thumbnail': 'https://i.ytimg.com/vi/80cLKA2wnoA/maxresdefault.jpg',
'comment_count': int,
'channel_url': 'https://www.youtube.com/channel/UCsHLBw2__5Q9y55skXPotOg',
'view_count': int,
'duration': 151,
'like_count': int,
'channel': 'KAMO',
'playable_in_embed': True,
'uploader_url': 'https://www.youtube.com/@KAMO_band',
'tags': [],
'timestamp': 1727271121,
'age_limit': 0,
'categories': ['People & Blogs'],
},
'add_ie': ['Youtube'],
'params': {'skip_download': 'Youtube'},
}]
def _real_extract(self, url):
song_id = self._match_id(url)
json_data = self._call_api(f'musics/{song_id}', song_id)
return self._extract_music_info(json_data)
class EggsArtistIE(EggsBaseIE):
IE_NAME = 'eggs:artist'
_VALID_URL = r'https?://eggs\.mu/artist/(?P<id>\w+)/?(?:[?#&]|$)'
_TESTS = [{
'url': 'https://eggs.mu/artist/32_sunny_girl',
'info_dict': {
'id': '32_sunny_girl',
'thumbnail': 'https://image-pro.eggs.mu/profile/1607.jpeg?updated_at=2024-04-03T20%3A06%3A00%2B09%3A00',
'description': 'Muddy Mine / 東京高田馬場CLUB PHASE / Gt.Vo 橘高 連太郎 / Ba.Cho 小野 ゆうき / Dr 大森 りゅうひこ',
'title': 'Sunny Girl',
},
'playlist_mincount': 18,
}, {
'url': 'https://eggs.mu/artist/KAMO_3pband',
'info_dict': {
'id': 'KAMO_3pband',
'description': '川崎発3ピースバンド',
'thumbnail': 'https://image-pro.eggs.mu/profile/35217.jpeg?updated_at=2024-11-27T16%3A31%3A50%2B09%3A00',
'title': 'KAMO',
},
'playlist_mincount': 2,
}]
def _real_extract(self, url):
artist_id = self._match_id(url)
artist_data = self._call_api(f'artists/{artist_id}', artist_id)
song_data = self._call_api(f'artists/{artist_id}/musics', artist_id)
return self.playlist_result(
traverse_obj(song_data, ('data', ..., {dict}, {self._extract_music_info})),
playlist_id=artist_id, **traverse_obj(artist_data, {
'title': ('displayName', {str}),
'description': ('profile', {str}),
'thumbnail': ('imageDataPath', {url_or_none}),
}))

View File

@@ -50,7 +50,7 @@ class FacebookIE(InfoExtractor):
[^/]+/videos/(?:[^/]+/)?| [^/]+/videos/(?:[^/]+/)?|
[^/]+/posts/| [^/]+/posts/|
events/(?:[^/]+/)?| events/(?:[^/]+/)?|
groups/[^/]+/(?:permalink|posts)/| groups/[^/]+/(?:permalink|posts)/(?:[\da-f]+/)?|
watchparty/ watchparty/
)| )|
facebook: facebook:
@@ -410,6 +410,9 @@ class FacebookIE(InfoExtractor):
'uploader': 'Comitato Liberi Pensatori', 'uploader': 'Comitato Liberi Pensatori',
'uploader_id': '100065709540881', 'uploader_id': '100065709540881',
}, },
}, {
'url': 'https://www.facebook.com/groups/1513990329015294/posts/d41d8cd9/2013209885760000/?app=fbl',
'only_matching': True,
}] }]
_SUPPORTED_PAGLETS_REGEX = r'(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_[0-9a-f]+)' _SUPPORTED_PAGLETS_REGEX = r'(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_[0-9a-f]+)'
_api_config = { _api_config = {
@@ -563,13 +566,13 @@ class FacebookIE(InfoExtractor):
return extract_video_data(try_get( return extract_video_data(try_get(
js_data, lambda x: x['jsmods']['instances'], list) or []) js_data, lambda x: x['jsmods']['instances'], list) or [])
def extract_dash_manifest(video, formats): def extract_dash_manifest(vid_data, formats, mpd_url=None):
dash_manifest = traverse_obj( dash_manifest = traverse_obj(
video, 'dash_manifest', 'playlist', 'dash_manifest_xml_string', expected_type=str) vid_data, 'dash_manifest', 'playlist', 'dash_manifest_xml_string', 'manifest_xml', expected_type=str)
if dash_manifest: if dash_manifest:
formats.extend(self._parse_mpd_formats( formats.extend(self._parse_mpd_formats(
compat_etree_fromstring(urllib.parse.unquote_plus(dash_manifest)), compat_etree_fromstring(urllib.parse.unquote_plus(dash_manifest)),
mpd_url=url_or_none(video.get('dash_manifest_url')))) mpd_url=url_or_none(vid_data.get('dash_manifest_url')) or mpd_url))
def process_formats(info): def process_formats(info):
# Downloads with browser's User-Agent are rate limited. Working around # Downloads with browser's User-Agent are rate limited. Working around
@@ -619,9 +622,12 @@ class FacebookIE(InfoExtractor):
video = video['creation_story'] video = video['creation_story']
video['owner'] = traverse_obj(video, ('short_form_video_context', 'video_owner')) video['owner'] = traverse_obj(video, ('short_form_video_context', 'video_owner'))
video.update(reel_info) video.update(reel_info)
fmt_data = traverse_obj(video, ('videoDeliveryLegacyFields', {dict})) or video
formats = [] formats = []
q = qualities(['sd', 'hd']) q = qualities(['sd', 'hd'])
# Legacy formats extraction
fmt_data = traverse_obj(video, ('videoDeliveryLegacyFields', {dict})) or video
for key, format_id in (('playable_url', 'sd'), ('playable_url_quality_hd', 'hd'), for key, format_id in (('playable_url', 'sd'), ('playable_url_quality_hd', 'hd'),
('playable_url_dash', ''), ('browser_native_hd_url', 'hd'), ('playable_url_dash', ''), ('browser_native_hd_url', 'hd'),
('browser_native_sd_url', 'sd')): ('browser_native_sd_url', 'sd')):
@@ -629,7 +635,7 @@ class FacebookIE(InfoExtractor):
if not playable_url: if not playable_url:
continue continue
if determine_ext(playable_url) == 'mpd': if determine_ext(playable_url) == 'mpd':
formats.extend(self._extract_mpd_formats(playable_url, video_id)) formats.extend(self._extract_mpd_formats(playable_url, video_id, fatal=False))
else: else:
formats.append({ formats.append({
'format_id': format_id, 'format_id': format_id,
@@ -638,6 +644,28 @@ class FacebookIE(InfoExtractor):
'url': playable_url, 'url': playable_url,
}) })
extract_dash_manifest(fmt_data, formats) extract_dash_manifest(fmt_data, formats)
# New videoDeliveryResponse formats extraction
fmt_data = traverse_obj(video, ('videoDeliveryResponseFragment', 'videoDeliveryResponseResult'))
mpd_urls = traverse_obj(fmt_data, ('dash_manifest_urls', ..., 'manifest_url', {url_or_none}))
dash_manifests = traverse_obj(fmt_data, ('dash_manifests', lambda _, v: v['manifest_xml']))
for idx, dash_manifest in enumerate(dash_manifests):
extract_dash_manifest(dash_manifest, formats, mpd_url=traverse_obj(mpd_urls, idx))
if not dash_manifests:
# Only extract from MPD URLs if the manifests are not already provided
for mpd_url in mpd_urls:
formats.extend(self._extract_mpd_formats(mpd_url, video_id, fatal=False))
for prog_fmt in traverse_obj(fmt_data, ('progressive_urls', lambda _, v: v['progressive_url'])):
format_id = traverse_obj(prog_fmt, ('metadata', 'quality', {str.lower}))
formats.append({
'format_id': format_id,
# sd, hd formats w/o resolution info should be deprioritized below DASH
'quality': q(format_id) - 3,
'url': prog_fmt['progressive_url'],
})
for m3u8_url in traverse_obj(fmt_data, ('hls_playlist_urls', ..., 'hls_playlist_url', {url_or_none})):
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', fatal=False, m3u8_id='hls'))
if not formats: if not formats:
# Do not append false positive entry w/o any formats # Do not append false positive entry w/o any formats
return return

View File

@@ -12,7 +12,7 @@ from ..utils import (
class FirstTVIE(InfoExtractor): class FirstTVIE(InfoExtractor):
IE_NAME = '1tv' IE_NAME = '1tv'
IE_DESC = 'Первый канал' IE_DESC = 'Первый канал'
_VALID_URL = r'https?://(?:www\.)?1tv\.ru/(?:[^/]+/)+(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:sport)?1tv\.ru/(?:[^/?#]+/)+(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
# single format # single format
@@ -52,6 +52,9 @@ class FirstTVIE(InfoExtractor):
}, { }, {
'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016', 'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.sport1tv.ru/sport/chempionat-rossii-po-figurnomu-kataniyu-2025',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -1,349 +0,0 @@
import random
import re
import string
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
join_nonempty,
js_to_json,
make_archive_id,
orderedSet,
qualities,
str_or_none,
traverse_obj,
try_get,
urlencode_postdata,
)
class FunimationBaseIE(InfoExtractor):
_NETRC_MACHINE = 'funimation'
_REGION = None
_TOKEN = None
def _get_region(self):
region_cookie = self._get_cookies('https://www.funimation.com').get('region')
region = region_cookie.value if region_cookie else self.get_param('geo_bypass_country')
return region or traverse_obj(
self._download_json(
'https://geo-service.prd.funimationsvc.com/geo/v1/region/check', None, fatal=False,
note='Checking geo-location', errnote='Unable to fetch geo-location information'),
'region') or 'US'
def _perform_login(self, username, password):
if self._TOKEN:
return
try:
data = self._download_json(
'https://prod-api-funimationnow.dadcdigital.com/api/auth/login/',
None, 'Logging in', data=urlencode_postdata({
'username': username,
'password': password,
}))
FunimationBaseIE._TOKEN = data['token']
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
error = self._parse_json(e.cause.response.read().decode(), None)['error']
raise ExtractorError(error, expected=True)
raise
class FunimationPageIE(FunimationBaseIE):
IE_NAME = 'funimation:page'
_VALID_URL = r'https?://(?:www\.)?funimation(?:\.com|now\.uk)/(?:(?P<lang>[^/]+)/)?(?:shows|v)/(?P<show>[^/]+)/(?P<episode>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.funimation.com/shows/attack-on-titan-junior-high/broadcast-dub-preview/',
'info_dict': {
'id': '210050',
'ext': 'mp4',
'title': 'Broadcast Dub Preview',
# Other metadata is tested in FunimationIE
},
'params': {
'skip_download': 'm3u8',
},
'add_ie': ['Funimation'],
}, {
# Not available in US
'url': 'https://www.funimation.com/shows/hacksign/role-play/',
'only_matching': True,
}, {
# with lang code
'url': 'https://www.funimation.com/en/shows/hacksign/role-play/',
'only_matching': True,
}, {
'url': 'https://www.funimationnow.uk/shows/puzzle-dragons-x/drop-impact/simulcast/',
'only_matching': True,
}, {
'url': 'https://www.funimation.com/v/a-certain-scientific-railgun/super-powered-level-5',
'only_matching': True,
}]
def _real_initialize(self):
if not self._REGION:
FunimationBaseIE._REGION = self._get_region()
def _real_extract(self, url):
locale, show, episode = self._match_valid_url(url).group('lang', 'show', 'episode')
video_id = traverse_obj(self._download_json(
f'https://title-api.prd.funimationsvc.com/v1/shows/{show}/episodes/{episode}',
f'{show}_{episode}', query={
'deviceType': 'web',
'region': self._REGION,
'locale': locale or 'en',
}), ('videoList', ..., 'id'), get_all=False)
return self.url_result(f'https://www.funimation.com/player/{video_id}', FunimationIE.ie_key(), video_id)
class FunimationIE(FunimationBaseIE):
_VALID_URL = r'https?://(?:www\.)?funimation\.com/player/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.funimation.com/player/210051',
'info_dict': {
'id': '210050',
'display_id': 'broadcast-dub-preview',
'ext': 'mp4',
'title': 'Broadcast Dub Preview',
'thumbnail': r're:https?://.*\.(?:jpg|png)',
'episode': 'Broadcast Dub Preview',
'episode_id': '210050',
'season': 'Extras',
'season_id': '166038',
'season_number': 99,
'series': 'Attack on Titan: Junior High',
'description': '',
'duration': 155,
},
'params': {
'skip_download': 'm3u8',
},
}, {
'note': 'player_id should be extracted with the relevent compat-opt',
'url': 'https://www.funimation.com/player/210051',
'info_dict': {
'id': '210051',
'display_id': 'broadcast-dub-preview',
'ext': 'mp4',
'title': 'Broadcast Dub Preview',
'thumbnail': r're:https?://.*\.(?:jpg|png)',
'episode': 'Broadcast Dub Preview',
'episode_id': '210050',
'season': 'Extras',
'season_id': '166038',
'season_number': 99,
'series': 'Attack on Titan: Junior High',
'description': '',
'duration': 155,
},
'params': {
'skip_download': 'm3u8',
'compat_opts': ['seperate-video-versions'],
},
}]
@staticmethod
def _get_experiences(episode):
for lang, lang_data in episode.get('languages', {}).items():
for video_data in lang_data.values():
for version, f in video_data.items():
yield lang, version.title(), f
def _get_episode(self, webpage, experience_id=None, episode_id=None, fatal=True):
""" Extract the episode, season and show objects given either episode/experience id """
show = self._parse_json(
self._search_regex(
r'show\s*=\s*({.+?})\s*;', webpage, 'show data', fatal=fatal),
experience_id, transform_source=js_to_json, fatal=fatal) or []
for season in show.get('seasons', []):
for episode in season.get('episodes', []):
if episode_id is not None:
if str(episode.get('episodePk')) == episode_id:
return episode, season, show
continue
for _, _, f in self._get_experiences(episode):
if f.get('experienceId') == experience_id:
return episode, season, show
if fatal:
raise ExtractorError('Unable to find episode information')
else:
self.report_warning('Unable to find episode information')
return {}, {}, {}
def _real_extract(self, url):
initial_experience_id = self._match_id(url)
webpage = self._download_webpage(
url, initial_experience_id, note=f'Downloading player webpage for {initial_experience_id}')
episode, season, show = self._get_episode(webpage, experience_id=int(initial_experience_id))
episode_id = str(episode['episodePk'])
display_id = episode.get('slug') or episode_id
formats, subtitles, thumbnails, duration = [], {}, [], 0
requested_languages, requested_versions = self._configuration_arg('language'), self._configuration_arg('version')
language_preference = qualities((requested_languages or [''])[::-1])
source_preference = qualities((requested_versions or ['uncut', 'simulcast'])[::-1])
only_initial_experience = 'seperate-video-versions' in self.get_param('compat_opts', [])
for lang, version, fmt in self._get_experiences(episode):
experience_id = str(fmt['experienceId'])
if (only_initial_experience and experience_id != initial_experience_id
or requested_languages and lang.lower() not in requested_languages
or requested_versions and version.lower() not in requested_versions):
continue
thumbnails.append({'url': fmt.get('poster')})
duration = max(duration, fmt.get('duration', 0))
format_name = f'{version} {lang} ({experience_id})'
self.extract_subtitles(
subtitles, experience_id, display_id=display_id, format_name=format_name,
episode=episode if experience_id == initial_experience_id else episode_id)
headers = {}
if self._TOKEN:
headers['Authorization'] = f'Token {self._TOKEN}'
page = self._download_json(
f'https://www.funimation.com/api/showexperience/{experience_id}/',
display_id, headers=headers, expected_status=403, query={
'pinst_id': ''.join(random.choices(string.digits + string.ascii_letters, k=8)),
}, note=f'Downloading {format_name} JSON')
sources = page.get('items') or []
if not sources:
error = try_get(page, lambda x: x['errors'][0], dict)
if error:
self.report_warning('{} said: Error {} - {}'.format(
self.IE_NAME, error.get('code'), error.get('detail') or error.get('title')))
else:
self.report_warning('No sources found for format')
current_formats = []
for source in sources:
source_url = source.get('src')
source_type = source.get('videoType') or determine_ext(source_url)
if source_type == 'm3u8':
current_formats.extend(self._extract_m3u8_formats(
source_url, display_id, 'mp4', m3u8_id='{}-{}'.format(experience_id, 'hls'), fatal=False,
note=f'Downloading {format_name} m3u8 information'))
else:
current_formats.append({
'format_id': f'{experience_id}-{source_type}',
'url': source_url,
})
for f in current_formats:
# TODO: Convert language to code
f.update({
'language': lang,
'format_note': version,
'source_preference': source_preference(version.lower()),
'language_preference': language_preference(lang.lower()),
})
formats.extend(current_formats)
if not formats and (requested_languages or requested_versions):
self.raise_no_formats(
'There are no video formats matching the requested languages/versions', expected=True, video_id=display_id)
self._remove_duplicate_formats(formats)
return {
'id': episode_id,
'_old_archive_ids': [make_archive_id(self, initial_experience_id)],
'display_id': display_id,
'duration': duration,
'title': episode['episodeTitle'],
'description': episode.get('episodeSummary'),
'episode': episode.get('episodeTitle'),
'episode_number': int_or_none(episode.get('episodeId')),
'episode_id': episode_id,
'season': season.get('seasonTitle'),
'season_number': int_or_none(season.get('seasonId')),
'season_id': str_or_none(season.get('seasonPk')),
'series': show.get('showTitle'),
'formats': formats,
'thumbnails': thumbnails,
'subtitles': subtitles,
'_format_sort_fields': ('lang', 'source'),
}
def _get_subtitles(self, subtitles, experience_id, episode, display_id, format_name):
if isinstance(episode, str):
webpage = self._download_webpage(
f'https://www.funimation.com/player/{experience_id}/', display_id,
fatal=False, note=f'Downloading player webpage for {format_name}')
episode, _, _ = self._get_episode(webpage, episode_id=episode, fatal=False)
for _, version, f in self._get_experiences(episode):
for source in f.get('sources'):
for text_track in source.get('textTracks'):
if not text_track.get('src'):
continue
sub_type = text_track.get('type').upper()
sub_type = sub_type if sub_type != 'FULL' else None
current_sub = {
'url': text_track['src'],
'name': join_nonempty(version, text_track.get('label'), sub_type, delim=' '),
}
lang = join_nonempty(text_track.get('language', 'und'),
version if version != 'Simulcast' else None,
sub_type, delim='_')
if current_sub not in subtitles.get(lang, []):
subtitles.setdefault(lang, []).append(current_sub)
return subtitles
class FunimationShowIE(FunimationBaseIE):
IE_NAME = 'funimation:show'
_VALID_URL = r'(?P<url>https?://(?:www\.)?funimation(?:\.com|now\.uk)/(?P<locale>[^/]+)?/?shows/(?P<id>[^/?#&]+))/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://www.funimation.com/en/shows/sk8-the-infinity',
'info_dict': {
'id': '1315000',
'title': 'SK8 the Infinity',
},
'playlist_count': 13,
'params': {
'skip_download': True,
},
}, {
# without lang code
'url': 'https://www.funimation.com/shows/ouran-high-school-host-club/',
'info_dict': {
'id': '39643',
'title': 'Ouran High School Host Club',
},
'playlist_count': 26,
'params': {
'skip_download': True,
},
}]
def _real_initialize(self):
if not self._REGION:
FunimationBaseIE._REGION = self._get_region()
def _real_extract(self, url):
base_url, locale, display_id = self._match_valid_url(url).groups()
show_info = self._download_json(
'https://title-api.prd.funimationsvc.com/v2/shows/{}?region={}&deviceType=web&locale={}'.format(
display_id, self._REGION, locale or 'en'), display_id)
items_info = self._download_json(
'https://prod-api-funimationnow.dadcdigital.com/api/funimation/episodes/?limit=99999&title_id={}'.format(
show_info.get('id')), display_id)
vod_items = traverse_obj(items_info, ('items', ..., lambda k, _: re.match(r'(?i)mostRecent[AS]vod', k), 'item'))
return {
'_type': 'playlist',
'id': str_or_none(show_info['id']),
'title': show_info['name'],
'entries': orderedSet(
self.url_result(
'{}/{}'.format(base_url, vod_item.get('episodeSlug')), FunimationPageIE.ie_key(),
vod_item.get('episodeId'), vod_item.get('episodeName'))
for vod_item in sorted(vod_items, key=lambda x: x.get('episodeOrder', -1))),
}

View File

@@ -0,0 +1,141 @@
import json
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
clean_html,
int_or_none,
join_nonempty,
parse_iso8601,
str_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class GameDevTVDashboardIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gamedev\.tv/dashboard/courses/(?P<course_id>\d+)(?:/(?P<lecture_id>\d+))?'
_NETRC_MACHINE = 'gamedevtv'
_TESTS = [{
'url': 'https://www.gamedev.tv/dashboard/courses/25',
'info_dict': {
'id': '25',
'title': 'Complete Blender Creator 3: Learn 3D Modelling for Beginners',
'tags': ['blender', 'course', 'all', 'box modelling', 'sculpting'],
'categories': ['Blender', '3D Art'],
'thumbnail': 'https://gamedev-files.b-cdn.net/courses/qisc9pmu1jdc.jpg',
'upload_date': '20220516',
'timestamp': 1652694420,
'modified_date': '20241027',
'modified_timestamp': 1730049658,
},
'playlist_count': 100,
}, {
'url': 'https://www.gamedev.tv/dashboard/courses/63/2279',
'info_dict': {
'id': 'df04f4d8-68a4-4756-a71b-9ca9446c3a01',
'ext': 'mp4',
'modified_timestamp': 1701695752,
'upload_date': '20230504',
'episode': 'MagicaVoxel Community Course Introduction',
'series_id': '63',
'title': 'MagicaVoxel Community Course Introduction',
'timestamp': 1683195397,
'modified_date': '20231204',
'categories': ['3D Art', 'MagicaVoxel'],
'season': 'MagicaVoxel Community Course',
'tags': ['MagicaVoxel', 'all', 'course'],
'series': 'MagicaVoxel 3D Art Mini Course',
'duration': 1405,
'episode_number': 1,
'season_number': 1,
'season_id': '219',
'description': 'md5:a378738c5bbec1c785d76c067652d650',
'display_id': '63-219-2279',
'alt_title': '1_CC_MVX MagicaVoxel Community Course Introduction.mp4',
'thumbnail': 'https://vz-23691c65-6fa.b-cdn.net/df04f4d8-68a4-4756-a71b-9ca9446c3a01/thumbnail.jpg',
},
}]
_API_HEADERS = {}
def _perform_login(self, username, password):
try:
response = self._download_json(
'https://api.gamedev.tv/api/students/login', None, 'Logging in',
headers={'Content-Type': 'application/json'},
data=json.dumps({
'email': username,
'password': password,
'cart_items': [],
}).encode())
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
raise ExtractorError('Invalid username/password', expected=True)
raise
self._API_HEADERS['Authorization'] = f'{response["token_type"]} {response["access_token"]}'
def _real_initialize(self):
if not self._API_HEADERS.get('Authorization'):
self.raise_login_required(
'This content is only available with purchase', method='password')
def _entries(self, data, course_id, course_info, selected_lecture):
for section in traverse_obj(data, ('sections', ..., {dict})):
section_info = traverse_obj(section, {
'season_id': ('id', {str_or_none}),
'season': ('title', {str}),
'season_number': ('order', {int_or_none}),
})
for lecture in traverse_obj(section, ('lectures', lambda _, v: url_or_none(v['video']['playListUrl']))):
if selected_lecture and str(lecture.get('id')) != selected_lecture:
continue
display_id = join_nonempty(course_id, section_info.get('season_id'), lecture.get('id'))
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
lecture['video']['playListUrl'], display_id, 'mp4', m3u8_id='hls')
yield {
**course_info,
**section_info,
'id': display_id, # fallback
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
'series': course_info.get('title'),
'series_id': course_id,
**traverse_obj(lecture, {
'id': ('video', 'guid', {str}),
'title': ('title', {str}),
'alt_title': ('video', 'title', {str}),
'description': ('description', {clean_html}),
'episode': ('title', {str}),
'episode_number': ('order', {int_or_none}),
'duration': ('video', 'duration_in_sec', {int_or_none}),
'timestamp': ('video', 'created_at', {parse_iso8601}),
'modified_timestamp': ('video', 'updated_at', {parse_iso8601}),
'thumbnail': ('video', 'thumbnailUrl', {url_or_none}),
}),
}
def _real_extract(self, url):
course_id, lecture_id = self._match_valid_url(url).group('course_id', 'lecture_id')
data = self._download_json(
f'https://api.gamedev.tv/api/courses/my/{course_id}', course_id,
headers=self._API_HEADERS)['data']
course_info = traverse_obj(data, {
'title': ('title', {str}),
'tags': ('tags', ..., 'name', {str}),
'categories': ('categories', ..., 'title', {str}),
'timestamp': ('created_at', {parse_iso8601}),
'modified_timestamp': ('updated_at', {parse_iso8601}),
'thumbnail': ('image', {url_or_none}),
})
entries = self._entries(data, course_id, course_info, lecture_id)
if lecture_id:
lecture = next(entries, None)
if not lecture:
raise ExtractorError('Lecture not found')
return lecture
return self.playlist_result(entries, course_id, **course_info)

View File

@@ -1,40 +1,48 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html,
int_or_none, int_or_none,
str_or_none, str_or_none,
traverse_obj, traverse_obj,
url_or_none,
) )
class GoodGameIE(InfoExtractor): class GoodGameIE(InfoExtractor):
IE_NAME = 'goodgame:stream' IE_NAME = 'goodgame:stream'
_VALID_URL = r'https?://goodgame\.ru/channel/(?P<id>\w+)' _VALID_URL = r'https?://goodgame\.ru/(?!channel/)(?P<id>[\w.*-]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://goodgame.ru/channel/Pomi/#autoplay', 'url': 'https://goodgame.ru/TGW#autoplay',
'info_dict': { 'info_dict': {
'id': 'pomi', 'id': '7998',
'ext': 'mp4', 'ext': 'mp4',
'title': r're:Reynor vs Special \(1/2,bo3\) Wardi Spring EU \- playoff \(финальный день\) \d{4}-\d{2}-\d{2} \d{2}:\d{2}$', 'channel_id': '7998',
'channel_id': '1644', 'title': r're:шоуматч Happy \(NE\) vs Fortitude \(UD\), потом ладдер и дс \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'channel': 'Pomi', 'channel_url': 'https://goodgame.ru/TGW',
'channel_url': 'https://goodgame.ru/channel/Pomi/', 'thumbnail': 'https://hls.goodgame.ru/previews/7998_240.jpg',
'description': 'md5:4a87b775ee7b2b57bdccebe285bbe171', 'uploader': 'TGW',
'thumbnail': r're:^https?://.*\.jpg$', 'channel': 'JosephStalin',
'live_status': 'is_live', 'live_status': 'is_live',
'view_count': int, 'age_limit': 18,
'channel_follower_count': int,
'uploader_id': '2899',
'concurrent_view_count': int,
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
'skip': 'May not be online', }, {
'url': 'https://goodgame.ru/Mr.Gray',
'only_matching': True,
}, {
'url': 'https://goodgame.ru/HeDoPa3yMeHue*',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
channel_name = self._match_id(url) channel_name = self._match_id(url)
response = self._download_json(f'https://api2.goodgame.ru/v2/streams/{channel_name}', channel_name) response = self._download_json(f'https://goodgame.ru/api/4/users/{channel_name}/stream', channel_name)
player_id = response['channel']['gg_player_src'] player_id = response['streamkey']
formats, subtitles = [], {} formats, subtitles = [], {}
if response.get('status') == 'Live': if response.get('status'):
formats, subtitles = self._extract_m3u8_formats_and_subtitles( formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://hls.goodgame.ru/manifest/{player_id}_master.m3u8', f'https://hls.goodgame.ru/manifest/{player_id}_master.m3u8',
channel_name, 'mp4', live=True) channel_name, 'mp4', live=True)
@@ -45,13 +53,17 @@ class GoodGameIE(InfoExtractor):
'id': player_id, 'id': player_id,
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'title': traverse_obj(response, ('channel', 'title')),
'channel': channel_name,
'channel_id': str_or_none(traverse_obj(response, ('channel', 'id'))),
'channel_url': response.get('url'),
'description': clean_html(traverse_obj(response, ('channel', 'description'))),
'thumbnail': traverse_obj(response, ('channel', 'thumb')),
'is_live': bool(formats), 'is_live': bool(formats),
'view_count': int_or_none(response.get('viewers')), **traverse_obj(response, {
'age_limit': 18 if traverse_obj(response, ('channel', 'adult')) else None, 'title': ('title', {str}),
'channel': ('channelkey', {str}),
'channel_id': ('id', {str_or_none}),
'channel_url': ('link', {url_or_none}),
'uploader': ('streamer', 'username', {str}),
'uploader_id': ('streamer', 'id', {str_or_none}),
'thumbnail': ('preview', {url_or_none}, {self._proto_relative_url}),
'concurrent_view_count': ('viewers', {int_or_none}),
'channel_follower_count': ('followers', {int_or_none}),
'age_limit': ('adult', {bool}, {lambda x: 18 if x else None}),
}),
} }

View File

@@ -5,56 +5,63 @@ import hashlib
import hmac import hmac
import json import json
import os import os
import re
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none,
js_to_json,
remove_end,
traverse_obj, traverse_obj,
unescapeHTML,
) )
class GoPlayIE(InfoExtractor): class GoPlayIE(InfoExtractor):
_VALID_URL = r'https?://(www\.)?goplay\.be/video/([^/]+/[^/]+/|)(?P<display_id>[^/#]+)' _VALID_URL = r'https?://(www\.)?goplay\.be/video/([^/?#]+/[^/?#]+/|)(?P<id>[^/#]+)'
_NETRC_MACHINE = 'goplay' _NETRC_MACHINE = 'goplay'
_TESTS = [{ _TESTS = [{
'url': 'https://www.goplay.be/video/de-container-cup/de-container-cup-s3/de-container-cup-s3-aflevering-2#autoplay', 'url': 'https://www.goplay.be/video/de-slimste-mens-ter-wereld/de-slimste-mens-ter-wereld-s22/de-slimste-mens-ter-wereld-s22-aflevering-1',
'info_dict': { 'info_dict': {
'id': '9c4214b8-e55d-4e4b-a446-f015f6c6f811', 'id': '2baa4560-87a0-421b-bffc-359914e3c387',
'ext': 'mp4', 'ext': 'mp4',
'title': 'S3 - Aflevering 2', 'title': 'S22 - Aflevering 1',
'series': 'De Container Cup', 'description': r're:In aflevering 1 nemen Daan Alferink, Tess Elst en Xander De Rycke .{66}',
'season': 'Season 3', 'series': 'De Slimste Mens ter Wereld',
'season_number': 3, 'episode': 'Episode 1',
'episode': 'Episode 2', 'season_number': 22,
'episode_number': 2, 'episode_number': 1,
'season': 'Season 22',
}, },
'params': {'skip_download': True},
'skip': 'This video is only available for registered users', 'skip': 'This video is only available for registered users',
}, { }, {
'url': 'https://www.goplay.be/video/a-family-for-thr-holidays-s1-aflevering-1#autoplay', 'url': 'https://www.goplay.be/video/1917',
'info_dict': { 'info_dict': {
'id': '74e3ed07-748c-49e4-85a0-393a93337dbf', 'id': '40cac41d-8d29-4ef5-aa11-75047b9f0907',
'ext': 'mp4', 'ext': 'mp4',
'title': 'A Family for the Holidays', 'title': '1917',
'description': r're:Op het hoogtepunt van de Eerste Wereldoorlog krijgen twee jonge .{94}',
}, },
'params': {'skip_download': True},
'skip': 'This video is only available for registered users', 'skip': 'This video is only available for registered users',
}, { }, {
'url': 'https://www.goplay.be/video/de-mol/de-mol-s11/de-mol-s11-aflevering-1#autoplay', 'url': 'https://www.goplay.be/video/de-mol/de-mol-s11/de-mol-s11-aflevering-1#autoplay',
'info_dict': { 'info_dict': {
'id': '03eb8f2f-153e-41cb-9805-0d3a29dab656', 'id': 'ecb79672-92b9-4cd9-a0d7-e2f0250681ee',
'ext': 'mp4', 'ext': 'mp4',
'title': 'S11 - Aflevering 1', 'title': 'S11 - Aflevering 1',
'description': r're:Tien kandidaten beginnen aan hun verovering van Amerika en ontmoeten .{102}',
'episode': 'Episode 1', 'episode': 'Episode 1',
'series': 'De Mol', 'series': 'De Mol',
'season_number': 11, 'season_number': 11,
'episode_number': 1, 'episode_number': 1,
'season': 'Season 11', 'season': 'Season 11',
}, },
'params': { 'params': {'skip_download': True},
'skip_download': True,
},
'skip': 'This video is only available for registered users', 'skip': 'This video is only available for registered users',
}] }]
@@ -69,27 +76,42 @@ class GoPlayIE(InfoExtractor):
if not self._id_token: if not self._id_token:
raise self.raise_login_required(method='password') raise self.raise_login_required(method='password')
def _real_extract(self, url): def _find_json(self, s):
url, display_id = self._match_valid_url(url).group(0, 'display_id') return self._search_json(
webpage = self._download_webpage(url, display_id) r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
video_data_json = self._html_search_regex(r'<div\s+data-hero="([^"]+)"', webpage, 'video_data')
video_data = self._parse_json(unescapeHTML(video_data_json), display_id).get('data')
movie = video_data.get('movie') def _real_extract(self, url):
if movie: display_id = self._match_id(url)
video_id = movie['videoUuid'] webpage = self._download_webpage(url, display_id)
info_dict = {
'title': movie.get('title'), nextjs_data = traverse_obj(
} re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
else: (..., {js_to_json}, {json.loads}, ..., {self._find_json}, ...))
episode = traverse_obj(video_data, ('playlists', ..., 'episodes', lambda _, v: v['pageInfo']['url'] == url), get_all=False) meta = traverse_obj(nextjs_data, (
video_id = episode['videoUuid'] ..., lambda _, v: v['meta']['path'] == urllib.parse.urlparse(url).path, 'meta', any))
info_dict = {
'title': episode.get('episodeTitle'), video_id = meta['uuid']
'series': traverse_obj(episode, ('program', 'title')), info_dict = traverse_obj(meta, {
'season_number': episode.get('seasonNumber'), 'title': ('title', {str}),
'episode_number': episode.get('episodeNumber'), 'description': ('description', {str.strip}),
} })
if traverse_obj(meta, ('program', 'subtype')) != 'movie':
for season_data in traverse_obj(nextjs_data, (..., 'children', ..., 'playlists', ...)):
episode_data = traverse_obj(
season_data, ('videos', lambda _, v: v['videoId'] == video_id, any))
if not episode_data:
continue
episode_title = traverse_obj(
episode_data, 'contextualTitle', 'episodeTitle', expected_type=str)
info_dict.update({
'title': episode_title or info_dict.get('title'),
'series': remove_end(info_dict.get('title'), f' - {episode_title}'),
'season_number': traverse_obj(season_data, ('season', {int_or_none})),
'episode_number': traverse_obj(episode_data, ('episodeNumber', {int_or_none})),
})
break
api = self._download_json( api = self._download_json(
f'https://api.goplay.be/web/v1/videos/long-form/{video_id}', f'https://api.goplay.be/web/v1/videos/long-form/{video_id}',

View File

@@ -254,7 +254,7 @@ class InstagramIOSIE(InfoExtractor):
class InstagramIE(InstagramBaseIE): class InstagramIE(InstagramBaseIE):
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com(?:/[^/]+)?/(?:p|tv|reels?(?!/audio/))/(?P<id>[^/?#&]+))' _VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com(?:/(?!share/)[^/?#]+)?/(?:p|tv|reels?(?!/audio/))/(?P<id>[^/?#&]+))'
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?instagram\.com/p/[^/]+/embed.*?)\1'] _EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?instagram\.com/p/[^/]+/embed.*?)\1']
_TESTS = [{ _TESTS = [{
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc', 'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',

160
yt_dlp/extractor/kenh14.py Normal file
View File

@@ -0,0 +1,160 @@
from .common import InfoExtractor
from ..utils import (
clean_html,
extract_attributes,
get_element_by_class,
get_element_html_by_attribute,
get_elements_html_by_class,
int_or_none,
parse_duration,
parse_iso8601,
remove_start,
strip_or_none,
unescapeHTML,
update_url,
url_or_none,
)
from ..utils.traversal import traverse_obj
class Kenh14VideoIE(InfoExtractor):
_VALID_URL = r'https?://video\.kenh14\.vn/(?:video/)?[\w-]+-(?P<id>[0-9]+)\.chn'
_TESTS = [{
'url': 'https://video.kenh14.vn/video/mo-hop-iphone-14-pro-max-nguon-unbox-therapy-316173.chn',
'md5': '1ed67f9c3a1e74acf15db69590cf6210',
'info_dict': {
'id': '316173',
'ext': 'mp4',
'title': 'Video mở hộp iPhone 14 Pro Max (Nguồn: Unbox Therapy)',
'description': 'Video mở hộp iPhone 14 Pro MaxVideo mở hộp iPhone 14 Pro Max (Nguồn: Unbox Therapy)',
'thumbnail': r're:^https?://videothumbs\.mediacdn\.vn/.*\.jpg$',
'tags': [],
'uploader': 'Unbox Therapy',
'upload_date': '20220517',
'view_count': int,
'duration': 722.86,
'timestamp': 1652764468,
},
}, {
'url': 'https://video.kenh14.vn/video-316174.chn',
'md5': '2b41877d2afaf4a3f487ceda8e5c7cbd',
'info_dict': {
'id': '316174',
'ext': 'mp4',
'title': 'Khoảnh khắc VĐV nằm gục khóc sau chiến thắng: 7 năm trời Việt Nam mới có HCV kiếm chém nữ, chỉ có 8 tháng để khổ luyện trước khi lên sàn đấu',
'description': 'md5:de86aa22e143e2b277bce8ec9c6f17dc',
'thumbnail': r're:^https?://videothumbs\.mediacdn\.vn/.*\.jpg$',
'tags': [],
'upload_date': '20220517',
'view_count': int,
'duration': 70.04,
'timestamp': 1652766021,
},
}, {
'url': 'https://video.kenh14.vn/0-344740.chn',
'md5': 'b843495d5e728142c8870c09b46df2a9',
'info_dict': {
'id': '344740',
'ext': 'mov',
'title': 'Kỳ Duyên đầy căng thẳng trong buổi ra quân đi Miss Universe, nghi thức tuyên thuệ lần đầu xuất hiện gây nhiều tranh cãi',
'description': 'md5:2a2dbb4a7397169fb21ee68f09160497',
'thumbnail': r're:^https?://kenh14cdn\.com/.*\.jpg$',
'tags': ['kỳ duyên', 'Kỳ Duyên tuyên thuệ', 'miss universe'],
'uploader': 'Quang Vũ',
'upload_date': '20241024',
'view_count': int,
'duration': 198.88,
'timestamp': 1729741590,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
attrs = extract_attributes(get_element_html_by_attribute('type', 'VideoStream', webpage) or '')
direct_url = attrs['data-vid']
metadata = self._download_json(
'https://api.kinghub.vn/video/api/v1/detailVideoByGet?FileName={}'.format(
remove_start(direct_url, 'kenh14cdn.com/')), video_id, fatal=False)
formats = [{'url': f'https://{direct_url}', 'format_id': 'http', 'quality': 1}]
subtitles = {}
video_data = self._download_json(
f'https://{direct_url}.json', video_id, note='Downloading video data', fatal=False)
if hls_url := traverse_obj(video_data, ('hls', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if dash_url := traverse_obj(video_data, ('mpd', {url_or_none})):
fmts, subs = self._extract_mpd_formats_and_subtitles(
dash_url, video_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
**traverse_obj(metadata, {
'duration': ('duration', {parse_duration}),
'uploader': ('author', {strip_or_none}),
'timestamp': ('uploadtime', {parse_iso8601(delimiter=' ')}),
'view_count': ('views', {int_or_none}),
}),
'id': video_id,
'title': (
traverse_obj(metadata, ('title', {strip_or_none}))
or clean_html(self._og_search_title(webpage))
or clean_html(get_element_by_class('vdbw-title', webpage))),
'formats': formats,
'subtitles': subtitles,
'description': (
clean_html(self._og_search_description(webpage))
or clean_html(get_element_by_class('vdbw-sapo', webpage))),
'thumbnail': (self._og_search_thumbnail(webpage) or attrs.get('data-thumb')),
'tags': traverse_obj(self._html_search_meta('keywords', webpage), (
{lambda x: x.split(';')}, ..., filter)),
}
class Kenh14PlaylistIE(InfoExtractor):
_VALID_URL = r'https?://video\.kenh14\.vn/playlist/[\w-]+-(?P<id>[0-9]+)\.chn'
_TESTS = [{
'url': 'https://video.kenh14.vn/playlist/tran-tinh-naked-love-mua-2-71.chn',
'info_dict': {
'id': '71',
'title': 'Trần Tình (Naked love) mùa 2',
'description': 'md5:e9522339304956dea931722dd72eddb2',
'thumbnail': r're:^https?://kenh14cdn\.com/.*\.png$',
},
'playlist_count': 9,
}, {
'url': 'https://video.kenh14.vn/playlist/0-72.chn',
'info_dict': {
'id': '72',
'title': 'Lau Lại Đầu Từ',
'description': 'Cùng xem xưa và nay có gì khác biệt nhé!',
'thumbnail': r're:^https?://kenh14cdn\.com/.*\.png$',
},
'playlist_count': 6,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
category_detail = get_element_by_class('category-detail', webpage) or ''
embed_info = traverse_obj(
self._yield_json_ld(webpage, playlist_id),
(lambda _, v: v['name'] and v['alternateName'], any)) or {}
return self.playlist_from_matches(
get_elements_html_by_class('video-item', webpage), playlist_id,
(clean_html(get_element_by_class('name', category_detail)) or unescapeHTML(embed_info.get('name'))),
getter=lambda x: 'https://video.kenh14.vn/video/video-{}.chn'.format(extract_attributes(x)['data-id']),
ie=Kenh14VideoIE, playlist_description=(
clean_html(get_element_by_class('description', category_detail))
or unescapeHTML(embed_info.get('alternateName'))),
thumbnail=traverse_obj(
self._og_search_thumbnail(webpage),
({url_or_none}, {update_url(query=None)})))

View File

@@ -39,7 +39,7 @@ class LaracastsBaseIE(InfoExtractor):
'description': ('body', {clean_html}), 'description': ('body', {clean_html}),
'thumbnail': ('largeThumbnail', {url_or_none}), 'thumbnail': ('largeThumbnail', {url_or_none}),
'duration': ('length', {int_or_none}), 'duration': ('length', {int_or_none}),
'date': ('dateSegments', 'published', {unified_strdate}), 'upload_date': ('dateSegments', 'published', {unified_strdate}),
})) }))
@@ -54,7 +54,7 @@ class LaracastsIE(LaracastsBaseIE):
'title': 'Hello, Laravel', 'title': 'Hello, Laravel',
'ext': 'mp4', 'ext': 'mp4',
'duration': 519, 'duration': 519,
'date': '20240312', 'upload_date': '20240312',
'thumbnail': 'https://laracasts.s3.amazonaws.com/videos/thumbnails/youtube/30-days-to-learn-laravel-11-1.png', 'thumbnail': 'https://laracasts.s3.amazonaws.com/videos/thumbnails/youtube/30-days-to-learn-laravel-11-1.png',
'description': 'md5:ddd658bb241975871d236555657e1dd1', 'description': 'md5:ddd658bb241975871d236555657e1dd1',
'season_number': 1, 'season_number': 1,

View File

@@ -310,7 +310,13 @@ class LBRYIE(LBRYBaseIE):
if stream_type in self._SUPPORTED_STREAM_TYPES: if stream_type in self._SUPPORTED_STREAM_TYPES:
claim_id, is_live = result['claim_id'], False claim_id, is_live = result['claim_id'], False
streaming_url = self._call_api_proxy( streaming_url = self._call_api_proxy(
'get', claim_id, {'uri': uri}, 'streaming url')['streaming_url'] 'get', claim_id, {
'uri': uri,
**traverse_obj(parse_qs(url), {
'signature': ('signature', 0),
'signature_ts': ('signature_ts', 0),
}),
}, 'streaming url')['streaming_url']
# GET request to v3 API returns original video/audio file if available # GET request to v3 API returns original video/audio file if available
direct_url = re.sub(r'/api/v\d+/', '/api/v3/', streaming_url) direct_url = re.sub(r'/api/v\d+/', '/api/v3/', streaming_url)

View File

@@ -1,30 +1,32 @@
import json import json
import uuid
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
join_nonempty,
smuggle_url, smuggle_url,
traverse_obj, traverse_obj,
try_call, try_call,
unsmuggle_url, unsmuggle_url,
urljoin,
) )
class LiTVIE(InfoExtractor): class LiTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?litv\.tv/(?:vod|promo)/[^/]+/(?:content\.do)?\?.*?\b(?:content_)?id=(?P<id>[^&]+)' _VALID_URL = r'https?://(?:www\.)?litv\.tv/(?:[^/?#]+/watch/|vod/[^/?#]+/content\.do\?content_id=)(?P<id>[\w-]+)'
_URL_TEMPLATE = 'https://www.litv.tv/%s/watch/%s'
_URL_TEMPLATE = 'https://www.litv.tv/vod/%s/content.do?content_id=%s' _GEO_COUNTRIES = ['TW']
_TESTS = [{ _TESTS = [{
'url': 'https://www.litv.tv/vod/drama/content.do?brc_id=root&id=VOD00041610&isUHEnabled=true&autoPlay=1', 'url': 'https://www.litv.tv/drama/watch/VOD00041610',
'info_dict': { 'info_dict': {
'id': 'VOD00041606', 'id': 'VOD00041606',
'title': '花千骨', 'title': '花千骨',
}, },
'playlist_count': 51, # 50 episodes + 1 trailer 'playlist_count': 51, # 50 episodes + 1 trailer
}, { }, {
'url': 'https://www.litv.tv/vod/drama/content.do?brc_id=root&id=VOD00041610&isUHEnabled=true&autoPlay=1', 'url': 'https://www.litv.tv/drama/watch/VOD00041610',
'md5': 'b90ff1e9f1d8f5cfcd0a44c3e2b34c7a', 'md5': 'b90ff1e9f1d8f5cfcd0a44c3e2b34c7a',
'info_dict': { 'info_dict': {
'id': 'VOD00041610', 'id': 'VOD00041610',
@@ -32,16 +34,15 @@ class LiTVIE(InfoExtractor):
'title': '花千骨第1集', 'title': '花千骨第1集',
'thumbnail': r're:https?://.*\.jpg$', 'thumbnail': r're:https?://.*\.jpg$',
'description': '《花千骨》陸劇線上看。十六年前,平靜的村莊內,一名女嬰隨異相出生,途徑此地的蜀山掌門清虛道長算出此女命運非同一般,她體內散發的異香易招惹妖魔。一念慈悲下,他在村莊周邊設下結界阻擋妖魔入侵,讓其年滿十六後去蜀山,並賜名花千骨。', 'description': '《花千骨》陸劇線上看。十六年前,平靜的村莊內,一名女嬰隨異相出生,途徑此地的蜀山掌門清虛道長算出此女命運非同一般,她體內散發的異香易招惹妖魔。一念慈悲下,他在村莊周邊設下結界阻擋妖魔入侵,讓其年滿十六後去蜀山,並賜名花千骨。',
'categories': ['奇幻', '愛情', '中國', '仙俠'], 'categories': ['奇幻', '愛情', '仙俠', '古裝'],
'episode': 'Episode 1', 'episode': 'Episode 1',
'episode_number': 1, 'episode_number': 1,
}, },
'params': { 'params': {
'noplaylist': True, 'noplaylist': True,
}, },
'skip': 'Georestricted to Taiwan',
}, { }, {
'url': 'https://www.litv.tv/promo/miyuezhuan/?content_id=VOD00044841&', 'url': 'https://www.litv.tv/drama/watch/VOD00044841',
'md5': '88322ea132f848d6e3e18b32a832b918', 'md5': '88322ea132f848d6e3e18b32a832b918',
'info_dict': { 'info_dict': {
'id': 'VOD00044841', 'id': 'VOD00044841',
@@ -55,94 +56,62 @@ class LiTVIE(InfoExtractor):
def _extract_playlist(self, playlist_data, content_type): def _extract_playlist(self, playlist_data, content_type):
all_episodes = [ all_episodes = [
self.url_result(smuggle_url( self.url_result(smuggle_url(
self._URL_TEMPLATE % (content_type, episode['contentId']), self._URL_TEMPLATE % (content_type, episode['content_id']),
{'force_noplaylist': True})) # To prevent infinite recursion {'force_noplaylist': True})) # To prevent infinite recursion
for episode in traverse_obj(playlist_data, ('seasons', ..., 'episode', lambda _, v: v['contentId']))] for episode in traverse_obj(playlist_data, ('seasons', ..., 'episodes', lambda _, v: v['content_id']))]
return self.playlist_result(all_episodes, playlist_data['contentId'], playlist_data.get('title')) return self.playlist_result(all_episodes, playlist_data['content_id'], playlist_data.get('title'))
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
vod_data = self._search_nextjs_data(webpage, video_id)['props']['pageProps']
if self._search_regex( program_info = traverse_obj(vod_data, ('programInformation', {dict})) or {}
r'(?i)<meta\s[^>]*http-equiv="refresh"\s[^>]*content="[0-9]+;\s*url=https://www\.litv\.tv/"', playlist_data = traverse_obj(vod_data, ('seriesTree'))
webpage, 'meta refresh redirect', default=False, group=0): if playlist_data and self._yes_playlist(program_info.get('series_id'), video_id, smuggled_data):
raise ExtractorError('No such content found', expected=True) return self._extract_playlist(playlist_data, program_info.get('content_type'))
program_info = self._parse_json(self._search_regex( asset_id = traverse_obj(program_info, ('assets', 0, 'asset_id', {str}))
r'var\s+programInfo\s*=\s*([^;]+)', webpage, 'VOD data', default='{}'), if asset_id: # This is a VOD
video_id) media_type = 'vod'
else: # This is a live stream
asset_id = program_info['content_id']
media_type = program_info['content_type']
puid = try_call(lambda: self._get_cookies('https://www.litv.tv/')['PUID'].value)
if puid:
endpoint = 'get-urls'
else:
puid = str(uuid.uuid4())
endpoint = 'get-urls-no-auth'
video_data = self._download_json(
f'https://www.litv.tv/api/{endpoint}', video_id,
data=json.dumps({'AssetId': asset_id, 'MediaType': media_type, 'puid': puid}).encode(),
headers={'Content-Type': 'application/json'})
# In browsers `getProgramInfo` request is always issued. Usually this if error := traverse_obj(video_data, ('error', {dict})):
# endpoint gives the same result as the data embedded in the webpage. error_msg = traverse_obj(error, ('message', {str}))
# If, for some reason, there are no embedded data, we do an extra request. if error_msg and 'OutsideRegionError' in error_msg:
if 'assetId' not in program_info:
program_info = self._download_json(
'https://www.litv.tv/vod/ajax/getProgramInfo', video_id,
query={'contentId': video_id},
headers={'Accept': 'application/json'})
series_id = program_info['seriesId']
if self._yes_playlist(series_id, video_id, smuggled_data):
playlist_data = self._download_json(
'https://www.litv.tv/vod/ajax/getSeriesTree', video_id,
query={'seriesId': series_id}, headers={'Accept': 'application/json'})
return self._extract_playlist(playlist_data, program_info['contentType'])
video_data = self._parse_json(self._search_regex(
r'uiHlsUrl\s*=\s*testBackendData\(([^;]+)\);',
webpage, 'video data', default='{}'), video_id)
if not video_data:
payload = {'assetId': program_info['assetId']}
puid = try_call(lambda: self._get_cookies('https://www.litv.tv/')['PUID'].value)
if puid:
payload.update({
'type': 'auth',
'puid': puid,
})
endpoint = 'getUrl'
else:
payload.update({
'watchDevices': program_info['watchDevices'],
'contentType': program_info['contentType'],
})
endpoint = 'getMainUrlNoAuth'
video_data = self._download_json(
f'https://www.litv.tv/vod/ajax/{endpoint}', video_id,
data=json.dumps(payload).encode(),
headers={'Content-Type': 'application/json'})
if not video_data.get('fullpath'):
error_msg = video_data.get('errorMessage')
if error_msg == 'vod.error.outsideregionerror':
self.raise_geo_restricted('This video is available in Taiwan only') self.raise_geo_restricted('This video is available in Taiwan only')
if error_msg: elif error_msg:
raise ExtractorError(f'{self.IE_NAME} said: {error_msg}', expected=True) raise ExtractorError(f'{self.IE_NAME} said: {error_msg}', expected=True)
raise ExtractorError(f'Unexpected result from {self.IE_NAME}') raise ExtractorError(f'Unexpected error from {self.IE_NAME}')
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
video_data['fullpath'], video_id, ext='mp4', video_data['result']['AssetURLs'][0], video_id, ext='mp4', m3u8_id='hls')
entry_protocol='m3u8_native', m3u8_id='hls')
for a_format in formats: for a_format in formats:
# LiTV HLS segments doesn't like compressions # LiTV HLS segments doesn't like compressions
a_format.setdefault('http_headers', {})['Accept-Encoding'] = 'identity' a_format.setdefault('http_headers', {})['Accept-Encoding'] = 'identity'
title = program_info['title'] + program_info.get('secondaryMark', '')
description = program_info.get('description')
thumbnail = program_info.get('imageFile')
categories = [item['name'] for item in program_info.get('category', [])]
episode = int_or_none(program_info.get('episode'))
return { return {
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': title, 'title': join_nonempty('title', 'secondary_mark', delim='', from_dict=program_info),
'description': description, **traverse_obj(program_info, {
'thumbnail': thumbnail, 'description': ('description', {str}),
'categories': categories, 'thumbnail': ('picture', {urljoin('https://p-cdnstatic.svc.litv.tv/')}),
'episode_number': episode, 'categories': ('genres', ..., 'name', {str}),
'episode_number': ('episode', {int_or_none}),
}),
} }

View File

@@ -26,6 +26,7 @@ class MicrosoftEmbedIE(InfoExtractor):
'timestamp': 1631658316, 'timestamp': 1631658316,
'upload_date': '20210914', 'upload_date': '20210914',
}, },
'expected_warnings': ['Failed to parse XML: syntax error: line 1, column 0'],
}] }]
_API_URL = 'https://prod-video-cms-rt-microsoft-com.akamaized.net/vhs/api/videos/' _API_URL = 'https://prod-video-cms-rt-microsoft-com.akamaized.net/vhs/api/videos/'
@@ -36,11 +37,11 @@ class MicrosoftEmbedIE(InfoExtractor):
formats = [] formats = []
for source_type, source in metadata['streams'].items(): for source_type, source in metadata['streams'].items():
if source_type == 'smooth_Streaming': if source_type == 'smooth_Streaming':
formats.extend(self._extract_ism_formats(source['url'], video_id, 'mss')) formats.extend(self._extract_ism_formats(source['url'], video_id, 'mss', fatal=False))
elif source_type == 'apple_HTTP_Live_Streaming': elif source_type == 'apple_HTTP_Live_Streaming':
formats.extend(self._extract_m3u8_formats(source['url'], video_id, 'mp4')) formats.extend(self._extract_m3u8_formats(source['url'], video_id, 'mp4', fatal=False))
elif source_type == 'mPEG_DASH': elif source_type == 'mPEG_DASH':
formats.extend(self._extract_mpd_formats(source['url'], video_id)) formats.extend(self._extract_mpd_formats(source['url'], video_id, fatal=False))
else: else:
formats.append({ formats.append({
'format_id': source_type, 'format_id': source_type,

View File

@@ -1,291 +0,0 @@
import functools
import json
import uuid
from .common import InfoExtractor
from ..utils import (
ExtractorError,
OnDemandPagedList,
determine_ext,
dict_get,
float_or_none,
traverse_obj,
)
class MildomBaseIE(InfoExtractor):
_GUEST_ID = None
def _call_api(self, url, video_id, query=None, note='Downloading JSON metadata', body=None):
if not self._GUEST_ID:
self._GUEST_ID = f'pc-gp-{uuid.uuid4()}'
content = self._download_json(
url, video_id, note=note, data=json.dumps(body).encode() if body else None,
headers={'Content-Type': 'application/json'} if body else {},
query={
'__guest_id': self._GUEST_ID,
'__platform': 'web',
**(query or {}),
})
if content['code'] != 0:
raise ExtractorError(
f'Mildom says: {content["message"]} (code {content["code"]})',
expected=True)
return content['body']
class MildomIE(MildomBaseIE):
IE_NAME = 'mildom'
IE_DESC = 'Record ongoing live by specific user in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/(?P<id>\d+)'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(f'https://www.mildom.com/{video_id}', video_id)
enterstudio = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/live/enterstudio', video_id,
note='Downloading live metadata', query={'user_id': video_id})
result_video_id = enterstudio.get('log_id', video_id)
servers = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/live/liveserver', result_video_id,
note='Downloading live server list', query={
'user_id': video_id,
'live_server_type': 'hls',
})
playback_token = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/live/token', result_video_id,
note='Obtaining live playback token', body={'host_id': video_id, 'type': 'hls'})
playback_token = traverse_obj(playback_token, ('data', ..., 'token'), get_all=False)
if not playback_token:
raise ExtractorError('Failed to obtain live playback token')
formats = self._extract_m3u8_formats(
f'{servers["stream_server"]}/{video_id}_master.m3u8?{playback_token}',
result_video_id, 'mp4', headers={
'Referer': 'https://www.mildom.com/',
'Origin': 'https://www.mildom.com',
})
for fmt in formats:
fmt.setdefault('http_headers', {})['Referer'] = 'https://www.mildom.com/'
return {
'id': result_video_id,
'title': self._html_search_meta('twitter:description', webpage, default=None) or traverse_obj(enterstudio, 'anchor_intro'),
'description': traverse_obj(enterstudio, 'intro', 'live_intro', expected_type=str),
'timestamp': float_or_none(enterstudio.get('live_start_ms'), scale=1000),
'uploader': self._html_search_meta('twitter:title', webpage, default=None) or traverse_obj(enterstudio, 'loginname'),
'uploader_id': video_id,
'formats': formats,
'is_live': True,
}
class MildomVodIE(MildomBaseIE):
IE_NAME = 'mildom:vod'
IE_DESC = 'VOD in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/playback/(?P<user_id>\d+)/(?P<id>(?P=user_id)-[a-zA-Z0-9]+-?[0-9]*)'
_TESTS = [{
'url': 'https://www.mildom.com/playback/10882672/10882672-1597662269',
'info_dict': {
'id': '10882672-1597662269',
'ext': 'mp4',
'title': '始めてのミルダム配信じゃぃ!',
'thumbnail': r're:^https?://.*\.(png|jpg)$',
'upload_date': '20200817',
'duration': 4138.37,
'description': 'ゲームをしたくて!',
'timestamp': 1597662269.0,
'uploader_id': '10882672',
'uploader': 'kson組長(けいそん)',
},
}, {
'url': 'https://www.mildom.com/playback/10882672/10882672-1597758589870-477',
'info_dict': {
'id': '10882672-1597758589870-477',
'ext': 'mp4',
'title': '【kson】感染メイズ麻酔銃で無双する',
'thumbnail': r're:^https?://.*\.(png|jpg)$',
'timestamp': 1597759093.0,
'uploader': 'kson組長(けいそん)',
'duration': 4302.58,
'uploader_id': '10882672',
'description': 'このステージ絶対乗り越えたい',
'upload_date': '20200818',
},
}, {
'url': 'https://www.mildom.com/playback/10882672/10882672-buha9td2lrn97fk2jme0',
'info_dict': {
'id': '10882672-buha9td2lrn97fk2jme0',
'ext': 'mp4',
'title': '【kson組長】CART RACER!!!',
'thumbnail': r're:^https?://.*\.(png|jpg)$',
'uploader_id': '10882672',
'uploader': 'kson組長(けいそん)',
'upload_date': '20201104',
'timestamp': 1604494797.0,
'duration': 4657.25,
'description': 'WTF',
},
}]
def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
webpage = self._download_webpage(f'https://www.mildom.com/playback/{user_id}/{video_id}', video_id)
autoplay = self._call_api(
'https://cloudac.mildom.com/nonolive/videocontent/playback/getPlaybackDetail', video_id,
note='Downloading playback metadata', query={
'v_id': video_id,
})['playback']
formats = [{
'url': autoplay['audio_url'],
'format_id': 'audio',
'protocol': 'm3u8_native',
'vcodec': 'none',
'acodec': 'aac',
'ext': 'm4a',
}]
for fmt in autoplay['video_link']:
formats.append({
'format_id': 'video-{}'.format(fmt['name']),
'url': fmt['url'],
'protocol': 'm3u8_native',
'width': fmt['level'] * autoplay['video_width'] // autoplay['video_height'],
'height': fmt['level'],
'vcodec': 'h264',
'acodec': 'aac',
'ext': 'mp4',
})
return {
'id': video_id,
'title': self._html_search_meta(('og:description', 'description'), webpage, default=None) or autoplay.get('title'),
'description': traverse_obj(autoplay, 'video_intro'),
'timestamp': float_or_none(autoplay.get('publish_time'), scale=1000),
'duration': float_or_none(autoplay.get('video_length'), scale=1000),
'thumbnail': dict_get(autoplay, ('upload_pic', 'video_pic')),
'uploader': traverse_obj(autoplay, ('author_info', 'login_name')),
'uploader_id': user_id,
'formats': formats,
}
class MildomClipIE(MildomBaseIE):
IE_NAME = 'mildom:clip'
IE_DESC = 'Clip in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/clip/(?P<id>(?P<user_id>\d+)-[a-zA-Z0-9]+)'
_TESTS = [{
'url': 'https://www.mildom.com/clip/10042245-63921673e7b147ebb0806d42b5ba5ce9',
'info_dict': {
'id': '10042245-63921673e7b147ebb0806d42b5ba5ce9',
'title': '全然違ったよ',
'timestamp': 1619181890,
'duration': 59,
'thumbnail': r're:https?://.+',
'uploader': 'ざきんぽ',
'uploader_id': '10042245',
},
}, {
'url': 'https://www.mildom.com/clip/10111524-ebf4036e5aa8411c99fb3a1ae0902864',
'info_dict': {
'id': '10111524-ebf4036e5aa8411c99fb3a1ae0902864',
'title': 'かっこいい',
'timestamp': 1621094003,
'duration': 59,
'thumbnail': r're:https?://.+',
'uploader': '(ルーキー',
'uploader_id': '10111524',
},
}, {
'url': 'https://www.mildom.com/clip/10660174-2c539e6e277c4aaeb4b1fbe8d22cb902',
'info_dict': {
'id': '10660174-2c539e6e277c4aaeb4b1fbe8d22cb902',
'title': '',
'timestamp': 1614769431,
'duration': 31,
'thumbnail': r're:https?://.+',
'uploader': 'ドルゴルスレンギーン=ダグワドルジ',
'uploader_id': '10660174',
},
}]
def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
webpage = self._download_webpage(f'https://www.mildom.com/clip/{video_id}', video_id)
clip_detail = self._call_api(
'https://cloudac-cf-jp.mildom.com/nonolive/videocontent/clip/detail', video_id,
note='Downloading playback metadata', query={
'clip_id': video_id,
})
return {
'id': video_id,
'title': self._html_search_meta(
('og:description', 'description'), webpage, default=None) or clip_detail.get('title'),
'timestamp': float_or_none(clip_detail.get('create_time')),
'duration': float_or_none(clip_detail.get('length')),
'thumbnail': clip_detail.get('cover'),
'uploader': traverse_obj(clip_detail, ('user_info', 'loginname')),
'uploader_id': user_id,
'url': clip_detail['url'],
'ext': determine_ext(clip_detail.get('url'), 'mp4'),
}
class MildomUserVodIE(MildomBaseIE):
IE_NAME = 'mildom:user:vod'
IE_DESC = 'Download all VODs from specific user in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/profile/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mildom.com/profile/10093333',
'info_dict': {
'id': '10093333',
'title': 'Uploads from ねこばたけ',
},
'playlist_mincount': 732,
}, {
'url': 'https://www.mildom.com/profile/10882672',
'info_dict': {
'id': '10882672',
'title': 'Uploads from kson組長(けいそん)',
},
'playlist_mincount': 201,
}]
def _fetch_page(self, user_id, page):
page += 1
reply = self._call_api(
'https://cloudac.mildom.com/nonolive/videocontent/profile/playbackList',
user_id, note=f'Downloading page {page}', query={
'user_id': user_id,
'page': page,
'limit': '30',
})
if not reply:
return
for x in reply:
v_id = x.get('v_id')
if not v_id:
continue
yield self.url_result(f'https://www.mildom.com/playback/{user_id}/{v_id}')
def _real_extract(self, url):
user_id = self._match_id(url)
self.to_screen(f'This will download all VODs belonging to user. To download ongoing live video, use "https://www.mildom.com/{user_id}" instead')
profile = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/user/profileV2', user_id,
query={'user_id': user_id}, note='Downloading user profile')['user_info']
return self.playlist_result(
OnDemandPagedList(functools.partial(self._fetch_page, user_id), 30),
user_id, f'Uploads from {profile["loginname"]}')

View File

@@ -80,9 +80,9 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
pre_player = self._parse_json(self._search_regex( pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})', r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player'), display_id)['prePlayer'] webpage, 'Pre Player', display_id)['prePlayer']
title = pre_player['title'] title = pre_player['title']
video_info = self._parse_content(pre_player['video'], url) video_info = self._parse_content(pre_player['video'], url)
content = pre_player.get('content') or {} content = pre_player.get('content') or {}

View File

@@ -12,7 +12,7 @@ from ..utils.traversal import traverse_obj
class MixchIE(InfoExtractor): class MixchIE(InfoExtractor):
IE_NAME = 'mixch' IE_NAME = 'mixch'
_VALID_URL = r'https?://(?:www\.)?mixch\.tv/u/(?P<id>\d+)' _VALID_URL = r'https?://mixch\.tv/u/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://mixch.tv/u/16943797/live', 'url': 'https://mixch.tv/u/16943797/live',
@@ -74,7 +74,7 @@ class MixchIE(InfoExtractor):
class MixchArchiveIE(InfoExtractor): class MixchArchiveIE(InfoExtractor):
IE_NAME = 'mixch:archive' IE_NAME = 'mixch:archive'
_VALID_URL = r'https?://(?:www\.)?mixch\.tv/archive/(?P<id>\d+)' _VALID_URL = r'https?://mixch\.tv/archive/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://mixch.tv/archive/421', 'url': 'https://mixch.tv/archive/421',
@@ -116,3 +116,56 @@ class MixchArchiveIE(InfoExtractor):
'formats': self._extract_m3u8_formats(info_json['archiveURL'], video_id), 'formats': self._extract_m3u8_formats(info_json['archiveURL'], video_id),
'thumbnail': traverse_obj(info_json, ('thumbnailURL', {url_or_none})), 'thumbnail': traverse_obj(info_json, ('thumbnailURL', {url_or_none})),
} }
class MixchMovieIE(InfoExtractor):
IE_NAME = 'mixch:movie'
_VALID_URL = r'https?://mixch\.tv/m/(?P<id>\w+)'
_TESTS = [{
'url': 'https://mixch.tv/m/Ve8KNkJ5',
'info_dict': {
'id': 'Ve8KNkJ5',
'title': '夏☀️\nムービーへのポイントは本イベントに加算されないので配信にてお願い致します🙇🏻\u200d♀️\n#TGCCAMPUS #ミス東大 #ミス東大2024 ',
'ext': 'mp4',
'uploader': 'ミス東大No.5 松藤百香🍑💫',
'uploader_id': '12299174',
'channel_follower_count': int,
'view_count': int,
'like_count': int,
'comment_count': int,
'timestamp': 1724070828,
'uploader_url': 'https://mixch.tv/u/12299174',
'live_status': 'not_live',
'upload_date': '20240819',
},
}, {
'url': 'https://mixch.tv/m/61DzpIKE',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
f'https://mixch.tv/api-web/movies/{video_id}', video_id)
return {
'id': video_id,
'formats': [{
'format_id': 'mp4',
'url': data['movie']['file'],
'ext': 'mp4',
}],
**traverse_obj(data, {
'title': ('movie', 'title', {str}),
'thumbnail': ('movie', 'thumbnailURL', {url_or_none}),
'uploader': ('ownerInfo', 'name', {str}),
'uploader_id': ('ownerInfo', 'id', {int}, {str_or_none}),
'channel_follower_count': ('ownerInfo', 'fan', {int_or_none}),
'view_count': ('ownerInfo', 'view', {int_or_none}),
'like_count': ('movie', 'favCount', {int_or_none}),
'comment_count': ('movie', 'commentCount', {int_or_none}),
'timestamp': ('movie', 'published', {int_or_none}),
'uploader_url': ('ownerInfo', 'id', {lambda x: x and f'https://mixch.tv/u/{x}'}, filter),
}),
'live_status': 'not_live',
}

View File

@@ -72,6 +72,7 @@ class NaverBaseIE(InfoExtractor):
'abr': int_or_none(bitrate.get('audio')), 'abr': int_or_none(bitrate.get('audio')),
'filesize': int_or_none(stream.get('size')), 'filesize': int_or_none(stream.get('size')),
'protocol': 'm3u8_native' if stream_type == 'HLS' else None, 'protocol': 'm3u8_native' if stream_type == 'HLS' else None,
'extra_param_to_segment_url': urllib.parse.urlencode(query, doseq=True) if stream_type == 'HLS' else None,
}) })
extract_formats(get_list('video'), 'H264') extract_formats(get_list('video'), 'H264')
@@ -168,6 +169,26 @@ class NaverIE(NaverBaseIE):
'duration': 277, 'duration': 277,
'thumbnail': r're:^https?://.*\.jpg', 'thumbnail': r're:^https?://.*\.jpg',
}, },
}, {
'url': 'https://tv.naver.com/v/67838091',
'md5': '126ea384ab033bca59672c12cca7a6be',
'info_dict': {
'id': '67838091',
'ext': 'mp4',
'title': '[라인W 날씨] 내일 아침 서울 체감 -19도…호남·충남 대설',
'description': 'md5:fe026e25634c85845698aed4b59db5a7',
'timestamp': 1736347853,
'upload_date': '20250108',
'uploader': 'KBS뉴스',
'uploader_id': 'kbsnews',
'uploader_url': 'https://tv.naver.com/kbsnews',
'view_count': int,
'like_count': int,
'comment_count': int,
'duration': 69,
'thumbnail': r're:^https?://.*\.jpg',
},
'params': {'format': 'HLS_144P'},
}, { }, {
'url': 'http://tvcast.naver.com/v/81652', 'url': 'http://tvcast.naver.com/v/81652',
'only_matching': True, 'only_matching': True,

117
yt_dlp/extractor/nest.py Normal file
View File

@@ -0,0 +1,117 @@
from .common import InfoExtractor
from ..utils import ExtractorError, float_or_none, update_url_query, url_or_none
from ..utils.traversal import traverse_obj
class NestIE(InfoExtractor):
_VALID_URL = r'https?://video\.nest\.com/(?:embedded/)?live/(?P<id>\w+)'
_EMBED_REGEX = [rf'<iframe [^>]*\bsrc=[\'"](?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'https://video.nest.com/embedded/live/4fvYdSo8AX?autoplay=0',
'info_dict': {
'id': '4fvYdSo8AX',
'ext': 'mp4',
'title': 'startswith:Outside ',
'alt_title': 'Outside',
'description': '<null>',
'location': 'Los Angeles',
'availability': 'public',
'thumbnail': r're:https?://',
'live_status': 'is_live',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://video.nest.com/live/4fvYdSo8AX',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.pacificblue.biz/noyo-harbor-webcam/',
'info_dict': {
'id': '4fvYdSo8AX',
'ext': 'mp4',
'title': 'startswith:Outside ',
'alt_title': 'Outside',
'description': '<null>',
'location': 'Los Angeles',
'availability': 'public',
'thumbnail': r're:https?://',
'live_status': 'is_live',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
item = self._download_json(
'https://video.nest.com/api/dropcam/cameras.get_by_public_token',
video_id, query={'token': video_id})['items'][0]
uuid = item.get('uuid')
stream_domain = item.get('live_stream_host')
if not stream_domain or not uuid:
raise ExtractorError('Unable to construct playlist URL')
thumb_domain = item.get('nexus_api_nest_domain_host')
return {
'id': video_id,
**traverse_obj(item, {
'description': ('description', {str}),
'title': (('title', 'name', 'where'), {str}, filter, any),
'alt_title': ('name', {str}),
'location': ((('timezone', {lambda x: x.split('/')[1].replace('_', ' ')}), 'where'), {str}, filter, any),
}),
'thumbnail': update_url_query(
f'https://{thumb_domain}/get_image',
{'uuid': uuid, 'public': video_id}) if thumb_domain else None,
'availability': self._availability(is_private=item.get('is_public') is False),
'formats': self._extract_m3u8_formats(
f'https://{stream_domain}/nexus_aac/{uuid}/playlist.m3u8',
video_id, 'mp4', live=True, query={'public': video_id}),
'is_live': True,
}
class NestClipIE(InfoExtractor):
_VALID_URL = r'https?://video\.nest\.com/(?:embedded/)?clip/(?P<id>\w+)'
_EMBED_REGEX = [rf'<iframe [^>]*\bsrc=[\'"](?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'https://video.nest.com/clip/f34c9dd237a44eca9a0001af685e3dff',
'info_dict': {
'id': 'f34c9dd237a44eca9a0001af685e3dff',
'ext': 'mp4',
'title': 'NestClip video #f34c9dd237a44eca9a0001af685e3dff',
'thumbnail': 'https://clips.dropcam.com/f34c9dd237a44eca9a0001af685e3dff.jpg',
'timestamp': 1735413474.468,
'upload_date': '20241228',
},
}, {
'url': 'https://video.nest.com/embedded/clip/34e0432adc3c46a98529443d8ad5aa76',
'info_dict': {
'id': '34e0432adc3c46a98529443d8ad5aa76',
'ext': 'mp4',
'title': 'Shootout at Veterans Boulevard at Fleur De Lis Drive',
'thumbnail': 'https://clips.dropcam.com/34e0432adc3c46a98529443d8ad5aa76.jpg',
'upload_date': '20230817',
'timestamp': 1692262897.191,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'https://video.nest.com/api/dropcam/videos.get_by_filename', video_id,
query={'filename': f'{video_id}.mp4'})
return {
'id': video_id,
**traverse_obj(data, ('items', 0, {
'title': ('title', {str}),
'thumbnail': ('thumbnail_url', {url_or_none}),
'url': ('download_url', {url_or_none}),
'timestamp': ('start_time', {float_or_none}),
})),
}

View File

@@ -592,8 +592,8 @@ class NiconicoPlaylistBaseIE(InfoExtractor):
@staticmethod @staticmethod
def _parse_owner(item): def _parse_owner(item):
return { return {
'uploader': traverse_obj(item, ('owner', 'name')), 'uploader': traverse_obj(item, ('owner', ('name', ('user', 'nickname')), {str}, any)),
'uploader_id': traverse_obj(item, ('owner', 'id')), 'uploader_id': traverse_obj(item, ('owner', 'id', {str})),
} }
def _fetch_page(self, list_id, page): def _fetch_page(self, list_id, page):
@@ -666,7 +666,7 @@ class NiconicoPlaylistIE(NiconicoPlaylistBaseIE):
mylist.get('name'), mylist.get('description'), **self._parse_owner(mylist)) mylist.get('name'), mylist.get('description'), **self._parse_owner(mylist))
class NiconicoSeriesIE(InfoExtractor): class NiconicoSeriesIE(NiconicoPlaylistBaseIE):
IE_NAME = 'niconico:series' IE_NAME = 'niconico:series'
_VALID_URL = r'https?://(?:(?:www\.|sp\.)?nicovideo\.jp(?:/user/\d+)?|nico\.ms)/series/(?P<id>\d+)' _VALID_URL = r'https?://(?:(?:www\.|sp\.)?nicovideo\.jp(?:/user/\d+)?|nico\.ms)/series/(?P<id>\d+)'
@@ -675,6 +675,9 @@ class NiconicoSeriesIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '110226', 'id': '110226',
'title': 'ご立派ァ!のシリーズ', 'title': 'ご立派ァ!のシリーズ',
'description': '楽しそうな外人の吹き替えをさせたら終身名誉ホモガキの右に出る人はいませんね…',
'uploader': 'アルファるふぁ',
'uploader_id': '44113208',
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
}, { }, {
@@ -682,6 +685,9 @@ class NiconicoSeriesIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '12312', 'id': '12312',
'title': 'バトルスピリッツ お勧めカード紹介(調整中)', 'title': 'バトルスピリッツ お勧めカード紹介(調整中)',
'description': '',
'uploader': '野鳥',
'uploader_id': '2275360',
}, },
'playlist_mincount': 103, 'playlist_mincount': 103,
}, { }, {
@@ -689,19 +695,21 @@ class NiconicoSeriesIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
def _call_api(self, list_id, resource, query):
return self._download_json(
f'https://nvapi.nicovideo.jp/v2/series/{list_id}', list_id,
f'Downloading {resource}', query=query,
headers=self._API_HEADERS)['data']
def _real_extract(self, url): def _real_extract(self, url):
list_id = self._match_id(url) list_id = self._match_id(url)
webpage = self._download_webpage(url, list_id) series = self._call_api(list_id, 'list', {
'pageSize': 1,
})['detail']
title = self._search_regex( return self.playlist_result(
(r'<title>「(.+)(全', self._entries(list_id), list_id,
r'<div class="TwitterShareButton"\s+data-text="(.+)\s+https:'), series.get('title'), series.get('description'), **self._parse_owner(series))
webpage, 'title', fatal=False)
if title:
title = unescapeHTML(title)
json_data = next(self._yield_json_ld(webpage, None, fatal=False))
return self.playlist_from_matches(
traverse_obj(json_data, ('itemListElement', ..., 'url')), list_id, title, ie=NiconicoIE)
class NiconicoHistoryIE(NiconicoPlaylistBaseIE): class NiconicoHistoryIE(NiconicoPlaylistBaseIE):

View File

@@ -12,6 +12,7 @@ from ..utils import (
parse_iso8601, parse_iso8601,
str_or_none, str_or_none,
try_get, try_get,
update_url_query,
url_or_none, url_or_none,
urljoin, urljoin,
) )
@@ -27,6 +28,12 @@ class NRKBaseIE(InfoExtractor):
)/''' )/'''
def _extract_nrk_formats(self, asset_url, video_id): def _extract_nrk_formats(self, asset_url, video_id):
asset_url = update_url_query(asset_url, {
# Remove 'adap' to return all streams (known values are: small, large, small_h265, large_h265)
'adap': [],
# Disable subtitles since they are fetched separately
's': 0,
})
if re.match(r'https?://[^/]+\.akamaihd\.net/i/', asset_url): if re.match(r'https?://[^/]+\.akamaihd\.net/i/', asset_url):
return self._extract_akamai_formats(asset_url, video_id) return self._extract_akamai_formats(asset_url, video_id)
asset_url = re.sub(r'(?:bw_(?:low|high)=\d+|no_audio_only)&?', '', asset_url) asset_url = re.sub(r'(?:bw_(?:low|high)=\d+|no_audio_only)&?', '', asset_url)
@@ -58,7 +65,10 @@ class NRKBaseIE(InfoExtractor):
return self._download_json( return self._download_json(
urljoin('https://psapi.nrk.no/', path), urljoin('https://psapi.nrk.no/', path),
video_id, note or f'Downloading {item} JSON', video_id, note or f'Downloading {item} JSON',
fatal=fatal, query=query) fatal=fatal, query=query, headers={
# Needed for working stream URLs, see https://github.com/yt-dlp/yt-dlp/issues/12192
'Accept': 'application/vnd.nrk.psapi+json; version=9; player=tv-player; device=player-core',
})
class NRKIE(NRKBaseIE): class NRKIE(NRKBaseIE):
@@ -77,13 +87,17 @@ class NRKIE(NRKBaseIE):
_TESTS = [{ _TESTS = [{
# video # video
'url': 'http://www.nrk.no/video/PS*150533', 'url': 'http://www.nrk.no/video/PS*150533',
'md5': 'f46be075326e23ad0e524edfcb06aeb6', 'md5': '2b88a652ad2e275591e61cf550887eec',
'info_dict': { 'info_dict': {
'id': '150533', 'id': '150533',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Dompap og andre fugler i Piip-Show', 'title': 'Dompap og andre fugler i Piip-Show',
'description': 'md5:d9261ba34c43b61c812cb6b0269a5c8f', 'description': 'md5:d9261ba34c43b61c812cb6b0269a5c8f',
'duration': 262, 'duration': 262,
'upload_date': '20140325',
'thumbnail': r're:^https?://gfx\.nrk\.no/.*$',
'timestamp': 1395751833,
'alt_title': 'md5:d9261ba34c43b61c812cb6b0269a5c8f',
}, },
}, { }, {
# audio # audio
@@ -95,6 +109,10 @@ class NRKIE(NRKBaseIE):
'title': 'Slik høres internett ut når du er blind', 'title': 'Slik høres internett ut når du er blind',
'description': 'md5:a621f5cc1bd75c8d5104cb048c6b8568', 'description': 'md5:a621f5cc1bd75c8d5104cb048c6b8568',
'duration': 20, 'duration': 20,
'timestamp': 1398429565,
'alt_title': 'Cathrine Lie Wathne er blind, og bruker hurtigtaster for å navigere seg rundt på ulike nettsider.',
'thumbnail': 'https://gfx.nrk.no/urxQMSXF-WnbfjBH5ke2igLGyN27EdJVWZ6FOsEAclhA',
'upload_date': '20140425',
}, },
}, { }, {
'url': 'nrk:ecc1b952-96dc-4a98-81b9-5296dc7a98d9', 'url': 'nrk:ecc1b952-96dc-4a98-81b9-5296dc7a98d9',
@@ -152,7 +170,7 @@ class NRKIE(NRKBaseIE):
return self._call_api(f'playback/{item}/{video_id}', video_id, item, query=query) return self._call_api(f'playback/{item}/{video_id}', video_id, item, query=query)
raise raise
# known values for preferredCdn: akamai, iponly, minicdn and telenor # known values for preferredCdn: akamai, globalconnect and telenor
manifest = call_playback_api('manifest', {'preferredCdn': 'akamai'}) manifest = call_playback_api('manifest', {'preferredCdn': 'akamai'})
video_id = try_get(manifest, lambda x: x['id'], str) or video_id video_id = try_get(manifest, lambda x: x['id'], str) or video_id
@@ -307,6 +325,13 @@ class NRKTVIE(InfoExtractor):
'ext': 'vtt', 'ext': 'vtt',
}], }],
}, },
'upload_date': '20170627',
'timestamp': 1498591822,
'thumbnail': 'https://gfx.nrk.no/myRSc4vuFlahB60P3n6swwRTQUZI1LqJZl9B7icZFgzA',
'alt_title': 'md5:46923a6e6510eefcce23d5ef2a58f2ce',
},
'params': {
'skip_download': True,
}, },
}, { }, {
'url': 'https://tv.nrk.no/serie/20-spoersmaal-tv/MUHH48000314/23-05-2014', 'url': 'https://tv.nrk.no/serie/20-spoersmaal-tv/MUHH48000314/23-05-2014',
@@ -321,6 +346,13 @@ class NRKTVIE(InfoExtractor):
'series': '20 spørsmål', 'series': '20 spørsmål',
'episode': '23. mai 2014', 'episode': '23. mai 2014',
'age_limit': 0, 'age_limit': 0,
'timestamp': 1584593700,
'thumbnail': 'https://gfx.nrk.no/u7uCe79SEfPVGRAGVp2_uAZnNc4mfz_kjXg6Bgek8lMQ',
'season_id': '126936',
'upload_date': '20200319',
'season': 'Season 2014',
'season_number': 2014,
'episode_number': 3,
}, },
}, { }, {
'url': 'https://tv.nrk.no/program/mdfp15000514', 'url': 'https://tv.nrk.no/program/mdfp15000514',

View File

@@ -343,7 +343,7 @@ class NYTimesCookingIE(NYTimesBaseIE):
if media_ids: if media_ids:
media_ids.append(lead_video_id) media_ids.append(lead_video_id)
return self.playlist_result( return self.playlist_result(
[self._extract_video(media_id) for media_id in media_ids], page_id, title, description) map(self._extract_video, media_ids), page_id, title, description)
return { return {
**self._extract_video(lead_video_id), **self._extract_video(lead_video_id),

View File

@@ -16,10 +16,10 @@ from ..utils import (
parse_iso8601, parse_iso8601,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
traverse_obj,
url_or_none, url_or_none,
urljoin, urljoin,
) )
from ..utils.traversal import traverse_obj, value
class PatreonBaseIE(InfoExtractor): class PatreonBaseIE(InfoExtractor):
@@ -63,6 +63,7 @@ class PatreonIE(PatreonBaseIE):
'info_dict': { 'info_dict': {
'id': '743933', 'id': '743933',
'ext': 'mp3', 'ext': 'mp3',
'alt_title': 'cd166.mp3',
'title': 'Episode 166: David Smalley of Dogma Debate', 'title': 'Episode 166: David Smalley of Dogma Debate',
'description': 'md5:34d207dd29aa90e24f1b3f58841b81c7', 'description': 'md5:34d207dd29aa90e24f1b3f58841b81c7',
'uploader': 'Cognitive Dissonance Podcast', 'uploader': 'Cognitive Dissonance Podcast',
@@ -252,6 +253,27 @@ class PatreonIE(PatreonBaseIE):
'thumbnail': r're:^https?://.+', 'thumbnail': r're:^https?://.+',
}, },
'skip': 'Patron-only content', 'skip': 'Patron-only content',
}, {
# Contains a comment reply in the 'included' section
'url': 'https://www.patreon.com/posts/114721679',
'info_dict': {
'id': '114721679',
'ext': 'mp4',
'upload_date': '20241025',
'uploader': 'Japanalysis',
'like_count': int,
'thumbnail': r're:^https?://.+',
'comment_count': int,
'title': 'Karasawa Part 2',
'description': 'Part 2 of this video https://www.youtube.com/watch?v=Azms2-VTASk',
'uploader_url': 'https://www.patreon.com/japanalysis',
'uploader_id': '80504268',
'channel_url': 'https://www.patreon.com/japanalysis',
'channel_follower_count': int,
'timestamp': 1729897015,
'channel_id': '9346307',
},
'params': {'getcomments': True},
}] }]
_RETURN_TYPE = 'video' _RETURN_TYPE = 'video'
@@ -259,7 +281,7 @@ class PatreonIE(PatreonBaseIE):
video_id = self._match_id(url) video_id = self._match_id(url)
post = self._call_api( post = self._call_api(
f'posts/{video_id}', video_id, query={ f'posts/{video_id}', video_id, query={
'fields[media]': 'download_url,mimetype,size_bytes', 'fields[media]': 'download_url,mimetype,size_bytes,file_name',
'fields[post]': 'comment_count,content,embed,image,like_count,post_file,published_at,title,current_user_can_view', 'fields[post]': 'comment_count,content,embed,image,like_count,post_file,published_at,title,current_user_can_view',
'fields[user]': 'full_name,url', 'fields[user]': 'full_name,url',
'fields[post_tag]': 'value', 'fields[post_tag]': 'value',
@@ -296,6 +318,7 @@ class PatreonIE(PatreonBaseIE):
'ext': ext, 'ext': ext,
'filesize': size_bytes, 'filesize': size_bytes,
'url': download_url, 'url': download_url,
'alt_title': traverse_obj(media_attributes, ('file_name', {str})),
}) })
elif include_type == 'user': elif include_type == 'user':
@@ -404,26 +427,24 @@ class PatreonIE(PatreonBaseIE):
f'posts/{post_id}/comments', post_id, query=params, note=f'Downloading comments page {page}') f'posts/{post_id}/comments', post_id, query=params, note=f'Downloading comments page {page}')
cursor = None cursor = None
for comment in traverse_obj(response, (('data', ('included', lambda _, v: v['type'] == 'comment')), ...)): for comment in traverse_obj(response, (('data', 'included'), lambda _, v: v['type'] == 'comment' and v['id'])):
count += 1 count += 1
comment_id = comment.get('id')
attributes = comment.get('attributes') or {}
if comment_id is None:
continue
author_id = traverse_obj(comment, ('relationships', 'commenter', 'data', 'id')) author_id = traverse_obj(comment, ('relationships', 'commenter', 'data', 'id'))
author_info = traverse_obj(
response, ('included', lambda _, v: v['id'] == author_id and v['type'] == 'user', 'attributes'),
get_all=False, expected_type=dict, default={})
yield { yield {
'id': comment_id, **traverse_obj(comment, {
'text': attributes.get('body'), 'id': ('id', {str_or_none}),
'timestamp': parse_iso8601(attributes.get('created')), 'text': ('attributes', 'body', {str}),
'parent': traverse_obj(comment, ('relationships', 'parent', 'data', 'id'), default='root'), 'timestamp': ('attributes', 'created', {parse_iso8601}),
'author_is_uploader': attributes.get('is_by_creator'), 'parent': ('relationships', 'parent', 'data', ('id', {value('root')}), {str}, any),
'author_is_uploader': ('attributes', 'is_by_creator', {bool}),
}),
**traverse_obj(response, (
'included', lambda _, v: v['id'] == author_id and v['type'] == 'user', 'attributes', {
'author': ('full_name', {str}),
'author_thumbnail': ('image_url', {url_or_none}),
}), get_all=False),
'author_id': author_id, 'author_id': author_id,
'author': author_info.get('full_name'),
'author_thumbnail': author_info.get('image_url'),
} }
if count < traverse_obj(response, ('meta', 'count')): if count < traverse_obj(response, ('meta', 'count')):
@@ -438,7 +459,7 @@ class PatreonCampaignIE(PatreonBaseIE):
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?patreon\.com/(?: https?://(?:www\.)?patreon\.com/(?:
(?:m|api/campaigns)/(?P<campaign_id>\d+)| (?:m|api/campaigns)/(?P<campaign_id>\d+)|
(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+) (?:c/)?(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+)
)(?:/posts)?/?(?:$|[?#])''' )(?:/posts)?/?(?:$|[?#])'''
_TESTS = [{ _TESTS = [{
'url': 'https://www.patreon.com/dissonancepod/', 'url': 'https://www.patreon.com/dissonancepod/',
@@ -490,6 +511,26 @@ class PatreonCampaignIE(PatreonBaseIE):
'thumbnail': r're:^https?://.*$', 'thumbnail': r're:^https?://.*$',
}, },
'playlist_mincount': 201, 'playlist_mincount': 201,
}, {
'url': 'https://www.patreon.com/c/OgSog',
'info_dict': {
'id': '8504388',
'title': 'OGSoG',
'description': r're:(?s)Hello and welcome to our Patreon page. We are Mari, Lasercorn, .+',
'channel': 'OGSoG',
'channel_id': '8504388',
'channel_url': 'https://www.patreon.com/OgSog',
'uploader_url': 'https://www.patreon.com/OgSog',
'uploader_id': '72323575',
'uploader': 'David Moss',
'thumbnail': r're:https?://.+/.+',
'channel_follower_count': int,
'age_limit': 0,
},
'playlist_mincount': 331,
}, {
'url': 'https://www.patreon.com/c/OgSog/posts',
'only_matching': True,
}, { }, {
'url': 'https://www.patreon.com/dissonancepod/posts', 'url': 'https://www.patreon.com/dissonancepod/posts',
'only_matching': True, 'only_matching': True,

View File

@@ -47,7 +47,7 @@ class PBSIE(InfoExtractor):
(r'video\.kpbs\.org', 'KPBS San Diego (KPBS)'), # http://www.kpbs.org/ (r'video\.kpbs\.org', 'KPBS San Diego (KPBS)'), # http://www.kpbs.org/
(r'video\.kqed\.org', 'KQED (KQED)'), # http://www.kqed.org (r'video\.kqed\.org', 'KQED (KQED)'), # http://www.kqed.org
(r'vids\.kvie\.org', 'KVIE Public Television (KVIE)'), # http://www.kvie.org (r'vids\.kvie\.org', 'KVIE Public Television (KVIE)'), # http://www.kvie.org
(r'video\.pbssocal\.org', 'PBS SoCal/KOCE (KOCE)'), # http://www.pbssocal.org/ (r'(?:video\.|www\.)pbssocal\.org', 'PBS SoCal/KOCE (KOCE)'), # http://www.pbssocal.org/
(r'video\.valleypbs\.org', 'ValleyPBS (KVPT)'), # http://www.valleypbs.org/ (r'video\.valleypbs\.org', 'ValleyPBS (KVPT)'), # http://www.valleypbs.org/
(r'video\.cptv\.org', 'CONNECTICUT PUBLIC TELEVISION (WEDH)'), # http://cptv.org (r'video\.cptv\.org', 'CONNECTICUT PUBLIC TELEVISION (WEDH)'), # http://cptv.org
(r'watch\.knpb\.org', 'KNPB Channel 5 (KNPB)'), # http://www.knpb.org/ (r'watch\.knpb\.org', 'KNPB Channel 5 (KNPB)'), # http://www.knpb.org/
@@ -185,12 +185,13 @@ class PBSIE(InfoExtractor):
_VALID_URL = r'''(?x)https?:// _VALID_URL = r'''(?x)https?://
(?: (?:
# Direct video URL # Player
(?:{})/(?:(?:vir|port)alplayer|video)/(?P<id>[0-9]+)(?:[?/]|$) | (?:video|player)\.pbs\.org/(?:widget/)?partnerplayer/(?P<player_id>[^/?#]+) |
# Article with embedded player (or direct video) # Direct video URL, or article with embedded player
(?:www\.)?pbs\.org/(?:[^/]+/){{1,5}}(?P<presumptive_id>[^/]+?)(?:\.html)?/?(?:$|[?\#]) | (?:{})/(?:
# Player (?:(?:vir|port)alplayer|video)/(?P<id>[0-9]+)(?:[?/#]|$) |
(?:video|player)\.pbs\.org/(?:widget/)?partnerplayer/(?P<player_id>[^/]+) (?:[^/?#]+/){{1,5}}(?P<presumptive_id>[^/?#]+?)(?:\.html)?/?(?:$|[?#])
)
) )
'''.format('|'.join(next(zip(*_STATIONS)))) '''.format('|'.join(next(zip(*_STATIONS))))
@@ -403,6 +404,19 @@ class PBSIE(InfoExtractor):
}, },
'expected_warnings': ['HTTP Error 403: Forbidden'], 'expected_warnings': ['HTTP Error 403: Forbidden'],
}, },
{
'url': 'https://www.pbssocal.org/shows/newshour/clip/capehart-johnson-1715984001',
'info_dict': {
'id': '3091549094',
'ext': 'mp4',
'title': 'PBS NewsHour - Capehart and Johnson on the unusual Biden-Trump debate plans',
'description': 'Capehart and Johnson on how the Biden-Trump debates could shape the campaign season',
'display_id': 'capehart-johnson-1715984001',
'duration': 593,
'thumbnail': 'https://image.pbs.org/video-assets/mF3oSVn-asset-mezzanine-16x9-QeXjXPy.jpg',
'chapters': [],
},
},
{ {
'url': 'http://player.pbs.org/widget/partnerplayer/2365297708/?start=0&end=0&chapterbar=false&endscreen=false&topbar=true', 'url': 'http://player.pbs.org/widget/partnerplayer/2365297708/?start=0&end=0&chapterbar=false&endscreen=false&topbar=true',
'only_matching': True, 'only_matching': True,
@@ -467,6 +481,7 @@ class PBSIE(InfoExtractor):
r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',", r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',",
r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/ r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/
r'<iframe[^>]+\bsrc=["\'](?:https?:)?//video\.pbs\.org/widget/partnerplayer/(\d+)', # https://www.pbs.org/wgbh/masterpiece/episodes/victoria-s2-e1/ r'<iframe[^>]+\bsrc=["\'](?:https?:)?//video\.pbs\.org/widget/partnerplayer/(\d+)', # https://www.pbs.org/wgbh/masterpiece/episodes/victoria-s2-e1/
r'\bhttps?://player\.pbs\.org/[\w-]+player/(\d+)', # last pattern to avoid false positives
] ]
media_id = self._search_regex( media_id = self._search_regex(

122
yt_dlp/extractor/pialive.py Normal file
View File

@@ -0,0 +1,122 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
extract_attributes,
get_element_by_class,
get_element_html_by_class,
multipart_encode,
str_or_none,
unified_timestamp,
url_or_none,
)
from ..utils.traversal import traverse_obj
class PiaLiveIE(InfoExtractor):
_VALID_URL = r'https?://player\.pia-live\.jp/stream/(?P<id>[\w-]+)'
_PLAYER_ROOT_URL = 'https://player.pia-live.jp/'
_PIA_LIVE_API_URL = 'https://api.pia-live.jp'
_API_KEY = 'kfds)FKFps-dms9e'
_TESTS = [{
'url': 'https://player.pia-live.jp/stream/4JagFBEIM14s_hK9aXHKf3k3F3bY5eoHFQxu68TC6krUDqGOwN4d61dCWQYOd6CTxl4hjya9dsfEZGsM4uGOUdax60lEI4twsXGXf7crmz8Gk__GhupTrWxA7RFRVt76',
'info_dict': {
'id': '88f3109a-f503-4d0f-a9f7-9f39ac745d84',
'display_id': '2431867_001',
'title': 'こながめでたい日2024の視聴ページ | PIA LIVE STREAM(ぴあライブストリーム)',
'live_status': 'was_live',
'comment_count': int,
},
'params': {
'getcomments': True,
'skip_download': True,
'ignore_no_formats_error': True,
},
'skip': 'The video is no longer available',
}, {
'url': 'https://player.pia-live.jp/stream/4JagFBEIM14s_hK9aXHKf3k3F3bY5eoHFQxu68TC6krJdu0GVBVbVy01IwpJ6J3qBEm3d9TCTt1d0eWpsZGj7DrOjVOmS7GAWGwyscMgiThopJvzgWC4H5b-7XQjAfRZ',
'info_dict': {
'id': '9ce8b8ba-f6d1-4d1f-83a0-18c3148ded93',
'display_id': '2431867_002',
'title': 'こながめでたい日2024の視聴ページ | PIA LIVE STREAM(ぴあライブストリーム)',
'live_status': 'was_live',
'comment_count': int,
},
'params': {
'getcomments': True,
'skip_download': True,
'ignore_no_formats_error': True,
},
'skip': 'The video is no longer available',
}]
def _extract_var(self, variable, html):
return self._search_regex(
rf'(?:var|const|let)\s+{variable}\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
html, f'variable {variable}', group='value')
def _real_extract(self, url):
video_key = self._match_id(url)
webpage = self._download_webpage(url, video_key)
program_code = self._extract_var('programCode', webpage)
article_code = self._extract_var('articleCode', webpage)
title = self._html_extract_title(webpage)
if get_element_html_by_class('play-end', webpage):
raise ExtractorError('The video is no longer available', expected=True, video_id=program_code)
if start_info := clean_html(get_element_by_class('play-waiting__date', webpage)):
date, time = self._search_regex(
r'(?P<date>\d{4}/\d{1,2}/\d{1,2})\([月火水木金土日]\)(?P<time>\d{2}:\d{2})',
start_info, 'start_info', fatal=False, group=('date', 'time'))
if date and time:
release_timestamp_str = f'{date} {time} +09:00'
release_timestamp = unified_timestamp(release_timestamp_str)
self.raise_no_formats(f'The video will be available after {release_timestamp_str}', expected=True)
return {
'id': program_code,
'title': title,
'live_status': 'is_upcoming',
'release_timestamp': release_timestamp,
}
payload, content_type = multipart_encode({
'play_url': video_key,
'api_key': self._API_KEY,
})
api_data_and_headers = {
'data': payload,
'headers': {'Content-Type': content_type, 'Referer': self._PLAYER_ROOT_URL},
}
player_tag_list = self._download_json(
f'{self._PIA_LIVE_API_URL}/perf/player-tag-list/{program_code}', program_code,
'Fetching player tag list', 'Unable to fetch player tag list', **api_data_and_headers)
return self.url_result(
extract_attributes(player_tag_list['data']['movie_one_tag'])['src'],
url_transparent=True, title=title, display_id=program_code,
__post_extractor=self.extract_comments(program_code, article_code, api_data_and_headers))
def _get_comments(self, program_code, article_code, api_data_and_headers):
chat_room_url = traverse_obj(self._download_json(
f'{self._PIA_LIVE_API_URL}/perf/chat-tag-list/{program_code}/{article_code}', program_code,
'Fetching chat info', 'Unable to fetch chat info', fatal=False, **api_data_and_headers),
('data', 'chat_one_tag', {extract_attributes}, 'src', {url_or_none}))
if not chat_room_url:
return
comment_page = self._download_webpage(
chat_room_url, program_code, 'Fetching comment page', 'Unable to fetch comment page',
fatal=False, headers={'Referer': self._PLAYER_ROOT_URL})
if not comment_page:
return
yield from traverse_obj(self._search_json(
r'var\s+_history\s*=', comment_page, 'comment list',
program_code, contains_pattern=r'\[(?s:.+)\]', fatal=False), (..., {
'timestamp': (0, {int}),
'author_is_uploader': (1, {lambda x: x == 2}),
'author': (2, {str}),
'text': (3, {str}),
'id': (4, {str_or_none}),
}))

View File

@@ -1,70 +0,0 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
parse_qs,
time_seconds,
traverse_obj,
)
class PIAULIZAPortalIE(InfoExtractor):
IE_DESC = 'ulizaportal.jp - PIA LIVE STREAM'
_VALID_URL = r'https?://(?:www\.)?ulizaportal\.jp/pages/(?P<id>[\da-f]{8}-(?:[\da-f]{4}-){3}[\da-f]{12})'
_TESTS = [{
'url': 'https://ulizaportal.jp/pages/005f18b7-e810-5618-cb82-0987c5755d44',
'info_dict': {
'id': '005f18b7-e810-5618-cb82-0987c5755d44',
'title': 'プレゼンテーションプレイヤーのサンプル',
'live_status': 'not_live',
},
'params': {
'skip_download': True,
'ignore_no_formats_error': True,
},
}, {
'url': 'https://ulizaportal.jp/pages/005e1b23-fe93-5780-19a0-98e917cc4b7d?expires=4102412400&signature=f422a993b683e1068f946caf406d211c17d1ef17da8bef3df4a519502155aa91&version=1',
'info_dict': {
'id': '005e1b23-fe93-5780-19a0-98e917cc4b7d',
'title': '【確認用】視聴サンプルページULIZA',
'live_status': 'not_live',
},
'params': {
'skip_download': True,
'ignore_no_formats_error': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
expires = int_or_none(traverse_obj(parse_qs(url), ('expires', 0)))
if expires and expires <= time_seconds():
raise ExtractorError('The link is expired.', video_id=video_id, expected=True)
webpage = self._download_webpage(url, video_id)
player_data = self._download_webpage(
self._search_regex(
r'<script [^>]*\bsrc="(https://player-api\.p\.uliza\.jp/v1/players/[^"]+)"',
webpage, 'player data url'),
video_id, headers={'Referer': 'https://ulizaportal.jp/'},
note='Fetching player data', errnote='Unable to fetch player data')
formats = self._extract_m3u8_formats(
self._search_regex(
r'["\'](https://vms-api\.p\.uliza\.jp/v1/prog-index\.m3u8[^"\']+)', player_data,
'm3u8 url', default=None),
video_id, fatal=False)
m3u8_type = self._search_regex(
r'/hls/(dvr|video)/', traverse_obj(formats, (0, 'url')), 'm3u8 type', default=None)
return {
'id': video_id,
'title': self._html_extract_title(webpage),
'formats': formats,
'live_status': {
'video': 'is_live',
'dvr': 'was_live', # short-term archives
}.get(m3u8_type, 'not_live'), # VOD or long-term archives
}

View File

@@ -0,0 +1,99 @@
from .common import InfoExtractor
from ..utils import parse_iso8601, smuggle_url, unsmuggle_url, url_or_none
from ..utils.traversal import traverse_obj
class PiramideTVIE(InfoExtractor):
_VALID_URL = r'https?://piramide\.tv/video/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://piramide.tv/video/wWtBAORdJUTh',
'info_dict': {
'id': 'wWtBAORdJUTh',
'ext': 'mp4',
'title': 'md5:79f9c8183ea6a35c836923142cf0abcc',
'description': '',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/W86PgQDn/thumbnails/B9gpIxkH.jpg',
'channel': 'León Picarón',
'channel_id': 'leonpicaron',
'timestamp': 1696460362,
'upload_date': '20231004',
},
}, {
'url': 'https://piramide.tv/video/wcYn6li79NgN',
'info_dict': {
'id': 'wcYn6li79NgN',
'ext': 'mp4',
'title': 'ACEPTO TENER UN BEBE CON MI NOVIA\u2026? | Parte 1',
'description': '',
'channel': 'ARTA GAME',
'channel_id': 'arta_game',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/cnEdGp5X/thumbnails/rHAaWfP7.jpg',
'timestamp': 1703434976,
'upload_date': '20231224',
},
}]
def _extract_video(self, video_id):
video_data = self._download_json(
f'https://hermes.piramide.tv/video/data/{video_id}', video_id, fatal=False)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://cdn.piramide.tv/video/{video_id}/manifest.m3u8', video_id, fatal=False)
next_video = traverse_obj(video_data, ('video', 'next_video', 'id', {str}))
return next_video, {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(video_data, ('video', {
'id': ('id', {str}),
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('media', 'thumbnail', {url_or_none}),
'channel': ('channel', 'name', {str}),
'channel_id': ('channel', 'id', {str}),
'timestamp': ('date', {parse_iso8601}),
})),
}
def _entries(self, video_id):
visited = set()
while True:
visited.add(video_id)
next_video, info = self._extract_video(video_id)
yield info
if not next_video or next_video in visited:
break
video_id = next_video
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url)
if self._yes_playlist(video_id, video_id, smuggled_data):
return self.playlist_result(self._entries(video_id), video_id)
return self._extract_video(video_id)[1]
class PiramideTVChannelIE(InfoExtractor):
_VALID_URL = r'https?://piramide\.tv/channel/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://piramide.tv/channel/thekalo',
'playlist_mincount': 10,
'info_dict': {
'id': 'thekalo',
},
}]
def _entries(self, channel_name):
videos = self._download_json(
f'https://hermes.piramide.tv/channel/list/{channel_name}/date/100000', channel_name)
for video in traverse_obj(videos, ('videos', lambda _, v: v['id'])):
yield self.url_result(smuggle_url(
f'https://piramide.tv/video/{video["id"]}', {'force_noplaylist': True}),
**traverse_obj(video, {
'id': ('id', {str}),
'title': ('title', {str}),
'description': ('description', {str}),
}))
def _real_extract(self, url):
channel_name = self._match_id(url)
return self.playlist_result(self._entries(channel_name), channel_name)

View File

@@ -1,4 +1,5 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
traverse_obj, traverse_obj,
@@ -110,8 +111,8 @@ class PixivSketchUserIE(PixivSketchBaseIE):
if not traverse_obj(data, 'is_broadcasting'): if not traverse_obj(data, 'is_broadcasting'):
try: try:
self._call_api(user_id, 'users/current.json', url, 'Investigating reason for request failure') self._call_api(user_id, 'users/current.json', url, 'Investigating reason for request failure')
except ExtractorError as ex: except ExtractorError as e:
if ex.cause and ex.cause.code == 401: if isinstance(e.cause, HTTPError) and e.cause.status == 401:
self.raise_login_required(f'Please log in, or use direct link like https://sketch.pixiv.net/@{user_id}/1234567890', method='cookies') self.raise_login_required(f'Please log in, or use direct link like https://sketch.pixiv.net/@{user_id}/1234567890', method='cookies')
raise ExtractorError('This user is offline', expected=True) raise ExtractorError('This user is offline', expected=True)

130
yt_dlp/extractor/plvideo.py Normal file
View File

@@ -0,0 +1,130 @@
from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
parse_iso8601,
parse_resolution,
url_or_none,
)
from ..utils.traversal import traverse_obj
class PlVideoIE(InfoExtractor):
IE_DESC = 'Платформа'
_VALID_URL = r'https?://(?:www\.)?plvideo\.ru/(?:watch\?(?:[^#]+&)?v=|shorts/)(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://plvideo.ru/watch?v=Y5JzUzkcQTMK',
'md5': 'fe8e18aca892b3b31f3bf492169f8a26',
'info_dict': {
'id': 'Y5JzUzkcQTMK',
'ext': 'mp4',
'thumbnail': 'https://img.plvideo.ru/images/fp-2024-images/v/cover/37/dd/37dd00a4c96c77436ab737e85947abd7/original663a4a3bb713e5.33151959.jpg',
'title': 'Presidente de Cuba llega a Moscú en una visita de trabajo',
'channel': 'RT en Español',
'channel_id': 'ZH4EKqunVDvo',
'media_type': 'video',
'comment_count': int,
'tags': ['rusia', 'cuba', 'russia', 'miguel díaz-canel'],
'description': 'md5:a1a395d900d77a86542a91ee0826c115',
'release_timestamp': 1715096124,
'channel_is_verified': True,
'like_count': int,
'timestamp': 1715095911,
'duration': 44320,
'view_count': int,
'dislike_count': int,
'upload_date': '20240507',
'modified_date': '20240701',
'channel_follower_count': int,
'modified_timestamp': 1719824073,
},
}, {
'url': 'https://plvideo.ru/shorts/S3Uo9c-VLwFX',
'md5': '7d8fa2279406c69d2fd2a6fc548a9805',
'info_dict': {
'id': 'S3Uo9c-VLwFX',
'ext': 'mp4',
'channel': 'Romaatom',
'tags': 'count:22',
'dislike_count': int,
'upload_date': '20241130',
'description': 'md5:452e6de219bf2f32bb95806c51c3b364',
'duration': 58433,
'modified_date': '20241130',
'thumbnail': 'https://img.plvideo.ru/images/fp-2024-11-cover/S3Uo9c-VLwFX/f9318999-a941-482b-b700-2102a7049366.jpg',
'media_type': 'shorts',
'like_count': int,
'modified_timestamp': 1732961458,
'channel_is_verified': True,
'channel_id': 'erJyyTIbmUd1',
'timestamp': 1732961355,
'comment_count': int,
'title': 'Белоусов отменил приказы о кадровом резерве на гражданской службе',
'channel_follower_count': int,
'view_count': int,
'release_timestamp': 1732961458,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(
f'https://api.g1.plvideo.ru/v1/videos/{video_id}?Aud=18', video_id)
is_live = False
formats = []
subtitles = {}
automatic_captions = {}
for quality, data in traverse_obj(video_data, ('item', 'profiles', {dict.items}, lambda _, v: url_or_none(v[1]['hls']))):
formats.append({
'format_id': quality,
'ext': 'mp4',
'protocol': 'm3u8_native',
**traverse_obj(data, {
'url': 'hls',
'fps': ('fps', {float_or_none}),
'aspect_ratio': ('aspectRatio', {float_or_none}),
}),
**parse_resolution(quality),
})
if livestream_url := traverse_obj(video_data, ('item', 'livestream', 'url', {url_or_none})):
is_live = True
formats.extend(self._extract_m3u8_formats(livestream_url, video_id, 'mp4', live=True))
for lang, url in traverse_obj(video_data, ('item', 'subtitles', {dict.items}, lambda _, v: url_or_none(v[1]))):
if lang.endswith('-auto'):
automatic_captions.setdefault(lang[:-5], []).append({
'url': url,
})
else:
subtitles.setdefault(lang, []).append({
'url': url,
})
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
'automatic_captions': automatic_captions,
'is_live': is_live,
**traverse_obj(video_data, ('item', {
'id': ('id', {str}),
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('cover', 'paths', 'original', 'src', {url_or_none}),
'duration': ('uploadFile', 'videoDuration', {int_or_none}),
'channel': ('channel', 'name', {str}),
'channel_id': ('channel', 'id', {str}),
'channel_follower_count': ('channel', 'stats', 'subscribers', {int_or_none}),
'channel_is_verified': ('channel', 'verified', {bool}),
'tags': ('tags', ..., {str}),
'timestamp': ('createdAt', {parse_iso8601}),
'release_timestamp': ('publishedAt', {parse_iso8601}),
'modified_timestamp': ('updatedAt', {parse_iso8601}),
'view_count': ('stats', 'viewTotalCount', {int_or_none}),
'like_count': ('stats', 'likeCount', {int_or_none}),
'dislike_count': ('stats', 'dislikeCount', {int_or_none}),
'comment_count': ('stats', 'commentCount', {int_or_none}),
'media_type': ('type', {str}),
})),
}

View File

@@ -1,136 +0,0 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
extract_attributes,
int_or_none,
js_to_json,
merge_dicts,
)
class PokemonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?pokemon\.com/[a-z]{2}(?:.*?play=(?P<id>[a-z0-9]{32})|/(?:[^/]+/)+(?P<display_id>[^/?#&]+))'
_TESTS = [{
'url': 'https://www.pokemon.com/us/pokemon-episodes/20_30-the-ol-raise-and-switch/',
'md5': '2fe8eaec69768b25ef898cda9c43062e',
'info_dict': {
'id': 'afe22e30f01c41f49d4f1d9eab5cd9a4',
'ext': 'mp4',
'title': 'The Ol Raise and Switch!',
'description': 'md5:7db77f7107f98ba88401d3adc80ff7af',
},
'add_id': ['LimelightMedia'],
}, {
# no data-video-title
'url': 'https://www.pokemon.com/fr/episodes-pokemon/films-pokemon/pokemon-lascension-de-darkrai-2008',
'info_dict': {
'id': 'dfbaf830d7e54e179837c50c0c6cc0e1',
'ext': 'mp4',
'title': "Pokémon : L'ascension de Darkrai",
'description': 'md5:d1dbc9e206070c3e14a06ff557659fb5',
},
'add_id': ['LimelightMedia'],
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.pokemon.com/uk/pokemon-episodes/?play=2e8b5c761f1d4a9286165d7748c1ece2',
'only_matching': True,
}, {
'url': 'http://www.pokemon.com/fr/episodes-pokemon/18_09-un-hiver-inattendu/',
'only_matching': True,
}, {
'url': 'http://www.pokemon.com/de/pokemon-folgen/01_20-bye-bye-smettbo/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id, display_id = self._match_valid_url(url).groups()
webpage = self._download_webpage(url, video_id or display_id)
video_data = extract_attributes(self._search_regex(
r'(<[^>]+data-video-id="{}"[^>]*>)'.format(video_id if video_id else '[a-z0-9]{32}'),
webpage, 'video data element'))
video_id = video_data['data-video-id']
title = video_data.get('data-video-title') or self._html_search_meta(
'pkm-title', webpage, ' title', default=None) or self._search_regex(
r'<h1[^>]+\bclass=["\']us-title[^>]+>([^<]+)', webpage, 'title')
return {
'_type': 'url_transparent',
'id': video_id,
'url': f'limelight:media:{video_id}',
'title': title,
'description': video_data.get('data-video-summary'),
'thumbnail': video_data.get('data-video-poster'),
'series': 'Pokémon',
'season_number': int_or_none(video_data.get('data-video-season')),
'episode': title,
'episode_number': int_or_none(video_data.get('data-video-episode')),
'ie_key': 'LimelightMedia',
}
class PokemonWatchIE(InfoExtractor):
_VALID_URL = r'https?://watch\.pokemon\.com/[a-z]{2}-[a-z]{2}/(?:#/)?player(?:\.html)?\?id=(?P<id>[a-z0-9]{32})'
_API_URL = 'https://www.pokemon.com/api/pokemontv/v2/channels/{0:}'
_TESTS = [{
'url': 'https://watch.pokemon.com/en-us/player.html?id=8309a40969894a8e8d5bc1311e9c5667',
'md5': '62833938a31e61ab49ada92f524c42ff',
'info_dict': {
'id': '8309a40969894a8e8d5bc1311e9c5667',
'ext': 'mp4',
'title': 'Lillier and the Staff!',
'description': 'md5:338841b8c21b283d24bdc9b568849f04',
},
}, {
'url': 'https://watch.pokemon.com/en-us/#/player?id=3fe7752ba09141f0b0f7756d1981c6b2',
'only_matching': True,
}, {
'url': 'https://watch.pokemon.com/de-de/player.html?id=b3c402e111a4459eb47e12160ab0ba07',
'only_matching': True,
}]
def _extract_media(self, channel_array, video_id):
for channel in channel_array:
for media in channel.get('media'):
if media.get('id') == video_id:
return media
return None
def _real_extract(self, url):
video_id = self._match_id(url)
info = {
'_type': 'url',
'id': video_id,
'url': f'limelight:media:{video_id}',
'ie_key': 'LimelightMedia',
}
# API call can be avoided entirely if we are listing formats
if self.get_param('listformats', False):
return info
webpage = self._download_webpage(url, video_id)
build_vars = self._parse_json(self._search_regex(
r'(?s)buildVars\s*=\s*({.*?})', webpage, 'build vars'),
video_id, transform_source=js_to_json)
region = build_vars.get('region')
channel_array = self._download_json(self._API_URL.format(region), video_id)
video_data = self._extract_media(channel_array, video_id)
if video_data is None:
raise ExtractorError(
f'Video {video_id} does not exist', expected=True)
info['_type'] = 'url_transparent'
images = video_data.get('images')
return merge_dicts(info, {
'title': video_data.get('title'),
'description': video_data.get('description'),
'thumbnail': images.get('medium') or images.get('small'),
'series': 'Pokémon',
'season_number': int_or_none(video_data.get('season')),
'episode': video_data.get('title'),
'episode_number': int_or_none(video_data.get('episode')),
})

View File

@@ -0,0 +1,105 @@
from .common import InfoExtractor
from ..utils import url_or_none
from ..utils.traversal import traverse_obj
class RadioRadicaleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?radioradicale\.it/scheda/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.radioradicale.it/scheda/471591',
'md5': 'eb0fbe43a601f1a361cbd00f3c45af4a',
'info_dict': {
'id': '471591',
'ext': 'mp4',
'title': 'md5:e8fbb8de57011a3255db0beca69af73d',
'description': 'md5:5e15a789a2fe4d67da8d1366996e89ef',
'location': 'Napoli',
'duration': 2852.0,
'timestamp': 1459987200,
'upload_date': '20160407',
'thumbnail': 'https://www.radioradicale.it/photo400/0/0/9/0/1/00901768.jpg',
},
}, {
'url': 'https://www.radioradicale.it/scheda/742783/parlamento-riunito-in-seduta-comune-11a-della-xix-legislatura',
'info_dict': {
'id': '742783',
'title': 'Parlamento riunito in seduta comune (11ª della XIX legislatura)',
'description': '-) Votazione per l\'elezione di un giudice della Corte Costituzionale (nono scrutinio)',
'location': 'CAMERA',
'duration': 5868.0,
'timestamp': 1730246400,
'upload_date': '20241030',
},
'playlist': [{
'md5': 'aa48de55dcc45478e4cd200f299aab7d',
'info_dict': {
'id': '742783-0',
'ext': 'mp4',
'title': 'Parlamento riunito in seduta comune (11ª della XIX legislatura)',
},
}, {
'md5': 'be915c189c70ad2920e5810f32260ff5',
'info_dict': {
'id': '742783-1',
'ext': 'mp4',
'title': 'Parlamento riunito in seduta comune (11ª della XIX legislatura)',
},
}, {
'md5': 'f0ee4047342baf8ed3128a8417ac5e0a',
'info_dict': {
'id': '742783-2',
'ext': 'mp4',
'title': 'Parlamento riunito in seduta comune (11ª della XIX legislatura)',
},
}],
}]
def _entries(self, videos_info, page_id):
for idx, video in enumerate(traverse_obj(
videos_info, ('playlist', lambda _, v: v['sources']))):
video_id = f'{page_id}-{idx}'
formats = []
subtitles = {}
for m3u8_url in traverse_obj(video, ('sources', ..., 'src', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
for sub in traverse_obj(video, ('subtitles', ..., lambda _, v: url_or_none(v['src']))):
self._merge_subtitles({sub.get('srclang') or 'und': [{
'url': sub['src'],
'name': sub.get('label'),
}]}, target=subtitles)
yield {
'id': video_id,
'title': video.get('title'),
'formats': formats,
'subtitles': subtitles,
}
def _real_extract(self, url):
page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id)
videos_info = self._search_json(
r'jQuery\.extend\(Drupal\.settings\s*,',
webpage, 'videos_info', page_id)['RRscheda']
entries = list(self._entries(videos_info, page_id))
common_info = {
'id': page_id,
'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'location': videos_info.get('luogo'),
**self._search_json_ld(webpage, page_id),
}
if len(entries) == 1:
return {
**entries[0],
**common_info,
}
return self.playlist_result(entries, multi_video=True, **common_info)

View File

@@ -259,6 +259,8 @@ class RedditIE(InfoExtractor):
f'https://www.reddit.com/{slug}/.json', video_id, expected_status=403) f'https://www.reddit.com/{slug}/.json', video_id, expected_status=403)
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, json.JSONDecodeError): if isinstance(e.cause, json.JSONDecodeError):
if self._get_cookies('https://www.reddit.com/').get('reddit_session'):
raise ExtractorError('Your IP address is unable to access the Reddit API', expected=True)
self.raise_login_required('Account authentication is required') self.raise_login_required('Account authentication is required')
raise raise

View File

@@ -114,7 +114,7 @@ class RedGifsBaseInfoExtractor(InfoExtractor):
class RedGifsIE(RedGifsBaseInfoExtractor): class RedGifsIE(RedGifsBaseInfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?redgifs\.com/watch/|thumbs2\.redgifs\.com/)(?P<id>[^-/?#\.]+)' _VALID_URL = r'https?://(?:(?:www\.)?redgifs\.com/(?:watch|ifr)/|thumbs2\.redgifs\.com/)(?P<id>[^-/?#\.]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.redgifs.com/watch/squeakyhelplesswisent', 'url': 'https://www.redgifs.com/watch/squeakyhelplesswisent',
'info_dict': { 'info_dict': {
@@ -147,6 +147,22 @@ class RedGifsIE(RedGifsBaseInfoExtractor):
'age_limit': 18, 'age_limit': 18,
'tags': list, 'tags': list,
}, },
}, {
'url': 'https://www.redgifs.com/ifr/squeakyhelplesswisent',
'info_dict': {
'id': 'squeakyhelplesswisent',
'ext': 'mp4',
'title': 'Hotwife Legs Thick',
'timestamp': 1636287915,
'upload_date': '20211107',
'uploader': 'ignored52',
'duration': 16,
'view_count': int,
'like_count': int,
'categories': list,
'age_limit': 18,
'tags': list,
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -213,7 +229,7 @@ class RedGifsSearchIE(RedGifsBaseInfoExtractor):
class RedGifsUserIE(RedGifsBaseInfoExtractor): class RedGifsUserIE(RedGifsBaseInfoExtractor):
IE_DESC = 'Redgifs user' IE_DESC = 'Redgifs user'
_VALID_URL = r'https?://(?:www\.)?redgifs\.com/users/(?P<username>[^/?#]+)(?:\?(?P<query>[^#]+))?' _VALID_URL = r'https?://(?:www\.)?redgifs\.com/users/(?P<username>[^/?#]+)(?:\?(?P<query>[^#]+))?'
_PAGE_SIZE = 30 _PAGE_SIZE = 80
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.redgifs.com/users/lamsinka89', 'url': 'https://www.redgifs.com/users/lamsinka89',
@@ -222,7 +238,7 @@ class RedGifsUserIE(RedGifsBaseInfoExtractor):
'title': 'lamsinka89', 'title': 'lamsinka89',
'description': 'RedGifs user lamsinka89, ordered by recent', 'description': 'RedGifs user lamsinka89, ordered by recent',
}, },
'playlist_mincount': 100, 'playlist_mincount': 391,
}, },
{ {
'url': 'https://www.redgifs.com/users/lamsinka89?page=3', 'url': 'https://www.redgifs.com/users/lamsinka89?page=3',
@@ -231,7 +247,7 @@ class RedGifsUserIE(RedGifsBaseInfoExtractor):
'title': 'lamsinka89', 'title': 'lamsinka89',
'description': 'RedGifs user lamsinka89, ordered by recent', 'description': 'RedGifs user lamsinka89, ordered by recent',
}, },
'playlist_count': 30, 'playlist_count': 80,
}, },
{ {
'url': 'https://www.redgifs.com/users/lamsinka89?order=best&type=g', 'url': 'https://www.redgifs.com/users/lamsinka89?order=best&type=g',
@@ -240,7 +256,17 @@ class RedGifsUserIE(RedGifsBaseInfoExtractor):
'title': 'lamsinka89', 'title': 'lamsinka89',
'description': 'RedGifs user lamsinka89, ordered by best', 'description': 'RedGifs user lamsinka89, ordered by best',
}, },
'playlist_mincount': 100, 'playlist_mincount': 391,
},
{
'url': 'https://www.redgifs.com/users/ignored52',
'note': 'https://github.com/yt-dlp/yt-dlp/issues/7382',
'info_dict': {
'id': 'ignored52',
'title': 'ignored52',
'description': 'RedGifs user ignored52, ordered by recent',
},
'playlist_mincount': 121,
}, },
] ]

View File

@@ -176,6 +176,8 @@ class RTVSLOShowIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '173250997', 'id': '173250997',
'title': 'Ekipa Bled', 'title': 'Ekipa Bled',
'description': 'md5:c88471e27a1268c448747a5325319ab7',
'thumbnail': 'https://img.rtvcdn.si/_up/ava/ava_misc/show_logos/173250997/logo_wide1.jpg',
}, },
'playlist_count': 18, 'playlist_count': 18,
}] }]
@@ -187,4 +189,7 @@ class RTVSLOShowIE(InfoExtractor):
return self.playlist_from_matches( return self.playlist_from_matches(
re.findall(r'<a [^>]*\bhref="(/arhiv/[^"]+)"', webpage), re.findall(r'<a [^>]*\bhref="(/arhiv/[^"]+)"', webpage),
playlist_id, self._html_extract_title(webpage), playlist_id, self._html_extract_title(webpage),
getter=urljoin('https://365.rtvslo.si'), ie=RTVSLOIE) getter=urljoin('https://365.rtvslo.si'), ie=RTVSLOIE,
description=self._og_search_description(webpage),
thumbnail=self._og_search_thumbnail(webpage),
)

View File

@@ -2,15 +2,21 @@ import itertools
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
UnsupportedError,
bool_or_none, bool_or_none,
determine_ext, determine_ext,
int_or_none, int_or_none,
js_to_json,
parse_qs, parse_qs,
traverse_obj, str_or_none,
try_get, try_get,
unified_timestamp, unified_timestamp,
url_or_none, url_or_none,
) )
from ..utils.traversal import (
subs_list_to_dict,
traverse_obj,
)
class RutubeBaseIE(InfoExtractor): class RutubeBaseIE(InfoExtractor):
@@ -19,7 +25,7 @@ class RutubeBaseIE(InfoExtractor):
query = {} query = {}
query['format'] = 'json' query['format'] = 'json'
return self._download_json( return self._download_json(
f'http://rutube.ru/api/video/{video_id}/', f'https://rutube.ru/api/video/{video_id}/',
video_id, 'Downloading video JSON', video_id, 'Downloading video JSON',
'Unable to download video JSON', query=query) 'Unable to download video JSON', query=query)
@@ -61,18 +67,21 @@ class RutubeBaseIE(InfoExtractor):
query = {} query = {}
query['format'] = 'json' query['format'] = 'json'
return self._download_json( return self._download_json(
f'http://rutube.ru/api/play/options/{video_id}/', f'https://rutube.ru/api/play/options/{video_id}/',
video_id, 'Downloading options JSON', video_id, 'Downloading options JSON',
'Unable to download options JSON', 'Unable to download options JSON',
headers=self.geo_verification_headers(), query=query) headers=self.geo_verification_headers(), query=query)
def _extract_formats(self, options, video_id): def _extract_formats_and_subtitles(self, options, video_id):
formats = [] formats = []
subtitles = {}
for format_id, format_url in options['video_balancer'].items(): for format_id, format_url in options['video_balancer'].items():
ext = determine_ext(format_url) ext = determine_ext(format_url)
if ext == 'm3u8': if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats( fmts, subs = self._extract_m3u8_formats_and_subtitles(
format_url, video_id, 'mp4', m3u8_id=format_id, fatal=False)) format_url, video_id, 'mp4', m3u8_id=format_id, fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
elif ext == 'f4m': elif ext == 'f4m':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
format_url, video_id, f4m_id=format_id, fatal=False)) format_url, video_id, f4m_id=format_id, fatal=False))
@@ -82,11 +91,19 @@ class RutubeBaseIE(InfoExtractor):
'format_id': format_id, 'format_id': format_id,
}) })
for hls_url in traverse_obj(options, ('live_streams', 'hls', ..., 'url', {url_or_none})): for hls_url in traverse_obj(options, ('live_streams', 'hls', ..., 'url', {url_or_none})):
formats.extend(self._extract_m3u8_formats(hls_url, video_id, ext='mp4', fatal=False)) fmts, subs = self._extract_m3u8_formats_and_subtitles(
return formats hls_url, video_id, 'mp4', fatal=False, m3u8_id='hls')
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
self._merge_subtitles(traverse_obj(options, ('captions', ..., {
'id': 'code',
'url': 'file',
'name': ('langTitle', {str}),
}, all, {subs_list_to_dict(lang='ru')})), target=subtitles)
return formats, subtitles
def _download_and_extract_formats(self, video_id, query=None): def _download_and_extract_formats_and_subtitles(self, video_id, query=None):
return self._extract_formats( return self._extract_formats_and_subtitles(
self._download_api_options(video_id, query=query), video_id) self._download_api_options(video_id, query=query), video_id)
@@ -97,8 +114,8 @@ class RutubeIE(RutubeBaseIE):
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//rutube\.ru/(?:play/)?embed/[\da-z]{32}.*?)\1'] _EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//rutube\.ru/(?:play/)?embed/[\da-z]{32}.*?)\1']
_TESTS = [{ _TESTS = [{
'url': 'http://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/', 'url': 'https://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/',
'md5': 'e33ac625efca66aba86cbec9851f2692', 'md5': '3d73fdfe5bb81b9aef139e22ef3de26a',
'info_dict': { 'info_dict': {
'id': '3eac3b4561676c17df9132a9a1e62e3e', 'id': '3eac3b4561676c17df9132a9a1e62e3e',
'ext': 'mp4', 'ext': 'mp4',
@@ -111,26 +128,25 @@ class RutubeIE(RutubeBaseIE):
'upload_date': '20131016', 'upload_date': '20131016',
'age_limit': 0, 'age_limit': 0,
'view_count': int, 'view_count': int,
'thumbnail': 'http://pic.rutubelist.ru/video/d2/a0/d2a0aec998494a396deafc7ba2c82add.jpg', 'thumbnail': 'https://pic.rutubelist.ru/video/d2/a0/d2a0aec998494a396deafc7ba2c82add.jpg',
'categories': ['Новости и СМИ'], 'categories': ['Новости и СМИ'],
'chapters': [], 'chapters': [],
}, },
'expected_warnings': ['Unable to download f4m'],
}, { }, {
'url': 'http://rutube.ru/play/embed/a10e53b86e8f349080f718582ce4c661', 'url': 'https://rutube.ru/play/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'http://rutube.ru/embed/a10e53b86e8f349080f718582ce4c661', 'url': 'https://rutube.ru/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'http://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/?pl_id=4252', 'url': 'https://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/?pl_id=4252',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://rutube.ru/video/10b3a03fc01d5bbcc632a2f3514e8aab/?pl_type=source', 'url': 'https://rutube.ru/video/10b3a03fc01d5bbcc632a2f3514e8aab/?pl_type=source',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://rutube.ru/video/private/884fb55f07a97ab673c7d654553e0f48/?p=x2QojCumHTS3rsKHWXN8Lg', 'url': 'https://rutube.ru/video/private/884fb55f07a97ab673c7d654553e0f48/?p=x2QojCumHTS3rsKHWXN8Lg',
'md5': 'd106225f15d625538fe22971158e896f', 'md5': '4fce7b4fcc7b1bcaa3f45eb1e1ad0dd7',
'info_dict': { 'info_dict': {
'id': '884fb55f07a97ab673c7d654553e0f48', 'id': '884fb55f07a97ab673c7d654553e0f48',
'ext': 'mp4', 'ext': 'mp4',
@@ -143,11 +159,10 @@ class RutubeIE(RutubeBaseIE):
'upload_date': '20221210', 'upload_date': '20221210',
'age_limit': 0, 'age_limit': 0,
'view_count': int, 'view_count': int,
'thumbnail': 'http://pic.rutubelist.ru/video/f2/d4/f2d42b54be0a6e69c1c22539e3152156.jpg', 'thumbnail': 'https://pic.rutubelist.ru/video/f2/d4/f2d42b54be0a6e69c1c22539e3152156.jpg',
'categories': ['Видеоигры'], 'categories': ['Видеоигры'],
'chapters': [], 'chapters': [],
}, },
'expected_warnings': ['Unable to download f4m'],
}, { }, {
'url': 'https://rutube.ru/video/c65b465ad0c98c89f3b25cb03dcc87c6/', 'url': 'https://rutube.ru/video/c65b465ad0c98c89f3b25cb03dcc87c6/',
'info_dict': { 'info_dict': {
@@ -156,17 +171,16 @@ class RutubeIE(RutubeBaseIE):
'chapters': 'count:4', 'chapters': 'count:4',
'categories': ['Бизнес и предпринимательство'], 'categories': ['Бизнес и предпринимательство'],
'description': 'md5:252feac1305257d8c1bab215cedde75d', 'description': 'md5:252feac1305257d8c1bab215cedde75d',
'thumbnail': 'http://pic.rutubelist.ru/video/71/8f/718f27425ea9706073eb80883dd3787b.png', 'thumbnail': 'https://pic.rutubelist.ru/video/71/8f/718f27425ea9706073eb80883dd3787b.png',
'duration': 782, 'duration': 782,
'age_limit': 0, 'age_limit': 0,
'uploader_id': '23491359', 'uploader_id': '23491359',
'timestamp': 1677153329, 'timestamp': 1677153329,
'view_count': int, 'view_count': int,
'upload_date': '20230223', 'upload_date': '20230223',
'title': 'Бизнес с нуля: найм сотрудников. Интервью с директором строительной компании', 'title': 'Бизнес с нуля: найм сотрудников. Интервью с директором строительной компании #1',
'uploader': 'Стас Быков', 'uploader': 'Стас Быков',
}, },
'expected_warnings': ['Unable to download f4m'],
}, { }, {
'url': 'https://rutube.ru/live/video/c58f502c7bb34a8fcdd976b221fca292/', 'url': 'https://rutube.ru/live/video/c58f502c7bb34a8fcdd976b221fca292/',
'info_dict': { 'info_dict': {
@@ -174,7 +188,7 @@ class RutubeIE(RutubeBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'categories': ['Телепередачи'], 'categories': ['Телепередачи'],
'description': '', 'description': '',
'thumbnail': 'http://pic.rutubelist.ru/video/14/19/14190807c0c48b40361aca93ad0867c7.jpg', 'thumbnail': 'https://pic.rutubelist.ru/video/14/19/14190807c0c48b40361aca93ad0867c7.jpg',
'live_status': 'is_live', 'live_status': 'is_live',
'age_limit': 0, 'age_limit': 0,
'uploader_id': '23460655', 'uploader_id': '23460655',
@@ -184,6 +198,24 @@ class RutubeIE(RutubeBaseIE):
'title': r're:Первый канал. Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$', 'title': r're:Первый канал. Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'uploader': 'Первый канал', 'uploader': 'Первый канал',
}, },
}, {
'url': 'https://rutube.ru/play/embed/03a9cb54bac3376af4c5cb0f18444e01/',
'info_dict': {
'id': '03a9cb54bac3376af4c5cb0f18444e01',
'ext': 'mp4',
'age_limit': 0,
'description': '',
'title': 'Церемония начала торгов акциями ПАО «ЕвроТранс»',
'chapters': [],
'upload_date': '20240829',
'duration': 293,
'uploader': 'MOEX - Московская биржа',
'timestamp': 1724946628,
'thumbnail': 'https://pic.rutubelist.ru/video/2e/24/2e241fddb459baf0fa54acfca44874f4.jpg',
'view_count': int,
'uploader_id': '38420507',
'categories': ['Интервью'],
},
}, { }, {
'url': 'https://rutube.ru/video/5ab908fccfac5bb43ef2b1e4182256b0/', 'url': 'https://rutube.ru/video/5ab908fccfac5bb43ef2b1e4182256b0/',
'only_matching': True, 'only_matching': True,
@@ -192,40 +224,46 @@ class RutubeIE(RutubeBaseIE):
'only_matching': True, 'only_matching': True,
}] }]
@classmethod
def suitable(cls, url):
return False if RutubePlaylistIE.suitable(url) else super().suitable(url)
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
query = parse_qs(url) query = parse_qs(url)
info = self._download_and_extract_info(video_id, query) info = self._download_and_extract_info(video_id, query)
info['formats'] = self._download_and_extract_formats(video_id, query) formats, subtitles = self._download_and_extract_formats_and_subtitles(video_id, query)
return info return {
**info,
'formats': formats,
'subtitles': subtitles,
}
class RutubeEmbedIE(RutubeBaseIE): class RutubeEmbedIE(RutubeBaseIE):
IE_NAME = 'rutube:embed' IE_NAME = 'rutube:embed'
IE_DESC = 'Rutube embedded videos' IE_DESC = 'Rutube embedded videos'
_VALID_URL = r'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)' _VALID_URL = r'https?://rutube\.ru/(?:video|play)/embed/(?P<id>[0-9]+)(?:[?#/]|$)'
_TESTS = [{ _TESTS = [{
'url': 'http://rutube.ru/video/embed/6722881?vk_puid37=&vk_puid38=', 'url': 'https://rutube.ru/video/embed/6722881?vk_puid37=&vk_puid38=',
'info_dict': { 'info_dict': {
'id': 'a10e53b86e8f349080f718582ce4c661', 'id': 'a10e53b86e8f349080f718582ce4c661',
'ext': 'mp4', 'ext': 'mp4',
'timestamp': 1387830582, 'timestamp': 1387830582,
'upload_date': '20131223', 'upload_date': '20131223',
'uploader_id': '297833', 'uploader_id': '297833',
'description': 'Видео группы ★http://vk.com/foxkidsreset★ музей Fox Kids и Jetix<br/><br/> восстановлено и сделано в шикоформате subziro89 http://vk.com/subziro89',
'uploader': 'subziro89 ILya', 'uploader': 'subziro89 ILya',
'title': 'Мистический городок Эйри в Индиан 5 серия озвучка subziro89', 'title': 'Мистический городок Эйри в Индиан 5 серия озвучка subziro89',
'age_limit': 0,
'duration': 1395,
'chapters': [],
'description': 'md5:a5acea57bbc3ccdc3cacd1f11a014b5b',
'view_count': int,
'thumbnail': 'https://pic.rutubelist.ru/video/d3/03/d3031f4670a6e6170d88fb3607948418.jpg',
'categories': ['Сериалы'],
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, { }, {
'url': 'http://rutube.ru/play/embed/8083783', 'url': 'https://rutube.ru/play/embed/8083783',
'only_matching': True, 'only_matching': True,
}, { }, {
# private video # private video
@@ -240,11 +278,12 @@ class RutubeEmbedIE(RutubeBaseIE):
query = parse_qs(url) query = parse_qs(url)
options = self._download_api_options(embed_id, query) options = self._download_api_options(embed_id, query)
video_id = options['effective_video'] video_id = options['effective_video']
formats = self._extract_formats(options, video_id) formats, subtitles = self._extract_formats_and_subtitles(options, video_id)
info = self._download_and_extract_info(video_id, query) info = self._download_and_extract_info(video_id, query)
info.update({ info.update({
'extractor_key': 'Rutube', 'extractor_key': 'Rutube',
'formats': formats, 'formats': formats,
'subtitles': subtitles,
}) })
return info return info
@@ -295,14 +334,14 @@ class RutubeTagsIE(RutubePlaylistBaseIE):
IE_DESC = 'Rutube tags' IE_DESC = 'Rutube tags'
_VALID_URL = r'https?://rutube\.ru/tags/video/(?P<id>\d+)' _VALID_URL = r'https?://rutube\.ru/tags/video/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://rutube.ru/tags/video/1800/', 'url': 'https://rutube.ru/tags/video/1800/',
'info_dict': { 'info_dict': {
'id': '1800', 'id': '1800',
}, },
'playlist_mincount': 68, 'playlist_mincount': 68,
}] }]
_PAGE_TEMPLATE = 'http://rutube.ru/api/tags/video/%s/?page=%s&format=json' _PAGE_TEMPLATE = 'https://rutube.ru/api/tags/video/%s/?page=%s&format=json'
class RutubeMovieIE(RutubePlaylistBaseIE): class RutubeMovieIE(RutubePlaylistBaseIE):
@@ -310,8 +349,8 @@ class RutubeMovieIE(RutubePlaylistBaseIE):
IE_DESC = 'Rutube movies' IE_DESC = 'Rutube movies'
_VALID_URL = r'https?://rutube\.ru/metainfo/tv/(?P<id>\d+)' _VALID_URL = r'https?://rutube\.ru/metainfo/tv/(?P<id>\d+)'
_MOVIE_TEMPLATE = 'http://rutube.ru/api/metainfo/tv/%s/?format=json' _MOVIE_TEMPLATE = 'https://rutube.ru/api/metainfo/tv/%s/?format=json'
_PAGE_TEMPLATE = 'http://rutube.ru/api/metainfo/tv/%s/video?page=%s&format=json' _PAGE_TEMPLATE = 'https://rutube.ru/api/metainfo/tv/%s/video?page=%s&format=json'
def _real_extract(self, url): def _real_extract(self, url):
movie_id = self._match_id(url) movie_id = self._match_id(url)
@@ -327,62 +366,82 @@ class RutubePersonIE(RutubePlaylistBaseIE):
IE_DESC = 'Rutube person videos' IE_DESC = 'Rutube person videos'
_VALID_URL = r'https?://rutube\.ru/video/person/(?P<id>\d+)' _VALID_URL = r'https?://rutube\.ru/video/person/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://rutube.ru/video/person/313878/', 'url': 'https://rutube.ru/video/person/313878/',
'info_dict': { 'info_dict': {
'id': '313878', 'id': '313878',
}, },
'playlist_mincount': 37, 'playlist_mincount': 36,
}] }]
_PAGE_TEMPLATE = 'http://rutube.ru/api/video/person/%s/?page=%s&format=json' _PAGE_TEMPLATE = 'https://rutube.ru/api/video/person/%s/?page=%s&format=json'
class RutubePlaylistIE(RutubePlaylistBaseIE): class RutubePlaylistIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:playlist' IE_NAME = 'rutube:playlist'
IE_DESC = 'Rutube playlists' IE_DESC = 'Rutube playlists'
_VALID_URL = r'https?://rutube\.ru/(?:video|(?:play/)?embed)/[\da-z]{32}/\?.*?\bpl_id=(?P<id>\d+)' _VALID_URL = r'https?://rutube\.ru/plst/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://rutube.ru/video/cecd58ed7d531fc0f3d795d51cee9026/?pl_id=3097&pl_type=tag', 'url': 'https://rutube.ru/plst/308547/',
'info_dict': { 'info_dict': {
'id': '3097', 'id': '308547',
}, },
'playlist_count': 27, 'playlist_mincount': 22,
}, {
'url': 'https://rutube.ru/video/10b3a03fc01d5bbcc632a2f3514e8aab/?pl_id=4252&pl_type=source',
'only_matching': True,
}] }]
_PAGE_TEMPLATE = 'https://rutube.ru/api/playlist/custom/%s/videos?page=%s&format=json'
_PAGE_TEMPLATE = 'http://rutube.ru/api/playlist/%s/%s/?page=%s&format=json'
@classmethod
def suitable(cls, url):
from ..utils import int_or_none, parse_qs
if not super().suitable(url):
return False
params = parse_qs(url)
return params.get('pl_type', [None])[0] and int_or_none(params.get('pl_id', [None])[0])
def _next_page_url(self, page_num, playlist_id, item_kind):
return self._PAGE_TEMPLATE % (item_kind, playlist_id, page_num)
def _real_extract(self, url):
qs = parse_qs(url)
playlist_kind = qs['pl_type'][0]
playlist_id = qs['pl_id'][0]
return self._extract_playlist(playlist_id, item_kind=playlist_kind)
class RutubeChannelIE(RutubePlaylistBaseIE): class RutubeChannelIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:channel' IE_NAME = 'rutube:channel'
IE_DESC = 'Rutube channel' IE_DESC = 'Rutube channel'
_VALID_URL = r'https?://rutube\.ru/channel/(?P<id>\d+)/videos' _VALID_URL = r'https?://rutube\.ru/(?:channel/(?P<id>\d+)|u/(?P<slug>\w+))(?:/(?P<section>videos|shorts|playlists))?'
_TESTS = [{ _TESTS = [{
'url': 'https://rutube.ru/channel/639184/videos/', 'url': 'https://rutube.ru/channel/639184/videos/',
'info_dict': { 'info_dict': {
'id': '639184', 'id': '639184_videos',
}, },
'playlist_mincount': 133, 'playlist_mincount': 129,
}, {
'url': 'https://rutube.ru/channel/25902603/shorts/',
'info_dict': {
'id': '25902603_shorts',
},
'playlist_mincount': 277,
}, {
'url': 'https://rutube.ru/channel/25902603/',
'info_dict': {
'id': '25902603',
},
'playlist_mincount': 406,
}, {
'url': 'https://rutube.ru/u/rutube/videos/',
'info_dict': {
'id': '23704195_videos',
},
'playlist_mincount': 113,
}] }]
_PAGE_TEMPLATE = 'http://rutube.ru/api/video/person/%s/?page=%s&format=json' _PAGE_TEMPLATE = 'https://rutube.ru/api/video/person/%s/?page=%s&format=json&origin__type=%s'
def _next_page_url(self, page_num, playlist_id, section):
origin_type = {
'videos': 'rtb,rst,ifrm,rspa',
'shorts': 'rshorts',
None: '',
}.get(section)
return self._PAGE_TEMPLATE % (playlist_id, page_num, origin_type)
def _real_extract(self, url):
playlist_id, slug, section = self._match_valid_url(url).group('id', 'slug', 'section')
if section == 'playlists':
raise UnsupportedError(url)
if slug:
webpage = self._download_webpage(url, slug)
redux_state = self._search_json(
r'window\.reduxState\s*=', webpage, 'redux state', slug, transform_source=js_to_json)
playlist_id = traverse_obj(redux_state, (
'api', 'queries', lambda k, _: k.startswith('channelIdBySlug'),
'data', 'channel_id', {int}, {str_or_none}, any))
playlist = self._extract_playlist(playlist_id, section=section)
if section:
playlist['id'] = f'{playlist_id}_{section}'
return playlist

View File

@@ -4,43 +4,12 @@ import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
parse_qs, UnsupportedError,
unsmuggle_url, make_archive_id,
remove_end,
url_or_none,
) )
from ..utils.traversal import traverse_obj
_COMMITTEES = {
'ag': ('76440', 'http://ag-f.akamaihd.net'),
'aging': ('76442', 'http://aging-f.akamaihd.net'),
'approps': ('76441', 'http://approps-f.akamaihd.net'),
'arch': ('', 'http://ussenate-f.akamaihd.net'),
'armed': ('76445', 'http://armed-f.akamaihd.net'),
'banking': ('76446', 'http://banking-f.akamaihd.net'),
'budget': ('76447', 'http://budget-f.akamaihd.net'),
'cecc': ('76486', 'http://srs-f.akamaihd.net'),
'commerce': ('80177', 'http://commerce1-f.akamaihd.net'),
'csce': ('75229', 'http://srs-f.akamaihd.net'),
'dpc': ('76590', 'http://dpc-f.akamaihd.net'),
'energy': ('76448', 'http://energy-f.akamaihd.net'),
'epw': ('76478', 'http://epw-f.akamaihd.net'),
'ethics': ('76449', 'http://ethics-f.akamaihd.net'),
'finance': ('76450', 'http://finance-f.akamaihd.net'),
'foreign': ('76451', 'http://foreign-f.akamaihd.net'),
'govtaff': ('76453', 'http://govtaff-f.akamaihd.net'),
'help': ('76452', 'http://help-f.akamaihd.net'),
'indian': ('76455', 'http://indian-f.akamaihd.net'),
'intel': ('76456', 'http://intel-f.akamaihd.net'),
'intlnarc': ('76457', 'http://intlnarc-f.akamaihd.net'),
'jccic': ('85180', 'http://jccic-f.akamaihd.net'),
'jec': ('76458', 'http://jec-f.akamaihd.net'),
'judiciary': ('76459', 'http://judiciary-f.akamaihd.net'),
'rpc': ('76591', 'http://rpc-f.akamaihd.net'),
'rules': ('76460', 'http://rules-f.akamaihd.net'),
'saa': ('76489', 'http://srs-f.akamaihd.net'),
'smbiz': ('76461', 'http://smbiz-f.akamaihd.net'),
'srs': ('75229', 'http://srs-f.akamaihd.net'),
'uscc': ('76487', 'http://srs-f.akamaihd.net'),
'vetaff': ('76462', 'http://vetaff-f.akamaihd.net'),
}
class SenateISVPIE(InfoExtractor): class SenateISVPIE(InfoExtractor):
@@ -53,31 +22,46 @@ class SenateISVPIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'judiciary031715', 'id': 'judiciary031715',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Integrated Senate Video Player', 'title': 'ISVP',
'thumbnail': r're:^https?://.*\.(?:jpg|png)$', 'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
'_old_archive_ids': ['senategov judiciary031715'],
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['Failed to download m3u8 information'],
}, { }, {
'url': 'http://www.senate.gov/isvp/?type=live&comm=commerce&filename=commerce011514.mp4&auto_play=false', 'url': 'http://www.senate.gov/isvp/?type=live&comm=commerce&filename=commerce011514.mp4&auto_play=false',
'info_dict': { 'info_dict': {
'id': 'commerce011514', 'id': 'commerce011514',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Integrated Senate Video Player', 'title': 'Integrated Senate Video Player',
'_old_archive_ids': ['senategov commerce011514'],
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': 'This video is not available.',
}, { }, {
'url': 'http://www.senate.gov/isvp/?type=arch&comm=intel&filename=intel090613&hc_location=ufi', 'url': 'http://www.senate.gov/isvp/?type=arch&comm=intel&filename=intel090613&hc_location=ufi',
# checksum differs each time # checksum differs each time
'info_dict': { 'info_dict': {
'id': 'intel090613', 'id': 'intel090613',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Integrated Senate Video Player', 'title': 'ISVP',
'_old_archive_ids': ['senategov intel090613'],
},
'expected_warnings': ['Failed to download m3u8 information'],
}, {
'url': 'https://www.senate.gov/isvp/?auto_play=false&comm=help&filename=help090920&poster=https://www.help.senate.gov/assets/images/video-poster.png&stt=950',
'info_dict': {
'id': 'help090920',
'ext': 'mp4',
'title': 'ISVP',
'thumbnail': 'https://www.help.senate.gov/assets/images/video-poster.png',
'_old_archive_ids': ['senategov help090920'],
}, },
}, { }, {
# From http://www.c-span.org/video/?96791-1 # From http://www.c-span.org/video/?96791-1
@@ -85,60 +69,81 @@ class SenateISVPIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
_COMMITTEES = {
'ag': ('76440', 'https://ag-f.akamaihd.net', '2036803', 'agriculture'),
'aging': ('76442', 'https://aging-f.akamaihd.net', '2036801', 'aging'),
'approps': ('76441', 'https://approps-f.akamaihd.net', '2036802', 'appropriations'),
'arch': ('', 'https://ussenate-f.akamaihd.net', '', 'arch'),
'armed': ('76445', 'https://armed-f.akamaihd.net', '2036800', 'armedservices'),
'banking': ('76446', 'https://banking-f.akamaihd.net', '2036799', 'banking'),
'budget': ('76447', 'https://budget-f.akamaihd.net', '2036798', 'budget'),
'cecc': ('76486', 'https://srs-f.akamaihd.net', '2036782', 'srs_cecc'),
'commerce': ('80177', 'https://commerce1-f.akamaihd.net', '2036779', 'commerce'),
'csce': ('75229', 'https://srs-f.akamaihd.net', '2036777', 'srs_srs'),
'dpc': ('76590', 'https://dpc-f.akamaihd.net', '', 'dpc'),
'energy': ('76448', 'https://energy-f.akamaihd.net', '2036797', 'energy'),
'epw': ('76478', 'https://epw-f.akamaihd.net', '2036783', 'environment'),
'ethics': ('76449', 'https://ethics-f.akamaihd.net', '2036796', 'ethics'),
'finance': ('76450', 'https://finance-f.akamaihd.net', '2036795', 'finance_finance'),
'foreign': ('76451', 'https://foreign-f.akamaihd.net', '2036794', 'foreignrelations'),
'govtaff': ('76453', 'https://govtaff-f.akamaihd.net', '2036792', 'hsgac'),
'help': ('76452', 'https://help-f.akamaihd.net', '2036793', 'help'),
'indian': ('76455', 'https://indian-f.akamaihd.net', '2036791', 'indianaffairs'),
'intel': ('76456', 'https://intel-f.akamaihd.net', '2036790', 'intelligence'),
'intlnarc': ('76457', 'https://intlnarc-f.akamaihd.net', '', 'internationalnarcoticscaucus'),
'jccic': ('85180', 'https://jccic-f.akamaihd.net', '2036778', 'jccic'),
'jec': ('76458', 'https://jec-f.akamaihd.net', '2036789', 'jointeconomic'),
'judiciary': ('76459', 'https://judiciary-f.akamaihd.net', '2036788', 'judiciary'),
'rpc': ('76591', 'https://rpc-f.akamaihd.net', '', 'rpc'),
'rules': ('76460', 'https://rules-f.akamaihd.net', '2036787', 'rules'),
'saa': ('76489', 'https://srs-f.akamaihd.net', '2036780', 'srs_saa'),
'smbiz': ('76461', 'https://smbiz-f.akamaihd.net', '2036786', 'smallbusiness'),
'srs': ('75229', 'https://srs-f.akamaihd.net', '2031966', 'srs_srs'),
'uscc': ('76487', 'https://srs-f.akamaihd.net', '2036781', 'srs_uscc'),
'vetaff': ('76462', 'https://vetaff-f.akamaihd.net', '2036785', 'veteransaffairs'),
}
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
qs = urllib.parse.parse_qs(self._match_valid_url(url).group('qs')) qs = urllib.parse.parse_qs(self._match_valid_url(url).group('qs'))
if not qs.get('filename') or not qs.get('type') or not qs.get('comm'): if not qs.get('filename') or not qs.get('comm'):
raise ExtractorError('Invalid URL', expected=True) raise ExtractorError('Invalid URL', expected=True)
filename = qs['filename'][0]
video_id = re.sub(r'.mp4$', '', qs['filename'][0]) video_id = remove_end(filename, '.mp4')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
committee = qs['comm'][0]
if smuggled_data.get('force_title'): stream_num, stream_domain, stream_id, msl3 = self._COMMITTEES[committee]
title = smuggled_data['force_title']
else:
title = self._html_extract_title(webpage)
poster = qs.get('poster')
thumbnail = poster[0] if poster else None
video_type = qs['type'][0]
committee = video_type if video_type == 'arch' else qs['comm'][0]
stream_num, domain = _COMMITTEES[committee]
urls_alternatives = [f'https://www-senate-gov-media-srs.akamaized.net/hls/live/{stream_id}/{committee}/{filename}/master.m3u8',
f'https://www-senate-gov-msl3archive.akamaized.net/{msl3}/{filename}_1/master.m3u8',
f'{stream_domain}/i/{filename}_1@{stream_num}/master.m3u8',
f'{stream_domain}/i/{filename}.mp4/master.m3u8']
formats = [] formats = []
if video_type == 'arch': subtitles = {}
filename = video_id if '.' in video_id else video_id + '.mp4' for video_url in urls_alternatives:
m3u8_url = urllib.parse.urljoin(domain, 'i/' + filename + '/master.m3u8') formats, subtitles = self._extract_m3u8_formats_and_subtitles(video_url, video_id, ext='mp4', fatal=False)
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', m3u8_id='m3u8') if formats:
else: break
hdcore_sign = 'hdcore=3.1.0'
url_params = (domain, video_id, stream_num)
f4m_url = f'%s/z/%s_1@%s/manifest.f4m?{hdcore_sign}' % url_params
m3u8_url = '{}/i/{}_1@{}/master.m3u8'.format(*url_params)
for entry in self._extract_f4m_formats(f4m_url, video_id, f4m_id='f4m'):
# URLs without the extra param induce an 404 error
entry.update({'extra_param_to_segment_url': hdcore_sign})
formats.append(entry)
for entry in self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', m3u8_id='m3u8'):
mobj = re.search(r'(?P<tag>(?:-p|-b)).m3u8', entry['url'])
if mobj:
entry['format_id'] += mobj.group('tag')
formats.append(entry)
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': self._html_extract_title(webpage),
'formats': formats, 'formats': formats,
'thumbnail': thumbnail, 'subtitles': subtitles,
'thumbnail': traverse_obj(qs, ('poster', 0, {url_or_none})),
'_old_archive_ids': [make_archive_id(SenateGovIE, video_id)],
} }
class SenateGovIE(InfoExtractor): class SenateGovIE(InfoExtractor):
_IE_NAME = 'senate.gov' _IE_NAME = 'senate.gov'
_VALID_URL = r'https?:\/\/(?:www\.)?(help|appropriations|judiciary|banking|armed-services|finance)\.senate\.gov' _SUBDOMAIN_RE = '|'.join(map(re.escape, (
'agriculture', 'aging', 'appropriations', 'armed-services', 'banking',
'budget', 'commerce', 'energy', 'epw', 'finance', 'foreign', 'help',
'intelligence', 'inaugural', 'judiciary', 'rules', 'sbc', 'veterans',
)))
_VALID_URL = rf'https?://(?:www\.)?(?:{_SUBDOMAIN_RE})\.senate\.gov'
_TESTS = [{ _TESTS = [{
'url': 'https://www.help.senate.gov/hearings/vaccines-saving-lives-ensuring-confidence-and-protecting-public-health', 'url': 'https://www.help.senate.gov/hearings/vaccines-saving-lives-ensuring-confidence-and-protecting-public-health',
'info_dict': { 'info_dict': {
@@ -147,6 +152,9 @@ class SenateGovIE(InfoExtractor):
'title': 'Vaccines: Saving Lives, Ensuring Confidence, and Protecting Public Health', 'title': 'Vaccines: Saving Lives, Ensuring Confidence, and Protecting Public Health',
'description': 'The U.S. Senate Committee on Health, Education, Labor & Pensions', 'description': 'The U.S. Senate Committee on Health, Education, Labor & Pensions',
'ext': 'mp4', 'ext': 'mp4',
'age_limit': 0,
'thumbnail': 'https://www.help.senate.gov/assets/images/sharelogo.jpg',
'_old_archive_ids': ['senategov help090920'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@@ -156,8 +164,12 @@ class SenateGovIE(InfoExtractor):
'display_id': 'watch?hearingid=B8A25434-5056-A066-6020-1F68CB75F0CD', 'display_id': 'watch?hearingid=B8A25434-5056-A066-6020-1F68CB75F0CD',
'title': 'Review of the FY2019 Budget Request for the U.S. Army', 'title': 'Review of the FY2019 Budget Request for the U.S. Army',
'ext': 'mp4', 'ext': 'mp4',
'age_limit': 0,
'thumbnail': 'https://www.appropriations.senate.gov/themes/appropriations/images/video-poster-flash-fit.png',
'_old_archive_ids': ['senategov appropsA051518'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to download m3u8 information'],
}, { }, {
'url': 'https://www.banking.senate.gov/hearings/21st-century-communities-public-transportation-infrastructure-investment-and-fast-act-reauthorization', 'url': 'https://www.banking.senate.gov/hearings/21st-century-communities-public-transportation-infrastructure-investment-and-fast-act-reauthorization',
'info_dict': { 'info_dict': {
@@ -166,32 +178,65 @@ class SenateGovIE(InfoExtractor):
'title': '21st Century Communities: Public Transportation Infrastructure Investment and FAST Act Reauthorization', 'title': '21st Century Communities: Public Transportation Infrastructure Investment and FAST Act Reauthorization',
'description': 'The Official website of The United States Committee on Banking, Housing, and Urban Affairs', 'description': 'The Official website of The United States Committee on Banking, Housing, and Urban Affairs',
'ext': 'mp4', 'ext': 'mp4',
'thumbnail': 'https://www.banking.senate.gov/themes/banking/images/sharelogo.jpg',
'age_limit': 0,
'_old_archive_ids': ['senategov banking041521'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.agriculture.senate.gov/hearings/hemp-production-and-the-2018-farm-bill',
'only_matching': True,
}, {
'url': 'https://www.aging.senate.gov/hearings/the-older-americans-act-the-local-impact-of-the-law-and-the-upcoming-reauthorization',
'only_matching': True,
}, {
'url': 'https://www.budget.senate.gov/hearings/improving-care-lowering-costs-achieving-health-care-efficiency',
'only_matching': True,
}, {
'url': 'https://www.commerce.senate.gov/2024/12/communications-networks-safety-and-security',
'only_matching': True,
}, {
'url': 'https://www.energy.senate.gov/hearings/2024/2/full-committee-hearing-to-examine',
'only_matching': True,
}, {
'url': 'https://www.epw.senate.gov/public/index.cfm/hearings?ID=F63083EA-2C13-498C-B548-341BED68C209',
'only_matching': True,
}, {
'url': 'https://www.foreign.senate.gov/hearings/american-diplomacy-and-global-leadership-review-of-the-fy25-state-department-budget-request',
'only_matching': True,
}, {
'url': 'https://www.intelligence.senate.gov/hearings/foreign-threats-elections-2024-%E2%80%93-roles-and-responsibilities-us-tech-providers',
'only_matching': True,
}, {
'url': 'https://www.inaugural.senate.gov/52nd-inaugural-ceremonies/',
'only_matching': True,
}, {
'url': 'https://www.rules.senate.gov/hearings/02/07/2023/business-meeting',
'only_matching': True,
}, {
'url': 'https://www.sbc.senate.gov/public/index.cfm/hearings?ID=5B13AA6B-8279-45AF-B54B-94156DC7A2AB',
'only_matching': True,
}, {
'url': 'https://www.veterans.senate.gov/2024/5/frontier-health-care-ensuring-veterans-access-no-matter-where-they-live',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._generic_id(url) display_id = self._generic_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
parse_info = parse_qs(self._search_regex( url_info = next(SenateISVPIE.extract_from_webpage(self._downloader, url, webpage), None)
r'<iframe class="[^>"]*streaminghearing[^>"]*"\s[^>]*\bsrc="([^">]*)', webpage, 'hearing URL')) if not url_info:
raise UnsupportedError(url)
stream_num, stream_domain = _COMMITTEES[parse_info['comm'][-1]]
filename = parse_info['filename'][-1]
formats = self._extract_m3u8_formats(
f'{stream_domain}/i/{filename}_1@{stream_num}/master.m3u8',
display_id, ext='mp4')
title = self._html_search_regex( title = self._html_search_regex(
(*self._og_regexes('title'), r'(?s)<title>([^<]*?)</title>'), webpage, 'video title') (*self._og_regexes('title'), r'(?s)<title>([^<]*?)</title>'), webpage, 'video title', fatal=False)
return { return {
'id': re.sub(r'.mp4$', '', filename), **url_info,
'_type': 'url_transparent',
'display_id': display_id, 'display_id': display_id,
'title': re.sub(r'\s+', ' ', title.split('|')[0]).strip(), 'title': re.sub(r'\s+', ' ', title.split('|')[0]).strip(),
'description': self._og_search_description(webpage, default=None), 'description': self._og_search_description(webpage, default=None),
'thumbnail': self._og_search_thumbnail(webpage, default=None), 'thumbnail': self._og_search_thumbnail(webpage, default=None),
'age_limit': self._rta_search(webpage), 'age_limit': self._rta_search(webpage),
'formats': formats,
} }

View File

@@ -1,11 +1,9 @@
import base64 import base64
from .common import InfoExtractor from .common import InfoExtractor
from ..aes import aes_cbc_decrypt, unpad_pkcs7 from ..aes import aes_cbc_decrypt_bytes, unpad_pkcs7
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
bytes_to_intlist,
intlist_to_bytes,
unified_strdate, unified_strdate,
) )
@@ -68,10 +66,10 @@ class ShemarooMeIE(InfoExtractor):
data_json = self._download_json('https://www.shemaroome.com/users/user_all_lists', video_id, data=data.encode()) data_json = self._download_json('https://www.shemaroome.com/users/user_all_lists', video_id, data=data.encode())
if not data_json.get('status'): if not data_json.get('status'):
raise ExtractorError('Premium videos cannot be downloaded yet.', expected=True) raise ExtractorError('Premium videos cannot be downloaded yet.', expected=True)
url_data = bytes_to_intlist(base64.b64decode(data_json['new_play_url'])) url_data = base64.b64decode(data_json['new_play_url'])
key = bytes_to_intlist(base64.b64decode(data_json['key'])) key = base64.b64decode(data_json['key'])
iv = [0] * 16 iv = bytes(16)
m3u8_url = unpad_pkcs7(intlist_to_bytes(aes_cbc_decrypt(url_data, key, iv))).decode('ascii') m3u8_url = unpad_pkcs7(aes_cbc_decrypt_bytes(url_data, key, iv)).decode('ascii')
headers = {'stream_key': data_json['stream_key']} headers = {'stream_key': data_json['stream_key']}
formats, m3u8_subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, fatal=False, headers=headers) formats, m3u8_subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, fatal=False, headers=headers)
for fmt in formats: for fmt in formats:

Some files were not shown because too many files have changed in this diff Show More