1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-30 10:42:05 +00:00

Compare commits

...

130 Commits

Author SHA1 Message Date
github-actions[bot]
e4c120f315 Release 2026.01.29
Created by: bashonly

:ci skip all
2026-01-29 17:00:19 +00:00
bashonly
8b275536d9 [cleanup] Misc (#15749)
* Documentation fixes
* Bump PyInstaller to 6.18.0 for Windows builds

Authored by: bashonly
2026-01-29 16:55:27 +00:00
bashonly
88b35ff911 [ie/youtube] Update ejs to 0.4.0 (#15747)
Authored by: bashonly
2026-01-29 16:47:00 +00:00
bashonly
a65349443b [cleanup] Misc (#15430)
Authored by: bashonly, Grub4K, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
Co-authored-by: Simon Sawicki <contact@grub4k.dev>
2026-01-29 16:22:35 +00:00
bashonly
ba5e2227c8 [ie/vimeo] Add macos client (#15746)
Authored by: bashonly, gamer191
2026-01-29 16:19:59 +00:00
bashonly
309b03f2ad [ie/youtube] Fix default player clients (#15726)
* Add `ios_downgraded` player client
* Remove `android_sdkless` player client

Closes #15712
Authored by: bashonly
2026-01-29 06:57:13 +00:00
bashonly
f70ebf97ea [ie/whyp] Fix extractor (#15721)
Closes #15719
Authored by: bashonly
2026-01-29 00:28:55 +00:00
N/Ame
5bf91072bc Fix concurrent formats downloading to stdout (#15617)
Authored by: grqz
2026-01-28 03:57:09 +00:00
N/Ame
1829a53a54 Fix interactive format/video selection when downloading to stdout (#15626)
Authored by: grqz
2026-01-28 01:11:19 +00:00
rdamas
1c739bf53e [ie/ERRArhiiv] Add extractor (#15667)
Closes #15663
Authored by: rdamas
2026-01-27 16:53:38 +00:00
bashonly
e08fdaaec2 [ie/franceinfo] Fix extraction (#15704)
Closes #15701
Authored by: bashonly
2026-01-27 15:40:47 +00:00
Romain Reignier
ac3a566434 [ie/franceinfo] Support new domain URLs (#15669)
Closes #13173
Authored by: romainreignier
2026-01-27 14:09:16 +00:00
Alexander Bocken
1f4b26c39f [ie/TheChosen] Support new URL format (#15687)
Closes #15686
Authored by: AlexBocken
2026-01-27 14:08:22 +00:00
bashonly
14998eef63 [ie/patreon] Extract inlined media (#15498)
Closes #15473
Authored by: bashonly
2026-01-27 12:52:49 +00:00
bashonly
a893774096 [ie/dailymotion] Support browser impersonation (#15697)
Fix 2b61a2a4b2

Closes #15526
Authored by: bashonly
2026-01-27 12:47:19 +00:00
nlurker
a810871608 [ie/pbs] Fix extraction (#15083)
Closes #13299
Authored by: nlurker
2026-01-27 12:45:19 +00:00
Md5Lukas
f9a06197f5 [ie/boosty] Improve metadata extraction (#15543)
Authored by: Sytm
2026-01-27 12:39:10 +00:00
Mivik
a421eb06d1 [ie/neteasemusic] Fix merged lyrics extraction (#15052)
Authored by: Mivik
2026-01-27 12:30:11 +00:00
wesson09
bc6ff877dd [ie/wat.tv] Improve DRM detection (#15659)
Closes #15647
Authored by: wesson09
2026-01-27 12:29:09 +00:00
Subrat Lima
1effa06dbf [ie/volejtv] Fix and add extractors (#13226)
Closes #13203
Authored by: subrat-lima
2026-01-27 12:22:55 +00:00
Ștefan-Gabriel Muscalu
f8b3fe33f6 [ie/facebook:ads] Fix extractor (#15582)
Closes #15577
Authored by: legraphista
2026-01-27 11:59:50 +00:00
christoph-heinrich
0e4d1e9de6 [ie/lbry] Support filtering of flat playlist results (#15695)
Closes #15683
Authored by: christoph-heinrich, dirkf

Co-authored-by: dirkf <1222880+dirkf@users.noreply.github.com>
2026-01-27 02:06:38 +00:00
christoph-heinrich
0dec80c02a [ie/RumbleChannel] Support filtering of flat playlist results (#15694)
Authored by: christoph-heinrich
2026-01-27 02:05:39 +00:00
bashonly
e3f0d8b731 [ie/tiktok] Solve JS challenges with native Python implementation (#15672)
Closes #15418
Authored by: bashonly, DTrombett

Co-authored-by: DTrombett <d@trombett.org>
2026-01-25 23:16:01 +00:00
bashonly
2b61a2a4b2 [ie/dailymotion] Fix extractor (#15682)
Closes #15526
Authored by: bashonly
2026-01-25 23:03:55 +00:00
rdamas
c8680b65f7 [ie/media.ccc.de] Fix extractor (#15608)
Closes #15607
Authored by: rdamas
2026-01-19 23:16:08 +00:00
Subrat Lima
457dd036af [ie/cbc] Fix extractors (#15631)
Closes #15584
Authored by: subrat-lima
2026-01-19 22:39:27 +00:00
bashonly
5382c6c81b Add --compat-options 2025 (#15499)
Authored by: bashonly
2026-01-19 20:16:33 +00:00
Nil Admirari
b16b06378a Add --format-sort-reset option (#13809)
Authored by: nihil-admirari
2026-01-19 17:40:08 +00:00
bashonly
0b08b833bf [build] Fix manually triggered nightly releases (#15615)
Fix 3763d0d4ab

Authored by: bashonly
2026-01-19 09:25:37 +00:00
bashonly
9ab4777b97 [rh:curl_cffi] Support curl_cffi 0.14.x (#15613)
Closes #11860
Authored by: bashonly
2026-01-18 23:40:37 +00:00
Karl Knechtel
dde5eab3b3 Support Deno installed via Python package (#15614)
* Add `deno` extra
* Check Python "scripts" path for runtime executables

Closes #15530
Authored by: zahlman, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2026-01-18 23:31:54 +00:00
bashonly
23b8465063 [ie/youtube] Adjust default clients (#15601)
* Remove `tv` client from logged-out defaults due to #15583
* Remove all HTML5 clients from "JS-less" defaults due to #15569
* Prioritize `web` over `web_safari` until we request latter's config
* Bump all player client versions
* Do not warn for expected SABR-only responses (`web`/`web_safari`)
* Improve PO Token binding experiment debug output

Authored by: bashonly
2026-01-18 19:26:16 +00:00
bashonly
d20f58d721 [ie/youtube] Solve n challenges for manifest formats (#15602)
* Solve n challenges in HLS/DASH manifest URL path parameters
* Collect all challenges in advance to solve in bulk once per video
* Improve & always use the load/store helper methods for player cache

Closes #15569, Closes #15586, Closes #15587, Closes #15600
Authored by: bashonly
2026-01-18 16:34:13 +00:00
Simon Sawicki
e2ea6bd6ab [ie/youtube] Fix priorization of youtube URL matching (#15596)
Authored by: Grub4K
2026-01-18 16:11:29 +01:00
Simon Sawicki
ede54330fb [utils] devalue: Fix calling reviver on cached value (#15568)
Authored by: Grub4K
2026-01-16 15:53:32 +01:00
bashonly
27afb31edc [ie/tarangplus] Sanitize m3u8 URLs (#15502)
Fix 260ba3abba

Closes #15501
Authored by: bashonly
2026-01-06 05:44:30 +00:00
InvalidUsernameException
48b845a296 [ie/zdf] Support sister sites URLs (#15370)
Closes #13319
Authored by: InvalidUsernameException
2026-01-06 04:56:18 +00:00
clayote
cec1f1df79 Fix --parse-metadata when TO is a single field name (#14577)
Closes #14576
Authored by: clayote, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2026-01-05 03:19:30 +00:00
0x∅
ba499ab0dc [ie/croatian.film] Add extractor (#15468)
Closes #15464
Authored by: 0xvd
2026-01-04 17:43:47 +00:00
0x∅
5a481d65fa [ie/hotstar] Extract from new API (#15480)
Closes #15479
Authored by: 0xvd
2026-01-04 04:52:37 +00:00
Cédric Luthi
6ae9e95687 [ie/tv5unis] Fix extractors (#15477)
Closes #12662
Authored by: 0xced
2026-01-04 01:02:29 +00:00
pomtnp
9c393e3f62 [ie/tiktok] Extract save_count (#15054)
Closes #15053
Authored by: pomtnp
2026-01-03 21:48:42 +00:00
Emi
87a265d820 [ie/tumblr] Extract timestamp (#15462)
Authored by: alch-emi
2026-01-03 20:54:29 +00:00
doe1080
4d4c7e1c69 [utils] js_to_json: Prevent false positives for octals (#15474)
Authored by: doe1080
2026-01-03 20:53:16 +00:00
João Victor Fernandes Oliveira
0066de5b7e [ie/zoom] Extract recordings with start times (#15475)
Authored by: JV-Fernandes
2026-01-03 20:30:38 +00:00
Oliver Pfeiffer
5026548d65 [ie/bigo] Support --wait-for-video (#15463)
Authored by: olipfei
2026-01-03 00:20:59 +00:00
0x∅
e15ca65874 [ie/twitch:videos] Raise error when channel is not found (#15458)
Closes #15450
Authored by: 0xvd
2026-01-03 00:17:38 +00:00
bashonly
3763d0d4ab [build] Improve nightly release check (#15455)
Authored by: bashonly
2026-01-02 16:02:58 +00:00
Subrat Lima
260ba3abba [ie/tarangplus] Add extractors (#13060)
Closes #13020
Authored by: subrat-lima
2026-01-02 00:15:25 +00:00
ptlydpr
878a41e283 [ie/pandatv] Add extractor (#13210)
Authored by: ptlydpr
2026-01-01 01:24:14 +01:00
bashonly
76c31a7a21 [ie/youtube] Fix comment subthreads extraction (#15448)
Fix d22436e5dc

Closes #15444
Authored by: bashonly
2025-12-31 09:56:26 +00:00
bashonly
ab3ff2d5dd [build] Harden CI/CD pipeline (#15387)
* NOTE: the release workflows' new handling of secrets
  may be a breaking change for forks that are using any secrets
  other than GPG_SIGNING_KEY or ARCHIVE_REPO_TOKEN.

  Previously, the release workflow would try to resolve a token
  secret name based on the `target` or `source` input,
  e.g. NIGHTLY_ARCHIVE_REPO_TOKEN or CUSTOM_ARCHIVE_REPO_TOKEN,
  and then fall back to using the ARCHIVE_REPO_TOKEN secret if the
  resolved token secret name was not found in the repository.

  This behavior has been replaced by the release workflow
  always using the ARCHIVE_REPO_TOKEN secret as the token
  for publishing releases to any external archive repository.

* Add zizmor CI job for auditing workflows

* Pin all actions to commit hashes instead of symbolic references

* Explicitly set GITHUB_TOKEN permissions at the job level

* Use actions/checkout with `persist-credentials: false` whenever possible

* Remove/replace template expansions in workflow scripts

* Remove all usage of actions/cache from build/release workflows

* Remove the cache-warmer.yml workflow

* Remove the unused download.yml workflow

* Set concurrency limits for any workflows that are triggered by PRs

* Avoid loading the entire secrets context

* Replace usage of `secrets: inherit` with explicit `secrets:` blocks

* Pin all external docker images to hash that are used by the build workflow

* Explicitly set `shell: bash` for some steps to avoid pwsh or set pipefail

* Ensure any pwsh steps will fail on non-zero exit codes

Authored by: bashonly
2025-12-30 21:05:10 +00:00
bashonly
468aa6a9b4 [ie/youtube] Fix tracking of parent comment among replies (#15439)
Fix d22436e5dc

Closes #15438
Authored by: bashonly
2025-12-30 20:53:33 +00:00
prettysunflower
6c918c5071 [ie/nebula:season] Support more URLs (#15436)
Authored by: prettysunflower
2025-12-30 21:41:19 +01:00
sepro
09078190b0 [ie/iqiyi] Remove broken login support (#15441)
Authored by: seproDev
2025-12-30 15:02:35 +01:00
sepro
4a772e5289 [ie/scte] Remove extractors (#15442)
Authored by: seproDev
2025-12-30 15:01:24 +01:00
cesbar
f24b9ac0c9 [utils] decode_packed_codes: Fix missing key handling (#15440)
Authored by: cesbar
2025-12-30 14:57:42 +01:00
bashonly
2a7e048a60 [ie/facebook] Remove broken login support (#15434)
Authored by: bashonly
2025-12-30 00:48:11 +00:00
bashonly
a6ba714005 [ie/twitter] Remove broken login support (#15432)
Closes #12616
Authored by: bashonly
2025-12-30 00:22:33 +00:00
bashonly
ce9a3591f8 [ie/twitter] Do not extract non-video posts from unified_cards (#15431)
Closes #15402
Authored by: bashonly
2025-12-30 00:20:44 +00:00
bashonly
d22436e5dc [ie/youtube] Support comment subthreads (#15419)
* Support newly rolled out comment "subthreads"
* Fix comments extraction: all replies were being missed
* Add a `max-depth` element to the `max_comments` extractor-arg
* Fully remove the deprecated `max_comment_depth` extractor-arg

Closes #15303
Authored by: bashonly
2025-12-29 21:46:29 +00:00
bashonly
abf29e3e72 [ie/youtube] Fix skip_player=js extractor-arg (#15428)
Authored by: bashonly
2025-12-29 21:41:48 +00:00
Mike Fährmann
fcd47d2db3 [ie/picarto] Fix extraction when stream has no title (#15407)
Closes #14540
Authored by: mikf
2025-12-29 02:50:03 +00:00
bashonly
cea825e7e0 [ie/generic] Improve detection of blockage due to TLS fingerprint (#15426)
Authored by: bashonly
2025-12-29 01:02:09 +00:00
sepro
c0a7c594a9 [utils] mimetype2ext: Recognize more srt types (#15411)
Authored by: seproDev
2025-12-26 19:00:45 +01:00
sepro
6b23305822 [ie/manoto] Remove extractor (#15414)
Authored by: seproDev
2025-12-26 18:57:08 +01:00
sepro
6d92f87ddc [ie/cda] Support mobile URLs (#15398)
Closes #15397
Authored by: seproDev
2025-12-25 02:25:03 +01:00
sepro
9bf040dc6f [utils] random_user_agent: Bump versions (#15396)
Authored by: seproDev
2025-12-24 21:47:50 +01:00
doe1080
15263d049c [utils] unified_timestamp: Add tz_offset parameter (#15357)
Allows datetime strings without a timezone to be parsed with the correct offset

Authored by: doe1080
2025-12-20 19:52:53 +00:00
0x∅
0ea6cc6d82 [ie/netease:program] Support DJ URLs (#15365)
Closes #15364
Authored by: 0xvd
2025-12-20 10:09:22 +00:00
0x∅
e9d4b22b9b [ie/bandcamp:weekly] Fix extractor (#15208)
Closes #13963
Authored by: 0xvd, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-12-20 03:54:08 +00:00
0x∅
97fb78a5b9 [ie/yahoo] Fix extractor (#15314)
Closes #15211
Authored by: 0xvd, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-12-20 02:58:47 +00:00
0x∅
f5270705e8 [ie/nebula:season] Add extractor (#15347)
Closes #15343
Authored by: 0xvd, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-12-20 01:51:09 +00:00
bashonly
a6a8f6b6d6 [ci] Explicitly declare permissions and limit credentials (#15324)
Authored by: bashonly
2025-12-19 19:22:23 +00:00
bashonly
825648a740 [build] Bump official actions to latest versions (#15305)
* Bump actions/cache → v5
* Bump actions/upload-artifact → v6
* Bump actions/download-artifact → v7

Authored by: bashonly
2025-12-19 19:04:52 +00:00
bashonly
e0bb477732 Bypass interactive format selection if no formats are found (#15278)
Authored by: bashonly
2025-12-19 18:57:55 +00:00
delta
c0c9cac554 [ie/filmarchiv] Add extractor (#13490)
Closes #14821
Authored by: 4elta
2025-12-19 00:44:58 +00:00
0x∅
f0bc71abf6 [ie/tubitv] Support URLs with locales (#15205)
Closes #15176
Authored by: 0xvd
2025-12-19 00:26:53 +00:00
0x∅
8a4b626daf [ie/dropbox] Support videos in folders (#15313)
Closes #15312
Authored by: 0xvd
2025-12-19 00:24:13 +00:00
0x∅
f6dc7d5279 Accept float values for --sleep-subtitles (#15282)
Closes #15269
Authored by: 0xvd
2025-12-18 23:42:50 +00:00
quietvoid
c5e55e0479 [ie/gofile] Fix extractor (#15296)
Authored by: quietvoid
2025-12-18 23:42:13 +00:00
doe1080
6d4984e64e [ie/nextmedia] Remove extractors (#15354)
Authored by: doe1080
2025-12-18 21:36:15 +00:00
doe1080
a27ec9efc6 [ie/netzkino] Rework extractor (#15351)
Authored by: doe1080
2025-12-18 21:32:54 +00:00
bashonly
ff61bef041 [ie/youtube:tab] Fix flat thumbnails extraction for shorts (#15331)
Closes #15329
Authored by: bashonly
2025-12-15 22:37:25 +00:00
sepro
04f2ec4b97 [ie/parti] Fix extractors (#15319)
Authored by: seproDev
2025-12-13 20:00:56 +01:00
0x∅
b6f24745bf [ie/telecinco] Fix extractor (#15311)
Closes #15240
Authored by: 0xvd, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-12-12 22:25:45 +00:00
norepro
f2ee2a46fc [ie/pornhub] Optimize metadata extraction (#15231)
Closes #14621
Authored by: norepro
2025-12-12 20:52:09 +00:00
bashonly
5f37f67d37 [ie/archive.org] Fix metadata extraction (#15286)
Closes #15280
Authored by: bashonly
2025-12-09 19:05:12 +00:00
github-actions[bot]
aa220d0aaa Release 2025.12.08
Created by: bashonly

:ci skip all
2025-12-08 00:06:43 +00:00
bashonly
7a52ff29d8 [cleanup] Misc (#15016)
Closes #15160, Closes #15184
Authored by: bashonly, seproDev, RezSat, oxyzenQ

Co-authored-by: sepro <sepro@sepr0.com>
Co-authored-by: Yehan Wasura <yehantest@gmail.com>
Co-authored-by: rezky_nightky <with.rezky@gmail.com>
2025-12-07 23:58:34 +00:00
bashonly
0c7e4cfcae [ie/youtube] Update ejs to 0.3.2 (#15267)
Authored by: bashonly
2025-12-07 23:51:49 +00:00
bashonly
29fe515d8d [devscripts] install_deps: Align options/terms with PEP 735 (#15200)
Authored by: bashonly
2025-12-07 23:39:05 +00:00
bashonly
1d43fa5af8 [ie/youtube] Improve message when no JS runtime is found (#15266)
Closes #15158
Authored by: bashonly
2025-12-07 23:37:03 +00:00
bashonly
fa16dc5241 [cookies] Fix --cookies-from-browser for new installs of Firefox 147+ (#15215)
Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=259356

Authored by: bashonly, mbway

Co-authored-by: Matthew Broadway <mattdbway@gmail.com>
2025-12-07 23:20:02 +00:00
garret1317
04050be583 [pp/FFmpegMetadata] Add more tag mappings (#14654)
Authored by: garret1317
2025-12-07 23:04:03 +00:00
Simon Sawicki
7bd79d9296 [ie/youtube] Allow ejs patch version to differ (#15263)
Authored by: Grub4K
2025-12-07 22:10:53 +00:00
0x∅
29e2570378 [ie/xhamster] Fix extractor (#15252)
Closes #15239
Authored by: 0xvd
2025-12-06 22:12:38 +00:00
sepro
c70b57c03e [ie/Alibaba] Add extractor (#15253)
Closes #13774
Authored by: seproDev
2025-12-06 22:24:03 +01:00
bashonly
025191fea6 [ie/sporteurope] Support new domain (#15251)
Closes #15250
Authored by: bashonly
2025-12-06 21:16:05 +00:00
bashonly
36b29bb353 [ie/loom] Fix extractor (#15236)
Closes #15141
Authored by: bashonly
2025-12-05 23:18:02 +00:00
sepro
7ec6b9bc40 [ie/web.archive:youtube] Fix extractor (#15234)
Closes #15233
Authored by: seproDev
2025-12-04 18:15:09 +01:00
WhatAmISupposedToPutHere
f7acf3c1f4 [ie/youtube] Add use_ad_playback_context extractor-arg (#15220)
Closes #15144
Authored by: WhatAmISupposedToPutHere
2025-12-03 23:26:20 +00:00
bashonly
017d76edcf [ie/youtube] Revert 56ea3a00ea
Remove `request_no_ads` workaround (#15214)

Closes #15212
Authored by: bashonly
2025-12-01 05:01:22 +00:00
WhatAmISupposedToPutHere
56ea3a00ea [ie/youtube] Add request_no_ads extractor-arg (#15145)
Default is `true` for unauthenticated users.
Default is `false` if logged-in cookies have been passed to yt-dlp.
Using `true` results in a loss of premium formats.

Closes #15144
Authored by: WhatAmISupposedToPutHere
2025-12-01 01:02:58 +00:00
Zer0 Spectrum
2a777ecbd5 [ie/tubitv:series] Fix extractor (#15018)
Authored by: Zer0spectrum
2025-12-01 00:33:14 +00:00
thomasmllt
023e4db9af [ie/patreon:campaign] Fix extractor (#15108)
Closes #15094
Authored by: thomasmllt
2025-11-30 23:59:28 +00:00
Zer0 Spectrum
4433b3a217 [ie/fc2:live] Raise appropriate error when stream is offline (#15180)
Closes #15179
Authored by: Zer0spectrum
2025-11-30 23:54:17 +00:00
bashonly
419776ecf5 [ie/youtube] Extract all automatic caption languages (#15156)
Closes #14889, Closes #15150
Authored by: bashonly
2025-11-30 23:35:05 +00:00
bashonly
2801650268 [build] Bump PyInstaller minimum version requirement to 6.17.0 (#15199)
Ref: https://github.com/pyinstaller/pyinstaller/issues/9149

Authored by: bashonly
2025-11-29 21:18:49 +00:00
sepro
26c2545b87 [ie/S4C] Fix geo-restricted content (#15196)
Closes #15190
Authored by: seproDev
2025-11-28 23:14:03 +01:00
garret1317
12d411722a [ie/nhk] Fix extractors (#14528)
Closes #14223, Closes #14589
Authored by: garret1317
2025-11-24 11:27:43 +00:00
Simon Sawicki
e564b4a808 Respect PATHEXT when locating JS runtime on Windows (#15117)
Fixes #15043

Authored by: Grub4K
2025-11-24 01:56:43 +01:00
WhatAmISupposedToPutHere
715af0c636 [ie/youtube] Determine wait time from player response (#14646)
Closes #14645
Authored by: WhatAmISupposedToPutHere, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-11-23 00:49:36 +00:00
Sojiroh
0c696239ef [ie/WistiaChannel] Fix extractor (#14218)
Closes #14204
Authored by: Sojiroh
2025-11-21 23:08:20 +00:00
putridambassador121
3cb5e4db54 [ie/AGalega] Add extractor (#15105)
Closes #14758
Authored by: putridambassador121
2025-11-21 20:07:07 +01:00
Elioo
6842620d56 [ie/Digiteka] Rework extractor (#14903)
Closes #12454
Authored by: beliote
2025-11-20 20:01:07 +01:00
Michael D.
20f83f208e [ie/netapp] Add extractors (#15122)
Closes #14902
Authored by: darkstar
2025-11-20 19:56:25 +01:00
sepro
c2e7e9cdb2 [ie/URPlay] Fix extractor (#15120)
Closes #13028
Authored by: seproDev
2025-11-20 16:22:45 +01:00
bashonly
2c9f0c3456 [ie/sproutvideo] Fix extractor (#15113)
Closes #15112
Authored by: bashonly
2025-11-19 18:17:29 +00:00
bashonly
0eed3fe530 [pp/ffmpeg] Fix uncaught error if bad --ffmpeg-location is given (#15104)
Revert 9f77e04c76

Closes #12829
Authored by: bashonly
2025-11-19 00:23:00 +00:00
sepro
a4c72acc46 [ie/MedalTV] Rework extractor (#15103)
Closes #15102
Authored by: seproDev
2025-11-19 00:52:55 +01:00
bashonly
9daba4f442 [ie/thisoldhouse] Fix login support (#15097)
Closes #14931
Authored by: bashonly
2025-11-18 23:08:21 +00:00
Mr Flamel
854fded114 [ie/TheChosen] Add extractors (#14183)
Closes #11246
Authored by: mrFlamel
2025-11-17 00:17:55 +01:00
Anton Larionov
5f66ac71f6 [ie/mave:channel] Add extractor (#14915)
Authored by: anlar
2025-11-17 00:05:44 +01:00
bashonly
4cb5e191ef [ie/youtube] Detect "super resolution" AI-upscaled formats (#15050)
Closes #14923
Authored by: bashonly
2025-11-16 22:39:22 +00:00
bashonly
6ee6a6fc58 [rh:urllib] Do not read after close (#15049)
Fix regression introduced in 5767fb4ab1

Closes #15017
Authored by: bashonly
2025-11-16 19:07:48 +00:00
bashonly
23f1ab3469 [fd] Fix playback wait time for ffmpeg downloads (#15066)
Authored by: bashonly
2025-11-16 18:15:16 +00:00
Haytam001
af285016d2 [ie/yfanefa] Add extractor (#15032)
Closes #14974
Authored by: Haytam001
2025-11-16 12:02:13 +01:00
sepro
1dd84b9d1c [ie/SoundcloudPlaylist] Support new API URLs (#15071)
Closes #15068
Authored by: seproDev
2025-11-16 00:35:00 +01:00
127 changed files with 4252 additions and 2959 deletions

View File

@@ -1,5 +1,4 @@
config-variables:
- KEEP_CACHE_WARM
- PUSH_VERSION_COMMIT
- UPDATE_TO_VERIFICATION
- PYPI_PROJECT

View File

@@ -74,11 +74,11 @@ on:
default: true
type: boolean
permissions:
contents: read
permissions: {}
jobs:
process:
name: Process
runs-on: ubuntu-latest
outputs:
origin: ${{ steps.process_inputs.outputs.origin }}
@@ -146,7 +146,6 @@ jobs:
'runner': 'ubuntu-24.04-arm',
'qemu_platform': 'linux/arm/v7',
'onefile': False,
'cache_requirements': True,
'update_to': 'yt-dlp/yt-dlp@2023.03.04',
}],
'musllinux': [{
@@ -175,7 +174,6 @@ jobs:
exe.setdefault('qemu_platform', None)
exe.setdefault('onefile', True)
exe.setdefault('onedir', True)
exe.setdefault('cache_requirements', False)
exe.setdefault('python_version', os.environ['PYTHON_VERSION'])
exe.setdefault('update_to', os.environ['UPDATE_TO'])
if not any(INPUTS.get(key) for key in EXE_MAP):
@@ -186,8 +184,11 @@ jobs:
f.write(f'matrix={json.dumps(matrix)}')
unix:
needs: process
name: unix
needs: [process]
if: inputs.unix
permissions:
contents: read
runs-on: ubuntu-latest
env:
CHANNEL: ${{ inputs.channel }}
@@ -196,11 +197,12 @@ jobs:
UPDATE_TO: yt-dlp/yt-dlp@2025.09.05
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0 # Needed for changelog
persist-credentials: false
- uses: actions/setup-python@v6
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.10"
@@ -229,7 +231,7 @@ jobs:
[[ "${version}" != "${downgraded_version}" ]]
- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-bin-${{ github.job }}
path: |
@@ -239,8 +241,10 @@ jobs:
linux:
name: ${{ matrix.os }} (${{ matrix.arch }})
needs: [process]
if: inputs.linux || inputs.linux_armv7l || inputs.musllinux
needs: process
permissions:
contents: read
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
@@ -257,26 +261,16 @@ jobs:
SKIP_ONEFILE_BUILD: ${{ (!matrix.onefile && '1') || '' }}
steps:
- uses: actions/checkout@v5
- name: Cache requirements
if: matrix.cache_requirements
id: cache-venv
uses: actions/cache@v4
env:
SEGMENT_DOWNLOAD_TIMEOUT_MINS: 1
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
path: |
venv
key: cache-reqs-${{ matrix.os }}_${{ matrix.arch }}-${{ github.ref }}-${{ needs.process.outputs.timestamp }}
restore-keys: |
cache-reqs-${{ matrix.os }}_${{ matrix.arch }}-${{ github.ref }}-
cache-reqs-${{ matrix.os }}_${{ matrix.arch }}-
persist-credentials: false
- name: Set up QEMU
if: matrix.qemu_platform
uses: docker/setup-qemu-action@v3
uses: docker/setup-qemu-action@c7c53464625b32c7a7e944ae62b3e17d2b600130 # v3.7.0
with:
image: tonistiigi/binfmt:qemu-v10.0.4-56@sha256:30cc9a4d03765acac9be2ed0afc23af1ad018aed2c28ea4be8c2eb9afe03fbd1
cache-image: false
platforms: ${{ matrix.qemu_platform }}
- name: Build executable
@@ -300,7 +294,7 @@ jobs:
docker compose up --build --exit-code-from "${SERVICE}" "${SERVICE}"
- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-bin-${{ matrix.os }}_${{ matrix.arch }}
path: |
@@ -308,7 +302,8 @@ jobs:
compression-level: 0
macos:
needs: process
name: macos
needs: [process]
if: inputs.macos
permissions:
contents: read
@@ -320,21 +315,11 @@ jobs:
UPDATE_TO: yt-dlp/yt-dlp@2025.09.05
steps:
- uses: actions/checkout@v5
# NB: Building universal2 does not work with python from actions/setup-python
- name: Cache requirements
id: cache-venv
uses: actions/cache@v4
env:
SEGMENT_DOWNLOAD_TIMEOUT_MINS: 1
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
path: |
~/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}-${{ github.ref }}-${{ needs.process.outputs.timestamp }}
restore-keys: |
cache-reqs-${{ github.job }}-${{ github.ref }}-
cache-reqs-${{ github.job }}-
persist-credentials: false
# NB: Building universal2 does not work with python from actions/setup-python
- name: Install Requirements
run: |
@@ -343,14 +328,14 @@ jobs:
brew uninstall --ignore-dependencies python3
python3 -m venv ~/yt-dlp-build-venv
source ~/yt-dlp-build-venv/bin/activate
python3 devscripts/install_deps.py --only-optional-groups --include-group build
python3 devscripts/install_deps.py --print --include-group pyinstaller > requirements.txt
python3 devscripts/install_deps.py --omit-default --include-extra build
python3 devscripts/install_deps.py --print --include-extra pyinstaller > requirements.txt
# We need to ignore wheels otherwise we break universal2 builds
python3 -m pip install -U --no-binary :all: -r requirements.txt
# We need to fuse our own universal2 wheels for curl_cffi
python3 -m pip install -U 'delocate==0.11.0'
mkdir curl_cffi_whls curl_cffi_universal2
python3 devscripts/install_deps.py --print --only-optional-groups --include-group curl-cffi > requirements.txt
python3 devscripts/install_deps.py --print --omit-default --include-extra build-curl-cffi > requirements.txt
for platform in "macosx_11_0_arm64" "macosx_11_0_x86_64"; do
python3 -m pip download \
--only-binary=:all: \
@@ -399,7 +384,7 @@ jobs:
[[ "$version" != "$downgraded_version" ]]
- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-bin-${{ github.job }}
path: |
@@ -409,7 +394,7 @@ jobs:
windows:
name: windows (${{ matrix.arch }})
needs: process
needs: [process]
if: inputs.windows
permissions:
contents: read
@@ -422,23 +407,23 @@ jobs:
runner: windows-2025
python_version: '3.10'
platform_tag: win_amd64
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: b6496c7630c3afe66900cfa824e8234a8c2e2c81704bd7facd79586abc76c0e5
pyi_version: '6.18.0'
pyi_tag: '2026.01.29.160356'
pyi_hash: bb9cd0b0b233e4d031a295211cb8aa7c7f8b3c12ff33f1d57a40849ab4d3cf42
- arch: 'x86'
runner: windows-2025
python_version: '3.10'
platform_tag: win32
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: 2d881843580efdc54f3523507fc6d9c5b6051ee49c743a6d9b7003ac5758c226
pyi_version: '6.18.0'
pyi_tag: '2026.01.29.160356'
pyi_hash: aa8f260e735d94f1e2e1aac42e322f508eb54d0433de803c2998c337f72045e4
- arch: 'arm64'
runner: windows-11-arm
python_version: '3.13' # arm64 only has Python >= 3.11 available
platform_tag: win_arm64
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: 4250c9085e34a95c898f3ee2f764914fc36ec59f0d97c28e6a75fcf21f7b144f
pyi_version: '6.18.0'
pyi_tag: '2026.01.29.160356'
pyi_hash: 4bbca67d0cdfa860d92ac9cc7e4c2586fd393d1e814e3f1375b8c62d5cfb6771
env:
CHANNEL: ${{ inputs.channel }}
ORIGIN: ${{ needs.process.outputs.origin }}
@@ -450,26 +435,15 @@ jobs:
PYI_WHEEL: pyinstaller-${{ matrix.pyi_version }}-py3-none-${{ matrix.platform_tag }}.whl
steps:
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ matrix.python_version }}
architecture: ${{ matrix.arch }}
- name: Cache requirements
id: cache-venv
if: matrix.arch == 'arm64'
uses: actions/cache@v4
env:
SEGMENT_DOWNLOAD_TIMEOUT_MINS: 1
with:
path: |
/yt-dlp-build-venv
key: ${{ env.BASE_CACHE_KEY }}-${{ github.ref }}-${{ needs.process.outputs.timestamp }}
restore-keys: |
${{ env.BASE_CACHE_KEY }}-${{ github.ref }}-
${{ env.BASE_CACHE_KEY }}-
- name: Install Requirements
env:
ARCH: ${{ matrix.arch }}
@@ -477,6 +451,8 @@ jobs:
PYI_HASH: ${{ matrix.pyi_hash }}
shell: pwsh
run: |
$ErrorActionPreference = "Stop"
$PSNativeCommandUseErrorActionPreference = $true
python -m venv /yt-dlp-build-venv
/yt-dlp-build-venv/Scripts/Activate.ps1
python -m pip install -U pip
@@ -484,22 +460,26 @@ jobs:
mkdir /pyi-wheels
python -m pip download -d /pyi-wheels --no-deps --require-hashes "pyinstaller@${Env:PYI_URL}#sha256=${Env:PYI_HASH}"
python -m pip install --force-reinstall -U "/pyi-wheels/${Env:PYI_WHEEL}"
python devscripts/install_deps.py --only-optional-groups --include-group build
python devscripts/install_deps.py --omit-default --include-extra build
if ("${Env:ARCH}" -eq "x86") {
python devscripts/install_deps.py
} else {
python devscripts/install_deps.py --include-group curl-cffi
python devscripts/install_deps.py --include-extra build-curl-cffi
}
- name: Prepare
shell: pwsh
run: |
$ErrorActionPreference = "Stop"
$PSNativeCommandUseErrorActionPreference = $true
python devscripts/update-version.py -c "${Env:CHANNEL}" -r "${Env:ORIGIN}" "${Env:VERSION}"
python devscripts/make_lazy_extractors.py
- name: Build
shell: pwsh
run: |
$ErrorActionPreference = "Stop"
$PSNativeCommandUseErrorActionPreference = $true
/yt-dlp-build-venv/Scripts/Activate.ps1
python -m bundle.pyinstaller
python -m bundle.pyinstaller --onedir
@@ -509,6 +489,8 @@ jobs:
if: vars.UPDATE_TO_VERIFICATION
shell: pwsh
run: |
$ErrorActionPreference = "Stop"
$PSNativeCommandUseErrorActionPreference = $true
$name = "yt-dlp${Env:SUFFIX}"
Copy-Item "./dist/${name}.exe" "./dist/${name}_downgraded.exe"
$version = & "./dist/${name}.exe" --version
@@ -519,7 +501,7 @@ jobs:
}
- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-bin-${{ github.job }}-${{ matrix.arch }}
path: |
@@ -528,23 +510,25 @@ jobs:
compression-level: 0
meta_files:
if: always() && !cancelled()
name: Metadata files
needs:
- process
- unix
- linux
- macos
- windows
if: always() && !failure() && !cancelled()
runs-on: ubuntu-latest
steps:
- name: Download artifacts
uses: actions/download-artifact@v5
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifact
pattern: build-bin-*
merge-multiple: true
- name: Make SHA2-SUMS files
shell: bash
run: |
cd ./artifact/
# make sure SHA sums are also printed to stdout
@@ -600,13 +584,13 @@ jobs:
GPG_SIGNING_KEY: ${{ secrets.GPG_SIGNING_KEY }}
if: env.GPG_SIGNING_KEY
run: |
gpg --batch --import <<< "${{ secrets.GPG_SIGNING_KEY }}"
gpg --batch --import <<< "${GPG_SIGNING_KEY}"
for signfile in ./SHA*SUMS; do
gpg --batch --detach-sign "$signfile"
done
- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-${{ github.job }}
path: |

View File

@@ -1,23 +0,0 @@
name: Keep cache warm
on:
workflow_dispatch:
schedule:
- cron: '0 22 1,6,11,16,21,27 * *'
jobs:
build:
if: |
vars.KEEP_CACHE_WARM || github.event_name == 'workflow_dispatch'
uses: ./.github/workflows/build.yml
with:
version: '999999'
channel: stable
origin: ${{ github.repository }}
unix: false
linux: false
linux_armv7l: true
musllinux: false
macos: true
windows: true
permissions:
contents: read

View File

@@ -16,8 +16,8 @@ on:
- yt_dlp/extractor/youtube/jsc/**.py
- yt_dlp/extractor/youtube/pot/**.py
- yt_dlp/utils/_jsruntime.py
permissions:
contents: read
permissions: {}
concurrency:
group: challenge-tests-${{ github.event.pull_request.number || github.ref }}
@@ -26,6 +26,8 @@ concurrency:
jobs:
tests:
name: Challenge Tests
permissions:
contents: read
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
@@ -35,26 +37,30 @@ jobs:
env:
QJS_VERSION: '2025-04-26' # Earliest version with rope strings
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ matrix.python-version }}
- name: Install Deno
uses: denoland/setup-deno@v2
uses: denoland/setup-deno@e95548e56dfa95d4e1a28d6f422fafe75c4c26fb # v2.0.3
with:
deno-version: '2.0.0' # minimum supported version
- name: Install Bun
uses: oven-sh/setup-bun@v2
uses: oven-sh/setup-bun@3d267786b128fe76c2f16a390aa2448b815359f3 # v2.1.2
with:
# minimum supported version is 1.0.31 but earliest available Windows version is 1.1.0
bun-version: ${{ (matrix.os == 'windows-latest' && '1.1.0') || '1.0.31' }}
no-cache: true
- name: Install Node
uses: actions/setup-node@v6
uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0
with:
node-version: '20.0' # minimum supported version
- name: Install QuickJS (Linux)
if: matrix.os == 'ubuntu-latest'
shell: bash
run: |
wget "https://bellard.org/quickjs/binary_releases/quickjs-linux-x86_64-${QJS_VERSION}.zip" -O quickjs.zip
unzip quickjs.zip qjs
@@ -63,15 +69,19 @@ jobs:
if: matrix.os == 'windows-latest'
shell: pwsh
run: |
$ErrorActionPreference = "Stop"
$PSNativeCommandUseErrorActionPreference = $true
Invoke-WebRequest "https://bellard.org/quickjs/binary_releases/quickjs-win-x86_64-${Env:QJS_VERSION}.zip" -OutFile quickjs.zip
unzip quickjs.zip
- name: Install test requirements
shell: bash
run: |
python ./devscripts/install_deps.py --print --only-optional-groups --include-group test > requirements.txt
python ./devscripts/install_deps.py --print --omit-default --include-extra test > requirements.txt
python ./devscripts/install_deps.py --print -c certifi -c requests -c urllib3 -c yt-dlp-ejs >> requirements.txt
python -m pip install -U -r requirements.txt
- name: Run tests
timeout-minutes: 15
shell: bash
run: |
python -m yt_dlp -v --js-runtimes node --js-runtimes bun --js-runtimes quickjs || true
python ./devscripts/run_tests.py test/test_jsc -k download

View File

@@ -2,64 +2,46 @@ name: "CodeQL"
on:
push:
branches: [ 'master', 'gh-pages', 'release' ]
branches: [ 'master' ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ 'master' ]
schedule:
- cron: '59 11 * * 5'
permissions: {}
concurrency:
group: codeql-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
analyze:
name: Analyze
name: Analyze (${{ matrix.language }})
runs-on: ubuntu-latest
permissions:
actions: read
actions: read # Needed by github/codeql-action if repository is private
contents: read
security-events: write
security-events: write # Needed to use github/codeql-action with Github Advanced Security
strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Use only 'java' to analyze code written in Java, Kotlin or both
# Use only 'javascript' to analyze code written in JavaScript, TypeScript or both
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
language: [ 'actions', 'javascript-typescript', 'python' ]
steps:
- name: Checkout repository
uses: actions/checkout@v5
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@5d4e8d1aca955e8d8589aabd499c5cae939e33c7 # v4.31.9
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.
# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh
build-mode: none
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@5d4e8d1aca955e8d8589aabd499c5cae939e33c7 # v4.31.9
with:
category: "/language:${{matrix.language}}"

View File

@@ -22,8 +22,8 @@ on:
- yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py
permissions:
contents: read
permissions: {}
concurrency:
group: core-${{ github.event.pull_request.number || github.ref }}
@@ -32,7 +32,9 @@ concurrency:
jobs:
tests:
name: Core Tests
if: "!contains(github.event.head_commit.message, 'ci skip')"
if: ${{ !contains(github.event.head_commit.message, 'ci skip') }}
permissions:
contents: read
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
@@ -55,15 +57,16 @@ jobs:
- os: windows-latest
python-version: pypy-3.11
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
persist-credentials: false
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ matrix.python-version }}
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group test --include-group curl-cffi
run: python ./devscripts/install_deps.py --include-extra test --include-extra curl-cffi
- name: Run tests
timeout-minutes: 15
continue-on-error: False

View File

@@ -1,48 +0,0 @@
name: Download Tests
on: [push, pull_request]
permissions:
contents: read
jobs:
quick:
name: Quick Download Tests
if: "contains(github.event.head_commit.message, 'ci run dl')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.10'
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group dev
- name: Run tests
continue-on-error: true
run: python ./devscripts/run_tests.py download
full:
name: Full Download Tests
if: "contains(github.event.head_commit.message, 'ci run dl all')"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
os: [ubuntu-latest]
python-version: ['3.11', '3.12', '3.13', '3.14', pypy-3.11]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest
python-version: '3.10'
- os: windows-latest
python-version: pypy-3.11
steps:
- uses: actions/checkout@v5
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group dev
- name: Run tests
continue-on-error: true
run: python ./devscripts/run_tests.py download

View File

@@ -3,13 +3,14 @@ on:
issues:
types: [opened]
permissions:
issues: write
permissions: {}
jobs:
lockdown:
name: Issue Lockdown
if: vars.ISSUE_LOCKDOWN
permissions:
issues: write # Needed to lock issues
runs-on: ubuntu-latest
steps:
- name: "Lock new issue"

View File

@@ -1,37 +1,51 @@
name: Quick Test
on: [push, pull_request]
permissions:
contents: read
permissions: {}
concurrency:
group: quick-test-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
tests:
name: Core Test
if: "!contains(github.event.head_commit.message, 'ci skip all')"
if: ${{ !contains(github.event.head_commit.message, 'ci skip all') }}
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Set up Python 3.10
uses: actions/setup-python@v6
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.10'
- name: Install test requirements
run: python ./devscripts/install_deps.py --only-optional-groups --include-group test
shell: bash
run: python ./devscripts/install_deps.py --omit-default --include-extra test
- name: Run tests
timeout-minutes: 15
shell: bash
run: |
python3 -m yt_dlp -v || true
python3 ./devscripts/run_tests.py --pytest-args '--reruns 2 --reruns-delay 3.0' core
check:
name: Code check
if: "!contains(github.event.head_commit.message, 'ci skip all')"
if: ${{ !contains(github.event.head_commit.message, 'ci skip all') }}
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.10'
- name: Install dev dependencies
run: python ./devscripts/install_deps.py --only-optional-groups --include-group static-analysis
run: python ./devscripts/install_deps.py --omit-default --include-extra static-analysis
- name: Make lazy extractors
run: python ./devscripts/make_lazy_extractors.py
- name: Run ruff
@@ -39,4 +53,5 @@ jobs:
- name: Run autopep8
run: autopep8 --diff .
- name: Check file mode
shell: bash
run: git ls-files --format="%(objectmode) %(path)" yt_dlp/ | ( ! grep -v "^100644" )

View File

@@ -14,35 +14,39 @@ on:
- ".github/workflows/release-master.yml"
concurrency:
group: release-master
permissions:
contents: read
permissions: {}
jobs:
release:
name: Publish Github release
if: vars.BUILD_MASTER
permissions:
contents: write # May be needed to publish release
id-token: write # Needed for trusted publishing
uses: ./.github/workflows/release.yml
with:
prerelease: true
source: ${{ (github.repository != 'yt-dlp/yt-dlp' && vars.MASTER_ARCHIVE_REPO) || 'master' }}
target: 'master'
permissions:
contents: write
id-token: write # mandatory for trusted publishing
secrets: inherit
secrets:
ARCHIVE_REPO_TOKEN: ${{ secrets.ARCHIVE_REPO_TOKEN }}
GPG_SIGNING_KEY: ${{ secrets.GPG_SIGNING_KEY }}
publish_pypi:
name: Publish to PyPI
needs: [release]
if: vars.MASTER_PYPI_PROJECT
runs-on: ubuntu-latest
permissions:
id-token: write # mandatory for trusted publishing
id-token: write # Needed for trusted publishing
runs-on: ubuntu-latest
steps:
- name: Download artifacts
uses: actions/download-artifact@v5
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: dist
name: build-pypi
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0
with:
verbose: true

View File

@@ -2,21 +2,43 @@ name: Release (nightly)
on:
schedule:
- cron: '23 23 * * *'
permissions:
contents: read
workflow_dispatch:
permissions: {}
jobs:
check_nightly:
if: vars.BUILD_NIGHTLY
name: Check for new commits
if: github.event_name == 'workflow_dispatch' || vars.BUILD_NIGHTLY
permissions:
contents: read
runs-on: ubuntu-latest
outputs:
commit: ${{ steps.check_for_new_commits.outputs.commit }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
persist-credentials: false
- name: Retrieve HEAD commit hash
id: head
shell: bash
run: echo "head=$(git rev-parse HEAD)" | tee -a "${GITHUB_OUTPUT}"
- name: Cache nightly commit hash
uses: actions/cache@8b402f58fbc84540c8b491a91e594a4576fec3d7 # v5.0.2
env:
SEGMENT_DOWNLOAD_TIMEOUT_MINS: 1
with:
path: .nightly_commit_hash
key: release-nightly-${{ steps.head.outputs.head }}
restore-keys: |
release-nightly-
- name: Check for new commits
id: check_for_new_commits
shell: bash
run: |
relevant_files=(
"yt_dlp/*.py"
@@ -30,34 +52,54 @@ jobs:
".github/workflows/release.yml"
".github/workflows/release-nightly.yml"
)
echo "commit=$(git log --format=%H -1 --since="24 hours ago" -- "${relevant_files[@]}")" | tee "$GITHUB_OUTPUT"
if [[ -f .nightly_commit_hash ]]; then
limit_args=(
"$(cat .nightly_commit_hash)..HEAD"
)
else
limit_args=(
--since="24 hours ago"
)
fi
echo "commit=$(git log --format=%H -1 "${limit_args[@]}" -- "${relevant_files[@]}")" | tee -a "${GITHUB_OUTPUT}"
- name: Record new nightly commit hash
env:
HEAD: ${{ steps.head.outputs.head }}
shell: bash
run: echo "${HEAD}" | tee .nightly_commit_hash
release:
name: Publish Github release
needs: [check_nightly]
if: ${{ needs.check_nightly.outputs.commit }}
if: needs.check_nightly.outputs.commit
permissions:
contents: write # May be needed to publish release
id-token: write # Needed for trusted publishing
uses: ./.github/workflows/release.yml
with:
prerelease: true
source: ${{ (github.repository != 'yt-dlp/yt-dlp' && vars.NIGHTLY_ARCHIVE_REPO) || 'nightly' }}
target: 'nightly'
permissions:
contents: write
id-token: write # mandatory for trusted publishing
secrets: inherit
secrets:
ARCHIVE_REPO_TOKEN: ${{ secrets.ARCHIVE_REPO_TOKEN }}
GPG_SIGNING_KEY: ${{ secrets.GPG_SIGNING_KEY }}
publish_pypi:
name: Publish to PyPI
needs: [release]
if: vars.NIGHTLY_PYPI_PROJECT
runs-on: ubuntu-latest
permissions:
id-token: write # mandatory for trusted publishing
id-token: write # Needed for trusted publishing
runs-on: ubuntu-latest
steps:
- name: Download artifacts
uses: actions/download-artifact@v5
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: dist
name: build-pypi
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0
with:
verbose: true

View File

@@ -22,6 +22,11 @@ on:
required: false
default: true
type: boolean
secrets:
ARCHIVE_REPO_TOKEN:
required: false
GPG_SIGNING_KEY:
required: false
workflow_dispatch:
inputs:
source:
@@ -56,30 +61,30 @@ on:
default: false
type: boolean
permissions:
contents: read
permissions: {}
jobs:
prepare:
name: Prepare
permissions:
contents: write
contents: write # Needed to git-push the release commit
runs-on: ubuntu-latest
outputs:
channel: ${{ steps.setup_variables.outputs.channel }}
version: ${{ steps.setup_variables.outputs.version }}
target_repo: ${{ steps.setup_variables.outputs.target_repo }}
target_repo_token: ${{ steps.setup_variables.outputs.target_repo_token }}
target_tag: ${{ steps.setup_variables.outputs.target_tag }}
pypi_project: ${{ steps.setup_variables.outputs.pypi_project }}
pypi_suffix: ${{ steps.setup_variables.outputs.pypi_suffix }}
head_sha: ${{ steps.get_target.outputs.head_sha }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
persist-credentials: true # Needed to git-push the release commit
- uses: actions/setup-python@v6
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.10" # Keep this in sync with test-workflows.yml
@@ -104,8 +109,6 @@ jobs:
TARGET_PYPI_SUFFIX: ${{ vars[format('{0}_pypi_suffix', steps.process_inputs.outputs.target_repo)] }}
SOURCE_ARCHIVE_REPO: ${{ vars[format('{0}_archive_repo', steps.process_inputs.outputs.source_repo)] }}
TARGET_ARCHIVE_REPO: ${{ vars[format('{0}_archive_repo', steps.process_inputs.outputs.target_repo)] }}
HAS_SOURCE_ARCHIVE_REPO_TOKEN: ${{ !!secrets[format('{0}_archive_repo_token', steps.process_inputs.outputs.source_repo)] }}
HAS_TARGET_ARCHIVE_REPO_TOKEN: ${{ !!secrets[format('{0}_archive_repo_token', steps.process_inputs.outputs.target_repo)] }}
HAS_ARCHIVE_REPO_TOKEN: ${{ !!secrets.ARCHIVE_REPO_TOKEN }}
run: |
python -m devscripts.setup_variables
@@ -127,8 +130,7 @@ jobs:
VERSION: ${{ steps.setup_variables.outputs.version }}
GITHUB_EVENT_SENDER_LOGIN: ${{ github.event.sender.login }}
GITHUB_EVENT_REF: ${{ github.event.ref }}
if: |
!inputs.prerelease && steps.setup_variables.outputs.target_repo == github.repository
if: steps.setup_variables.outputs.target_repo == github.repository && !inputs.prerelease
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
@@ -145,42 +147,45 @@ jobs:
- name: Update master
env:
GITHUB_EVENT_REF: ${{ github.event.ref }}
if: |
vars.PUSH_VERSION_COMMIT && !inputs.prerelease && steps.setup_variables.outputs.target_repo == github.repository
if: vars.PUSH_VERSION_COMMIT && !inputs.prerelease && steps.setup_variables.outputs.target_repo == github.repository
run: git push origin "${GITHUB_EVENT_REF}"
build:
needs: prepare
name: Build
needs: [prepare]
permissions:
contents: read
uses: ./.github/workflows/build.yml
with:
version: ${{ needs.prepare.outputs.version }}
channel: ${{ needs.prepare.outputs.channel }}
origin: ${{ needs.prepare.outputs.target_repo }}
linux_armv7l: ${{ inputs.linux_armv7l }}
permissions:
contents: read
secrets:
GPG_SIGNING_KEY: ${{ secrets.GPG_SIGNING_KEY }}
publish_pypi:
name: Publish to PyPI
needs: [prepare, build]
if: ${{ needs.prepare.outputs.pypi_project }}
runs-on: ubuntu-latest
if: needs.prepare.outputs.pypi_project
permissions:
id-token: write # mandatory for trusted publishing
contents: read
id-token: write # Needed for trusted publishing
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- uses: actions/setup-python@v6
fetch-depth: 0 # Needed for changelog
persist-credentials: false
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.10"
- name: Install Requirements
run: |
sudo apt -y install pandoc man
python devscripts/install_deps.py --only-optional-groups --include-group build
python devscripts/install_deps.py --omit-default --include-extra build
- name: Prepare
env:
@@ -208,8 +213,8 @@ jobs:
python -m build --no-isolation .
- name: Upload artifacts
if: github.event_name != 'workflow_dispatch'
uses: actions/upload-artifact@v4
if: github.event.workflow != '.github/workflows/release.yml' # Reusable workflow_call
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: build-pypi
path: |
@@ -217,15 +222,16 @@ jobs:
compression-level: 0
- name: Publish to PyPI
if: github.event_name == 'workflow_dispatch'
uses: pypa/gh-action-pypi-publish@release/v1
if: github.event.workflow == '.github/workflows/release.yml' # Direct workflow_dispatch
uses: pypa/gh-action-pypi-publish@ed0c53931b1dc9bd32cbe73a98c7f6766f8a527e # v1.13.0
with:
verbose: true
publish:
name: Publish Github release
needs: [prepare, build]
permissions:
contents: write
contents: write # Needed by gh to publish release to Github
runs-on: ubuntu-latest
env:
TARGET_REPO: ${{ needs.prepare.outputs.target_repo }}
@@ -233,15 +239,16 @@ jobs:
VERSION: ${{ needs.prepare.outputs.version }}
HEAD_SHA: ${{ needs.prepare.outputs.head_sha }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
- uses: actions/download-artifact@v5
persist-credentials: false
- uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifact
pattern: build-*
merge-multiple: true
- uses: actions/setup-python@v6
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.10"
@@ -282,12 +289,11 @@ jobs:
- name: Publish to archive repo
env:
GH_TOKEN: ${{ secrets[needs.prepare.outputs.target_repo_token] }}
GH_TOKEN: ${{ secrets.ARCHIVE_REPO_TOKEN }}
GH_REPO: ${{ needs.prepare.outputs.target_repo }}
TITLE_PREFIX: ${{ startswith(env.TARGET_REPO, 'yt-dlp/') && 'yt-dlp ' || '' }}
TITLE: ${{ inputs.target != env.TARGET_REPO && inputs.target || needs.prepare.outputs.channel }}
if: |
inputs.prerelease && env.GH_TOKEN && env.GH_REPO && env.GH_REPO != github.repository
if: inputs.prerelease && env.GH_TOKEN && env.GH_REPO && env.GH_REPO != github.repository
run: |
gh release create \
--notes-file ARCHIVE_NOTES \
@@ -298,8 +304,7 @@ jobs:
- name: Prune old release
env:
GH_TOKEN: ${{ github.token }}
if: |
env.TARGET_REPO == github.repository && env.TARGET_TAG != env.VERSION
if: env.TARGET_REPO == github.repository && env.TARGET_TAG != env.VERSION
run: |
gh release delete --yes --cleanup-tag "${TARGET_TAG}" || true
git tag --delete "${TARGET_TAG}" || true
@@ -312,8 +317,7 @@ jobs:
TITLE_PREFIX: ${{ github.repository == 'yt-dlp/yt-dlp' && 'yt-dlp ' || '' }}
TITLE: ${{ env.TARGET_TAG != env.VERSION && format('{0} ', env.TARGET_TAG) || '' }}
PRERELEASE: ${{ inputs.prerelease && '1' || '0' }}
if: |
env.TARGET_REPO == github.repository
if: env.TARGET_REPO == github.repository
run: |
gh_options=(
--notes-file "${NOTES_FILE}"

View File

@@ -4,14 +4,15 @@ on:
issue_comment:
types: [created, edited]
permissions:
issues: write
permissions: {}
jobs:
sanitize-comment:
name: Sanitize comment
if: vars.SANITIZE_COMMENT && !github.event.issue.pull_request
permissions:
issues: write # Needed by yt-dlp/sanitize-comment to edit comments
runs-on: ubuntu-latest
steps:
- name: Sanitize comment
uses: yt-dlp/sanitize-comment@v1
uses: yt-dlp/sanitize-comment@4536c691101b89f5373d50fe8a7980cae146346b # v1.0.0

View File

@@ -1,40 +1,54 @@
name: Test and lint workflows
on:
push:
branches: [master]
paths:
- .github/*.yml
- .github/workflows/*
- bundle/docker/linux/*.sh
- devscripts/setup_variables.py
- devscripts/setup_variables_tests.py
- devscripts/utils.py
pull_request:
branches: [master]
paths:
- .github/*.yml
- .github/workflows/*
- bundle/docker/linux/*.sh
- devscripts/setup_variables.py
- devscripts/setup_variables_tests.py
- devscripts/utils.py
permissions:
contents: read
permissions: {}
concurrency:
group: test-workflows-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
env:
ACTIONLINT_VERSION: "1.7.8"
ACTIONLINT_SHA256SUM: be92c2652ab7b6d08425428797ceabeb16e31a781c07bc388456b4e592f3e36a
ACTIONLINT_VERSION: "1.7.9"
ACTIONLINT_SHA256SUM: 233b280d05e100837f4af1433c7b40a5dcb306e3aa68fb4f17f8a7f45a7df7b4
ACTIONLINT_REPO: https://github.com/rhysd/actionlint
jobs:
check:
name: Check workflows
permissions:
contents: read
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.10" # Keep this in sync with release.yml's prepare job
- name: Install requirements
env:
ACTIONLINT_TARBALL: ${{ format('actionlint_{0}_linux_amd64.tar.gz', env.ACTIONLINT_VERSION) }}
shell: bash
run: |
python -m devscripts.install_deps --only-optional-groups --include-group test
python -m devscripts.install_deps --omit-default --include-extra test
sudo apt -y install shellcheck
python -m pip install -U pyflakes
curl -LO "${ACTIONLINT_REPO}/releases/download/v${ACTIONLINT_VERSION}/${ACTIONLINT_TARBALL}"
@@ -50,3 +64,20 @@ jobs:
- name: Test GHA devscripts
run: |
pytest -Werror --tb=short --color=yes devscripts/setup_variables_tests.py
zizmor:
name: Run zizmor
permissions:
contents: read
actions: read # Needed by zizmorcore/zizmor-action if repository is private
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- name: Run zizmor
uses: zizmorcore/zizmor-action@135698455da5c3b3e55f73f4419e481ab68cdd95 # v0.4.1
with:
advanced-security: false
persona: pedantic
version: v1.22.0

15
.github/zizmor.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
rules:
concurrency-limits:
ignore:
- build.yml # Can only be triggered by maintainers or cronjob
- issue-lockdown.yml # It *should* run for *every* new issue
- release-nightly.yml # Can only be triggered by once-daily cronjob
- release.yml # Can only be triggered by maintainers or cronjob
- sanitize-comment.yml # It *should* run for *every* new comment/edit
obfuscation:
ignore:
- release.yml # Not actual obfuscation
unpinned-uses:
config:
policies:
"*": hash-pin

View File

@@ -177,7 +177,7 @@ While it is strongly recommended to use `hatch` for yt-dlp development, if you a
```shell
# To only install development dependencies:
$ python -m devscripts.install_deps --include-group dev
$ python -m devscripts.install_deps --include-extra dev
# Or, for an editable install plus dev dependencies:
$ python -m pip install -e ".[default,dev]"
@@ -763,7 +763,7 @@ Wrap all extracted numeric data into safe functions from [`yt_dlp/utils/`](yt_dl
Use `url_or_none` for safe URL processing.
Use `traverse_obj` and `try_call` (superseeds `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.
Use `traverse_obj` and `try_call` (supersedes `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.
Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.

View File

@@ -828,9 +828,37 @@ krystophny
matyb08
pha1n0q
PierceLBrooks
sepro
TheQWERTYCodr
thomasmllt
w4grfw
WeidiDeng
Zer0spectrum
0xvd
1bnBattuta
beliote
darkstar
Haytam001
mrFlamel
oxyzenQ
putridambassador121
RezSat
WhatAmISupposedToPutHere
0xced
4elta
alch-emi
AlexBocken
cesbar
clayote
JV-Fernandes
legraphista
Mivik
nlurker
norepro
olipfei
pomtnp
prettysunflower
ptlydpr
quietvoid
romainreignier
Sytm
zahlman

View File

@@ -4,6 +4,168 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2026.01.29
#### Core changes
- [Accept float values for `--sleep-subtitles`](https://github.com/yt-dlp/yt-dlp/commit/f6dc7d5279bcb7f29839c700d54ac148b332d208) ([#15282](https://github.com/yt-dlp/yt-dlp/issues/15282)) by [0xvd](https://github.com/0xvd)
- [Add `--compat-options 2025`](https://github.com/yt-dlp/yt-dlp/commit/5382c6c81bb22a382e46adb646e1379ccfc462b6) ([#15499](https://github.com/yt-dlp/yt-dlp/issues/15499)) by [bashonly](https://github.com/bashonly)
- [Add `--format-sort-reset` option](https://github.com/yt-dlp/yt-dlp/commit/b16b06378a0805430699131ca6b786f971ae05b5) ([#13809](https://github.com/yt-dlp/yt-dlp/issues/13809)) by [nihil-admirari](https://github.com/nihil-admirari)
- [Bypass interactive format selection if no formats are found](https://github.com/yt-dlp/yt-dlp/commit/e0bb4777328a7d1eb96f2d0256fa33ae06b5930d) ([#15278](https://github.com/yt-dlp/yt-dlp/issues/15278)) by [bashonly](https://github.com/bashonly)
- [Fix `--parse-metadata` when `TO` is a single field name](https://github.com/yt-dlp/yt-dlp/commit/cec1f1df792fe521fff2d5ca54b5c70094b3d96a) ([#14577](https://github.com/yt-dlp/yt-dlp/issues/14577)) by [bashonly](https://github.com/bashonly), [clayote](https://github.com/clayote)
- [Fix concurrent formats downloading to stdout](https://github.com/yt-dlp/yt-dlp/commit/5bf91072bcfbb26e6618d668a0b3379a3a862f8c) ([#15617](https://github.com/yt-dlp/yt-dlp/issues/15617)) by [grqz](https://github.com/grqz)
- [Fix interactive format/video selection when downloading to stdout](https://github.com/yt-dlp/yt-dlp/commit/1829a53a543e63bf0391da572cefcd2526c0a806) ([#15626](https://github.com/yt-dlp/yt-dlp/issues/15626)) by [grqz](https://github.com/grqz)
- [Support Deno installed via Python package](https://github.com/yt-dlp/yt-dlp/commit/dde5eab3b3a356449b5c8c09506553b1c2842953) ([#15614](https://github.com/yt-dlp/yt-dlp/issues/15614)) by [bashonly](https://github.com/bashonly), [zahlman](https://github.com/zahlman)
- **utils**
- `decode_packed_codes`: [Fix missing key handling](https://github.com/yt-dlp/yt-dlp/commit/f24b9ac0c94aff3311ab0b935ce8103b5a3faeb1) ([#15440](https://github.com/yt-dlp/yt-dlp/issues/15440)) by [cesbar](https://github.com/cesbar)
- `devalue`: [Fix calling reviver on cached value](https://github.com/yt-dlp/yt-dlp/commit/ede54330fb38866936c63ebb96c490a2d4b1b58c) ([#15568](https://github.com/yt-dlp/yt-dlp/issues/15568)) by [Grub4K](https://github.com/Grub4K)
- `js_to_json`: [Prevent false positives for octals](https://github.com/yt-dlp/yt-dlp/commit/4d4c7e1c6930861f8388ce3cdd7a5335bf860e7d) ([#15474](https://github.com/yt-dlp/yt-dlp/issues/15474)) by [doe1080](https://github.com/doe1080)
- `mimetype2ext`: [Recognize more srt types](https://github.com/yt-dlp/yt-dlp/commit/c0a7c594a9e67ac2ee4cde38fa4842a0b2d675e8) ([#15411](https://github.com/yt-dlp/yt-dlp/issues/15411)) by [seproDev](https://github.com/seproDev)
- `random_user_agent`: [Bump versions](https://github.com/yt-dlp/yt-dlp/commit/9bf040dc6f348bf22abc71233446a0a5017e613c) ([#15396](https://github.com/yt-dlp/yt-dlp/issues/15396)) by [seproDev](https://github.com/seproDev)
- `unified_timestamp`: [Add `tz_offset` parameter](https://github.com/yt-dlp/yt-dlp/commit/15263d049cb3f47e921b414782490052feca3def) ([#15357](https://github.com/yt-dlp/yt-dlp/issues/15357)) by [doe1080](https://github.com/doe1080)
#### Extractor changes
- [Fix prioritization of Youtube URL matching](https://github.com/yt-dlp/yt-dlp/commit/e2ea6bd6ab639f910b99e55add18856974ff4c3a) ([#15596](https://github.com/yt-dlp/yt-dlp/issues/15596)) by [Grub4K](https://github.com/Grub4K)
- **archive.org**: [Fix metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/5f37f67d37b54bf9bd6fe7fa3083492d42f7a20a) ([#15286](https://github.com/yt-dlp/yt-dlp/issues/15286)) by [bashonly](https://github.com/bashonly)
- **bandcamp**: weekly: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e9d4b22b9b09a30f31b557df740b01b09a8aefe8) ([#15208](https://github.com/yt-dlp/yt-dlp/issues/15208)) by [0xvd](https://github.com/0xvd), [bashonly](https://github.com/bashonly)
- **bigo**: [Support `--wait-for-video`](https://github.com/yt-dlp/yt-dlp/commit/5026548d65276732ec290751d97994e23bdecc20) ([#15463](https://github.com/yt-dlp/yt-dlp/issues/15463)) by [olipfei](https://github.com/olipfei)
- **boosty**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/f9a06197f563a2ccadce2603e91ceec523e88d91) ([#15543](https://github.com/yt-dlp/yt-dlp/issues/15543)) by [Sytm](https://github.com/Sytm)
- **cbc**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/457dd036af907aa8b1b544b95311847abe470bf1) ([#15631](https://github.com/yt-dlp/yt-dlp/issues/15631)) by [subrat-lima](https://github.com/subrat-lima)
- **cda**: [Support mobile URLs](https://github.com/yt-dlp/yt-dlp/commit/6d92f87ddc40a31959097622ff01d4a7ca833a13) ([#15398](https://github.com/yt-dlp/yt-dlp/issues/15398)) by [seproDev](https://github.com/seproDev)
- **croatian.film**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/ba499ab0dcf2486d97f739e155264b305e0abd26) ([#15468](https://github.com/yt-dlp/yt-dlp/issues/15468)) by [0xvd](https://github.com/0xvd)
- **dailymotion**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2b61a2a4b20b499d6497c9212207f72a52b922a6) ([#15682](https://github.com/yt-dlp/yt-dlp/issues/15682)) by [bashonly](https://github.com/bashonly) (With fixes in [a893774](https://github.com/yt-dlp/yt-dlp/commit/a8937740969b60df1c2a634e58ab959352c9504c))
- **dropbox**: [Support videos in folders](https://github.com/yt-dlp/yt-dlp/commit/8a4b626daf59d0ecb6117ed275cb43dd68768b85) ([#15313](https://github.com/yt-dlp/yt-dlp/issues/15313)) by [0xvd](https://github.com/0xvd)
- **errarhiiv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/1c739bf53e673e06d2a43feddb5a31ee8496fa6e) ([#15667](https://github.com/yt-dlp/yt-dlp/issues/15667)) by [rdamas](https://github.com/rdamas)
- **facebook**
- [Remove broken login support](https://github.com/yt-dlp/yt-dlp/commit/2a7e048a60b76a245deeea734885bdce5e6571ae) ([#15434](https://github.com/yt-dlp/yt-dlp/issues/15434)) by [bashonly](https://github.com/bashonly)
- ads: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f8b3fe33f68495ade453602a201b33e3aa69ed1f) ([#15582](https://github.com/yt-dlp/yt-dlp/issues/15582)) by [legraphista](https://github.com/legraphista)
- **filmarchiv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/c0c9cac55446f7bf48370ba60c06f9cf5bc48d15) ([#13490](https://github.com/yt-dlp/yt-dlp/issues/13490)) by [4elta](https://github.com/4elta)
- **franceinfo**
- [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/e08fdaaec2b253abb1e08899d1d13ec5072d76f2) ([#15704](https://github.com/yt-dlp/yt-dlp/issues/15704)) by [bashonly](https://github.com/bashonly)
- [Support new domain URLs](https://github.com/yt-dlp/yt-dlp/commit/ac3a566434c68cbf960dfb357c6c8a275e8bf8eb) ([#15669](https://github.com/yt-dlp/yt-dlp/issues/15669)) by [romainreignier](https://github.com/romainreignier)
- **generic**: [Improve detection of blockage due to TLS fingerprint](https://github.com/yt-dlp/yt-dlp/commit/cea825e7e0a1a93a1a355a86bbb2b9e77594f569) ([#15426](https://github.com/yt-dlp/yt-dlp/issues/15426)) by [bashonly](https://github.com/bashonly)
- **gofile**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c5e55e04795636a2855a1be80cea0f6b2d0f0cc6) ([#15296](https://github.com/yt-dlp/yt-dlp/issues/15296)) by [quietvoid](https://github.com/quietvoid)
- **hotstar**: [Extract from new API](https://github.com/yt-dlp/yt-dlp/commit/5a481d65fa99862110bb84d10a2f15f0cb47cab3) ([#15480](https://github.com/yt-dlp/yt-dlp/issues/15480)) by [0xvd](https://github.com/0xvd)
- **iqiyi**: [Remove broken login support](https://github.com/yt-dlp/yt-dlp/commit/09078190b0f33d14ae2b402913c64b724acf4bcb) ([#15441](https://github.com/yt-dlp/yt-dlp/issues/15441)) by [seproDev](https://github.com/seproDev)
- **lbry**: [Support filtering of flat playlist results](https://github.com/yt-dlp/yt-dlp/commit/0e4d1e9de6250a80453d46f94b9fade5f10197a0) ([#15695](https://github.com/yt-dlp/yt-dlp/issues/15695)) by [christoph-heinrich](https://github.com/christoph-heinrich), [dirkf](https://github.com/dirkf)
- **manoto**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/6b23305822d406eff8e813244d95f328c22e821e) ([#15414](https://github.com/yt-dlp/yt-dlp/issues/15414)) by [seproDev](https://github.com/seproDev)
- **media.ccc.de**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c8680b65f79cfeb23b342b70ffe1e233902f7933) ([#15608](https://github.com/yt-dlp/yt-dlp/issues/15608)) by [rdamas](https://github.com/rdamas)
- **nebula**
- season
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/f5270705e816a24caef7357a7ce8e17471899d73) ([#15347](https://github.com/yt-dlp/yt-dlp/issues/15347)) by [0xvd](https://github.com/0xvd), [bashonly](https://github.com/bashonly)
- [Support more URLs](https://github.com/yt-dlp/yt-dlp/commit/6c918c5071dec8290686a4d030a1f74da3d9debf) ([#15436](https://github.com/yt-dlp/yt-dlp/issues/15436)) by [prettysunflower](https://github.com/prettysunflower)
- **netease**: program: [Support DJ URLs](https://github.com/yt-dlp/yt-dlp/commit/0ea6cc6d82318e554ffa0b5eaf9da4f4379ccbe9) ([#15365](https://github.com/yt-dlp/yt-dlp/issues/15365)) by [0xvd](https://github.com/0xvd)
- **neteasemusic**: [Fix merged lyrics extraction](https://github.com/yt-dlp/yt-dlp/commit/a421eb06d111cfa75e42569dc42331e9f3d8f27b) ([#15052](https://github.com/yt-dlp/yt-dlp/issues/15052)) by [Mivik](https://github.com/Mivik)
- **netzkino**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/a27ec9efc63da1cfb2a390eb028549585dbb2f41) ([#15351](https://github.com/yt-dlp/yt-dlp/issues/15351)) by [doe1080](https://github.com/doe1080)
- **nextmedia**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/6d4984e64e893dd954e781046a3532eb7abbfa16) ([#15354](https://github.com/yt-dlp/yt-dlp/issues/15354)) by [doe1080](https://github.com/doe1080)
- **pandatv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/878a41e283878ee34b052a395b1f9499f2b9ef81) ([#13210](https://github.com/yt-dlp/yt-dlp/issues/13210)) by [ptlydpr](https://github.com/ptlydpr)
- **parti**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/04f2ec4b976271e1e7ad3e650a0be2c4fd796ee0) ([#15319](https://github.com/yt-dlp/yt-dlp/issues/15319)) by [seproDev](https://github.com/seproDev)
- **patreon**: [Extract inlined media](https://github.com/yt-dlp/yt-dlp/commit/14998eef63a1462961a666d71318f804aca12220) ([#15498](https://github.com/yt-dlp/yt-dlp/issues/15498)) by [bashonly](https://github.com/bashonly)
- **pbs**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/a81087160812ec7a2059e1641a9785bfa4629023) ([#15083](https://github.com/yt-dlp/yt-dlp/issues/15083)) by [nlurker](https://github.com/nlurker)
- **picarto**: [Fix extraction when stream has no title](https://github.com/yt-dlp/yt-dlp/commit/fcd47d2db3f871c7b7d638773c36cc503119742d) ([#15407](https://github.com/yt-dlp/yt-dlp/issues/15407)) by [mikf](https://github.com/mikf)
- **pornhub**: [Optimize metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/f2ee2a46fc2a4efb6ed58ee9e67c506c6b72b843) ([#15231](https://github.com/yt-dlp/yt-dlp/issues/15231)) by [norepro](https://github.com/norepro)
- **rumblechannel**: [Support filtering of flat playlist results](https://github.com/yt-dlp/yt-dlp/commit/0dec80c02a0c9edcc52d33d3ac83435dd8bcaa08) ([#15694](https://github.com/yt-dlp/yt-dlp/issues/15694)) by [christoph-heinrich](https://github.com/christoph-heinrich)
- **scte**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/4a772e5289b939013202ad7707d5b989794ed287) ([#15442](https://github.com/yt-dlp/yt-dlp/issues/15442)) by [seproDev](https://github.com/seproDev)
- **tarangplus**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/260ba3abba2849aa175dd0bcfec308fc6ba6a678) ([#13060](https://github.com/yt-dlp/yt-dlp/issues/13060)) by [subrat-lima](https://github.com/subrat-lima) (With fixes in [27afb31](https://github.com/yt-dlp/yt-dlp/commit/27afb31edc492cb079f9bce9773498d08e568ff3) by [bashonly](https://github.com/bashonly))
- **telecinco**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/b6f24745bfb89ec0eaaa181a68203c2e81e58802) ([#15311](https://github.com/yt-dlp/yt-dlp/issues/15311)) by [0xvd](https://github.com/0xvd), [bashonly](https://github.com/bashonly)
- **thechosen**: [Support new URL format](https://github.com/yt-dlp/yt-dlp/commit/1f4b26c39fb09782cf03615d089e712975395d6d) ([#15687](https://github.com/yt-dlp/yt-dlp/issues/15687)) by [AlexBocken](https://github.com/AlexBocken)
- **tiktok**
- [Extract `save_count`](https://github.com/yt-dlp/yt-dlp/commit/9c393e3f6220d34d534bef7d9d345782003b58ad) ([#15054](https://github.com/yt-dlp/yt-dlp/issues/15054)) by [pomtnp](https://github.com/pomtnp)
- [Solve JS challenges with native Python implementation](https://github.com/yt-dlp/yt-dlp/commit/e3f0d8b731b40176bcc632bf92cfe5149402b202) ([#15672](https://github.com/yt-dlp/yt-dlp/issues/15672)) by [bashonly](https://github.com/bashonly), [DTrombett](https://github.com/DTrombett)
- **tubitv**: [Support URLs with locales](https://github.com/yt-dlp/yt-dlp/commit/f0bc71abf68480b3b65b27c2a60319bc88e5eea2) ([#15205](https://github.com/yt-dlp/yt-dlp/issues/15205)) by [0xvd](https://github.com/0xvd)
- **tumblr**: [Extract timestamp](https://github.com/yt-dlp/yt-dlp/commit/87a265d820fbf9e3ce47c149609100fc8e9e13c5) ([#15462](https://github.com/yt-dlp/yt-dlp/issues/15462)) by [alch-emi](https://github.com/alch-emi)
- **tv5unis**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/6ae9e9568701b9c960e817c6dc35bcd824719a80) ([#15477](https://github.com/yt-dlp/yt-dlp/issues/15477)) by [0xced](https://github.com/0xced)
- **twitch**: videos: [Raise error when channel is not found](https://github.com/yt-dlp/yt-dlp/commit/e15ca65874b2a8bcd7435696b8f01252c39512ba) ([#15458](https://github.com/yt-dlp/yt-dlp/issues/15458)) by [0xvd](https://github.com/0xvd)
- **twitter**
- [Do not extract non-video posts from `unified_card`s](https://github.com/yt-dlp/yt-dlp/commit/ce9a3591f8292aeb93ffdad10028bfcddda3976b) ([#15431](https://github.com/yt-dlp/yt-dlp/issues/15431)) by [bashonly](https://github.com/bashonly)
- [Remove broken login support](https://github.com/yt-dlp/yt-dlp/commit/a6ba7140051dbe1d63a1da4de263bb9c886c0a32) ([#15432](https://github.com/yt-dlp/yt-dlp/issues/15432)) by [bashonly](https://github.com/bashonly)
- **vimeo**: [Add `macos` client](https://github.com/yt-dlp/yt-dlp/commit/ba5e2227c8c49fa76d9d30332aad2416774ddb31) ([#15746](https://github.com/yt-dlp/yt-dlp/issues/15746)) by [bashonly](https://github.com/bashonly), [gamer191](https://github.com/gamer191)
- **volejtv**: [Fix and add extractors](https://github.com/yt-dlp/yt-dlp/commit/1effa06dbf4dfd2e307b445a55a465d897205213) ([#13226](https://github.com/yt-dlp/yt-dlp/issues/13226)) by [subrat-lima](https://github.com/subrat-lima)
- **wat.tv**: [Improve DRM detection](https://github.com/yt-dlp/yt-dlp/commit/bc6ff877dd371d405b11f0ab16634c4d4b5d645e) ([#15659](https://github.com/yt-dlp/yt-dlp/issues/15659)) by [wesson09](https://github.com/wesson09)
- **whyp**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f70ebf97ea7ef3b00c3e9213acf40d1b004c31d9) ([#15721](https://github.com/yt-dlp/yt-dlp/issues/15721)) by [bashonly](https://github.com/bashonly)
- **yahoo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/97fb78a5b95a98a698f77281ea0c101bf090ed4c) ([#15314](https://github.com/yt-dlp/yt-dlp/issues/15314)) by [0xvd](https://github.com/0xvd), [bashonly](https://github.com/bashonly)
- **youtube**
- [Adjust default clients](https://github.com/yt-dlp/yt-dlp/commit/23b846506378a6a9c9a0958382d37f943f7cfa51) ([#15601](https://github.com/yt-dlp/yt-dlp/issues/15601)) by [bashonly](https://github.com/bashonly)
- [Fix `player_skip=js` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/abf29e3e72e8a4dcae61e2ceaf37ce8405af61ab) ([#15428](https://github.com/yt-dlp/yt-dlp/issues/15428)) by [bashonly](https://github.com/bashonly)
- [Fix default player clients](https://github.com/yt-dlp/yt-dlp/commit/309b03f2ad09fcfcf4ce81e757f8d3796bb56add) ([#15726](https://github.com/yt-dlp/yt-dlp/issues/15726)) by [bashonly](https://github.com/bashonly)
- [Solve n challenges for manifest formats](https://github.com/yt-dlp/yt-dlp/commit/d20f58d721fe45fe873e3389a0d17a72352aecec) ([#15602](https://github.com/yt-dlp/yt-dlp/issues/15602)) by [bashonly](https://github.com/bashonly)
- [Support comment subthreads](https://github.com/yt-dlp/yt-dlp/commit/d22436e5dc7c6808d931e27cbb967b1b2a33c17c) ([#15419](https://github.com/yt-dlp/yt-dlp/issues/15419)) by [bashonly](https://github.com/bashonly) (With fixes in [76c31a7](https://github.com/yt-dlp/yt-dlp/commit/76c31a7a216a3894884381c7775f838b811fde06), [468aa6a](https://github.com/yt-dlp/yt-dlp/commit/468aa6a9b431194949ede9eaad8f33e314a288d6))
- [Update ejs to 0.4.0](https://github.com/yt-dlp/yt-dlp/commit/88b35ff911a999e0b479417237010c305114ba08) ([#15747](https://github.com/yt-dlp/yt-dlp/issues/15747)) by [bashonly](https://github.com/bashonly)
- tab: [Fix flat thumbnails extraction for shorts](https://github.com/yt-dlp/yt-dlp/commit/ff61bef041d1f69fec1044f783fb938c005128af) ([#15331](https://github.com/yt-dlp/yt-dlp/issues/15331)) by [bashonly](https://github.com/bashonly)
- **zdf**: [Support sister sites URLs](https://github.com/yt-dlp/yt-dlp/commit/48b845a29623cbc814ad6c6b2ef285e3f3c0fe91) ([#15370](https://github.com/yt-dlp/yt-dlp/issues/15370)) by [InvalidUsernameException](https://github.com/InvalidUsernameException)
- **zoom**: [Extract recordings with start times](https://github.com/yt-dlp/yt-dlp/commit/0066de5b7e146a96e4cb4352f65dc3f1e283af4a) ([#15475](https://github.com/yt-dlp/yt-dlp/issues/15475)) by [JV-Fernandes](https://github.com/JV-Fernandes)
#### Networking changes
- **Request Handler**: curl_cffi: [Support `curl_cffi` 0.14.x](https://github.com/yt-dlp/yt-dlp/commit/9ab4777b97b5280ae1f53d1fe1b8ac542727238b) ([#15613](https://github.com/yt-dlp/yt-dlp/issues/15613)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**
- [Bump official actions to latest versions](https://github.com/yt-dlp/yt-dlp/commit/825648a740867cbecd2e593963d7aaf3d568db84) ([#15305](https://github.com/yt-dlp/yt-dlp/issues/15305)) by [bashonly](https://github.com/bashonly)
- [Harden CI/CD pipeline](https://github.com/yt-dlp/yt-dlp/commit/ab3ff2d5dd220aa35805dadb6fae66ae9a0e2553) ([#15387](https://github.com/yt-dlp/yt-dlp/issues/15387)) by [bashonly](https://github.com/bashonly)
- [Improve nightly release check](https://github.com/yt-dlp/yt-dlp/commit/3763d0d4ab8bdbe433ce08e45e21f36ebdeb5db3) ([#15455](https://github.com/yt-dlp/yt-dlp/issues/15455)) by [bashonly](https://github.com/bashonly) (With fixes in [0b08b83](https://github.com/yt-dlp/yt-dlp/commit/0b08b833bfca6a0882f4741bb8fa46c1698c77e5))
- **ci**: [Explicitly declare permissions and limit credentials](https://github.com/yt-dlp/yt-dlp/commit/a6a8f6b6d6775caa031e5016b79db28c6aaadfcb) ([#15324](https://github.com/yt-dlp/yt-dlp/issues/15324)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- Miscellaneous
- [a653494](https://github.com/yt-dlp/yt-dlp/commit/a65349443b959b8ab6bdec8e573777006d29b827) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- [8b27553](https://github.com/yt-dlp/yt-dlp/commit/8b275536d945c4b3d07b6c520677922c67a7c10f) by [bashonly](https://github.com/bashonly)
### 2025.12.08
#### Core changes
- [Respect `PATHEXT` when locating JS runtime on Windows](https://github.com/yt-dlp/yt-dlp/commit/e564b4a8080cff48fa0c28f20272c05085ee6130) ([#15117](https://github.com/yt-dlp/yt-dlp/issues/15117)) by [Grub4K](https://github.com/Grub4K)
- **cookies**: [Fix `--cookies-from-browser` for new installs of Firefox 147+](https://github.com/yt-dlp/yt-dlp/commit/fa16dc5241ac1552074feee48e1c2605dc36d352) ([#15215](https://github.com/yt-dlp/yt-dlp/issues/15215)) by [bashonly](https://github.com/bashonly), [mbway](https://github.com/mbway)
#### Extractor changes
- **agalega**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3cb5e4db54d44fe82d4eee94ae2f37cbce2e7dfc) ([#15105](https://github.com/yt-dlp/yt-dlp/issues/15105)) by [putridambassador121](https://github.com/putridambassador121)
- **alibaba**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/c70b57c03e0c25767a5166620798297a2a4878fb) ([#15253](https://github.com/yt-dlp/yt-dlp/issues/15253)) by [seproDev](https://github.com/seproDev)
- **bitmovin**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/45a3b42bb917e99b0b5c155c272ebf4a82a5bf66) ([#15064](https://github.com/yt-dlp/yt-dlp/issues/15064)) by [seproDev](https://github.com/seproDev)
- **digiteka**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/6842620d56e4c4e6affb90c2f8dff8a36dee852c) ([#14903](https://github.com/yt-dlp/yt-dlp/issues/14903)) by [beliote](https://github.com/beliote)
- **fc2**: live: [Raise appropriate error when stream is offline](https://github.com/yt-dlp/yt-dlp/commit/4433b3a217c9f430dc057643bfd7b6769eff4a45) ([#15180](https://github.com/yt-dlp/yt-dlp/issues/15180)) by [Zer0spectrum](https://github.com/Zer0spectrum)
- **floatplane**: [Add subtitle support](https://github.com/yt-dlp/yt-dlp/commit/b333ef1b3f961e292a8bf7052c54b54c81587a17) ([#15069](https://github.com/yt-dlp/yt-dlp/issues/15069)) by [seproDev](https://github.com/seproDev)
- **jtbc**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/947e7883406e5ea43687d6e4ff721cc0162c9664) ([#15047](https://github.com/yt-dlp/yt-dlp/issues/15047)) by [seproDev](https://github.com/seproDev)
- **loom**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/36b29bb3532e008a2aaf3d36d1c6fc3944137930) ([#15236](https://github.com/yt-dlp/yt-dlp/issues/15236)) by [bashonly](https://github.com/bashonly)
- **mave**: channel: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/5f66ac71f6637f768cd251509b0a932d0ce56427) ([#14915](https://github.com/yt-dlp/yt-dlp/issues/14915)) by [anlar](https://github.com/anlar)
- **medaltv**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/a4c72acc462668a938827370bd77084a1cd4733b) ([#15103](https://github.com/yt-dlp/yt-dlp/issues/15103)) by [seproDev](https://github.com/seproDev)
- **netapp**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/20f83f208eae863250b35e2761adad88e91d85a1) ([#15122](https://github.com/yt-dlp/yt-dlp/issues/15122)) by [darkstar](https://github.com/darkstar)
- **nhk**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/12d411722a3d7a0382d1d230a904ecd4e20298b6) ([#14528](https://github.com/yt-dlp/yt-dlp/issues/14528)) by [garret1317](https://github.com/garret1317)
- **nowcanal**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/4e680db1505dafb93313b1d42ffcd3f230fcc92a) ([#14584](https://github.com/yt-dlp/yt-dlp/issues/14584)) by [pferreir](https://github.com/pferreir)
- **patreon**: campaign: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/023e4db9afe0630c608621846856a1ca876d8bab) ([#15108](https://github.com/yt-dlp/yt-dlp/issues/15108)) by [thomasmllt](https://github.com/thomasmllt)
- **rinsefm**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/d6aa8c235d2e7d9374f79ec73af23a3859c76bea) ([#15020](https://github.com/yt-dlp/yt-dlp/issues/15020)) by [1bnBattuta](https://github.com/1bnBattuta), [seproDev](https://github.com/seproDev)
- **s4c**: [Fix geo-restricted content](https://github.com/yt-dlp/yt-dlp/commit/26c2545b87e2b22f134d1f567ed4d4b0b91c3253) ([#15196](https://github.com/yt-dlp/yt-dlp/issues/15196)) by [seproDev](https://github.com/seproDev)
- **soundcloudplaylist**: [Support new API URLs](https://github.com/yt-dlp/yt-dlp/commit/1dd84b9d1c33e50de49866b0d93c2596897ce506) ([#15071](https://github.com/yt-dlp/yt-dlp/issues/15071)) by [seproDev](https://github.com/seproDev)
- **sporteurope**: [Support new domain](https://github.com/yt-dlp/yt-dlp/commit/025191fea655ac879ca6dc68df358c26456a6e46) ([#15251](https://github.com/yt-dlp/yt-dlp/issues/15251)) by [bashonly](https://github.com/bashonly)
- **sproutvideo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2c9f0c3456057aff0631d9ea6d3eda70ffd8aabe) ([#15113](https://github.com/yt-dlp/yt-dlp/issues/15113)) by [bashonly](https://github.com/bashonly)
- **thechosen**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/854fded114f3b7b33693c2d3418575d04014aa4b) ([#14183](https://github.com/yt-dlp/yt-dlp/issues/14183)) by [mrFlamel](https://github.com/mrFlamel)
- **thisoldhouse**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/9daba4f442139ee2537746398afc5ac30b51c28c) ([#15097](https://github.com/yt-dlp/yt-dlp/issues/15097)) by [bashonly](https://github.com/bashonly)
- **tubitv**: series: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2a777ecbd598de19a4c691ba1f790ccbec9cdbc4) ([#15018](https://github.com/yt-dlp/yt-dlp/issues/15018)) by [Zer0spectrum](https://github.com/Zer0spectrum)
- **urplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c2e7e9cdb2261adde01048d161914b156a3bad51) ([#15120](https://github.com/yt-dlp/yt-dlp/issues/15120)) by [seproDev](https://github.com/seproDev)
- **web.archive**: youtube: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7ec6b9bc40ee8a21b11cce83a09a07a37014062e) ([#15234](https://github.com/yt-dlp/yt-dlp/issues/15234)) by [seproDev](https://github.com/seproDev)
- **wistiachannel**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/0c696239ef418776ac6ba20284bd2f3976a011b4) ([#14218](https://github.com/yt-dlp/yt-dlp/issues/14218)) by [Sojiroh](https://github.com/Sojiroh)
- **xhamster**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/29e257037862f3b2ad65e6e8d2972f9ed89389e3) ([#15252](https://github.com/yt-dlp/yt-dlp/issues/15252)) by [0xvd](https://github.com/0xvd)
- **yfanefa**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/af285016d2b14c4445109283e7c590b31542de88) ([#15032](https://github.com/yt-dlp/yt-dlp/issues/15032)) by [Haytam001](https://github.com/Haytam001)
- **youtube**
- [Add `use_ad_playback_context` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/f7acf3c1f42cc474927ecc452205d7877af36731) ([#15220](https://github.com/yt-dlp/yt-dlp/issues/15220)) by [WhatAmISupposedToPutHere](https://github.com/WhatAmISupposedToPutHere)
- [Allow `ejs` patch version to differ](https://github.com/yt-dlp/yt-dlp/commit/7bd79d92965fe9f84d7e1720eb6bb10fa9a10c77) ([#15263](https://github.com/yt-dlp/yt-dlp/issues/15263)) by [Grub4K](https://github.com/Grub4K)
- [Detect "super resolution" AI-upscaled formats](https://github.com/yt-dlp/yt-dlp/commit/4cb5e191efeebc3679f89c3c8ac819bcd511bb1f) ([#15050](https://github.com/yt-dlp/yt-dlp/issues/15050)) by [bashonly](https://github.com/bashonly)
- [Determine wait time from player response](https://github.com/yt-dlp/yt-dlp/commit/715af0c636b2b33fb3df1eb2ee37eac8262d43ac) ([#14646](https://github.com/yt-dlp/yt-dlp/issues/14646)) by [bashonly](https://github.com/bashonly), [WhatAmISupposedToPutHere](https://github.com/WhatAmISupposedToPutHere)
- [Extract all automatic caption languages](https://github.com/yt-dlp/yt-dlp/commit/419776ecf57269efb13095386a19ddc75c1f11b2) ([#15156](https://github.com/yt-dlp/yt-dlp/issues/15156)) by [bashonly](https://github.com/bashonly)
- [Improve message when no JS runtime is found](https://github.com/yt-dlp/yt-dlp/commit/1d43fa5af883f96af902a29544fc766f5e97fce6) ([#15266](https://github.com/yt-dlp/yt-dlp/issues/15266)) by [bashonly](https://github.com/bashonly)
- [Update ejs to 0.3.2](https://github.com/yt-dlp/yt-dlp/commit/0c7e4cfcaed95909d7c1c0a11b5a12881bcfdfd6) ([#15267](https://github.com/yt-dlp/yt-dlp/issues/15267)) by [bashonly](https://github.com/bashonly)
#### Downloader changes
- [Fix playback wait time for ffmpeg downloads](https://github.com/yt-dlp/yt-dlp/commit/23f1ab346927ab73ad510fd7ba105a69e5291c66) ([#15066](https://github.com/yt-dlp/yt-dlp/issues/15066)) by [bashonly](https://github.com/bashonly)
#### Postprocessor changes
- **ffmpeg**: [Fix uncaught error if bad --ffmpeg-location is given](https://github.com/yt-dlp/yt-dlp/commit/0eed3fe530d6ff4b668494c5b1d4d6fc1ade96f7) ([#15104](https://github.com/yt-dlp/yt-dlp/issues/15104)) by [bashonly](https://github.com/bashonly)
- **ffmpegmetadata**: [Add more tag mappings](https://github.com/yt-dlp/yt-dlp/commit/04050be583aae21f99932a674d1d2992ff016d5c) ([#14654](https://github.com/yt-dlp/yt-dlp/issues/14654)) by [garret1317](https://github.com/garret1317)
#### Networking changes
- **Request Handler**: urllib: [Do not read after close](https://github.com/yt-dlp/yt-dlp/commit/6ee6a6fc58d6254ef944bd311e6890e208a75e98) ([#15049](https://github.com/yt-dlp/yt-dlp/issues/15049)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**: [Bump PyInstaller minimum version requirement to 6.17.0](https://github.com/yt-dlp/yt-dlp/commit/280165026886a1f1614ab527c34c66d71faa5d69) ([#15199](https://github.com/yt-dlp/yt-dlp/issues/15199)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [7a52ff2](https://github.com/yt-dlp/yt-dlp/commit/7a52ff29d86efc8f3adeba977b2009ce40b8e52e) by [bashonly](https://github.com/bashonly), [oxyzenQ](https://github.com/oxyzenQ), [RezSat](https://github.com/RezSat), [seproDev](https://github.com/seproDev)
- **devscripts**: `install_deps`: [Align options/terms with PEP 735](https://github.com/yt-dlp/yt-dlp/commit/29fe515d8d3386b3406ff02bdabb967d6821bc02) ([#15200](https://github.com/yt-dlp/yt-dlp/issues/15200)) by [bashonly](https://github.com/bashonly)
### 2025.11.12
#### Important changes
@@ -64,7 +226,7 @@ yt-dlp now requires users to have an external JavaScript runtime (e.g. Deno) ins
- **build**: [Bump musllinux Python version to 3.14](https://github.com/yt-dlp/yt-dlp/commit/646904cd3a79429ec5fdc43f904b3f57ae213f34) ([#14623](https://github.com/yt-dlp/yt-dlp/issues/14623)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- Miscellaneous
- [c63b4e2](https://github.com/yt-dlp/yt-dlp/commit/c63b4e2a2b81cc78397c8709ef53ffd29bada213) by [bashonly](https://github.com/bashonly), [matyb08](https://github.com/matyb08), [sepro](https://github.com/sepro)
- [c63b4e2](https://github.com/yt-dlp/yt-dlp/commit/c63b4e2a2b81cc78397c8709ef53ffd29bada213) by [bashonly](https://github.com/bashonly), [matyb08](https://github.com/matyb08), [seproDev](https://github.com/seproDev)
- [335653b](https://github.com/yt-dlp/yt-dlp/commit/335653be82d5ef999cfc2879d005397402eebec1) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **devscripts**: [Improve `install_deps` script](https://github.com/yt-dlp/yt-dlp/commit/73922e66e437fb4bb618bdc119a96375081bf508) ([#14766](https://github.com/yt-dlp/yt-dlp/issues/14766)) by [bashonly](https://github.com/bashonly)
- **test**: [Skip flaky tests if source unchanged](https://github.com/yt-dlp/yt-dlp/commit/ade8c2b36ff300edef87d48fd1ba835ac35c5b63) ([#14970](https://github.com/yt-dlp/yt-dlp/issues/14970)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)

View File

@@ -8,9 +8,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
Core Maintainers are responsible for reviewing and merging contributions, publishing releases, and steering the overall direction of the project.
**You can contact the core maintainers via `maintainers@yt-dlp.org`.**
This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-dlp/issues/new/choose) if you need help or want to report a bug.
**You can contact the core maintainers via `maintainers@yt-dlp.org`.** This email address is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-dlp/issues/new/choose) if you need help or want to report a bug.
### [coletdjnz](https://github.com/coletdjnz)
@@ -18,6 +16,7 @@ This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-
* Overhauled the networking stack and implemented support for `requests` and `curl_cffi` (`--impersonate`) HTTP clients
* Reworked the plugin architecture to support installing plugins across all yt-dlp distributions (exe, pip, etc.)
* Implemented support for external JavaScript runtimes/engines
* Maintains support for YouTube
* Added and fixed support for various other sites
@@ -25,9 +24,10 @@ This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-
* Rewrote and maintains the build/release workflows and the self-updater: executables, automated/nightly/master releases, `--update-to`
* Overhauled external downloader cookie handling
* Helped in implementing support for external JavaScript runtimes/engines
* Added `--cookies-from-browser` support for Firefox containers
* Overhauled and maintains support for sites like Youtube, Vimeo, Twitter, TikTok, etc
* Added support for sites like Dacast, Kick, Loom, SproutVideo, Triller, Weverse, etc
* Maintains support for sites like YouTube, Vimeo, Twitter, TikTok, etc
* Added support for various sites
### [Grub4K](https://github.com/Grub4K)
@@ -37,16 +37,10 @@ This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-
* `--update-to`, self-updater rewrite, automated/nightly/master releases
* Reworked internals like `traverse_obj`, various core refactors and bugs fixes
* Implemented proper progress reporting for parallel downloads
* Implemented support for external JavaScript runtimes/engines
* Improved/fixed/added Bundestag, crunchyroll, pr0gramm, Twitter, WrestleUniverse etc
### [sepro](https://github.com/seproDev)
* UX improvements: Warn when ffmpeg is missing, warn when double-clicking exe
* Code cleanup: Remove dead extractors, mark extractors as broken, enable/apply ruff rules
* Improved/fixed/added ArdMediathek, DRTV, Floatplane, MagentaMusik, Naver, Nebula, OnDemandKorea, Vbox7 etc
## Inactive Core Maintainers
### [pukkandan](https://github.com/pukkandan)
@@ -75,6 +69,15 @@ This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-
* Added playlist/series downloads for Hotstar, ParamountPlus, Rumble, SonyLIV, Trovo, TubiTv, Voot etc
* Improved/fixed support for HiDive, HotStar, Hungama, LBRY, LinkedInLearning, Mxplayer, SonyLiv, TV2, Vimeo, VLive etc
### [sepro](https://github.com/seproDev)
* UX improvements: Warn when ffmpeg is missing, warn when double-clicking exe
* Helped in implementing support for external JavaScript runtimes/engines
* Code cleanup: Remove dead extractors, mark extractors as broken, enable/apply ruff rules
* Improved/fixed/added ArdMediathek, DRTV, Floatplane, MagentaMusik, Naver, Nebula, OnDemandKorea, Vbox7 etc
## Triage Maintainers
Triage Maintainers are frequent contributors who can manage issues and pull requests.

View File

@@ -202,9 +202,9 @@ CONTRIBUTORS: Changelog.md
# The following EJS_-prefixed variables are auto-generated by devscripts/update_ejs.py
# DO NOT EDIT!
EJS_VERSION = 0.3.1
EJS_WHEEL_NAME = yt_dlp_ejs-0.3.1-py3-none-any.whl
EJS_WHEEL_HASH = sha256:a6e3548874db7c774388931752bb46c7f4642c044b2a189e56968f3d5ecab622
EJS_VERSION = 0.4.0
EJS_WHEEL_NAME = yt_dlp_ejs-0.4.0-py3-none-any.whl
EJS_WHEEL_HASH = sha256:19278cff397b243074df46342bb7616c404296aeaff01986b62b4e21823b0b9c
EJS_PY_FOLDERS = yt_dlp_ejs yt_dlp_ejs/yt yt_dlp_ejs/yt/solver
EJS_PY_FILES = yt_dlp_ejs/__init__.py yt_dlp_ejs/_version.py yt_dlp_ejs/yt/__init__.py yt_dlp_ejs/yt/solver/__init__.py
EJS_JS_FOLDERS = yt_dlp_ejs/yt/solver

View File

@@ -203,7 +203,7 @@ Python versions 3.10+ (CPython) and 3.11+ (PyPy) are supported. Other versions a
On Windows, [Microsoft Visual C++ 2010 SP1 Redistributable Package (x86)](https://download.microsoft.com/download/1/6/5/165255E7-1014-4D0A-B094-B6A430A6BFFC/vcredist_x86.exe) is also necessary to run yt-dlp. You probably already have this, but if the executable throws an error due to missing `MSVCR100.dll` you need to install it manually.
-->
While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs` and a JavaScript runtime are highly recommended
While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs` and a supported JavaScript runtime/engine are highly recommended
### Strongly recommended
@@ -213,9 +213,9 @@ While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs`
**Important**: What you need is ffmpeg *binary*, **NOT** [the Python package of the same name](https://pypi.org/project/ffmpeg)
* [**yt-dlp-ejs**](https://github.com/yt-dlp/ejs) - Required for deciphering YouTube n/sig values. Licensed under [Unlicense](https://github.com/yt-dlp/ejs/blob/main/LICENSE), bundles [MIT](https://github.com/davidbonnet/astring/blob/main/LICENSE) and [ISC](https://github.com/meriyah/meriyah/blob/main/LICENSE.md) components.
* [**yt-dlp-ejs**](https://github.com/yt-dlp/ejs) - Required for full YouTube support. Licensed under [Unlicense](https://github.com/yt-dlp/ejs/blob/main/LICENSE), bundles [MIT](https://github.com/davidbonnet/astring/blob/main/LICENSE) and [ISC](https://github.com/meriyah/meriyah/blob/main/LICENSE.md) components.
A JavaScript runtime like [**deno**](https://deno.land) (recommended), [**node.js**](https://nodejs.org), [**bun**](https://bun.sh), or [**QuickJS**](https://bellard.org/quickjs/) is also required to run yt-dlp-ejs. See [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/EJS).
A JavaScript runtime/engine like [**deno**](https://deno.land) (recommended), [**node.js**](https://nodejs.org), [**bun**](https://bun.sh), or [**QuickJS**](https://bellard.org/quickjs/) is also required to run yt-dlp-ejs. See [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/EJS).
### Networking
* [**certifi**](https://github.com/certifi/python-certifi)\* - Provides Mozilla's root certificate bundle. Licensed under [MPLv2](https://github.com/certifi/python-certifi/blob/master/LICENSE)
@@ -228,7 +228,7 @@ While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs`
The following provide support for impersonating browser requests. This may be required for some sites that employ TLS fingerprinting.
* [**curl_cffi**](https://github.com/lexiforest/curl_cffi) (recommended) - Python binding for [curl-impersonate](https://github.com/lexiforest/curl-impersonate). Provides impersonation targets for Chrome, Edge and Safari. Licensed under [MIT](https://github.com/lexiforest/curl_cffi/blob/main/LICENSE)
* Can be installed with the `curl-cffi` group, e.g. `pip install "yt-dlp[default,curl-cffi]"`
* Can be installed with the `curl-cffi` extra, e.g. `pip install "yt-dlp[default,curl-cffi]"`
* Currently included in most builds *except* `yt-dlp` (Unix zipimport binary), `yt-dlp_x86` (Windows 32-bit) and `yt-dlp_musllinux_aarch64`
@@ -265,7 +265,7 @@ To build the standalone executable, you must have Python and `pyinstaller` (plus
You can run the following commands:
```
python devscripts/install_deps.py --include-group pyinstaller
python devscripts/install_deps.py --include-extra pyinstaller
python devscripts/make_lazy_extractors.py
python -m bundle.pyinstaller
```
@@ -483,7 +483,7 @@ Tip: Use `CTRL`+`F` (or `Command`+`F`) to search by keywords
two-letter ISO 3166-2 country code
## Video Selection:
-I, --playlist-items ITEM_SPEC Comma separated playlist_index of the items
-I, --playlist-items ITEM_SPEC Comma-separated playlist_index of the items
to download. You can specify a range using
"[START]:[STOP][:STEP]". For backward
compatibility, START-STOP is also supported.
@@ -858,6 +858,8 @@ Tip: Use `CTRL`+`F` (or `Command`+`F`) to search by keywords
for more details
-S, --format-sort SORTORDER Sort the formats by the fields given, see
"Sorting Formats" for more details
--format-sort-reset Disregard previous user specified sort order
and reset to the default
--format-sort-force Force user specified sort order to have
precedence over all fields, see "Sorting
Formats" for more details (Alias: --S-force)
@@ -1299,7 +1301,7 @@ The field names themselves (the part inside the parenthesis) can also have some
1. **Default**: A literal default value can be specified for when the field is empty using a `|` separator. This overrides `--output-na-placeholder`. E.g. `%(uploader|Unknown)s`
1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing, `+` for Unicode), `h` = HTML escaping, `l` = a comma separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (e.g. 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted)
1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing, `+` for Unicode), `h` = HTML escaping, `l` = a comma-separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (e.g. 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted)
1. **Unicode normalization**: The format type `U` can be used for NFC [Unicode normalization](https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize). The alternate form flag (`#`) changes the normalization to NFD and the conversion flag `+` can be used for NFKC/NFKD compatibility equivalence normalization. E.g. `%(title)+.100U` is NFKC
@@ -1351,6 +1353,7 @@ The available fields are:
- `repost_count` (numeric): Number of reposts of the video
- `average_rating` (numeric): Average rating given by users, the scale used depends on the webpage
- `comment_count` (numeric): Number of comments on the video (For some extractors, comments are only downloaded at the end, and so this field cannot be used)
- `save_count` (numeric): Number of times the video has been saved or bookmarked
- `age_limit` (numeric): Age restriction for the video (years)
- `live_status` (string): One of "not_live", "is_live", "is_upcoming", "was_live", "post_live" (was live, but VOD is not yet processed)
- `is_live` (boolean): Whether this video is a live stream or a fixed-length video
@@ -1644,6 +1647,8 @@ Note that the default for hdr is `hdr:12`; i.e. Dolby Vision is not preferred. T
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
If you use the `-S`/`--format-sort` option multiple times, each subsequent sorting argument will be prepended to the previous one, and only the highest priority entry of any duplicated field will be preserved. E.g. `-S proto -S res` is equivalent to `-S res,proto`, and `-S res:720,fps -S vcodec,res:1080` is equivalent to `-S vcodec,res:1080,fps`. You can use `--format-sort-reset` to disregard any previously passed `-S`/`--format-sort` arguments and reset to the default order.
**Tip**: You can use the `-v -F` to see how the formats have been sorted (worst to best).
## Format Selection examples
@@ -1798,8 +1803,8 @@ Metadata fields | From
`track` | `track_number`
`artist` | `artist`, `artists`, `creator`, `creators`, `uploader` or `uploader_id`
`composer` | `composer` or `composers`
`genre` | `genre` or `genres`
`album` | `album`
`genre` | `genre`, `genres`, `categories` or `tags`
`album` | `album` or `series`
`album_artist` | `album_artist` or `album_artists`
`disc` | `disc_number`
`show` | `series`
@@ -1820,6 +1825,9 @@ $ yt-dlp --parse-metadata "title:%(artist)s - %(title)s"
# Regex example
$ yt-dlp --parse-metadata "description:Artist - (?P<artist>.+)"
# Copy the episode field to the title field (with FROM and TO as single fields)
$ yt-dlp --parse-metadata "episode:title"
# Set title as "Series name S01E05"
$ yt-dlp --parse-metadata "%(series)s S%(season_number)02dE%(episode_number)02d:%(title)s"
@@ -1852,29 +1860,30 @@ The following extractors use this feature:
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube/_base.py](https://github.com/yt-dlp/yt-dlp/blob/415b4c9f955b1a0391204bd24a7132590e7b3bdb/yt_dlp/extractor/youtube/_base.py#L402-L409) for the list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_sdkless`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `tv,android_sdkless,web` is used. If no JavaScript runtime is available, then `android_sdkless,web_safari,web` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web_safari,web` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `ios_downgraded`, `android`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `android_vr,ios_downgraded,web,web_safari` is used. If no JavaScript runtime/engine is available, then `android_vr,ios_downgraded` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web,web_safari` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `webpage_skip`: Skip extraction of embedded webpage data. One or both of `player_response`, `initial_data`. These options are for testing purposes and don't skip any network requests
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `player_js_variant`: The player javascript variant to use for n/sig deciphering. The known variants are: `main`, `tcc`, `tce`, `es5`, `es6`, `tv`, `tv_es6`, `phone`, `tablet`. The default is `main`, and the others are for debugging purposes. You can use `actual` to go with what is prescribed by the site
* `player_js_version`: The player javascript version to use for n/sig deciphering, in the format of `signature_timestamp@hash` (e.g. `20348@0004de42`). The default is to use what is prescribed by the site, and can be selected with `actual`
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread,max-depth`. Default is `all,all,all,all,all`
* A `max-depth` value of `1` will discard all replies, regardless of the `max-replies` or `max-replies-per-thread` values given
* E.g. `all,all,1000,10,2` will get a maximum of 1000 replies total, with up to 10 replies per thread, and only 2 levels of depth (i.e. top-level comments plus their immediate replies). `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8), `missing_pot` (include formats that require a PO Token but are missing one)
* `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
* `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma-separated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `pot_trace`: Enable debug logging for PO Token fetching. Either `true` or `false` (default)
* `fetch_pot`: Policy to use for fetching a PO Token from providers. One of `always` (always try fetch a PO Token regardless if the client requires one for the given context), `never` (never fetch a PO Token), or `auto` (default; only fetch a PO Token if the client requires one for the given context)
* `playback_wait`: Duration (in seconds) to wait inbetween the extraction and download stages in order to ensure the formats are available. The default is `6` seconds
* `jsc_trace`: Enable debug logging for JS Challenge fetching. Either `true` or `false` (default)
* `use_ad_playback_context`: Skip preroll ads to eliminate the mandatory wait period before download. Do NOT use this when passing premium account cookies to yt-dlp, as it will result in a loss of premium formats. Only effective with the `web`, `web_safari`, `web_music` and `mweb` player clients. Either `true` or `false` (default)
#### youtube-ejs
* `jitless`: Run suported Javascript engines in JIT-less mode. Supported runtimes are `deno`, `node` and `bun`. Provides better security at the cost of performance/speed. Do note that `node` and `bun` are still considered unsecure. Either `true` or `false` (default)
* `jitless`: Run supported Javascript engines in JIT-less mode. Supported runtimes are `deno`, `node` and `bun`. Provides better security at the cost of performance/speed. Do note that `node` and `bun` are still considered insecure. Either `true` or `false` (default)
#### youtubepot-webpo
* `bind_to_visitor_id`: Whether to use the Visitor ID instead of Visitor Data for caching WebPO tokens. Either `true` (default) or `false`
@@ -1963,7 +1972,7 @@ The following extractors use this feature:
* `backend`: Backend API to use for extraction - one of `streaks` (default) or `brightcove` (deprecated)
#### vimeo
* `client`: Client to extract video data from. The currently available clients are `android`, `ios`, and `web`. Only one client can be used. The `web` client is used by default. The `web` client only works with account cookies or login credentials. The `android` and `ios` clients only work with previously cached OAuth tokens
* `client`: Client to extract video data from. The currently available clients are `android`, `ios`, `macos` and `web`. Only one client can be used. The `macos` client is used by default, but the `web` client is used when logged-in. The `web` client only works with account cookies or login credentials. The `android` and `ios` clients only work with previously cached OAuth tokens
* `original_format_policy`: Policy for when to try extracting original formats. One of `always`, `never`, or `auto`. The default `auto` policy tries to avoid exceeding the web client's API rate-limit by only making an extra request when Vimeo publicizes the video's downloadability
**Note**: These options may be changed/removed in the future without concern for backward compatibility
@@ -2329,7 +2338,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
* Passing `--simulate` (or calling `extract_info` with `download=False`) no longer alters the default format selection. See [#9843](https://github.com/yt-dlp/yt-dlp/issues/9843) for details.
* yt-dlp no longer applies the server modified time to downloaded files by default. Use `--mtime` or `--compat-options mtime-by-default` to revert this.
For ease of use, a few more compat options are available:
For convenience, there are some compat option aliases available to use:
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
@@ -2337,7 +2346,10 @@ For ease of use, a few more compat options are available:
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
* `--compat-options 2024`: Same as `--compat-options mtime-by-default`. Use this to enable all future compat options
* `--compat-options 2024`: Same as `--compat-options 2025,mtime-by-default`
* `--compat-options 2025`: Currently does nothing. Use this to enable all future compat options
Using one of the yearly compat option aliases will pin yt-dlp's default behavior to what it was at the *end* of that calendar year.
The following compat options restore vulnerable behavior from before security patches:

View File

@@ -26,7 +26,7 @@ services:
platforms:
- "linux/amd64"
args:
VERIFYIMAGE: quay.io/pypa/manylinux2014_x86_64:latest
VERIFYIMAGE: quay.io/pypa/manylinux2014_x86_64:2025.12.19-1@sha256:b716645f9aecd0c1418283af930804bbdbd68a73d855a60101c5aab8548d737d
environment:
EXE_NAME: ${EXE_NAME:?}
UPDATE_TO:
@@ -61,7 +61,7 @@ services:
platforms:
- "linux/arm64"
args:
VERIFYIMAGE: quay.io/pypa/manylinux2014_aarch64:latest
VERIFYIMAGE: quay.io/pypa/manylinux2014_aarch64:2025.12.19-1@sha256:36cbe6638c7c605c2b44a92e35751baa537ec8902112f790139d89c7e1ccd2a4
environment:
EXE_NAME: ${EXE_NAME:?}
UPDATE_TO:
@@ -97,7 +97,7 @@ services:
platforms:
- "linux/arm/v7"
args:
VERIFYIMAGE: arm32v7/debian:bullseye
VERIFYIMAGE: arm32v7/debian:bullseye@sha256:9d544bf6ff73e36b8df1b7e415f6c8ee40ed84a0f3a26970cac8ea88b0ccf2ac
environment:
EXE_NAME: ${EXE_NAME:?}
UPDATE_TO:
@@ -132,7 +132,7 @@ services:
platforms:
- "linux/amd64"
args:
VERIFYIMAGE: alpine:3.22
VERIFYIMAGE: alpine:3.23.2@sha256:865b95f46d98cf867a156fe4a135ad3fe50d2056aa3f25ed31662dff6da4eb62
environment:
EXE_NAME: ${EXE_NAME:?}
UPDATE_TO:
@@ -168,7 +168,7 @@ services:
platforms:
- "linux/arm64"
args:
VERIFYIMAGE: alpine:3.22
VERIFYIMAGE: alpine:3.23.2@sha256:865b95f46d98cf867a156fe4a135ad3fe50d2056aa3f25ed31662dff6da4eb62
environment:
EXE_NAME: ${EXE_NAME:?}
UPDATE_TO:

View File

@@ -6,43 +6,35 @@ if [[ -z "${PYTHON_VERSION:-}" ]]; then
echo "Defaulting to using Python ${PYTHON_VERSION}"
fi
function runpy {
"/opt/shared-cpython-${PYTHON_VERSION}/bin/python${PYTHON_VERSION}" "$@"
}
function venvpy {
"python${PYTHON_VERSION}" "$@"
}
INCLUDES=(
--include-group pyinstaller
--include-group secretstorage
--include-extra pyinstaller
--include-extra secretstorage
)
if [[ -z "${EXCLUDE_CURL_CFFI:-}" ]]; then
INCLUDES+=(--include-group curl-cffi)
INCLUDES+=(--include-extra build-curl-cffi)
fi
runpy -m venv /yt-dlp-build-venv
py"${PYTHON_VERSION}" -m venv /yt-dlp-build-venv
# shellcheck disable=SC1091
source /yt-dlp-build-venv/bin/activate
# Inside the venv we use venvpy instead of runpy
venvpy -m ensurepip --upgrade --default-pip
venvpy -m devscripts.install_deps --only-optional-groups --include-group build
venvpy -m devscripts.install_deps "${INCLUDES[@]}"
venvpy -m devscripts.make_lazy_extractors
venvpy devscripts/update-version.py -c "${CHANNEL}" -r "${ORIGIN}" "${VERSION}"
# Inside the venv we can use python instead of py3.13 or py3.14 etc
python -m devscripts.install_deps "${INCLUDES[@]}"
python -m devscripts.make_lazy_extractors
python devscripts/update-version.py -c "${CHANNEL}" -r "${ORIGIN}" "${VERSION}"
if [[ -z "${SKIP_ONEDIR_BUILD:-}" ]]; then
mkdir -p /build
venvpy -m bundle.pyinstaller --onedir --distpath=/build
python -m bundle.pyinstaller --onedir --distpath=/build
pushd "/build/${EXE_NAME}"
chmod +x "${EXE_NAME}"
venvpy -m zipfile -c "/yt-dlp/dist/${EXE_NAME}.zip" ./
python -m zipfile -c "/yt-dlp/dist/${EXE_NAME}.zip" ./
popd
fi
if [[ -z "${SKIP_ONEFILE_BUILD:-}" ]]; then
venvpy -m bundle.pyinstaller
python -m bundle.pyinstaller
chmod +x "./dist/${EXE_NAME}"
fi
deactivate

View File

@@ -319,5 +319,23 @@
"action": "add",
"when": "6224a3898821965a7d6a2cb9cc2de40a0fd6e6bc",
"short": "[priority] **An external JavaScript runtime is now required for full YouTube support**\nyt-dlp now requires users to have an external JavaScript runtime (e.g. Deno) installed in order to solve the JavaScript challenges presented by YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/15012)"
},
{
"action": "change",
"when": "c63b4e2a2b81cc78397c8709ef53ffd29bada213",
"short": "[cleanup] Misc (#14767)",
"authors": ["bashonly", "seproDev", "matyb08"]
},
{
"action": "change",
"when": "abf29e3e72e8a4dcae61e2ceaf37ce8405af61ab",
"short": "[ie/youtube] Fix `player_skip=js` extractor-arg (#15428)",
"authors": ["bashonly"]
},
{
"action": "change",
"when": "e2ea6bd6ab639f910b99e55add18856974ff4c3a",
"short": "[ie] Fix prioritization of Youtube URL matching (#15596)",
"authors": ["Grub4K"]
}
]

View File

@@ -25,16 +25,16 @@ def parse_args():
'-e', '--exclude-dependency', metavar='DEPENDENCY', action='append',
help='exclude a dependency (can be used multiple times)')
parser.add_argument(
'-i', '--include-group', metavar='GROUP', action='append',
help='include an optional dependency group (can be used multiple times)')
'-i', '--include-extra', metavar='EXTRA', action='append',
help='include an extra/optional-dependencies list (can be used multiple times)')
parser.add_argument(
'-c', '--cherry-pick', metavar='DEPENDENCY', action='append',
help=(
'only include a specific dependency from the resulting dependency list '
'(can be used multiple times)'))
parser.add_argument(
'-o', '--only-optional-groups', action='store_true',
help='omit default dependencies unless the "default" group is specified with --include-group')
'-o', '--omit-default', action='store_true',
help='omit the "default" extra unless it is explicitly included (it is included by default)')
parser.add_argument(
'-p', '--print', action='store_true',
help='only print requirements to stdout')
@@ -51,27 +51,27 @@ def uniq(arg) -> dict[str, None]:
def main():
args = parse_args()
project_table = parse_toml(read_file(args.input))['project']
recursive_pattern = re.compile(rf'{project_table["name"]}\[(?P<group_name>[\w-]+)\]')
optional_groups = project_table['optional-dependencies']
recursive_pattern = re.compile(rf'{project_table["name"]}\[(?P<extra_name>[\w-]+)\]')
extras = project_table['optional-dependencies']
excludes = uniq(args.exclude_dependency)
only_includes = uniq(args.cherry_pick)
include_groups = uniq(args.include_group)
include_extras = uniq(args.include_extra)
def yield_deps(group):
for dep in group:
def yield_deps(extra):
for dep in extra:
if mobj := recursive_pattern.fullmatch(dep):
yield from optional_groups.get(mobj.group('group_name'), ())
yield from extras.get(mobj.group('extra_name'), ())
else:
yield dep
targets = {}
if not args.only_optional_groups:
if not args.omit_default:
# legacy: 'dependencies' is empty now
targets.update(dict.fromkeys(project_table['dependencies']))
targets.update(dict.fromkeys(yield_deps(optional_groups['default'])))
targets.update(dict.fromkeys(yield_deps(extras['default'])))
for include in filter(None, map(optional_groups.get, include_groups)):
for include in filter(None, map(extras.get, include_extras)):
targets.update(dict.fromkeys(yield_deps(include)))
def target_filter(target):

View File

@@ -251,7 +251,13 @@ class CommitRange:
''', re.VERBOSE | re.DOTALL)
EXTRACTOR_INDICATOR_RE = re.compile(r'(?:Fix|Add)\s+Extractors?', re.IGNORECASE)
REVERT_RE = re.compile(r'(?:\[[^\]]+\]\s+)?(?i:Revert)\s+([\da-f]{40})')
FIXES_RE = re.compile(r'(?i:(?:bug\s*)?fix(?:es)?(?:\s+bugs?)?(?:\s+in|\s+for)?|Improve)\s+([\da-f]{40})')
FIXES_RE = re.compile(r'''
(?i:
(?:bug\s*)?fix(?:es)?(?:
\s+(?:bugs?|regression(?:\s+introduced)?)
)?(?:\s+(?:in|for|from|by))?
|Improve
)\s+([\da-f]{40})''', re.VERBOSE)
UPSTREAM_MERGE_RE = re.compile(r'Update to ytdl-commit-([\da-f]+)')
def __init__(self, start, end, default_author=None):

View File

@@ -21,8 +21,6 @@ def setup_variables(environment):
SOURCE_PYPI_PROJECT, SOURCE_PYPI_SUFFIX,
TARGET_PYPI_PROJECT, TARGET_PYPI_SUFFIX,
SOURCE_ARCHIVE_REPO, TARGET_ARCHIVE_REPO,
HAS_SOURCE_ARCHIVE_REPO_TOKEN,
HAS_TARGET_ARCHIVE_REPO_TOKEN,
HAS_ARCHIVE_REPO_TOKEN
`INPUTS` must contain these keys:
@@ -37,8 +35,6 @@ def setup_variables(environment):
PROCESSED = json.loads(environment['PROCESSED'])
source_channel = None
does_not_have_needed_token = False
target_repo_token = None
pypi_project = None
pypi_suffix = None
@@ -81,28 +77,19 @@ def setup_variables(environment):
target_repo = REPOSITORY
if target_repo != REPOSITORY:
target_repo = environment['TARGET_ARCHIVE_REPO']
target_repo_token = f'{PROCESSED["target_repo"].upper()}_ARCHIVE_REPO_TOKEN'
if not json.loads(environment['HAS_TARGET_ARCHIVE_REPO_TOKEN']):
does_not_have_needed_token = True
pypi_project = environment['TARGET_PYPI_PROJECT'] or None
pypi_suffix = environment['TARGET_PYPI_SUFFIX'] or None
else:
target_tag = source_tag or version
if source_channel:
target_repo = source_channel
target_repo_token = f'{PROCESSED["source_repo"].upper()}_ARCHIVE_REPO_TOKEN'
if not json.loads(environment['HAS_SOURCE_ARCHIVE_REPO_TOKEN']):
does_not_have_needed_token = True
pypi_project = environment['SOURCE_PYPI_PROJECT'] or None
pypi_suffix = environment['SOURCE_PYPI_SUFFIX'] or None
else:
target_repo = REPOSITORY
if does_not_have_needed_token:
if not json.loads(environment['HAS_ARCHIVE_REPO_TOKEN']):
print(f'::error::Repository access secret {target_repo_token} not found')
return None
target_repo_token = 'ARCHIVE_REPO_TOKEN'
if target_repo != REPOSITORY and not json.loads(environment['HAS_ARCHIVE_REPO_TOKEN']):
return None
if target_repo == REPOSITORY and not INPUTS['prerelease']:
pypi_project = environment['PYPI_PROJECT'] or None
@@ -111,7 +98,6 @@ def setup_variables(environment):
'channel': resolved_source,
'version': version,
'target_repo': target_repo,
'target_repo_token': target_repo_token,
'target_tag': target_tag,
'pypi_project': pypi_project,
'pypi_suffix': pypi_suffix,
@@ -147,6 +133,7 @@ if __name__ == '__main__':
outputs = setup_variables(dict(os.environ))
if not outputs:
print('::error::Repository access secret ARCHIVE_REPO_TOKEN not found')
sys.exit(1)
print('::group::Output variables')

View File

@@ -9,8 +9,10 @@ import json
from devscripts.setup_variables import STABLE_REPOSITORY, process_inputs, setup_variables
from devscripts.utils import calculate_version
GENERATE_TEST_DATA = object()
def _test(github_repository, note, repo_vars, repo_secrets, inputs, expected=None, ignore_revision=False):
def _test(github_repository, note, repo_vars, repo_secrets, inputs, expected, ignore_revision=False):
inp = inputs.copy()
inp.setdefault('linux_armv7l', True)
inp.setdefault('prerelease', False)
@@ -33,16 +35,19 @@ def _test(github_repository, note, repo_vars, repo_secrets, inputs, expected=Non
'TARGET_PYPI_SUFFIX': variables.get(f'{target_repo}_PYPI_SUFFIX') or '',
'SOURCE_ARCHIVE_REPO': variables.get(f'{source_repo}_ARCHIVE_REPO') or '',
'TARGET_ARCHIVE_REPO': variables.get(f'{target_repo}_ARCHIVE_REPO') or '',
'HAS_SOURCE_ARCHIVE_REPO_TOKEN': json.dumps(bool(secrets.get(f'{source_repo}_ARCHIVE_REPO_TOKEN'))),
'HAS_TARGET_ARCHIVE_REPO_TOKEN': json.dumps(bool(secrets.get(f'{target_repo}_ARCHIVE_REPO_TOKEN'))),
'HAS_ARCHIVE_REPO_TOKEN': json.dumps(bool(secrets.get('ARCHIVE_REPO_TOKEN'))),
}
result = setup_variables(env)
if not expected:
if expected is GENERATE_TEST_DATA:
print(' {\n' + '\n'.join(f' {k!r}: {v!r},' for k, v in result.items()) + '\n }')
return
if expected is None:
assert result is None, f'expected error/None but got dict: {github_repository} {note}'
return
exp = expected.copy()
if ignore_revision:
assert len(result['version']) == len(exp['version']), f'revision missing: {github_repository} {note}'
@@ -77,7 +82,6 @@ def test_setup_variables():
'channel': 'stable',
'version': DEFAULT_VERSION,
'target_repo': STABLE_REPOSITORY,
'target_repo_token': None,
'target_tag': DEFAULT_VERSION,
'pypi_project': 'yt-dlp',
'pypi_suffix': None,
@@ -91,7 +95,6 @@ def test_setup_variables():
'channel': 'nightly',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': 'yt-dlp/yt-dlp-nightly-builds',
'target_repo_token': 'ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': 'yt-dlp',
'pypi_suffix': 'dev',
@@ -106,7 +109,6 @@ def test_setup_variables():
'channel': 'nightly',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': 'yt-dlp/yt-dlp-nightly-builds',
'target_repo_token': 'ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': 'yt-dlp',
'pypi_suffix': 'dev',
@@ -120,7 +122,6 @@ def test_setup_variables():
'channel': 'master',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': 'yt-dlp/yt-dlp-master-builds',
'target_repo_token': 'ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
@@ -135,7 +136,6 @@ def test_setup_variables():
'channel': 'master',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': 'yt-dlp/yt-dlp-master-builds',
'target_repo_token': 'ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
@@ -149,7 +149,6 @@ def test_setup_variables():
'channel': 'stable',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': STABLE_REPOSITORY,
'target_repo_token': None,
'target_tag': 'experimental',
'pypi_project': None,
'pypi_suffix': None,
@@ -163,7 +162,6 @@ def test_setup_variables():
'channel': 'stable',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': STABLE_REPOSITORY,
'target_repo_token': None,
'target_tag': 'experimental',
'pypi_project': None,
'pypi_suffix': None,
@@ -175,7 +173,6 @@ def test_setup_variables():
'channel': FORK_REPOSITORY,
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
@@ -186,7 +183,6 @@ def test_setup_variables():
'channel': FORK_REPOSITORY,
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
@@ -201,7 +197,6 @@ def test_setup_variables():
'channel': f'{FORK_REPOSITORY}@nightly',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': 'nightly',
'pypi_project': None,
'pypi_suffix': None,
@@ -216,7 +211,6 @@ def test_setup_variables():
'channel': f'{FORK_REPOSITORY}@master',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': 'master',
'pypi_project': None,
'pypi_suffix': None,
@@ -227,7 +221,6 @@ def test_setup_variables():
'channel': FORK_REPOSITORY,
'version': f'{DEFAULT_VERSION[:10]}.123',
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': f'{DEFAULT_VERSION[:10]}.123',
'pypi_project': None,
'pypi_suffix': None,
@@ -239,7 +232,6 @@ def test_setup_variables():
'channel': FORK_REPOSITORY,
'version': DEFAULT_VERSION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': DEFAULT_VERSION,
'pypi_project': None,
'pypi_suffix': None,
@@ -250,19 +242,16 @@ def test_setup_variables():
'channel': FORK_REPOSITORY,
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
}, ignore_revision=True)
_test(
FORK_REPOSITORY, 'fork w/NIGHTLY_ARCHIVE_REPO_TOKEN, nightly', {
FORK_REPOSITORY, 'fork, nightly', {
'NIGHTLY_ARCHIVE_REPO': f'{FORK_ORG}/yt-dlp-nightly-builds',
'PYPI_PROJECT': 'yt-dlp-test',
}, {
'NIGHTLY_ARCHIVE_REPO_TOKEN': '1',
}, {
}, BASE_REPO_SECRETS, {
'source': f'{FORK_ORG}/yt-dlp-nightly-builds',
'target': 'nightly',
'prerelease': True,
@@ -270,19 +259,16 @@ def test_setup_variables():
'channel': f'{FORK_ORG}/yt-dlp-nightly-builds',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': f'{FORK_ORG}/yt-dlp-nightly-builds',
'target_repo_token': 'NIGHTLY_ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': None,
'pypi_suffix': None,
}, ignore_revision=True)
_test(
FORK_REPOSITORY, 'fork w/MASTER_ARCHIVE_REPO_TOKEN, master', {
FORK_REPOSITORY, 'fork, master', {
'MASTER_ARCHIVE_REPO': f'{FORK_ORG}/yt-dlp-master-builds',
'MASTER_PYPI_PROJECT': 'yt-dlp-test',
'MASTER_PYPI_SUFFIX': 'dev',
}, {
'MASTER_ARCHIVE_REPO_TOKEN': '1',
}, {
}, BASE_REPO_SECRETS, {
'source': f'{FORK_ORG}/yt-dlp-master-builds',
'target': 'master',
'prerelease': True,
@@ -290,7 +276,6 @@ def test_setup_variables():
'channel': f'{FORK_ORG}/yt-dlp-master-builds',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': f'{FORK_ORG}/yt-dlp-master-builds',
'target_repo_token': 'MASTER_ARCHIVE_REPO_TOKEN',
'target_tag': DEFAULT_VERSION_WITH_REVISION,
'pypi_project': 'yt-dlp-test',
'pypi_suffix': 'dev',
@@ -302,7 +287,6 @@ def test_setup_variables():
'channel': f'{FORK_REPOSITORY}@experimental',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': 'experimental',
'pypi_project': None,
'pypi_suffix': None,
@@ -317,8 +301,15 @@ def test_setup_variables():
'channel': 'stable',
'version': DEFAULT_VERSION_WITH_REVISION,
'target_repo': FORK_REPOSITORY,
'target_repo_token': None,
'target_tag': 'experimental',
'pypi_project': None,
'pypi_suffix': None,
}, ignore_revision=True)
_test(
STABLE_REPOSITORY, 'official vars but no ARCHIVE_REPO_TOKEN, nightly',
BASE_REPO_VARS, {}, {
'source': 'nightly',
'target': 'nightly',
'prerelease': True,
}, None)

View File

@@ -9,10 +9,9 @@ authors = [
]
maintainers = [
{email = "maintainers@yt-dlp.org"},
{name = "Grub4K", email = "contact@grub4k.xyz"},
{name = "Grub4K", email = "contact@grub4k.dev"},
{name = "bashonly", email = "bashonly@protonmail.com"},
{name = "coletdjnz", email = "coletdjnz@protonmail.com"},
{name = "sepro", email = "sepro@sepr0.com"},
]
description = "A feature-rich command-line audio/video downloader"
readme = "README.md"
@@ -56,20 +55,27 @@ default = [
"requests>=2.32.2,<3",
"urllib3>=2.0.2,<3",
"websockets>=13.0",
"yt-dlp-ejs==0.3.1",
"yt-dlp-ejs==0.4.0",
]
curl-cffi = [
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.14; implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.15; implementation_name=='cpython'",
]
build-curl-cffi = [
"curl-cffi==0.13.0; sys_platform=='darwin' or (sys_platform=='linux' and platform_machine!='armv7l')",
"curl-cffi==0.14.0; sys_platform=='win32' or (sys_platform=='linux' and platform_machine=='armv7l')",
]
secretstorage = [
"cffi",
"secretstorage",
]
deno = [
"deno>=2.6.6", # v2.6.5 fixes compatibility, v2.6.6 adds integrity check
]
build = [
"build",
"hatchling>=1.27.0",
"pip",
"setuptools>=71.0.2,<81", # See https://github.com/pyinstaller/pyinstaller/issues/9149
"setuptools>=71.0.2",
"wheel",
]
dev = [
@@ -86,7 +92,7 @@ test = [
"pytest-rerunfailures~=14.0",
]
pyinstaller = [
"pyinstaller>=6.13.0", # Windows temp cleanup fixed in 6.13.0
"pyinstaller>=6.17.0", # 6.17.0+ needed for compat with setuptools 81+
]
[project.urls]

View File

@@ -50,8 +50,10 @@ The only reliable way to check if a site is supported is to try it.
- **aenetworks:collection**
- **aenetworks:show**
- **AeonCo**
- **agalega:videos**
- **AirTV**
- **AitubeKZVideo**
- **Alibaba**
- **AliExpressLive**
- **AlJazeera**
- **Allocine**
@@ -83,11 +85,10 @@ The only reliable way to check if a site is supported is to try it.
- **ant1newsgr:embed**: ant1news.gr embedded videos
- **antenna:watch**: antenna.gr and ant1news.gr videos
- **Anvato**
- **aol.com**: Yahoo screen and movies (**Currently broken**)
- **aol.com**: (**Currently broken**)
- **APA**
- **Aparat**
- **apple:music:connect**: Apple Music Connect
- **AppleDaily**: 臺灣蘋果日報
- **ApplePodcasts**
- **appletrailers**
- **appletrailers:section**
@@ -190,6 +191,7 @@ The only reliable way to check if a site is supported is to try it.
- **Biography**
- **BitChute**
- **BitChuteChannel**
- **Bitmovin**
- **BlackboardCollaborate**
- **BlackboardCollaborateLaunch**
- **BleacherReport**: (**Currently broken**)
@@ -303,6 +305,7 @@ The only reliable way to check if a site is supported is to try it.
- **cpac:playlist**
- **Cracked**
- **Craftsy**
- **croatian.film**
- **CrooksAndLiars**
- **CrowdBunker**
- **CrowdBunkerChannel**
@@ -412,6 +415,7 @@ The only reliable way to check if a site is supported is to try it.
- **Erocast**
- **EroProfile**: [*eroprofile*](## "netrc machine")
- **EroProfile:album**
- **ERRArhiiv**
- **ERRJupiter**
- **ertflix**: ERTFLIX videos
- **ertflix:codename**: ERTFLIX videos by codename
@@ -430,7 +434,7 @@ The only reliable way to check if a site is supported is to try it.
- **EWETVRecordings**: [*ewetv*](## "netrc machine")
- **Expressen**
- **EyedoTV**
- **facebook**: [*facebook*](## "netrc machine")
- **facebook**
- **facebook:ads**
- **facebook:reel**
- **FacebookPluginsVideo**
@@ -445,6 +449,7 @@ The only reliable way to check if a site is supported is to try it.
- **fc2:live**
- **Fczenit**
- **Fifa**
- **FilmArchiv**: FILMARCHIV ON
- **filmon**
- **filmon:channel**
- **Filmweb**
@@ -467,10 +472,10 @@ The only reliable way to check if a site is supported is to try it.
- **fptplay**: fptplay.vn
- **FrancaisFacile**
- **FranceCulture**
- **franceinfo**: franceinfo.fr (formerly francetvinfo.fr)
- **FranceInter**
- **francetv**
- **francetv:site**
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **freetv:series**
@@ -618,7 +623,7 @@ The only reliable way to check if a site is supported is to try it.
- **IPrimaCNN**
- **iq.com**: International version of iQiyi
- **iq.com:album**
- **iqiyi**: [*iqiyi*](## "netrc machine") 爱奇艺
- **iqiyi**: 爱奇艺
- **IslamChannel**
- **IslamChannelSeries**
- **IsraelNationalNews**
@@ -731,7 +736,7 @@ The only reliable way to check if a site is supported is to try it.
- **loc**: Library of Congress
- **Loco**
- **loom**
- **loom:folder**
- **loom:folder**: (**Currently broken**)
- **LoveHomePorn**
- **LRTRadio**
- **LRTStream**
@@ -752,9 +757,6 @@ The only reliable way to check if a site is supported is to try it.
- **mangomolo:live**
- **mangomolo:video**
- **MangoTV**: 芒果TV
- **ManotoTV**: Manoto TV (Episode)
- **ManotoTVLive**: Manoto TV (Live)
- **ManotoTVShow**: Manoto TV (Show)
- **ManyVids**
- **MaoriTV**
- **Markiza**: (**Currently broken**)
@@ -762,7 +764,8 @@ The only reliable way to check if a site is supported is to try it.
- **massengeschmack.tv**
- **Masters**
- **MatchTV**
- **Mave**
- **mave**
- **mave:channel**
- **MBN**: mbn.co.kr (매일방송)
- **MDR**: MDR.DE
- **MedalTV**
@@ -889,12 +892,15 @@ The only reliable way to check if a site is supported is to try it.
- **NDTV**: (**Currently broken**)
- **nebula:channel**: [*watchnebula*](## "netrc machine")
- **nebula:media**: [*watchnebula*](## "netrc machine")
- **nebula:season**: [*watchnebula*](## "netrc machine")
- **nebula:subscriptions**: [*watchnebula*](## "netrc machine")
- **nebula:video**: [*watchnebula*](## "netrc machine")
- **NekoHacker**
- **NerdCubedFeed**
- **Nest**
- **NestClip**
- **NetAppCollection**
- **NetAppVideo**
- **netease:album**: 网易云音乐 - 专辑
- **netease:djradio**: 网易云音乐 - 电台
- **netease:mv**: 网易云音乐 - MV
@@ -908,15 +914,12 @@ The only reliable way to check if a site is supported is to try it.
- **Netverse**
- **NetversePlaylist**
- **NetverseSearch**: "netsearch:" prefix
- **Netzkino**: (**Currently broken**)
- **Netzkino**
- **Newgrounds**: [*newgrounds*](## "netrc machine")
- **Newgrounds:playlist**
- **Newgrounds:user**
- **NewsPicks**
- **Newsy**
- **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞
- **NextTV**: 壹電視 (**Currently broken**)
- **Nexx**
- **NexxEmbed**
- **nfb**: nfb.ca and onf.ca films and episodes
@@ -962,6 +965,7 @@ The only reliable way to check if a site is supported is to try it.
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **NovaEmbed**
- **NovaPlay**
- **NowCanal**
- **nowness**
- **nowness:playlist**
- **nowness:series**
@@ -1035,6 +1039,7 @@ The only reliable way to check if a site is supported is to try it.
- **PalcoMP3:artist**
- **PalcoMP3:song**
- **PalcoMP3:video**
- **PandaTv**: pandalive.co.kr (팬더티비)
- **Panopto**
- **PanoptoList**
- **PanoptoPlaylist**
@@ -1284,7 +1289,6 @@ The only reliable way to check if a site is supported is to try it.
- **sbs.co.kr:programs_vod**
- **schooltv**
- **ScienceChannel**
- **screen.yahoo:search**: Yahoo screen search; "yvsearch:" prefix
- **Screen9**
- **Screencast**
- **Screencastify**
@@ -1293,8 +1297,6 @@ The only reliable way to check if a site is supported is to try it.
- **ScrippsNetworks**
- **scrippsnetworks:watch**
- **Scrolller**
- **SCTE**: [*scte*](## "netrc machine") (**Currently broken**)
- **SCTECourse**: [*scte*](## "netrc machine") (**Currently broken**)
- **sejm**
- **Sen**
- **SenalColombiaLive**: (**Currently broken**)
@@ -1373,7 +1375,7 @@ The only reliable way to check if a site is supported is to try it.
- **Spiegel**
- **Sport5**
- **SportBox**: (**Currently broken**)
- **SportDeutschland**
- **sporteurope**
- **Spreaker**
- **SpreakerShow**
- **SpringboardPlatform**
@@ -1422,6 +1424,9 @@ The only reliable way to check if a site is supported is to try it.
- **TapTapAppIntl**
- **TapTapMoment**
- **TapTapPostIntl**
- **tarangplus:episodes**
- **tarangplus:playlist**
- **tarangplus:video**
- **Tass**: (**Currently broken**)
- **TBS**
- **TBSJPEpisode**
@@ -1461,6 +1466,8 @@ The only reliable way to check if a site is supported is to try it.
- **TFO**: (**Currently broken**)
- **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine")
- **theatercomplextown:vod**: [*theatercomplextown*](## "netrc machine")
- **TheChosen**
- **TheChosenGroup**
- **TheGuardianPodcast**
- **TheGuardianPodcastPlaylist**
- **TheHighWire**
@@ -1532,8 +1539,8 @@ The only reliable way to check if a site is supported is to try it.
- **tv2playseries.hu**
- **TV4**: tv4.se and tv4play.se
- **TV5MONDE**
- **tv5unis**: (**Currently broken**)
- **tv5unis:video**: (**Currently broken**)
- **tv5unis**
- **tv5unis:video**
- **tv8.it**
- **tv8.it:live**: TV8 Live
- **tv8.it:playlist**: TV8 Playlist
@@ -1570,12 +1577,12 @@ The only reliable way to check if a site is supported is to try it.
- **twitch:videos:clips**: [*twitch*](## "netrc machine")
- **twitch:videos:collections**: [*twitch*](## "netrc machine")
- **twitch:vod**: [*twitch*](## "netrc machine")
- **twitter**: [*twitter*](## "netrc machine")
- **twitter:amplify**: [*twitter*](## "netrc machine")
- **twitter:broadcast**: [*twitter*](## "netrc machine")
- **twitter**
- **twitter:amplify**
- **twitter:broadcast**
- **twitter:card**
- **twitter:shortener**: [*twitter*](## "netrc machine")
- **twitter:spaces**: [*twitter*](## "netrc machine")
- **twitter:shortener**
- **twitter:spaces**
- **Txxx**
- **udemy**: [*udemy*](## "netrc machine")
- **udemy:course**: [*udemy*](## "netrc machine")
@@ -1672,7 +1679,9 @@ The only reliable way to check if a site is supported is to try it.
- **VODPlatform**
- **voicy**: (**Currently broken**)
- **voicy:channel**: (**Currently broken**)
- **VolejTV**
- **volejtv:category**
- **volejtv:club**
- **volejtv:match**
- **VoxMedia**
- **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
@@ -1765,8 +1774,9 @@ The only reliable way to check if a site is supported is to try it.
- **XVideos**
- **xvideos:quickies**
- **XXXYMovies**
- **Yahoo**: Yahoo screen and movies
- **yahoo**
- **yahoo:japannews**: Yahoo! Japan News
- **yahoo:search**: "yvsearch:" prefix
- **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:artist:albums**: Яндекс.Музыка - Артист - Альбомы
@@ -1778,6 +1788,7 @@ The only reliable way to check if a site is supported is to try it.
- **YapFiles**: (**Currently broken**)
- **Yappy**: (**Currently broken**)
- **YappyProfile**
- **yfanefa**
- **YleAreena**
- **YouJizz**
- **youku**: 优酷

View File

@@ -261,9 +261,9 @@ def sanitize_got_info_dict(got_dict):
def expect_info_dict(self, got_dict, expected_dict):
ALLOWED_KEYS_SORT_ORDER = (
# NB: Keep in sync with the docstring of extractor/common.py
'id', 'ext', 'direct', 'display_id', 'title', 'alt_title', 'description', 'media_type',
'ie_key', 'url', 'id', 'ext', 'direct', 'display_id', 'title', 'alt_title', 'description', 'media_type',
'uploader', 'uploader_id', 'uploader_url', 'channel', 'channel_id', 'channel_url', 'channel_is_verified',
'channel_follower_count', 'comment_count', 'view_count', 'concurrent_view_count',
'channel_follower_count', 'comment_count', 'view_count', 'concurrent_view_count', 'save_count',
'like_count', 'dislike_count', 'repost_count', 'average_rating', 'age_limit', 'duration', 'thumbnail', 'heatmap',
'chapters', 'chapter', 'chapter_number', 'chapter_id', 'start_time', 'end_time', 'section_start', 'section_end',
'categories', 'tags', 'cast', 'composers', 'artists', 'album_artists', 'creators', 'genres',

View File

@@ -227,9 +227,13 @@ class TestDevalue(unittest.TestCase):
{'a': 'b'}, 'revivers (indirect)')
self.assertEqual(
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': lambda x: json.loads(x)}),
devalue.parse([['parse', 1], '{"a":0}'], revivers={'parse': json.loads}),
{'a': 0}, 'revivers (parse)')
self.assertEqual(
devalue.parse([{'a': 1, 'b': 3}, ['EmptyRef', 2], 'false', ['EmptyRef', 2]], revivers={'EmptyRef': json.loads}),
{'a': False, 'b': False}, msg='revivers (duplicate EmptyRef)')
if __name__ == '__main__':
unittest.main()

View File

@@ -1,44 +0,0 @@
#!/usr/bin/env python3
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL, is_download_test
from yt_dlp.extractor import IqiyiIE
class WarningLogger:
def __init__(self):
self.messages = []
def warning(self, msg):
self.messages.append(msg)
def debug(self, msg):
pass
def error(self, msg):
pass
@is_download_test
class TestIqiyiSDKInterpreter(unittest.TestCase):
def test_iqiyi_sdk_interpreter(self):
"""
Test the functionality of IqiyiSDKInterpreter by trying to log in
If `sign` is incorrect, /validate call throws an HTTP 556 error
"""
logger = WarningLogger()
ie = IqiyiIE(FakeYDL({'logger': logger}))
ie._perform_login('foo', 'bar')
self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__':
unittest.main()

View File

@@ -88,6 +88,21 @@ CHALLENGES: list[Challenge] = [
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt':
'MhudCuAuP-6fByOk1_GNXN7gNHHShjyXS2VOgsEItAJz0tipeav0OmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt',
}),
# c1c87fb0: tce variant broke sig solving; n and main variant are added only for regression testing
Challenge('c1c87fb0', Variant.main, JsChallengeType.N, {
'ZdZIqFPQK-Ty8wId': 'jCHBK5GuAFNa2',
}),
Challenge('c1c87fb0', Variant.main, JsChallengeType.SIG, {
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt':
'ttJC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit0zJAtIEsgOV2SXZjhSHMNy0NXNGa1kOyBf6HPuAuCduh-_',
}),
Challenge('c1c87fb0', Variant.tce, JsChallengeType.N, {
'ZdZIqFPQK-Ty8wId': 'jCHBK5GuAFNa2',
}),
Challenge('c1c87fb0', Variant.tce, JsChallengeType.SIG, {
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt':
'ttJC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit0zJAtIEsgOV2SXZjhSHMNy0NXNGa1kOyBf6HPuAuCduh-_',
}),
]
requests: list[JsChallengeRequest] = []

View File

@@ -755,6 +755,17 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
assert res.read(0) == b''
assert res.read() == b'<video src="/vid.mp4" /></html>'
def test_partial_read_greater_than_response_then_full_read(self, handler):
with handler() as rh:
for encoding in ('', 'gzip', 'deflate'):
res = validate_and_send(rh, Request(
f'http://127.0.0.1:{self.http_port}/content-encoding',
headers={'ytdl-encoding': encoding}))
assert res.headers.get('Content-Encoding') == encoding
assert res.read(512) == b'<html><video src="/vid.mp4" /></html>'
assert res.read(0) == b''
assert res.read() == b''
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
@pytest.mark.handler_flaky('CurlCFFI', reason='segfaults')
@@ -920,6 +931,28 @@ class TestUrllibRequestHandler(TestRequestHandlerBase):
assert res.fp.fp is None
assert res.closed
def test_data_uri_partial_read_then_full_read(self, handler):
with handler() as rh:
res = validate_and_send(rh, Request('data:text/plain,hello%20world'))
assert res.read(6) == b'hello '
assert res.read(0) == b''
assert res.read() == b'world'
# Should automatically close the underlying file object
assert res.fp.closed
assert res.closed
def test_data_uri_partial_read_greater_than_response_then_full_read(self, handler):
with handler() as rh:
res = validate_and_send(rh, Request('data:text/plain,hello%20world'))
assert res.read(512) == b'hello world'
# Response and its underlying file object should already be closed now
assert res.fp.closed
assert res.closed
assert res.read(0) == b''
assert res.read() == b''
assert res.fp.closed
assert res.closed
def test_http_error_returns_content(self, handler):
# urllib HTTPError will try close the underlying response if reference to the HTTPError object is lost
def get_response():

View File

@@ -29,6 +29,11 @@ class TestMetadataFromField(unittest.TestCase):
MetadataParserPP.format_to_regex('%(title)s - %(artist)s'),
r'(?P<title>.+)\ \-\ (?P<artist>.+)')
self.assertEqual(MetadataParserPP.format_to_regex(r'(?P<x>.+)'), r'(?P<x>.+)')
self.assertEqual(MetadataParserPP.format_to_regex(r'text (?P<x>.+)'), r'text (?P<x>.+)')
self.assertEqual(MetadataParserPP.format_to_regex('x'), r'(?s)(?P<x>.+)')
self.assertEqual(MetadataParserPP.format_to_regex('Field_Name1'), r'(?s)(?P<Field_Name1>.+)')
self.assertEqual(MetadataParserPP.format_to_regex('é'), r'(?s)(?P<é>.+)')
self.assertEqual(MetadataParserPP.format_to_regex('invalid '), 'invalid ')
def test_field_to_template(self):
self.assertEqual(MetadataParserPP.field_to_template('title'), '%(title)s')

View File

@@ -489,6 +489,10 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('Wednesday 31 December 1969 18:01:26 MDT'), 86)
self.assertEqual(unified_timestamp('12/31/1969 20:01:18 EDT', False), 78)
self.assertEqual(unified_timestamp('2026-01-01 00:00:00', tz_offset=0), 1767225600)
self.assertEqual(unified_timestamp('2026-01-01 00:00:00', tz_offset=8), 1767196800)
self.assertEqual(unified_timestamp('2026-01-01 00:00:00 +0800', tz_offset=-5), 1767196800)
def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
self.assertEqual(determine_ext('http://example.com/foo/bar/?download', None), None)
@@ -1276,6 +1280,9 @@ class TestUtil(unittest.TestCase):
on = js_to_json('[new Date("spam"), \'("eggs")\']')
self.assertEqual(json.loads(on), ['spam', '("eggs")'], msg='Date regex should match a single string')
on = js_to_json('[0.077, 7.06, 29.064, 169.0072]')
self.assertEqual(json.loads(on), [0.077, 7.06, 29.064, 169.0072])
def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1')
@@ -1403,6 +1410,9 @@ class TestUtil(unittest.TestCase):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
self.assertEqual(version_tuple('10.1-6'), (10, 1, 6)) # avconv style
self.assertEqual(version_tuple('invalid', lenient=True), (-1,))
self.assertEqual(version_tuple('1.2.3', lenient=True), (1, 2, 3))
self.assertEqual(version_tuple('12.34-something', lenient=True), (12, 34, -1))
def test_detect_exe_version(self):
self.assertEqual(detect_exe_version('''ffmpeg version 1.2.1

View File

@@ -40,7 +40,7 @@ TEST_DIR = os.path.dirname(os.path.abspath(__file__))
pytestmark = pytest.mark.handler_flaky(
'Websockets',
os.name != 'nt' and sys.implementation.name == 'pypy',
os.name == 'nt' or sys.implementation.name == 'pypy',
reason='segfaults',
)

View File

@@ -595,7 +595,7 @@ class YoutubeDL:
'width', 'height', 'asr', 'audio_channels', 'fps',
'tbr', 'abr', 'vbr', 'filesize', 'filesize_approx',
'timestamp', 'release_timestamp', 'available_at',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count', 'save_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
'chapter_number', 'season_number', 'episode_number',
@@ -1602,8 +1602,10 @@ class YoutubeDL:
if ret is NO_DEFAULT:
while True:
filename = self._format_screen(self.prepare_filename(info_dict), self.Styles.FILENAME)
reply = input(self._format_screen(
f'Download "{filename}"? (Y/n): ', self.Styles.EMPHASIS)).lower().strip()
self.to_screen(
self._format_screen(f'Download "{filename}"? (Y/n): ', self.Styles.EMPHASIS),
skip_eol=True)
reply = input().lower().strip()
if reply in {'y', ''}:
return None
elif reply == 'n':
@@ -3026,9 +3028,14 @@ class YoutubeDL:
format_selector = self.format_selector
while True:
if interactive_format_selection:
req_format = input(self._format_screen('\nEnter format selector ', self.Styles.EMPHASIS)
+ '(Press ENTER for default, or Ctrl+C to quit)'
+ self._format_screen(': ', self.Styles.EMPHASIS))
if not formats:
# Bypass interactive format selection if no formats & --ignore-no-formats-error
formats_to_download = None
break
self.to_screen(self._format_screen('\nEnter format selector ', self.Styles.EMPHASIS)
+ '(Press ENTER for default, or Ctrl+C to quit)'
+ self._format_screen(': ', self.Styles.EMPHASIS), skip_eol=True)
req_format = input()
try:
format_selector = self.build_format_selector(req_format) if req_format else None
except SyntaxError as err:
@@ -3474,11 +3481,12 @@ class YoutubeDL:
if dl_filename is not None:
self.report_file_already_downloaded(dl_filename)
elif fd:
for f in info_dict['requested_formats'] if fd != FFmpegFD else []:
f['filepath'] = fname = prepend_extension(
correct_ext(temp_filename, info_dict['ext']),
'f{}'.format(f['format_id']), info_dict['ext'])
downloaded.append(fname)
if fd != FFmpegFD and temp_filename != '-':
for f in info_dict['requested_formats']:
f['filepath'] = fname = prepend_extension(
correct_ext(temp_filename, info_dict['ext']),
'f{}'.format(f['format_id']), info_dict['ext'])
downloaded.append(fname)
info_dict['url'] = '\n'.join(f['url'] for f in info_dict['requested_formats'])
success, real_download = self.dl(temp_filename, info_dict)
info_dict['__real_download'] = real_download

View File

@@ -212,9 +212,16 @@ def _firefox_browser_dirs():
else:
yield from map(os.path.expanduser, (
# New installations of FF147+ respect the XDG base directory specification
# Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=259356
os.path.join(_config_home(), 'mozilla/firefox'),
# Existing FF version<=146 installations
'~/.mozilla/firefox',
'~/snap/firefox/common/.mozilla/firefox',
# Flatpak XDG: https://docs.flatpak.org/en/latest/conventions.html#xdg-base-directories
'~/.var/app/org.mozilla.firefox/config/mozilla/firefox',
'~/.var/app/org.mozilla.firefox/.mozilla/firefox',
# Snap installations do not respect the XDG base directory specification
'~/snap/firefox/common/.mozilla/firefox',
))

View File

@@ -461,7 +461,8 @@ class FileDownloader:
min_sleep_interval = self.params.get('sleep_interval') or 0
max_sleep_interval = self.params.get('max_sleep_interval') or 0
if available_at := info_dict.get('available_at'):
requested_formats = info_dict.get('requested_formats') or [info_dict]
if available_at := max(f.get('available_at') or 0 for f in requested_formats):
forced_sleep_interval = available_at - int(time.time())
if forced_sleep_interval > min_sleep_interval:
sleep_note = 'as required by the site'

View File

@@ -457,6 +457,8 @@ class FFmpegFD(ExternalFD):
@classmethod
def available(cls, path=None):
# TODO: Fix path for ffmpeg
# Fixme: This may be wrong when --ffmpeg-location is used
return FFmpegPostProcessor().available
def on_process_started(self, proc, stdin):

View File

@@ -1,32 +1,4 @@
# flake8: noqa: F401
# isort: off
from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeIE,
YoutubeClipIE,
YoutubeFavouritesIE,
YoutubeNotificationsIE,
YoutubeHistoryIE,
YoutubeTabIE,
YoutubeLivestreamEmbedIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeMusicSearchURLIE,
YoutubeSubscriptionsIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
YoutubeShortsAudioPivotIE,
YoutubeConsentRedirectIE,
)
# isort: on
from .abc import (
ABCIE,
ABCIViewIE,
@@ -75,6 +47,7 @@ from .afreecatv import (
AfreecaTVLiveIE,
AfreecaTVUserIE,
)
from .agalega import AGalegaIE
from .agora import (
TokFMAuditionIE,
TokFMPodcastIE,
@@ -83,6 +56,7 @@ from .agora import (
)
from .airtv import AirTVIE
from .aitube import AitubeKZVideoIE
from .alibaba import AlibabaIE
from .aliexpress import AliExpressLiveIE
from .aljazeera import AlJazeeraIE
from .allocine import AllocineIE
@@ -429,6 +403,7 @@ from .cpac import (
)
from .cracked import CrackedIE
from .craftsy import CraftsyIE
from .croatianfilm import CroatianFilmIE
from .crooksandliars import CrooksAndLiarsIE
from .crowdbunker import (
CrowdBunkerChannelIE,
@@ -589,7 +564,10 @@ from .eroprofile import (
EroProfileAlbumIE,
EroProfileIE,
)
from .err import ERRJupiterIE
from .err import (
ERRArhiivIE,
ERRJupiterIE,
)
from .ertgr import (
ERTFlixCodenameIE,
ERTFlixIE,
@@ -636,6 +614,7 @@ from .fc2 import (
)
from .fczenit import FczenitIE
from .fifa import FifaIE
from .filmarchiv import FilmArchivIE
from .filmon import (
FilmOnChannelIE,
FilmOnIE,
@@ -691,6 +670,10 @@ from .frontendmasters import (
FrontendMastersIE,
FrontendMastersLessonIE,
)
from .frontro import (
TheChosenGroupIE,
TheChosenIE,
)
from .fujitv import FujiTVFODPlus7IE
from .funk import FunkIE
from .funker530 import Funker530IE
@@ -1080,11 +1063,6 @@ from .mangomolo import (
MangomoloLiveIE,
MangomoloVideoIE,
)
from .manoto import (
ManotoTVIE,
ManotoTVLiveIE,
ManotoTVShowIE,
)
from .manyvids import ManyVidsIE
from .maoritv import MaoriTVIE
from .markiza import (
@@ -1094,7 +1072,10 @@ from .markiza import (
from .massengeschmacktv import MassengeschmackTVIE
from .masters import MastersIE
from .matchtv import MatchTVIE
from .mave import MaveIE
from .mave import (
MaveChannelIE,
MaveIE,
)
from .mbn import MBNIE
from .mdr import MDRIE
from .medaltv import MedalTVIE
@@ -1269,6 +1250,7 @@ from .nebula import (
NebulaChannelIE,
NebulaClassIE,
NebulaIE,
NebulaSeasonIE,
NebulaSubscriptionsIE,
)
from .nekohacker import NekoHackerIE
@@ -1277,6 +1259,10 @@ from .nest import (
NestClipIE,
NestIE,
)
from .netapp import (
NetAppCollectionIE,
NetAppVideoIE,
)
from .neteasemusic import (
NetEaseMusicAlbumIE,
NetEaseMusicDjRadioIE,
@@ -1299,12 +1285,6 @@ from .newgrounds import (
)
from .newspicks import NewsPicksIE
from .newsy import NewsyIE
from .nextmedia import (
AppleDailyIE,
NextMediaActionNewsIE,
NextMediaIE,
NextTVIE,
)
from .nexx import (
NexxEmbedIE,
NexxIE,
@@ -1473,6 +1453,7 @@ from .palcomp3 import (
PalcoMP3IE,
PalcoMP3VideoIE,
)
from .pandatv import PandaTvIE
from .panopto import (
PanoptoIE,
PanoptoListIE,
@@ -1821,10 +1802,6 @@ from .scrippsnetworks import (
ScrippsNetworksWatchIE,
)
from .scrolller import ScrolllerIE
from .scte import (
SCTEIE,
SCTECourseIE,
)
from .sejmpl import SejmIE
from .sen import SenIE
from .senalcolombia import SenalColombiaLiveIE
@@ -2006,6 +1983,11 @@ from .taptap import (
TapTapMomentIE,
TapTapPostIntlIE,
)
from .tarangplus import (
TarangPlusEpisodesIE,
TarangPlusPlaylistIE,
TarangPlusVideoIE,
)
from .tass import TassIE
from .tbs import TBSIE
from .tbsjp import (
@@ -2381,7 +2363,11 @@ from .voicy import (
VoicyChannelIE,
VoicyIE,
)
from .volejtv import VolejTVIE
from .volejtv import (
VolejTVCategoryPlaylistIE,
VolejTVClubPlaylistIE,
VolejTVIE,
)
from .voxmedia import (
VoxMediaIE,
VoxMediaVolumeIE,
@@ -2523,6 +2509,7 @@ from .yappy import (
YappyIE,
YappyProfileIE,
)
from .yfanefa import YfanefaIE
from .yle_areena import YleAreenaIE
from .youjizz import YouJizzIE
from .youku import (
@@ -2543,6 +2530,29 @@ from .youporn import (
YouPornTagIE,
YouPornVideosIE,
)
from .youtube import (
YoutubeClipIE,
YoutubeConsentRedirectIE,
YoutubeFavouritesIE,
YoutubeHistoryIE,
YoutubeIE,
YoutubeLivestreamEmbedIE,
YoutubeMusicSearchURLIE,
YoutubeNotificationsIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeShortsAudioPivotIE,
YoutubeSubscriptionsIE,
YoutubeTabIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeWatchLaterIE,
YoutubeYtBeIE,
YoutubeYtUserIE,
)
from .zaiko import (
ZaikoETicketIE,
ZaikoIE,

View File

@@ -0,0 +1,91 @@
import json
import time
from .common import InfoExtractor
from ..utils import jwt_decode_hs256, url_or_none
from ..utils.traversal import traverse_obj
class AGalegaBaseIE(InfoExtractor):
_access_token = None
@staticmethod
def _jwt_is_expired(token):
return jwt_decode_hs256(token)['exp'] - time.time() < 120
def _refresh_access_token(self, video_id):
AGalegaBaseIE._access_token = self._download_json(
'https://www.agalega.gal/api/fetch-api/jwt/token', video_id,
note='Downloading access token',
data=json.dumps({
'username': None,
'password': None,
'client': 'crtvg',
'checkExistsCookies': False,
}).encode())['access']
def _call_api(self, endpoint, display_id, note, fatal=True, query=None):
if not AGalegaBaseIE._access_token or self._jwt_is_expired(AGalegaBaseIE._access_token):
self._refresh_access_token(endpoint)
return self._download_json(
f'https://api-agalega.interactvty.com/api/2.0/contents/{endpoint}', display_id,
note=note, fatal=fatal, query=query,
headers={'Authorization': f'jwtok {AGalegaBaseIE._access_token}'})
class AGalegaIE(AGalegaBaseIE):
IE_NAME = 'agalega:videos'
_VALID_URL = r'https?://(?:www\.)?agalega\.gal/videos/(?:detail/)?(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.agalega.gal/videos/288664-lr-ninguencheconta',
'md5': '04533a66c5f863d08dd9724b11d1c223',
'info_dict': {
'id': '288664',
'title': 'Roberto e Ángel Martín atenden consultas dos espectadores',
'description': 'O cómico ademais fai un repaso dalgúns momentos da súa traxectoria profesional',
'thumbnail': 'https://crtvg-bucket.flumotion.cloud/content_cards/2ef32c3b9f6249d9868fd8f11d389d3d.png',
'ext': 'mp4',
},
}, {
'url': 'https://www.agalega.gal/videos/detail/296152-pulso-activo-7',
'md5': '26df7fdcf859f38ad92d837279d6b56d',
'info_dict': {
'id': '296152',
'title': 'Pulso activo | 18-11-2025',
'description': 'Anxo, Noemí, Silvia e Estrella comparten as sensacións da clase de Eddy.',
'thumbnail': 'https://crtvg-bucket.flumotion.cloud/content_cards/a6bb7da6c8994b82bf961ac6cad1707b.png',
'ext': 'mp4',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
content_data = self._call_api(
f'content/{video_id}/', video_id, note='Downloading content data', fatal=False,
query={
'optional_fields': 'image,is_premium,short_description,has_subtitle',
})
resource_data = self._call_api(
f'content_resources/{video_id}/', video_id, note='Downloading resource data',
query={
'optional_fields': 'media_url',
})
formats = []
subtitles = {}
for m3u8_url in traverse_obj(resource_data, ('results', ..., 'media_url', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, ext='mp4', m3u8_id='hls')
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(content_data, {
'title': ('name', {str}),
'description': (('description', 'short_description'), {str}, any),
'thumbnail': ('image', {url_or_none}),
}),
}

View File

@@ -0,0 +1,42 @@
from .common import InfoExtractor
from ..utils import int_or_none, str_or_none, url_or_none
from ..utils.traversal import traverse_obj
class AlibabaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?alibaba\.com/product-detail/[\w-]+_(?P<id>\d+)\.html'
_TESTS = [{
'url': 'https://www.alibaba.com/product-detail/Kids-Entertainment-Bouncer-Bouncy-Castle-Waterslide_1601271126969.html',
'info_dict': {
'id': '6000280444270',
'display_id': '1601271126969',
'ext': 'mp4',
'title': 'Kids Entertainment Bouncer Bouncy Castle Waterslide Juex Gonflables Commercial Inflatable Tropical Water Slide',
'duration': 30,
'thumbnail': 'https://sc04.alicdn.com/kf/Hc5bb391974454af18c7a4f91cbe4062bg.jpg_120x120.jpg',
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
product_data = self._search_json(
r'window\.detailData\s*=', webpage, 'detail data', display_id)['globalData']['product']
return {
**traverse_obj(product_data, ('mediaItems', lambda _, v: v['type'] == 'video' and v['videoId'], any, {
'id': ('videoId', {int}, {str_or_none}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('videoCoverUrl', {url_or_none}),
'formats': ('videoUrl', lambda _, v: url_or_none(v['videoUrl']), {
'url': 'videoUrl',
'format_id': ('definition', {str_or_none}),
'tbr': ('bitrate', {int_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'filesize': ('length', {int_or_none}),
}),
})),
'title': traverse_obj(product_data, ('subject', {str})),
'display_id': display_id,
}

View File

@@ -279,7 +279,7 @@ class ArchiveOrgIE(InfoExtractor):
'url': 'https://archive.org/' + track['file'].lstrip('/'),
}
metadata = self._download_json('http://archive.org/metadata/' + identifier, identifier)
metadata = self._download_json(f'https://archive.org/metadata/{identifier}', identifier)
m = metadata['metadata']
identifier = m['identifier']
@@ -704,6 +704,24 @@ class YoutubeWebArchiveIE(InfoExtractor):
'thumbnail': 'https://web.archive.org/web/20160108040020if_/https://i.ytimg.com/vi/SQCom7wjGDs/maxresdefault.jpg',
'upload_date': '20160107',
},
}, {
# dmuxed formats
'url': 'https://web.archive.org/web/20240922160632/https://www.youtube.com/watch?v=z7hzvTL3k1k',
'info_dict': {
'id': 'z7hzvTL3k1k',
'ext': 'webm',
'title': 'Praise the Lord and Pass the Ammunition (BARRXN REMIX)',
'description': 'md5:45dbf2c71c23b0734c8dfb82dd1e94b6',
'uploader': 'Barrxn',
'uploader_id': 'TheRockstar6086',
'uploader_url': 'https://www.youtube.com/user/TheRockstar6086',
'channel_id': 'UCjJPGUTtvR9uizmawn2ThqA',
'channel_url': 'https://www.youtube.com/channel/UCjJPGUTtvR9uizmawn2ThqA',
'duration': 125,
'thumbnail': r're:https?://.*\.(jpg|webp)',
'upload_date': '20201207',
},
'params': {'format': 'bv'},
}, {
'url': 'https://web.archive.org/web/http://www.youtube.com/watch?v=kH-G_aIBlFw',
'only_matching': True,
@@ -1060,6 +1078,19 @@ class YoutubeWebArchiveIE(InfoExtractor):
capture_dates.extend([self._OLDEST_CAPTURE_DATE, self._NEWEST_CAPTURE_DATE])
return orderedSet(filter(None, capture_dates))
def _parse_fmt(self, fmt, extra_info=None):
format_id = traverse_obj(fmt, ('url', {parse_qs}, 'itag', 0))
return {
'format_id': format_id,
**self._FORMATS.get(format_id, {}),
**traverse_obj(fmt, {
'url': ('url', {lambda x: f'https://web.archive.org/web/2id_/{x}'}),
'ext': ('ext', {str}),
'filesize': ('url', {parse_qs}, 'clen', 0, {int_or_none}),
}),
**(extra_info or {}),
}
def _real_extract(self, url):
video_id, url_date, url_date_2 = self._match_valid_url(url).group('id', 'date', 'date2')
url_date = url_date or url_date_2
@@ -1090,17 +1121,14 @@ class YoutubeWebArchiveIE(InfoExtractor):
info['thumbnails'] = self._extract_thumbnails(video_id)
formats = []
for fmt in traverse_obj(video_info, ('formats', lambda _, v: url_or_none(v['url']))):
format_id = traverse_obj(fmt, ('url', {parse_qs}, 'itag', 0))
formats.append({
'format_id': format_id,
**self._FORMATS.get(format_id, {}),
**traverse_obj(fmt, {
'url': ('url', {lambda x: f'https://web.archive.org/web/2id_/{x}'}),
'ext': ('ext', {str}),
'filesize': ('url', {parse_qs}, 'clen', 0, {int_or_none}),
}),
})
if video_info.get('dmux'):
for vf in traverse_obj(video_info, ('formats', 'video', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(vf, {'acodec': 'none'}))
for af in traverse_obj(video_info, ('formats', 'audio', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(af, {'vcodec': 'none'}))
else:
for fmt in traverse_obj(video_info, ('formats', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(fmt))
info['formats'] = formats
return info

View File

@@ -5,16 +5,18 @@ import time
from .common import InfoExtractor
from ..utils import (
KNOWN_EXTENSIONS,
ExtractorError,
clean_html,
extract_attributes,
float_or_none,
format_field,
int_or_none,
join_nonempty,
parse_filesize,
parse_qs,
str_or_none,
strftime_or_none,
try_get,
unified_strdate,
unified_timestamp,
update_url_query,
url_or_none,
@@ -411,70 +413,67 @@ class BandcampAlbumIE(BandcampIE): # XXX: Do not subclass from concrete IE
class BandcampWeeklyIE(BandcampIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'Bandcamp:weekly'
_VALID_URL = r'https?://(?:www\.)?bandcamp\.com/?\?(?:.*?&)?show=(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?bandcamp\.com/radio/?\?(?:[^#]+&)?show=(?P<id>\d+)'
_TESTS = [{
'url': 'https://bandcamp.com/?show=224',
'url': 'https://bandcamp.com/radio?show=224',
'md5': '61acc9a002bed93986b91168aa3ab433',
'info_dict': {
'id': '224',
'ext': 'mp3',
'title': 'BC Weekly April 4th 2017 - Magic Moments',
'title': 'Bandcamp Weekly, 2017-04-04',
'description': 'md5:5d48150916e8e02d030623a48512c874',
'duration': 5829.77,
'release_date': '20170404',
'thumbnail': 'https://f4.bcbits.com/img/9982549_0.jpg',
'series': 'Bandcamp Weekly',
'episode': 'Magic Moments',
'episode_id': '224',
'release_timestamp': 1491264000,
'release_date': '20170404',
'duration': 5829.77,
},
'params': {
'format': 'mp3-128',
},
}, {
'url': 'https://bandcamp.com/?blah/blah@&show=228',
'url': 'https://bandcamp.com/radio/?foo=bar&show=224',
'only_matching': True,
}]
def _real_extract(self, url):
show_id = self._match_id(url)
webpage = self._download_webpage(url, show_id)
audio_data = self._download_json(
'https://bandcamp.com/api/bcradio_api/1/get_show',
show_id, 'Downloading radio show JSON',
data=json.dumps({'id': show_id}).encode(),
headers={'Content-Type': 'application/json'})['radioShowAudio']
blob = self._extract_data_attr(webpage, show_id, 'blob')
stream_url = audio_data['streamUrl']
format_id = traverse_obj(stream_url, ({parse_qs}, 'enc', -1))
encoding, _, bitrate_str = (format_id or '').partition('-')
show = blob['bcw_data'][show_id]
webpage = self._download_webpage(url, show_id, fatal=False)
metadata = traverse_obj(
self._extract_data_attr(webpage, show_id, 'blob', fatal=False),
('appData', 'shows', lambda _, v: str(v['showId']) == show_id, any)) or {}
formats = []
for format_id, format_url in show['audio_stream'].items():
if not url_or_none(format_url):
continue
for known_ext in KNOWN_EXTENSIONS:
if known_ext in format_id:
ext = known_ext
break
else:
ext = None
formats.append({
'format_id': format_id,
'url': format_url,
'ext': ext,
'vcodec': 'none',
})
title = show.get('audio_title') or 'Bandcamp Weekly'
subtitle = show.get('subtitle')
if subtitle:
title += f' - {subtitle}'
series_title = audio_data.get('title') or metadata.get('title')
release_timestamp = unified_timestamp(audio_data.get('date')) or unified_timestamp(metadata.get('date'))
return {
'id': show_id,
'title': title,
'description': show.get('desc') or show.get('short_desc'),
'duration': float_or_none(show.get('audio_duration')),
'is_live': False,
'release_date': unified_strdate(show.get('published_date')),
'series': 'Bandcamp Weekly',
'episode': show.get('subtitle'),
'episode_id': show_id,
'formats': formats,
'title': join_nonempty(series_title, strftime_or_none(release_timestamp, '%Y-%m-%d'), delim=', '),
'series': series_title,
'thumbnail': format_field(metadata, 'imageId', 'https://f4.bcbits.com/img/%s_0.jpg', default=None),
'description': metadata.get('desc') or metadata.get('short_desc'),
'duration': float_or_none(audio_data.get('duration')),
'release_timestamp': release_timestamp,
'formats': [{
'url': stream_url,
'format_id': format_id,
'ext': encoding or 'mp3',
'acodec': encoding or None,
'vcodec': 'none',
'abr': int_or_none(bitrate_str),
}],
}

View File

@@ -1,5 +1,5 @@
from .common import InfoExtractor
from ..utils import ExtractorError, urlencode_postdata
from ..utils import ExtractorError, UserNotLive, urlencode_postdata
class BigoIE(InfoExtractor):
@@ -40,7 +40,7 @@ class BigoIE(InfoExtractor):
info = info_raw.get('data') or {}
if not info.get('alive'):
raise ExtractorError('This user is offline.', expected=True)
raise UserNotLive(video_id=user_id)
formats, subs = self._extract_m3u8_formats_and_subtitles(
info.get('hls_src'), user_id, 'mp4', 'm3u8')

View File

@@ -21,21 +21,44 @@ class BoostyIE(InfoExtractor):
'url': 'https://boosty.to/kuplinov/posts/e55d050c-e3bb-4873-a7db-ac7a49b40c38',
'info_dict': {
'id': 'd7473824-352e-48e2-ae53-d4aa39459968',
'title': 'phasma_3',
'title': 'Бан? А! Бан! (Phasmophobia)',
'alt_title': 'Бан? А! Бан! (Phasmophobia)',
'channel': 'Kuplinov',
'channel_id': '7958701',
'timestamp': 1655031975,
'upload_date': '20220612',
'release_timestamp': 1655049000,
'release_date': '20220612',
'modified_timestamp': 1668680993,
'modified_date': '20221117',
'modified_timestamp': 1743328648,
'modified_date': '20250330',
'tags': ['куплинов', 'phasmophobia'],
'like_count': int,
'ext': 'mp4',
'duration': 105,
'view_count': int,
'thumbnail': r're:^https://i\.mycdn\.me/videoPreview\?',
'thumbnail': r're:^https://iv\.okcdn\.ru/videoPreview\?',
},
}, {
# single ok_video with truncated title
'url': 'https://boosty.to/kuplinov/posts/cc09b7f9-121e-40b8-9392-4a075ef2ce53',
'info_dict': {
'id': 'fb5ea762-6303-4557-9a17-157947326810',
'title': 'Какая там активность была? Не слышу! Повтори еще пару раз! (Phas',
'alt_title': 'Какая там активность была? Не слышу! Повтори еще пару раз! (Phasmophobia)',
'channel': 'Kuplinov',
'channel_id': '7958701',
'timestamp': 1655031930,
'upload_date': '20220612',
'release_timestamp': 1655048400,
'release_date': '20220612',
'modified_timestamp': 1743328616,
'modified_date': '20250330',
'tags': ['куплинов', 'phasmophobia'],
'like_count': int,
'ext': 'mp4',
'duration': 39,
'view_count': int,
'thumbnail': r're:^https://iv\.okcdn\.ru/videoPreview\?',
},
}, {
# multiple ok_video
@@ -109,36 +132,41 @@ class BoostyIE(InfoExtractor):
'thumbnail': r're:^https://i\.mycdn\.me/videoPreview\?',
},
}],
'skip': 'post has been deleted',
}, {
# single external video (youtube)
'url': 'https://boosty.to/denischuzhoy/posts/6094a487-bcec-4cf8-a453-43313b463c38',
'url': 'https://boosty.to/futuremusicproduction/posts/32a8cae2-3252-49da-b285-0e014bc6e565',
'info_dict': {
'id': 'EXelTnve5lY',
'title': 'Послание Президента Федеральному Собранию | Класс народа',
'upload_date': '20210425',
'channel': 'Денис Чужой',
'tags': 'count:10',
'id': '-37FW_YQ3B4',
'title': 'Afro | Deep House FREE FLP',
'media_type': 'video',
'upload_date': '20250829',
'timestamp': 1756466005,
'channel': 'Future Music Production',
'tags': 'count:0',
'like_count': int,
'ext': 'mp4',
'duration': 816,
'ext': 'm4a',
'duration': 170,
'view_count': int,
'thumbnail': r're:^https://i\.ytimg\.com/',
'age_limit': 0,
'availability': 'public',
'categories': list,
'channel_follower_count': int,
'channel_id': 'UCCzVNbWZfYpBfyofCCUD_0w',
'channel_is_verified': bool,
'channel_id': 'UCKVYrFBYmci1e-T8NeHw2qg',
'channel_url': r're:^https://www\.youtube\.com/',
'comment_count': int,
'description': str,
'heatmap': 'count:100',
'live_status': str,
'playable_in_embed': bool,
'uploader': str,
'uploader_id': str,
'uploader_url': r're:^https://www\.youtube\.com/',
},
'expected_warnings': [
'Remote components challenge solver script',
'n challenge solving failed',
],
}]
_MP4_TYPES = ('tiny', 'lowest', 'low', 'medium', 'high', 'full_hd', 'quad_hd', 'ultra_hd')
@@ -207,13 +235,14 @@ class BoostyIE(InfoExtractor):
video_id = item.get('id') or post_id
entries.append({
'id': video_id,
'alt_title': post_title,
'formats': self._extract_formats(item.get('playerUrls'), video_id),
**common_metadata,
**traverse_obj(item, {
'title': ('title', {str}),
'duration': ('duration', {int_or_none}),
'view_count': ('viewsCounter', {int_or_none}),
'thumbnail': (('previewUrl', 'defaultPreview'), {url_or_none}),
'thumbnail': (('preview', 'defaultPreview'), {url_or_none}),
}, get_all=False)})
if not entries and not post.get('hasAccess'):

View File

@@ -105,7 +105,7 @@ class CBCIE(InfoExtractor):
# multiple CBC.APP.Caffeine.initInstance(...)
'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks', # FIXME: actual title includes " | CBC News"
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238',
'description': 'md5:c18552e41726ee95bd75210d1ca9194c',
},
@@ -134,6 +134,13 @@ class CBCIE(InfoExtractor):
title = (self._og_search_title(webpage, default=None)
or self._html_search_meta('twitter:title', webpage, 'title', default=None)
or self._html_extract_title(webpage))
title = self._search_regex(
r'^(?P<title>.+?)(?:\s*[|-]\s*CBC.*)?$',
title, 'cleaned title', group='title', default=title)
data = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage,
'initial state', display_id, default={}, transform_source=js_to_json)
entries = [
self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
@@ -143,6 +150,11 @@ class CBCIE(InfoExtractor):
r'<div[^>]+\bid=["\']player-(\d+)',
r'guid["\']\s*:\s*["\'](\d+)'):
media_ids.extend(re.findall(media_id_re, webpage))
media_ids.extend(traverse_obj(data, (
'detail', 'content', 'body', ..., 'content',
lambda _, v: v['type'] == 'polopoly_media', 'content', 'sourceId', {str})))
if content_id := traverse_obj(data, ('app', 'contentId', {str})):
media_ids.append(content_id)
entries.extend([
self.url_result(f'cbcplayer:{media_id}', 'CBCPlayer', media_id)
for media_id in orderedSet(media_ids)])
@@ -268,7 +280,7 @@ class CBCPlayerIE(InfoExtractor):
'duration': 2692.833,
'subtitles': {
'en-US': [{
'name': 'English Captions',
'name': r're:English',
'url': 'https://cbchls.akamaized.net/delivery/news-shows/2024/06/17/NAT_JUN16-00-55-00/NAT_JUN16_cc.vtt',
}],
},
@@ -322,6 +334,7 @@ class CBCPlayerIE(InfoExtractor):
'categories': ['Olympics Summer Soccer', 'Summer Olympics Replays', 'Summer Olympics Soccer Replays'],
'location': 'Canada',
},
'skip': 'Video no longer available',
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.cbc.ca/player/play/video/9.6459530',
@@ -380,7 +393,8 @@ class CBCPlayerIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(f'https://www.cbc.ca/player/play/{video_id}', video_id)
data = self._search_json(
r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)['video']['currentClip']
r'window\.__INITIAL_STATE__\s*=', webpage,
'initial state', video_id, transform_source=js_to_json)['video']['currentClip']
assets = traverse_obj(
data, ('media', 'assets', lambda _, v: url_or_none(v['key']) and v['type']))
@@ -492,12 +506,14 @@ class CBCPlayerPlaylistIE(InfoExtractor):
'info_dict': {
'id': 'news/tv shows/the national/latest broadcast',
},
'skip': 'Playlist no longer available',
}, {
'url': 'https://www.cbc.ca/player/news/Canada/North',
'playlist_mincount': 25,
'info_dict': {
'id': 'news/canada/north',
},
'skip': 'Playlist no longer available',
}]
def _real_extract(self, url):

View File

@@ -18,23 +18,41 @@ class CCCIE(InfoExtractor):
'id': '1839',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'creators': ['byterazor'],
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
'tags': list,
'display_id': '30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor',
'view_count': int,
},
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
'only_matching': True,
}, {
'url': 'https://media.ccc.de/v/39c3-schlechte-karten-it-sicherheit-im-jahr-null-der-epa-fur-alle',
'info_dict': {
'id': '16261',
'ext': 'mp4',
'title': 'Schlechte Karten - IT-Sicherheit im Jahr null der ePA für alle',
'display_id': '39c3-schlechte-karten-it-sicherheit-im-jahr-null-der-epa-fur-alle',
'description': 'md5:719a5a9a52630249d606219c55056cbf',
'view_count': int,
'duration': 3619,
'thumbnail': 'https://static.media.ccc.de/media/congress/2025/2403-2b5a6a8e-327e-594d-8f92-b91201d18a02.jpg',
'tags': list,
'creators': ['Bianca Kastl'],
'timestamp': 1767024900,
'upload_date': '20251229',
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
event_id = self._search_regex(r"data-id=(['\"])(?P<event_id>\d+)\1", webpage, 'event id', group='event_id')
event_data = self._download_json(f'https://media.ccc.de/public/events/{event_id}', event_id)
formats = []

View File

@@ -27,7 +27,7 @@ from ..utils.traversal import traverse_obj
class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_VALID_URL = r'https?://(?:(?:(?:www|m)\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_NETRC_MACHINE = 'cdapl'
_BASE_URL = 'https://www.cda.pl'
@@ -110,6 +110,9 @@ class CDAIE(InfoExtractor):
}, {
'url': 'http://ebd.cda.pl/0x0/5749950c',
'only_matching': True,
}, {
'url': 'https://m.cda.pl/video/617297677',
'only_matching': True,
}]
def _download_age_confirm_page(self, url, video_id, *args, **kwargs):
@@ -367,35 +370,35 @@ class CDAIE(InfoExtractor):
class CDAFolderIE(InfoExtractor):
_MAX_PAGE_SIZE = 36
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>[\w-]+)/folder/(?P<id>\d+)'
_TESTS = [
{
'url': 'https://www.cda.pl/domino264/folder/31188385',
'info_dict': {
'id': '31188385',
'title': 'SERIA DRUGA',
},
'playlist_mincount': 13,
_VALID_URL = r'https?://(?:(?:www|m)\.)?cda\.pl/(?P<channel>[\w-]+)/folder/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.cda.pl/domino264/folder/31188385',
'info_dict': {
'id': '31188385',
'title': 'SERIA DRUGA',
},
{
'url': 'https://www.cda.pl/smiechawaTV/folder/2664592/vfilm',
'info_dict': {
'id': '2664592',
'title': 'VideoDowcipy - wszystkie odcinki',
},
'playlist_mincount': 71,
'playlist_mincount': 13,
}, {
'url': 'https://www.cda.pl/smiechawaTV/folder/2664592/vfilm',
'info_dict': {
'id': '2664592',
'title': 'VideoDowcipy - wszystkie odcinki',
},
{
'url': 'https://www.cda.pl/DeliciousBeauty/folder/19129979/vfilm',
'info_dict': {
'id': '19129979',
'title': 'TESTY KOSMETYKÓW',
},
'playlist_mincount': 139,
}, {
'url': 'https://www.cda.pl/FILMY-SERIALE-ANIME-KRESKOWKI-BAJKI/folder/18493422',
'only_matching': True,
}]
'playlist_mincount': 71,
}, {
'url': 'https://www.cda.pl/DeliciousBeauty/folder/19129979/vfilm',
'info_dict': {
'id': '19129979',
'title': 'TESTY KOSMETYKÓW',
},
'playlist_mincount': 139,
}, {
'url': 'https://www.cda.pl/FILMY-SERIALE-ANIME-KRESKOWKI-BAJKI/folder/18493422',
'only_matching': True,
}, {
'url': 'https://m.cda.pl/smiechawaTV/folder/2664592/vfilm',
'only_matching': True,
}]
def _real_extract(self, url):
folder_id, channel = self._match_valid_url(url).group('id', 'channel')

View File

@@ -348,6 +348,7 @@ class InfoExtractor:
duration: Length of the video in seconds, as an integer or float.
view_count: How many users have watched the video on the platform.
concurrent_view_count: How many users are currently watching the video on the platform.
save_count: Number of times the video has been saved or bookmarked
like_count: Number of positive ratings of the video
dislike_count: Number of negative ratings of the video
repost_count: Number of reposts of the video

View File

@@ -0,0 +1,79 @@
from .common import InfoExtractor
from .vimeo import VimeoIE
from ..utils import (
ExtractorError,
join_nonempty,
)
from ..utils.traversal import traverse_obj
class CroatianFilmIE(InfoExtractor):
IE_NAME = 'croatian.film'
_VALID_URL = r'https://?(?:www\.)?croatian\.film/[a-z]{2}/[^/?#]+/(?P<id>\d+)'
_GEO_COUNTRIES = ['HR']
_TESTS = [{
'url': 'https://www.croatian.film/hr/films/72472',
'info_dict': {
'id': '1078340774',
'ext': 'mp4',
'title': '“ŠKAFETIN”, r. Paško Vukasović',
'uploader': 'croatian.film',
'uploader_id': 'user94192658',
'uploader_url': 'https://vimeo.com/user94192658',
'duration': 1357,
'thumbnail': 'https://i.vimeocdn.com/video/2008556407-40eb1315ec11be5fcb8dda4d7059675b0881e182b9fc730892e267db72cb57f5-d',
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}, {
# geo-restricted but works with xff
'url': 'https://www.croatian.film/en/films/77144',
'info_dict': {
'id': '1144997795',
'ext': 'mp4',
'title': '“ROKO” r. Ivana Marinić Kragić',
'uploader': 'croatian.film',
'uploader_id': 'user94192658',
'uploader_url': 'https://vimeo.com/user94192658',
'duration': 1023,
'thumbnail': 'https://i.vimeocdn.com/video/2093793231-11c2928698ff8347489e679b4d563a576e7acd0681ce95b383a9a25f6adb5e8f-d',
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}, {
'url': 'https://www.croatian.film/en/films/75904/watch',
'info_dict': {
'id': '1134883757',
'ext': 'mp4',
'title': '"CARPE DIEM" r. Nina Damjanović',
'uploader': 'croatian.film',
'uploader_id': 'user94192658',
'uploader_url': 'https://vimeo.com/user94192658',
'duration': 1123,
'thumbnail': 'https://i.vimeocdn.com/video/2080022187-bb691c470c28c4d979258cf235e594bf9a11c14b837a0784326c25c95edd83f9-d',
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}]
def _real_extract(self, url):
display_id = self._match_id(url)
api_data = self._download_json(
f'https://api.croatian.film/api/videos/{display_id}',
display_id)
if errors := traverse_obj(api_data, ('errors', lambda _, v: v['code'])):
codes = traverse_obj(errors, (..., 'code', {str}))
if 'INVALID_COUNTRY' in codes:
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise ExtractorError(join_nonempty(
*(traverse_obj(errors, (..., 'details', {str})) or codes),
delim='; '))
vimeo_id = self._search_regex(
r'/videos/(\d+)', api_data['video']['vimeoURL'], 'vimeo ID')
return self.url_result(
VimeoIE._smuggle_referrer(f'https://player.vimeo.com/video/{vimeo_id}', url),
VimeoIE, vimeo_id)

View File

@@ -1,5 +1,6 @@
import functools
import json
import random
import re
import urllib.parse
@@ -363,6 +364,56 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
continue
yield update_url(player_url, query=query_string)
@staticmethod
def _generate_blockbuster_headers():
"""Randomize our HTTP header fingerprint to bust the HTTP Error 403 block"""
def random_letters(minimum, maximum):
# Omit vowels so we don't generate valid header names like 'authorization', etc
return ''.join(random.choices('bcdfghjklmnpqrstvwxz', k=random.randint(minimum, maximum)))
return {
random_letters(8, 24): random_letters(16, 32)
for _ in range(random.randint(2, 8))
}
def _extract_dailymotion_m3u8_formats_and_subtitles(self, media_url, video_id, live=False):
"""See https://github.com/yt-dlp/yt-dlp/issues/15526"""
ERROR_NOTE = 'Unable to download m3u8 information'
last_error = None
for note, kwargs in (
('Downloading m3u8 information', {}),
('Retrying m3u8 download with randomized headers', {
'headers': self._generate_blockbuster_headers(),
}),
('Retrying m3u8 download with Chrome impersonation', {
'impersonate': 'chrome',
'require_impersonation': True,
}),
('Retrying m3u8 download with Firefox impersonation', {
'impersonate': 'firefox',
'require_impersonation': True,
}),
):
try:
m3u8_doc = self._download_webpage(media_url, video_id, note, ERROR_NOTE, **kwargs)
break
except ExtractorError as e:
last_error = e.orig_msg
self.write_debug(f'{video_id}: {last_error}')
else:
if 'impersonation' not in last_error:
self.report_warning(last_error, video_id=video_id)
last_error = None
return [], {}, last_error
formats, subtitles = self._parse_m3u8_formats_and_subtitles(
m3u8_doc, media_url, 'mp4', m3u8_id='hls', live=live, fatal=False)
return formats, subtitles, last_error
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url)
video_id, is_playlist, playlist_id = self._match_valid_url(url).group('id', 'is_playlist', 'playlist_id')
@@ -416,6 +467,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
is_live = media.get('isOnAir')
formats = []
subtitles = {}
expected_error = None
for quality, media_list in metadata['qualities'].items():
for m in media_list:
@@ -424,8 +476,8 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
if not media_url or media_type == 'application/vnd.lumberjack.manifest':
continue
if media_type == 'application/x-mpegURL':
fmt, subs = self._extract_m3u8_formats_and_subtitles(
media_url, video_id, 'mp4', live=is_live, m3u8_id='hls', fatal=False)
fmt, subs, expected_error = self._extract_dailymotion_m3u8_formats_and_subtitles(
media_url, video_id, live=is_live)
formats.extend(fmt)
self._merge_subtitles(subs, target=subtitles)
else:
@@ -442,6 +494,10 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'width': width,
})
formats.append(f)
if not formats and expected_error:
self.raise_no_formats(expected_error, expected=True)
for f in formats:
f['url'] = f['url'].split('#')[0]
if not f.get('fps') and f['format_id'].endswith('@60'):

View File

@@ -1,5 +1,6 @@
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import int_or_none, url_or_none
from ..utils.traversal import traverse_obj
class DigitekaIE(InfoExtractor):
@@ -25,74 +26,56 @@ class DigitekaIE(InfoExtractor):
)/(?P<id>[\d+a-z]+)'''
_EMBED_REGEX = [r'<(?:iframe|script)[^>]+src=["\'](?P<url>(?:https?:)?//(?:www\.)?ultimedia\.com/deliver/(?:generic|musique)(?:/[^/]+)*/(?:src|article)/[\d+a-z]+)']
_TESTS = [{
# news
'url': 'https://www.ultimedia.com/default/index/videogeneric/id/s8uk0r',
'md5': '276a0e49de58c7e85d32b057837952a2',
'url': 'https://www.ultimedia.com/default/index/videogeneric/id/3x5x55k',
'info_dict': {
'id': 's8uk0r',
'id': '3x5x55k',
'ext': 'mp4',
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
'title': 'Il est passionné de DS',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 74,
'upload_date': '20150317',
'timestamp': 1426604939,
'uploader_id': '3fszv',
'duration': 89,
'upload_date': '20251012',
'timestamp': 1760285363,
'uploader_id': '3pz33',
},
}, {
# music
'url': 'https://www.ultimedia.com/default/index/videomusic/id/xvpfp8',
'md5': '2ea3513813cf230605c7e2ffe7eca61c',
'info_dict': {
'id': 'xvpfp8',
'ext': 'mp4',
'title': 'Two - C\'est La Vie (clip)',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 233,
'upload_date': '20150224',
'timestamp': 1424760500,
'uploader_id': '3rfzk',
},
}, {
'url': 'https://www.digiteka.net/deliver/generic/iframe/mdtk/01637594/src/lqm3kl/zone/1/showtitle/1/autoplay/yes',
'only_matching': True,
'params': {'skip_download': True},
}]
_IFRAME_MD_ID = '01836272' # One static ID working for Ultimedia iframes
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
video_type = mobj.group('embed_type') or mobj.group('site_type')
if video_type == 'music':
video_type = 'musique'
video_id = self._match_id(url)
deliver_info = self._download_json(
f'http://www.ultimedia.com/deliver/video?video={video_id}&topic={video_type}',
video_id)
yt_id = deliver_info.get('yt_id')
if yt_id:
return self.url_result(yt_id, 'Youtube')
jwconf = deliver_info['jwconf']
video_info = self._download_json(
f'https://www.ultimedia.com/player/getConf/{self._IFRAME_MD_ID}/1/{video_id}', video_id,
note='Downloading player configuration')['video']
formats = []
for source in jwconf['playlist'][0]['sources']:
formats.append({
'url': source['file'],
'format_id': source.get('label'),
})
subtitles = {}
title = deliver_info['title']
thumbnail = jwconf.get('image')
duration = int_or_none(deliver_info.get('duration'))
timestamp = int_or_none(deliver_info.get('release_time'))
uploader_id = deliver_info.get('owner_id')
if hls_url := traverse_obj(video_info, ('media_sources', 'hls', 'hls_auto', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
for format_id, mp4_url in traverse_obj(video_info, ('media_sources', 'mp4', {dict.items}, ...)):
if not mp4_url:
continue
formats.append({
'url': mp4_url,
'format_id': format_id,
'height': int_or_none(format_id.partition('_')[2]),
'ext': 'mp4',
})
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'uploader_id': uploader_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(video_info, {
'title': ('title', {str}),
'thumbnail': ('image', {url_or_none}),
'duration': ('duration', {int_or_none}),
'timestamp': ('creationDate', {int_or_none}),
'uploader_id': ('ownerId', {str}),
}),
}

View File

@@ -14,7 +14,7 @@ from ..utils import (
class DropboxIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dropbox\.com/(?:(?:e/)?scl/fi|sh?)/(?P<id>\w+)'
_VALID_URL = r'https?://(?:www\.)?dropbox\.com/(?:(?:e/)?scl/f[io]|sh?)/(?P<id>\w+)'
_TESTS = [
{
'url': 'https://www.dropbox.com/s/nelirfsxnmcfbfh/youtube-dl%20test%20video%20%27%C3%A4%22BaW_jenozKc.mp4?dl=0',
@@ -35,6 +35,9 @@ class DropboxIE(InfoExtractor):
}, {
'url': 'https://www.dropbox.com/e/scl/fi/r2kd2skcy5ylbbta5y1pz/DJI_0003.MP4?dl=0&rlkey=wcdgqangn7t3lnmmv6li9mu9h',
'only_matching': True,
}, {
'url': 'https://www.dropbox.com/scl/fo/zjfqse5txqfd7twa8iewj/AOfZzSYWUSKle2HD7XF7kzQ/A-BEAT%20C.mp4?rlkey=6tg3jkp4tv6a5vt58a6dag0mm&dl=0',
'only_matching': True,
},
]

View File

@@ -2,6 +2,7 @@ from .common import InfoExtractor
from ..utils import (
clean_html,
int_or_none,
parse_iso8601,
str_or_none,
url_or_none,
)
@@ -222,3 +223,70 @@ class ERRJupiterIE(InfoExtractor):
'episode_id': ('id', {str_or_none}),
}) if data.get('type') == 'episode' else {}),
}
class ERRArhiivIE(InfoExtractor):
_VALID_URL = r'https://arhiiv\.err\.ee/video/(?:vaata/)?(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://arhiiv.err.ee/video/kontsertpalad',
'info_dict': {
'id': 'kontsertpalad',
'ext': 'mp4',
'title': 'Kontsertpalad: 255 | L. Beethoveni sonaat c-moll, "Pateetiline"',
'description': 'md5:a70f4ff23c3618f3be63f704bccef063',
'series': 'Kontsertpalad',
'episode_id': 255,
'timestamp': 1666152162,
'upload_date': '20221019',
'release_year': 1970,
'modified_timestamp': 1718620982,
'modified_date': '20240617',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://arhiiv.err.ee/video/vaata/koalitsioonileppe-allkirjastamine',
'info_dict': {
'id': 'koalitsioonileppe-allkirjastamine',
'ext': 'mp4',
'title': 'Koalitsioonileppe allkirjastamine',
'timestamp': 1710728222,
'upload_date': '20240318',
'release_timestamp': 1611532800,
'release_date': '20210125',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
f'https://arhiiv.err.ee/api/v1/content/video/{video_id}', video_id)
formats, subtitles = [], {}
if hls_url := traverse_obj(data, ('media', 'src', 'hls', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if dash_url := traverse_obj(data, ('media', 'src', 'dash', {url_or_none})):
fmts, subs = self._extract_mpd_formats_and_subtitles(
dash_url, video_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(data, ('info', {
'title': ('title', {str}),
'series': ('seriesTitle', {str}, filter),
'series_id': ('seriesId', {str}, filter),
'episode_id': ('episode', {int_or_none}),
'description': ('synopsis', {str}, filter),
'timestamp': ('uploadDate', {parse_iso8601}),
'modified_timestamp': ('dateModified', {parse_iso8601}),
'release_timestamp': ('date', {parse_iso8601}),
'release_year': ('year', {int_or_none}),
})),
}

View File

@@ -1,4 +1,4 @@
import inspect
import itertools
import os
from ..globals import LAZY_EXTRACTORS
@@ -17,12 +17,18 @@ else:
if not _CLASS_LOOKUP:
from . import _extractors
_CLASS_LOOKUP = {
name: value
for name, value in inspect.getmembers(_extractors)
if name.endswith('IE') and name != 'GenericIE'
}
_CLASS_LOOKUP['GenericIE'] = _extractors.GenericIE
members = tuple(
(name, getattr(_extractors, name))
for name in dir(_extractors)
if name.endswith('IE')
)
_CLASS_LOOKUP = dict(itertools.chain(
# Add Youtube first to improve matching performance
((name, value) for name, value in members if '.youtube' in value.__module__),
# Add Generic last so that it is the fallback
((name, value) for name, value in members if name != 'GenericIE'),
(('GenericIE', _extractors.GenericIE),),
))
# We want to append to the main lookup
_current = _extractors_context.value

View File

@@ -4,8 +4,7 @@ import urllib.parse
from .common import InfoExtractor
from ..compat import compat_etree_fromstring
from ..networking import Request
from ..networking.exceptions import network_exceptions
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
clean_html,
@@ -64,9 +63,6 @@ class FacebookIE(InfoExtractor):
class=(?P<q1>[\'"])[^\'"]*\bfb-(?:video|post)\b[^\'"]*(?P=q1)[^>]+
data-href=(?P<q2>[\'"])(?P<url>(?:https?:)?//(?:www\.)?facebook.com/.+?)(?P=q2)''',
]
_LOGIN_URL = 'https://www.facebook.com/login.php?next=http%3A%2F%2Ffacebook.com%2Fhome.php&login_attempt=1'
_CHECKPOINT_URL = 'https://www.facebook.com/checkpoint/?next=http%3A%2F%2Ffacebook.com%2Fhome.php&_fb_noscript=1'
_NETRC_MACHINE = 'facebook'
IE_NAME = 'facebook'
_VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s'
@@ -469,65 +465,6 @@ class FacebookIE(InfoExtractor):
'graphURI': '/api/graphql/',
}
def _perform_login(self, username, password):
login_page_req = Request(self._LOGIN_URL)
self._set_cookie('facebook.com', 'locale', 'en_US')
login_page = self._download_webpage(login_page_req, None,
note='Downloading login page',
errnote='Unable to download login page')
lsd = self._search_regex(
r'<input type="hidden" name="lsd" value="([^"]*)"',
login_page, 'lsd')
lgnrnd = self._search_regex(r'name="lgnrnd" value="([^"]*?)"', login_page, 'lgnrnd')
login_form = {
'email': username,
'pass': password,
'lsd': lsd,
'lgnrnd': lgnrnd,
'next': 'http://facebook.com/home.php',
'default_persistent': '0',
'legacy_return': '1',
'timezone': '-60',
'trynum': '1',
}
request = Request(self._LOGIN_URL, urlencode_postdata(login_form))
request.headers['Content-Type'] = 'application/x-www-form-urlencoded'
try:
login_results = self._download_webpage(request, None,
note='Logging in', errnote='unable to fetch login page')
if re.search(r'<form(.*)name="login"(.*)</form>', login_results) is not None:
error = self._html_search_regex(
r'(?s)<div[^>]+class=(["\']).*?login_error_box.*?\1[^>]*><div[^>]*>.*?</div><div[^>]*>(?P<error>.+?)</div>',
login_results, 'login error', default=None, group='error')
if error:
raise ExtractorError(f'Unable to login: {error}', expected=True)
self.report_warning('unable to log in: bad username/password, or exceeded login rate limit (~3/min). Check credentials or wait.')
return
fb_dtsg = self._search_regex(
r'name="fb_dtsg" value="(.+?)"', login_results, 'fb_dtsg', default=None)
h = self._search_regex(
r'name="h"\s+(?:\w+="[^"]+"\s+)*?value="([^"]+)"', login_results, 'h', default=None)
if not fb_dtsg or not h:
return
check_form = {
'fb_dtsg': fb_dtsg,
'h': h,
'name_action_selected': 'dont_save',
}
check_req = Request(self._CHECKPOINT_URL, urlencode_postdata(check_form))
check_req.headers['Content-Type'] = 'application/x-www-form-urlencoded'
check_response = self._download_webpage(check_req, None,
note='Confirming login')
if re.search(r'id="checkpointSubmitButton"', check_response) is not None:
self.report_warning('Unable to confirm login, you have to login in your browser and authorize the login.')
except network_exceptions as err:
self.report_warning(f'unable to log in: {err}')
return
def _extract_from_url(self, url, video_id):
webpage = self._download_webpage(
url.replace('://m.facebook.com/', '://www.facebook.com/'), video_id)
@@ -1081,6 +1018,7 @@ class FacebookAdsIE(InfoExtractor):
'upload_date': '20240812',
'like_count': int,
},
'skip': 'Invalid URL',
}, {
'url': 'https://www.facebook.com/ads/library/?id=893637265423481',
'info_dict': {
@@ -1095,6 +1033,33 @@ class FacebookAdsIE(InfoExtractor):
},
'playlist_count': 3,
'skip': 'Invalid URL',
}, {
'url': 'https://www.facebook.com/ads/library/?id=312304267031140',
'info_dict': {
'id': '312304267031140',
'title': 'Casper Wave Hybrid Mattress',
'uploader': 'Casper',
'uploader_id': '224110981099062',
'uploader_url': 'https://www.facebook.com/Casper/',
'timestamp': 1766299837,
'upload_date': '20251221',
'like_count': int,
},
'playlist_count': 2,
}, {
'url': 'https://www.facebook.com/ads/library/?id=874812092000430',
'info_dict': {
'id': '874812092000430',
'title': 'TikTok',
'uploader': 'Case \u00e0 Chocs',
'uploader_id': '112960472096793',
'uploader_url': 'https://www.facebook.com/Caseachocs/',
'timestamp': 1768498293,
'upload_date': '20260115',
'like_count': int,
'description': 'md5:f02a255fcf7dce6ed40e9494cf4bc49a',
},
'playlist_count': 3,
}, {
'url': 'https://es-la.facebook.com/ads/library/?id=901230958115569',
'only_matching': True,
@@ -1124,9 +1089,36 @@ class FacebookAdsIE(InfoExtractor):
})
return formats
def _download_fb_webpage_and_verify(self, url, video_id):
# See https://github.com/yt-dlp/yt-dlp/issues/15577
try:
return self._download_webpage(url, video_id)
except ExtractorError as e:
if (
not isinstance(e.cause, HTTPError)
or e.cause.status != 403
or e.cause.reason != 'Client challenge'
):
raise
error_page = self._webpage_read_content(e.cause.response, url, video_id)
self.write_debug('Received a client challenge response')
challenge_path = self._search_regex(
r'fetch\s*\(\s*["\'](/__rd_verify[^"\']+)["\']',
error_page, 'challenge path')
# Successful response will set the necessary cookie
self._request_webpage(
urljoin(url, challenge_path), video_id, 'Requesting verification cookie',
'Unable to get verification cookie', data=b'')
return self._download_webpage(url, video_id)
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_fb_webpage_and_verify(url, video_id)
post_data = traverse_obj(
re.findall(r'data-sjs>({.*?ScheduledServerJS.*?})</script>', webpage), (..., {json.loads}))

View File

@@ -5,6 +5,7 @@ from .common import InfoExtractor
from ..networking import Request
from ..utils import (
ExtractorError,
UserNotLive,
js_to_json,
traverse_obj,
update_url_query,
@@ -205,6 +206,9 @@ class FC2LiveIE(InfoExtractor):
'client_app': 'browser_hls',
'ipv6': '',
}), headers={'X-Requested-With': 'XMLHttpRequest'})
# A non-zero 'status' indicates the stream is not live, so check truthiness
if traverse_obj(control_server, ('status', {int})) and 'control_token' not in control_server:
raise UserNotLive(video_id=video_id)
self._set_cookie('live.fc2.com', 'l_ortkn', control_server['orz_raw'])
ws_url = update_url_query(control_server['url'], {'control_token': control_server['control_token']})

View File

@@ -0,0 +1,52 @@
from .common import InfoExtractor
from ..utils import clean_html
from ..utils.traversal import (
find_element,
find_elements,
traverse_obj,
)
class FilmArchivIE(InfoExtractor):
IE_DESC = 'FILMARCHIV ON'
_VALID_URL = r'https?://(?:www\.)?filmarchiv\.at/de/filmarchiv-on/video/(?P<id>f_[0-9a-zA-Z]{5,})'
_TESTS = [{
'url': 'https://www.filmarchiv.at/de/filmarchiv-on/video/f_0305p7xKrXUPBwoNE9x6mh',
'md5': '54a6596f6a84624531866008a77fa27a',
'info_dict': {
'id': 'f_0305p7xKrXUPBwoNE9x6mh',
'ext': 'mp4',
'title': 'Der Wurstelprater zur Kaiserzeit',
'description': 'md5:9843f92df5cc9a4975cee7aabcf6e3b2',
'thumbnail': r're:https://cdn\.filmarchiv\.at/f_0305/p7xKrXUPBwoNE9x6mh_v1/poster\.jpg',
},
}, {
'url': 'https://www.filmarchiv.at/de/filmarchiv-on/video/f_0306vI3wO0tJIsfrqYFQXF',
'md5': '595385d7f54cb6529140ee8de7d1c3c7',
'info_dict': {
'id': 'f_0306vI3wO0tJIsfrqYFQXF',
'ext': 'mp4',
'title': 'Vor 70 Jahren: Wettgehen der Briefträger in Wien',
'description': 'md5:b2a2e4230923cd1969d471c552e62811',
'thumbnail': r're:https://cdn\.filmarchiv\.at/f_0306/vI3wO0tJIsfrqYFQXF_v1/poster\.jpg',
},
}]
def _real_extract(self, url):
media_id = self._match_id(url)
webpage = self._download_webpage(url, media_id)
path = '/'.join((media_id[:6], media_id[6:]))
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://cdn.filmarchiv.at/{path}_v1_sv1/playlist.m3u8', media_id)
return {
'id': media_id,
'title': traverse_obj(webpage, ({find_element(tag='title-div')}, {clean_html})),
'description': traverse_obj(webpage, (
{find_elements(tag='div', attr='class', value=r'.*\bborder-base-content\b', regex=True)}, ...,
{find_elements(tag='div', attr='class', value=r'.*\bprose\b', html=False, regex=True)}, ...,
{clean_html}, any)),
'thumbnail': f'https://cdn.filmarchiv.at/{path}_v1/poster.jpg',
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -371,15 +371,16 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
class FranceTVInfoIE(FranceTVBaseInfoExtractor):
IE_NAME = 'francetvinfo.fr'
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.francetvinfo\.fr/(?:[^/]+/)*(?P<id>[^/?#&.]+)'
IE_NAME = 'franceinfo'
IE_DESC = 'franceinfo.fr (formerly francetvinfo.fr)'
_VALID_URL = r'https?://(?:www|mobile|france3-regions)\.france(?:tv)?info.fr/(?:[^/?#]+/)*(?P<id>[^/?#&.]+)'
_TESTS = [{
'url': 'https://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-jeudi-22-aout-2019_3561461.html',
'info_dict': {
'id': 'd12458ee-5062-48fe-bfdd-a30d6a01b793',
'ext': 'mp4',
'title': 'Soir 3',
'title': 'Soir 3 - Émission du jeudi 22 août 2019',
'upload_date': '20190822',
'timestamp': 1566510730,
'thumbnail': r're:^https?://.*\.jpe?g$',
@@ -398,7 +399,7 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
'info_dict': {
'id': '7d204c9e-a2d3-11eb-9e4c-000d3a23d482',
'ext': 'mp4',
'title': 'Covid-19 : une situation catastrophique à New Dehli - Édition du mercredi 21 avril 2021',
'title': 'Journal 20h00 - Covid-19 : une situation catastrophique à New Dehli',
'thumbnail': r're:^https?://.*\.jpe?g$',
'duration': 76,
'timestamp': 1619028518,
@@ -438,6 +439,18 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
'thumbnail': r're:https://[^/?#]+/v/[^/?#]+/x1080',
},
'add_ie': ['Dailymotion'],
'skip': 'Broken Dailymotion link',
}, {
'url': 'https://www.franceinfo.fr/monde/usa/presidentielle/donald-trump/etats-unis-un-risque-d-embrasement-apres-la-mort-d-un-manifestant_7764542.html',
'info_dict': {
'id': 'f920fcc2-fa20-11f0-ac98-57a09c50f7ce',
'ext': 'mp4',
'title': 'Affaires sensibles - Manifestant tué Le risque d\'embrasement',
'duration': 118,
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1769367756,
'upload_date': '20260125',
},
}, {
'url': 'http://france3-regions.francetvinfo.fr/limousin/emissions/jt-1213-limousin',
'only_matching': True,
@@ -445,6 +458,9 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
# "<figure id=" pattern (#28792)
'url': 'https://www.francetvinfo.fr/culture/patrimoine/incendie-de-notre-dame-de-paris/notre-dame-de-paris-de-l-incendie-de-la-cathedrale-a-sa-reconstruction_4372291.html',
'only_matching': True,
}, {
'url': 'https://www.franceinfo.fr/replay-jt/france-2/20-heures/robert-de-niro-portrait-d-un-monument-du-cinema_7245456.html',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -460,7 +476,7 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
video_id = (
traverse_obj(webpage, (
{find_element(tag='button', attr='data-cy', value='francetv-player-wrapper', html=True)},
{find_element(tag='(button|div)', attr='data-cy', value='francetv-player-wrapper', html=True, regex=True)},
{extract_attributes}, 'id'))
or self._search_regex(
(r'player\.load[^;]+src:\s*["\']([^"\']+)',

164
yt_dlp/extractor/frontro.py Normal file
View File

@@ -0,0 +1,164 @@
import json
from .common import InfoExtractor
from ..utils import int_or_none, parse_iso8601, url_or_none
from ..utils.traversal import traverse_obj
class FrontoBaseIE(InfoExtractor):
def _get_auth_headers(self, url):
return traverse_obj(self._get_cookies(url), {
'authorization': ('frAccessToken', 'value', {lambda token: f'Bearer {token}' if token else None}),
})
class FrontroVideoBaseIE(FrontoBaseIE):
_CHANNEL_ID = None
def _real_extract(self, url):
video_id = self._match_id(url)
metadata = self._download_json(
'https://api.frontrow.cc/query', video_id, data=json.dumps({
'operationName': 'Video',
'variables': {'channelID': self._CHANNEL_ID, 'videoID': video_id},
'query': '''query Video($channelID: ID!, $videoID: ID!) {
video(ChannelID: $channelID, VideoID: $videoID) {
... on Video {title description updatedAt thumbnail createdAt duration likeCount comments views url hasAccess}
}
}''',
}).encode(), headers={
'content-type': 'application/json',
**self._get_auth_headers(url),
})['data']['video']
if not traverse_obj(metadata, 'hasAccess'):
self.raise_login_required()
formats, subtitles = self._extract_m3u8_formats_and_subtitles(metadata['url'], video_id)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(metadata, {
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('thumbnail', {url_or_none}),
'timestamp': ('createdAt', {parse_iso8601}),
'modified_timestamp': ('updatedAt', {parse_iso8601}),
'duration': ('duration', {int_or_none}),
'like_count': ('likeCount', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'view_count': ('views', {int_or_none}),
}),
}
class FrontroGroupBaseIE(FrontoBaseIE):
_CHANNEL_ID = None
_VIDEO_EXTRACTOR = None
_VIDEO_URL_TMPL = None
def _real_extract(self, url):
group_id = self._match_id(url)
metadata = self._download_json(
'https://api.frontrow.cc/query', group_id, note='Downloading playlist metadata',
data=json.dumps({
'operationName': 'PaginatedStaticPageContainer',
'variables': {'channelID': self._CHANNEL_ID, 'first': 500, 'pageContainerID': group_id},
'query': '''query PaginatedStaticPageContainer($channelID: ID!, $pageContainerID: ID!) {
pageContainer(ChannelID: $channelID, PageContainerID: $pageContainerID) {
... on StaticPageContainer { id title updatedAt createdAt itemRefs {edges {node {
id contentItem { ... on ItemVideo { videoItem: item {
id
}}}
}}}
}
}
}''',
}).encode(), headers={
'content-type': 'application/json',
**self._get_auth_headers(url),
})['data']['pageContainer']
entries = []
for video_id in traverse_obj(metadata, (
'itemRefs', 'edges', ..., 'node', 'contentItem', 'videoItem', 'id', {str}),
):
entries.append(self.url_result(
self._VIDEO_URL_TMPL % video_id, self._VIDEO_EXTRACTOR, video_id))
return {
'_type': 'playlist',
'id': group_id,
'entries': entries,
**traverse_obj(metadata, {
'title': ('title', {str}),
'timestamp': ('createdAt', {parse_iso8601}),
'modified_timestamp': ('updatedAt', {parse_iso8601}),
}),
}
class TheChosenIE(FrontroVideoBaseIE):
_CHANNEL_ID = '12884901895'
_VALID_URL = r'https?://(?:www\.)?watch\.thechosen\.tv/watch/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://watch.thechosen.tv/watch/184683594325',
'md5': '3f878b689588c71b38ec9943c54ff5b0',
'info_dict': {
'id': '184683594325',
'ext': 'mp4',
'title': 'Season 3 Episode 2: Two by Two',
'description': 'md5:174c373756ecc8df46b403f4fcfbaf8c',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 4212,
'thumbnail': r're:https://fastly\.frontrowcdn\.com/channels/12884901895/VIDEO_THUMBNAIL/184683594325/',
'timestamp': 1698954546,
'upload_date': '20231102',
'modified_timestamp': int,
'modified_date': str,
},
}, {
'url': 'https://watch.thechosen.tv/watch/184683596189',
'md5': 'd581562f9d29ce82f5b7770415334151',
'info_dict': {
'id': '184683596189',
'ext': 'mp4',
'title': 'Season 4 Episode 8: Humble',
'description': 'md5:20a57bead43da1cf77cd5b0fe29bbc76',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 5092,
'thumbnail': r're:https://fastly\.frontrowcdn\.com/channels/12884901895/VIDEO_THUMBNAIL/184683596189/',
'timestamp': 1715019474,
'upload_date': '20240506',
'modified_timestamp': int,
'modified_date': str,
},
}]
class TheChosenGroupIE(FrontroGroupBaseIE):
_CHANNEL_ID = '12884901895'
_VIDEO_EXTRACTOR = TheChosenIE
_VIDEO_URL_TMPL = 'https://watch.thechosen.tv/watch/%s'
_VALID_URL = r'https?://(?:www\.)?watch\.thechosen\.tv/group/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://watch.thechosen.tv/group/309237658592',
'info_dict': {
'id': '309237658592',
'title': 'Season 3',
'timestamp': 1746203969,
'upload_date': '20250502',
'modified_timestamp': int,
'modified_date': str,
},
'playlist_count': 8,
}]

View File

@@ -821,13 +821,17 @@ class GenericIE(InfoExtractor):
'Referer': smuggled_data.get('referer'),
}), impersonate=impersonate)
except ExtractorError as e:
if not (isinstance(e.cause, HTTPError) and e.cause.status == 403
and e.cause.response.get_header('cf-mitigated') == 'challenge'
and e.cause.response.extensions.get('impersonate') is None):
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
res = e.cause.response
already_impersonating = res.extensions.get('impersonate') is not None
if already_impersonating or (
res.get_header('cf-mitigated') != 'challenge'
and b'<title>Attention Required! | Cloudflare</title>' not in res.read()
):
raise
cf_cookie_domain = traverse_obj(
LenientSimpleCookie(e.cause.response.get_header('set-cookie')),
('__cf_bm', 'domain'))
LenientSimpleCookie(res.get_header('set-cookie')), ('__cf_bm', 'domain'))
if cf_cookie_domain:
self.write_debug(f'Clearing __cf_bm cookie for {cf_cookie_domain}')
self.cookiejar.clear(domain=cf_cookie_domain, path='/', name='__cf_bm')

View File

@@ -46,6 +46,7 @@ class GofileIE(InfoExtractor):
'videopassword': 'password',
},
}]
_STATIC_TOKEN = '4fd6sg89d7s6' # From https://gofile.io/dist/js/config.js
_TOKEN = None
def _real_initialize(self):
@@ -60,13 +61,16 @@ class GofileIE(InfoExtractor):
self._set_cookie('.gofile.io', 'accountToken', self._TOKEN)
def _entries(self, file_id):
query_params = {'wt': '4fd6sg89d7s6'} # From https://gofile.io/dist/js/alljs.js
password = self.get_param('videopassword')
if password:
query_params = {}
if password := self.get_param('videopassword'):
query_params['password'] = hashlib.sha256(password.encode()).hexdigest()
files = self._download_json(
f'https://api.gofile.io/contents/{file_id}', file_id, 'Getting filelist',
query=query_params, headers={'Authorization': f'Bearer {self._TOKEN}'})
query=query_params, headers={
'Authorization': f'Bearer {self._TOKEN}',
'X-Website-Token': self._STATIC_TOKEN,
})
status = files['status']
if status == 'error-passwordRequired':

View File

@@ -27,7 +27,7 @@ class HotStarBaseIE(InfoExtractor):
_TOKEN_NAME = 'userUP'
_BASE_URL = 'https://www.hotstar.com'
_API_URL = 'https://api.hotstar.com'
_API_URL_V2 = 'https://apix.hotstar.com/v2'
_API_URL_V2 = 'https://www.hotstar.com/api/internal/bff/v2'
_AKAMAI_ENCRYPTION_KEY = b'\x05\xfc\x1a\x01\xca\xc9\x4b\xc4\x12\xfc\x53\x12\x07\x75\xf9\xee'
_FREE_HEADERS = {

View File

@@ -9,14 +9,12 @@ from .openload import PhantomJSwrapper
from ..utils import (
ExtractorError,
clean_html,
decode_packed_codes,
float_or_none,
format_field,
get_element_by_attribute,
get_element_by_id,
int_or_none,
js_to_json,
ohdave_rsa_encrypt,
parse_age_limit,
parse_duration,
parse_iso8601,
@@ -33,143 +31,12 @@ def md5_text(text):
return hashlib.md5(text.encode()).hexdigest()
class IqiyiSDK:
def __init__(self, target, ip, timestamp):
self.target = target
self.ip = ip
self.timestamp = timestamp
@staticmethod
def split_sum(data):
return str(sum(int(p, 16) for p in data))
@staticmethod
def digit_sum(num):
if isinstance(num, int):
num = str(num)
return str(sum(map(int, num)))
def even_odd(self):
even = self.digit_sum(str(self.timestamp)[::2])
odd = self.digit_sum(str(self.timestamp)[1::2])
return even, odd
def preprocess(self, chunksize):
self.target = md5_text(self.target)
chunks = []
for i in range(32 // chunksize):
chunks.append(self.target[chunksize * i:chunksize * (i + 1)])
if 32 % chunksize:
chunks.append(self.target[32 - 32 % chunksize:])
return chunks, list(map(int, self.ip.split('.')))
def mod(self, modulus):
chunks, ip = self.preprocess(32)
self.target = chunks[0] + ''.join(str(p % modulus) for p in ip)
def split(self, chunksize):
modulus_map = {
4: 256,
5: 10,
8: 100,
}
chunks, ip = self.preprocess(chunksize)
ret = ''
for i in range(len(chunks)):
ip_part = str(ip[i] % modulus_map[chunksize]) if i < 4 else ''
if chunksize == 8:
ret += ip_part + chunks[i]
else:
ret += chunks[i] + ip_part
self.target = ret
def handle_input16(self):
self.target = md5_text(self.target)
self.target = self.split_sum(self.target[:16]) + self.target + self.split_sum(self.target[16:])
def handle_input8(self):
self.target = md5_text(self.target)
ret = ''
for i in range(4):
part = self.target[8 * i:8 * (i + 1)]
ret += self.split_sum(part) + part
self.target = ret
def handleSum(self):
self.target = md5_text(self.target)
self.target = self.split_sum(self.target) + self.target
def date(self, scheme):
self.target = md5_text(self.target)
d = time.localtime(self.timestamp)
strings = {
'y': str(d.tm_year),
'm': '%02d' % d.tm_mon,
'd': '%02d' % d.tm_mday,
}
self.target += ''.join(strings[c] for c in scheme)
def split_time_even_odd(self):
even, odd = self.even_odd()
self.target = odd + md5_text(self.target) + even
def split_time_odd_even(self):
even, odd = self.even_odd()
self.target = even + md5_text(self.target) + odd
def split_ip_time_sum(self):
chunks, ip = self.preprocess(32)
self.target = str(sum(ip)) + chunks[0] + self.digit_sum(self.timestamp)
def split_time_ip_sum(self):
chunks, ip = self.preprocess(32)
self.target = self.digit_sum(self.timestamp) + chunks[0] + str(sum(ip))
class IqiyiSDKInterpreter:
def __init__(self, sdk_code):
self.sdk_code = sdk_code
def run(self, target, ip, timestamp):
self.sdk_code = decode_packed_codes(self.sdk_code)
functions = re.findall(r'input=([a-zA-Z0-9]+)\(input', self.sdk_code)
sdk = IqiyiSDK(target, ip, timestamp)
other_functions = {
'handleSum': sdk.handleSum,
'handleInput8': sdk.handle_input8,
'handleInput16': sdk.handle_input16,
'splitTimeEvenOdd': sdk.split_time_even_odd,
'splitTimeOddEven': sdk.split_time_odd_even,
'splitIpTimeSum': sdk.split_ip_time_sum,
'splitTimeIpSum': sdk.split_time_ip_sum,
}
for function in functions:
if re.match(r'mod\d+', function):
sdk.mod(int(function[3:]))
elif re.match(r'date[ymd]{3}', function):
sdk.date(function[4:])
elif re.match(r'split\d+', function):
sdk.split(int(function[5:]))
elif function in other_functions:
other_functions[function]()
else:
raise ExtractorError(f'Unknown function {function}')
return sdk.target
class IqiyiIE(InfoExtractor):
IE_NAME = 'iqiyi'
IE_DESC = '爱奇艺'
_VALID_URL = r'https?://(?:(?:[^.]+\.)?iqiyi\.com|www\.pps\.tv)/.+\.html'
_NETRC_MACHINE = 'iqiyi'
_TESTS = [{
'url': 'http://www.iqiyi.com/v_19rrojlavg.html',
# MD5 checksum differs on my machine and Travis CI
@@ -234,57 +101,6 @@ class IqiyiIE(InfoExtractor):
'18': 7, # 1080p
}
@staticmethod
def _rsa_fun(data):
# public key extracted from http://static.iqiyi.com/js/qiyiV2/20160129180840/jobs/i18n/i18nIndex.js
N = 0xab86b6371b5318aaa1d3c9e612a9f1264f372323c8c0f19875b5fc3b3fd3afcc1e5bec527aa94bfa85bffc157e4245aebda05389a5357b75115ac94f074aefcd
e = 65537
return ohdave_rsa_encrypt(data, e, N)
def _perform_login(self, username, password):
data = self._download_json(
'http://kylin.iqiyi.com/get_token', None,
note='Get token for logging', errnote='Unable to get token for logging')
sdk = data['sdk']
timestamp = int(time.time())
target = (
f'/apis/reglogin/login.action?lang=zh_TW&area_code=null&email={username}'
f'&passwd={self._rsa_fun(password.encode())}&agenttype=1&from=undefined&keeplogin=0&piccode=&fromurl=&_pos=1')
interp = IqiyiSDKInterpreter(sdk)
sign = interp.run(target, data['ip'], timestamp)
validation_params = {
'target': target,
'server': 'BEA3AA1908656AABCCFF76582C4C6660',
'token': data['token'],
'bird_src': 'f8d91d57af224da7893dd397d52d811a',
'sign': sign,
'bird_t': timestamp,
}
validation_result = self._download_json(
'http://kylin.iqiyi.com/validate?' + urllib.parse.urlencode(validation_params), None,
note='Validate credentials', errnote='Unable to validate credentials')
MSG_MAP = {
'P00107': 'please login via the web interface and enter the CAPTCHA code',
'P00117': 'bad username or password',
}
code = validation_result['code']
if code != 'A00000':
msg = MSG_MAP.get(code)
if not msg:
msg = f'error {code}'
if validation_result.get('msg'):
msg += ': ' + validation_result['msg']
self.report_warning('unable to log in: ' + msg)
return False
return True
def get_raw_data(self, tvid, video_id):
tm = int(time.time() * 1000)

View File

@@ -95,6 +95,7 @@ class LBRYBaseIE(InfoExtractor):
'_type': 'url',
'id': item['claim_id'],
'url': self._permanent_url(url, item['name'], item['claim_id']),
'ie_key': 'LBRY',
}
def _playlist_entries(self, url, display_id, claim_param, metadata):

View File

@@ -8,12 +8,10 @@ from ..utils import (
ExtractorError,
determine_ext,
filter_dict,
get_first,
int_or_none,
parse_iso8601,
update_url,
url_or_none,
variadic,
)
from ..utils.traversal import traverse_obj
@@ -51,7 +49,7 @@ class LoomIE(InfoExtractor):
}, {
# m3u8 raw-url, mp4 transcoded-url, cdn url == raw-url, vtt sub and json subs
'url': 'https://www.loom.com/share/9458bcbf79784162aa62ffb8dd66201b',
'md5': '51737ec002969dd28344db4d60b9cbbb',
'md5': '7b6bfdef8181c4ffc376e18919a4dcc2',
'info_dict': {
'id': '9458bcbf79784162aa62ffb8dd66201b',
'ext': 'mp4',
@@ -71,12 +69,13 @@ class LoomIE(InfoExtractor):
'ext': 'webm',
'title': 'OMFG clown',
'description': 'md5:285c5ee9d62aa087b7e3271b08796815',
'uploader': 'MrPumkin B',
'uploader': 'Brailey Bragg',
'upload_date': '20210924',
'timestamp': 1632519618,
'duration': 210,
},
'params': {'skip_download': 'dash'},
'expected_warnings': ['Failed to parse JSON'], # transcoded-url no longer available
}, {
# password-protected
'url': 'https://www.loom.com/share/50e26e8aeb7940189dff5630f95ce1f4',
@@ -91,10 +90,11 @@ class LoomIE(InfoExtractor):
'duration': 35,
},
'params': {'videopassword': 'seniorinfants2'},
'expected_warnings': ['Failed to parse JSON'], # transcoded-url no longer available
}, {
# embed, transcoded-url endpoint sends empty JSON response, split video and audio HLS formats
'url': 'https://www.loom.com/embed/ddcf1c1ad21f451ea7468b1e33917e4e',
'md5': 'b321d261656848c184a94e3b93eae28d',
'md5': 'f983a0f02f24331738b2f43aecb05256',
'info_dict': {
'id': 'ddcf1c1ad21f451ea7468b1e33917e4e',
'ext': 'mp4',
@@ -119,11 +119,12 @@ class LoomIE(InfoExtractor):
'duration': 247,
'timestamp': 1676274030,
},
'skip': '404 Not Found',
}]
_GRAPHQL_VARIABLES = {
'GetVideoSource': {
'acceptableMimes': ['DASH', 'M3U8', 'MP4'],
'acceptableMimes': ['DASH', 'M3U8', 'MP4', 'WEBM'],
},
}
_GRAPHQL_QUERIES = {
@@ -192,6 +193,12 @@ class LoomIE(InfoExtractor):
id
nullableRawCdnUrl(acceptableMimes: $acceptableMimes, password: $password) {
url
credentials {
Policy
Signature
KeyPairId
__typename
}
__typename
}
__typename
@@ -240,9 +247,9 @@ class LoomIE(InfoExtractor):
}
}\n'''),
}
_APOLLO_GRAPHQL_VERSION = '0a1856c'
_APOLLO_GRAPHQL_VERSION = '45a5bd4'
def _call_graphql_api(self, operations, video_id, note=None, errnote=None):
def _call_graphql_api(self, operation_name, video_id, note=None, errnote=None, fatal=True):
password = self.get_param('videopassword')
return self._download_json(
'https://www.loom.com/graphql', video_id, note or 'Downloading GraphQL JSON',
@@ -252,7 +259,9 @@ class LoomIE(InfoExtractor):
'x-loom-request-source': f'loom_web_{self._APOLLO_GRAPHQL_VERSION}',
'apollographql-client-name': 'web',
'apollographql-client-version': self._APOLLO_GRAPHQL_VERSION,
}, data=json.dumps([{
'graphql-operation-name': operation_name,
'Origin': 'https://www.loom.com',
}, data=json.dumps({
'operationName': operation_name,
'variables': {
'videoId': video_id,
@@ -260,7 +269,7 @@ class LoomIE(InfoExtractor):
**self._GRAPHQL_VARIABLES.get(operation_name, {}),
},
'query': self._GRAPHQL_QUERIES[operation_name],
} for operation_name in variadic(operations)], separators=(',', ':')).encode())
}, separators=(',', ':')).encode(), fatal=fatal)
def _call_url_api(self, endpoint, video_id):
response = self._download_json(
@@ -275,7 +284,7 @@ class LoomIE(InfoExtractor):
}, separators=(',', ':')).encode())
return traverse_obj(response, ('url', {url_or_none}))
def _extract_formats(self, video_id, metadata, gql_data):
def _extract_formats(self, video_id, metadata, video_data):
formats = []
video_properties = traverse_obj(metadata, ('video_properties', {
'width': ('width', {int_or_none}),
@@ -330,7 +339,7 @@ class LoomIE(InfoExtractor):
transcoded_url = self._call_url_api('transcoded-url', video_id)
formats.extend(get_formats(transcoded_url, 'transcoded', quality=-1)) # transcoded quality
cdn_url = get_first(gql_data, ('data', 'getVideo', 'nullableRawCdnUrl', 'url', {url_or_none}))
cdn_url = traverse_obj(video_data, ('data', 'getVideo', 'nullableRawCdnUrl', 'url', {url_or_none}))
# cdn_url is usually a dupe, but the raw-url/transcoded-url endpoints could return errors
valid_urls = [update_url(url, query=None) for url in (raw_url, transcoded_url) if url]
if cdn_url and update_url(cdn_url, query=None) not in valid_urls:
@@ -338,10 +347,21 @@ class LoomIE(InfoExtractor):
return formats
def _get_subtitles(self, video_id):
subs_data = self._call_graphql_api(
'FetchVideoTranscript', video_id, 'Downloading GraphQL subtitles JSON', fatal=False)
return filter_dict({
'en': traverse_obj(subs_data, (
'data', 'fetchVideoTranscript',
('source_url', 'captions_source_url'), {
'url': {url_or_none},
})) or None,
})
def _real_extract(self, url):
video_id = self._match_id(url)
metadata = get_first(
self._call_graphql_api('GetVideoSSR', video_id, 'Downloading GraphQL metadata JSON'),
metadata = traverse_obj(
self._call_graphql_api('GetVideoSSR', video_id, 'Downloading GraphQL metadata JSON', fatal=False),
('data', 'getVideo', {dict})) or {}
if metadata.get('__typename') == 'VideoPasswordMissingOrIncorrect':
@@ -350,22 +370,19 @@ class LoomIE(InfoExtractor):
'This video is password-protected, use the --video-password option', expected=True)
raise ExtractorError('Invalid video password', expected=True)
gql_data = self._call_graphql_api(['FetchChapters', 'FetchVideoTranscript', 'GetVideoSource'], video_id)
video_data = self._call_graphql_api(
'GetVideoSource', video_id, 'Downloading GraphQL video JSON')
chapter_data = self._call_graphql_api(
'FetchChapters', video_id, 'Downloading GraphQL chapters JSON', fatal=False)
duration = traverse_obj(metadata, ('video_properties', 'duration', {int_or_none}))
return {
'id': video_id,
'duration': duration,
'chapters': self._extract_chapters_from_description(
get_first(gql_data, ('data', 'fetchVideoChapters', 'content', {str})), duration) or None,
'formats': self._extract_formats(video_id, metadata, gql_data),
'subtitles': filter_dict({
'en': traverse_obj(gql_data, (
..., 'data', 'fetchVideoTranscript',
('source_url', 'captions_source_url'), {
'url': {url_or_none},
})) or None,
}),
traverse_obj(chapter_data, ('data', 'fetchVideoChapters', 'content', {str})), duration) or None,
'formats': self._extract_formats(video_id, metadata, video_data),
'subtitles': self.extract_subtitles(video_id),
**traverse_obj(metadata, {
'title': ('name', {str}),
'description': ('description', {str}),
@@ -376,6 +393,7 @@ class LoomIE(InfoExtractor):
class LoomFolderIE(InfoExtractor):
_WORKING = False
IE_NAME = 'loom:folder'
_VALID_URL = r'https?://(?:www\.)?loom\.com/share/folder/(?P<id>[\da-f]{32})'
_TESTS = [{

View File

@@ -1,128 +0,0 @@
from .common import InfoExtractor
from ..utils import clean_html, int_or_none, traverse_obj
_API_URL = 'https://dak1vd5vmi7x6.cloudfront.net/api/v1/publicrole/{}/{}?id={}'
class ManotoTVIE(InfoExtractor):
IE_DESC = 'Manoto TV (Episode)'
_VALID_URL = r'https?://(?:www\.)?manototv\.com/episode/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.manototv.com/episode/8475',
'info_dict': {
'id': '8475',
'series': 'خانه های رویایی با برادران اسکات',
'season_number': 7,
'episode_number': 25,
'episode_id': 'My Dream Home S7: Carol & John',
'duration': 3600,
'categories': ['سرگرمی'],
'title': 'کارول و جان',
'description': 'md5:d0fff1f8ba5c6775d312a00165d1a97e',
'thumbnail': r're:^https?://.*\.(jpeg|png|jpg)$',
'ext': 'mp4',
},
'params': {
'skip_download': 'm3u8',
},
}, {
'url': 'https://www.manototv.com/episode/12576',
'info_dict': {
'id': '12576',
'series': 'فیلم های ایرانی',
'episode_id': 'Seh Mah Taatili',
'duration': 5400,
'view_count': int,
'categories': ['سرگرمی'],
'title': 'سه ماه تعطیلی',
'description': 'سه ماه تعطیلی فیلمی به کارگردانی و نویسندگی شاپور قریب ساختهٔ سال ۱۳۵۶ است.',
'thumbnail': r're:^https?://.*\.(jpeg|png|jpg)$',
'ext': 'mp4',
},
'params': {
'skip_download': 'm3u8',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
episode_json = self._download_json(_API_URL.format('showmodule', 'episodedetails', video_id), video_id)
details = episode_json.get('details', {})
formats = self._extract_m3u8_formats(details.get('videoM3u8Url'), video_id, 'mp4')
return {
'id': video_id,
'series': details.get('showTitle'),
'season_number': int_or_none(details.get('analyticsSeasonNumber')),
'episode_number': int_or_none(details.get('episodeNumber')),
'episode_id': details.get('analyticsEpisodeTitle'),
'duration': int_or_none(details.get('durationInMinutes'), invscale=60),
'view_count': details.get('viewCount'),
'categories': [details.get('videoCategory')],
'title': details.get('episodeTitle'),
'description': clean_html(details.get('episodeDescription')),
'thumbnail': details.get('episodelandscapeImgIxUrl'),
'formats': formats,
}
class ManotoTVShowIE(InfoExtractor):
IE_DESC = 'Manoto TV (Show)'
_VALID_URL = r'https?://(?:www\.)?manototv\.com/show/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.manototv.com/show/2526',
'playlist_mincount': 68,
'info_dict': {
'id': '2526',
'title': 'فیلم های ایرانی',
'description': 'مجموعه ای از فیلم های سینمای کلاسیک ایران',
},
}]
def _real_extract(self, url):
show_id = self._match_id(url)
show_json = self._download_json(_API_URL.format('showmodule', 'details', show_id), show_id)
show_details = show_json.get('details', {})
title = show_details.get('showTitle')
description = show_details.get('showSynopsis')
series_json = self._download_json(_API_URL.format('showmodule', 'serieslist', show_id), show_id)
playlist_id = str(traverse_obj(series_json, ('details', 'list', 0, 'id')))
playlist_json = self._download_json(_API_URL.format('showmodule', 'episodelist', playlist_id), playlist_id)
playlist = traverse_obj(playlist_json, ('details', 'list')) or []
entries = [
self.url_result(
'https://www.manototv.com/episode/{}'.format(item['slideID']), ie=ManotoTVIE.ie_key(), video_id=item['slideID'])
for item in playlist]
return self.playlist_result(entries, show_id, title, description)
class ManotoTVLiveIE(InfoExtractor):
IE_DESC = 'Manoto TV (Live)'
_VALID_URL = r'https?://(?:www\.)?manototv\.com/live/'
_TEST = {
'url': 'https://www.manototv.com/live/',
'info_dict': {
'id': 'live',
'title': 'Manoto TV Live',
'ext': 'mp4',
'is_live': True,
},
'params': {
'skip_download': 'm3u8',
},
}
def _real_extract(self, url):
video_id = 'live'
json = self._download_json(_API_URL.format('livemodule', 'details', ''), video_id)
details = json.get('details', {})
video_url = details.get('liveUrl')
formats = self._extract_m3u8_formats(video_url, video_id, 'mp4', live=True)
return {
'id': video_id,
'title': 'Manoto TV Live',
'is_live': True,
'formats': formats,
}

View File

@@ -1,7 +1,9 @@
import re
import functools
import math
from .common import InfoExtractor
from ..utils import (
InAdvancePagedList,
clean_html,
int_or_none,
parse_iso8601,
@@ -10,15 +12,64 @@ from ..utils import (
from ..utils.traversal import require, traverse_obj
class MaveIE(InfoExtractor):
_VALID_URL = r'https?://(?P<channel>[\w-]+)\.mave\.digital/(?P<id>ep-\d+)'
class MaveBaseIE(InfoExtractor):
_API_BASE_URL = 'https://api.mave.digital/v1/website'
_API_BASE_STORAGE_URL = 'https://store.cloud.mts.ru/mave/'
def _load_channel_meta(self, channel_id, display_id):
return traverse_obj(self._download_json(
f'{self._API_BASE_URL}/{channel_id}/', display_id,
note='Downloading channel metadata'), 'podcast')
def _load_episode_meta(self, channel_id, episode_code, display_id):
return self._download_json(
f'{self._API_BASE_URL}/{channel_id}/episodes/{episode_code}',
display_id, note='Downloading episode metadata')
def _create_entry(self, channel_id, channel_meta, episode_meta):
episode_code = traverse_obj(episode_meta, ('code', {int}, {require('episode code')}))
return {
'display_id': f'{channel_id}-{episode_code}',
'extractor_key': MaveIE.ie_key(),
'extractor': MaveIE.IE_NAME,
'webpage_url': f'https://{channel_id}.mave.digital/ep-{episode_code}',
'channel_id': channel_id,
'channel_url': f'https://{channel_id}.mave.digital/',
'vcodec': 'none',
**traverse_obj(episode_meta, {
'id': ('id', {str}),
'url': ('audio', {urljoin(self._API_BASE_STORAGE_URL)}),
'title': ('title', {str}),
'description': ('description', {clean_html}),
'thumbnail': ('image', {urljoin(self._API_BASE_STORAGE_URL)}),
'duration': ('duration', {int_or_none}),
'season_number': ('season', {int_or_none}),
'episode_number': ('number', {int_or_none}),
'view_count': ('listenings', {int_or_none}),
'like_count': ('reactions', lambda _, v: v['type'] == 'like', 'count', {int_or_none}, any),
'dislike_count': ('reactions', lambda _, v: v['type'] == 'dislike', 'count', {int_or_none}, any),
'age_limit': ('is_explicit', {bool}, {lambda x: 18 if x else None}),
'timestamp': ('publish_date', {parse_iso8601}),
}),
**traverse_obj(channel_meta, {
'series_id': ('id', {str}),
'series': ('title', {str}),
'channel': ('title', {str}),
'uploader': ('author', {str}),
}),
}
class MaveIE(MaveBaseIE):
IE_NAME = 'mave'
_VALID_URL = r'https?://(?P<channel_id>[\w-]+)\.mave\.digital/ep-(?P<episode_code>\d+)'
_TESTS = [{
'url': 'https://ochenlichnoe.mave.digital/ep-25',
'md5': 'aa3e513ef588b4366df1520657cbc10c',
'info_dict': {
'id': '4035f587-914b-44b6-aa5a-d76685ad9bc2',
'ext': 'mp3',
'display_id': 'ochenlichnoe-ep-25',
'display_id': 'ochenlichnoe-25',
'title': 'Между мной и миром: психология самооценки',
'description': 'md5:4b7463baaccb6982f326bce5c700382a',
'uploader': 'Самарский университет',
@@ -45,7 +96,7 @@ class MaveIE(InfoExtractor):
'info_dict': {
'id': '41898bb5-ff57-4797-9236-37a8e537aa21',
'ext': 'mp3',
'display_id': 'budem-ep-12',
'display_id': 'budem-12',
'title': 'Екатерина Михайлова: "Горе от ума" не про женщин написана',
'description': 'md5:fa3bdd59ee829dfaf16e3efcb13f1d19',
'uploader': 'Полина Цветкова+Евгения Акопова',
@@ -68,40 +119,72 @@ class MaveIE(InfoExtractor):
'upload_date': '20241230',
},
}]
_API_BASE_URL = 'https://api.mave.digital/'
def _real_extract(self, url):
channel_id, slug = self._match_valid_url(url).group('channel', 'id')
display_id = f'{channel_id}-{slug}'
webpage = self._download_webpage(url, display_id)
data = traverse_obj(
self._search_nuxt_json(webpage, display_id),
('data', lambda _, v: v['activeEpisodeData'], any, {require('podcast data')}))
channel_id, episode_code = self._match_valid_url(url).group(
'channel_id', 'episode_code')
display_id = f'{channel_id}-{episode_code}'
channel_meta = self._load_channel_meta(channel_id, display_id)
episode_meta = self._load_episode_meta(channel_id, episode_code, display_id)
return self._create_entry(channel_id, channel_meta, episode_meta)
class MaveChannelIE(MaveBaseIE):
IE_NAME = 'mave:channel'
_VALID_URL = r'https?://(?P<id>[\w-]+)\.mave\.digital/?(?:$|[?#])'
_TESTS = [{
'url': 'https://budem.mave.digital/',
'info_dict': {
'id': 'budem',
'title': 'Все там будем',
'description': 'md5:f04ae12a42be0f1d765c5e326b41987a',
},
'playlist_mincount': 15,
}, {
'url': 'https://ochenlichnoe.mave.digital/',
'info_dict': {
'id': 'ochenlichnoe',
'title': 'Очень личное',
'description': 'md5:ee36a6a52546b91b487fe08c552fdbb2',
},
'playlist_mincount': 20,
}, {
'url': 'https://geekcity.mave.digital/',
'info_dict': {
'id': 'geekcity',
'title': 'Мужчины в трико',
'description': 'md5:4164d425d60a0d97abdce9d1f6f8e049',
},
'playlist_mincount': 80,
}]
_PAGE_SIZE = 50
def _entries(self, channel_id, channel_meta, page_num):
page_data = self._download_json(
f'{self._API_BASE_URL}/{channel_id}/episodes', channel_id, query={
'view': 'all',
'page': page_num + 1,
'sort': 'newest',
'format': 'all',
}, note=f'Downloading page {page_num + 1}')
for ep in traverse_obj(page_data, ('episodes', lambda _, v: v['audio'] and v['id'])):
yield self._create_entry(channel_id, channel_meta, ep)
def _real_extract(self, url):
channel_id = self._match_id(url)
channel_meta = self._load_channel_meta(channel_id, channel_id)
return {
'display_id': display_id,
'channel_id': channel_id,
'channel_url': f'https://{channel_id}.mave.digital/',
'vcodec': 'none',
'thumbnail': re.sub(r'_\d+(?=\.(?:jpg|png))', '', self._og_search_thumbnail(webpage, default='')) or None,
**traverse_obj(data, ('activeEpisodeData', {
'url': ('audio', {urljoin(self._API_BASE_URL)}),
'id': ('id', {str}),
'_type': 'playlist',
'id': channel_id,
**traverse_obj(channel_meta, {
'title': ('title', {str}),
'description': ('description', {clean_html}),
'duration': ('duration', {int_or_none}),
'season_number': ('season', {int_or_none}),
'episode_number': ('number', {int_or_none}),
'view_count': ('listenings', {int_or_none}),
'like_count': ('reactions', lambda _, v: v['type'] == 'like', 'count', {int_or_none}, any),
'dislike_count': ('reactions', lambda _, v: v['type'] == 'dislike', 'count', {int_or_none}, any),
'age_limit': ('is_explicit', {bool}, {lambda x: 18 if x else None}),
'timestamp': ('publish_date', {parse_iso8601}),
})),
**traverse_obj(data, ('podcast', 'podcast', {
'series_id': ('id', {str}),
'series': ('title', {str}),
'channel': ('title', {str}),
'uploader': ('author', {str}),
})),
'description': ('description', {str}),
}),
'entries': InAdvancePagedList(
functools.partial(self._entries, channel_id, channel_meta),
math.ceil(channel_meta['episodes_count'] / self._PAGE_SIZE), self._PAGE_SIZE),
}

View File

@@ -1,14 +1,9 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
format_field,
int_or_none,
str_or_none,
traverse_obj,
url_or_none,
)
from ..utils.traversal import traverse_obj
class MedalTVIE(InfoExtractor):
@@ -30,25 +25,8 @@ class MedalTVIE(InfoExtractor):
'view_count': int,
'like_count': int,
'duration': 13,
},
}, {
'url': 'https://medal.tv/games/cod-cold-war/clips/2mA60jWAGQCBH',
'md5': 'fc7a3e4552ae8993c1c4006db46be447',
'info_dict': {
'id': '2mA60jWAGQCBH',
'ext': 'mp4',
'title': 'Quad Cold',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'MowgliSB',
'timestamp': 1603165266,
'upload_date': '20201020',
'uploader_id': '10619174',
'thumbnail': 'https://cdn.medal.tv/10619174/thumbnail-34934644-720p.jpg?t=1080p&c=202042&missing',
'uploader_url': 'https://medal.tv/users/10619174',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 23,
'thumbnail': r're:https://cdn\.medal\.tv/ugcp/content-thumbnail/.*\.jpg',
'tags': ['headshot', 'valorant', '4k', 'clutch', 'mornu'],
},
}, {
'url': 'https://medal.tv/games/cod-cold-war/clips/2um24TWdty0NA',
@@ -57,12 +35,12 @@ class MedalTVIE(InfoExtractor):
'id': '2um24TWdty0NA',
'ext': 'mp4',
'title': 'u tk me i tk u bigger',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'Mimicc',
'description': '',
'uploader': 'zahl',
'timestamp': 1605580939,
'upload_date': '20201117',
'uploader_id': '5156321',
'thumbnail': 'https://cdn.medal.tv/5156321/thumbnail-36787208-360p.jpg?t=1080p&c=202046&missing',
'thumbnail': r're:https://cdn\.medal\.tv/source/.*\.png',
'uploader_url': 'https://medal.tv/users/5156321',
'comment_count': int,
'view_count': int,
@@ -70,91 +48,77 @@ class MedalTVIE(InfoExtractor):
'duration': 9,
},
}, {
'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True,
}, {
# API requires auth
'url': 'https://medal.tv/games/valorant/clips/2WRj40tpY_EU9',
'md5': '6c6bb6569777fd8b4ef7b33c09de8dcf',
'info_dict': {
'id': '2WRj40tpY_EU9',
'ext': 'mp4',
'title': '1v5 clutch',
'description': '',
'uploader': 'adny',
'uploader_id': '6256941',
'uploader_url': 'https://medal.tv/users/6256941',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 25,
'thumbnail': r're:https://cdn\.medal\.tv/source/.*\.jpg',
'timestamp': 1612896680,
'upload_date': '20210209',
},
'expected_warnings': ['Video formats are not available through API'],
}, {
'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, query={'mobilebypass': 'true'})
hydration_data = self._search_json(
r'<script[^>]*>[^<]*\bhydrationData\s*=', webpage,
'next data', video_id, end_pattern='</script>', fatal=False)
clip = traverse_obj(hydration_data, ('clips', ...), get_all=False)
if not clip:
raise ExtractorError(
'Could not find video information.', video_id=video_id)
title = clip['contentTitle']
source_width = int_or_none(clip.get('sourceWidth'))
source_height = int_or_none(clip.get('sourceHeight'))
aspect_ratio = source_width / source_height if source_width and source_height else 16 / 9
def add_item(container, item_url, height, id_key='format_id', item_id=None):
item_id = item_id or '%dp' % height
if item_id not in item_url:
return
container.append({
'url': item_url,
id_key: item_id,
'width': round(aspect_ratio * height),
'height': height,
})
content_data = self._download_json(
f'https://medal.tv/api/content/{video_id}', video_id,
headers={'Accept': 'application/json'})
formats = []
thumbnails = []
for k, v in clip.items():
if not (v and isinstance(v, str)):
continue
mobj = re.match(r'(contentUrl|thumbnail)(?:(\d+)p)?$', k)
if not mobj:
continue
prefix = mobj.group(1)
height = int_or_none(mobj.group(2))
if prefix == 'contentUrl':
add_item(
formats, v, height or source_height,
item_id=None if height else 'source')
elif prefix == 'thumbnail':
add_item(thumbnails, v, height, 'id')
error = clip.get('error')
if not formats and error:
if error == 404:
self.raise_no_formats(
'That clip does not exist.',
expected=True, video_id=video_id)
else:
self.raise_no_formats(
f'An unknown error occurred ({error}).',
video_id=video_id)
# Necessary because the id of the author is not known in advance.
# Won't raise an issue if no profile can be found as this is optional.
author = traverse_obj(hydration_data, ('profiles', ...), get_all=False) or {}
author_id = str_or_none(author.get('userId'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
if m3u8_url := url_or_none(content_data.get('contentUrlHls')):
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls'))
if http_url := url_or_none(content_data.get('contentUrl')):
formats.append({
'url': http_url,
'format_id': 'http-source',
'ext': 'mp4',
'quality': 1,
})
formats = [fmt for fmt in formats if 'video/privacy-protected-guest' not in fmt['url']]
if not formats:
# Fallback, does not require auth
self.report_warning('Video formats are not available through API, falling back to social video URL')
urlh = self._request_webpage(
f'https://medal.tv/api/content/{video_id}/socialVideoUrl', video_id,
note='Checking social video URL')
formats.append({
'url': urlh.url,
'format_id': 'social-video',
'ext': 'mp4',
'quality': -1,
})
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': clip.get('contentDescription'),
'uploader': author.get('displayName'),
'timestamp': float_or_none(clip.get('created'), 1000),
'uploader_id': author_id,
'uploader_url': author_url,
'duration': int_or_none(clip.get('videoLengthSeconds')),
'view_count': int_or_none(clip.get('views')),
'like_count': int_or_none(clip.get('likes')),
'comment_count': int_or_none(clip.get('comments')),
**traverse_obj(content_data, {
'title': ('contentTitle', {str}),
'description': ('contentDescription', {str}),
'timestamp': ('created', {int_or_none(scale=1000)}),
'duration': ('videoLengthSeconds', {int_or_none}),
'view_count': ('views', {int_or_none}),
'like_count': ('likes', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'uploader': ('poster', 'displayName', {str}),
'uploader_id': ('poster', 'userId', {str}),
'uploader_url': ('poster', 'userId', {str}, filter, {lambda x: x and f'https://medal.tv/users/{x}'}),
'tags': ('tags', ..., {str}),
'thumbnail': ('thumbnailUrl', {url_or_none}),
}),
}

View File

@@ -478,3 +478,64 @@ class NebulaChannelIE(NebulaBaseIE):
playlist_id=collection_slug,
playlist_title=channel.get('title'),
playlist_description=channel.get('description'))
class NebulaSeasonIE(NebulaBaseIE):
IE_NAME = 'nebula:season'
_VALID_URL = rf'{_BASE_URL_RE}/(?P<series>[\w-]+)/season/(?P<season_number>[\w-]+)'
_TESTS = [{
'url': 'https://nebula.tv/jetlag/season/15',
'info_dict': {
'id': 'jetlag_15',
'title': 'Tag: All Stars',
'description': 'md5:5aa5b8abf3de71756448dc44ffebb674',
},
'playlist_count': 8,
}, {
'url': 'https://nebula.tv/jetlag/season/14',
'info_dict': {
'id': 'jetlag_14',
'title': 'Snake',
'description': 'md5:6da9040f1c2ac559579738bfb6919d1e',
},
'playlist_count': 8,
}, {
'url': 'https://nebula.tv/jetlag/season/13-5',
'info_dict': {
'id': 'jetlag_13-5',
'title': 'Hide + Seek Across NYC',
'description': 'md5:5b87bb9acc6dcdff289bb4c71a2ad59f',
},
'playlist_count': 3,
}]
def _build_url_result(self, item):
url = (
traverse_obj(item, ('share_url', {url_or_none}))
or urljoin('https://nebula.tv/', item.get('app_path'))
or f'https://nebula.tv/videos/{item["slug"]}')
return self.url_result(
smuggle_url(url, {'id': item['id']}),
NebulaIE, url_transparent=True,
**self._extract_video_metadata(item))
def _entries(self, data):
for episode in traverse_obj(data, ('episodes', lambda _, v: v['video']['id'], 'video')):
yield self._build_url_result(episode)
for extra in traverse_obj(data, ('extras', ..., 'items', lambda _, v: v['id'])):
yield self._build_url_result(extra)
for trailer in traverse_obj(data, ('trailers', lambda _, v: v['id'])):
yield self._build_url_result(trailer)
def _real_extract(self, url):
series, season_id = self._match_valid_url(url).group('series', 'season_number')
playlist_id = f'{series}_{season_id}'
data = self._call_api(
f'https://content.api.nebula.app/content/{series}/season/{season_id}', playlist_id)
return self.playlist_result(
self._entries(data), playlist_id,
**traverse_obj(data, {
'title': ('title', {str}),
'description': ('description', {str}),
}))

View File

@@ -0,0 +1,79 @@
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..utils import parse_iso8601
from ..utils.traversal import require, traverse_obj
class NetAppBaseIE(InfoExtractor):
_BC_URL = 'https://players.brightcove.net/6255154784001/default_default/index.html?videoId={}'
@staticmethod
def _parse_metadata(item):
return traverse_obj(item, {
'title': ('name', {str}),
'description': ('description', {str}),
'timestamp': ('createdAt', {parse_iso8601}),
})
class NetAppVideoIE(NetAppBaseIE):
_VALID_URL = r'https?://media\.netapp\.com/video-detail/(?P<id>[0-9a-f-]+)'
_TESTS = [{
'url': 'https://media.netapp.com/video-detail/da25fc01-82ad-5284-95bc-26920200a222/seamless-storage-for-modern-kubernetes-deployments',
'info_dict': {
'id': '1843620950167202073',
'ext': 'mp4',
'title': 'Seamless storage for modern Kubernetes deployments',
'description': 'md5:1ee39e315243fe71fb90af2796037248',
'uploader_id': '6255154784001',
'duration': 2159.41,
'thumbnail': r're:https://house-fastly-signed-us-east-1-prod\.brightcovecdn\.com/image/.*\.jpg',
'tags': 'count:15',
'timestamp': 1758213949,
'upload_date': '20250918',
},
}, {
'url': 'https://media.netapp.com/video-detail/45593e5d-cf1c-5996-978c-c9081906e69f/unleash-ai-innovation-with-your-data-with-the-netapp-platform',
'only_matching': True,
}]
def _real_extract(self, url):
video_uuid = self._match_id(url)
metadata = self._download_json(
f'https://api.media.netapp.com/client/detail/{video_uuid}', video_uuid)
brightcove_video_id = traverse_obj(metadata, (
'sections', lambda _, v: v['type'] == 'Player', 'video', {str}, any, {require('brightcove video id')}))
video_item = traverse_obj(metadata, ('sections', lambda _, v: v['type'] == 'VideoDetail', any))
return self.url_result(
self._BC_URL.format(brightcove_video_id), BrightcoveNewIE, brightcove_video_id,
url_transparent=True, **self._parse_metadata(video_item))
class NetAppCollectionIE(NetAppBaseIE):
_VALID_URL = r'https?://media\.netapp\.com/collection/(?P<id>[0-9a-f-]+)'
_TESTS = [{
'url': 'https://media.netapp.com/collection/9820e190-f2a6-47ac-9c0a-98e5e64234a4',
'info_dict': {
'title': 'Featured sessions',
'id': '9820e190-f2a6-47ac-9c0a-98e5e64234a4',
},
'playlist_count': 4,
}]
def _entries(self, metadata):
for item in traverse_obj(metadata, ('items', lambda _, v: v['brightcoveVideoId'])):
brightcove_video_id = item['brightcoveVideoId']
yield self.url_result(
self._BC_URL.format(brightcove_video_id), BrightcoveNewIE, brightcove_video_id,
url_transparent=True, **self._parse_metadata(item))
def _real_extract(self, url):
collection_uuid = self._match_id(url)
metadata = self._download_json(
f'https://api.media.netapp.com/client/collection/{collection_uuid}', collection_uuid)
return self.playlist_result(self._entries(metadata), collection_uuid, playlist_title=metadata.get('name'))

View File

@@ -156,18 +156,36 @@ class NetEaseMusicIE(NetEaseMusicBaseIE):
'id': '17241424',
'ext': 'mp3',
'title': 'Opus 28',
'upload_date': '20080211',
'timestamp': 1202745600,
'upload_date': '20060912',
'timestamp': 1158076800,
'duration': 263,
'thumbnail': r're:^http.*\.jpg',
'album': 'Piano Solos Vol. 2',
'album': 'Piano Solos, Vol. 2',
'album_artist': 'Dustin O\'Halloran',
'average_rating': int,
'description': '[00:05.00]纯音乐,请欣赏\n',
'description': 'md5:b566b92c55ca348df65d206c5d689576',
'album_artists': ['Dustin O\'Halloran'],
'creators': ['Dustin O\'Halloran'],
'subtitles': {'lyrics': [{'ext': 'lrc'}]},
},
}, {
'url': 'https://music.163.com/#/song?id=2755669231',
'info_dict': {
'id': '2755669231',
'ext': 'mp3',
'title': '十二月-Departure',
'upload_date': '20251111',
'timestamp': 1762876800,
'duration': 188,
'thumbnail': r're:^http.*\.jpg',
'album': '',
'album_artist': 'ひとひら',
'average_rating': int,
'description': 'md5:deee249c8c9c3e2c54ecdab36e87d174',
'album_artists': ['ひとひら'],
'creators': ['ひとひら'],
'subtitles': {'lyrics': [{'ext': 'lrc', 'data': 'md5:d32b4425a5d6c9fa249ca6e803dd0401'}]},
},
}, {
'url': 'https://y.music.163.com/m/song?app_version=8.8.45&id=95670&uct2=sKnvS4+0YStsWkqsPhFijw%3D%3D&dlt=0846',
'md5': 'b896be78d8d34bd7bb665b26710913ff',
@@ -241,9 +259,16 @@ class NetEaseMusicIE(NetEaseMusicBaseIE):
'lyrics': [{'data': original, 'ext': 'lrc'}],
}
lyrics_expr = r'(\[[0-9]{2}:[0-9]{2}\.[0-9]{2,}\])([^\n]+)'
original_ts_texts = re.findall(lyrics_expr, original)
translation_ts_dict = dict(re.findall(lyrics_expr, translated))
def collect_lyrics(lrc):
lyrics_expr = r'\[([0-9]{2}):([0-9]{2})[:.]([0-9]{2,})\]([^\n]+)'
matches = re.findall(lyrics_expr, lrc)
return (
(f'[{minute}:{sec}.{msec}]', text)
for minute, sec, msec, text in matches
)
original_ts_texts = collect_lyrics(original)
translation_ts_dict = dict(collect_lyrics(translated))
merged = '\n'.join(
join_nonempty(f'{timestamp}{text}', translation_ts_dict.get(timestamp, ''), delim=' / ')
@@ -528,7 +553,7 @@ class NetEaseMusicMvIE(NetEaseMusicBaseIE):
class NetEaseMusicProgramIE(NetEaseMusicBaseIE):
IE_NAME = 'netease:program'
IE_DESC = '网易云音乐 - 电台节目'
_VALID_URL = r'https?://music\.163\.com/(?:#/)?program\?id=(?P<id>[0-9]+)'
_VALID_URL = r'https?://music\.163\.com/(?:#/)?(?:dj|program)\?id=(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://music.163.com/#/program?id=10109055',
'info_dict': {
@@ -572,6 +597,9 @@ class NetEaseMusicProgramIE(NetEaseMusicBaseIE):
'params': {
'noplaylist': True,
},
}, {
'url': 'https://music.163.com/#/dj?id=3706179315',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -2,84 +2,59 @@ from .common import InfoExtractor
from ..utils import (
clean_html,
int_or_none,
js_to_json,
parse_iso8601,
url_or_none,
urljoin,
)
from ..utils.traversal import traverse_obj
class NetzkinoIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?netzkino\.de/\#!/[^/]+/(?P<id>[^/]+)'
_GEO_COUNTRIES = ['DE']
_VALID_URL = r'https?://(?:www\.)?netzkino\.de/details/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.netzkino.de/#!/scifikino/rakete-zum-mond',
'md5': '92a3f8b76f8d7220acce5377ea5d4873',
'url': 'https://www.netzkino.de/details/snow-beast',
'md5': '1a4c90fe40d3ccabce163287e45e56dd',
'info_dict': {
'id': 'rakete-zum-mond',
'id': 'snow-beast',
'ext': 'mp4',
'title': 'Rakete zum Mond \u2013 Jules Verne',
'description': 'md5:f0a8024479618ddbfa450ff48ffa6c60',
'upload_date': '20120813',
'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1344858571,
'title': 'Snow Beast',
'age_limit': 12,
},
'params': {
'skip_download': 'Download only works from Germany',
},
}, {
'url': 'https://www.netzkino.de/#!/filme/dr-jekyll-mrs-hyde-2',
'md5': 'c7728b2dadd04ff6727814847a51ef03',
'info_dict': {
'id': 'dr-jekyll-mrs-hyde-2',
'ext': 'mp4',
'title': 'Dr. Jekyll & Mrs. Hyde 2',
'description': 'md5:c2e9626ebd02de0a794b95407045d186',
'upload_date': '20190130',
'thumbnail': r're:https?://.*\.jpg$',
'timestamp': 1548849437,
'age_limit': 18,
},
'params': {
'skip_download': 'Download only works from Germany',
'alt_title': 'Snow Beast',
'cast': 'count:3',
'categories': 'count:7',
'creators': 'count:2',
'description': 'md5:e604a954a7f827a80e96a3a97d48b269',
'location': 'US',
'release_year': 2011,
'thumbnail': r're:https?://.+\.jpg',
},
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
next_js_data = self._search_nextjs_data(webpage, video_id)
api_url = f'https://api.netzkino.de.simplecache.net/capi-2.0a/movies/{video_id}.json?d=www'
info = self._download_json(api_url, video_id)
custom_fields = info['custom_fields']
production_js = self._download_webpage(
'http://www.netzkino.de/beta/dist/production.min.js', video_id,
note='Downloading player code')
avo_js = self._search_regex(
r'var urlTemplate=(\{.*?"\})',
production_js, 'URL templates')
templates = self._parse_json(
avo_js, video_id, transform_source=js_to_json)
suffix = {
'hds': '.mp4/manifest.f4m',
'hls': '.mp4/master.m3u8',
'pmd': '.mp4',
}
film_fn = custom_fields['Streaming'][0]
formats = [{
'format_id': key,
'ext': 'mp4',
'url': tpl.replace('{}', film_fn) + suffix[key],
} for key, tpl in templates.items()]
query = traverse_obj(next_js_data, (
'props', '__dehydratedState', 'queries', ..., 'state',
'data', 'data', lambda _, v: v['__typename'] == 'CmsMovie', any))
if 'DRM' in traverse_obj(query, ('licenses', 'nodes', ..., 'properties', {str})):
self.report_drm(video_id)
return {
'id': video_id,
'formats': formats,
'title': info['title'],
'age_limit': int_or_none(custom_fields.get('FSK')[0]),
'timestamp': parse_iso8601(info.get('date'), delimiter=' '),
'description': clean_html(info.get('content')),
'thumbnail': info.get('thumbnail'),
**traverse_obj(query, {
'title': ('originalTitle', {clean_html}),
'age_limit': ('fskRating', {int_or_none}),
'alt_title': ('originalTitle', {clean_html}, filter),
'cast': ('cast', 'nodes', ..., 'person', 'name', {clean_html}, filter),
'creators': (('directors', 'writers'), 'nodes', ..., 'person', 'name', {clean_html}, filter),
'categories': ('categories', 'nodes', ..., 'category', 'title', {clean_html}, filter),
'description': ('longSynopsis', {clean_html}, filter),
'duration': ('runtimeInSeconds', {int_or_none}),
'location': ('productionCountry', {clean_html}, filter),
'release_year': ('productionYear', {int_or_none}),
'thumbnail': ('coverImage', 'masterUrl', {url_or_none}),
'url': ('videoSource', 'pmdUrl', {urljoin('https://pmd.netzkino-seite.netzkino.de/')}),
}),
}

View File

@@ -1,238 +0,0 @@
import urllib.parse
from .common import InfoExtractor
from ..utils import (
clean_html,
get_element_by_class,
int_or_none,
parse_iso8601,
remove_start,
unified_timestamp,
)
class NextMediaIE(InfoExtractor):
IE_DESC = '蘋果日報'
_VALID_URL = r'https?://hk\.apple\.nextmedia\.com/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://hk.apple.nextmedia.com/realtime/news/20141108/53109199',
'md5': 'dff9fad7009311c421176d1ac90bfe4f',
'info_dict': {
'id': '53109199',
'ext': 'mp4',
'title': '【佔領金鐘】50外國領事議員撐場 讚學生勇敢香港有希望',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:28222b9912b6665a21011b034c70fcc7',
'timestamp': 1415456273,
'upload_date': '20141108',
},
}]
_URL_PATTERN = r'\{ url: \'(.+)\' \}'
def _real_extract(self, url):
news_id = self._match_id(url)
page = self._download_webpage(url, news_id)
return self._extract_from_nextmedia_page(news_id, url, page)
def _extract_from_nextmedia_page(self, news_id, url, page):
redirection_url = self._search_regex(
r'window\.location\.href\s*=\s*([\'"])(?P<url>(?!\1).+)\1',
page, 'redirection URL', default=None, group='url')
if redirection_url:
return self.url_result(urllib.parse.urljoin(url, redirection_url))
title = self._fetch_title(page)
video_url = self._search_regex(self._URL_PATTERN, page, 'video url')
attrs = {
'id': news_id,
'title': title,
'url': video_url, # ext can be inferred from url
'thumbnail': self._fetch_thumbnail(page),
'description': self._fetch_description(page),
}
timestamp = self._fetch_timestamp(page)
if timestamp:
attrs['timestamp'] = timestamp
else:
attrs['upload_date'] = self._fetch_upload_date(url)
return attrs
def _fetch_title(self, page):
return self._og_search_title(page)
def _fetch_thumbnail(self, page):
return self._og_search_thumbnail(page)
def _fetch_timestamp(self, page):
date_created = self._search_regex('"dateCreated":"([^"]+)"', page, 'created time')
return parse_iso8601(date_created)
def _fetch_upload_date(self, url):
return self._search_regex(self._VALID_URL, url, 'upload date', group='date')
def _fetch_description(self, page):
return self._og_search_property('description', page)
class NextMediaActionNewsIE(NextMediaIE): # XXX: Do not subclass from concrete IE
IE_DESC = '蘋果日報 - 動新聞'
_VALID_URL = r'https?://hk\.dv\.nextmedia\.com/actionnews/[^/]+/(?P<date>\d+)/(?P<id>\d+)/\d+'
_TESTS = [{
'url': 'http://hk.dv.nextmedia.com/actionnews/hit/20150121/19009428/20061460',
'md5': '05fce8ffeed7a5e00665d4b7cf0f9201',
'info_dict': {
'id': '19009428',
'ext': 'mp4',
'title': '【壹週刊】細10年男友偷食 50歲邵美琪再失戀',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:cd802fad1f40fd9ea178c1e2af02d659',
'timestamp': 1421791200,
'upload_date': '20150120',
},
}]
def _real_extract(self, url):
news_id = self._match_id(url)
actionnews_page = self._download_webpage(url, news_id)
article_url = self._og_search_url(actionnews_page)
article_page = self._download_webpage(article_url, news_id)
return self._extract_from_nextmedia_page(news_id, url, article_page)
class AppleDailyIE(NextMediaIE): # XXX: Do not subclass from concrete IE
IE_DESC = '臺灣蘋果日報'
_VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/[^/]+/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
_TESTS = [{
'url': 'http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694',
'md5': 'a843ab23d150977cc55ef94f1e2c1e4d',
'info_dict': {
'id': '36354694',
'ext': 'mp4',
'title': '周亭羽走過摩鐵陰霾2男陪吃 九把刀孤寒看醫生',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:2acd430e59956dc47cd7f67cb3c003f4',
'upload_date': '20150128',
},
}, {
'url': 'http://www.appledaily.com.tw/realtimenews/article/strange/20150128/550549/%E4%B8%8D%E6%BB%BF%E8%A2%AB%E8%B8%A9%E8%85%B3%E3%80%80%E5%B1%B1%E6%9D%B1%E5%85%A9%E5%A4%A7%E5%AA%BD%E4%B8%80%E8%B7%AF%E6%89%93%E4%B8%8B%E8%BB%8A',
'md5': '86b4e9132d158279c7883822d94ccc49',
'info_dict': {
'id': '550549',
'ext': 'mp4',
'title': '不滿被踩腳 山東兩大媽一路打下車',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:175b4260c1d7c085993474217e4ab1b4',
'upload_date': '20150128',
},
}, {
'url': 'http://www.appledaily.com.tw/animation/realtimenews/new/20150128/5003671',
'md5': '03df296d95dedc2d5886debbb80cb43f',
'info_dict': {
'id': '5003671',
'ext': 'mp4',
'title': '20正妹熱舞 《刀龍傳說Online》火辣上市',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:23c0aac567dc08c9c16a3161a2c2e3cd',
'upload_date': '20150128',
},
'skip': 'redirect to http://www.appledaily.com.tw/animation/',
}, {
# No thumbnail
'url': 'http://www.appledaily.com.tw/animation/realtimenews/new/20150128/5003673/',
'md5': 'b06182cd386ea7bc6115ec7ff0f72aeb',
'info_dict': {
'id': '5003673',
'ext': 'mp4',
'title': '半夜尿尿 好像會看到___',
'description': 'md5:61d2da7fe117fede148706cdb85ac066',
'upload_date': '20150128',
},
'expected_warnings': [
'video thumbnail',
],
'skip': 'redirect to http://www.appledaily.com.tw/animation/',
}, {
'url': 'http://www.appledaily.com.tw/appledaily/article/supplement/20140417/35770334/',
'md5': 'eaa20e6b9df418c912d7f5dec2ba734d',
'info_dict': {
'id': '35770334',
'ext': 'mp4',
'title': '咖啡占卜測 XU裝熟指數',
'thumbnail': r're:^https?://.*\.jpg$',
'description': 'md5:7b859991a6a4fedbdf3dd3b66545c748',
'upload_date': '20140417',
},
}, {
'url': 'http://www.appledaily.com.tw/actionnews/appledaily/7/20161003/960588/',
'only_matching': True,
}, {
# Redirected from http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694
'url': 'http://ent.appledaily.com.tw/section/article/headline/20150128/36354694',
'only_matching': True,
}]
_URL_PATTERN = r'\{url: \'(.+)\'\}'
def _fetch_title(self, page):
return (self._html_search_regex(r'<h1 id="h1">([^<>]+)</h1>', page, 'news title', default=None)
or self._html_search_meta('description', page, 'news title'))
def _fetch_thumbnail(self, page):
return self._html_search_regex(r"setInitialImage\(\'([^']+)'\)", page, 'video thumbnail', fatal=False)
def _fetch_timestamp(self, page):
return None
def _fetch_description(self, page):
return self._html_search_meta('description', page, 'news description')
class NextTVIE(InfoExtractor):
_WORKING = False
_ENABLED = None # XXX: pass through to GenericIE
IE_DESC = '壹電視'
_VALID_URL = r'https?://(?:www\.)?nexttv\.com\.tw/(?:[^/]+/)+(?P<id>\d+)'
_TEST = {
'url': 'http://www.nexttv.com.tw/news/realtime/politics/11779671',
'info_dict': {
'id': '11779671',
'ext': 'mp4',
'title': '「超收稅」近4千億 藍議員籲發消費券',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1484825400,
'upload_date': '20170119',
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(
r'<h1[^>]*>([^<]+)</h1>', webpage, 'title')
data = self._hidden_inputs(webpage)
video_url = data['ntt-vod-src-detailview']
date_str = get_element_by_class('date', webpage)
timestamp = unified_timestamp(date_str + '+0800') if date_str else None
view_count = int_or_none(remove_start(
clean_html(get_element_by_class('click', webpage)), '點閱:'))
return {
'id': video_id,
'title': title,
'url': video_url,
'thumbnail': data.get('ntt-vod-img-src'),
'timestamp': timestamp,
'view_count': view_count,
}

View File

@@ -23,96 +23,38 @@ from ..utils import (
class NhkBaseIE(InfoExtractor):
_API_URL_TEMPLATE = 'https://nwapi.nhk.jp/nhkworld/%sod%slist/v7b/%s/%s/%s/all%s.json'
_API_URL_TEMPLATE = 'https://api.nhkworld.jp/showsapi/v1/{lang}/{content_format}_{page_type}/{m_id}{extra_page}'
_BASE_URL_REGEX = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/'
def _call_api(self, m_id, lang, is_video, is_episode, is_clip):
content_format = 'video' if is_video else 'audio'
content_type = 'clips' if is_clip else 'episodes'
if not is_episode:
extra_page = f'/{content_format}_{content_type}'
page_type = 'programs'
else:
extra_page = ''
page_type = content_type
return self._download_json(
self._API_URL_TEMPLATE % (
'v' if is_video else 'r',
'clip' if is_clip else 'esd',
'episode' if is_episode else 'program',
m_id, lang, '/all' if is_video else ''),
m_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'] or []
def _get_api_info(self, refresh=True):
if not refresh:
return self.cache.load('nhk', 'api_info')
self.cache.store('nhk', 'api_info', {})
movie_player_js = self._download_webpage(
'https://movie-a.nhk.or.jp/world/player/js/movie-player.js', None,
note='Downloading stream API information')
api_info = {
'url': self._search_regex(
r'prod:[^;]+\bapiUrl:\s*[\'"]([^\'"]+)[\'"]', movie_player_js, None, 'stream API url'),
'token': self._search_regex(
r'prod:[^;]+\btoken:\s*[\'"]([^\'"]+)[\'"]', movie_player_js, None, 'stream API token'),
}
self.cache.store('nhk', 'api_info', api_info)
return api_info
def _extract_stream_info(self, vod_id):
for refresh in (False, True):
api_info = self._get_api_info(refresh)
if not api_info:
continue
api_url = api_info.pop('url')
meta = traverse_obj(
self._download_json(
api_url, vod_id, 'Downloading stream url info', fatal=False, query={
**api_info,
'type': 'json',
'optional_id': vod_id,
'active_flg': 1,
}), ('meta', 0))
stream_url = traverse_obj(
meta, ('movie_url', ('mb_auto', 'auto_sp', 'auto_pc'), {url_or_none}), get_all=False)
if stream_url:
formats, subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, vod_id)
return {
**traverse_obj(meta, {
'duration': ('duration', {int_or_none}),
'timestamp': ('publication_date', {unified_timestamp}),
'release_timestamp': ('insert_date', {unified_timestamp}),
'modified_timestamp': ('update_date', {unified_timestamp}),
}),
'formats': formats,
'subtitles': subtitles,
}
raise ExtractorError('Unable to extract stream url')
self._API_URL_TEMPLATE.format(
lang=lang, content_format=content_format, page_type=page_type,
m_id=m_id, extra_page=extra_page),
join_nonempty(m_id, lang))
def _extract_episode_info(self, url, episode=None):
fetch_episode = episode is None
lang, m_type, episode_id = NhkVodIE._match_valid_url(url).group('lang', 'type', 'id')
is_video = m_type != 'audio'
if is_video:
episode_id = episode_id[:4] + '-' + episode_id[4:]
if fetch_episode:
episode = self._call_api(
episode_id, lang, is_video, True, episode_id[:4] == '9999')[0]
episode_id, lang, is_video, is_episode=True, is_clip=episode_id[:4] == '9999')
def get_clean_field(key):
return clean_html(episode.get(key + '_clean') or episode.get(key))
video_id = join_nonempty('id', 'lang', from_dict=episode)
title = get_clean_field('sub_title')
series = get_clean_field('title')
thumbnails = []
for s, w, h in [('', 640, 360), ('_l', 1280, 720)]:
img_path = episode.get('image' + s)
if not img_path:
continue
thumbnails.append({
'id': f'{h}p',
'height': h,
'width': w,
'url': 'https://www3.nhk.or.jp' + img_path,
})
title = episode.get('title')
series = traverse_obj(episode, (('video_program', 'audio_program'), any, 'title'))
episode_name = title
if series and title:
@@ -125,37 +67,52 @@ class NhkBaseIE(InfoExtractor):
episode_name = None
info = {
'id': episode_id + '-' + lang,
'id': video_id,
'title': title,
'description': get_clean_field('description'),
'thumbnails': thumbnails,
'series': series,
'episode': episode_name,
**traverse_obj(episode, {
'description': ('description', {str}),
'release_timestamp': ('first_broadcasted_at', {unified_timestamp}),
'categories': ('categories', ..., 'name', {str}),
'tags': ('tags', ..., 'name', {str}),
'thumbnails': ('images', lambda _, v: v['url'], {
'url': ('url', {urljoin(url)}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
'webpage_url': ('url', {urljoin(url)}),
}),
'extractor_key': NhkVodIE.ie_key(),
'extractor': NhkVodIE.IE_NAME,
}
if is_video:
vod_id = episode['vod_id']
info.update({
**self._extract_stream_info(vod_id),
'id': vod_id,
})
# XXX: We are assuming that 'video' and 'audio' are mutually exclusive
stream_info = traverse_obj(episode, (('video', 'audio'), {dict}, any)) or {}
if not stream_info.get('url'):
self.raise_no_formats('Stream not found; it has most likely expired', expected=True)
else:
if fetch_episode:
stream_url = stream_info['url']
if is_video:
formats, subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, video_id)
info.update({
'formats': formats,
'subtitles': subtitles,
**traverse_obj(stream_info, ({
'duration': ('duration', {int_or_none}),
'timestamp': ('published_at', {unified_timestamp}),
})),
})
else:
# From https://www3.nhk.or.jp/nhkworld/common/player/radio/inline/rod.html
audio_path = remove_end(episode['audio']['audio'], '.m4a')
audio_path = remove_end(stream_url, '.m4a')
info['formats'] = self._extract_m3u8_formats(
f'{urljoin("https://vod-stream.nhk.jp", audio_path)}/index.m3u8',
episode_id, 'm4a', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False)
for f in info['formats']:
f['language'] = lang
else:
info.update({
'_type': 'url_transparent',
'ie_key': NhkVodIE.ie_key(),
'url': url,
})
return info
@@ -168,29 +125,29 @@ class NhkVodIE(NhkBaseIE):
# Content available only for a limited period of time. Visit
# https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
_TESTS = [{
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049126/',
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/2049165/',
'info_dict': {
'id': 'nw_vod_v_en_2049_126_20230413233000_01_1681398302',
'id': '2049165-en',
'ext': 'mp4',
'title': 'Japan Railway Journal - The Tohoku Shinkansen: Full Speed Ahead',
'description': 'md5:49f7c5b206e03868a2fdf0d0814b92f6',
'title': 'Japan Railway Journal - Choshi Electric Railway: Fighting to Get Back on Track',
'description': 'md5:ab57df2fca7f04245148c2e787bb203d',
'thumbnail': r're:https://.+/.+\.jpg',
'episode': 'The Tohoku Shinkansen: Full Speed Ahead',
'episode': 'Choshi Electric Railway: Fighting to Get Back on Track',
'series': 'Japan Railway Journal',
'modified_timestamp': 1707217907,
'timestamp': 1681428600,
'release_timestamp': 1693883728,
'duration': 1679,
'upload_date': '20230413',
'modified_date': '20240206',
'release_date': '20230905',
'duration': 1680,
'categories': ['Biz & Tech'],
'tags': ['Akita', 'Chiba', 'Trains', 'Transcript', 'All (Japan Navigator)'],
'timestamp': 1759055880,
'upload_date': '20250928',
'release_timestamp': 1758810600,
'release_date': '20250925',
},
}, {
# video clip
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999011/',
'md5': '153c3016dfd252ba09726588149cf0e7',
'info_dict': {
'id': 'lpZXIwaDE6_Z-976CPsFdxyICyWUzlT5',
'id': '9999011-en',
'ext': 'mp4',
'title': 'Dining with the Chef - Chef Saito\'s Family recipe: MENCHI-KATSU',
'description': 'md5:5aee4a9f9d81c26281862382103b0ea5',
@@ -198,24 +155,23 @@ class NhkVodIE(NhkBaseIE):
'series': 'Dining with the Chef',
'episode': 'Chef Saito\'s Family recipe: MENCHI-KATSU',
'duration': 148,
'upload_date': '20190816',
'release_date': '20230902',
'release_timestamp': 1693619292,
'modified_timestamp': 1707217907,
'modified_date': '20240206',
'timestamp': 1565997540,
'categories': ['Food'],
'tags': ['Washoku'],
'timestamp': 1548212400,
'upload_date': '20190123',
},
}, {
# radio
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/livinginjapan-20231001-1/',
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/audio/livinginjapan-20240901-1/',
'info_dict': {
'id': 'livinginjapan-20231001-1-en',
'id': 'livinginjapan-20240901-1-en',
'ext': 'm4a',
'title': 'Living in Japan - Tips for Travelers to Japan / Ramen Vending Machines',
'title': 'Living in Japan - Weekend Hiking / Self-protection from crime',
'series': 'Living in Japan',
'description': 'md5:0a0e2077d8f07a03071e990a6f51bfab',
'description': 'md5:4d0e14ab73bdbfedb60a53b093954ed6',
'thumbnail': r're:https://.+/.+\.jpg',
'episode': 'Tips for Travelers to Japan / Ramen Vending Machines',
'episode': 'Weekend Hiking / Self-protection from crime',
'categories': ['Interactive'],
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/',
@@ -256,96 +212,51 @@ class NhkVodIE(NhkBaseIE):
},
'skip': 'expires 2023-10-15',
}, {
# a one-off (single-episode series). title from the api is just '<p></p>'
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/3004952/',
# a one-off (single-episode series). title from the api is just null
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/3026036/',
'info_dict': {
'id': 'nw_vod_v_en_3004_952_20230723091000_01_1690074552',
'id': '3026036-en',
'ext': 'mp4',
'title': 'Barakan Discovers - AMAMI OSHIMA: Isson\'s Treasure Isla',
'description': 'md5:5db620c46a0698451cc59add8816b797',
'thumbnail': r're:https://.+/.+\.jpg',
'release_date': '20230905',
'timestamp': 1690103400,
'duration': 2939,
'release_timestamp': 1693898699,
'upload_date': '20230723',
'modified_timestamp': 1707217907,
'modified_date': '20240206',
'episode': 'AMAMI OSHIMA: Isson\'s Treasure Isla',
'series': 'Barakan Discovers',
'title': 'STATELESS: The Japanese Left Behind in the Philippines',
'description': 'md5:9a2fd51cdfa9f52baae28569e0053786',
'duration': 2955,
'thumbnail': 'https://www3.nhk.or.jp/nhkworld/en/shows/3026036/images/wide_l_QPtWpt4lzVhm3NzPAMIIF35MCg4CdNwcikPaTS5Q.jpg',
'categories': ['Documentary', 'Culture & Lifestyle'],
'tags': ['Transcript', 'Documentary 360', 'The Pursuit of PEACE'],
'timestamp': 1758931800,
'upload_date': '20250927',
'release_timestamp': 1758931800,
'release_date': '20250927',
},
}, {
# /ondemand/video/ url with alphabetical character in 5th position of id
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999a07/',
'info_dict': {
'id': 'nw_c_en_9999-a07',
'id': '9999a07-en',
'ext': 'mp4',
'episode': 'Mini-Dramas on SDGs: Ep 1 Close the Gender Gap [Director\'s Cut]',
'series': 'Mini-Dramas on SDGs',
'modified_date': '20240206',
'title': 'Mini-Dramas on SDGs - Mini-Dramas on SDGs: Ep 1 Close the Gender Gap [Director\'s Cut]',
'description': 'md5:3f9dcb4db22fceb675d90448a040d3f6',
'timestamp': 1621962360,
'duration': 189,
'release_date': '20230903',
'modified_timestamp': 1707217907,
'timestamp': 1621911600,
'duration': 190,
'upload_date': '20210525',
'thumbnail': r're:https://.+/.+\.jpg',
'release_timestamp': 1693713487,
'categories': ['Current Affairs', 'Entertainment'],
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999d17/',
'info_dict': {
'id': 'nw_c_en_9999-d17',
'id': '9999d17-en',
'ext': 'mp4',
'title': 'Flowers of snow blossom - The 72 Pentads of Yamato',
'description': 'Todays focus: Snow',
'release_timestamp': 1693792402,
'release_date': '20230904',
'upload_date': '20220128',
'timestamp': 1643370960,
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 136,
'series': '',
'modified_date': '20240206',
'modified_timestamp': 1707217907,
},
}, {
# new /shows/ url format
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/2032307/',
'info_dict': {
'id': 'nw_vod_v_en_2032_307_20240321113000_01_1710990282',
'ext': 'mp4',
'title': 'Japanology Plus - 20th Anniversary Special Part 1',
'description': 'md5:817d41fc8e54339ad2a916161ea24faf',
'episode': '20th Anniversary Special Part 1',
'series': 'Japanology Plus',
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 1680,
'timestamp': 1711020600,
'upload_date': '20240321',
'release_timestamp': 1711022683,
'release_date': '20240321',
'modified_timestamp': 1711031012,
'modified_date': '20240321',
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/3020025/',
'info_dict': {
'id': 'nw_vod_v_en_3020_025_20230325144000_01_1679723944',
'ext': 'mp4',
'title': '100 Ideas to Save the World - Working Styles Evolve',
'description': 'md5:9e6c7778eaaf4f7b4af83569649f84d9',
'episode': 'Working Styles Evolve',
'series': '100 Ideas to Save the World',
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 899,
'upload_date': '20230325',
'timestamp': 1679755200,
'release_date': '20230905',
'release_timestamp': 1693880540,
'modified_date': '20240206',
'modified_timestamp': 1707217907,
'categories': ['Culture & Lifestyle', 'Science & Nature'],
'tags': ['Nara', 'Temples & Shrines', 'Winter', 'Snow'],
'timestamp': 1643339040,
'upload_date': '20220128',
},
}, {
# new /shows/audio/ url format
@@ -373,6 +284,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'sumo',
'title': 'GRAND SUMO Highlights',
'description': 'md5:fc20d02dc6ce85e4b72e0273aa52fdbf',
'series': 'GRAND SUMO Highlights',
},
'playlist_mincount': 1,
}, {
@@ -381,6 +293,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'japanrailway',
'title': 'Japan Railway Journal',
'description': 'md5:ea39d93af7d05835baadf10d1aae0e3f',
'series': 'Japan Railway Journal',
},
'playlist_mincount': 12,
}, {
@@ -390,6 +303,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'japanrailway',
'title': 'Japan Railway Journal',
'description': 'md5:ea39d93af7d05835baadf10d1aae0e3f',
'series': 'Japan Railway Journal',
},
'playlist_mincount': 12,
}, {
@@ -399,17 +313,9 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'livinginjapan',
'title': 'Living in Japan',
'description': 'md5:665bb36ec2a12c5a7f598ee713fc2b54',
'series': 'Living in Japan',
},
'playlist_mincount': 12,
}, {
# /tv/ program url
'url': 'https://www3.nhk.or.jp/nhkworld/en/tv/designtalksplus/',
'info_dict': {
'id': 'designtalksplus',
'title': 'DESIGN TALKS plus',
'description': 'md5:47b3b3a9f10d4ac7b33b53b70a7d2837',
},
'playlist_mincount': 20,
'playlist_mincount': 11,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/10yearshayaomiyazaki/',
'only_matching': True,
@@ -430,9 +336,8 @@ class NhkVodProgramIE(NhkBaseIE):
program_id, lang, m_type != 'audio', False, episode_type == 'clip')
def entries():
for episode in episodes:
if episode_path := episode.get('url'):
yield self._extract_episode_info(urljoin(url, episode_path), episode)
for episode in traverse_obj(episodes, ('items', lambda _, v: v['url'])):
yield self._extract_episode_info(urljoin(url, episode['url']), episode)
html = self._download_webpage(url, program_id)
program_title = self._extract_meta_from_class_elements([
@@ -446,7 +351,7 @@ class NhkVodProgramIE(NhkBaseIE):
'tAudioProgramMain__info', # /shows/audio/programs/
'p-program-description'], html) # /tv/
return self.playlist_result(entries(), program_id, program_title, program_description)
return self.playlist_result(entries(), program_id, program_title, program_description, series=program_title)
class NhkForSchoolBangumiIE(InfoExtractor):

View File

@@ -0,0 +1,83 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
UserNotLive,
filter_dict,
int_or_none,
join_nonempty,
parse_iso8601,
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class PandaTvIE(InfoExtractor):
IE_DESC = 'pandalive.co.kr (팬더티비)'
_VALID_URL = r'https?://(?:www\.|m\.)?pandalive\.co\.kr/play/(?P<id>\w+)'
_TESTS = [{
'url': 'https://www.pandalive.co.kr/play/bebenim',
'info_dict': {
'id': 'bebenim',
'ext': 'mp4',
'channel': '릴리ෆ',
'title': r're:앙앙❤ \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'thumbnail': r're:https://cdn\.pandalive\.co\.kr/ivs/v1/.+/thumb\.jpg',
'concurrent_view_count': int,
'like_count': int,
'live_status': 'is_live',
'upload_date': str,
},
'skip': 'The channel is not currently live',
}]
def _real_extract(self, url):
channel_id = self._match_id(url)
video_meta = self._download_json(
'https://api.pandalive.co.kr/v1/live/play', channel_id,
'Downloading video meta data', 'Unable to download video meta data',
data=urlencode_postdata(filter_dict({
'action': 'watch',
'userId': channel_id,
'password': self.get_param('videopassword'),
})), expected_status=400)
if error_code := traverse_obj(video_meta, ('errorData', 'code', {str})):
if error_code == 'castEnd':
raise UserNotLive(video_id=channel_id)
elif error_code == 'needAdult':
self.raise_login_required('Adult verification is required for this stream')
elif error_code == 'needLogin':
self.raise_login_required('Login is required for this stream')
elif error_code == 'needCoinPurchase':
raise ExtractorError('Coin purchase is required for this stream', expected=True)
elif error_code == 'needUnlimitItem':
raise ExtractorError('Ticket purchase is required for this stream', expected=True)
elif error_code == 'needPw':
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
elif error_code == 'wrongPw':
raise ExtractorError('Wrong password', expected=True)
else:
error_msg = video_meta.get('message')
raise ExtractorError(join_nonempty(
'API returned error code', error_code,
error_msg and 'with error message:', error_msg,
delim=' '))
http_headers = {'Origin': 'https://www.pandalive.co.kr'}
return {
'id': channel_id,
'is_live': True,
'formats': self._extract_m3u8_formats(
video_meta['PlayList']['hls'][0]['url'], channel_id, 'mp4', headers=http_headers, live=True),
'http_headers': http_headers,
**traverse_obj(video_meta, ('media', {
'title': ('title', {str}),
'release_timestamp': ('startTime', {parse_iso8601(delim=' ')}),
'thumbnail': ('ivsThumbnail', {url_or_none}),
'channel': ('userNick', {str}),
'concurrent_view_count': ('user', {int_or_none}),
'like_count': ('likeCnt', {int_or_none}),
})),
}

View File

@@ -6,7 +6,10 @@ from ..utils.traversal import traverse_obj
class PartiBaseIE(InfoExtractor):
def _call_api(self, path, video_id, note=None):
return self._download_json(
f'https://api-backend.parti.com/parti_v2/profile/{path}', video_id, note)
f'https://prod-api.parti.com/parti_v2/profile/{path}', video_id, note, headers={
'Origin': 'https://parti.com',
'Referer': 'https://parti.com/',
})
class PartiVideoIE(PartiBaseIE):
@@ -20,7 +23,7 @@ class PartiVideoIE(PartiBaseIE):
'title': 'NOW LIVE ',
'upload_date': '20250327',
'categories': ['Gaming'],
'thumbnail': 'https://assets.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
'thumbnail': 'https://media.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
'channel': 'ItZTMGG',
'timestamp': 1743044379,
},
@@ -34,7 +37,7 @@ class PartiVideoIE(PartiBaseIE):
return {
'id': video_id,
'formats': self._extract_m3u8_formats(
urljoin('https://watch.parti.com', data['livestream_recording']), video_id, 'mp4'),
urljoin('https://media.parti.com/', data['livestream_recording']), video_id, 'mp4'),
**traverse_obj(data, {
'title': ('event_title', {str}),
'channel': ('user_name', {str}),
@@ -47,32 +50,27 @@ class PartiVideoIE(PartiBaseIE):
class PartiLivestreamIE(PartiBaseIE):
IE_NAME = 'parti:livestream'
_VALID_URL = r'https?://(?:www\.)?parti\.com/creator/(?P<service>[\w]+)/(?P<id>[\w/-]+)'
_VALID_URL = r'https?://(?:www\.)?parti\.com/(?!video/)(?P<id>[\w/-]+)'
_TESTS = [{
'url': 'https://parti.com/creator/parti/Capt_Robs_Adventures',
'url': 'https://parti.com/247CryptoTracker',
'info_dict': {
'id': 'Capt_Robs_Adventures',
'ext': 'mp4',
'id': '247CryptoTracker',
'description': 'md5:a78051f3d7e66e6a64c6b1eaf59fd364',
'title': r"re:I'm Live on Parti \d{4}-\d{2}-\d{2} \d{2}:\d{2}",
'view_count': int,
'thumbnail': r're:https://assets\.parti\.com/.+\.png',
'timestamp': 1743879776,
'upload_date': '20250405',
'thumbnail': r're:https://media\.parti\.com/stream-screenshots/.+\.png',
'live_status': 'is_live',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://parti.com/creator/discord/sazboxgaming/0',
'only_matching': True,
}]
def _real_extract(self, url):
service, creator_slug = self._match_valid_url(url).group('service', 'id')
creator_slug = self._match_id(url)
encoded_creator_slug = creator_slug.replace('/', '%23')
creator_id = self._call_api(
f'get_user_by_social_media/{service}/{encoded_creator_slug}',
creator_slug, note='Fetching user ID')
f'user_id_from_name/{encoded_creator_slug}',
creator_slug, note='Fetching user ID')['user_id']
data = self._call_api(
f'get_livestream_channel_info/{creator_id}', creator_id,
@@ -85,11 +83,7 @@ class PartiLivestreamIE(PartiBaseIE):
return {
'id': creator_slug,
'formats': self._extract_m3u8_formats(
channel_info['playback_url'], creator_slug, live=True, query={
'token': channel_info['playback_auth_token'],
'player_version': '1.17.0',
}),
'formats': self._extract_m3u8_formats(channel_info['playback_url'], creator_slug, live=True),
'is_live': True,
**traverse_obj(data, {
'title': ('livestream_event_info', 'event_name', {str}),

View File

@@ -1,6 +1,5 @@
import functools
import itertools
import urllib.parse
from .common import InfoExtractor
from .sproutvideo import VidsIoIE
@@ -11,15 +10,23 @@ from ..utils import (
ExtractorError,
clean_html,
determine_ext,
extract_attributes,
float_or_none,
int_or_none,
mimetype2ext,
parse_iso8601,
smuggle_url,
str_or_none,
update_url_query,
url_or_none,
urljoin,
)
from ..utils.traversal import require, traverse_obj, value
from ..utils.traversal import (
find_elements,
require,
traverse_obj,
value,
)
class PatreonBaseIE(InfoExtractor):
@@ -121,6 +128,7 @@ class PatreonIE(PatreonBaseIE):
'channel_is_verified': True,
'chapters': 'count:4',
'timestamp': 1423689666,
'media_type': 'video',
},
'params': {
'noplaylist': True,
@@ -161,7 +169,7 @@ class PatreonIE(PatreonBaseIE):
'uploader_url': 'https://www.patreon.com/loish',
'description': 'md5:e2693e97ee299c8ece47ffdb67e7d9d2',
'title': 'VIDEO // sketchbook flipthrough',
'uploader': 'Loish ',
'uploader': 'Loish',
'tags': ['sketchbook', 'video'],
'channel_id': '1641751',
'channel_url': 'https://www.patreon.com/loish',
@@ -274,8 +282,73 @@ class PatreonIE(PatreonBaseIE):
'channel_id': '9346307',
},
'params': {'getcomments': True},
}, {
# Inlined media in post; uses _extract_from_media_api
'url': 'https://www.patreon.com/posts/scottfalco-146966245',
'info_dict': {
'id': '146966245',
'ext': 'mp4',
'title': 'scottfalco 1080',
'description': 'md5:a3f29bbd0a46b4821ec3400957c98aa2',
'uploader': 'Insanimate',
'uploader_id': '2828146',
'uploader_url': 'https://www.patreon.com/Insanimate',
'channel_id': '6260877',
'channel_url': 'https://www.patreon.com/Insanimate',
'channel_follower_count': int,
'comment_count': int,
'like_count': int,
'duration': 7.833333,
'timestamp': 1767061800,
'upload_date': '20251230',
},
}]
_RETURN_TYPE = 'video'
_HTTP_HEADERS = {
# Must be all-lowercase 'referer' so we can smuggle it to Generic, SproutVideo, and Vimeo.
# patreon.com URLs redirect to www.patreon.com; this matters when requesting mux.com m3u8s
'referer': 'https://www.patreon.com/',
}
def _extract_from_media_api(self, media_id):
attributes = traverse_obj(
self._call_api(f'media/{media_id}', media_id, fatal=False),
('data', 'attributes', {dict}))
if not attributes:
return None
info_dict = traverse_obj(attributes, {
'title': ('file_name', {lambda x: x.rpartition('.')[0]}),
'timestamp': ('created_at', {parse_iso8601}),
'duration': ('display', 'duration', {float_or_none}),
})
info_dict['id'] = media_id
playback_url = traverse_obj(
attributes, ('display', (None, 'viewer_playback_data'), 'url', {url_or_none}, any))
download_url = traverse_obj(attributes, ('download_url', {url_or_none}))
if playback_url and mimetype2ext(attributes.get('mimetype')) == 'm3u8':
info_dict['formats'], info_dict['subtitles'] = self._extract_m3u8_formats_and_subtitles(
playback_url, media_id, 'mp4', fatal=False, headers=self._HTTP_HEADERS)
for f in info_dict['formats']:
f['http_headers'] = self._HTTP_HEADERS
if transcript_url := traverse_obj(attributes, ('display', 'transcript_url', {url_or_none})):
info_dict['subtitles'].setdefault('en', []).append({
'url': transcript_url,
'ext': 'vtt',
})
elif playback_url or download_url:
info_dict['formats'] = [{
# If playback_url is available, download_url is a duplicate lower resolution format
'url': playback_url or download_url,
'vcodec': 'none' if attributes.get('media_type') != 'video' else None,
}]
if not info_dict.get('formats'):
return None
return info_dict
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -299,6 +372,7 @@ class PatreonIE(PatreonBaseIE):
'comment_count': ('comment_count', {int_or_none}),
})
seen_media_ids = set()
entries = []
idx = 0
for include in traverse_obj(post, ('included', lambda _, v: v['type'])):
@@ -320,6 +394,8 @@ class PatreonIE(PatreonBaseIE):
'url': download_url,
'alt_title': traverse_obj(media_attributes, ('file_name', {str})),
})
if media_id := traverse_obj(include, ('id', {str})):
seen_media_ids.add(media_id)
elif include_type == 'user':
info.update(traverse_obj(include, {
@@ -340,34 +416,29 @@ class PatreonIE(PatreonBaseIE):
'channel_follower_count': ('attributes', 'patron_count', {int_or_none}),
}))
# Must be all-lowercase 'referer' so we can smuggle it to Generic, SproutVideo, and Vimeo.
# patreon.com URLs redirect to www.patreon.com; this matters when requesting mux.com m3u8s
headers = {'referer': 'https://www.patreon.com/'}
if embed_url := traverse_obj(attributes, ('embed', 'url', {url_or_none})):
# Convert useless vimeo.com URLs to useful player.vimeo.com embed URLs
vimeo_id, vimeo_hash = self._search_regex(
r'//vimeo\.com/(\d+)(?:/([\da-f]+))?', embed_url,
'vimeo id', group=(1, 2), default=(None, None))
if vimeo_id:
embed_url = update_url_query(
f'https://player.vimeo.com/video/{vimeo_id}',
{'h': vimeo_hash or []})
if VimeoIE.suitable(embed_url):
entry = self.url_result(
VimeoIE._smuggle_referrer(embed_url, self._HTTP_HEADERS['referer']),
VimeoIE, url_transparent=True)
else:
entry = self.url_result(smuggle_url(embed_url, self._HTTP_HEADERS))
# handle Vimeo embeds
if traverse_obj(attributes, ('embed', 'provider')) == 'Vimeo':
v_url = urllib.parse.unquote(self._html_search_regex(
r'(https(?:%3A%2F%2F|://)player\.vimeo\.com.+app_id(?:=|%3D)+\d+)',
traverse_obj(attributes, ('embed', 'html', {str})), 'vimeo url', fatal=False) or '')
if url_or_none(v_url) and self._request_webpage(
v_url, video_id, 'Checking Vimeo embed URL', headers=headers,
fatal=False, errnote=False, expected_status=429): # 429 is TLS fingerprint rejection
entries.append(self.url_result(
VimeoIE._smuggle_referrer(v_url, headers['referer']),
VimeoIE, url_transparent=True))
embed_url = traverse_obj(attributes, ('embed', 'url', {url_or_none}))
if embed_url and (urlh := self._request_webpage(
embed_url, video_id, 'Checking embed URL', headers=headers,
fatal=False, errnote=False, expected_status=403)):
# Vimeo's Cloudflare anti-bot protection will return HTTP status 200 for 404, so we need
# to check for "Sorry, we couldn&amp;rsquo;t find that page" in the meta description tag
meta_description = clean_html(self._html_search_meta(
'description', self._webpage_read_content(urlh, embed_url, video_id, fatal=False), default=None))
# Password-protected vids.io embeds return 403 errors w/o --video-password or session cookie
if ((urlh.status != 403 and meta_description != 'Sorry, we couldnt find that page')
or VidsIoIE.suitable(embed_url)):
entries.append(self.url_result(smuggle_url(embed_url, headers)))
if urlh := self._request_webpage(
embed_url, video_id, 'Checking embed URL', headers=self._HTTP_HEADERS,
fatal=False, errnote=False, expected_status=(403, 429), # Ignore Vimeo 429's
):
# Password-protected vids.io embeds return 403 errors w/o --video-password or session cookie
if VidsIoIE.suitable(embed_url) or urlh.status != 403:
entries.append(entry)
post_file = traverse_obj(attributes, ('post_file', {dict}))
if post_file:
@@ -381,13 +452,27 @@ class PatreonIE(PatreonBaseIE):
})
elif name == 'video' or determine_ext(post_file.get('url')) == 'm3u8':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
post_file['url'], video_id, headers=headers)
post_file['url'], video_id, headers=self._HTTP_HEADERS)
for f in formats:
f['http_headers'] = self._HTTP_HEADERS
entries.append({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
'http_headers': headers,
})
if media_id := traverse_obj(post_file, ('media_id', {int}, {str_or_none})):
seen_media_ids.add(media_id)
for media_id in traverse_obj(attributes, (
'content', {find_elements(attr='data-media-id', value=r'\d+', regex=True, html=True)},
..., {extract_attributes}, 'data-media-id',
)):
# Inlined media may be duplicates of what was extracted above
if media_id in seen_media_ids:
continue
if media := self._extract_from_media_api(media_id):
entries.append(media)
seen_media_ids.add(media_id)
can_view_post = traverse_obj(attributes, 'current_user_can_view')
comments = None
@@ -598,7 +683,8 @@ class PatreonCampaignIE(PatreonBaseIE):
'props', 'pageProps', 'bootstrapEnvelope', 'pageBootstrap', 'campaign', 'data', 'id', {str}))
if not campaign_id:
campaign_id = traverse_obj(self._search_nextjs_v13_data(webpage, vanity), (
lambda _, v: v['type'] == 'campaign', 'id', {str}, any, {require('campaign ID')}))
((..., 'value', 'campaign', 'data'), lambda _, v: v['type'] == 'campaign'),
'id', {str}, any, {require('campaign ID')}))
params = {
'json-api-use-default-includes': 'false',

View File

@@ -453,6 +453,23 @@ class PBSIE(InfoExtractor):
'url': 'https://player.pbs.org/portalplayer/3004638221/?uid=',
'only_matching': True,
},
{
# Next.js v13+, see https://github.com/yt-dlp/yt-dlp/issues/13299
'url': 'https://www.pbs.org/video/caregiving',
'info_dict': {
'id': '3101776876',
'ext': 'mp4',
'title': 'Caregiving - Caregiving',
'description': 'A documentary revealing Americas caregiving crisis through intimate stories and expert insight.',
'display_id': 'caregiving',
'duration': 6783,
'thumbnail': 'https://image.pbs.org/video-assets/BSrSkcc-asset-mezzanine-16x9-nlcxQts.jpg',
'chapters': [],
},
'params': {
'skip_download': True,
},
},
]
_ERRORS = {
101: 'We\'re sorry, but this video is not yet available.',
@@ -506,6 +523,7 @@ class PBSIE(InfoExtractor):
r"(?s)window\.PBS\.playerConfig\s*=\s*{.*?id\s*:\s*'([0-9]+)',",
r'<div[^>]+\bdata-cove-id=["\'](\d+)"', # http://www.pbs.org/wgbh/roadshow/watch/episode/2105-indianapolis-hour-2/
r'<iframe[^>]+\bsrc=["\'](?:https?:)?//video\.pbs\.org/widget/partnerplayer/(\d+)', # https://www.pbs.org/wgbh/masterpiece/episodes/victoria-s2-e1/
r'\\"videoTPMediaId\\":\\\"(\d+)\\"', # Next.js v13, e.g. https://www.pbs.org/video/caregiving
r'\bhttps?://player\.pbs\.org/[\w-]+player/(\d+)', # last pattern to avoid false positives
]

View File

@@ -4,6 +4,7 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
str_or_none,
strip_or_none,
traverse_obj,
update_url,
)
@@ -50,7 +51,6 @@ class PicartoIE(InfoExtractor):
if metadata.get('online') == 0:
raise ExtractorError('Stream is offline', expected=True)
title = metadata['title']
cdn_data = self._download_json(''.join((
update_url(data['getLoadBalancerUrl']['url'], scheme='https'),
@@ -79,7 +79,7 @@ class PicartoIE(InfoExtractor):
return {
'id': channel_id,
'title': title.strip(),
'title': strip_or_none(metadata.get('title')),
'is_live': True,
'channel': channel_id,
'channel_id': metadata.get('id'),
@@ -159,7 +159,7 @@ class PicartoVodIE(InfoExtractor):
'id': video_id,
**traverse_obj(data, {
'id': ('id', {str_or_none}),
'title': ('title', {str}),
'title': ('title', {str.strip}),
'thumbnail': 'video_recording_image_url',
'channel': ('channel', 'name', {str}),
'age_limit': ('adult', {lambda x: 18 if x else 0}),

View File

@@ -24,6 +24,7 @@ from ..utils import (
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import find_elements, traverse_obj
class PornHubBaseIE(InfoExtractor):
@@ -137,23 +138,24 @@ class PornHubIE(PornHubBaseIE):
_EMBED_REGEX = [r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub(?:premium)?\.(?:com|net|org)/embed/[\da-z]+)']
_TESTS = [{
'url': 'http://www.pornhub.com/view_video.php?viewkey=648719015',
'md5': 'a6391306d050e4547f62b3f485dd9ba9',
'md5': '4d4a4e9178b655776f86cf89ecaf0edf',
'info_dict': {
'id': '648719015',
'ext': 'mp4',
'title': 'Seductive Indian beauty strips down and fingers her pink pussy',
'uploader': 'Babes',
'uploader': 'BABES-COM',
'uploader_id': '/users/babes-com',
'upload_date': '20130628',
'timestamp': 1372447216,
'duration': 361,
'view_count': int,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'age_limit': 18,
'tags': list,
'categories': list,
'cast': list,
'thumbnail': r're:https?://.+',
},
}, {
# non-ASCII title
@@ -480,13 +482,6 @@ class PornHubIE(PornHubBaseIE):
comment_count = self._extract_count(
r'All Comments\s*<span>\(([\d,.]+)\)', webpage, 'comment')
def extract_list(meta_key):
div = self._search_regex(
rf'(?s)<div[^>]+\bclass=["\'].*?\b{meta_key}Wrapper[^>]*>(.+?)</div>',
webpage, meta_key, default=None)
if div:
return [clean_html(x).strip() for x in re.findall(r'(?s)<a[^>]+\bhref=[^>]+>.+?</a>', div)]
info = self._search_json_ld(webpage, video_id, default={})
# description provided in JSON-LD is irrelevant
info['description'] = None
@@ -505,9 +500,11 @@ class PornHubIE(PornHubBaseIE):
'comment_count': comment_count,
'formats': formats,
'age_limit': 18,
'tags': extract_list('tags'),
'categories': extract_list('categories'),
'cast': extract_list('pornstars'),
**traverse_obj(webpage, {
'tags': ({find_elements(attr='data-label', value='tag')}, ..., {clean_html}),
'categories': ({find_elements(attr='data-label', value='category')}, ..., {clean_html}),
'cast': ({find_elements(attr='data-label', value='pornstar')}, ..., {clean_html}),
}),
'subtitles': subtitles,
}, info)

View File

@@ -405,7 +405,7 @@ class RumbleChannelIE(InfoExtractor):
for video_url in traverse_obj(
get_elements_html_by_class('videostream__link', webpage), (..., {extract_attributes}, 'href'),
):
yield self.url_result(urljoin('https://rumble.com', video_url))
yield self.url_result(urljoin('https://rumble.com', video_url), RumbleIE)
def _real_extract(self, url):
url, playlist_id = self._match_valid_url(url).groups()

View File

@@ -15,14 +15,15 @@ class S4CIE(InfoExtractor):
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Y_Swn_2023S4C_099_ii.jpg',
},
}, {
'url': 'https://www.s4c.cymru/clic/programme/856636948',
# Geo restricted to the UK
'url': 'https://www.s4c.cymru/clic/programme/886303048',
'info_dict': {
'id': '856636948',
'id': '886303048',
'ext': 'mp4',
'title': 'Am Dro',
'title': 'Pennod 1',
'description': 'md5:7e3f364b70f61fcdaa8b4cb4a3eb3e7a',
'duration': 2880,
'description': 'md5:100d8686fc9a632a0cb2db52a3433ffe',
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Am_Dro_2022-23S4C_P6_4005.jpg',
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Stad_2025S4C_P1_210053.jpg',
},
}]
@@ -51,7 +52,7 @@ class S4CIE(InfoExtractor):
'https://player-api.s4c-cdn.co.uk/streaming-urls/prod', video_id, query={
'mode': 'od',
'application': 'clic',
'region': 'WW',
'region': 'UK' if player_config.get('application') == 's4chttpl' else 'WW',
'extra': 'false',
'thirdParty': 'false',
'filename': player_config['filename'],

View File

@@ -1,137 +0,0 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
decode_packed_codes,
urlencode_postdata,
)
class SCTEBaseIE(InfoExtractor):
_LOGIN_URL = 'https://www.scte.org/SCTE/Sign_In.aspx'
_NETRC_MACHINE = 'scte'
def _perform_login(self, username, password):
login_popup = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login popup')
def is_logged(webpage):
return any(re.search(p, webpage) for p in (
r'class=["\']welcome\b', r'>Sign Out<'))
# already logged in
if is_logged(login_popup):
return
login_form = self._hidden_inputs(login_popup)
login_form.update({
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$signInUserName': username,
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$signInPassword': password,
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$RememberMe': 'on',
})
response = self._download_webpage(
self._LOGIN_URL, None, 'Logging in',
data=urlencode_postdata(login_form))
if '|pageRedirect|' not in response and not is_logged(response):
error = self._html_search_regex(
r'(?s)<[^>]+class=["\']AsiError["\'][^>]*>(.+?)</',
response, 'error message', default=None)
if error:
raise ExtractorError(f'Unable to login: {error}', expected=True)
raise ExtractorError('Unable to log in')
class SCTEIE(SCTEBaseIE):
_WORKING = False
_VALID_URL = r'https?://learning\.scte\.org/mod/scorm/view\.php?.*?\bid=(?P<id>\d+)'
_TESTS = [{
'url': 'https://learning.scte.org/mod/scorm/view.php?id=31484',
'info_dict': {
'title': 'Introduction to DOCSIS Engineering Professional',
'id': '31484',
},
'playlist_count': 5,
'skip': 'Requires account credentials',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._search_regex(r'<h1>(.+?)</h1>', webpage, 'title')
context_id = self._search_regex(r'context-(\d+)', webpage, video_id)
content_base = f'https://learning.scte.org/pluginfile.php/{context_id}/mod_scorm/content/8/'
context = decode_packed_codes(self._download_webpage(
f'{content_base}mobile/data.js', video_id))
data = self._parse_xml(
self._search_regex(
r'CreateData\(\s*"(.+?)"', context, 'data').replace(r"\'", "'"),
video_id)
entries = []
for asset in data.findall('.//asset'):
asset_url = asset.get('url')
if not asset_url or not asset_url.endswith('.mp4'):
continue
asset_id = self._search_regex(
r'video_([^_]+)_', asset_url, 'asset id', default=None)
if not asset_id:
continue
entries.append({
'id': asset_id,
'title': title,
'url': content_base + asset_url,
})
return self.playlist_result(entries, video_id, title)
class SCTECourseIE(SCTEBaseIE):
_WORKING = False
_VALID_URL = r'https?://learning\.scte\.org/(?:mod/sub)?course/view\.php?.*?\bid=(?P<id>\d+)'
_TESTS = [{
'url': 'https://learning.scte.org/mod/subcourse/view.php?id=31491',
'only_matching': True,
}, {
'url': 'https://learning.scte.org/course/view.php?id=3639',
'only_matching': True,
}, {
'url': 'https://learning.scte.org/course/view.php?id=3073',
'only_matching': True,
}]
def _real_extract(self, url):
course_id = self._match_id(url)
webpage = self._download_webpage(url, course_id)
title = self._search_regex(
r'<h1>(.+?)</h1>', webpage, 'title', default=None)
entries = []
for mobj in re.finditer(
r'''(?x)
<a[^>]+
href=(["\'])
(?P<url>
https?://learning\.scte\.org/mod/
(?P<kind>scorm|subcourse)/view\.php?(?:(?!\1).)*?
\bid=\d+
)
''',
webpage):
item_url = mobj.group('url')
if item_url == url:
continue
ie = (SCTEIE.ie_key() if mobj.group('kind') == 'scorm'
else SCTECourseIE.ie_key())
entries.append(self.url_result(item_url, ie=ie))
return self.playlist_result(entries, course_id, title)

View File

@@ -1064,7 +1064,7 @@ class SoundcloudRelatedIE(SoundcloudPagedPlaylistBaseIE):
class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
_VALID_URL = r'https?://api(?:-v2)?\.soundcloud\.com/playlists/(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
_VALID_URL = r'https?://api(?:-v2)?\.soundcloud\.com/playlists/(?:soundcloud(?:%3A|:)playlists(?:%3A|:))?(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
IE_NAME = 'soundcloud:playlist'
_TESTS = [{
'url': 'https://api.soundcloud.com/playlists/4110309',
@@ -1079,6 +1079,12 @@ class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
'album': 'TILT Brass - Bowery Poetry Club, August \'03 [Non-Site SCR 02]',
},
'playlist_count': 6,
}, {
'url': 'https://api.soundcloud.com/playlists/soundcloud%3Aplaylists%3A1759227795',
'only_matching': True,
}, {
'url': 'https://api.soundcloud.com/playlists/soundcloud:playlists:2104769627?secret_token=s-wmpCLuExeYX',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -8,10 +8,11 @@ from ..utils import (
class SportDeutschlandIE(InfoExtractor):
_VALID_URL = r'https?://(?:player\.)?sportdeutschland\.tv/(?P<id>(?:[^/?#]+/)?[^?#/&]+)'
IE_NAME = 'sporteurope'
_VALID_URL = r'https?://(?:player\.)?sporteurope\.tv/(?P<id>(?:[^/?#]+/)?[^?#/&]+)'
_TESTS = [{
# Single-part video, direct link
'url': 'https://sportdeutschland.tv/rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'url': 'https://sporteurope.tv/rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'md5': '35c11a19395c938cdd076b93bda54cde',
'info_dict': {
'id': '9f27a97d-1544-4d0b-aa03-48d92d17a03a',
@@ -19,9 +20,9 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'GFL2: Rostock Griffins vs. Elmshorn Fighting Pirates',
'display_id': 'rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'channel': 'Rostock Griffins',
'channel_url': 'https://sportdeutschland.tv/rostock-griffins',
'channel_url': 'https://sporteurope.tv/rostock-griffins',
'live_status': 'was_live',
'description': 'md5:60cb00067e55dafa27b0933a43d72862',
'description': r're:Video-Livestream des Spiels Rostock Griffins vs\. Elmshorn Fighting Pirates.+',
'channel_id': '9635f21c-3f67-4584-9ce4-796e9a47276b',
'timestamp': 1749913117,
'upload_date': '20250614',
@@ -29,16 +30,16 @@ class SportDeutschlandIE(InfoExtractor):
},
}, {
# Single-part video, embedded player link
'url': 'https://player.sportdeutschland.tv/9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'url': 'https://player.sporteurope.tv/9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'info_dict': {
'id': '9f27a97d-1544-4d0b-aa03-48d92d17a03a',
'ext': 'mp4',
'title': 'GFL2: Rostock Griffins vs. Elmshorn Fighting Pirates',
'display_id': '9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'channel': 'Rostock Griffins',
'channel_url': 'https://sportdeutschland.tv/rostock-griffins',
'channel_url': 'https://sporteurope.tv/rostock-griffins',
'live_status': 'was_live',
'description': 'md5:60cb00067e55dafa27b0933a43d72862',
'description': r're:Video-Livestream des Spiels Rostock Griffins vs\. Elmshorn Fighting Pirates.+',
'channel_id': '9635f21c-3f67-4584-9ce4-796e9a47276b',
'timestamp': 1749913117,
'upload_date': '20250614',
@@ -47,7 +48,7 @@ class SportDeutschlandIE(InfoExtractor):
'params': {'skip_download': True},
}, {
# Multi-part video
'url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
'url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
'info_dict': {
'id': '9f63d737-2444-4e3a-a1ea-840df73fd481',
'display_id': 'rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
@@ -55,7 +56,7 @@ class SportDeutschlandIE(InfoExtractor):
'description': 'md5:0a17da15e48a687e6019639c3452572b',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'live_status': 'was_live',
},
'playlist_count': 2,
@@ -66,7 +67,7 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'Volleyball w: Japan vs. Braslien - Halbfinale 2 Part 1',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'duration': 14773.0,
'timestamp': 1753085197,
'upload_date': '20250721',
@@ -79,16 +80,17 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'Volleyball w: Japan vs. Braslien - Halbfinale 2 Part 2',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'duration': 14773.0,
'timestamp': 1753128421,
'upload_date': '20250721',
'live_status': 'was_live',
},
}],
'skip': '404 Not Found',
}, {
# Livestream
'url': 'https://sportdeutschland.tv/dtb/gymnastik-international-tag-1',
'url': 'https://sporteurope.tv/dtb/gymnastik-international-tag-1',
'info_dict': {
'id': '95d71b8a-370a-4b87-ad16-94680da18528',
'ext': 'mp4',
@@ -96,7 +98,7 @@ class SportDeutschlandIE(InfoExtractor):
'display_id': 'dtb/gymnastik-international-tag-1',
'channel_id': '936ecef1-2f4a-4e08-be2f-68073cb7ecab',
'channel': 'Deutscher Turner-Bund',
'channel_url': 'https://sportdeutschland.tv/dtb',
'channel_url': 'https://sporteurope.tv/dtb',
'description': 'md5:07a885dde5838a6f0796ee21dc3b0c52',
'live_status': 'is_live',
},
@@ -106,9 +108,9 @@ class SportDeutschlandIE(InfoExtractor):
def _process_video(self, asset_id, video):
is_live = video['type'] == 'mux_live'
token = self._download_json(
f'https://api.sportdeutschland.tv/api/web/personal/asset-token/{asset_id}',
f'https://api.sporteurope.tv/api/web/personal/asset-token/{asset_id}',
video['id'], query={'type': video['type'], 'playback_id': video['src']},
headers={'Referer': 'https://sportdeutschland.tv/'})['token']
headers={'Referer': 'https://sporteurope.tv/'})['token']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://stream.mux.com/{video["src"]}.m3u8?token={token}', video['id'], live=is_live)
@@ -126,7 +128,7 @@ class SportDeutschlandIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
meta = self._download_json(
f'https://api.sportdeutschland.tv/api/stateless/frontend/assets/{display_id}',
f'https://api.sporteurope.tv/api/stateless/frontend/assets/{display_id}',
display_id, query={'access_token': 'true'})
info = {
@@ -139,7 +141,7 @@ class SportDeutschlandIE(InfoExtractor):
'channel_id': ('profile', 'id'),
'is_live': 'currently_live',
'was_live': 'was_live',
'channel_url': ('profile', 'slug', {lambda x: f'https://sportdeutschland.tv/{x}'}),
'channel_url': ('profile', 'slug', {lambda x: f'https://sporteurope.tv/{x}'}),
}, get_all=False),
}

View File

@@ -101,8 +101,8 @@ class SproutVideoIE(InfoExtractor):
webpage = self._download_webpage(
url, video_id, headers=traverse_obj(smuggled_data, {'Referer': 'referer'}))
data = self._search_json(
r'(?:var|const|let)\s+(?:dat|(?:player|video)Info|)\s*=\s*["\']', webpage, 'player info',
video_id, contains_pattern=r'[A-Za-z0-9+/=]+', end_pattern=r'["\'];',
r'(?:window\.|(?:var|const|let)\s+)(?:dat|(?:player|video)Info|)\s*=\s*["\']', webpage,
'player info', video_id, contains_pattern=r'[A-Za-z0-9+/=]+', end_pattern=r'["\'];',
transform_source=lambda x: base64.b64decode(x).decode())
# SproutVideo may send player info for 'SMPTE Color Monitor Test' [a791d7b71b12ecc52e]

View File

@@ -0,0 +1,244 @@
import base64
import binascii
import functools
import re
import urllib.parse
from .common import InfoExtractor
from ..dependencies import Cryptodome
from ..utils import (
ExtractorError,
OnDemandPagedList,
clean_html,
extract_attributes,
url_or_none,
urljoin,
)
from ..utils.traversal import (
find_element,
find_elements,
require,
traverse_obj,
)
class TarangPlusBaseIE(InfoExtractor):
_BASE_URL = 'https://tarangplus.in'
class TarangPlusVideoIE(TarangPlusBaseIE):
IE_NAME = 'tarangplus:video'
_VALID_URL = r'https?://(?:www\.)?tarangplus\.in/(?:movies|[^#?/]+/[^#?/]+)/(?!episodes)(?P<id>[^#?/]+)'
_TESTS = [{
'url': 'https://tarangplus.in/tarangaplus-originals/khitpit/khitpit-ep-10',
'md5': '78ce056cee755687b8a48199909ecf53',
'info_dict': {
'id': '67b8206719521d054c0059b7',
'display_id': 'khitpit-ep-10',
'ext': 'mp4',
'title': 'Khitpit Ep-10',
'description': 'md5:a45b805cb628e15c853d78b0406eab48',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 756.0,
'timestamp': 1740355200,
'upload_date': '20250224',
'media_type': 'episode',
'categories': ['Originals'],
},
}, {
'url': 'https://tarangplus.in/tarang-serials/bada-bohu/bada-bohu-ep-233',
'md5': 'b4f9beb15172559bb362203b4f48382e',
'info_dict': {
'id': '680b9d6c19521d054c007782',
'display_id': 'bada-bohu-ep-233',
'ext': 'mp4',
'title': 'Bada Bohu | Ep -233',
'description': 'md5:e6b8e7edc9e60b92c1b390f8789ecd69',
'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 1392.0,
'timestamp': 1745539200,
'upload_date': '20250425',
'media_type': 'episode',
'categories': ['Prime'],
},
}, {
# Decrypted m3u8 URL has trailing control characters that need to be stripped
'url': 'https://tarangplus.in/tarangaplus-originals/ichha/ichha-teaser-1',
'md5': '16ee43fe21ad8b6e652ec65eba38a64e',
'info_dict': {
'id': '5f0f252d3326af0720000342',
'ext': 'mp4',
'display_id': 'ichha-teaser-1',
'title': 'Ichha Teaser',
'description': 'md5:c724b0b0669a2cefdada3711cec792e6',
'media_type': 'episode',
'duration': 21.0,
'thumbnail': r're:https?://.+/.+\.jpg',
'categories': ['Originals'],
'timestamp': 1758153600,
'upload_date': '20250918',
},
}, {
'url': 'https://tarangplus.in/short/ai-maa/ai-maa',
'only_matching': True,
}, {
'url': 'https://tarangplus.in/shows/tarang-cine-utsav-2024/tarang-cine-utsav-2024-seg-1',
'only_matching': True,
}, {
'url': 'https://tarangplus.in/music-videos/chori-chori-bohu-chori-songs/nijara-laguchu-dhire-dhire',
'only_matching': True,
}, {
'url': 'https://tarangplus.in/kids-shows/chhota-jaga/chhota-jaga-ep-33-jamidar-ra-khajana-adaya',
'only_matching': True,
}, {
'url': 'https://tarangplus.in/movies/swayambara',
'only_matching': True,
}]
def decrypt(self, data, key):
if not Cryptodome.AES:
raise ExtractorError('pycryptodomex not found. Please install', expected=True)
iv = binascii.unhexlify('00000000000000000000000000000000')
cipher = Cryptodome.AES.new(base64.b64decode(key), Cryptodome.AES.MODE_CBC, iv)
return cipher.decrypt(base64.b64decode(data)).decode('utf-8')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
hidden_inputs_data = self._hidden_inputs(webpage)
json_ld_data = self._search_json_ld(webpage, display_id)
json_ld_data.pop('url', None)
iframe_url = traverse_obj(webpage, (
{find_element(tag='iframe', attr='src', value=r'.+[?&]contenturl=.+', html=True, regex=True)},
{extract_attributes}, 'src', {require('iframe URL')}))
# Can't use parse_qs here since it would decode the encrypted base64 `+` chars to spaces
content = self._search_regex(r'[?&]contenturl=(.+)', iframe_url, 'content')
encrypted_data, _, attrs = content.partition('|')
metadata = {
m.group('k'): m.group('v')
for m in re.finditer(r'(?:^|\|)(?P<k>[a-z_]+)=(?P<v>(?:(?!\|[a-z_]+=).)+)', attrs)
}
m3u8_url = urllib.parse.unquote(
self.decrypt(encrypted_data, metadata['key'])).rstrip('\x0e\x0f')
return {
'id': display_id, # Fallback
'display_id': display_id,
**json_ld_data,
**traverse_obj(metadata, {
'id': ('content_id', {str}),
'title': ('title', {str}),
'thumbnail': ('image', {url_or_none}),
}),
**traverse_obj(hidden_inputs_data, {
'id': ('content_id', {str}),
'media_type': ('theme_type', {str}),
'categories': ('genre', {str}, filter, all, filter),
}),
'formats': self._extract_m3u8_formats(m3u8_url, display_id),
}
class TarangPlusEpisodesIE(TarangPlusBaseIE):
IE_NAME = 'tarangplus:episodes'
_VALID_URL = r'https?://(?:www\.)?tarangplus\.in/(?P<type>[^#?/]+)/(?P<id>[^#?/]+)/episodes/?(?:$|[?#])'
_TESTS = [{
'url': 'https://tarangplus.in/tarangaplus-originals/balijatra/episodes',
'info_dict': {
'id': 'balijatra',
'title': 'Balijatra',
},
'playlist_mincount': 7,
}, {
'url': 'https://tarangplus.in/tarang-serials/bada-bohu/episodes',
'info_dict': {
'id': 'bada-bohu',
'title': 'Bada Bohu',
},
'playlist_mincount': 236,
}, {
'url': 'https://tarangplus.in/shows/dr-nonsense/episodes',
'info_dict': {
'id': 'dr-nonsense',
'title': 'Dr. Nonsense',
},
'playlist_mincount': 15,
}]
_PAGE_SIZE = 20
def _entries(self, playlist_url, playlist_id, page):
data = self._download_json(
playlist_url, playlist_id, f'Downloading playlist JSON page {page + 1}',
query={'page_no': page})
for item in traverse_obj(data, ('items', ..., {str})):
yield self.url_result(
urljoin(self._BASE_URL, item.split('$')[3]), TarangPlusVideoIE)
def _real_extract(self, url):
url_type, display_id = self._match_valid_url(url).group('type', 'id')
series_url = f'{self._BASE_URL}/{url_type}/{display_id}'
webpage = self._download_webpage(series_url, display_id)
entries = OnDemandPagedList(
functools.partial(self._entries, f'{series_url}/episodes', display_id),
self._PAGE_SIZE)
return self.playlist_result(
entries, display_id, self._hidden_inputs(webpage).get('title'))
class TarangPlusPlaylistIE(TarangPlusBaseIE):
IE_NAME = 'tarangplus:playlist'
_VALID_URL = r'https?://(?:www\.)?tarangplus\.in/(?P<id>[^#?/]+)/all/?(?:$|[?#])'
_TESTS = [{
'url': 'https://tarangplus.in/chhota-jaga/all',
'info_dict': {
'id': 'chhota-jaga',
'title': 'Chhota Jaga',
},
'playlist_mincount': 33,
}, {
'url': 'https://tarangplus.in/kids-yali-show/all',
'info_dict': {
'id': 'kids-yali-show',
'title': 'Yali',
},
'playlist_mincount': 10,
}, {
'url': 'https://tarangplus.in/trailer/all',
'info_dict': {
'id': 'trailer',
'title': 'Trailer',
},
'playlist_mincount': 57,
}, {
'url': 'https://tarangplus.in/latest-songs/all',
'info_dict': {
'id': 'latest-songs',
'title': 'Latest Songs',
},
'playlist_mincount': 46,
}, {
'url': 'https://tarangplus.in/premium-serials-episodes/all',
'info_dict': {
'id': 'premium-serials-episodes',
'title': 'Primetime Latest Episodes',
},
'playlist_mincount': 100,
}]
def _entries(self, webpage):
for url_path in traverse_obj(webpage, (
{find_elements(cls='item')}, ...,
{find_elements(tag='a', attr='href', value='/.+', html=True, regex=True)},
..., {extract_attributes}, 'href',
)):
yield self.url_result(urljoin(self._BASE_URL, url_path), TarangPlusVideoIE)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self.playlist_result(
self._entries(webpage), display_id,
traverse_obj(webpage, ({find_element(id='al_title')}, {clean_html})))

View File

@@ -6,20 +6,21 @@ from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
clean_html,
extract_attributes,
int_or_none,
join_nonempty,
str_or_none,
traverse_obj,
update_url,
url_or_none,
)
from ..utils.traversal import traverse_obj
class TelecincoBaseIE(InfoExtractor):
def _parse_content(self, content, url):
video_id = content['dataMediaId']
video_id = content['dataMediaId'][1]
config = self._download_json(
content['dataConfig'], video_id, 'Downloading config JSON')
content['dataConfig'][1], video_id, 'Downloading config JSON')
services = config['services']
caronte = self._download_json(services['caronte'], video_id)
if traverse_obj(caronte, ('dls', 0, 'drm', {bool})):
@@ -57,9 +58,9 @@ class TelecincoBaseIE(InfoExtractor):
'id': video_id,
'title': traverse_obj(config, ('info', 'title', {str})),
'formats': formats,
'thumbnail': (traverse_obj(content, ('dataPoster', {url_or_none}))
'thumbnail': (traverse_obj(content, ('dataPoster', 1, {url_or_none}))
or traverse_obj(config, 'poster', 'imageUrl', expected_type=url_or_none)),
'duration': traverse_obj(content, ('dataDuration', {int_or_none})),
'duration': traverse_obj(content, ('dataDuration', 1, {int_or_none})),
'http_headers': headers,
}
@@ -137,30 +138,45 @@ class TelecincoIE(TelecincoBaseIE):
'url': 'http://www.cuatro.com/chesterinlove/a-carta/chester-chester_in_love-chester_edu_2_2331030022.html',
'only_matching': True,
}]
_ASTRO_ISLAND_RE = re.compile(r'<astro-island\b[^>]+>')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id, impersonate=True)
article = self._search_json(
r'window\.\$REACTBASE_STATE\.article(?:_multisite)?\s*=',
webpage, 'article', display_id)['article']
description = traverse_obj(article, ('leadParagraph', {clean_html}, filter))
if article.get('editorialType') != 'VID':
props_list = traverse_obj(webpage, (
{self._ASTRO_ISLAND_RE.findall}, ...,
{extract_attributes}, 'props', {json.loads}))
description = traverse_obj(props_list, (..., 'leadParagraph', 1, {clean_html}, any, filter))
main_content = traverse_obj(props_list, (..., ('content', ('articleData', 1, 'opening')), 1, {dict}, any))
if traverse_obj(props_list, (..., 'editorialType', 1, {str}, any)) != 'VID': # e.g. 'ART'
entries = []
for p in traverse_obj(article, ((('opening', all), 'body'), lambda _, v: v['content'])):
content = p['content']
type_ = p.get('type')
if type_ == 'paragraph' and isinstance(content, str):
for p in traverse_obj(props_list, (..., 'articleData', 1, ('opening', ('body', 1, ...)), 1, {dict})):
type_ = traverse_obj(p, ('type', 1, {str}))
content = traverse_obj(p, ('content', 1, {str} if type_ == 'paragraph' else {dict}))
if not content:
continue
if type_ == 'paragraph':
description = join_nonempty(description, content, delim='')
elif type_ == 'video' and isinstance(content, dict):
elif type_ == 'video':
entries.append(self._parse_content(content, url))
else:
self.report_warning(
f'Skipping unsupported content type "{type_}"', display_id, only_once=True)
return self.playlist_result(
entries, str_or_none(article.get('id')),
traverse_obj(article, ('title', {str})), clean_html(description))
entries,
traverse_obj(props_list, (..., 'id', 1, {int}, {str_or_none}, any)) or display_id,
traverse_obj(main_content, ('dataTitle', 1, {str})),
clean_html(description))
info = self._parse_content(article['opening']['content'], url)
if not main_content:
raise ExtractorError('Unable to extract main content from webpage')
info = self._parse_content(main_content, url)
info['description'] = description
return info

View File

@@ -1,18 +1,17 @@
import json
import urllib.parse
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from .zype import ZypeIE
from ..networking import HEADRequest
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
filter_dict,
parse_qs,
smuggle_url,
try_call,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class ThisOldHouseIE(InfoExtractor):
@@ -77,46 +76,43 @@ class ThisOldHouseIE(InfoExtractor):
'only_matching': True,
}]
_LOGIN_URL = 'https://login.thisoldhouse.com/usernamepassword/login'
def _perform_login(self, username, password):
self._request_webpage(
HEADRequest('https://www.thisoldhouse.com/insider'), None, 'Requesting session cookies')
urlh = self._request_webpage(
'https://www.thisoldhouse.com/wp-login.php', None, 'Requesting login info',
errnote='Unable to login', query={'redirect_to': 'https://www.thisoldhouse.com/insider'})
login_page = self._download_webpage(
'https://www.thisoldhouse.com/insider-login', None, 'Downloading login page')
hidden_inputs = self._hidden_inputs(login_page)
response = self._download_json(
'https://www.thisoldhouse.com/wp-admin/admin-ajax.php', None, 'Logging in',
headers={
'Accept': 'application/json',
'X-Requested-With': 'XMLHttpRequest',
}, data=urlencode_postdata(filter_dict({
'action': 'onebill_subscriber_login',
'email': username,
'password': password,
'pricingPlanTerm': hidden_inputs['pricing_plan_term'],
'utm_parameters': hidden_inputs.get('utm_parameters'),
'nonce': hidden_inputs['mdcr_onebill_login_nonce'],
})))
try:
auth_form = self._download_webpage(
self._LOGIN_URL, None, 'Submitting credentials', headers={
'Content-Type': 'application/json',
'Referer': urlh.url,
}, data=json.dumps(filter_dict({
**{('client_id' if k == 'client' else k): v[0] for k, v in parse_qs(urlh.url).items()},
'tenant': 'thisoldhouse',
'username': username,
'password': password,
'popup_options': {},
'sso': True,
'_csrf': try_call(lambda: self._get_cookies(self._LOGIN_URL)['_csrf'].value),
'_intstate': 'deprecated',
}), separators=(',', ':')).encode())
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
message = traverse_obj(response, ('data', 'message', {str}))
if not response['success']:
if message and 'Something went wrong' in message:
raise ExtractorError('Invalid username or password', expected=True)
raise
self._request_webpage(
'https://login.thisoldhouse.com/login/callback', None, 'Completing login',
data=urlencode_postdata(self._hidden_inputs(auth_form)))
raise ExtractorError(message or 'Login was unsuccessful')
if message and 'Your subscription is not active' in message:
self.report_warning(
f'{self.IE_NAME} said your subscription is not active. '
f'If your subscription is active, this could be caused by too many sign-ins, '
f'and you should instead try using {self._login_hint(method="cookies")[4:]}')
else:
self.write_debug(f'{self.IE_NAME} said: {message}')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
if 'To Unlock This content' in webpage:
self.raise_login_required(
'This video is only available for subscribers. '
'Note that --cookies-from-browser may not work due to this site using session cookies')
webpage, urlh = self._download_webpage_handle(url, display_id)
# If login response says inactive subscription, site redirects to frontpage for Insider content
if 'To Unlock This content' in webpage or urllib.parse.urlparse(urlh.url).path in ('', '/'):
self.raise_login_required('This video is only available for subscribers')
video_url, video_id = self._search_regex(
r'<iframe[^>]+src=[\'"]((?:https?:)?//(?:www\.)?thisoldhouse\.(?:chorus\.build|com)/videos/zype/([0-9a-f]{24})[^\'"]*)[\'"]',

View File

@@ -1,4 +1,6 @@
import base64
import functools
import hashlib
import itertools
import json
import random
@@ -15,6 +17,7 @@ from ..utils import (
UnsupportedError,
UserNotLive,
determine_ext,
extract_attributes,
filter_dict,
format_field,
int_or_none,
@@ -25,13 +28,13 @@ from ..utils import (
qualities,
srt_subtitles_timecode,
str_or_none,
traverse_obj,
truncate_string,
try_call,
try_get,
url_or_none,
urlencode_postdata,
)
from ..utils.traversal import find_element, require, traverse_obj
class TikTokBaseIE(InfoExtractor):
@@ -217,38 +220,94 @@ class TikTokBaseIE(InfoExtractor):
raise ExtractorError('Unable to extract aweme detail info', video_id=aweme_id)
return self._parse_aweme_video_app(aweme_detail)
def _solve_challenge_and_set_cookie(self, webpage):
challenge_data = traverse_obj(webpage, (
{find_element(id='cs', html=True)}, {extract_attributes}, 'class',
filter, {lambda x: f'{x}==='}, {base64.b64decode}, {json.loads}))
if not challenge_data:
if 'Please wait...' in webpage:
raise ExtractorError('Unable to extract challenge data')
raise ExtractorError('Unexpected response from webpage request')
self.to_screen('Solving JS challenge using native Python implementation')
expected_digest = traverse_obj(challenge_data, (
'v', 'c', {str}, {base64.b64decode},
{require('challenge expected digest')}))
base_hash = traverse_obj(challenge_data, (
'v', 'a', {str}, {base64.b64decode},
{hashlib.sha256}, {require('challenge base hash')}))
for i in range(1_000_001):
number = str(i).encode()
test_hash = base_hash.copy()
test_hash.update(number)
if test_hash.digest() == expected_digest:
challenge_data['d'] = base64.b64encode(number).decode()
break
else:
raise ExtractorError('Unable to solve JS challenge')
cookie_value = base64.b64encode(
json.dumps(challenge_data, separators=(',', ':')).encode()).decode()
# At time of writing, the cookie name was _wafchallengeid
cookie_name = traverse_obj(webpage, (
{find_element(id='wci', html=True)}, {extract_attributes},
'class', {require('challenge cookie name')}))
# Actual JS sets Max-Age=1, but we need to adjust for --sleep-requests and Python slowness
expire_time = int(time.time()) + (self.get_param('sleep_interval_requests') or 0) + 2
self._set_cookie('.tiktok.com', cookie_name, cookie_value, expire_time=expire_time)
def _extract_web_data_and_status(self, url, video_id, fatal=True):
video_data, status = {}, -1
res = self._download_webpage_handle(url, video_id, fatal=fatal, impersonate=True)
if res is False:
def get_webpage(note='Downloading webpage'):
res = self._download_webpage_handle(url, video_id, note, fatal=fatal, impersonate=True)
if res is False:
return False
webpage, urlh = res
if urllib.parse.urlparse(urlh.url).path == '/login':
message = 'TikTok is requiring login for access to this content'
if fatal:
self.raise_login_required(message)
self.report_warning(f'{message}. {self._login_hint()}', video_id=video_id)
return False
return webpage
webpage = get_webpage()
if webpage is False:
return video_data, status
webpage, urlh = res
if urllib.parse.urlparse(urlh.url).path == '/login':
message = 'TikTok is requiring login for access to this content'
universal_data = self._get_universal_data(webpage, video_id)
if not universal_data:
try:
self._solve_challenge_and_set_cookie(webpage)
except ExtractorError as e:
if fatal:
raise
self.report_warning(e.orig_msg, video_id=video_id)
return video_data, status
webpage = get_webpage(note='Downloading webpage with challenge cookie')
if webpage is False:
return video_data, status
universal_data = self._get_universal_data(webpage, video_id)
if not universal_data:
message = 'Unable to extract universal data for rehydration'
if fatal:
self.raise_login_required(message)
self.report_warning(f'{message}. {self._login_hint()}')
raise ExtractorError(message)
self.report_warning(message, video_id=video_id)
return video_data, status
if universal_data := self._get_universal_data(webpage, video_id):
self.write_debug('Found universal data for rehydration')
status = traverse_obj(universal_data, ('webapp.video-detail', 'statusCode', {int})) or 0
video_data = traverse_obj(universal_data, ('webapp.video-detail', 'itemInfo', 'itemStruct', {dict}))
elif sigi_data := self._get_sigi_state(webpage, video_id):
self.write_debug('Found sigi state data')
status = traverse_obj(sigi_data, ('VideoPage', 'statusCode', {int})) or 0
video_data = traverse_obj(sigi_data, ('ItemModule', video_id, {dict}))
elif next_data := self._search_nextjs_data(webpage, video_id, default={}):
self.write_debug('Found next.js data')
status = traverse_obj(next_data, ('props', 'pageProps', 'statusCode', {int})) or 0
video_data = traverse_obj(next_data, ('props', 'pageProps', 'itemInfo', 'itemStruct', {dict}))
elif fatal:
raise ExtractorError('Unable to extract webpage video data')
status = traverse_obj(universal_data, ('webapp.video-detail', 'statusCode', {int})) or 0
video_data = traverse_obj(universal_data, ('webapp.video-detail', 'itemInfo', 'itemStruct', {dict}))
if not traverse_obj(video_data, ('video', {dict})) and traverse_obj(video_data, ('isContentClassified', {bool})):
message = 'This post may not be comfortable for some audiences. Log in for access'
@@ -454,6 +513,7 @@ class TikTokBaseIE(InfoExtractor):
'like_count': 'digg_count',
'repost_count': 'share_count',
'comment_count': 'comment_count',
'save_count': 'collect_count',
}, expected_type=int_or_none),
**author_info,
'channel_url': format_field(author_info, 'channel_id', self._UPLOADER_URL_FORMAT, default=None),
@@ -607,6 +667,7 @@ class TikTokBaseIE(InfoExtractor):
'like_count': 'diggCount',
'repost_count': 'shareCount',
'comment_count': 'commentCount',
'save_count': 'collectCount',
}), expected_type=int_or_none),
'thumbnails': [
{
@@ -646,6 +707,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'artist': 'Ysrbeats',
'album': 'Lehanga',
'track': 'Lehanga',
@@ -675,6 +737,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'artists': ['Evan Todd', 'Jessica Keenan Wynn', 'Alice Lee', 'Barrett Wilbert Weed', 'Jon Eidson'],
'track': 'Big Fun',
},
@@ -702,6 +765,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
},
}, {
# Sponsored video, only available with feed workaround
@@ -725,6 +789,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
},
'skip': 'This video is unavailable',
}, {
@@ -751,6 +816,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
},
}, {
# hydration JSON is sent in a <script> element
@@ -773,6 +839,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
},
'skip': 'This video is unavailable',
}, {
@@ -798,6 +865,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:^https://.+\.(?:webp|jpe?g)',
},
}, {
@@ -824,6 +892,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:^https://.+',
'thumbnails': 'count:3',
},
@@ -851,6 +920,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:^https://.+\.webp',
},
'skip': 'Unavailable via feed API, only audio available via web',
@@ -879,6 +949,7 @@ class TikTokIE(TikTokBaseIE):
'like_count': int,
'comment_count': int,
'repost_count': int,
'save_count': int,
'thumbnail': r're:^https://.+\.(?:webp|jpe?g)',
},
}, {
@@ -1288,6 +1359,7 @@ class DouyinIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:https?://.+\.jpe?g',
},
}, {
@@ -1312,6 +1384,7 @@ class DouyinIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:https?://.+\.jpe?g',
},
}, {
@@ -1336,6 +1409,7 @@ class DouyinIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:https?://.+\.jpe?g',
},
}, {
@@ -1353,6 +1427,7 @@ class DouyinIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
},
'skip': 'No longer available',
}, {
@@ -1377,6 +1452,7 @@ class DouyinIE(TikTokBaseIE):
'like_count': int,
'repost_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:https?://.+\.jpe?g',
},
}]
@@ -1437,6 +1513,7 @@ class TikTokVMIE(InfoExtractor):
'view_count': int,
'like_count': int,
'comment_count': int,
'save_count': int,
'thumbnail': r're:https://.+\.webp.*',
'uploader_url': 'https://www.tiktok.com/@MS4wLjABAAAAdZ_NcPPgMneaGrW0hN8O_J_bwLshwNNERRF5DxOw2HKIzk0kdlLrR8RkVl1ksrMO',
'duration': 29,

View File

@@ -15,7 +15,7 @@ from ..utils import (
class TubiTvIE(InfoExtractor):
IE_NAME = 'tubitv'
_VALID_URL = r'https?://(?:www\.)?tubitv\.com/(?P<type>video|movies|tv-shows)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?tubitv\.com/(?:[a-z]{2}-[a-z]{2}/)?(?P<type>video|movies|tv-shows)/(?P<id>\d+)'
_LOGIN_URL = 'http://tubitv.com/login'
_NETRC_MACHINE = 'tubitv'
_TESTS = [{
@@ -73,6 +73,9 @@ class TubiTvIE(InfoExtractor):
'release_year': 1979,
},
'skip': 'Content Unavailable',
}, {
'url': 'https://tubitv.com/es-mx/tv-shows/477363/s01-e03-jacob-dos-dos-y-la-tarjets-de-hockey-robada',
'only_matching': True,
}]
# DRM formats are included only to raise appropriate error
@@ -182,13 +185,13 @@ class TubiTvShowIE(InfoExtractor):
webpage = self._download_webpage(show_url, playlist_id)
data = self._search_json(
r'window\.__data\s*=', webpage, 'data', playlist_id,
transform_source=js_to_json)['video']
r'window\.__REACT_QUERY_STATE__\s*=', webpage, 'data', playlist_id,
transform_source=js_to_json)['queries'][0]['state']['data']
# v['number'] is already a decimal string, but stringify to protect against API changes
path = [lambda _, v: str(v['number']) == selected_season] if selected_season else [..., {dict}]
for season in traverse_obj(data, ('byId', lambda _, v: v['type'] == 's', 'seasons', *path)):
for season in traverse_obj(data, ('seasons', *path)):
season_number = int_or_none(season.get('number'))
for episode in traverse_obj(season, ('episodes', lambda _, v: v['id'])):
episode_id = episode['id']

View File

@@ -20,6 +20,8 @@ class TumblrIE(InfoExtractor):
'id': '54196191430',
'ext': 'mp4',
'title': 'md5:dfac39636969fe6bf1caa2d50405f069',
'timestamp': 1372531260,
'upload_date': '20130629',
'description': 'md5:390ab77358960235b6937ab3b8528956',
'uploader_id': 'tatianamaslanydaily',
'uploader_url': 'https://tatianamaslanydaily.tumblr.com/',
@@ -39,6 +41,8 @@ class TumblrIE(InfoExtractor):
'ext': 'mp4',
'title': 'Mona\xa0“talking” in\xa0“english”',
'description': 'md5:082a3a621530cb786ad2b7592a6d9e2c',
'timestamp': 1597865276,
'upload_date': '20200819',
'uploader_id': 'maskofthedragon',
'uploader_url': 'https://maskofthedragon.tumblr.com/',
'thumbnail': r're:^https?://.*\.jpg',
@@ -76,6 +80,8 @@ class TumblrIE(InfoExtractor):
'id': '159704441298',
'ext': 'mp4',
'title': 'md5:ba79365861101f4911452728d2950561',
'timestamp': 1492489550,
'upload_date': '20170418',
'description': 'md5:773738196cea76b6996ec71e285bdabc',
'uploader_id': 'jujanon',
'uploader_url': 'https://jujanon.tumblr.com/',
@@ -93,6 +99,8 @@ class TumblrIE(InfoExtractor):
'id': '180294460076',
'ext': 'mp4',
'title': 'duality of bird',
'timestamp': 1542651819,
'upload_date': '20181119',
'description': 'duality of bird',
'uploader_id': 'todaysbird',
'uploader_url': 'https://todaysbird.tumblr.com/',
@@ -238,6 +246,8 @@ class TumblrIE(InfoExtractor):
'info_dict': {
'id': '730460905855467520',
'uploader_id': 'felixcosm',
'upload_date': '20231006',
'timestamp': 1696621805,
'repost_count': int,
'tags': 'count:15',
'description': 'md5:2eb3482a3c6987280cbefb6839068f32',
@@ -327,6 +337,8 @@ class TumblrIE(InfoExtractor):
'url': 'https://www.tumblr.com/anyaboz/765332564457209856/my-music-video-for-selkie-by-nobodys-wolf-child',
'info_dict': {
'id': '765332564457209856',
'timestamp': 1729878010,
'upload_date': '20241025',
'uploader_id': 'anyaboz',
'repost_count': int,
'age_limit': 0,
@@ -445,6 +457,8 @@ class TumblrIE(InfoExtractor):
'uploader_id': uploader_id,
'uploader_url': f'https://{uploader_id}.tumblr.com/' if uploader_id else None,
**traverse_obj(post_json, {
# Try oldest post in reblog chain, fall back to timestamp of the post itself
'timestamp': ((('trail', 0, 'post'), None), 'timestamp', {int_or_none}, any),
'like_count': ('like_count', {int_or_none}),
'repost_count': ('reblog_count', {int_or_none}),
'tags': ('tags', ..., {str}),

View File

@@ -1,14 +1,18 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
join_nonempty,
make_archive_id,
parse_age_limit,
smuggle_url,
try_get,
remove_end,
)
from ..utils.traversal import traverse_obj
class TV5UnisBaseIE(InfoExtractor):
_GEO_COUNTRIES = ['CA']
_GEO_BYPASS = False
def _real_extract(self, url):
groups = self._match_valid_url(url).groups()
@@ -16,96 +20,136 @@ class TV5UnisBaseIE(InfoExtractor):
'https://api.tv5unis.ca/graphql', groups[0], query={
'query': '''{
%s(%s) {
title
summary
tags
duration
seasonNumber
episodeNumber
collection {
title
}
episodeNumber
rating {
name
}
seasonNumber
tags
title
videoElement {
__typename
... on Video {
mediaId
encodings {
hls {
url
}
}
}
... on RestrictedVideo {
code
reason
}
}
}
}''' % (self._GQL_QUERY_NAME, self._gql_args(groups)), # noqa: UP031
})['data'][self._GQL_QUERY_NAME]
media_id = product['videoElement']['mediaId']
video = product['videoElement']
if video is None:
raise ExtractorError('This content is no longer available', expected=True)
if video.get('__typename') == 'RestrictedVideo':
code = video.get('code')
if code == 1001:
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
reason = video.get('reason')
raise ExtractorError(join_nonempty(
'This video is restricted',
code is not None and f', error code {code}',
reason and f': {remove_end(reason, ".")}',
delim=''))
media_id = video['mediaId']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
video['encodings']['hls']['url'], media_id, 'mp4')
return {
'_type': 'url_transparent',
'id': media_id,
'title': product.get('title'),
'url': smuggle_url('limelight:media:' + media_id, {'geo_countries': self._GEO_COUNTRIES}),
'age_limit': parse_age_limit(try_get(product, lambda x: x['rating']['name'])),
'tags': product.get('tags'),
'series': try_get(product, lambda x: x['collection']['title']),
'season_number': int_or_none(product.get('seasonNumber')),
'episode_number': int_or_none(product.get('episodeNumber')),
'ie_key': 'LimelightMedia',
'_old_archive_ids': [make_archive_id('LimelightMedia', media_id)],
'formats': formats,
'subtitles': subtitles,
**traverse_obj(product, {
'title': ('title', {str}),
'description': ('summary', {str}),
'tags': ('tags', ..., {str}),
'duration': ('duration', {int_or_none}),
'season_number': ('seasonNumber', {int_or_none}),
'episode_number': ('episodeNumber', {int_or_none}),
'series': ('collection', 'title', {str}),
'age_limit': ('rating', 'name', {parse_age_limit}),
}),
}
class TV5UnisVideoIE(TV5UnisBaseIE):
_WORKING = False
IE_NAME = 'tv5unis:video'
_VALID_URL = r'https?://(?:www\.)?tv5unis\.ca/videos/[^/]+/(?P<id>\d+)'
_TEST = {
'url': 'https://www.tv5unis.ca/videos/bande-annonces/71843',
'md5': '3d794164928bda97fb87a17e89923d9b',
_VALID_URL = r'https?://(?:www\.)?tv5unis\.ca/videos/[^/?#]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.tv5unis.ca/videos/bande-annonces/144041',
'md5': '24a247c96119d77fe1bae8b440457dfa',
'info_dict': {
'id': 'a883684aecb2486cad9bdc7bbe17f861',
'id': '56862325352147149dce0ae139afced6',
'_old_archive_ids': ['limelightmedia 56862325352147149dce0ae139afced6'],
'ext': 'mp4',
'title': 'Watatatow',
'duration': 10.01,
'title': 'Antigone',
'description': r"re:En aidant son frère .+ dicté par l'amour et la solidarité.",
'duration': 61,
},
}
}]
_GQL_QUERY_NAME = 'productById'
@staticmethod
def _gql_args(groups):
return f'id: {groups}'
return f'id: {groups[0]}'
class TV5UnisIE(TV5UnisBaseIE):
_WORKING = False
IE_NAME = 'tv5unis'
_VALID_URL = r'https?://(?:www\.)?tv5unis\.ca/videos/(?P<id>[^/]+)(?:/saisons/(?P<season_number>\d+)/episodes/(?P<episode_number>\d+))?/?(?:[?#&]|$)'
_VALID_URL = r'https?://(?:www\.)?tv5unis\.ca/videos/(?P<id>[^/?#]+)(?:/saisons/(?P<season_number>\d+)/episodes/(?P<episode_number>\d+))?/?(?:[?#&]|$)'
_TESTS = [{
'url': 'https://www.tv5unis.ca/videos/watatatow/saisons/6/episodes/1',
'md5': 'a479907d2e531a73e1f8dc48d6388d02',
# geo-restricted to Canada; xff is ineffective
'url': 'https://www.tv5unis.ca/videos/watatatow/saisons/11/episodes/1',
'md5': '43beebd47eefb1c5caf9a47a3fc35589',
'info_dict': {
'id': 'e5ee23a586c44612a56aad61accf16ef',
'id': '2c06e4af20f0417b86c2536825287690',
'_old_archive_ids': ['limelightmedia 2c06e4af20f0417b86c2536825287690'],
'ext': 'mp4',
'title': 'Je ne peux pas lui résister',
'description': "Atys, le nouveau concierge de l'école, a réussi à ébranler la confiance de Mado en affirmant qu'une médaille, ce n'est que du métal. Comme Mado essaie de lui prouver que ses valeurs sont solides, il veut la mettre à l'épreuve...",
'title': "L'homme éléphant",
'description': r're:Paul-André et Jean-Yves, .+ quand elle parle du feu au Spot.',
'subtitles': {
'fr': 'count:1',
},
'duration': 1370,
'duration': 1440,
'age_limit': 8,
'tags': 'count:3',
'tags': 'count:4',
'series': 'Watatatow',
'season_number': 6,
'season': 'Season 11',
'season_number': 11,
'episode': 'Episode 1',
'episode_number': 1,
},
}, {
'url': 'https://www.tv5unis.ca/videos/le-voyage-de-fanny',
'md5': '9ca80ebb575c681d10cae1adff3d4774',
# geo-restricted to Canada; xff is ineffective
'url': 'https://www.tv5unis.ca/videos/boite-a-savon',
'md5': '7898e868e8c540f03844660e0aab6bbe',
'info_dict': {
'id': '726188eefe094d8faefb13381d42bc06',
'id': '4de6d0c6467b4511a0c04b92037a9f15',
'_old_archive_ids': ['limelightmedia 4de6d0c6467b4511a0c04b92037a9f15'],
'ext': 'mp4',
'title': 'Le voyage de Fanny',
'description': "Fanny, 12 ans, cachée dans un foyer loin de ses parents, s'occupe de ses deux soeurs. Devant fuir, Fanny prend la tête d'un groupe de huit enfants et s'engage dans un dangereux périple à travers la France occupée pour rejoindre la frontière suisse.",
'title': 'Boîte à savon',
'description': r're:Dans le petit village de Broche-à-foin, .+ celle qui fait battre son coeur.',
'subtitles': {
'fr': 'count:1',
},
'duration': 5587.034,
'tags': 'count:4',
'duration': 1200,
'tags': 'count:5',
},
}]
_GQL_QUERY_NAME = 'productByRootProductSlug'

View File

@@ -680,6 +680,10 @@ class TwitchPlaylistBaseIE(TwitchBaseIE):
}],
f'Downloading {self._NODE_KIND}s GraphQL page {page_num}',
fatal=False)
# Avoid extracting random/unrelated entries when channel_name doesn't exist
# See https://github.com/yt-dlp/yt-dlp/issues/15450
if traverse_obj(page, (0, 'data', 'user', 'id', {str})) == '':
raise ExtractorError(f'Channel "{channel_name}" not found', expected=True)
if not page:
break
edges = try_get(

View File

@@ -32,67 +32,11 @@ from ..utils.traversal import require, traverse_obj
class TwitterBaseIE(InfoExtractor):
_NETRC_MACHINE = 'twitter'
_API_BASE = 'https://api.x.com/1.1/'
_GRAPHQL_API_BASE = 'https://x.com/i/api/graphql/'
_BASE_REGEX = r'https?://(?:(?:www|m(?:obile)?)\.)?(?:(?:twitter|x)\.com|twitter3e4tixl4xyajtrzo62zg5vztmjuricljdp2c5kshju4avyoid\.onion)/'
_AUTH = 'AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA'
_LEGACY_AUTH = 'AAAAAAAAAAAAAAAAAAAAAIK1zgAAAAAA2tUWuhGZ2JceoId5GwYWU5GspY4%3DUq7gzFoCZs1QfwGoVdvSac3IniczZEYXIcDyumCauIXpcAPorE'
_flow_token = None
_LOGIN_INIT_DATA = json.dumps({
'input_flow_data': {
'flow_context': {
'debug_overrides': {},
'start_location': {
'location': 'unknown',
},
},
},
'subtask_versions': {
'action_list': 2,
'alert_dialog': 1,
'app_download_cta': 1,
'check_logged_in_account': 1,
'choice_selection': 3,
'contacts_live_sync_permission_prompt': 0,
'cta': 7,
'email_verification': 2,
'end_flow': 1,
'enter_date': 1,
'enter_email': 2,
'enter_password': 5,
'enter_phone': 2,
'enter_recaptcha': 1,
'enter_text': 5,
'enter_username': 2,
'generic_urt': 3,
'in_app_notification': 1,
'interest_picker': 3,
'js_instrumentation': 1,
'menu_dialog': 1,
'notifications_permission_prompt': 2,
'open_account': 2,
'open_home_timeline': 1,
'open_link': 1,
'phone_verification': 4,
'privacy_options': 1,
'security_key': 3,
'select_avatar': 4,
'select_banner': 2,
'settings_list': 7,
'show_code': 1,
'sign_up': 2,
'sign_up_review': 4,
'tweet_selection_urt': 1,
'update_users': 1,
'upload_media': 1,
'user_recommendations_list': 4,
'user_recommendations_urt': 1,
'wait_spinner': 3,
'web_modal': 1,
},
}, separators=(',', ':')).encode()
def _extract_variant_formats(self, variant, video_id):
variant_url = variant.get('url')
@@ -172,135 +116,6 @@ class TwitterBaseIE(InfoExtractor):
'x-csrf-token': try_call(lambda: self._get_cookies(self._API_BASE)['ct0'].value),
})
def _call_login_api(self, note, headers, query={}, data=None):
response = self._download_json(
f'{self._API_BASE}onboarding/task.json', None, note,
headers=headers, query=query, data=data, expected_status=400)
error = traverse_obj(response, ('errors', 0, 'message', {str}))
if error:
raise ExtractorError(f'Login failed, Twitter API says: {error}', expected=True)
elif traverse_obj(response, 'status') != 'success':
raise ExtractorError('Login was unsuccessful')
subtask = traverse_obj(
response, ('subtasks', ..., 'subtask_id', {str}), get_all=False)
if not subtask:
raise ExtractorError('Twitter API did not return next login subtask')
self._flow_token = response['flow_token']
return subtask
def _perform_login(self, username, password):
if self.is_logged_in:
return
guest_token = self._fetch_guest_token(None)
headers = {
**self._set_base_headers(),
'content-type': 'application/json',
'x-guest-token': guest_token,
'x-twitter-client-language': 'en',
'x-twitter-active-user': 'yes',
'Referer': 'https://x.com/',
'Origin': 'https://x.com',
}
def build_login_json(*subtask_inputs):
return json.dumps({
'flow_token': self._flow_token,
'subtask_inputs': subtask_inputs,
}, separators=(',', ':')).encode()
def input_dict(subtask_id, text):
return {
'subtask_id': subtask_id,
'enter_text': {
'text': text,
'link': 'next_link',
},
}
next_subtask = self._call_login_api(
'Downloading flow token', headers, query={'flow_name': 'login'}, data=self._LOGIN_INIT_DATA)
while not self.is_logged_in:
if next_subtask == 'LoginJsInstrumentationSubtask':
next_subtask = self._call_login_api(
'Submitting JS instrumentation response', headers, data=build_login_json({
'subtask_id': next_subtask,
'js_instrumentation': {
'response': '{}',
'link': 'next_link',
},
}))
elif next_subtask == 'LoginEnterUserIdentifierSSO':
next_subtask = self._call_login_api(
'Submitting username', headers, data=build_login_json({
'subtask_id': next_subtask,
'settings_list': {
'setting_responses': [{
'key': 'user_identifier',
'response_data': {
'text_data': {
'result': username,
},
},
}],
'link': 'next_link',
},
}))
elif next_subtask == 'LoginEnterAlternateIdentifierSubtask':
next_subtask = self._call_login_api(
'Submitting alternate identifier', headers,
data=build_login_json(input_dict(next_subtask, self._get_tfa_info(
'one of username, phone number or email that was not used as --username'))))
elif next_subtask == 'LoginEnterPassword':
next_subtask = self._call_login_api(
'Submitting password', headers, data=build_login_json({
'subtask_id': next_subtask,
'enter_password': {
'password': password,
'link': 'next_link',
},
}))
elif next_subtask == 'AccountDuplicationCheck':
next_subtask = self._call_login_api(
'Submitting account duplication check', headers, data=build_login_json({
'subtask_id': next_subtask,
'check_logged_in_account': {
'link': 'AccountDuplicationCheck_false',
},
}))
elif next_subtask == 'LoginTwoFactorAuthChallenge':
next_subtask = self._call_login_api(
'Submitting 2FA token', headers, data=build_login_json(input_dict(
next_subtask, self._get_tfa_info('two-factor authentication token'))))
elif next_subtask == 'LoginAcid':
next_subtask = self._call_login_api(
'Submitting confirmation code', headers, data=build_login_json(input_dict(
next_subtask, self._get_tfa_info('confirmation code sent to your email or phone'))))
elif next_subtask == 'ArkoseLogin':
self.raise_login_required('Twitter is requiring captcha for this login attempt', method='cookies')
elif next_subtask == 'DenyLoginSubtask':
self.raise_login_required('Twitter rejected this login attempt as suspicious', method='cookies')
elif next_subtask == 'LoginSuccessSubtask':
raise ExtractorError('Twitter API did not grant auth token cookie')
else:
raise ExtractorError(f'Unrecognized subtask ID "{next_subtask}"')
self.report_login()
def _call_api(self, path, video_id, query={}, graphql=False):
headers = self._set_base_headers(legacy=not graphql and self._selected_api == 'legacy')
headers.update({
@@ -416,6 +231,7 @@ class TwitterCardIE(InfoExtractor):
'live_status': 'not_live',
},
'add_ie': ['Youtube'],
'skip': 'The page does not exist',
},
{
'url': 'https://twitter.com/i/videos/tweet/705235433198714880',
@@ -617,6 +433,7 @@ class TwitterIE(TwitterBaseIE):
'comment_count': int,
'_old_archive_ids': ['twitter 852138619213144067'],
},
'skip': 'Suspended',
}, {
'url': 'https://twitter.com/i/web/status/910031516746514432',
'info_dict': {
@@ -763,10 +580,10 @@ class TwitterIE(TwitterBaseIE):
'url': 'https://twitter.com/UltimaShadowX/status/1577719286659006464',
'info_dict': {
'id': '1577719286659006464',
'title': 'Ultima - Test',
'title': r're:Ultima.* - Test$',
'description': 'Test https://t.co/Y3KEZD7Dad',
'channel_id': '168922496',
'uploader': 'Ultima',
'uploader': r're:Ultima.*',
'uploader_id': 'UltimaShadowX',
'uploader_url': 'https://twitter.com/UltimaShadowX',
'upload_date': '20221005',
@@ -895,11 +712,12 @@ class TwitterIE(TwitterBaseIE):
'uploader': r're:Monique Camarra.+?',
'uploader_id': 'MoniqueCamarra',
'live_status': 'was_live',
'release_timestamp': 1658417414,
'release_timestamp': 1658417305,
'description': r're:Twitter Space participated by Sergej Sumlenny.+',
'timestamp': 1658407771,
'release_date': '20220721',
'upload_date': '20220721',
'thumbnail': 'https://pbs.twimg.com/profile_images/1920514378006188033/xQs6J_yI_400x400.jpg',
},
'add_ie': ['TwitterSpaces'],
'params': {'skip_download': 'm3u8'},
@@ -1010,10 +828,10 @@ class TwitterIE(TwitterBaseIE):
'description': 'This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525 https://t.co/cNsA0MoOml',
'thumbnail': 'https://pbs.twimg.com/ext_tw_video_thumb/1600009362759733248/pu/img/XVhFQivj75H_YxxV.jpg?name=orig',
'age_limit': 0,
'uploader': 'Boy Called Mün',
'uploader': 'D U N I Y A',
'repost_count': int,
'upload_date': '20221206',
'title': 'Boy Called Mün - This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525',
'title': 'D U N I Y A - This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525',
'comment_count': int,
'like_count': int,
'tags': [],
@@ -1068,6 +886,7 @@ class TwitterIE(TwitterBaseIE):
'comment_count': int,
'_old_archive_ids': ['twitter 1695424220702888009'],
},
'skip': 'Suspended',
}, {
# retweeted_status w/ legacy API
'url': 'https://twitter.com/playstrumpcard/status/1695424220702888009',
@@ -1092,6 +911,7 @@ class TwitterIE(TwitterBaseIE):
'_old_archive_ids': ['twitter 1695424220702888009'],
},
'params': {'extractor_args': {'twitter': {'api': ['legacy']}}},
'skip': 'Suspended',
}, {
# Broadcast embedded in tweet
'url': 'https://twitter.com/JessicaDobsonWX/status/1731121063248175384',
@@ -1135,7 +955,6 @@ class TwitterIE(TwitterBaseIE):
}, {
# "stale tweet" with typename "TweetWithVisibilityResults"
'url': 'https://twitter.com/RobertKennedyJr/status/1724884212803834154',
'md5': '511377ff8dfa7545307084dca4dce319',
'info_dict': {
'id': '1724883339285544960',
'ext': 'mp4',
@@ -1182,6 +1001,30 @@ class TwitterIE(TwitterBaseIE):
'age_limit': 0,
'_old_archive_ids': ['twitter 1790637656616943991'],
},
}, {
# unified_card with 2 items of type video and photo
'url': 'https://x.com/TopHeroes_/status/2001950365332455490',
'info_dict': {
'id': '2001841416071450628',
'ext': 'mp4',
'display_id': '2001950365332455490',
'title': 'Top Heroes - Forgot to close My heroes solo level up in my phone ✨Unlock the fog,...',
'description': r're:Forgot to close My heroes solo level up in my phone ✨Unlock the fog.+',
'uploader': 'Top Heroes',
'uploader_id': 'TopHeroes_',
'uploader_url': 'https://twitter.com/TopHeroes_',
'channel_id': '1737324725620326400',
'comment_count': int,
'like_count': int,
'repost_count': int,
'age_limit': 0,
'duration': 30.278,
'thumbnail': 'https://pbs.twimg.com/amplify_video_thumb/2001841416071450628/img/hpy5KpJh4pO17b65.jpg?name=orig',
'tags': [],
'timestamp': 1766137136,
'upload_date': '20251219',
'_old_archive_ids': ['twitter 2001950365332455490'],
},
}, {
# onion route
'url': 'https://twitter3e4tixl4xyajtrzo62zg5vztmjuricljdp2c5kshju4avyoid.onion/TwitterBlue/status/1484226494708662273',
@@ -1422,14 +1265,14 @@ class TwitterIE(TwitterBaseIE):
if not card:
return
self.write_debug(f'Extracting from card info: {card.get("url")}')
card_name = card['name'].split(':')[-1]
self.write_debug(f'Extracting from {card_name} card info: {card.get("url")}')
binding_values = card['binding_values']
def get_binding_value(k):
o = binding_values.get(k) or {}
return try_get(o, lambda x: x[x['type'].lower() + '_value'])
card_name = card['name'].split(':')[-1]
if card_name == 'player':
yield {
'_type': 'url',
@@ -1461,7 +1304,7 @@ class TwitterIE(TwitterBaseIE):
elif card_name == 'unified_card':
unified_card = self._parse_json(get_binding_value('unified_card'), twid)
yield from map(extract_from_video_info, traverse_obj(
unified_card, ('media_entities', ...), expected_type=dict))
unified_card, ('media_entities', lambda _, v: v['type'] == 'video')))
# amplify, promo_video_website, promo_video_convo, appplayer,
# video_direct_message, poll2choice_video, poll3choice_video,
# poll4choice_video, ...

Some files were not shown because too many files have changed in this diff Show More