mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2026-01-13 02:11:18 +00:00
Compare commits
58 Commits
2025.02.19
...
2025.03.21
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b4488a9e12 | ||
|
|
f36e4b6e65 | ||
|
|
983095485c | ||
|
|
bbada3ec07 | ||
|
|
8305df0001 | ||
|
|
7223d29569 | ||
|
|
f5fb2229e6 | ||
|
|
89a68c4857 | ||
|
|
9b868518a1 | ||
|
|
2ee3a0aff9 | ||
|
|
01a8be4c23 | ||
|
|
ebac65aa9e | ||
|
|
4815dac131 | ||
|
|
95f8df2f79 | ||
|
|
e67d786c7c | ||
|
|
d9a53cc1e6 | ||
|
|
83b119dadb | ||
|
|
06f6de78db | ||
|
|
3380febe99 | ||
|
|
be0d819e11 | ||
|
|
df9ebeec00 | ||
|
|
17504f2535 | ||
|
|
4432a9390c | ||
|
|
05c8023a27 | ||
|
|
bd0a668169 | ||
|
|
b8b4754704 | ||
|
|
9d70abe4de | ||
|
|
8eb9c1bf3b | ||
|
|
42b7440963 | ||
|
|
172d5fcd77 | ||
|
|
7d18fed8f1 | ||
|
|
79ec2fdff7 | ||
|
|
3042afb5fe | ||
|
|
ad60137c14 | ||
|
|
0bb3978862 | ||
|
|
7508e34f20 | ||
|
|
9807181cfb | ||
|
|
7126b47260 | ||
|
|
eb1417786a | ||
|
|
6933f5670c | ||
|
|
26a502fc72 | ||
|
|
652827d5a0 | ||
|
|
0e1697232f | ||
|
|
9f77e04c76 | ||
|
|
c034d65548 | ||
|
|
480125560a | ||
|
|
a59abe0636 | ||
|
|
a90641c836 | ||
|
|
65c3c58c0a | ||
|
|
99ea297875 | ||
|
|
6deeda5c11 | ||
|
|
7f3006eb0c | ||
|
|
4445f37a7a | ||
|
|
3a1583ca75 | ||
|
|
a3e0c7d3b2 | ||
|
|
f7a1f2d813 | ||
|
|
9deed13d7c | ||
|
|
c2e6e1d5f7 |
15
CONTRIBUTORS
15
CONTRIBUTORS
@@ -742,3 +742,18 @@ lfavole
|
||||
mp3butcher
|
||||
slipinthedove
|
||||
YoshiTabletopGamer
|
||||
Arc8ne
|
||||
benfaerber
|
||||
chrisellsworth
|
||||
fries1234
|
||||
Kenshin9977
|
||||
MichaelDeBoey
|
||||
msikma
|
||||
pedro
|
||||
pferreir
|
||||
red-acid
|
||||
refack
|
||||
rysson
|
||||
somini
|
||||
thedenv
|
||||
vallovic
|
||||
|
||||
73
Changelog.md
73
Changelog.md
@@ -4,6 +4,79 @@
|
||||
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
||||
-->
|
||||
|
||||
### 2025.03.21
|
||||
|
||||
#### Core changes
|
||||
- [Fix external downloader availability when using `--ffmpeg-location`](https://github.com/yt-dlp/yt-dlp/commit/9f77e04c76e36e1cbbf49bc9eb385fa6ef804b67) ([#12318](https://github.com/yt-dlp/yt-dlp/issues/12318)) by [Kenshin9977](https://github.com/Kenshin9977)
|
||||
- [Load plugins on demand](https://github.com/yt-dlp/yt-dlp/commit/4445f37a7a66b248dbd8376c43137e6e441f138e) ([#11305](https://github.com/yt-dlp/yt-dlp/issues/11305)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K), [pukkandan](https://github.com/pukkandan) (With fixes in [c034d65](https://github.com/yt-dlp/yt-dlp/commit/c034d655487be668222ef9476a16f374584e49a7))
|
||||
- [Support emitting ConEmu progress codes](https://github.com/yt-dlp/yt-dlp/commit/f7a1f2d8132967a62b0f6d5665c6d2dde2d42c09) ([#10649](https://github.com/yt-dlp/yt-dlp/issues/10649)) by [Grub4K](https://github.com/Grub4K)
|
||||
|
||||
#### Extractor changes
|
||||
- **azmedien**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/26a502fc727d0e91b2db6bf4a112823bcc672e85) ([#12375](https://github.com/yt-dlp/yt-dlp/issues/12375)) by [goggle](https://github.com/goggle)
|
||||
- **bilibiliplaylist**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f5fb2229e66cf59d5bf16065bc041b42a28354a0) ([#12690](https://github.com/yt-dlp/yt-dlp/issues/12690)) by [bashonly](https://github.com/bashonly)
|
||||
- **bunnycdn**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3a1583ca75fb523cbad0e5e174387ea7b477d175) ([#11586](https://github.com/yt-dlp/yt-dlp/issues/11586)) by [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
|
||||
- **canalsurmas**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/01a8be4c23f186329d85f9c78db34a55f3294ac5) ([#12497](https://github.com/yt-dlp/yt-dlp/issues/12497)) by [Arc8ne](https://github.com/Arc8ne)
|
||||
- **cda**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/be0d819e1103195043f6743650781f0d4d343f6d) ([#12552](https://github.com/yt-dlp/yt-dlp/issues/12552)) by [rysson](https://github.com/rysson)
|
||||
- **cultureunplugged**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/3042afb5fe342d3a00de76704cd7de611acc350e) ([#12486](https://github.com/yt-dlp/yt-dlp/issues/12486)) by [seproDev](https://github.com/seproDev)
|
||||
- **dailymotion**: [Improve embed detection](https://github.com/yt-dlp/yt-dlp/commit/ad60137c141efa5023fbc0ac8579eaefe8b3d8cc) ([#12464](https://github.com/yt-dlp/yt-dlp/issues/12464)) by [seproDev](https://github.com/seproDev)
|
||||
- **gem.cbc.ca**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/eb1417786a3027b1e7290ec37ef6aaece50ebed0) ([#12414](https://github.com/yt-dlp/yt-dlp/issues/12414)) by [bashonly](https://github.com/bashonly)
|
||||
- **globo**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/0e1697232fcbba7551f983fd1ba93bb445cbb08b) ([#12270](https://github.com/yt-dlp/yt-dlp/issues/12270)) by [pedro](https://github.com/pedro)
|
||||
- **instagram**
|
||||
- [Add `app_id` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/a90641c8363fa0c10800b36eb6b01ee22d3a9409) ([#12359](https://github.com/yt-dlp/yt-dlp/issues/12359)) by [chrisellsworth](https://github.com/chrisellsworth)
|
||||
- [Fix extraction of older private posts](https://github.com/yt-dlp/yt-dlp/commit/a59abe0636dc49b22a67246afe35613571b86f05) ([#12451](https://github.com/yt-dlp/yt-dlp/issues/12451)) by [bashonly](https://github.com/bashonly)
|
||||
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/480125560a3b9972d29ae0da850aba8109e6bd41) ([#12410](https://github.com/yt-dlp/yt-dlp/issues/12410)) by [bashonly](https://github.com/bashonly)
|
||||
- story: [Support `--no-playlist`](https://github.com/yt-dlp/yt-dlp/commit/65c3c58c0a67463a150920203cec929045c95a24) ([#12397](https://github.com/yt-dlp/yt-dlp/issues/12397)) by [fireattack](https://github.com/fireattack)
|
||||
- **jamendo**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/89a68c4857ddbaf937ff22f12648baaf6b5af840) ([#12622](https://github.com/yt-dlp/yt-dlp/issues/12622)) by [bashonly](https://github.com/bashonly), [JChris246](https://github.com/JChris246)
|
||||
- **ketnet**: [Remove extractor](https://github.com/yt-dlp/yt-dlp/commit/bbada3ec0779422cde34f1ce3dcf595da463b493) ([#12628](https://github.com/yt-dlp/yt-dlp/issues/12628)) by [MichaelDeBoey](https://github.com/MichaelDeBoey)
|
||||
- **lbry**
|
||||
- [Make m3u8 format extraction non-fatal](https://github.com/yt-dlp/yt-dlp/commit/9807181cfbf87bfa732f415c30412bdbd77cbf81) ([#12463](https://github.com/yt-dlp/yt-dlp/issues/12463)) by [bashonly](https://github.com/bashonly)
|
||||
- [Raise appropriate error for non-media files](https://github.com/yt-dlp/yt-dlp/commit/7126b472601814b7fd8c9de02069e8fff1764891) ([#12462](https://github.com/yt-dlp/yt-dlp/issues/12462)) by [bashonly](https://github.com/bashonly)
|
||||
- **loco**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/983095485c731240aae27c950cb8c24a50827b56) ([#12667](https://github.com/yt-dlp/yt-dlp/issues/12667)) by [DTrombett](https://github.com/DTrombett)
|
||||
- **magellantv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/172d5fcd778bf2605db7647ebc56b29ed18d24ac) ([#12505](https://github.com/yt-dlp/yt-dlp/issues/12505)) by [seproDev](https://github.com/seproDev)
|
||||
- **mitele**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7223d29569a48a35ad132a508c115973866838d3) ([#12689](https://github.com/yt-dlp/yt-dlp/issues/12689)) by [bashonly](https://github.com/bashonly)
|
||||
- **msn**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/4815dac131d42c51e12c1d05232db0bbbf607329) ([#12513](https://github.com/yt-dlp/yt-dlp/issues/12513)) by [seproDev](https://github.com/seproDev), [thedenv](https://github.com/thedenv)
|
||||
- **n1**: [Fix extraction of newer articles](https://github.com/yt-dlp/yt-dlp/commit/9d70abe4de401175cbbaaa36017806f16b2df9af) ([#12514](https://github.com/yt-dlp/yt-dlp/issues/12514)) by [u-spec-png](https://github.com/u-spec-png)
|
||||
- **nbcstations**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/ebac65aa9e0bf9a97c24d00f7977900d2577364b) ([#12534](https://github.com/yt-dlp/yt-dlp/issues/12534)) by [refack](https://github.com/refack)
|
||||
- **niconico**
|
||||
- [Fix format sorting](https://github.com/yt-dlp/yt-dlp/commit/7508e34f203e97389f1d04db92140b13401dd724) ([#12442](https://github.com/yt-dlp/yt-dlp/issues/12442)) by [xpadev-net](https://github.com/xpadev-net)
|
||||
- live: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/c2e6e1d5f77f3b720a6266f2869eb750d20e5dc1) ([#12419](https://github.com/yt-dlp/yt-dlp/issues/12419)) by [bashonly](https://github.com/bashonly)
|
||||
- **openrec**: [Fix `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/17504f253564cfad86244de2b6346d07d2300ca5) ([#12608](https://github.com/yt-dlp/yt-dlp/issues/12608)) by [fireattack](https://github.com/fireattack)
|
||||
- **pinterest**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/bd0a66816934de70312eea1e71c59c13b401dc3a) ([#12538](https://github.com/yt-dlp/yt-dlp/issues/12538)) by [mikf](https://github.com/mikf)
|
||||
- **playsuisse**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/6933f5670cea9c3e2fb16c1caa1eda54d13122c5) ([#12444](https://github.com/yt-dlp/yt-dlp/issues/12444)) by [bashonly](https://github.com/bashonly)
|
||||
- **reddit**: [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/d9a53cc1e6fd912daf500ca4f19e9ca88994dbf9) ([#12567](https://github.com/yt-dlp/yt-dlp/issues/12567)) by [seproDev](https://github.com/seproDev)
|
||||
- **rtp**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/8eb9c1bf3b9908cca22ef043602aa24fb9f352c6) ([#11638](https://github.com/yt-dlp/yt-dlp/issues/11638)) by [pferreir](https://github.com/pferreir), [red-acid](https://github.com/red-acid), [seproDev](https://github.com/seproDev), [somini](https://github.com/somini), [vallovic](https://github.com/vallovic)
|
||||
- **softwhiteunderbelly**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/652827d5a076c9483c36654ad2cf3fe46219baf4) ([#12281](https://github.com/yt-dlp/yt-dlp/issues/12281)) by [benfaerber](https://github.com/benfaerber)
|
||||
- **soop**: [Fix timestamp extraction](https://github.com/yt-dlp/yt-dlp/commit/8305df00012ff8138a6ff95279d06b54ac607f63) ([#12609](https://github.com/yt-dlp/yt-dlp/issues/12609)) by [msikma](https://github.com/msikma)
|
||||
- **soundcloud**
|
||||
- [Extract tags](https://github.com/yt-dlp/yt-dlp/commit/9deed13d7cce6d3647379e50589c92de89227509) ([#12420](https://github.com/yt-dlp/yt-dlp/issues/12420)) by [bashonly](https://github.com/bashonly)
|
||||
- [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/6deeda5c11f34f613724fa0627879f0d607ba1b4) ([#12447](https://github.com/yt-dlp/yt-dlp/issues/12447)) by [bashonly](https://github.com/bashonly)
|
||||
- **tiktok**
|
||||
- [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/99ea2978757a431eeb2a265b3395ccbe4ce202cf) ([#12445](https://github.com/yt-dlp/yt-dlp/issues/12445)) by [bashonly](https://github.com/bashonly)
|
||||
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/83b119dadb0f267f1fb66bf7ed74c097349de79e) ([#12566](https://github.com/yt-dlp/yt-dlp/issues/12566)) by [seproDev](https://github.com/seproDev)
|
||||
- **tv8.it**: [Add live and playlist extractors](https://github.com/yt-dlp/yt-dlp/commit/2ee3a0aff9be2be3bea60640d3d8a0febaf0acb6) ([#12569](https://github.com/yt-dlp/yt-dlp/issues/12569)) by [DTrombett](https://github.com/DTrombett)
|
||||
- **tvw**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/42b7440963866e31ff84a5b89030d1c596fa2e6e) ([#12271](https://github.com/yt-dlp/yt-dlp/issues/12271)) by [fries1234](https://github.com/fries1234)
|
||||
- **twitter**
|
||||
- [Fix syndication token generation](https://github.com/yt-dlp/yt-dlp/commit/b8b47547049f5ebc3dd680fc7de70ed0ca9c0d70) ([#12537](https://github.com/yt-dlp/yt-dlp/issues/12537)) by [bashonly](https://github.com/bashonly)
|
||||
- [Truncate title](https://github.com/yt-dlp/yt-dlp/commit/06f6de78db2eceeabd062ab1a3023e0ff9d4df53) ([#12560](https://github.com/yt-dlp/yt-dlp/issues/12560)) by [seproDev](https://github.com/seproDev)
|
||||
- **vk**: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/05c8023a27dd37c49163c0498bf98e3e3c1cb4b9) ([#12510](https://github.com/yt-dlp/yt-dlp/issues/12510)) by [seproDev](https://github.com/seproDev)
|
||||
- **vrtmax**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/df9ebeec00d658693252978d1ffb885e67aa6ab6) ([#12479](https://github.com/yt-dlp/yt-dlp/issues/12479)) by [bergoid](https://github.com/bergoid), [MichaelDeBoey](https://github.com/MichaelDeBoey), [seproDev](https://github.com/seproDev)
|
||||
- **weibo**: [Support playlists](https://github.com/yt-dlp/yt-dlp/commit/0bb39788626002a8a67e925580227952c563c8b9) ([#12284](https://github.com/yt-dlp/yt-dlp/issues/12284)) by [4ft35t](https://github.com/4ft35t)
|
||||
- **wsj**: [Support opinion URLs and impersonation](https://github.com/yt-dlp/yt-dlp/commit/7f3006eb0c0659982bb956d71b0bc806bcb0a5f2) ([#12431](https://github.com/yt-dlp/yt-dlp/issues/12431)) by [refack](https://github.com/refack)
|
||||
- **youtube**
|
||||
- [Fix nsig and signature extraction for player `643afba4`](https://github.com/yt-dlp/yt-dlp/commit/9b868518a15599f3d7ef5a1c730dda164c30da9b) ([#12684](https://github.com/yt-dlp/yt-dlp/issues/12684)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
|
||||
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/3380febe9984c21c79c3147c1d390a4cf339bc4c) ([#12603](https://github.com/yt-dlp/yt-dlp/issues/12603)) by [seproDev](https://github.com/seproDev)
|
||||
- [Split into package](https://github.com/yt-dlp/yt-dlp/commit/4432a9390c79253ac830702b226d2e558b636725) ([#12557](https://github.com/yt-dlp/yt-dlp/issues/12557)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
- [Warn on DRM formats](https://github.com/yt-dlp/yt-dlp/commit/e67d786c7cc87bd449d22e0ddef08306891c1173) ([#12593](https://github.com/yt-dlp/yt-dlp/issues/12593)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
- [Warn on missing formats due to SSAP](https://github.com/yt-dlp/yt-dlp/commit/79ec2fdff75c8c1bb89b550266849ad4dec48dd3) ([#12483](https://github.com/yt-dlp/yt-dlp/issues/12483)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
|
||||
#### Networking changes
|
||||
- [Add `keep_header_casing` extension](https://github.com/yt-dlp/yt-dlp/commit/7d18fed8f1983fe6de4ddc810dfb2761ba5744ac) ([#11652](https://github.com/yt-dlp/yt-dlp/issues/11652)) by [coletdjnz](https://github.com/coletdjnz), [Grub4K](https://github.com/Grub4K)
|
||||
- [Always add unsupported suffix on version mismatch](https://github.com/yt-dlp/yt-dlp/commit/95f8df2f796d0048119615200758199aedcd7cf4) ([#12626](https://github.com/yt-dlp/yt-dlp/issues/12626)) by [Grub4K](https://github.com/Grub4K)
|
||||
|
||||
#### Misc. changes
|
||||
- **cleanup**: Miscellaneous: [f36e4b6](https://github.com/yt-dlp/yt-dlp/commit/f36e4b6e65cb8403791aae2f520697115cb88dec) by [dirkf](https://github.com/dirkf), [gamer191](https://github.com/gamer191), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
|
||||
- **test**: [Show all differences for `expect_value` and `expect_dict`](https://github.com/yt-dlp/yt-dlp/commit/a3e0c7d3b267abdf3933b709704a28d43bb46503) ([#12334](https://github.com/yt-dlp/yt-dlp/issues/12334)) by [Grub4K](https://github.com/Grub4K)
|
||||
|
||||
### 2025.02.19
|
||||
|
||||
#### Core changes
|
||||
|
||||
14
README.md
14
README.md
@@ -337,10 +337,11 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
|
||||
--plugin-dirs PATH Path to an additional directory to search
|
||||
for plugins. This option can be used
|
||||
multiple times to add multiple directories.
|
||||
Note that this currently only works for
|
||||
extractor plugins; postprocessor plugins can
|
||||
only be loaded from the default plugin
|
||||
directories
|
||||
Use "default" to search the default plugin
|
||||
directories (default)
|
||||
--no-plugin-dirs Clear plugin directories to search,
|
||||
including defaults and those provided by
|
||||
previous --plugin-dirs
|
||||
--flat-playlist Do not extract a playlist's URL result
|
||||
entries; some entry metadata may be missing
|
||||
and downloading may be bypassed
|
||||
@@ -1768,7 +1769,7 @@ The following extractors use this feature:
|
||||
#### youtube
|
||||
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
|
||||
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
|
||||
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
|
||||
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
|
||||
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
|
||||
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
|
||||
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
|
||||
@@ -1811,6 +1812,9 @@ The following extractors use this feature:
|
||||
* `vcodec`: vcodec to ignore - one or more of `h264`, `h265`, `dvh265`
|
||||
* `dr`: dynamic range to ignore - one or more of `sdr`, `hdr10`, `dv`
|
||||
|
||||
#### instagram
|
||||
* `app_id`: The value of the `X-IG-App-ID` header used for API requests. Default is the web app ID, `936619743392459`
|
||||
|
||||
#### niconicochannelplus
|
||||
* `max_comments`: Maximum number of comments to extract - default is `120`
|
||||
|
||||
|
||||
@@ -10,6 +10,9 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
from inspect import getsource
|
||||
|
||||
from devscripts.utils import get_filename_args, read_file, write_file
|
||||
from yt_dlp.extractor import import_extractors
|
||||
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
|
||||
from yt_dlp.globals import extractors
|
||||
|
||||
NO_ATTR = object()
|
||||
STATIC_CLASS_PROPERTIES = [
|
||||
@@ -38,8 +41,7 @@ def main():
|
||||
|
||||
lazy_extractors_filename = get_filename_args(default_outfile='yt_dlp/extractor/lazy_extractors.py')
|
||||
|
||||
from yt_dlp.extractor.extractors import _ALL_CLASSES
|
||||
from yt_dlp.extractor.common import InfoExtractor, SearchInfoExtractor
|
||||
import_extractors()
|
||||
|
||||
DummyInfoExtractor = type('InfoExtractor', (InfoExtractor,), {'IE_NAME': NO_ATTR})
|
||||
module_src = '\n'.join((
|
||||
@@ -47,7 +49,7 @@ def main():
|
||||
' _module = None',
|
||||
*extra_ie_code(DummyInfoExtractor),
|
||||
'\nclass LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n',
|
||||
*build_ies(_ALL_CLASSES, (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
|
||||
*build_ies(list(extractors.value.values()), (InfoExtractor, SearchInfoExtractor), DummyInfoExtractor),
|
||||
))
|
||||
|
||||
write_file(lazy_extractors_filename, f'{module_src}\n')
|
||||
@@ -73,7 +75,7 @@ def build_ies(ies, bases, attr_base):
|
||||
if ie in ies:
|
||||
names.append(ie.__name__)
|
||||
|
||||
yield f'\n_ALL_CLASSES = [{", ".join(names)}]'
|
||||
yield '\n_CLASS_LOOKUP = {%s}' % ', '.join(f'{name!r}: {name}' for name in names)
|
||||
|
||||
|
||||
def sort_ies(ies, ignored_bases):
|
||||
|
||||
@@ -76,7 +76,7 @@ dev = [
|
||||
]
|
||||
static-analysis = [
|
||||
"autopep8~=2.0",
|
||||
"ruff~=0.9.0",
|
||||
"ruff~=0.11.0",
|
||||
]
|
||||
test = [
|
||||
"pytest~=8.1",
|
||||
@@ -384,9 +384,14 @@ select = [
|
||||
"W391",
|
||||
"W504",
|
||||
]
|
||||
exclude = "*/extractor/lazy_extractors.py,*venv*,*/test/testdata/sigs/player-*.js,.idea,.vscode"
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
addopts = "-ra -v --strict-markers"
|
||||
addopts = [
|
||||
"-ra", # summary: all except passed
|
||||
"--verbose",
|
||||
"--strict-markers",
|
||||
]
|
||||
markers = [
|
||||
"download",
|
||||
]
|
||||
|
||||
@@ -224,6 +224,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen
|
||||
- **Bundesliga**
|
||||
- **Bundestag**
|
||||
- **BunnyCdn**
|
||||
- **BusinessInsider**
|
||||
- **BuzzFeed**
|
||||
- **BYUtv**: (**Currently broken**)
|
||||
@@ -242,6 +243,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **CanalAlpha**
|
||||
- **canalc2.tv**
|
||||
- **Canalplus**: mycanal.fr and piwiplus.fr
|
||||
- **Canalsurmas**
|
||||
- **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine")
|
||||
- **CartoonNetwork**
|
||||
- **cbc.ca**
|
||||
@@ -609,10 +611,10 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **Inc**
|
||||
- **IndavideoEmbed**
|
||||
- **InfoQ**
|
||||
- **Instagram**: [*instagram*](## "netrc machine")
|
||||
- **instagram:story**: [*instagram*](## "netrc machine")
|
||||
- **instagram:tag**: [*instagram*](## "netrc machine") Instagram hashtag search URLs
|
||||
- **instagram:user**: [*instagram*](## "netrc machine") Instagram user profile (**Currently broken**)
|
||||
- **Instagram**
|
||||
- **instagram:story**
|
||||
- **instagram:tag**: Instagram hashtag search URLs
|
||||
- **instagram:user**: Instagram user profile (**Currently broken**)
|
||||
- **InstagramIOS**: IOS instagram:// URL
|
||||
- **Internazionale**
|
||||
- **InternetVideoArchive**
|
||||
@@ -661,7 +663,6 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **KelbyOne**: (**Currently broken**)
|
||||
- **Kenh14Playlist**
|
||||
- **Kenh14Video**
|
||||
- **Ketnet**
|
||||
- **khanacademy**
|
||||
- **khanacademy:unit**
|
||||
- **kick:clips**
|
||||
@@ -733,6 +734,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **Livestreamfails**
|
||||
- **Lnk**
|
||||
- **loc**: Library of Congress
|
||||
- **Loco**
|
||||
- **loom**
|
||||
- **loom:folder**
|
||||
- **LoveHomePorn**
|
||||
@@ -831,7 +833,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **MoviewPlay**
|
||||
- **Moviezine**
|
||||
- **MovingImage**
|
||||
- **MSN**: (**Currently broken**)
|
||||
- **MSN**
|
||||
- **mtg**: MTG services
|
||||
- **mtv**
|
||||
- **mtv.de**: (**Currently broken**)
|
||||
@@ -1342,6 +1344,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **Smotrim**
|
||||
- **SnapchatSpotlight**
|
||||
- **Snotr**
|
||||
- **SoftWhiteUnderbelly**: [*softwhiteunderbelly*](## "netrc machine")
|
||||
- **Sohu**
|
||||
- **SohuV**
|
||||
- **SonyLIV**: [*sonyliv*](## "netrc machine")
|
||||
@@ -1536,6 +1539,8 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **tv5unis**
|
||||
- **tv5unis:video**
|
||||
- **tv8.it**
|
||||
- **tv8.it:live**: TV8 Live
|
||||
- **tv8.it:playlist**: TV8 Playlist
|
||||
- **TVANouvelles**
|
||||
- **TVANouvellesArticle**
|
||||
- **tvaplus**: TVA+
|
||||
@@ -1556,6 +1561,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **tvp:vod:series**
|
||||
- **TVPlayer**
|
||||
- **TVPlayHome**
|
||||
- **Tvw**
|
||||
- **Tweakers**
|
||||
- **TwitCasting**
|
||||
- **TwitCastingLive**
|
||||
@@ -1677,7 +1683,7 @@ The only reliable way to check if a site is supported is to try it.
|
||||
- **vqq:series**
|
||||
- **vqq:video**
|
||||
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
|
||||
- **VrtNU**: [*vrtnu*](## "netrc machine") VRT MAX
|
||||
- **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU)
|
||||
- **VTM**: (**Currently broken**)
|
||||
- **VTV**
|
||||
- **VTVGo**
|
||||
|
||||
172
test/helper.py
172
test/helper.py
@@ -101,87 +101,109 @@ def getwebpagetestcases():
|
||||
md5 = lambda s: hashlib.md5(s.encode()).hexdigest()
|
||||
|
||||
|
||||
def expect_value(self, got, expected, field):
|
||||
if isinstance(expected, str) and expected.startswith('re:'):
|
||||
match_str = expected[len('re:'):]
|
||||
match_rex = re.compile(match_str)
|
||||
def _iter_differences(got, expected, field):
|
||||
if isinstance(expected, str):
|
||||
op, _, val = expected.partition(':')
|
||||
if op in ('mincount', 'maxcount', 'count'):
|
||||
if not isinstance(got, (list, dict)):
|
||||
yield field, f'expected either {list.__name__} or {dict.__name__}, got {type(got).__name__}'
|
||||
return
|
||||
|
||||
self.assertTrue(
|
||||
isinstance(got, str),
|
||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
||||
self.assertTrue(
|
||||
match_rex.match(got),
|
||||
f'field {field} (value: {got!r}) should match {match_str!r}')
|
||||
elif isinstance(expected, str) and expected.startswith('startswith:'):
|
||||
start_str = expected[len('startswith:'):]
|
||||
self.assertTrue(
|
||||
isinstance(got, str),
|
||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
||||
self.assertTrue(
|
||||
got.startswith(start_str),
|
||||
f'field {field} (value: {got!r}) should start with {start_str!r}')
|
||||
elif isinstance(expected, str) and expected.startswith('contains:'):
|
||||
contains_str = expected[len('contains:'):]
|
||||
self.assertTrue(
|
||||
isinstance(got, str),
|
||||
f'Expected a {str.__name__} object, but got {type(got).__name__} for field {field}')
|
||||
self.assertTrue(
|
||||
contains_str in got,
|
||||
f'field {field} (value: {got!r}) should contain {contains_str!r}')
|
||||
elif isinstance(expected, type):
|
||||
self.assertTrue(
|
||||
isinstance(got, expected),
|
||||
f'Expected type {expected!r} for field {field}, but got value {got!r} of type {type(got)!r}')
|
||||
elif isinstance(expected, dict) and isinstance(got, dict):
|
||||
expect_dict(self, got, expected)
|
||||
elif isinstance(expected, list) and isinstance(got, list):
|
||||
self.assertEqual(
|
||||
len(expected), len(got),
|
||||
f'Expect a list of length {len(expected)}, but got a list of length {len(got)} for field {field}')
|
||||
for index, (item_got, item_expected) in enumerate(zip(got, expected)):
|
||||
type_got = type(item_got)
|
||||
type_expected = type(item_expected)
|
||||
self.assertEqual(
|
||||
type_expected, type_got,
|
||||
f'Type mismatch for list item at index {index} for field {field}, '
|
||||
f'expected {type_expected!r}, got {type_got!r}')
|
||||
expect_value(self, item_got, item_expected, field)
|
||||
else:
|
||||
if isinstance(expected, str) and expected.startswith('md5:'):
|
||||
self.assertTrue(
|
||||
isinstance(got, str),
|
||||
f'Expected field {field} to be a unicode object, but got value {got!r} of type {type(got)!r}')
|
||||
got = 'md5:' + md5(got)
|
||||
elif isinstance(expected, str) and re.match(r'^(?:min|max)?count:\d+', expected):
|
||||
self.assertTrue(
|
||||
isinstance(got, (list, dict)),
|
||||
f'Expected field {field} to be a list or a dict, but it is of type {type(got).__name__}')
|
||||
op, _, expected_num = expected.partition(':')
|
||||
expected_num = int(expected_num)
|
||||
expected_num = int(val)
|
||||
got_num = len(got)
|
||||
if op == 'mincount':
|
||||
assert_func = assertGreaterEqual
|
||||
msg_tmpl = 'Expected %d items in field %s, but only got %d'
|
||||
elif op == 'maxcount':
|
||||
assert_func = assertLessEqual
|
||||
msg_tmpl = 'Expected maximum %d items in field %s, but got %d'
|
||||
elif op == 'count':
|
||||
assert_func = assertEqual
|
||||
msg_tmpl = 'Expected exactly %d items in field %s, but got %d'
|
||||
else:
|
||||
assert False
|
||||
assert_func(
|
||||
self, len(got), expected_num,
|
||||
msg_tmpl % (expected_num, field, len(got)))
|
||||
if got_num < expected_num:
|
||||
yield field, f'expected at least {val} items, got {got_num}'
|
||||
return
|
||||
|
||||
if op == 'maxcount':
|
||||
if got_num > expected_num:
|
||||
yield field, f'expected at most {val} items, got {got_num}'
|
||||
return
|
||||
|
||||
assert op == 'count'
|
||||
if got_num != expected_num:
|
||||
yield field, f'expected exactly {val} items, got {got_num}'
|
||||
return
|
||||
self.assertEqual(
|
||||
expected, got,
|
||||
f'Invalid value for field {field}, expected {expected!r}, got {got!r}')
|
||||
|
||||
if not isinstance(got, str):
|
||||
yield field, f'expected {str.__name__}, got {type(got).__name__}'
|
||||
return
|
||||
|
||||
if op == 're':
|
||||
if not re.match(val, got):
|
||||
yield field, f'should match {val!r}, got {got!r}'
|
||||
return
|
||||
|
||||
if op == 'startswith':
|
||||
if not val.startswith(got):
|
||||
yield field, f'should start with {val!r}, got {got!r}'
|
||||
return
|
||||
|
||||
if op == 'contains':
|
||||
if not val.startswith(got):
|
||||
yield field, f'should contain {val!r}, got {got!r}'
|
||||
return
|
||||
|
||||
if op == 'md5':
|
||||
hash_val = md5(got)
|
||||
if hash_val != val:
|
||||
yield field, f'expected hash {val}, got {hash_val}'
|
||||
return
|
||||
|
||||
if got != expected:
|
||||
yield field, f'expected {expected!r}, got {got!r}'
|
||||
return
|
||||
|
||||
if isinstance(expected, dict) and isinstance(got, dict):
|
||||
for key, expected_val in expected.items():
|
||||
if key not in got:
|
||||
yield field, f'missing key: {key!r}'
|
||||
continue
|
||||
|
||||
field_name = key if field is None else f'{field}.{key}'
|
||||
yield from _iter_differences(got[key], expected_val, field_name)
|
||||
return
|
||||
|
||||
if isinstance(expected, type):
|
||||
if not isinstance(got, expected):
|
||||
yield field, f'expected {expected.__name__}, got {type(got).__name__}'
|
||||
return
|
||||
|
||||
if isinstance(expected, list) and isinstance(got, list):
|
||||
# TODO: clever diffing algorithm lmao
|
||||
if len(expected) != len(got):
|
||||
yield field, f'expected length of {len(expected)}, got {len(got)}'
|
||||
return
|
||||
|
||||
for index, (got_val, expected_val) in enumerate(zip(got, expected)):
|
||||
field_name = str(index) if field is None else f'{field}.{index}'
|
||||
yield from _iter_differences(got_val, expected_val, field_name)
|
||||
return
|
||||
|
||||
if got != expected:
|
||||
yield field, f'expected {expected!r}, got {got!r}'
|
||||
|
||||
|
||||
def _expect_value(message, got, expected, field):
|
||||
mismatches = list(_iter_differences(got, expected, field))
|
||||
if not mismatches:
|
||||
return
|
||||
|
||||
fields = [field for field, _ in mismatches if field is not None]
|
||||
return ''.join((
|
||||
message, f' ({", ".join(fields)})' if fields else '',
|
||||
*(f'\n\t{field}: {message}' for field, message in mismatches)))
|
||||
|
||||
|
||||
def expect_value(self, got, expected, field):
|
||||
if message := _expect_value('values differ', got, expected, field):
|
||||
self.fail(message)
|
||||
|
||||
|
||||
def expect_dict(self, got_dict, expected_dict):
|
||||
for info_field, expected in expected_dict.items():
|
||||
got = got_dict.get(info_field)
|
||||
expect_value(self, got, expected, info_field)
|
||||
if message := _expect_value('dictionaries differ', got_dict, expected_dict, None):
|
||||
self.fail(message)
|
||||
|
||||
|
||||
def sanitize_got_info_dict(got_dict):
|
||||
|
||||
@@ -6,6 +6,8 @@ import sys
|
||||
import unittest
|
||||
from unittest.mock import patch
|
||||
|
||||
from yt_dlp.globals import all_plugins_loaded
|
||||
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
|
||||
@@ -1427,6 +1429,12 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
self.assertFalse(result.get('cookies'), msg='Cookies set in cookies field for wrong domain')
|
||||
self.assertFalse(ydl.cookiejar.get_cookie_header(fmt['url']), msg='Cookies set in cookiejar for wrong domain')
|
||||
|
||||
def test_load_plugins_compat(self):
|
||||
# Should try to reload plugins if they haven't already been loaded
|
||||
all_plugins_loaded.value = False
|
||||
FakeYDL().close()
|
||||
assert all_plugins_loaded.value
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
|
||||
@@ -331,10 +331,6 @@ class TestHTTPConnectProxy:
|
||||
assert proxy_info['proxy'] == server_address
|
||||
assert 'Proxy-Authorization' in proxy_info['headers']
|
||||
|
||||
@pytest.mark.skip_handler(
|
||||
'Requests',
|
||||
'bug in urllib3 causes unclosed socket: https://github.com/urllib3/urllib3/issues/3374',
|
||||
)
|
||||
def test_http_connect_bad_auth(self, handler, ctx):
|
||||
with ctx.http_server(HTTPConnectProxyHandler, username='test', password='test') as server_address:
|
||||
with handler(verify=False, proxies={ctx.REQUEST_PROTO: f'http://test:bad@{server_address}'}) as rh:
|
||||
|
||||
@@ -384,7 +384,7 @@ class TestJSInterpreter(unittest.TestCase):
|
||||
@unittest.skip('Not implemented')
|
||||
def test_packed(self):
|
||||
jsi = JSInterpreter('''function f(p,a,c,k,e,d){while(c--)if(k[c])p=p.replace(new RegExp('\\b'+c.toString(a)+'\\b','g'),k[c]);return p}''')
|
||||
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|')))
|
||||
self.assertEqual(jsi.call_function('f', '''h 7=g("1j");7.7h({7g:[{33:"w://7f-7e-7d-7c.v.7b/7a/79/78/77/76.74?t=73&s=2s&e=72&f=2t&71=70.0.0.1&6z=6y&6x=6w"}],6v:"w://32.v.u/6u.31",16:"r%",15:"r%",6t:"6s",6r:"",6q:"l",6p:"l",6o:"6n",6m:\'6l\',6k:"6j",9:[{33:"/2u?b=6i&n=50&6h=w://32.v.u/6g.31",6f:"6e"}],1y:{6d:1,6c:\'#6b\',6a:\'#69\',68:"67",66:30,65:r,},"64":{63:"%62 2m%m%61%5z%5y%5x.u%5w%5v%5u.2y%22 2k%m%1o%22 5t%m%1o%22 5s%m%1o%22 2j%m%5r%22 16%m%5q%22 15%m%5p%22 5o%2z%5n%5m%2z",5l:"w://v.u/d/1k/5k.2y",5j:[]},\'5i\':{"5h":"5g"},5f:"5e",5d:"w://v.u",5c:{},5b:l,1x:[0.25,0.50,0.75,1,1.25,1.5,2]});h 1m,1n,5a;h 59=0,58=0;h 7=g("1j");h 2x=0,57=0,56=0;$.55({54:{\'53-52\':\'2i-51\'}});7.j(\'4z\',6(x){c(5>0&&x.1l>=5&&1n!=1){1n=1;$(\'q.4y\').4x(\'4w\')}});7.j(\'13\',6(x){2x=x.1l});7.j(\'2g\',6(x){2w(x)});7.j(\'4v\',6(){$(\'q.2v\').4u()});6 2w(x){$(\'q.2v\').4t();c(1m)19;1m=1;17=0;c(4s.4r===l){17=1}$.4q(\'/2u?b=4p&2l=1k&4o=2t-4n-4m-2s-4l&4k=&4j=&4i=&17=\'+17,6(2r){$(\'#4h\').4g(2r)});$(\'.3-8-4f-4e:4d("4c")\').2h(6(e){2q();g().4b(0);g().4a(l)});6 2q(){h $14=$("<q />").2p({1l:"49",16:"r%",15:"r%",48:0,2n:0,2o:47,46:"45(10%, 10%, 10%, 0.4)","44-43":"42"});$("<41 />").2p({16:"60%",15:"60%",2o:40,"3z-2n":"3y"}).3x({\'2m\':\'/?b=3w&2l=1k\',\'2k\':\'0\',\'2j\':\'2i\'}).2f($14);$14.2h(6(){$(3v).3u();g().2g()});$14.2f($(\'#1j\'))}g().13(0);}6 3t(){h 9=7.1b(2e);2d.2c(9);c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==2e){2d.2c(\'!!=\'+i);7.1p(i)}}}}7.j(\'3s\',6(){g().1h("/2a/3r.29","3q 10 28",6(){g().13(g().27()+10)},"2b");$("q[26=2b]").23().21(\'.3-20-1z\');g().1h("/2a/3p.29","3o 10 28",6(){h 12=g().27()-10;c(12<0)12=0;g().13(12)},"24");$("q[26=24]").23().21(\'.3-20-1z\');});6 1i(){}7.j(\'3n\',6(){1i()});7.j(\'3m\',6(){1i()});7.j("k",6(y){h 9=7.1b();c(9.n<2)19;$(\'.3-8-3l-3k\').3j(6(){$(\'#3-8-a-k\').1e(\'3-8-a-z\');$(\'.3-a-k\').p(\'o-1f\',\'11\')});7.1h("/3i/3h.3g","3f 3e",6(){$(\'.3-1w\').3d(\'3-8-1v\');$(\'.3-8-1y, .3-8-1x\').p(\'o-1g\',\'11\');c($(\'.3-1w\').3c(\'3-8-1v\')){$(\'.3-a-k\').p(\'o-1g\',\'l\');$(\'.3-a-k\').p(\'o-1f\',\'l\');$(\'.3-8-a\').1e(\'3-8-a-z\');$(\'.3-8-a:1u\').3b(\'3-8-a-z\')}3a{$(\'.3-a-k\').p(\'o-1g\',\'11\');$(\'.3-a-k\').p(\'o-1f\',\'11\');$(\'.3-8-a:1u\').1e(\'3-8-a-z\')}},"39");7.j("38",6(y){1d.37(\'1c\',y.9[y.36].1a)});c(1d.1t(\'1c\')){35("1s(1d.1t(\'1c\'));",34)}});h 18;6 1s(1q){h 9=7.1b();c(9.n>1){1r(i=0;i<9.n;i++){c(9[i].1a==1q){c(i==18){19}18=i;7.1p(i)}}}}',36,270,'|||jw|||function|player|settings|tracks|submenu||if||||jwplayer|var||on|audioTracks|true|3D|length|aria|attr|div|100|||sx|filemoon|https||event|active||false|tt|seek|dd|height|width|adb|current_audio|return|name|getAudioTracks|default_audio|localStorage|removeClass|expanded|checked|addButton|callMeMaybe|vplayer|0fxcyc2ajhp1|position|vvplay|vvad|220|setCurrentAudioTrack|audio_name|for|audio_set|getItem|last|open|controls|playbackRates|captions|rewind|icon|insertAfter||detach|ff00||button|getPosition|sec|png|player8|ff11|log|console|track_name|appendTo|play|click|no|scrolling|frameborder|file_code|src|top|zIndex|css|showCCform|data|1662367683|383371|dl|video_ad|doPlay|prevt|mp4|3E||jpg|thumbs|file|300|setTimeout|currentTrack|setItem|audioTrackChanged|dualSound|else|addClass|hasClass|toggleClass|Track|Audio|svg|dualy|images|mousedown|buttons|topbar|playAttemptFailed|beforePlay|Rewind|fr|Forward|ff|ready|set_audio_track|remove|this|upload_srt|prop|50px|margin|1000001|iframe|center|align|text|rgba|background|1000000|left|absolute|pause|setCurrentCaptions|Upload|contains|item|content|html|fviews|referer|prem|embed|3e57249ef633e0d03bf76ceb8d8a4b65|216|83|hash|view|get|TokenZir|window|hide|show|complete|slow|fadeIn|video_ad_fadein|time||cache|Cache|Content|headers|ajaxSetup|v2done|tott|vastdone2|vastdone1|vvbefore|playbackRateControls|cast|aboutlink|FileMoon|abouttext|UHD|1870|qualityLabels|sites|GNOME_POWER|link|2Fiframe|3C|allowfullscreen|22360|22640|22no|marginheight|marginwidth|2FGNOME_POWER|2F0fxcyc2ajhp1|2Fe|2Ffilemoon|2F|3A||22https|3Ciframe|code|sharing|fontOpacity|backgroundOpacity|Tahoma|fontFamily|303030|backgroundColor|FFFFFF|color|userFontScale|thumbnails|kind|0fxcyc2ajhp10000|url|get_slides|start|startparam|none|preload|html5|primary|hlshtml|androidhls|duration|uniform|stretching|0fxcyc2ajhp1_xt|image|2048|sp|6871|asn|127|srv|43200|_g3XlBcu2lmD9oDexD2NLWSmah2Nu3XcDrl93m9PwXY|m3u8||master|0fxcyc2ajhp1_x|00076|01|hls2|to|s01|delivery|storage|moon|sources|setup'''.split('|'))) # noqa: SIM905
|
||||
|
||||
def test_join(self):
|
||||
test_input = list('test')
|
||||
@@ -462,6 +462,16 @@ class TestJSInterpreter(unittest.TestCase):
|
||||
]:
|
||||
assert js_number_to_string(test, radix) == expected
|
||||
|
||||
def test_extract_function(self):
|
||||
jsi = JSInterpreter('function a(b) { return b + 1; }')
|
||||
func = jsi.extract_function('a')
|
||||
self.assertEqual(func([2]), 3)
|
||||
|
||||
def test_extract_function_with_global_stack(self):
|
||||
jsi = JSInterpreter('function c(d) { return d + e + f + g; }')
|
||||
func = jsi.extract_function('c', {'e': 10}, {'f': 100, 'g': 1000})
|
||||
self.assertEqual(func([1]), 1111)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
|
||||
@@ -720,6 +720,15 @@ class TestHTTPRequestHandler(TestRequestHandlerBase):
|
||||
rh, Request(
|
||||
f'http://127.0.0.1:{self.http_port}/headers', proxies={'all': 'http://10.255.255.255'})).close()
|
||||
|
||||
@pytest.mark.skip_handlers_if(lambda _, handler: handler not in ['Urllib', 'CurlCFFI'], 'handler does not support keep_header_casing')
|
||||
def test_keep_header_casing(self, handler):
|
||||
with handler() as rh:
|
||||
res = validate_and_send(
|
||||
rh, Request(
|
||||
f'http://127.0.0.1:{self.http_port}/headers', headers={'X-test-heaDer': 'test'}, extensions={'keep_header_casing': True})).read().decode()
|
||||
|
||||
assert 'X-test-heaDer: test' in res
|
||||
|
||||
|
||||
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
|
||||
class TestClientCertificate:
|
||||
@@ -1289,6 +1298,7 @@ class TestRequestHandlerValidation:
|
||||
({'legacy_ssl': False}, False),
|
||||
({'legacy_ssl': True}, False),
|
||||
({'legacy_ssl': 'notabool'}, AssertionError),
|
||||
({'keep_header_casing': True}, UnsupportedRequest),
|
||||
]),
|
||||
('Requests', 'http', [
|
||||
({'cookiejar': 'notacookiejar'}, AssertionError),
|
||||
@@ -1299,6 +1309,9 @@ class TestRequestHandlerValidation:
|
||||
({'legacy_ssl': False}, False),
|
||||
({'legacy_ssl': True}, False),
|
||||
({'legacy_ssl': 'notabool'}, AssertionError),
|
||||
({'keep_header_casing': False}, False),
|
||||
({'keep_header_casing': True}, False),
|
||||
({'keep_header_casing': 'notabool'}, AssertionError),
|
||||
]),
|
||||
('CurlCFFI', 'http', [
|
||||
({'cookiejar': 'notacookiejar'}, AssertionError),
|
||||
|
||||
@@ -10,22 +10,71 @@ TEST_DATA_DIR = Path(os.path.dirname(os.path.abspath(__file__)), 'testdata')
|
||||
sys.path.append(str(TEST_DATA_DIR))
|
||||
importlib.invalidate_caches()
|
||||
|
||||
from yt_dlp.utils import Config
|
||||
from yt_dlp.plugins import PACKAGE_NAME, directories, load_plugins
|
||||
from yt_dlp.plugins import (
|
||||
PACKAGE_NAME,
|
||||
PluginSpec,
|
||||
directories,
|
||||
load_plugins,
|
||||
load_all_plugins,
|
||||
register_plugin_spec,
|
||||
)
|
||||
|
||||
from yt_dlp.globals import (
|
||||
extractors,
|
||||
postprocessors,
|
||||
plugin_dirs,
|
||||
plugin_ies,
|
||||
plugin_pps,
|
||||
all_plugins_loaded,
|
||||
plugin_specs,
|
||||
)
|
||||
|
||||
|
||||
EXTRACTOR_PLUGIN_SPEC = PluginSpec(
|
||||
module_name='extractor',
|
||||
suffix='IE',
|
||||
destination=extractors,
|
||||
plugin_destination=plugin_ies,
|
||||
)
|
||||
|
||||
POSTPROCESSOR_PLUGIN_SPEC = PluginSpec(
|
||||
module_name='postprocessor',
|
||||
suffix='PP',
|
||||
destination=postprocessors,
|
||||
plugin_destination=plugin_pps,
|
||||
)
|
||||
|
||||
|
||||
def reset_plugins():
|
||||
plugin_ies.value = {}
|
||||
plugin_pps.value = {}
|
||||
plugin_dirs.value = ['default']
|
||||
plugin_specs.value = {}
|
||||
all_plugins_loaded.value = False
|
||||
# Clearing override plugins is probably difficult
|
||||
for module_name in tuple(sys.modules):
|
||||
for plugin_type in ('extractor', 'postprocessor'):
|
||||
if module_name.startswith(f'{PACKAGE_NAME}.{plugin_type}.'):
|
||||
del sys.modules[module_name]
|
||||
|
||||
importlib.invalidate_caches()
|
||||
|
||||
|
||||
class TestPlugins(unittest.TestCase):
|
||||
|
||||
TEST_PLUGIN_DIR = TEST_DATA_DIR / PACKAGE_NAME
|
||||
|
||||
def setUp(self):
|
||||
reset_plugins()
|
||||
|
||||
def tearDown(self):
|
||||
reset_plugins()
|
||||
|
||||
def test_directories_containing_plugins(self):
|
||||
self.assertIn(self.TEST_PLUGIN_DIR, map(Path, directories()))
|
||||
|
||||
def test_extractor_classes(self):
|
||||
for module_name in tuple(sys.modules):
|
||||
if module_name.startswith(f'{PACKAGE_NAME}.extractor'):
|
||||
del sys.modules[module_name]
|
||||
plugins_ie = load_plugins('extractor', 'IE')
|
||||
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
|
||||
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||
self.assertIn('NormalPluginIE', plugins_ie.keys())
|
||||
@@ -35,17 +84,29 @@ class TestPlugins(unittest.TestCase):
|
||||
f'{PACKAGE_NAME}.extractor._ignore' in sys.modules,
|
||||
'loaded module beginning with underscore')
|
||||
self.assertNotIn('IgnorePluginIE', plugins_ie.keys())
|
||||
self.assertNotIn('IgnorePluginIE', plugin_ies.value)
|
||||
|
||||
# Don't load extractors with underscore prefix
|
||||
self.assertNotIn('_IgnoreUnderscorePluginIE', plugins_ie.keys())
|
||||
self.assertNotIn('_IgnoreUnderscorePluginIE', plugin_ies.value)
|
||||
|
||||
# Don't load extractors not specified in __all__ (if supplied)
|
||||
self.assertNotIn('IgnoreNotInAllPluginIE', plugins_ie.keys())
|
||||
self.assertNotIn('IgnoreNotInAllPluginIE', plugin_ies.value)
|
||||
self.assertIn('InAllPluginIE', plugins_ie.keys())
|
||||
self.assertIn('InAllPluginIE', plugin_ies.value)
|
||||
|
||||
# Don't load override extractors
|
||||
self.assertNotIn('OverrideGenericIE', plugins_ie.keys())
|
||||
self.assertNotIn('OverrideGenericIE', plugin_ies.value)
|
||||
self.assertNotIn('_UnderscoreOverrideGenericIE', plugins_ie.keys())
|
||||
self.assertNotIn('_UnderscoreOverrideGenericIE', plugin_ies.value)
|
||||
|
||||
def test_postprocessor_classes(self):
|
||||
plugins_pp = load_plugins('postprocessor', 'PP')
|
||||
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
self.assertIn('NormalPluginPP', plugins_pp.keys())
|
||||
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||
self.assertIn('NormalPluginPP', plugin_pps.value)
|
||||
|
||||
def test_importing_zipped_module(self):
|
||||
zip_path = TEST_DATA_DIR / 'zipped_plugins.zip'
|
||||
@@ -58,10 +119,10 @@ class TestPlugins(unittest.TestCase):
|
||||
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
|
||||
self.assertIn(zip_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
|
||||
|
||||
plugins_ie = load_plugins('extractor', 'IE')
|
||||
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
self.assertIn('ZippedPluginIE', plugins_ie.keys())
|
||||
|
||||
plugins_pp = load_plugins('postprocessor', 'PP')
|
||||
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
self.assertIn('ZippedPluginPP', plugins_pp.keys())
|
||||
|
||||
finally:
|
||||
@@ -69,23 +130,116 @@ class TestPlugins(unittest.TestCase):
|
||||
os.remove(zip_path)
|
||||
importlib.invalidate_caches() # reset the import caches
|
||||
|
||||
def test_plugin_dirs(self):
|
||||
# Internal plugin dirs hack for CLI --plugin-dirs
|
||||
# To be replaced with proper system later
|
||||
custom_plugin_dir = TEST_DATA_DIR / 'plugin_packages'
|
||||
Config._plugin_dirs = [str(custom_plugin_dir)]
|
||||
importlib.invalidate_caches() # reset the import caches
|
||||
def test_reloading_plugins(self):
|
||||
reload_plugins_path = TEST_DATA_DIR / 'reload_plugins'
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
|
||||
# Remove default folder and add reload_plugin path
|
||||
sys.path.remove(str(TEST_DATA_DIR))
|
||||
sys.path.append(str(reload_plugins_path))
|
||||
importlib.invalidate_caches()
|
||||
try:
|
||||
package = importlib.import_module(f'{PACKAGE_NAME}.extractor')
|
||||
self.assertIn(custom_plugin_dir / 'testpackage' / PACKAGE_NAME / 'extractor', map(Path, package.__path__))
|
||||
for plugin_type in ('extractor', 'postprocessor'):
|
||||
package = importlib.import_module(f'{PACKAGE_NAME}.{plugin_type}')
|
||||
self.assertIn(reload_plugins_path / PACKAGE_NAME / plugin_type, map(Path, package.__path__))
|
||||
|
||||
plugins_ie = load_plugins('extractor', 'IE')
|
||||
self.assertIn('PackagePluginIE', plugins_ie.keys())
|
||||
plugins_ie = load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
self.assertIn('NormalPluginIE', plugins_ie.keys())
|
||||
self.assertTrue(
|
||||
plugins_ie['NormalPluginIE'].REPLACED,
|
||||
msg='Reloading has not replaced original extractor plugin')
|
||||
self.assertTrue(
|
||||
extractors.value['NormalPluginIE'].REPLACED,
|
||||
msg='Reloading has not replaced original extractor plugin globally')
|
||||
|
||||
plugins_pp = load_plugins(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
self.assertIn('NormalPluginPP', plugins_pp.keys())
|
||||
self.assertTrue(plugins_pp['NormalPluginPP'].REPLACED,
|
||||
msg='Reloading has not replaced original postprocessor plugin')
|
||||
self.assertTrue(
|
||||
postprocessors.value['NormalPluginPP'].REPLACED,
|
||||
msg='Reloading has not replaced original postprocessor plugin globally')
|
||||
|
||||
finally:
|
||||
Config._plugin_dirs = []
|
||||
importlib.invalidate_caches() # reset the import caches
|
||||
sys.path.remove(str(reload_plugins_path))
|
||||
sys.path.append(str(TEST_DATA_DIR))
|
||||
importlib.invalidate_caches()
|
||||
|
||||
def test_extractor_override_plugin(self):
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
|
||||
from yt_dlp.extractor.generic import GenericIE
|
||||
|
||||
self.assertEqual(GenericIE.TEST_FIELD, 'override')
|
||||
self.assertEqual(GenericIE.SECONDARY_TEST_FIELD, 'underscore-override')
|
||||
|
||||
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
|
||||
importlib.invalidate_caches()
|
||||
# test that loading a second time doesn't wrap a second time
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
from yt_dlp.extractor.generic import GenericIE
|
||||
self.assertEqual(GenericIE.IE_NAME, 'generic+override+underscore-override')
|
||||
|
||||
def test_load_all_plugin_types(self):
|
||||
|
||||
# no plugin specs registered
|
||||
load_all_plugins()
|
||||
|
||||
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||
|
||||
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
load_all_plugins()
|
||||
self.assertTrue(all_plugins_loaded.value)
|
||||
|
||||
self.assertIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||
self.assertIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||
|
||||
def test_no_plugin_dirs(self):
|
||||
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
|
||||
plugin_dirs.value = []
|
||||
load_all_plugins()
|
||||
|
||||
self.assertNotIn(f'{PACKAGE_NAME}.extractor.normal', sys.modules.keys())
|
||||
self.assertNotIn(f'{PACKAGE_NAME}.postprocessor.normal', sys.modules.keys())
|
||||
|
||||
def test_set_plugin_dirs(self):
|
||||
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
|
||||
plugin_dirs.value = [custom_plugin_dir]
|
||||
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
|
||||
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
|
||||
self.assertIn('PackagePluginIE', plugin_ies.value)
|
||||
|
||||
def test_invalid_plugin_dir(self):
|
||||
plugin_dirs.value = ['invalid_dir']
|
||||
with self.assertRaises(ValueError):
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
|
||||
def test_append_plugin_dirs(self):
|
||||
custom_plugin_dir = str(TEST_DATA_DIR / 'plugin_packages')
|
||||
|
||||
self.assertEqual(plugin_dirs.value, ['default'])
|
||||
plugin_dirs.value.append(custom_plugin_dir)
|
||||
self.assertEqual(plugin_dirs.value, ['default', custom_plugin_dir])
|
||||
|
||||
load_plugins(EXTRACTOR_PLUGIN_SPEC)
|
||||
|
||||
self.assertIn(f'{PACKAGE_NAME}.extractor.package', sys.modules.keys())
|
||||
self.assertIn('PackagePluginIE', plugin_ies.value)
|
||||
|
||||
def test_get_plugin_spec(self):
|
||||
register_plugin_spec(EXTRACTOR_PLUGIN_SPEC)
|
||||
register_plugin_spec(POSTPROCESSOR_PLUGIN_SPEC)
|
||||
|
||||
self.assertEqual(plugin_specs.value.get('extractor'), EXTRACTOR_PLUGIN_SPEC)
|
||||
self.assertEqual(plugin_specs.value.get('postprocessor'), POSTPROCESSOR_PLUGIN_SPEC)
|
||||
self.assertIsNone(plugin_specs.value.get('invalid'))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
@@ -3,19 +3,20 @@
|
||||
# Allow direct execution
|
||||
import os
|
||||
import sys
|
||||
import unittest
|
||||
import unittest.mock
|
||||
import warnings
|
||||
import datetime as dt
|
||||
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
|
||||
import contextlib
|
||||
import datetime as dt
|
||||
import io
|
||||
import itertools
|
||||
import json
|
||||
import pickle
|
||||
import subprocess
|
||||
import unittest
|
||||
import unittest.mock
|
||||
import warnings
|
||||
import xml.etree.ElementTree
|
||||
|
||||
from yt_dlp.compat import (
|
||||
@@ -218,11 +219,8 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(sanitize_filename('_BD_eEpuzXw', is_id=True), '_BD_eEpuzXw')
|
||||
self.assertEqual(sanitize_filename('N0Y__7-UOdI', is_id=True), 'N0Y__7-UOdI')
|
||||
|
||||
@unittest.mock.patch('sys.platform', 'win32')
|
||||
def test_sanitize_path(self):
|
||||
with unittest.mock.patch('sys.platform', 'win32'):
|
||||
self._test_sanitize_path()
|
||||
|
||||
def _test_sanitize_path(self):
|
||||
self.assertEqual(sanitize_path('abc'), 'abc')
|
||||
self.assertEqual(sanitize_path('abc/def'), 'abc\\def')
|
||||
self.assertEqual(sanitize_path('abc\\def'), 'abc\\def')
|
||||
@@ -253,10 +251,8 @@ class TestUtil(unittest.TestCase):
|
||||
|
||||
# Check with nt._path_normpath if available
|
||||
try:
|
||||
import nt
|
||||
|
||||
nt_path_normpath = getattr(nt, '_path_normpath', None)
|
||||
except Exception:
|
||||
from nt import _path_normpath as nt_path_normpath
|
||||
except ImportError:
|
||||
nt_path_normpath = None
|
||||
|
||||
for test, expected in [
|
||||
@@ -2087,21 +2083,26 @@ Line 1
|
||||
headers = HTTPHeaderDict()
|
||||
headers['ytdl-test'] = b'0'
|
||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '0')])
|
||||
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '0')])
|
||||
headers['ytdl-test'] = 1
|
||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '1')])
|
||||
self.assertEqual(list(headers.sensitive().items()), [('ytdl-test', '1')])
|
||||
headers['Ytdl-test'] = '2'
|
||||
self.assertEqual(list(headers.items()), [('Ytdl-Test', '2')])
|
||||
self.assertEqual(list(headers.sensitive().items()), [('Ytdl-test', '2')])
|
||||
self.assertTrue('ytDl-Test' in headers)
|
||||
self.assertEqual(str(headers), str(dict(headers)))
|
||||
self.assertEqual(repr(headers), str(dict(headers)))
|
||||
|
||||
headers.update({'X-dlp': 'data'})
|
||||
self.assertEqual(set(headers.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data')})
|
||||
self.assertEqual(set(headers.sensitive().items()), {('Ytdl-test', '2'), ('X-dlp', 'data')})
|
||||
self.assertEqual(dict(headers), {'Ytdl-Test': '2', 'X-Dlp': 'data'})
|
||||
self.assertEqual(len(headers), 2)
|
||||
self.assertEqual(headers.copy(), headers)
|
||||
headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, **headers, **{'X-dlp': 'data2'})
|
||||
headers2 = HTTPHeaderDict({'X-dlp': 'data3'}, headers, **{'X-dlP': 'data2'})
|
||||
self.assertEqual(set(headers2.items()), {('Ytdl-Test', '2'), ('X-Dlp', 'data2')})
|
||||
self.assertEqual(set(headers2.sensitive().items()), {('Ytdl-test', '2'), ('X-dlP', 'data2')})
|
||||
self.assertEqual(len(headers2), 2)
|
||||
headers2.clear()
|
||||
self.assertEqual(len(headers2), 0)
|
||||
@@ -2109,16 +2110,23 @@ Line 1
|
||||
# ensure we prefer latter headers
|
||||
headers3 = HTTPHeaderDict({'Ytdl-TeSt': 1}, {'Ytdl-test': 2})
|
||||
self.assertEqual(set(headers3.items()), {('Ytdl-Test', '2')})
|
||||
self.assertEqual(set(headers3.sensitive().items()), {('Ytdl-test', '2')})
|
||||
del headers3['ytdl-tesT']
|
||||
self.assertEqual(dict(headers3), {})
|
||||
|
||||
headers4 = HTTPHeaderDict({'ytdl-test': 'data;'})
|
||||
self.assertEqual(set(headers4.items()), {('Ytdl-Test', 'data;')})
|
||||
self.assertEqual(set(headers4.sensitive().items()), {('ytdl-test', 'data;')})
|
||||
|
||||
# common mistake: strip whitespace from values
|
||||
# https://github.com/yt-dlp/yt-dlp/issues/8729
|
||||
headers5 = HTTPHeaderDict({'ytdl-test': ' data; '})
|
||||
self.assertEqual(set(headers5.items()), {('Ytdl-Test', 'data;')})
|
||||
self.assertEqual(set(headers5.sensitive().items()), {('ytdl-test', 'data;')})
|
||||
|
||||
# test if picklable
|
||||
headers6 = HTTPHeaderDict(a=1, b=2)
|
||||
self.assertEqual(pickle.loads(pickle.dumps(headers6)), headers6)
|
||||
|
||||
def test_extract_basic_auth(self):
|
||||
assert extract_basic_auth('http://:foo.bar') == ('http://:foo.bar', None)
|
||||
|
||||
@@ -44,7 +44,7 @@ def websocket_handler(websocket):
|
||||
return websocket.send('2')
|
||||
elif isinstance(message, str):
|
||||
if message == 'headers':
|
||||
return websocket.send(json.dumps(dict(websocket.request.headers)))
|
||||
return websocket.send(json.dumps(dict(websocket.request.headers.raw_items())))
|
||||
elif message == 'path':
|
||||
return websocket.send(websocket.request.path)
|
||||
elif message == 'source_address':
|
||||
@@ -266,18 +266,18 @@ class TestWebsSocketRequestHandlerConformance:
|
||||
with handler(cookiejar=cookiejar) as rh:
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||
ws.send('headers')
|
||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
||||
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||
ws.close()
|
||||
|
||||
with handler() as rh:
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||
ws.send('headers')
|
||||
assert 'cookie' not in json.loads(ws.recv())
|
||||
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||
ws.close()
|
||||
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': cookiejar}))
|
||||
ws.send('headers')
|
||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
||||
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||
ws.close()
|
||||
|
||||
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
||||
@@ -287,7 +287,7 @@ class TestWebsSocketRequestHandlerConformance:
|
||||
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie', extensions={'cookiejar': YoutubeDLCookieJar()}))
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url, extensions={'cookiejar': YoutubeDLCookieJar()}))
|
||||
ws.send('headers')
|
||||
assert 'cookie' not in json.loads(ws.recv())
|
||||
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||
ws.close()
|
||||
|
||||
@pytest.mark.skip_handler('Websockets', 'Set-Cookie not supported by websockets')
|
||||
@@ -298,12 +298,12 @@ class TestWebsSocketRequestHandlerConformance:
|
||||
ws_validate_and_send(rh, Request(f'{self.ws_base_url}/get_cookie'))
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||
ws.send('headers')
|
||||
assert json.loads(ws.recv())['cookie'] == 'test=ytdlp'
|
||||
assert HTTPHeaderDict(json.loads(ws.recv()))['cookie'] == 'test=ytdlp'
|
||||
ws.close()
|
||||
cookiejar.clear_session_cookies()
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url))
|
||||
ws.send('headers')
|
||||
assert 'cookie' not in json.loads(ws.recv())
|
||||
assert 'cookie' not in HTTPHeaderDict(json.loads(ws.recv()))
|
||||
ws.close()
|
||||
|
||||
def test_source_address(self, handler):
|
||||
@@ -341,6 +341,14 @@ class TestWebsSocketRequestHandlerConformance:
|
||||
assert headers['test3'] == 'test3'
|
||||
ws.close()
|
||||
|
||||
def test_keep_header_casing(self, handler):
|
||||
with handler(headers=HTTPHeaderDict({'x-TeSt1': 'test'})) as rh:
|
||||
ws = ws_validate_and_send(rh, Request(self.ws_base_url, headers={'x-TeSt2': 'test'}, extensions={'keep_header_casing': True}))
|
||||
ws.send('headers')
|
||||
headers = json.loads(ws.recv())
|
||||
assert 'x-TeSt1' in headers
|
||||
assert 'x-TeSt2' in headers
|
||||
|
||||
@pytest.mark.parametrize('client_cert', (
|
||||
{'client_certificate': os.path.join(MTLS_CERT_DIR, 'clientwithkey.crt')},
|
||||
{
|
||||
|
||||
@@ -78,6 +78,11 @@ _SIG_TESTS = [
|
||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
|
||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
|
||||
),
|
||||
]
|
||||
|
||||
_NSIG_TESTS = [
|
||||
@@ -205,6 +210,30 @@ _NSIG_TESTS = [
|
||||
'https://www.youtube.com/s/player/9c6dfc4a/player_ias.vflset/en_US/base.js',
|
||||
'jbu7ylIosQHyJyJV', 'uwI0ESiynAmhNg',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js',
|
||||
'Sy4aDGc0VpYRR9ew_', '5UPOT1VhoZxNLQ',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/d50f54ef/player_ias_tce.vflset/en_US/base.js',
|
||||
'Ha7507LzRmH3Utygtj', 'XFTb2HoeOE5MHg',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/074a8365/player_ias_tce.vflset/en_US/base.js',
|
||||
'Ha7507LzRmH3Utygtj', 'ufTsrE0IVYrkl8v',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/643afba4/player_ias.vflset/en_US/base.js',
|
||||
'N5uAlLqm0eg1GyHO', 'dCBQOejdq5s-ww',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/69f581a5/tv-player-ias.vflset/tv-player-ias.js',
|
||||
'-qIP447rVlTTwaZjY', 'KNcGOksBAvwqQg',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
|
||||
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
@@ -218,6 +247,8 @@ class TestPlayerInfo(unittest.TestCase):
|
||||
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-en_US.vflset/base.js', '64dddad9'),
|
||||
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-phone-de_DE.vflset/base.js', '64dddad9'),
|
||||
('https://www.youtube.com/s/player/64dddad9/player-plasma-ias-tablet-en_US.vflset/base.js', '64dddad9'),
|
||||
('https://www.youtube.com/s/player/e7567ecf/player_ias_tce.vflset/en_US/base.js', 'e7567ecf'),
|
||||
('https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', '643afba4'),
|
||||
# obsolete
|
||||
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
|
||||
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
|
||||
@@ -250,7 +281,7 @@ def t_factory(name, sig_func, url_pattern):
|
||||
def make_tfunc(url, sig_input, expected_sig):
|
||||
m = url_pattern.match(url)
|
||||
assert m, f'{url!r} should follow URL format'
|
||||
test_id = m.group('id')
|
||||
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
|
||||
|
||||
def test_func(self):
|
||||
basename = f'player-{name}-{test_id}.js'
|
||||
@@ -279,17 +310,22 @@ def n_sig(jscode, sig_input):
|
||||
ie = YoutubeIE(FakeYDL())
|
||||
funcname = ie._extract_n_function_name(jscode)
|
||||
jsi = JSInterpreter(jscode)
|
||||
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname)))
|
||||
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode))
|
||||
return func([sig_input])
|
||||
|
||||
|
||||
make_sig_test = t_factory(
|
||||
'signature', signature, re.compile(r'.*(?:-|/player/)(?P<id>[a-zA-Z0-9_-]+)(?:/.+\.js|(?:/watch_as3|/html5player)?\.[a-z]+)$'))
|
||||
'signature', signature,
|
||||
re.compile(r'''(?x)
|
||||
.+(?:
|
||||
/player/(?P<id>[a-zA-Z0-9_/.-]+)|
|
||||
/html5player-(?:en_US-)?(?P<compat_id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player)?
|
||||
)\.js$'''))
|
||||
for test_spec in _SIG_TESTS:
|
||||
make_sig_test(*test_spec)
|
||||
|
||||
make_nsig_test = t_factory(
|
||||
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_-]+)/.+.js$'))
|
||||
'nsig', n_sig, re.compile(r'.+/player/(?P<id>[a-zA-Z0-9_/.-]+)\.js$'))
|
||||
for test_spec in _NSIG_TESTS:
|
||||
make_nsig_test(*test_spec)
|
||||
|
||||
|
||||
@@ -2,4 +2,5 @@ from yt_dlp.extractor.common import InfoExtractor
|
||||
|
||||
|
||||
class PackagePluginIE(InfoExtractor):
|
||||
_VALID_URL = 'package'
|
||||
pass
|
||||
|
||||
10
test/testdata/reload_plugins/yt_dlp_plugins/extractor/normal.py
vendored
Normal file
10
test/testdata/reload_plugins/yt_dlp_plugins/extractor/normal.py
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
from yt_dlp.extractor.common import InfoExtractor
|
||||
|
||||
|
||||
class NormalPluginIE(InfoExtractor):
|
||||
_VALID_URL = 'normal'
|
||||
REPLACED = True
|
||||
|
||||
|
||||
class _IgnoreUnderscorePluginIE(InfoExtractor):
|
||||
pass
|
||||
5
test/testdata/reload_plugins/yt_dlp_plugins/postprocessor/normal.py
vendored
Normal file
5
test/testdata/reload_plugins/yt_dlp_plugins/postprocessor/normal.py
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
from yt_dlp.postprocessor.common import PostProcessor
|
||||
|
||||
|
||||
class NormalPluginPP(PostProcessor):
|
||||
REPLACED = True
|
||||
@@ -6,6 +6,7 @@ class IgnoreNotInAllPluginIE(InfoExtractor):
|
||||
|
||||
|
||||
class InAllPluginIE(InfoExtractor):
|
||||
_VALID_URL = 'inallpluginie'
|
||||
pass
|
||||
|
||||
|
||||
|
||||
@@ -2,8 +2,10 @@ from yt_dlp.extractor.common import InfoExtractor
|
||||
|
||||
|
||||
class NormalPluginIE(InfoExtractor):
|
||||
pass
|
||||
_VALID_URL = 'normalpluginie'
|
||||
REPLACED = False
|
||||
|
||||
|
||||
class _IgnoreUnderscorePluginIE(InfoExtractor):
|
||||
_VALID_URL = 'ignoreunderscorepluginie'
|
||||
pass
|
||||
|
||||
5
test/testdata/yt_dlp_plugins/extractor/override.py
vendored
Normal file
5
test/testdata/yt_dlp_plugins/extractor/override.py
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
from yt_dlp.extractor.generic import GenericIE
|
||||
|
||||
|
||||
class OverrideGenericIE(GenericIE, plugin_name='override'):
|
||||
TEST_FIELD = 'override'
|
||||
5
test/testdata/yt_dlp_plugins/extractor/overridetwo.py
vendored
Normal file
5
test/testdata/yt_dlp_plugins/extractor/overridetwo.py
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
from yt_dlp.extractor.generic import GenericIE
|
||||
|
||||
|
||||
class _UnderscoreOverrideGenericIE(GenericIE, plugin_name='underscore-override'):
|
||||
SECONDARY_TEST_FIELD = 'underscore-override'
|
||||
@@ -2,4 +2,4 @@ from yt_dlp.postprocessor.common import PostProcessor
|
||||
|
||||
|
||||
class NormalPluginPP(PostProcessor):
|
||||
pass
|
||||
REPLACED = False
|
||||
|
||||
@@ -2,4 +2,5 @@ from yt_dlp.extractor.common import InfoExtractor
|
||||
|
||||
|
||||
class ZippedPluginIE(InfoExtractor):
|
||||
_VALID_URL = 'zippedpluginie'
|
||||
pass
|
||||
|
||||
@@ -30,9 +30,18 @@ from .compat import urllib_req_to_req
|
||||
from .cookies import CookieLoadError, LenientSimpleCookie, load_cookies
|
||||
from .downloader import FFmpegFD, get_suitable_downloader, shorten_protocol_name
|
||||
from .downloader.rtmp import rtmpdump_version
|
||||
from .extractor import gen_extractor_classes, get_info_extractor
|
||||
from .extractor import gen_extractor_classes, get_info_extractor, import_extractors
|
||||
from .extractor.common import UnsupportedURLIE
|
||||
from .extractor.openload import PhantomJSwrapper
|
||||
from .globals import (
|
||||
IN_CLI,
|
||||
LAZY_EXTRACTORS,
|
||||
plugin_ies,
|
||||
plugin_ies_overrides,
|
||||
plugin_pps,
|
||||
all_plugins_loaded,
|
||||
plugin_dirs,
|
||||
)
|
||||
from .minicurses import format_text
|
||||
from .networking import HEADRequest, Request, RequestDirector
|
||||
from .networking.common import _REQUEST_HANDLERS, _RH_PREFERENCES
|
||||
@@ -44,8 +53,7 @@ from .networking.exceptions import (
|
||||
network_exceptions,
|
||||
)
|
||||
from .networking.impersonate import ImpersonateRequestHandler
|
||||
from .plugins import directories as plugin_directories
|
||||
from .postprocessor import _PLUGIN_CLASSES as plugin_pps
|
||||
from .plugins import directories as plugin_directories, load_all_plugins
|
||||
from .postprocessor import (
|
||||
EmbedThumbnailPP,
|
||||
FFmpegFixupDuplicateMoovPP,
|
||||
@@ -157,7 +165,7 @@ from .utils import (
|
||||
write_json_file,
|
||||
write_string,
|
||||
)
|
||||
from .utils._utils import _UnsafeExtensionError, _YDLLogger
|
||||
from .utils._utils import _UnsafeExtensionError, _YDLLogger, _ProgressState
|
||||
from .utils.networking import (
|
||||
HTTPHeaderDict,
|
||||
clean_headers,
|
||||
@@ -642,20 +650,23 @@ class YoutubeDL:
|
||||
self.cache = Cache(self)
|
||||
self.__header_cookies = []
|
||||
|
||||
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
|
||||
self._out_files = Namespace(
|
||||
out=stdout,
|
||||
error=sys.stderr,
|
||||
screen=sys.stderr if self.params.get('quiet') else stdout,
|
||||
console=None if os.name == 'nt' else next(
|
||||
filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
|
||||
)
|
||||
# compat for API: load plugins if they have not already
|
||||
if not all_plugins_loaded.value:
|
||||
load_all_plugins()
|
||||
|
||||
try:
|
||||
windows_enable_vt_mode()
|
||||
except Exception as e:
|
||||
self.write_debug(f'Failed to enable VT mode: {e}')
|
||||
|
||||
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
|
||||
self._out_files = Namespace(
|
||||
out=stdout,
|
||||
error=sys.stderr,
|
||||
screen=sys.stderr if self.params.get('quiet') else stdout,
|
||||
console=next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
|
||||
)
|
||||
|
||||
if self.params.get('no_color'):
|
||||
if self.params.get('color') is not None:
|
||||
self.params.setdefault('_warnings', []).append(
|
||||
@@ -956,21 +967,22 @@ class YoutubeDL:
|
||||
self._write_string(f'{self._bidi_workaround(message)}\n', self._out_files.error, only_once=only_once)
|
||||
|
||||
def _send_console_code(self, code):
|
||||
if os.name == 'nt' or not self._out_files.console:
|
||||
return
|
||||
if not supports_terminal_sequences(self._out_files.console):
|
||||
return False
|
||||
self._write_string(code, self._out_files.console)
|
||||
return True
|
||||
|
||||
def to_console_title(self, message):
|
||||
if not self.params.get('consoletitle', False):
|
||||
def to_console_title(self, message=None, progress_state=None, percent=None):
|
||||
if not self.params.get('consoletitle'):
|
||||
return
|
||||
message = remove_terminal_sequences(message)
|
||||
if os.name == 'nt':
|
||||
if ctypes.windll.kernel32.GetConsoleWindow():
|
||||
# c_wchar_p() might not be necessary if `message` is
|
||||
# already of type unicode()
|
||||
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
|
||||
else:
|
||||
self._send_console_code(f'\033]0;{message}\007')
|
||||
|
||||
if message:
|
||||
success = self._send_console_code(f'\033]0;{remove_terminal_sequences(message)}\007')
|
||||
if not success and os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
|
||||
ctypes.windll.kernel32.SetConsoleTitleW(message)
|
||||
|
||||
if isinstance(progress_state, _ProgressState):
|
||||
self._send_console_code(progress_state.get_ansi_escape(percent))
|
||||
|
||||
def save_console_title(self):
|
||||
if not self.params.get('consoletitle') or self.params.get('simulate'):
|
||||
@@ -984,6 +996,7 @@ class YoutubeDL:
|
||||
|
||||
def __enter__(self):
|
||||
self.save_console_title()
|
||||
self.to_console_title(progress_state=_ProgressState.INDETERMINATE)
|
||||
return self
|
||||
|
||||
def save_cookies(self):
|
||||
@@ -992,6 +1005,7 @@ class YoutubeDL:
|
||||
|
||||
def __exit__(self, *args):
|
||||
self.restore_console_title()
|
||||
self.to_console_title(progress_state=_ProgressState.HIDDEN)
|
||||
self.close()
|
||||
|
||||
def close(self):
|
||||
@@ -3993,15 +4007,6 @@ class YoutubeDL:
|
||||
if not self.params.get('verbose'):
|
||||
return
|
||||
|
||||
from . import _IN_CLI # Must be delayed import
|
||||
|
||||
# These imports can be slow. So import them only as needed
|
||||
from .extractor.extractors import _LAZY_LOADER
|
||||
from .extractor.extractors import (
|
||||
_PLUGIN_CLASSES as plugin_ies,
|
||||
_PLUGIN_OVERRIDES as plugin_ie_overrides,
|
||||
)
|
||||
|
||||
def get_encoding(stream):
|
||||
ret = str(getattr(stream, 'encoding', f'missing ({type(stream).__name__})'))
|
||||
additional_info = []
|
||||
@@ -4040,17 +4045,18 @@ class YoutubeDL:
|
||||
_make_label(ORIGIN, CHANNEL.partition('@')[2] or __version__, __version__),
|
||||
f'[{RELEASE_GIT_HEAD[:9]}]' if RELEASE_GIT_HEAD else '',
|
||||
'' if source == 'unknown' else f'({source})',
|
||||
'' if _IN_CLI else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
|
||||
'' if IN_CLI.value else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
|
||||
delim=' '))
|
||||
|
||||
if not _IN_CLI:
|
||||
if not IN_CLI.value:
|
||||
write_debug(f'params: {self.params}')
|
||||
|
||||
if not _LAZY_LOADER:
|
||||
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
||||
write_debug('Lazy loading extractors is forcibly disabled')
|
||||
else:
|
||||
write_debug('Lazy loading extractors is disabled')
|
||||
import_extractors()
|
||||
lazy_extractors = LAZY_EXTRACTORS.value
|
||||
if lazy_extractors is None:
|
||||
write_debug('Lazy loading extractors is disabled')
|
||||
elif not lazy_extractors:
|
||||
write_debug('Lazy loading extractors is forcibly disabled')
|
||||
if self.params['compat_opts']:
|
||||
write_debug('Compatibility options: {}'.format(', '.join(self.params['compat_opts'])))
|
||||
|
||||
@@ -4079,24 +4085,27 @@ class YoutubeDL:
|
||||
|
||||
write_debug(f'Proxy map: {self.proxies}')
|
||||
write_debug(f'Request Handlers: {", ".join(rh.RH_NAME for rh in self._request_director.handlers.values())}')
|
||||
if os.environ.get('YTDLP_NO_PLUGINS'):
|
||||
write_debug('Plugins are forcibly disabled')
|
||||
return
|
||||
|
||||
for plugin_type, plugins in {'Extractor': plugin_ies, 'Post-Processor': plugin_pps}.items():
|
||||
display_list = ['{}{}'.format(
|
||||
klass.__name__, '' if klass.__name__ == name else f' as {name}')
|
||||
for name, klass in plugins.items()]
|
||||
for plugin_type, plugins in (('Extractor', plugin_ies), ('Post-Processor', plugin_pps)):
|
||||
display_list = [
|
||||
klass.__name__ if klass.__name__ == name else f'{klass.__name__} as {name}'
|
||||
for name, klass in plugins.value.items()]
|
||||
if plugin_type == 'Extractor':
|
||||
display_list.extend(f'{plugins[-1].IE_NAME.partition("+")[2]} ({parent.__name__})'
|
||||
for parent, plugins in plugin_ie_overrides.items())
|
||||
for parent, plugins in plugin_ies_overrides.value.items())
|
||||
if not display_list:
|
||||
continue
|
||||
write_debug(f'{plugin_type} Plugins: {", ".join(sorted(display_list))}')
|
||||
|
||||
plugin_dirs = plugin_directories()
|
||||
if plugin_dirs:
|
||||
write_debug(f'Plugin directories: {plugin_dirs}')
|
||||
plugin_dirs_msg = 'none'
|
||||
if not plugin_dirs.value:
|
||||
plugin_dirs_msg = 'none (disabled)'
|
||||
else:
|
||||
found_plugin_directories = plugin_directories()
|
||||
if found_plugin_directories:
|
||||
plugin_dirs_msg = ', '.join(found_plugin_directories)
|
||||
|
||||
write_debug(f'Plugin directories: {plugin_dirs_msg}')
|
||||
|
||||
@functools.cached_property
|
||||
def proxies(self):
|
||||
|
||||
@@ -19,7 +19,9 @@ from .downloader.external import get_external_downloader
|
||||
from .extractor import list_extractor_classes
|
||||
from .extractor.adobepass import MSO_INFO
|
||||
from .networking.impersonate import ImpersonateTarget
|
||||
from .globals import IN_CLI, plugin_dirs
|
||||
from .options import parseOpts
|
||||
from .plugins import load_all_plugins as _load_all_plugins
|
||||
from .postprocessor import (
|
||||
FFmpegExtractAudioPP,
|
||||
FFmpegMergerPP,
|
||||
@@ -33,7 +35,6 @@ from .postprocessor import (
|
||||
)
|
||||
from .update import Updater
|
||||
from .utils import (
|
||||
Config,
|
||||
NO_DEFAULT,
|
||||
POSTPROCESS_WHEN,
|
||||
DateRange,
|
||||
@@ -66,8 +67,6 @@ from .utils.networking import std_headers
|
||||
from .utils._utils import _UnsafeExtensionError
|
||||
from .YoutubeDL import YoutubeDL
|
||||
|
||||
_IN_CLI = False
|
||||
|
||||
|
||||
def _exit(status=0, *args):
|
||||
for msg in args:
|
||||
@@ -433,6 +432,10 @@ def validate_options(opts):
|
||||
}
|
||||
|
||||
# Other options
|
||||
opts.plugin_dirs = opts.plugin_dirs
|
||||
if opts.plugin_dirs is None:
|
||||
opts.plugin_dirs = ['default']
|
||||
|
||||
if opts.playlist_items is not None:
|
||||
try:
|
||||
tuple(PlaylistEntries.parse_playlist_items(opts.playlist_items))
|
||||
@@ -973,11 +976,6 @@ def _real_main(argv=None):
|
||||
|
||||
parser, opts, all_urls, ydl_opts = parse_options(argv)
|
||||
|
||||
# HACK: Set the plugin dirs early on
|
||||
# TODO(coletdjnz): remove when plugin globals system is implemented
|
||||
if opts.plugin_dirs is not None:
|
||||
Config._plugin_dirs = list(map(expand_path, opts.plugin_dirs))
|
||||
|
||||
# Dump user agent
|
||||
if opts.dump_user_agent:
|
||||
ua = traverse_obj(opts.headers, 'User-Agent', casesense=False, default=std_headers['User-Agent'])
|
||||
@@ -992,6 +990,11 @@ def _real_main(argv=None):
|
||||
if opts.ffmpeg_location:
|
||||
FFmpegPostProcessor._ffmpeg_location.set(opts.ffmpeg_location)
|
||||
|
||||
# load all plugins into the global lookup
|
||||
plugin_dirs.value = opts.plugin_dirs
|
||||
if plugin_dirs.value:
|
||||
_load_all_plugins()
|
||||
|
||||
with YoutubeDL(ydl_opts) as ydl:
|
||||
pre_process = opts.update_self or opts.rm_cachedir
|
||||
actual_use = all_urls or opts.load_info_filename
|
||||
@@ -1091,8 +1094,7 @@ def _real_main(argv=None):
|
||||
|
||||
|
||||
def main(argv=None):
|
||||
global _IN_CLI
|
||||
_IN_CLI = True
|
||||
IN_CLI.value = True
|
||||
try:
|
||||
_exit(*variadic(_real_main(argv)))
|
||||
except (CookieLoadError, DownloadError):
|
||||
|
||||
@@ -83,7 +83,7 @@ def aes_ecb_encrypt(data, key, iv=None):
|
||||
@returns {int[]} encrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
|
||||
|
||||
encrypted_data = []
|
||||
for i in range(block_count):
|
||||
@@ -103,7 +103,7 @@ def aes_ecb_decrypt(data, key, iv=None):
|
||||
@returns {int[]} decrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
|
||||
|
||||
encrypted_data = []
|
||||
for i in range(block_count):
|
||||
@@ -134,7 +134,7 @@ def aes_ctr_encrypt(data, key, iv):
|
||||
@returns {int[]} encrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
|
||||
counter = iter_vector(iv)
|
||||
|
||||
encrypted_data = []
|
||||
@@ -158,7 +158,7 @@ def aes_cbc_decrypt(data, key, iv):
|
||||
@returns {int[]} decrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
|
||||
|
||||
decrypted_data = []
|
||||
previous_cipher_block = iv
|
||||
@@ -183,7 +183,7 @@ def aes_cbc_encrypt(data, key, iv, *, padding_mode='pkcs7'):
|
||||
@returns {int[]} encrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
block_count = ceil(len(data) / BLOCK_SIZE_BYTES)
|
||||
|
||||
encrypted_data = []
|
||||
previous_cipher_block = iv
|
||||
|
||||
@@ -35,6 +35,7 @@ from .rtmp import RtmpFD
|
||||
from .rtsp import RtspFD
|
||||
from .websocket import WebSocketFragmentFD
|
||||
from .youtube_live_chat import YoutubeLiveChatFD
|
||||
from .bunnycdn import BunnyCdnFD
|
||||
|
||||
PROTOCOL_MAP = {
|
||||
'rtmp': RtmpFD,
|
||||
@@ -55,6 +56,7 @@ PROTOCOL_MAP = {
|
||||
'websocket_frag': WebSocketFragmentFD,
|
||||
'youtube_live_chat': YoutubeLiveChatFD,
|
||||
'youtube_live_chat_replay': YoutubeLiveChatFD,
|
||||
'bunnycdn': BunnyCdnFD,
|
||||
}
|
||||
|
||||
|
||||
|
||||
50
yt_dlp/downloader/bunnycdn.py
Normal file
50
yt_dlp/downloader/bunnycdn.py
Normal file
@@ -0,0 +1,50 @@
|
||||
import hashlib
|
||||
import random
|
||||
import threading
|
||||
|
||||
from .common import FileDownloader
|
||||
from . import HlsFD
|
||||
from ..networking import Request
|
||||
from ..networking.exceptions import network_exceptions
|
||||
|
||||
|
||||
class BunnyCdnFD(FileDownloader):
|
||||
"""
|
||||
Downloads from BunnyCDN with required pings
|
||||
Note, this is not a part of public API, and will be removed without notice.
|
||||
DO NOT USE
|
||||
"""
|
||||
|
||||
def real_download(self, filename, info_dict):
|
||||
self.to_screen(f'[{self.FD_NAME}] Downloading from BunnyCDN')
|
||||
|
||||
fd = HlsFD(self.ydl, self.params)
|
||||
|
||||
stop_event = threading.Event()
|
||||
ping_thread = threading.Thread(target=self.ping_thread, args=(stop_event,), kwargs=info_dict['_bunnycdn_ping_data'])
|
||||
ping_thread.start()
|
||||
|
||||
try:
|
||||
return fd.real_download(filename, info_dict)
|
||||
finally:
|
||||
stop_event.set()
|
||||
|
||||
def ping_thread(self, stop_event, url, headers, secret, context_id):
|
||||
# Site sends ping every 4 seconds, but this throttles the download. Pinging every 2 seconds seems to work.
|
||||
ping_interval = 2
|
||||
# Hard coded resolution as it doesn't seem to matter
|
||||
res = 1080
|
||||
paused = 'false'
|
||||
current_time = 0
|
||||
|
||||
while not stop_event.wait(ping_interval):
|
||||
current_time += ping_interval
|
||||
|
||||
time = current_time + round(random.random(), 6)
|
||||
md5_hash = hashlib.md5(f'{secret}_{context_id}_{time}_{paused}_{res}'.encode()).hexdigest()
|
||||
ping_url = f'{url}?hash={md5_hash}&time={time}&paused={paused}&resolution={res}'
|
||||
|
||||
try:
|
||||
self.ydl.urlopen(Request(ping_url, headers=headers)).read()
|
||||
except network_exceptions as e:
|
||||
self.to_screen(f'[{self.FD_NAME}] Ping failed: {e}')
|
||||
@@ -31,6 +31,7 @@ from ..utils import (
|
||||
timetuple_from_msec,
|
||||
try_call,
|
||||
)
|
||||
from ..utils._utils import _ProgressState
|
||||
|
||||
|
||||
class FileDownloader:
|
||||
@@ -333,7 +334,7 @@ class FileDownloader:
|
||||
progress_dict), s.get('progress_idx') or 0)
|
||||
self.to_console_title(self.ydl.evaluate_outtmpl(
|
||||
progress_template.get('download-title') or 'yt-dlp %(progress._default_template)s',
|
||||
progress_dict))
|
||||
progress_dict), _ProgressState.from_dict(s), s.get('_percent'))
|
||||
|
||||
def _format_progress(self, *args, **kwargs):
|
||||
return self.ydl._format_text(
|
||||
@@ -357,6 +358,7 @@ class FileDownloader:
|
||||
'_speed_str': self.format_speed(speed).strip(),
|
||||
'_total_bytes_str': _format_bytes('total_bytes'),
|
||||
'_elapsed_str': self.format_seconds(s.get('elapsed')),
|
||||
'_percent': 100.0,
|
||||
'_percent_str': self.format_percent(100),
|
||||
})
|
||||
self._report_progress_status(s, join_nonempty(
|
||||
@@ -375,13 +377,15 @@ class FileDownloader:
|
||||
return
|
||||
self._progress_delta_time += update_delta
|
||||
|
||||
progress = try_call(
|
||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
|
||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
|
||||
lambda: s['downloaded_bytes'] == 0 and 0)
|
||||
s.update({
|
||||
'_eta_str': self.format_eta(s.get('eta')).strip(),
|
||||
'_speed_str': self.format_speed(s.get('speed')),
|
||||
'_percent_str': self.format_percent(try_call(
|
||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes'],
|
||||
lambda: 100 * s['downloaded_bytes'] / s['total_bytes_estimate'],
|
||||
lambda: s['downloaded_bytes'] == 0 and 0)),
|
||||
'_percent': progress,
|
||||
'_percent_str': self.format_percent(progress),
|
||||
'_total_bytes_str': _format_bytes('total_bytes'),
|
||||
'_total_bytes_estimate_str': _format_bytes('total_bytes_estimate'),
|
||||
'_downloaded_bytes_str': _format_bytes('downloaded_bytes'),
|
||||
|
||||
@@ -457,8 +457,6 @@ class FFmpegFD(ExternalFD):
|
||||
|
||||
@classmethod
|
||||
def available(cls, path=None):
|
||||
# TODO: Fix path for ffmpeg
|
||||
# Fixme: This may be wrong when --ffmpeg-location is used
|
||||
return FFmpegPostProcessor().available
|
||||
|
||||
def on_process_started(self, proc, stdin):
|
||||
|
||||
@@ -1,16 +1,25 @@
|
||||
from ..compat.compat_utils import passthrough_module
|
||||
from ..globals import extractors as _extractors_context
|
||||
from ..globals import plugin_ies as _plugin_ies_context
|
||||
from ..plugins import PluginSpec, register_plugin_spec
|
||||
|
||||
passthrough_module(__name__, '.extractors')
|
||||
del passthrough_module
|
||||
|
||||
register_plugin_spec(PluginSpec(
|
||||
module_name='extractor',
|
||||
suffix='IE',
|
||||
destination=_extractors_context,
|
||||
plugin_destination=_plugin_ies_context,
|
||||
))
|
||||
|
||||
|
||||
def gen_extractor_classes():
|
||||
""" Return a list of supported extractors.
|
||||
The order does matter; the first extractor matched is the one handling the URL.
|
||||
"""
|
||||
from .extractors import _ALL_CLASSES
|
||||
|
||||
return _ALL_CLASSES
|
||||
import_extractors()
|
||||
return list(_extractors_context.value.values())
|
||||
|
||||
|
||||
def gen_extractors():
|
||||
@@ -37,6 +46,9 @@ def list_extractors(age_limit=None):
|
||||
|
||||
def get_info_extractor(ie_name):
|
||||
"""Returns the info extractor class with the given ie_name"""
|
||||
from . import extractors
|
||||
import_extractors()
|
||||
return _extractors_context.value[f'{ie_name}IE']
|
||||
|
||||
return getattr(extractors, f'{ie_name}IE')
|
||||
|
||||
def import_extractors():
|
||||
from . import extractors # noqa: F401
|
||||
|
||||
@@ -312,6 +312,7 @@ from .brilliantpala import (
|
||||
)
|
||||
from .bundesliga import BundesligaIE
|
||||
from .bundestag import BundestagIE
|
||||
from .bunnycdn import BunnyCdnIE
|
||||
from .businessinsider import BusinessInsiderIE
|
||||
from .buzzfeed import BuzzFeedIE
|
||||
from .byutv import BYUtvIE
|
||||
@@ -335,6 +336,7 @@ from .canal1 import Canal1IE
|
||||
from .canalalpha import CanalAlphaIE
|
||||
from .canalc2 import Canalc2IE
|
||||
from .canalplus import CanalplusIE
|
||||
from .canalsurmas import CanalsurmasIE
|
||||
from .caracoltv import CaracolTvPlayIE
|
||||
from .cartoonnetwork import CartoonNetworkIE
|
||||
from .cbc import (
|
||||
@@ -1053,6 +1055,7 @@ from .livestream import (
|
||||
)
|
||||
from .livestreamfails import LivestreamfailsIE
|
||||
from .lnk import LnkIE
|
||||
from .loco import LocoIE
|
||||
from .loom import (
|
||||
LoomFolderIE,
|
||||
LoomIE,
|
||||
@@ -1881,6 +1884,8 @@ from .skyit import (
|
||||
SkyItVideoIE,
|
||||
SkyItVideoLiveIE,
|
||||
TV8ItIE,
|
||||
TV8ItLiveIE,
|
||||
TV8ItPlaylistIE,
|
||||
)
|
||||
from .skylinewebcams import SkylineWebcamsIE
|
||||
from .skynewsarabia import (
|
||||
@@ -1894,6 +1899,7 @@ from .slutload import SlutloadIE
|
||||
from .smotrim import SmotrimIE
|
||||
from .snapchat import SnapchatSpotlightIE
|
||||
from .snotr import SnotrIE
|
||||
from .softwhiteunderbelly import SoftWhiteUnderbellyIE
|
||||
from .sohu import (
|
||||
SohuIE,
|
||||
SohuVIE,
|
||||
@@ -2222,6 +2228,7 @@ from .tvplay import (
|
||||
TVPlayIE,
|
||||
)
|
||||
from .tvplayer import TVPlayerIE
|
||||
from .tvw import TvwIE
|
||||
from .tweakers import TweakersIE
|
||||
from .twentymin import TwentyMinutenIE
|
||||
from .twentythreevideo import TwentyThreeVideoIE
|
||||
@@ -2396,7 +2403,6 @@ from .voxmedia import (
|
||||
from .vrt import (
|
||||
VRTIE,
|
||||
DagelijkseKostIE,
|
||||
KetnetIE,
|
||||
Radio1BeIE,
|
||||
VrtNUIE,
|
||||
)
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import datetime as dt
|
||||
import functools
|
||||
|
||||
from .common import InfoExtractor
|
||||
@@ -10,7 +11,7 @@ from ..utils import (
|
||||
filter_dict,
|
||||
int_or_none,
|
||||
orderedSet,
|
||||
unified_timestamp,
|
||||
parse_iso8601,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
urljoin,
|
||||
@@ -87,9 +88,9 @@ class AfreecaTVIE(AfreecaTVBaseIE):
|
||||
'uploader_id': 'rlantnghks',
|
||||
'uploader': '페이즈으',
|
||||
'duration': 10840,
|
||||
'thumbnail': r're:https?://videoimg\.sooplive\.co/.kr/.+',
|
||||
'thumbnail': r're:https?://videoimg\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
|
||||
'upload_date': '20230108',
|
||||
'timestamp': 1673218805,
|
||||
'timestamp': 1673186405,
|
||||
'title': '젠지 페이즈',
|
||||
},
|
||||
'params': {
|
||||
@@ -102,7 +103,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
|
||||
'id': '20170411_BE689A0E_190960999_1_2_h',
|
||||
'ext': 'mp4',
|
||||
'title': '혼자사는여자집',
|
||||
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+',
|
||||
'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
|
||||
'uploader': '♥이슬이',
|
||||
'uploader_id': 'dasl8121',
|
||||
'upload_date': '20170411',
|
||||
@@ -119,7 +120,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
|
||||
'id': '20180327_27901457_202289533_1',
|
||||
'ext': 'mp4',
|
||||
'title': '[생]빨개요♥ (part 1)',
|
||||
'thumbnail': r're:https?://(?:video|st)img\.sooplive\.co\.kr/.+',
|
||||
'thumbnail': r're:https?://(?:video|st)img\.(?:sooplive\.co\.kr|afreecatv\.com)/.+',
|
||||
'uploader': '[SA]서아',
|
||||
'uploader_id': 'bjdyrksu',
|
||||
'upload_date': '20180327',
|
||||
@@ -187,7 +188,7 @@ class AfreecaTVIE(AfreecaTVBaseIE):
|
||||
'formats': formats,
|
||||
**traverse_obj(file_element, {
|
||||
'duration': ('duration', {int_or_none(scale=1000)}),
|
||||
'timestamp': ('file_start', {unified_timestamp}),
|
||||
'timestamp': ('file_start', {parse_iso8601(delimiter=' ', timezone=dt.timedelta(hours=9))}),
|
||||
}),
|
||||
})
|
||||
|
||||
@@ -370,7 +371,7 @@ class AfreecaTVLiveIE(AfreecaTVBaseIE):
|
||||
'title': channel_info.get('TITLE') or station_info.get('station_title'),
|
||||
'uploader': channel_info.get('BJNICK') or station_info.get('station_name'),
|
||||
'uploader_id': broadcaster_id,
|
||||
'timestamp': unified_timestamp(station_info.get('broad_start')),
|
||||
'timestamp': parse_iso8601(station_info.get('broad_start'), delimiter=' ', timezone=dt.timedelta(hours=9)),
|
||||
'formats': formats,
|
||||
'is_live': True,
|
||||
'http_headers': {'Referer': url},
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .kaltura import KalturaIE
|
||||
from ..utils.traversal import require, traverse_obj
|
||||
|
||||
|
||||
class AZMedienIE(InfoExtractor):
|
||||
@@ -9,15 +8,15 @@ class AZMedienIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.|tv\.)?
|
||||
(?P<host>
|
||||
(?:
|
||||
telezueri\.ch|
|
||||
telebaern\.tv|
|
||||
telem1\.ch|
|
||||
tvo-online\.ch
|
||||
)/
|
||||
[^/]+/
|
||||
[^/?#]+/
|
||||
(?P<id>
|
||||
[^/]+-(?P<article_id>\d+)
|
||||
[^/?#]+-\d+
|
||||
)
|
||||
(?:
|
||||
\#video=
|
||||
@@ -47,19 +46,17 @@ class AZMedienIE(InfoExtractor):
|
||||
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
|
||||
_PARTNER_ID = '1719221'
|
||||
|
||||
def _real_extract(self, url):
|
||||
host, display_id, article_id, entry_id = self._match_valid_url(url).groups()
|
||||
display_id, entry_id = self._match_valid_url(url).groups()
|
||||
|
||||
if not entry_id:
|
||||
entry_id = self._download_json(
|
||||
self._API_TEMPL % (host, host.split('.')[0]), display_id, query={
|
||||
'variables': json.dumps({
|
||||
'contextId': 'NewsArticle:' + article_id,
|
||||
}),
|
||||
})['data']['context']['mainAsset']['video']['kaltura']['kalturaId']
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
data = self._search_json(
|
||||
r'window\.__APOLLO_STATE__\s*=', webpage, 'video data', display_id)
|
||||
entry_id = traverse_obj(data, (
|
||||
lambda _, v: v['__typename'] == 'KalturaData', 'kalturaId', any, {require('kaltura id')}))
|
||||
|
||||
return self.url_result(
|
||||
f'kaltura:{self._PARTNER_ID}:{entry_id}',
|
||||
|
||||
@@ -86,7 +86,7 @@ class BandlabBaseIE(InfoExtractor):
|
||||
'webpage_url': (
|
||||
'id', ({value(url)}, {format_field(template='https://www.bandlab.com/post/%s')}), filter, any),
|
||||
'url': ('video', 'url', {url_or_none}),
|
||||
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}),
|
||||
'title': ('caption', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
|
||||
'description': ('caption', {str}),
|
||||
'thumbnail': ('video', 'picture', 'url', {url_or_none}),
|
||||
'view_count': ('video', 'counters', 'plays', {int_or_none}),
|
||||
@@ -120,7 +120,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'duration': 54.629999999999995,
|
||||
'title': 'sweet black',
|
||||
'upload_date': '20231210',
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
|
||||
'genres': ['Lofi'],
|
||||
'uploader': 'ender milze',
|
||||
'comment_count': int,
|
||||
@@ -142,7 +142,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'duration': 54.629999999999995,
|
||||
'title': 'sweet black',
|
||||
'upload_date': '20231210',
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/fa082beb-b856-4730-9170-a57e4e32cc2c/',
|
||||
'genres': ['Lofi'],
|
||||
'uploader': 'ender milze',
|
||||
'comment_count': int,
|
||||
@@ -158,7 +158,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'comment_count': int,
|
||||
'genres': ['Other'],
|
||||
'uploader_id': 'user8353034818103753',
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/51b18363-da23-4b9b-a29c-2933a3e561ca/',
|
||||
'timestamp': 1709625771,
|
||||
'track': 'PodcastMaerchen4b',
|
||||
'duration': 468.14,
|
||||
@@ -178,7 +178,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'id': '110343fc-148b-ea11-96d2-0003ffd1fc09',
|
||||
'ext': 'm4a',
|
||||
'timestamp': 1588273294,
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/b612e533-e4f7-4542-9f50-3fcfd8dd822c/',
|
||||
'description': 'Final Revision.',
|
||||
'title': 'Replay ( Instrumental)',
|
||||
'uploader': 'David R Sparks',
|
||||
@@ -200,7 +200,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'id': '5cdf9036-3857-ef11-991a-6045bd36e0d9',
|
||||
'ext': 'mp4',
|
||||
'duration': 44.705,
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/videos/67c6cef1-cef6-40d3-831e-a55bc1dcb972/',
|
||||
'comment_count': int,
|
||||
'title': 'backing vocals',
|
||||
'uploader_id': 'marliashya',
|
||||
@@ -224,7 +224,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'view_count': int,
|
||||
'track': 'Positronic Meltdown',
|
||||
'duration': 318.55,
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/songs/87165bc3-5439-496e-b1f7-a9f13b541ff2/',
|
||||
'description': 'Checkout my tracks at AOMX http://aomxsounds.com/',
|
||||
'uploader_id': 'microfreaks',
|
||||
'title': 'Positronic Meltdown',
|
||||
@@ -246,7 +246,7 @@ class BandlabIE(BandlabBaseIE):
|
||||
'comment_count': int,
|
||||
'uploader': 'Sorakime',
|
||||
'uploader_id': 'sorakime',
|
||||
'thumbnail': 'https://bandlabimages.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
|
||||
'thumbnail': 'https://bl-prod-images.azureedge.net/v1.0/users/572a351a-0f3a-4c6a-ac39-1a5defdeeb1c/',
|
||||
'timestamp': 1691162128,
|
||||
'upload_date': '20230804',
|
||||
'media_type': 'track',
|
||||
|
||||
@@ -1596,16 +1596,16 @@ class BilibiliPlaylistIE(BilibiliSpaceListBaseIE):
|
||||
|
||||
webpage = self._download_webpage(url, list_id)
|
||||
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', list_id)
|
||||
if traverse_obj(initial_state, ('error', 'code', {int_or_none})) != 200:
|
||||
error_code = traverse_obj(initial_state, ('error', 'trueCode', {int_or_none}))
|
||||
error_message = traverse_obj(initial_state, ('error', 'message', {str_or_none}))
|
||||
error = traverse_obj(initial_state, (('error', 'listError'), all, lambda _, v: v['code'], any))
|
||||
if error and error['code'] != 200:
|
||||
error_code = error.get('trueCode')
|
||||
if error_code == -400 and list_id == 'watchlater':
|
||||
self.raise_login_required('You need to login to access your watchlater playlist')
|
||||
elif error_code == -403:
|
||||
self.raise_login_required('This is a private playlist. You need to login as its owner')
|
||||
elif error_code == 11010:
|
||||
raise ExtractorError('Playlist is no longer available', expected=True)
|
||||
raise ExtractorError(f'Could not access playlist: {error_code} {error_message}')
|
||||
raise ExtractorError(f'Could not access playlist: {error_code} {error.get("message")}')
|
||||
|
||||
query = {
|
||||
'ps': 20,
|
||||
|
||||
@@ -53,7 +53,7 @@ class BlueskyIE(InfoExtractor):
|
||||
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||
'title': 'Bluesky now has video! Update your app to versi...',
|
||||
'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
|
||||
'alt_title': 'Bluesky video feature announcement',
|
||||
'description': r're:(?s)Bluesky now has video! .{239}',
|
||||
'upload_date': '20240911',
|
||||
@@ -172,7 +172,7 @@ class BlueskyIE(InfoExtractor):
|
||||
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||
'title': 'Bluesky now has video! Update your app to versi...',
|
||||
'title': 'Bluesky now has video! Update your app to version 1.91 or refresh on ...',
|
||||
'alt_title': 'Bluesky video feature announcement',
|
||||
'description': r're:(?s)Bluesky now has video! .{239}',
|
||||
'upload_date': '20240911',
|
||||
@@ -191,7 +191,7 @@ class BlueskyIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '3l7rdfxhyds2f',
|
||||
'ext': 'mp4',
|
||||
'uploader': 'cinnamon',
|
||||
'uploader': 'cinnamon 🐇 🏳️⚧️',
|
||||
'uploader_id': 'cinny.bun.how',
|
||||
'uploader_url': 'https://bsky.app/profile/cinny.bun.how',
|
||||
'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide',
|
||||
@@ -255,7 +255,7 @@ class BlueskyIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '3l77u64l7le2e',
|
||||
'ext': 'mp4',
|
||||
'title': 'hearing people on twitter say that bluesky isn\'...',
|
||||
'title': "hearing people on twitter say that bluesky isn't funny yet so post t...",
|
||||
'like_count': int,
|
||||
'uploader_id': 'thafnine.net',
|
||||
'uploader_url': 'https://bsky.app/profile/thafnine.net',
|
||||
@@ -387,7 +387,7 @@ class BlueskyIE(InfoExtractor):
|
||||
'age_limit': (
|
||||
'labels', ..., 'val', {lambda x: 18 if x in ('sexual', 'porn', 'graphic-media') else None}, any),
|
||||
'description': (*record_path, 'text', {str}, filter),
|
||||
'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=50)}),
|
||||
'title': (*record_path, 'text', {lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}),
|
||||
}),
|
||||
})
|
||||
return entries
|
||||
|
||||
178
yt_dlp/extractor/bunnycdn.py
Normal file
178
yt_dlp/extractor/bunnycdn.py
Normal file
@@ -0,0 +1,178 @@
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..networking import HEADRequest
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
extract_attributes,
|
||||
int_or_none,
|
||||
parse_qs,
|
||||
smuggle_url,
|
||||
unsmuggle_url,
|
||||
url_or_none,
|
||||
urlhandle_detect_ext,
|
||||
)
|
||||
from ..utils.traversal import find_element, traverse_obj
|
||||
|
||||
|
||||
class BunnyCdnIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:iframe\.mediadelivery\.net|video\.bunnycdn\.com)/(?:embed|play)/(?P<library_id>\d+)/(?P<id>[\da-f-]+)'
|
||||
_EMBED_REGEX = [rf'<iframe[^>]+src=[\'"](?P<url>{_VALID_URL}[^\'"]*)[\'"]']
|
||||
_TESTS = [{
|
||||
'url': 'https://iframe.mediadelivery.net/embed/113933/e73edec1-e381-4c8b-ae73-717a140e0924',
|
||||
'info_dict': {
|
||||
'id': 'e73edec1-e381-4c8b-ae73-717a140e0924',
|
||||
'ext': 'mp4',
|
||||
'title': 'mistress morgana (3).mp4',
|
||||
'description': '',
|
||||
'timestamp': 1693251673,
|
||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/e73edec1-e381-4c8b-ae73-717a140e0924/thumbnail\.jpg',
|
||||
'duration': 7.0,
|
||||
'upload_date': '20230828',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}, {
|
||||
'url': 'https://iframe.mediadelivery.net/play/136145/32e34c4b-0d72-437c-9abb-05e67657da34',
|
||||
'info_dict': {
|
||||
'id': '32e34c4b-0d72-437c-9abb-05e67657da34',
|
||||
'ext': 'mp4',
|
||||
'timestamp': 1691145748,
|
||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/32e34c4b-0d72-437c-9abb-05e67657da34/thumbnail_9172dc16\.jpg',
|
||||
'duration': 106.0,
|
||||
'description': 'md5:981a3e899a5c78352b21ed8b2f1efd81',
|
||||
'upload_date': '20230804',
|
||||
'title': 'Sanela ist Teil der #arbeitsmarktkraft',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}, {
|
||||
# Stream requires activation and pings
|
||||
'url': 'https://iframe.mediadelivery.net/embed/200867/2e8545ec-509d-4571-b855-4cf0235ccd75',
|
||||
'info_dict': {
|
||||
'id': '2e8545ec-509d-4571-b855-4cf0235ccd75',
|
||||
'ext': 'mp4',
|
||||
'timestamp': 1708497752,
|
||||
'title': 'netflix part 1',
|
||||
'duration': 3959.0,
|
||||
'description': '',
|
||||
'upload_date': '20240221',
|
||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/2e8545ec-509d-4571-b855-4cf0235ccd75/thumbnail\.jpg',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}]
|
||||
_WEBPAGE_TESTS = [{
|
||||
# Stream requires Referer
|
||||
'url': 'https://conword.io/',
|
||||
'info_dict': {
|
||||
'id': '3a5d863e-9cd6-447e-b6ef-e289af50b349',
|
||||
'ext': 'mp4',
|
||||
'title': 'Conword bei der Stadt Köln und Stadt Dortmund',
|
||||
'description': '',
|
||||
'upload_date': '20231031',
|
||||
'duration': 31.0,
|
||||
'thumbnail': 'https://video.watchuh.com/3a5d863e-9cd6-447e-b6ef-e289af50b349/thumbnail.jpg',
|
||||
'timestamp': 1698783879,
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}, {
|
||||
# URL requires token and expires
|
||||
'url': 'https://www.stockphotos.com/video/moscow-subway-the-train-is-arriving-at-the-park-kultury-station-10017830',
|
||||
'info_dict': {
|
||||
'id': '0b02fa20-4e8c-4140-8f87-f64d820a3386',
|
||||
'ext': 'mp4',
|
||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/0b02fa20-4e8c-4140-8f87-f64d820a3386/thumbnail\.jpg',
|
||||
'title': 'Moscow subway. The train is arriving at the Park Kultury station.',
|
||||
'upload_date': '20240531',
|
||||
'duration': 18.0,
|
||||
'timestamp': 1717152269,
|
||||
'description': '',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def _extract_embed_urls(cls, url, webpage):
|
||||
for embed_url in super()._extract_embed_urls(url, webpage):
|
||||
yield smuggle_url(embed_url, {'Referer': url})
|
||||
|
||||
def _real_extract(self, url):
|
||||
url, smuggled_data = unsmuggle_url(url, {})
|
||||
|
||||
video_id, library_id = self._match_valid_url(url).group('id', 'library_id')
|
||||
webpage = self._download_webpage(
|
||||
f'https://iframe.mediadelivery.net/embed/{library_id}/{video_id}', video_id,
|
||||
headers=traverse_obj(smuggled_data, {'Referer': 'Referer'}),
|
||||
query=traverse_obj(parse_qs(url), {'token': 'token', 'expires': 'expires'}))
|
||||
|
||||
if html_title := self._html_extract_title(webpage, default=None) == '403':
|
||||
raise ExtractorError(
|
||||
'This video is inaccessible. Setting a Referer header '
|
||||
'might be required to access the video', expected=True)
|
||||
elif html_title == '404':
|
||||
raise ExtractorError('This video does not exist', expected=True)
|
||||
|
||||
headers = {'Referer': url}
|
||||
|
||||
info = traverse_obj(self._parse_html5_media_entries(url, webpage, video_id, _headers=headers), 0) or {}
|
||||
formats = info.get('formats') or []
|
||||
subtitles = info.get('subtitles') or {}
|
||||
|
||||
original_url = self._search_regex(
|
||||
r'(?:var|const|let)\s+originalUrl\s*=\s*["\']([^"\']+)["\']', webpage, 'original url', default=None)
|
||||
if url_or_none(original_url):
|
||||
urlh = self._request_webpage(
|
||||
HEADRequest(original_url), video_id=video_id, note='Checking original',
|
||||
headers=headers, fatal=False, expected_status=(403, 404))
|
||||
if urlh and urlh.status == 200:
|
||||
formats.append({
|
||||
'url': original_url,
|
||||
'format_id': 'source',
|
||||
'quality': 1,
|
||||
'http_headers': headers,
|
||||
'ext': urlhandle_detect_ext(urlh, default='mp4'),
|
||||
'filesize': int_or_none(urlh.get_header('Content-Length')),
|
||||
})
|
||||
|
||||
# MediaCage Streams require activation and pings
|
||||
src_url = self._search_regex(
|
||||
r'\.setAttribute\([\'"]src[\'"],\s*[\'"]([^\'"]+)[\'"]\)', webpage, 'src url', default=None)
|
||||
activation_url = self._search_regex(
|
||||
r'loadUrl\([\'"]([^\'"]+/activate)[\'"]', webpage, 'activation url', default=None)
|
||||
ping_url = self._search_regex(
|
||||
r'loadUrl\([\'"]([^\'"]+/ping)[\'"]', webpage, 'ping url', default=None)
|
||||
secret = traverse_obj(parse_qs(src_url), ('secret', 0))
|
||||
context_id = traverse_obj(parse_qs(src_url), ('contextId', 0))
|
||||
ping_data = {}
|
||||
if src_url and activation_url and ping_url and secret and context_id:
|
||||
self._download_webpage(
|
||||
activation_url, video_id, headers=headers, note='Downloading activation data')
|
||||
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
src_url, video_id, 'mp4', headers=headers, m3u8_id='hls', fatal=False)
|
||||
for fmt in fmts:
|
||||
fmt.update({
|
||||
'protocol': 'bunnycdn',
|
||||
'http_headers': headers,
|
||||
})
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
|
||||
ping_data = {
|
||||
'_bunnycdn_ping_data': {
|
||||
'url': ping_url,
|
||||
'headers': headers,
|
||||
'secret': secret,
|
||||
'context_id': context_id,
|
||||
},
|
||||
}
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
**traverse_obj(webpage, ({find_element(id='main-video', html=True)}, {extract_attributes}, {
|
||||
'title': ('data-plyr-config', {json.loads}, 'title', {str}),
|
||||
'thumbnail': ('data-poster', {url_or_none}),
|
||||
})),
|
||||
**ping_data,
|
||||
**self._search_json_ld(webpage, video_id, fatal=False),
|
||||
}
|
||||
84
yt_dlp/extractor/canalsurmas.py
Normal file
84
yt_dlp/extractor/canalsurmas.py
Normal file
@@ -0,0 +1,84 @@
|
||||
import json
|
||||
import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
float_or_none,
|
||||
jwt_decode_hs256,
|
||||
parse_iso8601,
|
||||
url_or_none,
|
||||
variadic,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class CanalsurmasIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?canalsurmas\.es/videos/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.canalsurmas.es/videos/44006-el-gran-queo-1-lora-del-rio-sevilla-20072014',
|
||||
'md5': '861f86fdc1221175e15523047d0087ef',
|
||||
'info_dict': {
|
||||
'id': '44006',
|
||||
'ext': 'mp4',
|
||||
'title': 'Lora del Río (Sevilla)',
|
||||
'description': 'md5:3d9ee40a9b1b26ed8259e6b71ed27b8b',
|
||||
'thumbnail': 'https://cdn2.rtva.interactvty.com/content_cards/00f3e8f67b0a4f3b90a4a14618a48b0d.jpg',
|
||||
'timestamp': 1648123182,
|
||||
'upload_date': '20220324',
|
||||
},
|
||||
}]
|
||||
_API_BASE = 'https://api-rtva.interactvty.com'
|
||||
_access_token = None
|
||||
|
||||
@staticmethod
|
||||
def _is_jwt_expired(token):
|
||||
return jwt_decode_hs256(token)['exp'] - time.time() < 300
|
||||
|
||||
def _call_api(self, endpoint, video_id, fields=None):
|
||||
if not self._access_token or self._is_jwt_expired(self._access_token):
|
||||
self._access_token = self._download_json(
|
||||
f'{self._API_BASE}/jwt/token/', None,
|
||||
'Downloading access token', 'Failed to download access token',
|
||||
headers={'Content-Type': 'application/json'},
|
||||
data=json.dumps({
|
||||
'username': 'canalsur_demo',
|
||||
'password': 'dsUBXUcI',
|
||||
}).encode())['access']
|
||||
|
||||
return self._download_json(
|
||||
f'{self._API_BASE}/api/2.0/contents/{endpoint}/{video_id}/', video_id,
|
||||
f'Downloading {endpoint} API JSON', f'Failed to download {endpoint} API JSON',
|
||||
headers={'Authorization': f'jwtok {self._access_token}'},
|
||||
query={'optional_fields': ','.join(variadic(fields))} if fields else None)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_info = self._call_api('content', video_id, fields=[
|
||||
'description', 'image', 'duration', 'created_at', 'tags',
|
||||
])
|
||||
stream_info = self._call_api('content_resources', video_id, 'media_url')
|
||||
|
||||
formats, subtitles = [], {}
|
||||
for stream_url in traverse_obj(stream_info, ('results', ..., 'media_url', {url_or_none})):
|
||||
if determine_ext(stream_url) == 'm3u8':
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
stream_url, video_id, m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
else:
|
||||
formats.append({'url': stream_url})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
**traverse_obj(video_info, {
|
||||
'title': ('name', {str.strip}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('image', {url_or_none}),
|
||||
'duration': ('duration', {float_or_none}),
|
||||
'timestamp': ('created_at', {parse_iso8601}),
|
||||
'tags': ('tags', ..., {str}),
|
||||
}),
|
||||
}
|
||||
@@ -1,17 +1,17 @@
|
||||
import base64
|
||||
import functools
|
||||
import json
|
||||
import re
|
||||
import time
|
||||
import urllib.parse
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..networking import HEADRequest
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
jwt_decode_hs256,
|
||||
mimetype2ext,
|
||||
orderedSet,
|
||||
parse_age_limit,
|
||||
@@ -24,6 +24,7 @@ from ..utils import (
|
||||
update_url,
|
||||
url_basename,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
)
|
||||
from ..utils.traversal import require, traverse_obj, trim_str
|
||||
|
||||
@@ -608,66 +609,82 @@ class CBCGemIE(CBCGemBaseIE):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
_TOKEN_API_KEY = '3f4beddd-2061-49b0-ae80-6f1f2ed65b37'
|
||||
_CLIENT_ID = 'fc05b0ee-3865-4400-a3cc-3da82c330c23'
|
||||
_refresh_token = None
|
||||
_access_token = None
|
||||
_claims_token = None
|
||||
|
||||
def _new_claims_token(self, email, password):
|
||||
data = json.dumps({
|
||||
'email': email,
|
||||
'password': password,
|
||||
}).encode()
|
||||
headers = {'content-type': 'application/json'}
|
||||
query = {'apikey': self._TOKEN_API_KEY}
|
||||
resp = self._download_json('https://api.loginradius.com/identity/v2/auth/login',
|
||||
None, data=data, headers=headers, query=query)
|
||||
access_token = resp['access_token']
|
||||
@functools.cached_property
|
||||
def _ropc_settings(self):
|
||||
return self._download_json(
|
||||
'https://services.radio-canada.ca/ott/catalog/v1/gem/settings', None,
|
||||
'Downloading site settings', query={'device': 'web'})['identityManagement']['ropc']
|
||||
|
||||
query = {
|
||||
'access_token': access_token,
|
||||
'apikey': self._TOKEN_API_KEY,
|
||||
'jwtapp': 'jwt',
|
||||
}
|
||||
resp = self._download_json('https://cloud-api.loginradius.com/sso/jwt/api/token',
|
||||
None, headers=headers, query=query)
|
||||
sig = resp['signature']
|
||||
def _is_jwt_expired(self, token):
|
||||
return jwt_decode_hs256(token)['exp'] - time.time() < 300
|
||||
|
||||
data = json.dumps({'jwt': sig}).encode()
|
||||
headers = {'content-type': 'application/json', 'ott-device-type': 'web'}
|
||||
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/token',
|
||||
None, data=data, headers=headers, expected_status=426)
|
||||
cbc_access_token = resp['accessToken']
|
||||
def _call_oauth_api(self, oauth_data, note='Refreshing access token'):
|
||||
response = self._download_json(
|
||||
self._ropc_settings['url'], None, note, data=urlencode_postdata({
|
||||
'client_id': self._CLIENT_ID,
|
||||
**oauth_data,
|
||||
'scope': self._ropc_settings['scopes'],
|
||||
}))
|
||||
self._refresh_token = response['refresh_token']
|
||||
self._access_token = response['access_token']
|
||||
self.cache.store(self._NETRC_MACHINE, 'token_data', [self._refresh_token, self._access_token])
|
||||
|
||||
headers = {'content-type': 'application/json', 'ott-device-type': 'web', 'ott-access-token': cbc_access_token}
|
||||
resp = self._download_json('https://services.radio-canada.ca/ott/cbc-api/v2/profile',
|
||||
None, headers=headers, expected_status=426)
|
||||
return resp['claimsToken']
|
||||
def _perform_login(self, username, password):
|
||||
if not self._refresh_token:
|
||||
self._refresh_token, self._access_token = self.cache.load(
|
||||
self._NETRC_MACHINE, 'token_data', default=[None, None])
|
||||
|
||||
def _get_claims_token_expiry(self):
|
||||
# Token is a JWT
|
||||
# JWT is decoded here and 'exp' field is extracted
|
||||
# It is a Unix timestamp for when the token expires
|
||||
b64_data = self._claims_token.split('.')[1]
|
||||
data = base64.urlsafe_b64decode(b64_data + '==')
|
||||
return json.loads(data)['exp']
|
||||
|
||||
def claims_token_expired(self):
|
||||
exp = self._get_claims_token_expiry()
|
||||
# It will expire in less than 10 seconds, or has already expired
|
||||
return exp - time.time() < 10
|
||||
|
||||
def claims_token_valid(self):
|
||||
return self._claims_token is not None and not self.claims_token_expired()
|
||||
|
||||
def _get_claims_token(self, email, password):
|
||||
if not self.claims_token_valid():
|
||||
self._claims_token = self._new_claims_token(email, password)
|
||||
self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token)
|
||||
return self._claims_token
|
||||
|
||||
def _real_initialize(self):
|
||||
if self.claims_token_valid():
|
||||
if self._refresh_token and self._access_token:
|
||||
self.write_debug('Using cached refresh token')
|
||||
if not self._claims_token:
|
||||
self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
|
||||
return
|
||||
self._claims_token = self.cache.load(self._NETRC_MACHINE, 'claims_token')
|
||||
|
||||
try:
|
||||
self._call_oauth_api({
|
||||
'grant_type': 'password',
|
||||
'username': username,
|
||||
'password': password,
|
||||
}, note='Logging in')
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
|
||||
raise ExtractorError('Invalid username and/or password', expected=True)
|
||||
raise
|
||||
|
||||
def _fetch_access_token(self):
|
||||
if self._is_jwt_expired(self._access_token):
|
||||
try:
|
||||
self._call_oauth_api({
|
||||
'grant_type': 'refresh_token',
|
||||
'refresh_token': self._refresh_token,
|
||||
})
|
||||
except ExtractorError:
|
||||
self._refresh_token, self._access_token = None, None
|
||||
self.cache.store(self._NETRC_MACHINE, 'token_data', [None, None])
|
||||
self.report_warning('Refresh token has been invalidated; retrying with credentials')
|
||||
self._perform_login(*self._get_login_info())
|
||||
|
||||
return self._access_token
|
||||
|
||||
def _fetch_claims_token(self):
|
||||
if not self._get_login_info()[0]:
|
||||
return None
|
||||
|
||||
if not self._claims_token or self._is_jwt_expired(self._claims_token):
|
||||
self._claims_token = self._download_json(
|
||||
'https://services.radio-canada.ca/ott/subscription/v2/gem/Subscriber/profile',
|
||||
None, 'Downloading claims token', query={'device': 'web'},
|
||||
headers={'Authorization': f'Bearer {self._fetch_access_token()}'})['claimsToken']
|
||||
self.cache.store(self._NETRC_MACHINE, 'claims_token', self._claims_token)
|
||||
else:
|
||||
self.write_debug('Using cached claims token')
|
||||
|
||||
return self._claims_token
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, season_number = self._match_valid_url(url).group('id', 'season')
|
||||
@@ -675,14 +692,10 @@ class CBCGemIE(CBCGemBaseIE):
|
||||
item_info = traverse_obj(video_info, (
|
||||
'content', ..., 'lineups', ..., 'items',
|
||||
lambda _, v: v['url'] == video_id, any, {require('item info')}))
|
||||
media_id = item_info['idMedia']
|
||||
|
||||
email, password = self._get_login_info()
|
||||
if email and password:
|
||||
claims_token = self._get_claims_token(email, password)
|
||||
headers = {'x-claims-token': claims_token}
|
||||
else:
|
||||
headers = {}
|
||||
headers = {}
|
||||
if claims_token := self._fetch_claims_token():
|
||||
headers['x-claims-token'] = claims_token
|
||||
|
||||
m3u8_info = self._download_json(
|
||||
'https://services.radio-canada.ca/media/validation/v2/',
|
||||
@@ -695,7 +708,7 @@ class CBCGemIE(CBCGemBaseIE):
|
||||
'tech': 'hls',
|
||||
'manifestVersion': '2',
|
||||
'manifestType': 'desktop',
|
||||
'idMedia': media_id,
|
||||
'idMedia': item_info['idMedia'],
|
||||
})
|
||||
|
||||
if m3u8_info.get('errorCode') == 1:
|
||||
|
||||
@@ -121,10 +121,7 @@ class CDAIE(InfoExtractor):
|
||||
}, **kwargs)
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
app_version = random.choice((
|
||||
'1.2.88 build 15306',
|
||||
'1.2.174 build 18469',
|
||||
))
|
||||
app_version = '1.2.255 build 21541'
|
||||
android_version = random.randrange(8, 14)
|
||||
phone_model = random.choice((
|
||||
# x-kom.pl top selling Android smartphones, as of 2022-12-26
|
||||
@@ -190,7 +187,7 @@ class CDAIE(InfoExtractor):
|
||||
meta = self._download_json(
|
||||
f'{self._BASE_API_URL}/video/{video_id}', video_id, headers=self._API_HEADERS)['video']
|
||||
|
||||
uploader = traverse_obj(meta, 'author', 'login')
|
||||
uploader = traverse_obj(meta, ('author', 'login', {str}))
|
||||
|
||||
formats = [{
|
||||
'url': quality['file'],
|
||||
|
||||
@@ -29,6 +29,7 @@ from ..compat import (
|
||||
from ..cookies import LenientSimpleCookie
|
||||
from ..downloader.f4m import get_base_url, remove_encrypted_media
|
||||
from ..downloader.hls import HlsFD
|
||||
from ..globals import plugin_ies_overrides
|
||||
from ..networking import HEADRequest, Request
|
||||
from ..networking.exceptions import (
|
||||
HTTPError,
|
||||
@@ -2934,8 +2935,7 @@ class InfoExtractor:
|
||||
segment_duration = None
|
||||
if 'total_number' not in representation_ms_info and 'segment_duration' in representation_ms_info:
|
||||
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
|
||||
representation_ms_info['total_number'] = int(math.ceil(
|
||||
float_or_none(period_duration, segment_duration, default=0)))
|
||||
representation_ms_info['total_number'] = math.ceil(float_or_none(period_duration, segment_duration, default=0))
|
||||
representation_ms_info['fragments'] = [{
|
||||
media_location_key: media_template % {
|
||||
'Number': segment_number,
|
||||
@@ -3954,14 +3954,18 @@ class InfoExtractor:
|
||||
def __init_subclass__(cls, *, plugin_name=None, **kwargs):
|
||||
if plugin_name:
|
||||
mro = inspect.getmro(cls)
|
||||
super_class = cls.__wrapped__ = mro[mro.index(cls) + 1]
|
||||
cls.PLUGIN_NAME, cls.ie_key = plugin_name, super_class.ie_key
|
||||
cls.IE_NAME = f'{super_class.IE_NAME}+{plugin_name}'
|
||||
next_mro_class = super_class = mro[mro.index(cls) + 1]
|
||||
|
||||
while getattr(super_class, '__wrapped__', None):
|
||||
super_class = super_class.__wrapped__
|
||||
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
|
||||
_PLUGIN_OVERRIDES[super_class].append(cls)
|
||||
|
||||
if not any(override.PLUGIN_NAME == plugin_name for override in plugin_ies_overrides.value[super_class]):
|
||||
cls.__wrapped__ = next_mro_class
|
||||
cls.PLUGIN_NAME, cls.ie_key = plugin_name, next_mro_class.ie_key
|
||||
cls.IE_NAME = f'{next_mro_class.IE_NAME}+{plugin_name}'
|
||||
|
||||
setattr(sys.modules[super_class.__module__], super_class.__name__, cls)
|
||||
plugin_ies_overrides.value[super_class].append(cls)
|
||||
return super().__init_subclass__(**kwargs)
|
||||
|
||||
|
||||
@@ -4017,6 +4021,3 @@ class UnsupportedURLIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
raise UnsupportedError(url)
|
||||
|
||||
|
||||
_PLUGIN_OVERRIDES = collections.defaultdict(list)
|
||||
|
||||
@@ -3,7 +3,7 @@ from ..utils import int_or_none
|
||||
|
||||
|
||||
class CultureUnpluggedIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/documentary/watch-online/play/(?P<id>\d+)(?:/(?P<display_id>[^/]+))?'
|
||||
_VALID_URL = r'https?://(?:www\.)?cultureunplugged\.com/(?:documentary/watch-online/)?play/(?P<id>\d+)(?:/(?P<display_id>[^/#?]+))?'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662/The-Next--Best-West',
|
||||
'md5': 'ac6c093b089f7d05e79934dcb3d228fc',
|
||||
@@ -12,12 +12,25 @@ class CultureUnpluggedIE(InfoExtractor):
|
||||
'display_id': 'The-Next--Best-West',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Next, Best West',
|
||||
'description': 'md5:0423cd00833dea1519cf014e9d0903b1',
|
||||
'description': 'md5:770033a3b7c2946a3bcfb7f1c6fb7045',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'creator': 'Coldstream Creative',
|
||||
'creators': ['Coldstream Creative'],
|
||||
'duration': 2203,
|
||||
'view_count': int,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.cultureunplugged.com/play/2833/Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
|
||||
'md5': 'dc2014bc470dfccba389a1c934fa29fa',
|
||||
'info_dict': {
|
||||
'id': '2833',
|
||||
'display_id': 'Koi-Sunta-Hai--Journeys-with-Kumar---Kabir--Someone-is-Listening-',
|
||||
'ext': 'mp4',
|
||||
'title': 'Koi Sunta Hai: Journeys with Kumar & Kabir (Someone is Listening)',
|
||||
'description': 'md5:fa94ac934927c98660362b8285b2cda5',
|
||||
'view_count': int,
|
||||
'thumbnail': 'https://s3.amazonaws.com/cdn.cultureunplugged.com/thumbnails_16_9/lg/2833.jpg',
|
||||
'creators': ['Srishti'],
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.cultureunplugged.com/documentary/watch-online/play/53662',
|
||||
'only_matching': True,
|
||||
|
||||
@@ -100,7 +100,7 @@ class DailymotionBaseInfoExtractor(InfoExtractor):
|
||||
|
||||
class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
_VALID_URL = r'''(?ix)
|
||||
https?://
|
||||
(?:https?:)?//
|
||||
(?:
|
||||
dai\.ly/|
|
||||
(?:
|
||||
@@ -116,7 +116,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
(?P<id>[^/?_&#]+)(?:[\w-]*\?playlist=(?P<playlist_id>x[0-9a-z]+))?
|
||||
'''
|
||||
IE_NAME = 'dailymotion'
|
||||
_EMBED_REGEX = [r'<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/(?:embed|swf)/video/.+?)\1']
|
||||
_EMBED_REGEX = [rf'(?ix)<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)["\'](?P<url>{_VALID_URL[5:]})']
|
||||
_TESTS = [{
|
||||
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
|
||||
'md5': '074b95bdee76b9e3654137aee9c79dfe',
|
||||
@@ -308,6 +308,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
'description': 'Que lindura',
|
||||
'tags': [],
|
||||
},
|
||||
}, {
|
||||
# //geo.dailymotion.com/player/xysxq.html?video=k2Y4Mjp7krAF9iCuINM
|
||||
'url': 'https://lcp.fr/programmes/avant-la-catastrophe-la-naissance-de-la-dictature-nazie-1933-1936-346819',
|
||||
'info_dict': {
|
||||
'id': 'k2Y4Mjp7krAF9iCuINM',
|
||||
'ext': 'mp4',
|
||||
'title': 'Avant la catastrophe la naissance de la dictature nazie 1933 -1936',
|
||||
'description': 'md5:7b620d5e26edbe45f27bbddc1c0257c1',
|
||||
'uploader': 'LCP Assemblée nationale',
|
||||
'uploader_id': 'xbz33d',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
'duration': 3220,
|
||||
'thumbnail': 'https://s1.dmcdn.net/v/Xvumk1djJBUZfjj2a/x1080',
|
||||
'tags': [],
|
||||
'timestamp': 1739919947,
|
||||
'upload_date': '20250218',
|
||||
},
|
||||
}]
|
||||
_GEO_BYPASS = False
|
||||
_COMMON_MEDIA_FIELDS = '''description
|
||||
|
||||
@@ -1,28 +1,37 @@
|
||||
import contextlib
|
||||
import inspect
|
||||
import os
|
||||
|
||||
from ..plugins import load_plugins
|
||||
from ..globals import LAZY_EXTRACTORS
|
||||
from ..globals import extractors as _extractors_context
|
||||
|
||||
# NB: Must be before other imports so that plugins can be correctly injected
|
||||
_PLUGIN_CLASSES = load_plugins('extractor', 'IE')
|
||||
_CLASS_LOOKUP = None
|
||||
if os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
||||
LAZY_EXTRACTORS.value = False
|
||||
else:
|
||||
try:
|
||||
from .lazy_extractors import _CLASS_LOOKUP
|
||||
LAZY_EXTRACTORS.value = True
|
||||
except ImportError:
|
||||
LAZY_EXTRACTORS.value = None
|
||||
|
||||
_LAZY_LOADER = False
|
||||
if not os.environ.get('YTDLP_NO_LAZY_EXTRACTORS'):
|
||||
with contextlib.suppress(ImportError):
|
||||
from .lazy_extractors import * # noqa: F403
|
||||
from .lazy_extractors import _ALL_CLASSES
|
||||
_LAZY_LOADER = True
|
||||
if not _CLASS_LOOKUP:
|
||||
from . import _extractors
|
||||
|
||||
if not _LAZY_LOADER:
|
||||
from ._extractors import * # noqa: F403
|
||||
_ALL_CLASSES = [ # noqa: F811
|
||||
klass
|
||||
for name, klass in globals().items()
|
||||
_CLASS_LOOKUP = {
|
||||
name: value
|
||||
for name, value in inspect.getmembers(_extractors)
|
||||
if name.endswith('IE') and name != 'GenericIE'
|
||||
]
|
||||
_ALL_CLASSES.append(GenericIE) # noqa: F405
|
||||
}
|
||||
_CLASS_LOOKUP['GenericIE'] = _extractors.GenericIE
|
||||
|
||||
globals().update(_PLUGIN_CLASSES)
|
||||
_ALL_CLASSES[:0] = _PLUGIN_CLASSES.values()
|
||||
# We want to append to the main lookup
|
||||
_current = _extractors_context.value
|
||||
for name, ie in _CLASS_LOOKUP.items():
|
||||
_current.setdefault(name, ie)
|
||||
|
||||
from .common import _PLUGIN_OVERRIDES # noqa: F401
|
||||
|
||||
def __getattr__(name):
|
||||
value = _CLASS_LOOKUP.get(name)
|
||||
if not value:
|
||||
raise AttributeError(f'module {__name__} has no attribute {name}')
|
||||
return value
|
||||
|
||||
@@ -1,19 +0,0 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class GigyaBaseIE(InfoExtractor):
|
||||
def _gigya_login(self, auth_data):
|
||||
auth_info = self._download_json(
|
||||
'https://accounts.eu1.gigya.com/accounts.login', None,
|
||||
note='Logging in', errnote='Unable to log in',
|
||||
data=urlencode_postdata(auth_data))
|
||||
|
||||
error_message = auth_info.get('errorDetails') or auth_info.get('errorMessage')
|
||||
if error_message:
|
||||
raise ExtractorError(
|
||||
f'Unable to login: {error_message}', expected=True)
|
||||
return auth_info
|
||||
@@ -69,8 +69,13 @@ class GloboIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '8013907',
|
||||
'ext': 'mp4',
|
||||
'title': 'Capítulo de 14⧸08⧸1989',
|
||||
'title': 'Capítulo de 14/08/1989',
|
||||
'episode': 'Episode 1',
|
||||
'episode_number': 1,
|
||||
'uploader': 'Tieta',
|
||||
'uploader_id': '11895',
|
||||
'duration': 2858.389,
|
||||
'subtitles': 'count:1',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -82,7 +87,12 @@ class GloboIE(InfoExtractor):
|
||||
'id': '12824146',
|
||||
'ext': 'mp4',
|
||||
'title': 'Acordo de damas',
|
||||
'episode': 'Episode 1',
|
||||
'episode_number': 1,
|
||||
'uploader': 'Rensga Hits!',
|
||||
'uploader_id': '20481',
|
||||
'duration': 1953.994,
|
||||
'season': 'Season 2',
|
||||
'season_number': 2,
|
||||
},
|
||||
'params': {
|
||||
@@ -136,9 +146,10 @@ class GloboIE(InfoExtractor):
|
||||
else:
|
||||
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
|
||||
main_source['url'], video_id, 'mp4', m3u8_id='hls')
|
||||
self._merge_subtitles(traverse_obj(main_source, ('text', ..., {
|
||||
'url': ('subtitle', 'srt', 'url', {url_or_none}),
|
||||
}, all, {subs_list_to_dict(lang='en')})), target=subtitles)
|
||||
|
||||
self._merge_subtitles(traverse_obj(main_source, ('text', ..., ('caption', 'subtitle'), {
|
||||
'url': ('srt', 'url', {url_or_none}),
|
||||
}, all, {subs_list_to_dict(lang='pt-BR')})), target=subtitles)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
||||
@@ -2,12 +2,12 @@ import hashlib
|
||||
import itertools
|
||||
import json
|
||||
import re
|
||||
import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
bug_reports_message,
|
||||
decode_base_n,
|
||||
encode_base_n,
|
||||
filter_dict,
|
||||
@@ -15,12 +15,12 @@ from ..utils import (
|
||||
format_field,
|
||||
get_element_by_attribute,
|
||||
int_or_none,
|
||||
join_nonempty,
|
||||
lowercase_escape,
|
||||
str_or_none,
|
||||
str_to_int,
|
||||
traverse_obj,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
_ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_'
|
||||
@@ -28,63 +28,30 @@ _ENCODING_CHARS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz012345678
|
||||
|
||||
def _pk_to_id(media_id):
|
||||
"""Source: https://stackoverflow.com/questions/24437823/getting-instagram-post-url-from-media-id"""
|
||||
return encode_base_n(int(media_id.split('_')[0]), table=_ENCODING_CHARS)
|
||||
pk = int(str(media_id).split('_')[0])
|
||||
return encode_base_n(pk, table=_ENCODING_CHARS)
|
||||
|
||||
|
||||
def _id_to_pk(shortcode):
|
||||
"""Covert a shortcode to a numeric value"""
|
||||
return decode_base_n(shortcode[:11], table=_ENCODING_CHARS)
|
||||
"""Convert a shortcode to a numeric value"""
|
||||
if len(shortcode) > 28:
|
||||
shortcode = shortcode[:-28]
|
||||
return decode_base_n(shortcode, table=_ENCODING_CHARS)
|
||||
|
||||
|
||||
class InstagramBaseIE(InfoExtractor):
|
||||
_NETRC_MACHINE = 'instagram'
|
||||
_IS_LOGGED_IN = False
|
||||
|
||||
_API_BASE_URL = 'https://i.instagram.com/api/v1'
|
||||
_LOGIN_URL = 'https://www.instagram.com/accounts/login'
|
||||
_API_HEADERS = {
|
||||
'X-IG-App-ID': '936619743392459',
|
||||
'X-ASBD-ID': '198387',
|
||||
'X-IG-WWW-Claim': '0',
|
||||
'Origin': 'https://www.instagram.com',
|
||||
'Accept': '*/*',
|
||||
}
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
if self._IS_LOGGED_IN:
|
||||
return
|
||||
|
||||
login_webpage = self._download_webpage(
|
||||
self._LOGIN_URL, None, note='Downloading login webpage', errnote='Failed to download login webpage')
|
||||
|
||||
shared_data = self._parse_json(self._search_regex(
|
||||
r'window\._sharedData\s*=\s*({.+?});', login_webpage, 'shared data', default='{}'), None)
|
||||
|
||||
login = self._download_json(
|
||||
f'{self._LOGIN_URL}/ajax/', None, note='Logging in', headers={
|
||||
**self._API_HEADERS,
|
||||
'X-Requested-With': 'XMLHttpRequest',
|
||||
'X-CSRFToken': shared_data['config']['csrf_token'],
|
||||
'X-Instagram-AJAX': shared_data['rollout_hash'],
|
||||
'Referer': 'https://www.instagram.com/',
|
||||
}, data=urlencode_postdata({
|
||||
'enc_password': f'#PWD_INSTAGRAM_BROWSER:0:{int(time.time())}:{password}',
|
||||
'username': username,
|
||||
'queryParams': '{}',
|
||||
'optIntoOneTap': 'false',
|
||||
'stopDeletionNonce': '',
|
||||
'trustedDeviceRecords': '{}',
|
||||
}))
|
||||
|
||||
if not login.get('authenticated'):
|
||||
if login.get('message'):
|
||||
raise ExtractorError(f'Unable to login: {login["message"]}')
|
||||
elif login.get('user'):
|
||||
raise ExtractorError('Unable to login: Sorry, your password was incorrect. Please double-check your password.', expected=True)
|
||||
elif login.get('user') is False:
|
||||
raise ExtractorError('Unable to login: The username you entered doesn\'t belong to an account. Please check your username and try again.', expected=True)
|
||||
raise ExtractorError('Unable to login')
|
||||
InstagramBaseIE._IS_LOGGED_IN = True
|
||||
@property
|
||||
def _api_headers(self):
|
||||
return {
|
||||
'X-IG-App-ID': self._configuration_arg('app_id', ['936619743392459'], ie_key=InstagramIE)[0],
|
||||
'X-ASBD-ID': '198387',
|
||||
'X-IG-WWW-Claim': '0',
|
||||
'Origin': 'https://www.instagram.com',
|
||||
'Accept': '*/*',
|
||||
}
|
||||
|
||||
def _get_count(self, media, kind, *keys):
|
||||
return traverse_obj(
|
||||
@@ -209,7 +176,7 @@ class InstagramBaseIE(InfoExtractor):
|
||||
def _get_comments(self, video_id):
|
||||
comments_info = self._download_json(
|
||||
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/comments/?can_support_threading=true&permalink_enabled=false', video_id,
|
||||
fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._API_HEADERS) or {}
|
||||
fatal=False, errnote='Comments extraction failed', note='Downloading comments info', headers=self._api_headers) or {}
|
||||
|
||||
comment_data = traverse_obj(comments_info, ('edge_media_to_parent_comment', 'edges'), 'comments')
|
||||
for comment_dict in comment_data or []:
|
||||
@@ -402,14 +369,14 @@ class InstagramIE(InstagramBaseIE):
|
||||
info = traverse_obj(self._download_json(
|
||||
f'{self._API_BASE_URL}/media/{_id_to_pk(video_id)}/info/', video_id,
|
||||
fatal=False, errnote='Video info extraction failed',
|
||||
note='Downloading video info', headers=self._API_HEADERS), ('items', 0))
|
||||
note='Downloading video info', headers=self._api_headers), ('items', 0))
|
||||
if info:
|
||||
media.update(info)
|
||||
return self._extract_product(media)
|
||||
|
||||
api_check = self._download_json(
|
||||
f'{self._API_BASE_URL}/web/get_ruling_for_content/?content_type=MEDIA&target_id={_id_to_pk(video_id)}',
|
||||
video_id, headers=self._API_HEADERS, fatal=False, note='Setting up session', errnote=False) or {}
|
||||
video_id, headers=self._api_headers, fatal=False, note='Setting up session', errnote=False) or {}
|
||||
csrf_token = self._get_cookies('https://www.instagram.com').get('csrftoken')
|
||||
|
||||
if not csrf_token:
|
||||
@@ -429,7 +396,7 @@ class InstagramIE(InstagramBaseIE):
|
||||
general_info = self._download_json(
|
||||
'https://www.instagram.com/graphql/query/', video_id, fatal=False, errnote=False,
|
||||
headers={
|
||||
**self._API_HEADERS,
|
||||
**self._api_headers,
|
||||
'X-CSRFToken': csrf_token or '',
|
||||
'X-Requested-With': 'XMLHttpRequest',
|
||||
'Referer': url,
|
||||
@@ -437,7 +404,6 @@ class InstagramIE(InstagramBaseIE):
|
||||
'doc_id': '8845758582119845',
|
||||
'variables': json.dumps(variables, separators=(',', ':')),
|
||||
})
|
||||
media.update(traverse_obj(general_info, ('data', 'xdt_shortcode_media')) or {})
|
||||
|
||||
if not general_info:
|
||||
self.report_warning('General metadata extraction failed (some metadata might be missing).', video_id)
|
||||
@@ -466,6 +432,26 @@ class InstagramIE(InstagramBaseIE):
|
||||
media.update(traverse_obj(
|
||||
additional_data, ('graphql', 'shortcode_media'), 'shortcode_media', expected_type=dict) or {})
|
||||
|
||||
else:
|
||||
xdt_shortcode_media = traverse_obj(general_info, ('data', 'xdt_shortcode_media', {dict})) or {}
|
||||
if not xdt_shortcode_media:
|
||||
error = join_nonempty('title', 'description', delim=': ', from_dict=api_check)
|
||||
if 'Restricted Video' in error:
|
||||
self.raise_login_required(error)
|
||||
elif error:
|
||||
raise ExtractorError(error, expected=True)
|
||||
elif len(video_id) > 28:
|
||||
# It's a private post (video_id == shortcode + 28 extra characters)
|
||||
# Only raise after getting empty response; sometimes "long"-shortcode posts are public
|
||||
self.raise_login_required(
|
||||
'This content is only available for registered users who follow this account')
|
||||
raise ExtractorError(
|
||||
'Instagram sent an empty media response. Check if this post is accessible in your '
|
||||
f'browser without being logged-in. If it is not, then u{self._login_hint()[1:]}. '
|
||||
'Otherwise, if the post is accessible in browser without being logged-in'
|
||||
f'{bug_reports_message(before=",")}', expected=True)
|
||||
media.update(xdt_shortcode_media)
|
||||
|
||||
username = traverse_obj(media, ('owner', 'username')) or self._search_regex(
|
||||
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"', webpage, 'username', fatal=False)
|
||||
|
||||
@@ -485,8 +471,7 @@ class InstagramIE(InstagramBaseIE):
|
||||
return self.playlist_result(
|
||||
self._extract_nodes(nodes, True), video_id,
|
||||
format_field(username, None, 'Post by %s'), description)
|
||||
|
||||
video_url = self._og_search_video_url(webpage, secure=False)
|
||||
raise ExtractorError('There is no video in this post', expected=True)
|
||||
|
||||
formats = [{
|
||||
'url': video_url,
|
||||
@@ -689,7 +674,7 @@ class InstagramTagIE(InstagramPlaylistBaseIE):
|
||||
|
||||
|
||||
class InstagramStoryIE(InstagramBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/]+)/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?instagram\.com/stories/(?P<user>[^/?#]+)(?:/(?P<id>\d+))?'
|
||||
IE_NAME = 'instagram:story'
|
||||
|
||||
_TESTS = [{
|
||||
@@ -699,25 +684,38 @@ class InstagramStoryIE(InstagramBaseIE):
|
||||
'title': 'Rare',
|
||||
},
|
||||
'playlist_mincount': 50,
|
||||
}, {
|
||||
'url': 'https://www.instagram.com/stories/fruits_zipper/3570766765028588805/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.instagram.com/stories/fruits_zipper',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
username, story_id = self._match_valid_url(url).groups()
|
||||
story_info = self._download_webpage(url, story_id)
|
||||
user_info = self._search_json(r'"user":', story_info, 'user info', story_id, fatal=False)
|
||||
username, story_id = self._match_valid_url(url).group('user', 'id')
|
||||
if username == 'highlights' and not story_id: # story id is only mandatory for highlights
|
||||
raise ExtractorError('Input URL is missing a highlight ID', expected=True)
|
||||
display_id = story_id or username
|
||||
story_info = self._download_webpage(url, display_id)
|
||||
user_info = self._search_json(r'"user":', story_info, 'user info', display_id, fatal=False)
|
||||
if not user_info:
|
||||
self.raise_login_required('This content is unreachable')
|
||||
|
||||
user_id = traverse_obj(user_info, 'pk', 'id', expected_type=str)
|
||||
story_info_url = user_id if username != 'highlights' else f'highlight:{story_id}'
|
||||
if not story_info_url: # user id is only mandatory for non-highlights
|
||||
raise ExtractorError('Unable to extract user id')
|
||||
if username == 'highlights':
|
||||
story_info_url = f'highlight:{story_id}'
|
||||
else:
|
||||
if not user_id: # user id is only mandatory for non-highlights
|
||||
raise ExtractorError('Unable to extract user id')
|
||||
story_info_url = user_id
|
||||
|
||||
videos = traverse_obj(self._download_json(
|
||||
f'{self._API_BASE_URL}/feed/reels_media/?reel_ids={story_info_url}',
|
||||
story_id, errnote=False, fatal=False, headers=self._API_HEADERS), 'reels')
|
||||
display_id, errnote=False, fatal=False, headers=self._api_headers), 'reels')
|
||||
if not videos:
|
||||
self.raise_login_required('You need to log in to access this content')
|
||||
user_info = traverse_obj(videos, (user_id, 'user', {dict})) or {}
|
||||
|
||||
full_name = traverse_obj(videos, (f'highlight:{story_id}', 'user', 'full_name'), (user_id, 'user', 'full_name'))
|
||||
story_title = traverse_obj(videos, (f'highlight:{story_id}', 'title'))
|
||||
@@ -727,6 +725,7 @@ class InstagramStoryIE(InstagramBaseIE):
|
||||
highlights = traverse_obj(videos, (f'highlight:{story_id}', 'items'), (user_id, 'items'))
|
||||
info_data = []
|
||||
for highlight in highlights:
|
||||
highlight.setdefault('user', {}).update(user_info)
|
||||
highlight_data = self._extract_product(highlight)
|
||||
if highlight_data.get('formats'):
|
||||
info_data.append({
|
||||
@@ -734,4 +733,7 @@ class InstagramStoryIE(InstagramBaseIE):
|
||||
'uploader_id': user_id,
|
||||
**filter_dict(highlight_data),
|
||||
})
|
||||
if username != 'highlights' and story_id and not self._yes_playlist(username, story_id):
|
||||
return traverse_obj(info_data, (lambda _, v: v['id'] == _pk_to_id(story_id), any))
|
||||
|
||||
return self.playlist_result(info_data, playlist_id=story_id, playlist_title=story_title)
|
||||
|
||||
@@ -2,10 +2,12 @@ import hashlib
|
||||
import random
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..networking import HEADRequest
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
int_or_none,
|
||||
try_get,
|
||||
urlhandle_detect_ext,
|
||||
)
|
||||
|
||||
|
||||
@@ -27,7 +29,7 @@ class JamendoIE(InfoExtractor):
|
||||
'ext': 'flac',
|
||||
# 'title': 'Maya Filipič - Stories from Emona I',
|
||||
'title': 'Stories from Emona I',
|
||||
'artist': 'Maya Filipič',
|
||||
'artists': ['Maya Filipič'],
|
||||
'album': 'Between two worlds',
|
||||
'track': 'Stories from Emona I',
|
||||
'duration': 210,
|
||||
@@ -93,9 +95,15 @@ class JamendoIE(InfoExtractor):
|
||||
if not cover_url or cover_url in urls:
|
||||
continue
|
||||
urls.append(cover_url)
|
||||
urlh = self._request_webpage(
|
||||
HEADRequest(cover_url), track_id, 'Checking thumbnail extension',
|
||||
errnote=False, fatal=False)
|
||||
if not urlh:
|
||||
continue
|
||||
size = int_or_none(cover_id.lstrip('size'))
|
||||
thumbnails.append({
|
||||
'id': cover_id,
|
||||
'ext': urlhandle_detect_ext(urlh, default='jpg'),
|
||||
'url': cover_url,
|
||||
'width': size,
|
||||
'height': size,
|
||||
|
||||
@@ -26,6 +26,7 @@ class LBRYBaseIE(InfoExtractor):
|
||||
_CLAIM_ID_REGEX = r'[0-9a-f]{1,40}'
|
||||
_OPT_CLAIM_ID = f'[^$@:/?#&]+(?:[:#]{_CLAIM_ID_REGEX})?'
|
||||
_SUPPORTED_STREAM_TYPES = ['video', 'audio']
|
||||
_UNSUPPORTED_STREAM_TYPES = ['binary']
|
||||
_PAGE_SIZE = 50
|
||||
|
||||
def _call_api_proxy(self, method, display_id, params, resource):
|
||||
@@ -336,12 +337,15 @@ class LBRYIE(LBRYBaseIE):
|
||||
'vcodec': 'none' if stream_type == 'audio' else None,
|
||||
})
|
||||
|
||||
final_url = None
|
||||
# HEAD request returns redirect response to m3u8 URL if available
|
||||
final_url = self._request_webpage(
|
||||
urlh = self._request_webpage(
|
||||
HEADRequest(streaming_url), display_id, headers=headers,
|
||||
note='Downloading streaming redirect url info').url
|
||||
note='Downloading streaming redirect url info', fatal=False)
|
||||
if urlh:
|
||||
final_url = urlh.url
|
||||
|
||||
elif result.get('value_type') == 'stream':
|
||||
elif result.get('value_type') == 'stream' and stream_type not in self._UNSUPPORTED_STREAM_TYPES:
|
||||
claim_id, is_live = result['signing_channel']['claim_id'], True
|
||||
live_data = self._download_json(
|
||||
'https://api.odysee.live/livestream/is_live', claim_id,
|
||||
|
||||
87
yt_dlp/extractor/loco.py
Normal file
87
yt_dlp/extractor/loco.py
Normal file
@@ -0,0 +1,87 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none, url_or_none
|
||||
from ..utils.traversal import require, traverse_obj
|
||||
|
||||
|
||||
class LocoIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?loco\.com/(?P<type>streamers|stream)/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://loco.com/streamers/teuzinfps',
|
||||
'info_dict': {
|
||||
'id': 'teuzinfps',
|
||||
'ext': 'mp4',
|
||||
'title': r're:MS BOLADAO, RESENHA & GAMEPLAY ALTO NIVEL',
|
||||
'description': 'bom e novo',
|
||||
'uploader_id': 'RLUVE3S9JU',
|
||||
'channel': 'teuzinfps',
|
||||
'channel_follower_count': int,
|
||||
'comment_count': int,
|
||||
'view_count': int,
|
||||
'concurrent_view_count': int,
|
||||
'like_count': int,
|
||||
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/743701a9-98ca-41ae-9a8b-70bd5da070ad.jpg',
|
||||
'tags': ['MMORPG', 'Gameplay'],
|
||||
'series': 'Tibia',
|
||||
'timestamp': int,
|
||||
'modified_timestamp': int,
|
||||
'live_status': 'is_live',
|
||||
'upload_date': str,
|
||||
'modified_date': str,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': 'Livestream',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://loco.com/stream/c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
|
||||
'md5': '45ebc8a47ee1c2240178757caf8881b5',
|
||||
'info_dict': {
|
||||
'id': 'c64916eb-10fb-46a9-9a19-8c4b7ed064e7',
|
||||
'ext': 'mp4',
|
||||
'title': 'PAULINHO LOKO NA LOCO!',
|
||||
'description': 'live on na loco',
|
||||
'uploader_id': '2MDO7Z1DPM',
|
||||
'channel': 'paulinholokobr',
|
||||
'channel_follower_count': int,
|
||||
'comment_count': int,
|
||||
'view_count': int,
|
||||
'concurrent_view_count': int,
|
||||
'like_count': int,
|
||||
'duration': 14491,
|
||||
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/59b5970b-23c1-4518-9e96-17ce341299fe.jpg',
|
||||
'tags': ['Gameplay'],
|
||||
'series': 'GTA 5',
|
||||
'timestamp': 1740612872,
|
||||
'modified_timestamp': 1740613037,
|
||||
'upload_date': '20250226',
|
||||
'modified_date': '20250226',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_type, video_id = self._match_valid_url(url).group('type', 'id')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
|
||||
'props', 'pageProps', ('liveStreamData', 'stream'), {dict}, any, {require('stream info')}))
|
||||
|
||||
return {
|
||||
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),
|
||||
'id': video_id,
|
||||
'is_live': video_type == 'streamers',
|
||||
**traverse_obj(stream, {
|
||||
'title': ('title', {str}),
|
||||
'series': ('game_name', {str}),
|
||||
'uploader_id': ('user_uid', {str}),
|
||||
'channel': ('alias', {str}),
|
||||
'description': ('description', {str}),
|
||||
'concurrent_view_count': ('viewersCurrent', {int_or_none}),
|
||||
'view_count': ('total_views', {int_or_none}),
|
||||
'thumbnail': ('thumbnail_url_small', {url_or_none}),
|
||||
'like_count': ('likes', {int_or_none}),
|
||||
'tags': ('tags', ..., {str}),
|
||||
'timestamp': ('started_at', {int_or_none(scale=1000)}),
|
||||
'modified_timestamp': ('updated_at', {int_or_none(scale=1000)}),
|
||||
'comment_count': ('comments_count', {int_or_none}),
|
||||
'channel_follower_count': ('followers_count', {int_or_none}),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
}),
|
||||
}
|
||||
@@ -1,35 +1,36 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_age_limit, parse_duration, traverse_obj
|
||||
from ..utils import parse_age_limit, parse_duration, url_or_none
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class MagellanTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?magellantv\.com/(?:watch|video)/(?P<id>[\w-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.magellantv.com/watch/my-dads-on-death-row?type=v',
|
||||
'url': 'https://www.magellantv.com/watch/incas-the-new-story?type=v',
|
||||
'info_dict': {
|
||||
'id': 'my-dads-on-death-row',
|
||||
'id': 'incas-the-new-story',
|
||||
'ext': 'mp4',
|
||||
'title': 'My Dad\'s On Death Row',
|
||||
'description': 'md5:33ba23b9f0651fc4537ed19b1d5b0d7a',
|
||||
'duration': 3780.0,
|
||||
'title': 'Incas: The New Story',
|
||||
'description': 'md5:936c7f6d711c02dfb9db22a067b586fe',
|
||||
'age_limit': 14,
|
||||
'tags': ['Justice', 'Reality', 'United States', 'True Crime'],
|
||||
'duration': 3060.0,
|
||||
'tags': ['Ancient History', 'Archaeology', 'Anthropology'],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
'url': 'https://www.magellantv.com/video/james-bulger-the-new-revelations',
|
||||
'url': 'https://www.magellantv.com/video/tortured-to-death-murdering-the-nanny',
|
||||
'info_dict': {
|
||||
'id': 'james-bulger-the-new-revelations',
|
||||
'id': 'tortured-to-death-murdering-the-nanny',
|
||||
'ext': 'mp4',
|
||||
'title': 'James Bulger: The New Revelations',
|
||||
'description': 'md5:7b97922038bad1d0fe8d0470d8a189f2',
|
||||
'title': 'Tortured to Death: Murdering the Nanny',
|
||||
'description': 'md5:d87033594fa218af2b1a8b49f52511e5',
|
||||
'age_limit': 14,
|
||||
'duration': 2640.0,
|
||||
'age_limit': 0,
|
||||
'tags': ['Investigation', 'True Crime', 'Justice', 'Europe'],
|
||||
'tags': ['True Crime', 'Murder'],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
'url': 'https://www.magellantv.com/watch/celebration-nation',
|
||||
'url': 'https://www.magellantv.com/watch/celebration-nation?type=s',
|
||||
'info_dict': {
|
||||
'id': 'celebration-nation',
|
||||
'ext': 'mp4',
|
||||
@@ -43,10 +44,19 @@ class MagellanTVIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
data = traverse_obj(self._search_nextjs_data(webpage, video_id), (
|
||||
'props', 'pageProps', 'reactContext',
|
||||
(('video', 'detail'), ('series', 'currentEpisode')), {dict}), get_all=False)
|
||||
formats, subtitles = self._extract_m3u8_formats_and_subtitles(data['jwpVideoUrl'], video_id)
|
||||
context = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['reactContext']
|
||||
data = traverse_obj(context, ((('video', 'detail'), ('series', 'currentEpisode')), {dict}, any))
|
||||
|
||||
formats, subtitles = [], {}
|
||||
for m3u8_url in set(traverse_obj(data, ((('manifests', ..., 'hls'), 'jwp_video_url'), {url_or_none}))):
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
if not formats and (error := traverse_obj(context, ('errorDetailPage', 'errorMessage', {str}))):
|
||||
if 'available in your country' in error:
|
||||
self.raise_geo_restricted(msg=error)
|
||||
self.raise_no_formats(f'{self.IE_NAME} said: {error}', expected=True)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
|
||||
@@ -102,11 +102,10 @@ class MedalTVIE(InfoExtractor):
|
||||
item_id = item_id or '%dp' % height
|
||||
if item_id not in item_url:
|
||||
return
|
||||
width = int(round(aspect_ratio * height))
|
||||
container.append({
|
||||
'url': item_url,
|
||||
id_key: item_id,
|
||||
'width': width,
|
||||
'width': round(aspect_ratio * height),
|
||||
'height': height,
|
||||
})
|
||||
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
from .telecinco import TelecincoBaseIE
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
)
|
||||
@@ -79,7 +81,17 @@ class MiTeleIE(TelecincoBaseIE):
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
try: # yt-dlp's default user-agents are too old and blocked by akamai
|
||||
webpage = self._download_webpage(url, display_id, headers={
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
|
||||
})
|
||||
except ExtractorError as e:
|
||||
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
|
||||
raise
|
||||
# Retry with impersonation if hardcoded UA is insufficient to bypass akamai
|
||||
webpage = self._download_webpage(url, display_id, impersonate=True)
|
||||
|
||||
pre_player = self._search_json(
|
||||
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
|
||||
webpage, 'Pre Player', display_id)['prePlayer']
|
||||
|
||||
@@ -1,167 +1,215 @@
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
clean_html,
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
unescapeHTML,
|
||||
parse_iso8601,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class MSNIE(InfoExtractor):
|
||||
_WORKING = False
|
||||
_VALID_URL = r'https?://(?:(?:www|preview)\.)?msn\.com/(?:[^/]+/)+(?P<display_id>[^/]+)/[a-z]{2}-(?P<id>[\da-zA-Z]+)'
|
||||
_VALID_URL = r'https?://(?:(?:www|preview)\.)?msn\.com/(?P<locale>[a-z]{2}-[a-z]{2})/(?:[^/?#]+/)+(?P<display_id>[^/?#]+)/[a-z]{2}-(?P<id>[\da-zA-Z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.msn.com/en-in/money/video/7-ways-to-get-rid-of-chest-congestion/vi-BBPxU6d',
|
||||
'md5': '087548191d273c5c55d05028f8d2cbcd',
|
||||
'url': 'https://www.msn.com/en-gb/video/news/president-macron-interrupts-trump-over-ukraine-funding/vi-AA1zMcD7',
|
||||
'info_dict': {
|
||||
'id': 'BBPxU6d',
|
||||
'display_id': '7-ways-to-get-rid-of-chest-congestion',
|
||||
'id': 'AA1zMcD7',
|
||||
'ext': 'mp4',
|
||||
'title': 'Seven ways to get rid of chest congestion',
|
||||
'description': '7 Ways to Get Rid of Chest Congestion',
|
||||
'duration': 88,
|
||||
'uploader': 'Health',
|
||||
'uploader_id': 'BBPrMqa',
|
||||
'display_id': 'president-macron-interrupts-trump-over-ukraine-funding',
|
||||
'title': 'President Macron interrupts Trump over Ukraine funding',
|
||||
'description': 'md5:5fd3857ac25849e7a56cb25fbe1a2a8b',
|
||||
'uploader': 'k! News UK',
|
||||
'uploader_id': 'BB1hz5Rj',
|
||||
'duration': 59,
|
||||
'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1zMagX.img',
|
||||
'tags': 'count:14',
|
||||
'timestamp': 1740510914,
|
||||
'upload_date': '20250225',
|
||||
'release_timestamp': 1740513600,
|
||||
'release_date': '20250225',
|
||||
'modified_timestamp': 1741413241,
|
||||
'modified_date': '20250308',
|
||||
},
|
||||
}, {
|
||||
# Article, multiple Dailymotion Embeds
|
||||
'url': 'https://www.msn.com/en-in/money/sports/hottest-football-wags-greatest-footballers-turned-managers-and-more/ar-BBpc7Nl',
|
||||
'url': 'https://www.msn.com/en-gb/video/watch/films-success-saved-adam-pearsons-acting-career/vi-AA1znZGE?ocid=hpmsn',
|
||||
'info_dict': {
|
||||
'id': 'BBpc7Nl',
|
||||
'id': 'AA1znZGE',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'films-success-saved-adam-pearsons-acting-career',
|
||||
'title': "Films' success saved Adam Pearson's acting career",
|
||||
'description': 'md5:98c05f7bd9ab4f9c423400f62f2d3da5',
|
||||
'uploader': 'Sky News',
|
||||
'uploader_id': 'AA2eki',
|
||||
'duration': 52,
|
||||
'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1zo7nU.img',
|
||||
'timestamp': 1739993965,
|
||||
'upload_date': '20250219',
|
||||
'release_timestamp': 1739977753,
|
||||
'release_date': '20250219',
|
||||
'modified_timestamp': 1742076259,
|
||||
'modified_date': '20250315',
|
||||
},
|
||||
'playlist_mincount': 4,
|
||||
}, {
|
||||
'url': 'http://www.msn.com/en-ae/news/offbeat/meet-the-nine-year-old-self-made-millionaire/ar-BBt6ZKf',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.msn.com/en-ae/video/watch/obama-a-lot-of-people-will-be-disappointed/vi-AAhxUMH',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# geo restricted
|
||||
'url': 'http://www.msn.com/en-ae/foodanddrink/joinourtable/the-first-fart-makes-you-laugh-the-last-fart-makes-you-cry/vp-AAhzIBU',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.msn.com/en-ae/entertainment/bollywood/watch-how-salman-khan-reacted-when-asked-if-he-would-apologize-for-his-‘raped-woman’-comment/vi-AAhvzW6',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# Vidible(AOL) Embed
|
||||
'url': 'https://www.msn.com/en-us/money/other/jupiter-is-about-to-come-so-close-you-can-see-its-moons-with-binoculars/vi-AACqsHR',
|
||||
'only_matching': True,
|
||||
'url': 'https://www.msn.com/en-us/entertainment/news/rock-frontman-replacements-you-might-not-know-happened/vi-AA1yLVcD',
|
||||
'info_dict': {
|
||||
'id': 'AA1yLVcD',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'rock-frontman-replacements-you-might-not-know-happened',
|
||||
'title': 'Rock Frontman Replacements You Might Not Know Happened',
|
||||
'description': 'md5:451a125496ff0c9f6816055bb1808da9',
|
||||
'uploader': 'Grunge (Video)',
|
||||
'uploader_id': 'BB1oveoV',
|
||||
'duration': 596,
|
||||
'thumbnail': 'https://img-s-msn-com.akamaized.net/tenant/amp/entityid/AA1yM4OJ.img',
|
||||
'timestamp': 1739223456,
|
||||
'upload_date': '20250210',
|
||||
'release_timestamp': 1739219731,
|
||||
'release_date': '20250210',
|
||||
'modified_timestamp': 1741427272,
|
||||
'modified_date': '20250308',
|
||||
},
|
||||
}, {
|
||||
# Dailymotion Embed
|
||||
'url': 'https://www.msn.com/es-ve/entretenimiento/watch/winston-salem-paire-refait-des-siennes-en-perdant-sa-raquette-au-service/vp-AAG704L',
|
||||
'only_matching': True,
|
||||
'url': 'https://www.msn.com/de-de/nachrichten/other/the-first-descendant-gameplay-trailer-zu-serena-der-neuen-gefl%C3%BCgelten-nachfahrin/vi-AA1B1d06',
|
||||
'info_dict': {
|
||||
'id': 'x9g6oli',
|
||||
'ext': 'mp4',
|
||||
'title': 'The First Descendant: Gameplay-Trailer zu Serena, der neuen geflügelten Nachfahrin',
|
||||
'description': '',
|
||||
'uploader': 'MeinMMO',
|
||||
'uploader_id': 'x2mvqi4',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
'duration': 60,
|
||||
'thumbnail': 'https://s1.dmcdn.net/v/Y3fO61drj56vPB9SS/x1080',
|
||||
'tags': ['MeinMMO', 'The First Descendant'],
|
||||
'timestamp': 1742124877,
|
||||
'upload_date': '20250316',
|
||||
},
|
||||
}, {
|
||||
# YouTube Embed
|
||||
'url': 'https://www.msn.com/en-in/money/news/meet-vikram-%E2%80%94-chandrayaan-2s-lander/vi-AAGUr0v',
|
||||
'only_matching': True,
|
||||
# Youtube Embed
|
||||
'url': 'https://www.msn.com/en-gb/video/webcontent/web-content/vi-AA1ybFaJ',
|
||||
'info_dict': {
|
||||
'id': 'kQSChWu95nE',
|
||||
'ext': 'mp4',
|
||||
'title': '7 Daily Habits to Nurture Your Personal Growth',
|
||||
'description': 'md5:6f233c68341b74dee30c8c121924e827',
|
||||
'uploader': 'TopThink',
|
||||
'uploader_id': '@TopThink',
|
||||
'uploader_url': 'https://www.youtube.com/@TopThink',
|
||||
'channel': 'TopThink',
|
||||
'channel_id': 'UCMlGmHokrQRp-RaNO7aq4Uw',
|
||||
'channel_url': 'https://www.youtube.com/channel/UCMlGmHokrQRp-RaNO7aq4Uw',
|
||||
'channel_is_verified': True,
|
||||
'channel_follower_count': int,
|
||||
'comment_count': int,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
'duration': 705,
|
||||
'thumbnail': 'https://i.ytimg.com/vi/kQSChWu95nE/maxresdefault.jpg',
|
||||
'categories': ['Howto & Style'],
|
||||
'tags': ['topthink', 'top think', 'personal growth'],
|
||||
'timestamp': 1722711620,
|
||||
'upload_date': '20240803',
|
||||
'playable_in_embed': True,
|
||||
'availability': 'public',
|
||||
'live_status': 'not_live',
|
||||
},
|
||||
}, {
|
||||
# NBCSports Embed
|
||||
'url': 'https://www.msn.com/en-us/money/football_nfl/week-13-preview-redskins-vs-panthers/vi-BBXsCDb',
|
||||
'only_matching': True,
|
||||
# Article with social embed
|
||||
'url': 'https://www.msn.com/en-in/news/techandscience/watch-earth-sets-and-rises-behind-moon-in-breathtaking-blue-ghost-video/ar-AA1zKoAc',
|
||||
'info_dict': {
|
||||
'id': 'AA1zKoAc',
|
||||
'title': 'Watch: Earth sets and rises behind Moon in breathtaking Blue Ghost video',
|
||||
'description': 'md5:0ad51cfa77e42e7f0c46cf98a619dbbf',
|
||||
'uploader': 'India Today',
|
||||
'uploader_id': 'AAyFWG',
|
||||
'tags': 'count:11',
|
||||
'timestamp': 1740485034,
|
||||
'upload_date': '20250225',
|
||||
'release_timestamp': 1740484875,
|
||||
'release_date': '20250225',
|
||||
'modified_timestamp': 1740488561,
|
||||
'modified_date': '20250225',
|
||||
},
|
||||
'playlist_count': 1,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id, page_id = self._match_valid_url(url).groups()
|
||||
locale, display_id, page_id = self._match_valid_url(url).group('locale', 'display_id', 'id')
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
json_data = self._download_json(
|
||||
f'https://assets.msn.com/content/view/v2/Detail/{locale}/{page_id}', page_id)
|
||||
|
||||
entries = []
|
||||
for _, metadata in re.findall(r'data-metadata\s*=\s*(["\'])(?P<data>.+?)\1', webpage):
|
||||
video = self._parse_json(unescapeHTML(metadata), display_id)
|
||||
|
||||
provider_id = video.get('providerId')
|
||||
player_name = video.get('playerName')
|
||||
if player_name and provider_id:
|
||||
entry = None
|
||||
if player_name == 'AOL':
|
||||
if provider_id.startswith('http'):
|
||||
provider_id = self._search_regex(
|
||||
r'https?://delivery\.vidible\.tv/video/redirect/([0-9a-f]{24})',
|
||||
provider_id, 'vidible id')
|
||||
entry = self.url_result(
|
||||
'aol-video:' + provider_id, 'Aol', provider_id)
|
||||
elif player_name == 'Dailymotion':
|
||||
entry = self.url_result(
|
||||
'https://www.dailymotion.com/video/' + provider_id,
|
||||
'Dailymotion', provider_id)
|
||||
elif player_name == 'YouTube':
|
||||
entry = self.url_result(
|
||||
provider_id, 'Youtube', provider_id)
|
||||
elif player_name == 'NBCSports':
|
||||
entry = self.url_result(
|
||||
'http://vplayer.nbcsports.com/p/BxmELC/nbcsports_embed/select/media/' + provider_id,
|
||||
'NBCSportsVPlayer', provider_id)
|
||||
if entry:
|
||||
entries.append(entry)
|
||||
continue
|
||||
|
||||
video_id = video['uuid']
|
||||
title = video['title']
|
||||
common_metadata = traverse_obj(json_data, {
|
||||
'title': ('title', {str}),
|
||||
'description': (('abstract', ('body', {clean_html})), {str}, filter, any),
|
||||
'timestamp': ('createdDateTime', {parse_iso8601}),
|
||||
'release_timestamp': ('publishedDateTime', {parse_iso8601}),
|
||||
'modified_timestamp': ('updatedDateTime', {parse_iso8601}),
|
||||
'thumbnail': ('thumbnail', 'image', 'url', {url_or_none}),
|
||||
'duration': ('videoMetadata', 'playTime', {int_or_none}),
|
||||
'tags': ('keywords', ..., {str}),
|
||||
'uploader': ('provider', 'name', {str}),
|
||||
'uploader_id': ('provider', 'id', {str}),
|
||||
})
|
||||
|
||||
page_type = json_data['type']
|
||||
source_url = traverse_obj(json_data, ('sourceHref', {url_or_none}))
|
||||
if page_type == 'video':
|
||||
if traverse_obj(json_data, ('thirdPartyVideoPlayer', 'enabled')) and source_url:
|
||||
return self.url_result(source_url)
|
||||
formats = []
|
||||
for file_ in video.get('videoFiles', []):
|
||||
format_url = file_.get('url')
|
||||
if not format_url:
|
||||
continue
|
||||
if 'format=m3u8-aapl' in format_url:
|
||||
# m3u8_native should not be used here until
|
||||
# https://github.com/ytdl-org/youtube-dl/issues/9913 is fixed
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, display_id, 'mp4',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif 'format=mpd-time-csf' in format_url:
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
format_url, display_id, 'dash', fatal=False))
|
||||
elif '.ism' in format_url:
|
||||
if format_url.endswith('.ism'):
|
||||
format_url += '/manifest'
|
||||
formats.extend(self._extract_ism_formats(
|
||||
format_url, display_id, 'mss', fatal=False))
|
||||
else:
|
||||
format_id = file_.get('formatCode')
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'ext': 'mp4',
|
||||
'format_id': format_id,
|
||||
'width': int_or_none(file_.get('width')),
|
||||
'height': int_or_none(file_.get('height')),
|
||||
'vbr': int_or_none(self._search_regex(r'_(\d+)\.mp4', format_url, 'vbr', default=None)),
|
||||
'quality': 1 if format_id == '1001' else None,
|
||||
})
|
||||
|
||||
subtitles = {}
|
||||
for file_ in video.get('files', []):
|
||||
format_url = file_.get('url')
|
||||
format_code = file_.get('formatCode')
|
||||
if not format_url or not format_code:
|
||||
continue
|
||||
if str(format_code) == '3100':
|
||||
subtitles.setdefault(file_.get('culture', 'en'), []).append({
|
||||
'ext': determine_ext(format_url, 'ttml'),
|
||||
'url': format_url,
|
||||
})
|
||||
for file in traverse_obj(json_data, ('videoMetadata', 'externalVideoFiles', lambda _, v: url_or_none(v['url']))):
|
||||
file_url = file['url']
|
||||
ext = determine_ext(file_url)
|
||||
if ext == 'm3u8':
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
file_url, page_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
elif ext == 'mpd':
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(
|
||||
file_url, page_id, mpd_id='dash', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
else:
|
||||
formats.append(
|
||||
traverse_obj(file, {
|
||||
'url': 'url',
|
||||
'format_id': ('format', {str}),
|
||||
'filesize': ('fileSize', {int_or_none}),
|
||||
'height': ('height', {int_or_none}),
|
||||
'width': ('width', {int_or_none}),
|
||||
}))
|
||||
for caption in traverse_obj(json_data, ('videoMetadata', 'closedCaptions', lambda _, v: url_or_none(v['href']))):
|
||||
lang = caption.get('locale') or 'en-us'
|
||||
subtitles.setdefault(lang, []).append({
|
||||
'url': caption['href'],
|
||||
'ext': 'ttml',
|
||||
})
|
||||
|
||||
entries.append({
|
||||
'id': video_id,
|
||||
return {
|
||||
'id': page_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': video.get('description'),
|
||||
'thumbnail': video.get('headlineImage', {}).get('url'),
|
||||
'duration': int_or_none(video.get('durationSecs')),
|
||||
'uploader': video.get('sourceFriendly'),
|
||||
'uploader_id': video.get('providerId'),
|
||||
'creator': video.get('creator'),
|
||||
'subtitles': subtitles,
|
||||
'formats': formats,
|
||||
})
|
||||
'subtitles': subtitles,
|
||||
**common_metadata,
|
||||
}
|
||||
elif page_type == 'webcontent':
|
||||
if not source_url:
|
||||
raise ExtractorError('Could not find source URL')
|
||||
return self.url_result(source_url)
|
||||
elif page_type == 'article':
|
||||
entries = []
|
||||
for embed_url in traverse_obj(json_data, ('socialEmbeds', ..., 'postUrl', {url_or_none})):
|
||||
entries.append(self.url_result(embed_url))
|
||||
|
||||
if not entries:
|
||||
error = unescapeHTML(self._search_regex(
|
||||
r'data-error=(["\'])(?P<error>.+?)\1',
|
||||
webpage, 'error', group='error'))
|
||||
raise ExtractorError(f'{self.IE_NAME} said: {error}', expected=True)
|
||||
return self.playlist_result(entries, page_id, **common_metadata)
|
||||
|
||||
return self.playlist_result(entries, page_id)
|
||||
raise ExtractorError(f'Unsupported page type: {page_type}')
|
||||
|
||||
@@ -4,7 +4,9 @@ from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
unified_timestamp,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class N1InfoAssetIE(InfoExtractor):
|
||||
@@ -35,9 +37,9 @@ class N1InfoIIE(InfoExtractor):
|
||||
IE_NAME = 'N1Info:article'
|
||||
_VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
# Youtube embedded
|
||||
# YouTube embedded
|
||||
'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/',
|
||||
'md5': '01ddb6646d0fd9c4c7d990aa77fe1c5a',
|
||||
'md5': '987ce6fd72acfecc453281e066b87973',
|
||||
'info_dict': {
|
||||
'id': 'L5Hd4hQVUpk',
|
||||
'ext': 'mp4',
|
||||
@@ -45,7 +47,26 @@ class N1InfoIIE(InfoExtractor):
|
||||
'title': 'Ozmo i USO21, ep. 13: Novak Đoković – Danil Medvedev | Ključevi Poraza, Budućnost | SPORT KLUB TENIS',
|
||||
'description': 'md5:467f330af1effedd2e290f10dc31bb8e',
|
||||
'uploader': 'Sport Klub',
|
||||
'uploader_id': 'sportklub',
|
||||
'uploader_id': '@sportklub',
|
||||
'uploader_url': 'https://www.youtube.com/@sportklub',
|
||||
'channel': 'Sport Klub',
|
||||
'channel_id': 'UChpzBje9Ro6CComXe3BgNaw',
|
||||
'channel_url': 'https://www.youtube.com/channel/UChpzBje9Ro6CComXe3BgNaw',
|
||||
'channel_is_verified': True,
|
||||
'channel_follower_count': int,
|
||||
'comment_count': int,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
'duration': 1049,
|
||||
'thumbnail': 'https://i.ytimg.com/vi/L5Hd4hQVUpk/maxresdefault.jpg',
|
||||
'chapters': 'count:9',
|
||||
'categories': ['Sports'],
|
||||
'tags': 'count:10',
|
||||
'timestamp': 1631522787,
|
||||
'playable_in_embed': True,
|
||||
'availability': 'public',
|
||||
'live_status': 'not_live',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/',
|
||||
@@ -55,6 +76,7 @@ class N1InfoIIE(InfoExtractor):
|
||||
'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode',
|
||||
'upload_date': '20210924',
|
||||
'timestamp': 1632481347,
|
||||
'thumbnail': 'http://n1info.rs/wp-content/themes/ucnewsportal-n1/dist/assets/images/placeholder-image-video.jpg',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -67,6 +89,7 @@ class N1InfoIIE(InfoExtractor):
|
||||
'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”',
|
||||
'timestamp': 1632567630,
|
||||
'upload_date': '20210925',
|
||||
'thumbnail': 'https://n1info.si/wp-content/uploads/2021/09/06/1630945843-tomaz3.png',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -81,6 +104,14 @@ class N1InfoIIE(InfoExtractor):
|
||||
'upload_date': '20210924',
|
||||
'timestamp': 1632448649.0,
|
||||
'uploader': 'YouLotWhatDontStop',
|
||||
'display_id': 'pu9wbx',
|
||||
'channel_id': 'serbia',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
'dislike_count': int,
|
||||
'age_limit': 0,
|
||||
'duration': 134,
|
||||
'thumbnail': 'https://external-preview.redd.it/5nmmawSeGx60miQM3Iq-ueC9oyCLTLjjqX-qqY8uRsc.png?format=pjpg&auto=webp&s=2f973400b04d23f871b608b178e47fc01f9b8f1d',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -93,6 +124,7 @@ class N1InfoIIE(InfoExtractor):
|
||||
'title': 'Žaklina Tatalović Ani Brnabić: Pričate laži (VIDEO)',
|
||||
'upload_date': '20211102',
|
||||
'timestamp': 1635861677,
|
||||
'thumbnail': 'https://nova.rs/wp-content/uploads/2021/11/02/1635860298-TNJG_Ana_Brnabic_i_Zaklina_Tatalovic_100_dana_Vlade_GP.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/',
|
||||
@@ -104,6 +136,16 @@ class N1InfoIIE(InfoExtractor):
|
||||
'timestamp': 1687290536,
|
||||
'thumbnail': 'https://cdn.brid.tv/live/partners/26827/snapshot/1332368_th_6492013a8356f_1687290170.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://n1info.rs/vesti/vuciceva-turneja-po-srbiji-najavljuje-kontrarevoluciju-preti-svom-narodu-vredja-novinare/',
|
||||
'info_dict': {
|
||||
'id': '2025974',
|
||||
'ext': 'mp4',
|
||||
'title': 'Vučićeva turneja po Srbiji: Najavljuje kontrarevoluciju, preti svom narodu, vređa novinare',
|
||||
'thumbnail': 'https://cdn-uc.brid.tv/live/partners/26827/snapshot/2025974_fhd_67c4a23280a81_1740939826.jpg',
|
||||
'timestamp': 1740939936,
|
||||
'upload_date': '20250302',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://hr.n1info.com/vijesti/pravobraniteljica-o-ubojstvu-u-zagrebu-radi-se-o-doista-nezapamcenoj-situaciji/',
|
||||
'only_matching': True,
|
||||
@@ -115,11 +157,11 @@ class N1InfoIIE(InfoExtractor):
|
||||
|
||||
title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title')
|
||||
timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage))
|
||||
plugin_data = self._html_search_meta('BridPlugin', webpage)
|
||||
plugin_data = re.findall(r'\$bp\("(?:Brid|TargetVideo)_\d+",\s(.+)\);', webpage)
|
||||
entries = []
|
||||
if plugin_data:
|
||||
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id')
|
||||
for video_data in re.findall(r'\$bp\("Brid_\d+", (.+)\);', webpage):
|
||||
for video_data in plugin_data:
|
||||
video_id = self._parse_json(video_data, title)['video']
|
||||
entries.append({
|
||||
'id': video_id,
|
||||
@@ -140,7 +182,7 @@ class N1InfoIIE(InfoExtractor):
|
||||
'url': video_data.get('data-url'),
|
||||
'id': video_data.get('id'),
|
||||
'title': title,
|
||||
'thumbnail': video_data.get('data-thumbnail'),
|
||||
'thumbnail': traverse_obj(video_data, (('data-thumbnail', 'data-default_thumbnail'), {url_or_none}, any)),
|
||||
'timestamp': timestamp,
|
||||
'ie_key': 'N1InfoAsset',
|
||||
})
|
||||
@@ -152,7 +194,7 @@ class N1InfoIIE(InfoExtractor):
|
||||
if url.startswith('https://www.youtube.com'):
|
||||
entries.append(self.url_result(url, ie='Youtube'))
|
||||
elif url.startswith('https://www.redditmedia.com'):
|
||||
entries.append(self.url_result(url, ie='RedditR'))
|
||||
entries.append(self.url_result(url, ie='Reddit'))
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
|
||||
@@ -736,7 +736,7 @@ class NBCStationsIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
nbc_data = self._search_json(
|
||||
r'<script>\s*var\s+nbc\s*=', webpage, 'NBC JSON data', video_id)
|
||||
r'(?:<script>\s*var\s+nbc\s*=|Object\.assign\(nbc,)', webpage, 'NBC JSON data', video_id)
|
||||
pdk_acct = nbc_data.get('pdkAcct') or 'Yh1nAC'
|
||||
fw_ssid = traverse_obj(nbc_data, ('video', 'fwSSID'))
|
||||
|
||||
|
||||
@@ -13,11 +13,13 @@ from ..utils import (
|
||||
ExtractorError,
|
||||
OnDemandPagedList,
|
||||
clean_html,
|
||||
determine_ext,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
join_nonempty,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
parse_qs,
|
||||
parse_resolution,
|
||||
qualities,
|
||||
remove_start,
|
||||
@@ -26,6 +28,7 @@ from ..utils import (
|
||||
try_get,
|
||||
unescapeHTML,
|
||||
update_url_query,
|
||||
url_basename,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
urljoin,
|
||||
@@ -430,6 +433,7 @@ class NiconicoIE(InfoExtractor):
|
||||
'format_id': ('id', {str}),
|
||||
'abr': ('bitRate', {float_or_none(scale=1000)}),
|
||||
'asr': ('samplingRate', {int_or_none}),
|
||||
'quality': ('qualityLevel', {int_or_none}),
|
||||
}), get_all=False),
|
||||
'acodec': 'aac',
|
||||
}
|
||||
@@ -441,7 +445,9 @@ class NiconicoIE(InfoExtractor):
|
||||
min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000
|
||||
for video_fmt in video_fmts:
|
||||
video_fmt['tbr'] -= min_abr
|
||||
video_fmt['format_id'] = f'video-{video_fmt["tbr"]:.0f}'
|
||||
video_fmt['format_id'] = url_basename(video_fmt['url']).rpartition('.')[0]
|
||||
video_fmt['quality'] = traverse_obj(videos, (
|
||||
lambda _, v: v['id'] == video_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
|
||||
yield video_fmt
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -1033,6 +1039,7 @@ class NiconicoLiveIE(InfoExtractor):
|
||||
thumbnails.append({
|
||||
'id': f'{name}_{width}x{height}',
|
||||
'url': img_url,
|
||||
'ext': traverse_obj(parse_qs(img_url), ('image', 0, {determine_ext(default_ext='jpg')})),
|
||||
**res,
|
||||
})
|
||||
|
||||
|
||||
@@ -67,7 +67,7 @@ class OpenRecBaseIE(InfoExtractor):
|
||||
|
||||
class OpenRecIE(OpenRecBaseIE):
|
||||
IE_NAME = 'openrec'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/live/(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/live/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.openrec.tv/live/2p8v31qe4zy',
|
||||
'only_matching': True,
|
||||
@@ -85,7 +85,7 @@ class OpenRecIE(OpenRecBaseIE):
|
||||
|
||||
class OpenRecCaptureIE(OpenRecBaseIE):
|
||||
IE_NAME = 'openrec:capture'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/capture/(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/capture/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.openrec.tv/capture/l9nk2x4gn14',
|
||||
'only_matching': True,
|
||||
@@ -129,7 +129,7 @@ class OpenRecCaptureIE(OpenRecBaseIE):
|
||||
|
||||
class OpenRecMovieIE(OpenRecBaseIE):
|
||||
IE_NAME = 'openrec:movie'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/movie/(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?openrec\.tv/movie/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.openrec.tv/movie/nqz5xl5km8v',
|
||||
'info_dict': {
|
||||
@@ -141,6 +141,9 @@ class OpenRecMovieIE(OpenRecBaseIE):
|
||||
'uploader_id': 'taiki_to_kazuhiro',
|
||||
'timestamp': 1638856800,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.openrec.tv/movie/2p8vvex548y?playlist_id=98brq96vvsgn2nd',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -23,9 +23,9 @@ class PinterestBaseIE(InfoExtractor):
|
||||
def _call_api(self, resource, video_id, options):
|
||||
return self._download_json(
|
||||
f'https://www.pinterest.com/resource/{resource}Resource/get/',
|
||||
video_id, f'Download {resource} JSON metadata', query={
|
||||
'data': json.dumps({'options': options}),
|
||||
})['resource_response']
|
||||
video_id, f'Download {resource} JSON metadata',
|
||||
query={'data': json.dumps({'options': options})},
|
||||
headers={'X-Pinterest-PWS-Handler': 'www/[username].js'})['resource_response']
|
||||
|
||||
def _extract_video(self, data, extract_formats=True):
|
||||
video_id = data['id']
|
||||
|
||||
@@ -1,4 +1,7 @@
|
||||
import base64
|
||||
import hashlib
|
||||
import json
|
||||
import uuid
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
@@ -142,39 +145,73 @@ class PlaySuisseIE(InfoExtractor):
|
||||
id
|
||||
url
|
||||
}'''
|
||||
_LOGIN_BASE_URL = 'https://login.srgssr.ch/srgssrlogin.onmicrosoft.com'
|
||||
_LOGIN_PATH = 'B2C_1A__SignInV2'
|
||||
_CLIENT_ID = '1e33f1bf-8bf3-45e4-bbd9-c9ad934b5fca'
|
||||
_LOGIN_BASE = 'https://account.srgssr.ch'
|
||||
_ID_TOKEN = None
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
login_page = self._download_webpage(
|
||||
'https://www.playsuisse.ch/api/sso/login', None, note='Downloading login page',
|
||||
query={'x': 'x', 'locale': 'de', 'redirectUrl': 'https://www.playsuisse.ch/'})
|
||||
settings = self._search_json(r'var\s+SETTINGS\s*=', login_page, 'settings', None)
|
||||
code_verifier = uuid.uuid4().hex + uuid.uuid4().hex + uuid.uuid4().hex
|
||||
code_challenge = base64.urlsafe_b64encode(
|
||||
hashlib.sha256(code_verifier.encode()).digest()).decode().rstrip('=')
|
||||
|
||||
csrf_token = settings['csrf']
|
||||
query = {'tx': settings['transId'], 'p': self._LOGIN_PATH}
|
||||
request_id = parse_qs(self._request_webpage(
|
||||
f'{self._LOGIN_BASE}/authz-srv/authz', None, 'Requesting session ID', query={
|
||||
'client_id': self._CLIENT_ID,
|
||||
'redirect_uri': 'https://www.playsuisse.ch/auth',
|
||||
'scope': 'email profile openid offline_access',
|
||||
'response_type': 'code',
|
||||
'code_challenge': code_challenge,
|
||||
'code_challenge_method': 'S256',
|
||||
'view_type': 'login',
|
||||
}).url)['requestId'][0]
|
||||
|
||||
status = traverse_obj(self._download_json(
|
||||
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/SelfAsserted', None, 'Logging in',
|
||||
query=query, headers={'X-CSRF-TOKEN': csrf_token}, data=urlencode_postdata({
|
||||
'request_type': 'RESPONSE',
|
||||
'signInName': username,
|
||||
'password': password,
|
||||
}), expected_status=400), ('status', {int_or_none}))
|
||||
if status == 400:
|
||||
raise ExtractorError('Invalid username or password', expected=True)
|
||||
try:
|
||||
exchange_id = self._download_json(
|
||||
f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/initiate/password', None,
|
||||
'Submitting username', headers={'content-type': 'application/json'}, data=json.dumps({
|
||||
'usage_type': 'INITIAL_AUTHENTICATION',
|
||||
'request_id': request_id,
|
||||
'medium_id': 'PASSWORD',
|
||||
'type': 'password',
|
||||
'identifier': username,
|
||||
}).encode())['data']['exchange_id']['exchange_id']
|
||||
except ExtractorError:
|
||||
raise ExtractorError('Invalid username', expected=True)
|
||||
|
||||
urlh = self._request_webpage(
|
||||
f'{self._LOGIN_BASE_URL}/{self._LOGIN_PATH}/api/CombinedSigninAndSignup/confirmed',
|
||||
None, 'Downloading ID token', query={
|
||||
'rememberMe': 'false',
|
||||
'csrf_token': csrf_token,
|
||||
**query,
|
||||
'diags': '',
|
||||
})
|
||||
try:
|
||||
login_data = self._download_json(
|
||||
f'{self._LOGIN_BASE}/verification-srv/v2/authenticate/authenticate/password', None,
|
||||
'Submitting password', headers={'content-type': 'application/json'}, data=json.dumps({
|
||||
'requestId': request_id,
|
||||
'exchange_id': exchange_id,
|
||||
'type': 'password',
|
||||
'password': password,
|
||||
}).encode())['data']
|
||||
except ExtractorError:
|
||||
raise ExtractorError('Invalid password', expected=True)
|
||||
|
||||
authorization_code = parse_qs(self._request_webpage(
|
||||
f'{self._LOGIN_BASE}/login-srv/verification/login', None, 'Logging in',
|
||||
data=urlencode_postdata({
|
||||
'requestId': request_id,
|
||||
'exchange_id': login_data['exchange_id']['exchange_id'],
|
||||
'verificationType': 'password',
|
||||
'sub': login_data['sub'],
|
||||
'status_id': login_data['status_id'],
|
||||
'rememberMe': True,
|
||||
'lat': '',
|
||||
'lon': '',
|
||||
})).url)['code'][0]
|
||||
|
||||
self._ID_TOKEN = self._download_json(
|
||||
f'{self._LOGIN_BASE}/proxy/token', None, 'Downloading token', data=b'', query={
|
||||
'client_id': self._CLIENT_ID,
|
||||
'redirect_uri': 'https://www.playsuisse.ch/auth',
|
||||
'code': authorization_code,
|
||||
'code_verifier': code_verifier,
|
||||
'grant_type': 'authorization_code',
|
||||
})['id_token']
|
||||
|
||||
self._ID_TOKEN = traverse_obj(parse_qs(urlh.url), ('id_token', 0))
|
||||
if not self._ID_TOKEN:
|
||||
raise ExtractorError('Login failed')
|
||||
|
||||
|
||||
@@ -8,6 +8,7 @@ from ..utils import (
|
||||
int_or_none,
|
||||
parse_qs,
|
||||
traverse_obj,
|
||||
truncate_string,
|
||||
try_get,
|
||||
unescapeHTML,
|
||||
update_url_query,
|
||||
@@ -26,6 +27,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': '6rrwyj',
|
||||
'title': 'That small heart attack.',
|
||||
'alt_title': 'That small heart attack.',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnails': 'count:4',
|
||||
'timestamp': 1501941939,
|
||||
@@ -49,7 +51,8 @@ class RedditIE(InfoExtractor):
|
||||
'id': 'gyh95hiqc0b11',
|
||||
'ext': 'mp4',
|
||||
'display_id': '90bu6w',
|
||||
'title': 'Heat index was 110 degrees so we offered him a cold drink. He went for a full body soak instead',
|
||||
'title': 'Heat index was 110 degrees so we offered him a cold drink. He went fo...',
|
||||
'alt_title': 'Heat index was 110 degrees so we offered him a cold drink. He went for a full body soak instead',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnails': 'count:7',
|
||||
'timestamp': 1532051078,
|
||||
@@ -69,7 +72,8 @@ class RedditIE(InfoExtractor):
|
||||
'id': 'zasobba6wp071',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'nip71r',
|
||||
'title': 'I plan to make more stickers and prints! Check them out on my Etsy! Or get them through my Patreon. Links below.',
|
||||
'title': 'I plan to make more stickers and prints! Check them out on my Etsy! O...',
|
||||
'alt_title': 'I plan to make more stickers and prints! Check them out on my Etsy! Or get them through my Patreon. Links below.',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnails': 'count:5',
|
||||
'timestamp': 1621709093,
|
||||
@@ -91,7 +95,17 @@ class RedditIE(InfoExtractor):
|
||||
'playlist_count': 2,
|
||||
'info_dict': {
|
||||
'id': 'wzqkxp',
|
||||
'title': 'md5:72d3d19402aa11eff5bd32fc96369b37',
|
||||
'title': '[Finale] Kamen Rider Revice Episode 50 "Family to the End, Until the ...',
|
||||
'alt_title': '[Finale] Kamen Rider Revice Episode 50 "Family to the End, Until the Day We Meet Again" Discussion',
|
||||
'description': 'md5:5b7deb328062b164b15704c5fd67c335',
|
||||
'uploader': 'TheTwelveYearOld',
|
||||
'channel_id': 'KamenRider',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
'dislike_count': int,
|
||||
'age_limit': 0,
|
||||
'timestamp': 1661676059.0,
|
||||
'upload_date': '20220828',
|
||||
},
|
||||
}, {
|
||||
# crossposted reddit-hosted media
|
||||
@@ -102,6 +116,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': 'zjjw82',
|
||||
'title': 'Cringe',
|
||||
'alt_title': 'Cringe',
|
||||
'uploader': 'Otaku-senpai69420',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'upload_date': '20221212',
|
||||
@@ -122,6 +137,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': '124pp33',
|
||||
'title': 'Harmless prank of some old friends',
|
||||
'alt_title': 'Harmless prank of some old friends',
|
||||
'uploader': 'Dudezila',
|
||||
'channel_id': 'ContagiousLaughter',
|
||||
'duration': 17,
|
||||
@@ -142,6 +158,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': '12fujy3',
|
||||
'title': 'Based Hasan?',
|
||||
'alt_title': 'Based Hasan?',
|
||||
'uploader': 'KingNigelXLII',
|
||||
'channel_id': 'GenZedong',
|
||||
'duration': 16,
|
||||
@@ -161,6 +178,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': '1cl9h0u',
|
||||
'title': 'The insurance claim will be interesting',
|
||||
'alt_title': 'The insurance claim will be interesting',
|
||||
'uploader': 'darrenpauli',
|
||||
'channel_id': 'Unexpected',
|
||||
'duration': 53,
|
||||
@@ -183,6 +201,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': '1cxwzso',
|
||||
'title': 'Tottenham [1] - 0 Newcastle United - James Maddison 31\'',
|
||||
'alt_title': 'Tottenham [1] - 0 Newcastle United - James Maddison 31\'',
|
||||
'uploader': 'Woodstovia',
|
||||
'channel_id': 'soccer',
|
||||
'duration': 30,
|
||||
@@ -206,6 +225,7 @@ class RedditIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'display_id': 'degtjo',
|
||||
'title': 'When the K hits',
|
||||
'alt_title': 'When the K hits',
|
||||
'uploader': '[deleted]',
|
||||
'channel_id': 'ketamine',
|
||||
'comment_count': int,
|
||||
@@ -304,14 +324,6 @@ class RedditIE(InfoExtractor):
|
||||
data = data[0]['data']['children'][0]['data']
|
||||
video_url = data['url']
|
||||
|
||||
over_18 = data.get('over_18')
|
||||
if over_18 is True:
|
||||
age_limit = 18
|
||||
elif over_18 is False:
|
||||
age_limit = 0
|
||||
else:
|
||||
age_limit = None
|
||||
|
||||
thumbnails = []
|
||||
|
||||
def add_thumbnail(src):
|
||||
@@ -337,15 +349,19 @@ class RedditIE(InfoExtractor):
|
||||
add_thumbnail(resolution)
|
||||
|
||||
info = {
|
||||
'title': data.get('title'),
|
||||
'thumbnails': thumbnails,
|
||||
'timestamp': float_or_none(data.get('created_utc')),
|
||||
'uploader': data.get('author'),
|
||||
'channel_id': data.get('subreddit'),
|
||||
'like_count': int_or_none(data.get('ups')),
|
||||
'dislike_count': int_or_none(data.get('downs')),
|
||||
'comment_count': int_or_none(data.get('num_comments')),
|
||||
'age_limit': age_limit,
|
||||
'age_limit': {True: 18, False: 0}.get(data.get('over_18')),
|
||||
**traverse_obj(data, {
|
||||
'title': ('title', {truncate_string(left=72)}),
|
||||
'alt_title': ('title', {str}),
|
||||
'description': ('selftext', {str}, filter),
|
||||
'timestamp': ('created_utc', {float_or_none}),
|
||||
'uploader': ('author', {str}),
|
||||
'channel_id': ('subreddit', {str}),
|
||||
'like_count': ('ups', {int_or_none}),
|
||||
'dislike_count': ('downs', {int_or_none}),
|
||||
'comment_count': ('num_comments', {int_or_none}),
|
||||
}),
|
||||
}
|
||||
|
||||
parsed_url = urllib.parse.urlparse(video_url)
|
||||
@@ -371,7 +387,7 @@ class RedditIE(InfoExtractor):
|
||||
**info,
|
||||
})
|
||||
if entries:
|
||||
return self.playlist_result(entries, video_id, info.get('title'))
|
||||
return self.playlist_result(entries, video_id, **info)
|
||||
raise ExtractorError('No media found', expected=True)
|
||||
|
||||
# Check if media is hosted on reddit:
|
||||
|
||||
@@ -3,12 +3,20 @@ import json
|
||||
import re
|
||||
import urllib.parse
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import js_to_json
|
||||
from .common import InfoExtractor, Request
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class RTPIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/(?:(?:estudoemcasa|palco|zigzag)/)?p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/(?:[^/#?]+/)?p(?P<program_id>\d+)/(?P<id>e\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
|
||||
'md5': 'e736ce0c665e459ddb818546220b4ef8',
|
||||
@@ -16,99 +24,173 @@ class RTPIE(InfoExtractor):
|
||||
'id': 'e174042',
|
||||
'ext': 'mp3',
|
||||
'title': 'Paixões Cruzadas',
|
||||
'description': 'As paixões musicais de António Cartaxo e António Macedo',
|
||||
'description': 'md5:af979e58ba0ab73f78435fc943fdb070',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'series': 'Paixões Cruzadas',
|
||||
'duration': 2950.0,
|
||||
'modified_timestamp': 1553693464,
|
||||
'modified_date': '20190327',
|
||||
'timestamp': 1417219200,
|
||||
'upload_date': '20141129',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.rtp.pt/play/zigzag/p13166/e757904/25-curiosidades-25-de-abril',
|
||||
'md5': '9a81ed53f2b2197cfa7ed455b12f8ade',
|
||||
'md5': '5b4859940e3adef61247a77dfb76046a',
|
||||
'info_dict': {
|
||||
'id': 'e757904',
|
||||
'ext': 'mp4',
|
||||
'title': '25 Curiosidades, 25 de Abril',
|
||||
'description': 'Estudar ou não estudar - Em cada um dos episódios descobrimos uma curiosidade acerca de como era viver em Portugal antes da revolução do 25 de abr',
|
||||
'title': 'Estudar ou não estudar',
|
||||
'description': 'md5:3bfd7eb8bebfd5711a08df69c9c14c35',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1711958401,
|
||||
'duration': 146.0,
|
||||
'upload_date': '20240401',
|
||||
'modified_timestamp': 1712242991,
|
||||
'series': '25 Curiosidades, 25 de Abril',
|
||||
'episode_number': 2,
|
||||
'episode': 'Estudar ou não estudar',
|
||||
'modified_date': '20240404',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.rtp.pt/play/estudoemcasa/p7776/portugues-1-ano',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.rtp.pt/play/palco/p13785/l7nnon',
|
||||
'only_matching': True,
|
||||
# Episode not accessible through API
|
||||
'url': 'https://www.rtp.pt/play/estudoemcasa/p7776/e500050/portugues-1-ano',
|
||||
'md5': '57660c0b46db9f22118c52cbd65975e4',
|
||||
'info_dict': {
|
||||
'id': 'e500050',
|
||||
'ext': 'mp4',
|
||||
'title': 'Português - 1.º ano',
|
||||
'duration': 1669.0,
|
||||
'description': 'md5:be68925c81269f8c6886589f25fe83ea',
|
||||
'upload_date': '20201020',
|
||||
'timestamp': 1603180799,
|
||||
'thumbnail': 'https://cdn-images.rtp.pt/EPG/imagens/39482_59449_64850.png?v=3&w=860',
|
||||
},
|
||||
}]
|
||||
|
||||
_USER_AGENT = 'rtpplay/2.0.66 (pt.rtp.rtpplay; build:2066; iOS 15.8.3) Alamofire/5.9.1'
|
||||
_AUTH_TOKEN = None
|
||||
|
||||
def _fetch_auth_token(self):
|
||||
if self._AUTH_TOKEN:
|
||||
return self._AUTH_TOKEN
|
||||
self._AUTH_TOKEN = traverse_obj(self._download_json(Request(
|
||||
'https://rtpplayapi.rtp.pt/play/api/2/token-manager',
|
||||
headers={
|
||||
'Accept': '*/*',
|
||||
'rtp-play-auth': 'RTPPLAY_MOBILE_IOS',
|
||||
'rtp-play-auth-hash': 'fac9c328b2f27e26e03d7f8942d66c05b3e59371e16c2a079f5c83cc801bd3ee',
|
||||
'rtp-play-auth-timestamp': '2145973229682',
|
||||
'User-Agent': self._USER_AGENT,
|
||||
}, extensions={'keep_header_casing': True}), None,
|
||||
note='Fetching guest auth token', errnote='Could not fetch guest auth token',
|
||||
fatal=False), ('token', 'token', {str}))
|
||||
return self._AUTH_TOKEN
|
||||
|
||||
@staticmethod
|
||||
def _cleanup_media_url(url):
|
||||
if urllib.parse.urlparse(url).netloc == 'streaming-ondemand.rtp.pt':
|
||||
return None
|
||||
return url.replace('/drm-fps/', '/hls/').replace('/drm-dash/', '/dash/')
|
||||
|
||||
def _extract_formats(self, media_urls, episode_id):
|
||||
formats = []
|
||||
subtitles = {}
|
||||
for media_url in set(traverse_obj(media_urls, (..., {url_or_none}, {self._cleanup_media_url}))):
|
||||
ext = determine_ext(media_url)
|
||||
if ext == 'm3u8':
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
media_url, episode_id, m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
elif ext == 'mpd':
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(
|
||||
media_url, episode_id, mpd_id='dash', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
else:
|
||||
formats.append({
|
||||
'url': media_url,
|
||||
'format_id': 'http',
|
||||
})
|
||||
return formats, subtitles
|
||||
|
||||
def _extract_from_api(self, program_id, episode_id):
|
||||
auth_token = self._fetch_auth_token()
|
||||
if not auth_token:
|
||||
return
|
||||
episode_data = traverse_obj(self._download_json(
|
||||
f'https://www.rtp.pt/play/api/1/get-episode/{program_id}/{episode_id[1:]}', episode_id,
|
||||
query={'include_assets': 'true', 'include_webparams': 'true'},
|
||||
headers={
|
||||
'Accept': '*/*',
|
||||
'Authorization': f'Bearer {auth_token}',
|
||||
'User-Agent': self._USER_AGENT,
|
||||
}, fatal=False), 'result', {dict})
|
||||
if not episode_data:
|
||||
return
|
||||
asset_urls = traverse_obj(episode_data, ('assets', 0, 'asset_url', {dict}))
|
||||
media_urls = traverse_obj(asset_urls, (
|
||||
((('hls', 'dash'), 'stream_url'), ('multibitrate', ('url_hls', 'url_dash'))),))
|
||||
formats, subtitles = self._extract_formats(media_urls, episode_id)
|
||||
|
||||
for sub_data in traverse_obj(asset_urls, ('subtitles', 'vtt_list', lambda _, v: url_or_none(v['file']))):
|
||||
subtitles.setdefault(sub_data.get('code') or 'pt', []).append({
|
||||
'url': sub_data['file'],
|
||||
'name': sub_data.get('language'),
|
||||
})
|
||||
|
||||
return {
|
||||
'id': episode_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'thumbnail': traverse_obj(episode_data, ('assets', 0, 'asset_thumbnail', {url_or_none})),
|
||||
**traverse_obj(episode_data, ('episode', {
|
||||
'title': (('episode_title', 'program_title'), {str}, filter, any),
|
||||
'alt_title': ('episode_subtitle', {str}, filter),
|
||||
'description': (('episode_description', 'episode_summary'), {str}, filter, any),
|
||||
'timestamp': ('episode_air_date', {parse_iso8601(delimiter=' ')}),
|
||||
'modified_timestamp': ('episode_lastchanged', {parse_iso8601(delimiter=' ')}),
|
||||
'duration': ('episode_duration_complete', {parse_duration}),
|
||||
'episode': ('episode_title', {str}, filter),
|
||||
'episode_number': ('episode_number', {int_or_none}),
|
||||
'season': ('program_season', {str}, filter),
|
||||
'series': ('program_title', {str}, filter),
|
||||
})),
|
||||
}
|
||||
|
||||
_RX_OBFUSCATION = re.compile(r'''(?xs)
|
||||
atob\s*\(\s*decodeURIComponent\s*\(\s*
|
||||
(\[[0-9A-Za-z%,'"]*\])
|
||||
\s*\.\s*join\(\s*(?:""|'')\s*\)\s*\)\s*\)
|
||||
''')
|
||||
|
||||
def __unobfuscate(self, data, *, video_id):
|
||||
if data.startswith('{'):
|
||||
data = self._RX_OBFUSCATION.sub(
|
||||
lambda m: json.dumps(
|
||||
base64.b64decode(urllib.parse.unquote(
|
||||
''.join(self._parse_json(m.group(1), video_id)),
|
||||
)).decode('iso-8859-1')),
|
||||
data)
|
||||
return js_to_json(data)
|
||||
def __unobfuscate(self, data):
|
||||
return self._RX_OBFUSCATION.sub(
|
||||
lambda m: json.dumps(
|
||||
base64.b64decode(urllib.parse.unquote(
|
||||
''.join(json.loads(m.group(1))),
|
||||
)).decode('iso-8859-1')),
|
||||
data)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
title = self._html_search_meta(
|
||||
'twitter:title', webpage, display_name='title', fatal=True)
|
||||
|
||||
f, config = self._search_regex(
|
||||
r'''(?sx)
|
||||
(?:var\s+f\s*=\s*(?P<f>".*?"|{[^;]+?});\s*)?
|
||||
var\s+player1\s+=\s+new\s+RTPPlayer\s*\((?P<config>{(?:(?!\*/).)+?})\);(?!\s*\*/)
|
||||
''', webpage,
|
||||
'player config', group=('f', 'config'))
|
||||
|
||||
config = self._parse_json(
|
||||
config, video_id,
|
||||
lambda data: self.__unobfuscate(data, video_id=video_id))
|
||||
f = config['file'] if not f else self._parse_json(
|
||||
f, video_id,
|
||||
lambda data: self.__unobfuscate(data, video_id=video_id))
|
||||
def _extract_from_html(self, url, episode_id):
|
||||
webpage = self._download_webpage(url, episode_id)
|
||||
|
||||
formats = []
|
||||
if isinstance(f, dict):
|
||||
f_hls = f.get('hls')
|
||||
if f_hls is not None:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
f_hls, video_id, 'mp4', 'm3u8_native', m3u8_id='hls'))
|
||||
|
||||
f_dash = f.get('dash')
|
||||
if f_dash is not None:
|
||||
formats.extend(self._extract_mpd_formats(f_dash, video_id, mpd_id='dash'))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': 'f',
|
||||
'url': f,
|
||||
'vcodec': 'none' if config.get('mediaType') == 'audio' else None,
|
||||
})
|
||||
|
||||
subtitles = {}
|
||||
|
||||
vtt = config.get('vtt')
|
||||
if vtt is not None:
|
||||
for lcode, lname, url in vtt:
|
||||
subtitles.setdefault(lcode, []).append({
|
||||
'name': lname,
|
||||
'url': url,
|
||||
})
|
||||
media_urls = traverse_obj(re.findall(r'(?:var\s+f\s*=|RTPPlayer\({[^}]+file:)\s*({[^}]+}|"[^"]+")', webpage), (
|
||||
-1, (({self.__unobfuscate}, {js_to_json}, {json.loads}, {dict.values}, ...), {json.loads})))
|
||||
formats, subtitles = self._extract_formats(media_urls, episode_id)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'id': episode_id,
|
||||
'formats': formats,
|
||||
'description': self._html_search_meta(['description', 'twitter:description'], webpage),
|
||||
'thumbnail': config.get('poster') or self._og_search_thumbnail(webpage),
|
||||
'subtitles': subtitles,
|
||||
'description': self._html_search_meta(['og:description', 'twitter:description'], webpage, default=None),
|
||||
'thumbnail': self._html_search_meta(['og:image', 'twitter:image'], webpage, default=None),
|
||||
**self._search_json_ld(webpage, episode_id, default={}),
|
||||
'title': self._html_search_meta(['og:title', 'twitter:title'], webpage, default=None),
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
program_id, episode_id = self._match_valid_url(url).group('program_id', 'id')
|
||||
return self._extract_from_api(program_id, episode_id) or self._extract_from_html(url, episode_id)
|
||||
|
||||
@@ -2,16 +2,18 @@ import urllib.parse
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
dict_get,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
unified_timestamp,
|
||||
url_or_none,
|
||||
urljoin,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class SkyItPlayerIE(InfoExtractor):
|
||||
IE_NAME = 'player.sky.it'
|
||||
_VALID_URL = r'https?://player\.sky\.it/player/(?:external|social)\.html\?.*?\bid=(?P<id>\d+)'
|
||||
class SkyItBaseIE(InfoExtractor):
|
||||
_GEO_BYPASS = False
|
||||
_DOMAIN = 'sky'
|
||||
_PLAYER_TMPL = 'https://player.sky.it/player/external.html?id=%s&domain=%s'
|
||||
@@ -33,7 +35,6 @@ class SkyItPlayerIE(InfoExtractor):
|
||||
SkyItPlayerIE.ie_key(), video_id)
|
||||
|
||||
def _parse_video(self, video, video_id):
|
||||
title = video['title']
|
||||
is_live = video.get('type') == 'live'
|
||||
hls_url = video.get(('streaming' if is_live else 'hls') + '_url')
|
||||
if not hls_url and video.get('geoblock' if is_live else 'geob'):
|
||||
@@ -43,7 +44,7 @@ class SkyItPlayerIE(InfoExtractor):
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'title': video.get('title'),
|
||||
'formats': formats,
|
||||
'thumbnail': dict_get(video, ('video_still', 'video_still_medium', 'thumb')),
|
||||
'description': video.get('short_desc') or None,
|
||||
@@ -52,6 +53,11 @@ class SkyItPlayerIE(InfoExtractor):
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
||||
|
||||
class SkyItPlayerIE(SkyItBaseIE):
|
||||
IE_NAME = 'player.sky.it'
|
||||
_VALID_URL = r'https?://player\.sky\.it/player/(?:external|social)\.html\?.*?\bid=(?P<id>\d+)'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
domain = urllib.parse.parse_qs(urllib.parse.urlparse(
|
||||
@@ -67,7 +73,7 @@ class SkyItPlayerIE(InfoExtractor):
|
||||
return self._parse_video(video, video_id)
|
||||
|
||||
|
||||
class SkyItVideoIE(SkyItPlayerIE): # XXX: Do not subclass from concrete IE
|
||||
class SkyItVideoIE(SkyItBaseIE):
|
||||
IE_NAME = 'video.sky.it'
|
||||
_VALID_URL = r'https?://(?:masterchef|video|xfactor)\.sky\.it(?:/[^/]+)*/video/[0-9a-z-]+-(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
@@ -96,7 +102,7 @@ class SkyItVideoIE(SkyItPlayerIE): # XXX: Do not subclass from concrete IE
|
||||
return self._player_url_result(video_id)
|
||||
|
||||
|
||||
class SkyItVideoLiveIE(SkyItPlayerIE): # XXX: Do not subclass from concrete IE
|
||||
class SkyItVideoLiveIE(SkyItBaseIE):
|
||||
IE_NAME = 'video.sky.it:live'
|
||||
_VALID_URL = r'https?://video\.sky\.it/diretta/(?P<id>[^/?&#]+)'
|
||||
_TEST = {
|
||||
@@ -124,7 +130,7 @@ class SkyItVideoLiveIE(SkyItPlayerIE): # XXX: Do not subclass from concrete IE
|
||||
return self._parse_video(livestream, asset_id)
|
||||
|
||||
|
||||
class SkyItIE(SkyItPlayerIE): # XXX: Do not subclass from concrete IE
|
||||
class SkyItIE(SkyItBaseIE):
|
||||
IE_NAME = 'sky.it'
|
||||
_VALID_URL = r'https?://(?:sport|tg24)\.sky\.it(?:/[^/]+)*/\d{4}/\d{2}/\d{2}/(?P<id>[^/?&#]+)'
|
||||
_TESTS = [{
|
||||
@@ -223,3 +229,80 @@ class TV8ItIE(SkyItVideoIE): # XXX: Do not subclass from concrete IE
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}]
|
||||
_DOMAIN = 'mtv8'
|
||||
|
||||
|
||||
class TV8ItLiveIE(SkyItBaseIE):
|
||||
IE_NAME = 'tv8.it:live'
|
||||
IE_DESC = 'TV8 Live'
|
||||
_VALID_URL = r'https?://(?:www\.)?tv8\.it/streaming'
|
||||
_TESTS = [{
|
||||
'url': 'https://tv8.it/streaming',
|
||||
'info_dict': {
|
||||
'id': 'tv8',
|
||||
'ext': 'mp4',
|
||||
'title': str,
|
||||
'description': str,
|
||||
'is_live': True,
|
||||
'live_status': 'is_live',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = 'tv8'
|
||||
livestream = self._download_json(
|
||||
'https://apid.sky.it/vdp/v1/getLivestream', video_id,
|
||||
'Downloading manifest JSON', query={'id': '7'})
|
||||
metadata = self._download_json('https://tv8.it/api/getStreaming', video_id, fatal=False)
|
||||
|
||||
return {
|
||||
**self._parse_video(livestream, video_id),
|
||||
**traverse_obj(metadata, ('info', {
|
||||
'title': ('title', 'text', {str}),
|
||||
'description': ('description', 'html', {clean_html}),
|
||||
})),
|
||||
}
|
||||
|
||||
|
||||
class TV8ItPlaylistIE(InfoExtractor):
|
||||
IE_NAME = 'tv8.it:playlist'
|
||||
IE_DESC = 'TV8 Playlist'
|
||||
_VALID_URL = r'https?://(?:www\.)?tv8\.it/(?!video)[^/#?]+/(?P<id>[^/#?]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://tv8.it/intrattenimento/tv8-gialappas-night',
|
||||
'playlist_mincount': 32,
|
||||
'info_dict': {
|
||||
'id': 'tv8-gialappas-night',
|
||||
'title': 'Tv8 Gialappa\'s Night',
|
||||
'description': 'md5:c876039d487d9cf40229b768872718ed',
|
||||
'thumbnail': r're:https://static\.sky\.it/.+\.(png|jpe?g|webp)',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://tv8.it/sport/uefa-europa-league',
|
||||
'playlist_mincount': 11,
|
||||
'info_dict': {
|
||||
'id': 'uefa-europa-league',
|
||||
'title': 'UEFA Europa League',
|
||||
'description': 'md5:9ab1832b7a8b1705b1f590e13a36bc6a',
|
||||
'thumbnail': r're:https://static\.sky\.it/.+\.(png|jpe?g|webp)',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
data = self._search_nextjs_data(webpage, playlist_id)['props']['pageProps']['data']
|
||||
entries = [self.url_result(
|
||||
urljoin('https://tv8.it', card['href']), ie=TV8ItIE,
|
||||
**traverse_obj(card, {
|
||||
'description': ('extraData', 'videoDesc', {str}),
|
||||
'id': ('extraData', 'asset_id', {str}),
|
||||
'thumbnail': ('image', 'src', {url_or_none}),
|
||||
'title': ('title', 'typography', 'text', {str}),
|
||||
}))
|
||||
for card in traverse_obj(data, ('lastContent', 'cards', lambda _, v: v['href']))]
|
||||
|
||||
return self.playlist_result(entries, playlist_id, **traverse_obj(data, ('card', 'desktop', {
|
||||
'description': ('description', 'html', {clean_html}),
|
||||
'thumbnail': ('image', 'src', {url_or_none}),
|
||||
'title': ('title', 'text', {str}),
|
||||
})))
|
||||
|
||||
87
yt_dlp/extractor/softwhiteunderbelly.py
Normal file
87
yt_dlp/extractor/softwhiteunderbelly.py
Normal file
@@ -0,0 +1,87 @@
|
||||
from .common import InfoExtractor
|
||||
from .vimeo import VHXEmbedIE
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
clean_html,
|
||||
update_url,
|
||||
urlencode_postdata,
|
||||
)
|
||||
from ..utils.traversal import find_element, traverse_obj
|
||||
|
||||
|
||||
class SoftWhiteUnderbellyIE(InfoExtractor):
|
||||
_LOGIN_URL = 'https://www.softwhiteunderbelly.com/login'
|
||||
_NETRC_MACHINE = 'softwhiteunderbelly'
|
||||
_VALID_URL = r'https?://(?:www\.)?softwhiteunderbelly\.com/videos/(?P<id>[\w-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.softwhiteunderbelly.com/videos/kenneth-final1',
|
||||
'note': 'A single Soft White Underbelly Episode',
|
||||
'md5': '8e79f29ec1f1bda6da2e0b998fcbebb8',
|
||||
'info_dict': {
|
||||
'id': '3201266',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'kenneth-final1',
|
||||
'title': 'Appalachian Man interview-Kenneth',
|
||||
'description': 'Soft White Underbelly interview and portrait of Kenneth, an Appalachian man in Clay County, Kentucky.',
|
||||
'thumbnail': 'https://vhx.imgix.net/softwhiteunderbelly/assets/249f6db0-2b39-49a4-979b-f8dad4681825.jpg',
|
||||
'uploader_url': 'https://vimeo.com/user80538407',
|
||||
'uploader': 'OTT Videos',
|
||||
'uploader_id': 'user80538407',
|
||||
'duration': 512,
|
||||
},
|
||||
'expected_warnings': ['Failed to parse XML: not well-formed'],
|
||||
}, {
|
||||
'url': 'https://www.softwhiteunderbelly.com/videos/tj-2-final-2160p',
|
||||
'note': 'A single Soft White Underbelly Episode',
|
||||
'md5': '286bd8851b4824c62afb369e6f307036',
|
||||
'info_dict': {
|
||||
'id': '3506029',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'tj-2-final-2160p',
|
||||
'title': 'Fentanyl Addict interview-TJ (follow up)',
|
||||
'description': 'Soft White Underbelly follow up interview and portrait of TJ, a fentanyl addict on Skid Row.',
|
||||
'thumbnail': 'https://vhx.imgix.net/softwhiteunderbelly/assets/c883d531-5da0-4faf-a2e2-8eba97e5adfc.jpg',
|
||||
'duration': 817,
|
||||
'uploader': 'OTT Videos',
|
||||
'uploader_url': 'https://vimeo.com/user80538407',
|
||||
'uploader_id': 'user80538407',
|
||||
},
|
||||
'expected_warnings': ['Failed to parse XML: not well-formed'],
|
||||
}]
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
signin_page = self._download_webpage(self._LOGIN_URL, None, 'Fetching authenticity token')
|
||||
self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Logging in',
|
||||
data=urlencode_postdata({
|
||||
'email': username,
|
||||
'password': password,
|
||||
'authenticity_token': self._html_search_regex(
|
||||
r'name=["\']authenticity_token["\']\s+value=["\']([^"\']+)', signin_page, 'authenticity_token'),
|
||||
'utf8': True,
|
||||
}),
|
||||
)
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
if '<div id="watch-unauthorized"' in webpage:
|
||||
if self._get_cookies('https://www.softwhiteunderbelly.com').get('_session'):
|
||||
raise ExtractorError('This account is not subscribed to this content', expected=True)
|
||||
self.raise_login_required()
|
||||
|
||||
embed_url, embed_id = self._html_search_regex(
|
||||
r'embed_url:\s*["\'](?P<url>https?://embed\.vhx\.tv/videos/(?P<id>\d+)[^"\']*)',
|
||||
webpage, 'embed url', group=('url', 'id'))
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': VHXEmbedIE.ie_key(),
|
||||
'url': VHXEmbedIE._smuggle_referrer(embed_url, 'https://www.softwhiteunderbelly.com'),
|
||||
'id': embed_id,
|
||||
'display_id': display_id,
|
||||
'title': traverse_obj(webpage, ({find_element(id='watch-info')}, {find_element(cls='video-title')}, {clean_html})),
|
||||
'description': self._html_search_meta('description', webpage, default=None),
|
||||
'thumbnail': update_url(self._og_search_thumbnail(webpage) or '', query=None) or None,
|
||||
}
|
||||
@@ -52,7 +52,8 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
_API_VERIFY_AUTH_TOKEN = 'https://api-auth.soundcloud.com/connect/session%s'
|
||||
_HEADERS = {}
|
||||
|
||||
_IMAGE_REPL_RE = r'-([0-9a-z]+)\.jpg'
|
||||
_IMAGE_REPL_RE = r'-[0-9a-z]+\.(?P<ext>jpg|png)'
|
||||
_TAGS_RE = re.compile(r'"([^"]+)"|([^ ]+)')
|
||||
|
||||
_ARTWORK_MAP = {
|
||||
'mini': 16,
|
||||
@@ -331,12 +332,14 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
thumbnails = []
|
||||
artwork_url = info.get('artwork_url')
|
||||
thumbnail = artwork_url or user.get('avatar_url')
|
||||
if isinstance(thumbnail, str):
|
||||
if re.search(self._IMAGE_REPL_RE, thumbnail):
|
||||
if url_or_none(thumbnail):
|
||||
if mobj := re.search(self._IMAGE_REPL_RE, thumbnail):
|
||||
for image_id, size in self._ARTWORK_MAP.items():
|
||||
# Soundcloud serves JPEG regardless of URL's ext *except* for "original" thumb
|
||||
ext = mobj.group('ext') if image_id == 'original' else 'jpg'
|
||||
i = {
|
||||
'id': image_id,
|
||||
'url': re.sub(self._IMAGE_REPL_RE, f'-{image_id}.jpg', thumbnail),
|
||||
'url': re.sub(self._IMAGE_REPL_RE, f'-{image_id}.{ext}', thumbnail),
|
||||
}
|
||||
if image_id == 'tiny' and not artwork_url:
|
||||
size = 18
|
||||
@@ -372,6 +375,7 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
'comment_count': extract_count('comment'),
|
||||
'repost_count': extract_count('reposts'),
|
||||
'genres': traverse_obj(info, ('genre', {str}, filter, all, filter)),
|
||||
'tags': traverse_obj(info, ('tag_list', {self._TAGS_RE.findall}, ..., ..., filter)),
|
||||
'artists': traverse_obj(info, ('publisher_metadata', 'artist', {str}, filter, all, filter)),
|
||||
'formats': formats if not extract_flat else None,
|
||||
}
|
||||
@@ -425,6 +429,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'repost_count': int,
|
||||
'thumbnail': 'https://i1.sndcdn.com/artworks-000031955188-rwb18x-original.jpg',
|
||||
'uploader_url': 'https://soundcloud.com/ethmusic',
|
||||
'tags': 'count:14',
|
||||
},
|
||||
},
|
||||
# geo-restricted
|
||||
@@ -440,7 +445,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'uploader_id': '9615865',
|
||||
'timestamp': 1337635207,
|
||||
'upload_date': '20120521',
|
||||
'duration': 227.155,
|
||||
'duration': 227.103,
|
||||
'license': 'all-rights-reserved',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
@@ -450,6 +455,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'thumbnail': 'https://i1.sndcdn.com/artworks-v8bFHhXm7Au6-0-original.jpg',
|
||||
'genres': ['Alternative'],
|
||||
'artists': ['The Royal Concept'],
|
||||
'tags': [],
|
||||
},
|
||||
},
|
||||
# private link
|
||||
@@ -475,6 +481,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'uploader_url': 'https://soundcloud.com/jaimemf',
|
||||
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
||||
'genres': ['youtubedl'],
|
||||
'tags': [],
|
||||
},
|
||||
},
|
||||
# private link (alt format)
|
||||
@@ -500,15 +507,16 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'uploader_url': 'https://soundcloud.com/jaimemf',
|
||||
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
|
||||
'genres': ['youtubedl'],
|
||||
'tags': [],
|
||||
},
|
||||
},
|
||||
# downloadable song
|
||||
{
|
||||
'url': 'https://soundcloud.com/the80m/the-following',
|
||||
'md5': '9ffcddb08c87d74fb5808a3c183a1d04',
|
||||
'md5': 'ecb87d7705d5f53e6c02a63760573c75', # wav: '9ffcddb08c87d74fb5808a3c183a1d04'
|
||||
'info_dict': {
|
||||
'id': '343609555',
|
||||
'ext': 'wav',
|
||||
'ext': 'opus', # wav original available with auth
|
||||
'title': 'The Following',
|
||||
'track': 'The Following',
|
||||
'description': '',
|
||||
@@ -526,15 +534,18 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'view_count': int,
|
||||
'genres': ['Dance & EDM'],
|
||||
'artists': ['80M'],
|
||||
'tags': ['80M', 'EDM', 'Dance', 'Music'],
|
||||
},
|
||||
'expected_warnings': ['Original download format is only available for registered users'],
|
||||
},
|
||||
# private link, downloadable format
|
||||
# tags with spaces (e.g. "Uplifting Trance", "Ori Uplift")
|
||||
{
|
||||
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
|
||||
'md5': '64a60b16e617d41d0bef032b7f55441e',
|
||||
'md5': '2e1530d0e9986a833a67cb34fc90ece0', # wav: '64a60b16e617d41d0bef032b7f55441e'
|
||||
'info_dict': {
|
||||
'id': '340344461',
|
||||
'ext': 'wav',
|
||||
'ext': 'opus', # wav original available with auth
|
||||
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
||||
'track': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
||||
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
|
||||
@@ -552,7 +563,9 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'uploader_url': 'https://soundcloud.com/oriuplift',
|
||||
'genres': ['Trance'],
|
||||
'artists': ['Ori Uplift'],
|
||||
'tags': ['Orchestral', 'Emotional', 'Uplifting Trance', 'Trance', 'Ori Uplift', 'UpOnly'],
|
||||
},
|
||||
'expected_warnings': ['Original download format is only available for registered users'],
|
||||
},
|
||||
# no album art, use avatar pic for thumbnail
|
||||
{
|
||||
@@ -577,6 +590,7 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'repost_count': int,
|
||||
'uploader_url': 'https://soundcloud.com/garyvee',
|
||||
'artists': ['MadReal'],
|
||||
'tags': [],
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -604,8 +618,47 @@ class SoundcloudIE(SoundcloudBaseIE):
|
||||
'repost_count': int,
|
||||
'genres': ['Piano'],
|
||||
'uploader_url': 'https://soundcloud.com/giovannisarani',
|
||||
'tags': 'count:10',
|
||||
},
|
||||
},
|
||||
# .png "original" artwork, 160kbps m4a HLS format
|
||||
{
|
||||
'url': 'https://soundcloud.com/skorxh/audio-dealer',
|
||||
'info_dict': {
|
||||
'id': '2011421339',
|
||||
'ext': 'm4a',
|
||||
'title': 'audio dealer',
|
||||
'description': '',
|
||||
'uploader': '$KORCH',
|
||||
'uploader_id': '150292288',
|
||||
'uploader_url': 'https://soundcloud.com/skorxh',
|
||||
'comment_count': int,
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'repost_count': int,
|
||||
'duration': 213.469,
|
||||
'tags': [],
|
||||
'artists': ['$KORXH'],
|
||||
'track': 'audio dealer',
|
||||
'timestamp': 1737143201,
|
||||
'upload_date': '20250117',
|
||||
'license': 'all-rights-reserved',
|
||||
'thumbnail': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png',
|
||||
'thumbnails': [
|
||||
{'id': 'mini', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-mini.jpg'},
|
||||
{'id': 'tiny', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-tiny.jpg'},
|
||||
{'id': 'small', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-small.jpg'},
|
||||
{'id': 'badge', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-badge.jpg'},
|
||||
{'id': 't67x67', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t67x67.jpg'},
|
||||
{'id': 'large', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-large.jpg'},
|
||||
{'id': 't300x300', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t300x300.jpg'},
|
||||
{'id': 'crop', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-crop.jpg'},
|
||||
{'id': 't500x500', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t500x500.jpg'},
|
||||
{'id': 'original', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png'},
|
||||
],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8', 'format': 'hls_aac_160k'},
|
||||
},
|
||||
{
|
||||
# AAC HQ format available (account with active subscription needed)
|
||||
'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
from .bunnycdn import BunnyCdnIE
|
||||
from .common import InfoExtractor
|
||||
from ..utils import try_get, unified_timestamp
|
||||
from ..utils import make_archive_id, try_get, unified_timestamp
|
||||
|
||||
|
||||
class SovietsClosetBaseIE(InfoExtractor):
|
||||
@@ -43,7 +44,7 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
||||
'url': 'https://sovietscloset.com/video/1337',
|
||||
'md5': 'bd012b04b261725510ca5383074cdd55',
|
||||
'info_dict': {
|
||||
'id': '1337',
|
||||
'id': '2f0cfbf4-3588-43a9-a7d6-7c9ea3755e67',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Witcher #13',
|
||||
'thumbnail': r're:^https?://.*\.b-cdn\.net/2f0cfbf4-3588-43a9-a7d6-7c9ea3755e67/thumbnail\.jpg$',
|
||||
@@ -55,20 +56,23 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
||||
'upload_date': '20170413',
|
||||
'uploader_id': 'SovietWomble',
|
||||
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
||||
'duration': 7007,
|
||||
'duration': 7008,
|
||||
'was_live': True,
|
||||
'availability': 'public',
|
||||
'series': 'The Witcher',
|
||||
'season': 'Misc',
|
||||
'episode_number': 13,
|
||||
'episode': 'Episode 13',
|
||||
'creators': ['SovietWomble'],
|
||||
'description': '',
|
||||
'_old_archive_ids': ['sovietscloset 1337'],
|
||||
},
|
||||
},
|
||||
{
|
||||
'url': 'https://sovietscloset.com/video/1105',
|
||||
'md5': '89fa928f183893cb65a0b7be846d8a90',
|
||||
'info_dict': {
|
||||
'id': '1105',
|
||||
'id': 'c0e5e76f-3a93-40b4-bf01-12343c2eec5d',
|
||||
'ext': 'mp4',
|
||||
'title': 'Arma 3 - Zeus Games #5',
|
||||
'uploader': 'SovietWomble',
|
||||
@@ -80,39 +84,20 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
||||
'upload_date': '20160420',
|
||||
'uploader_id': 'SovietWomble',
|
||||
'uploader_url': 'https://www.twitch.tv/SovietWomble',
|
||||
'duration': 8804,
|
||||
'duration': 8805,
|
||||
'was_live': True,
|
||||
'availability': 'public',
|
||||
'series': 'Arma 3',
|
||||
'season': 'Zeus Games',
|
||||
'episode_number': 5,
|
||||
'episode': 'Episode 5',
|
||||
'creators': ['SovietWomble'],
|
||||
'description': '',
|
||||
'_old_archive_ids': ['sovietscloset 1105'],
|
||||
},
|
||||
},
|
||||
]
|
||||
|
||||
def _extract_bunnycdn_iframe(self, video_id, bunnycdn_id):
|
||||
iframe = self._download_webpage(
|
||||
f'https://iframe.mediadelivery.net/embed/5105/{bunnycdn_id}',
|
||||
video_id, note='Downloading BunnyCDN iframe', headers=self.MEDIADELIVERY_REFERER)
|
||||
|
||||
m3u8_url = self._search_regex(r'(https?://.*?\.m3u8)', iframe, 'm3u8 url')
|
||||
thumbnail_url = self._search_regex(r'(https?://.*?thumbnail\.jpg)', iframe, 'thumbnail url')
|
||||
|
||||
m3u8_formats = self._extract_m3u8_formats(m3u8_url, video_id, headers=self.MEDIADELIVERY_REFERER)
|
||||
|
||||
if not m3u8_formats:
|
||||
duration = None
|
||||
else:
|
||||
duration = self._extract_m3u8_vod_duration(
|
||||
m3u8_formats[0]['url'], video_id, headers=self.MEDIADELIVERY_REFERER)
|
||||
|
||||
return {
|
||||
'formats': m3u8_formats,
|
||||
'thumbnail': thumbnail_url,
|
||||
'duration': duration,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
@@ -122,13 +107,13 @@ class SovietsClosetIE(SovietsClosetBaseIE):
|
||||
|
||||
stream = self.parse_nuxt_jsonp(f'{static_assets_base}/video/{video_id}/payload.js', video_id, 'video')['stream']
|
||||
|
||||
return {
|
||||
return self.url_result(
|
||||
f'https://iframe.mediadelivery.net/embed/5105/{stream["bunnyId"]}', ie=BunnyCdnIE, url_transparent=True,
|
||||
**self.video_meta(
|
||||
video_id=video_id, game_name=stream['game']['name'],
|
||||
category_name=try_get(stream, lambda x: x['subcategory']['name'], str),
|
||||
episode_number=stream.get('number'), stream_date=stream.get('date')),
|
||||
**self._extract_bunnycdn_iframe(video_id, stream['bunnyId']),
|
||||
}
|
||||
_old_archive_ids=[make_archive_id(self, video_id)])
|
||||
|
||||
|
||||
class SovietsClosetPlaylistIE(SovietsClosetBaseIE):
|
||||
|
||||
@@ -46,7 +46,7 @@ class TelecincoBaseIE(InfoExtractor):
|
||||
error_code = traverse_obj(
|
||||
self._webpage_read_content(error.cause.response, caronte['cerbero'], video_id, fatal=False),
|
||||
({json.loads}, 'code', {int}))
|
||||
if error_code == 4038:
|
||||
if error_code in (4038, 40313):
|
||||
self.raise_geo_restricted(countries=['ES'])
|
||||
raise
|
||||
|
||||
|
||||
@@ -26,6 +26,7 @@ from ..utils import (
|
||||
srt_subtitles_timecode,
|
||||
str_or_none,
|
||||
traverse_obj,
|
||||
truncate_string,
|
||||
try_call,
|
||||
try_get,
|
||||
url_or_none,
|
||||
@@ -249,6 +250,12 @@ class TikTokBaseIE(InfoExtractor):
|
||||
elif fatal:
|
||||
raise ExtractorError('Unable to extract webpage video data')
|
||||
|
||||
if not traverse_obj(video_data, ('video', {dict})) and traverse_obj(video_data, ('isContentClassified', {bool})):
|
||||
message = 'This post may not be comfortable for some audiences. Log in for access'
|
||||
if fatal:
|
||||
self.raise_login_required(message)
|
||||
self.report_warning(f'{message}. {self._login_hint()}', video_id=video_id)
|
||||
|
||||
return video_data, status
|
||||
|
||||
def _get_subtitles(self, aweme_detail, aweme_id, user_name):
|
||||
@@ -438,7 +445,7 @@ class TikTokBaseIE(InfoExtractor):
|
||||
return {
|
||||
'id': aweme_id,
|
||||
**traverse_obj(aweme_detail, {
|
||||
'title': ('desc', {str}),
|
||||
'title': ('desc', {truncate_string(left=72)}),
|
||||
'description': ('desc', {str}),
|
||||
'timestamp': ('create_time', {int_or_none}),
|
||||
}),
|
||||
@@ -589,7 +596,7 @@ class TikTokBaseIE(InfoExtractor):
|
||||
'duration': ('duration', {int_or_none}),
|
||||
})),
|
||||
**traverse_obj(aweme_detail, {
|
||||
'title': ('desc', {str}),
|
||||
'title': ('desc', {truncate_string(left=72)}),
|
||||
'description': ('desc', {str}),
|
||||
# audio-only slideshows have a video duration of 0 and an actual audio duration
|
||||
'duration': ('video', 'duration', {int_or_none}, filter),
|
||||
@@ -650,7 +657,7 @@ class TikTokIE(TikTokBaseIE):
|
||||
'info_dict': {
|
||||
'id': '6742501081818877190',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:5e2a23877420bb85ce6521dbee39ba94',
|
||||
'title': 'Tag 1 Friend reverse this Video and look what happens 🤩😱 @skyandtami ...',
|
||||
'description': 'md5:5e2a23877420bb85ce6521dbee39ba94',
|
||||
'duration': 27,
|
||||
'height': 1024,
|
||||
@@ -854,7 +861,7 @@ class TikTokIE(TikTokBaseIE):
|
||||
'info_dict': {
|
||||
'id': '7253412088251534594',
|
||||
'ext': 'm4a',
|
||||
'title': 'я ред флаг простите #переписка #щитпост #тревожныйтиппривязанности #рекомендации ',
|
||||
'title': 'я ред флаг простите #переписка #щитпост #тревожныйтиппривязанности #р...',
|
||||
'description': 'я ред флаг простите #переписка #щитпост #тревожныйтиппривязанности #рекомендации ',
|
||||
'uploader': 'hara_yoimiya',
|
||||
'uploader_id': '6582536342634676230',
|
||||
@@ -895,8 +902,12 @@ class TikTokIE(TikTokBaseIE):
|
||||
|
||||
if video_data and status == 0:
|
||||
return self._parse_aweme_video_web(video_data, url, video_id)
|
||||
elif status == 10216:
|
||||
raise ExtractorError('This video is private', expected=True)
|
||||
elif status in (10216, 10222):
|
||||
# 10216: private post; 10222: private account
|
||||
self.raise_login_required(
|
||||
'You do not have permission to view this post. Log into an account that has access')
|
||||
elif status == 10204:
|
||||
raise ExtractorError('Your IP address is blocked from accessing this post', expected=True)
|
||||
raise ExtractorError(f'Video not available, status code {status}', video_id=video_id)
|
||||
|
||||
|
||||
|
||||
117
yt_dlp/extractor/tvw.py
Normal file
117
yt_dlp/extractor/tvw.py
Normal file
@@ -0,0 +1,117 @@
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import clean_html, remove_end, unified_timestamp, url_or_none
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class TvwIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/',
|
||||
'md5': '9ceb94fe2bb7fd726f74f16356825703',
|
||||
'info_dict': {
|
||||
'id': '2024011211',
|
||||
'ext': 'mp4',
|
||||
'title': 'Billy Frank Jr. Statue Maquette Unveiling Ceremony',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||
'description': 'md5:58a8150017d985b4f377e11ee8f6f36e',
|
||||
'timestamp': 1704902400,
|
||||
'upload_date': '20240110',
|
||||
'location': 'Legislative Building',
|
||||
'display_id': 'billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211',
|
||||
'categories': ['General Interest'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://tvw.org/video/ebeys-landing-state-park-2024081007/',
|
||||
'md5': '71e87dae3deafd65d75ff3137b9a32fc',
|
||||
'info_dict': {
|
||||
'id': '2024081007',
|
||||
'ext': 'mp4',
|
||||
'title': 'Ebey\'s Landing State Park',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||
'description': 'md5:50c5bd73bde32fa6286a008dbc853386',
|
||||
'timestamp': 1724310900,
|
||||
'upload_date': '20240822',
|
||||
'location': 'Ebey’s Landing State Park',
|
||||
'display_id': 'ebeys-landing-state-park-2024081007',
|
||||
'categories': ['Washington State Parks'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://tvw.org/video/home-warranties-workgroup-2',
|
||||
'md5': 'f678789bf94d07da89809f213cf37150',
|
||||
'info_dict': {
|
||||
'id': '1999121000',
|
||||
'ext': 'mp4',
|
||||
'title': 'Home Warranties Workgroup',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||
'description': 'md5:861396cc523c9641d0dce690bc5c35f3',
|
||||
'timestamp': 946389600,
|
||||
'upload_date': '19991228',
|
||||
'display_id': 'home-warranties-workgroup-2',
|
||||
'categories': ['Legislative'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://tvw.org/video/washington-to-washington-a-new-space-race-2022041111/?eventID=2022041111',
|
||||
'md5': '6f5551090b351aba10c0d08a881b4f30',
|
||||
'info_dict': {
|
||||
'id': '2022041111',
|
||||
'ext': 'mp4',
|
||||
'title': 'Washington to Washington - A New Space Race',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpe?g|png)$',
|
||||
'description': 'md5:f65a24eec56107afbcebb3aa5cd26341',
|
||||
'timestamp': 1650394800,
|
||||
'upload_date': '20220419',
|
||||
'location': 'Hayner Media Center',
|
||||
'display_id': 'washington-to-washington-a-new-space-race-2022041111',
|
||||
'categories': ['Washington to Washington', 'General Interest'],
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
client_id = self._html_search_meta('clientID', webpage, fatal=True)
|
||||
video_id = self._html_search_meta('eventID', webpage, fatal=True)
|
||||
|
||||
video_data = self._download_json(
|
||||
'https://api.v3.invintus.com/v2/Event/getDetailed', video_id,
|
||||
headers={
|
||||
'authorization': 'embedder',
|
||||
'wsc-api-key': '7WhiEBzijpritypp8bqcU7pfU9uicDR',
|
||||
},
|
||||
data=json.dumps({
|
||||
'clientID': client_id,
|
||||
'eventID': video_id,
|
||||
'showStreams': True,
|
||||
}).encode())['data']
|
||||
|
||||
formats = []
|
||||
subtitles = {}
|
||||
for stream_url in traverse_obj(video_data, ('streamingURIs', ..., {url_or_none})):
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(
|
||||
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
if caption_url := traverse_obj(video_data, ('captionPath', {url_or_none})):
|
||||
subtitles.setdefault('en', []).append({'url': caption_url, 'ext': 'vtt'})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'title': remove_end(self._og_search_title(webpage, default=None), ' - TVW'),
|
||||
'description': self._og_search_description(webpage, default=None),
|
||||
**traverse_obj(video_data, {
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {clean_html}),
|
||||
'categories': ('categories', ..., {str}),
|
||||
'thumbnail': ('videoThumbnail', {url_or_none}),
|
||||
'timestamp': ('startDateTime', {unified_timestamp}),
|
||||
'location': ('locationName', {str}),
|
||||
'is_live': ('eventStatus', {lambda x: x == 'live'}),
|
||||
}),
|
||||
}
|
||||
@@ -21,6 +21,7 @@ from ..utils import (
|
||||
str_or_none,
|
||||
strip_or_none,
|
||||
traverse_obj,
|
||||
truncate_string,
|
||||
try_call,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
@@ -358,6 +359,7 @@ class TwitterCardIE(InfoExtractor):
|
||||
'display_id': '560070183650213889',
|
||||
'uploader_url': 'https://twitter.com/Twitter',
|
||||
},
|
||||
'skip': 'This content is no longer available.',
|
||||
},
|
||||
{
|
||||
'url': 'https://twitter.com/i/cards/tfw/v1/623160978427936768',
|
||||
@@ -365,7 +367,7 @@ class TwitterCardIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '623160978427936768',
|
||||
'ext': 'mp4',
|
||||
'title': "NASA - Fly over Pluto's icy Norgay Mountains and Sputnik Plain in this @NASANewHorizons #PlutoFlyby video.",
|
||||
'title': "NASA - Fly over Pluto's icy Norgay Mountains and Sputnik Plain in this @NASA...",
|
||||
'description': "Fly over Pluto's icy Norgay Mountains and Sputnik Plain in this @NASANewHorizons #PlutoFlyby video. https://t.co/BJYgOjSeGA",
|
||||
'uploader': 'NASA',
|
||||
'uploader_id': 'NASA',
|
||||
@@ -377,12 +379,14 @@ class TwitterCardIE(InfoExtractor):
|
||||
'like_count': int,
|
||||
'repost_count': int,
|
||||
'tags': ['PlutoFlyby'],
|
||||
'channel_id': '11348282',
|
||||
'_old_archive_ids': ['twitter 623160978427936768'],
|
||||
},
|
||||
'params': {'format': '[protocol=https]'},
|
||||
},
|
||||
{
|
||||
'url': 'https://twitter.com/i/cards/tfw/v1/654001591733886977',
|
||||
'md5': 'b6d9683dd3f48e340ded81c0e917ad46',
|
||||
'md5': 'fb08fbd69595cbd8818f0b2f2a94474d',
|
||||
'info_dict': {
|
||||
'id': 'dq4Oj5quskI',
|
||||
'ext': 'mp4',
|
||||
@@ -390,12 +394,12 @@ class TwitterCardIE(InfoExtractor):
|
||||
'description': 'md5:a831e97fa384863d6e26ce48d1c43376',
|
||||
'upload_date': '20111013',
|
||||
'uploader': 'OMG! UBUNTU!',
|
||||
'uploader_id': 'omgubuntu',
|
||||
'uploader_id': '@omgubuntu',
|
||||
'channel_url': 'https://www.youtube.com/channel/UCIiSwcm9xiFb3Y4wjzR41eQ',
|
||||
'channel_id': 'UCIiSwcm9xiFb3Y4wjzR41eQ',
|
||||
'channel_follower_count': int,
|
||||
'chapters': 'count:8',
|
||||
'uploader_url': 'http://www.youtube.com/user/omgubuntu',
|
||||
'uploader_url': 'https://www.youtube.com/@omgubuntu',
|
||||
'duration': 138,
|
||||
'categories': ['Film & Animation'],
|
||||
'age_limit': 0,
|
||||
@@ -407,6 +411,9 @@ class TwitterCardIE(InfoExtractor):
|
||||
'tags': 'count:12',
|
||||
'channel': 'OMG! UBUNTU!',
|
||||
'playable_in_embed': True,
|
||||
'heatmap': 'count:100',
|
||||
'timestamp': 1318500227,
|
||||
'live_status': 'not_live',
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
},
|
||||
@@ -548,13 +555,14 @@ class TwitterIE(TwitterBaseIE):
|
||||
'age_limit': 0,
|
||||
'_old_archive_ids': ['twitter 700207533655363584'],
|
||||
},
|
||||
'skip': 'Tweet has been deleted',
|
||||
}, {
|
||||
'url': 'https://twitter.com/captainamerica/status/719944021058060289',
|
||||
'info_dict': {
|
||||
'id': '717462543795523584',
|
||||
'display_id': '719944021058060289',
|
||||
'ext': 'mp4',
|
||||
'title': 'Captain America - @King0fNerd Are you sure you made the right choice? Find out in theaters.',
|
||||
'title': 'Captain America - @King0fNerd Are you sure you made the right choice? Find out in theat...',
|
||||
'description': '@King0fNerd Are you sure you made the right choice? Find out in theaters. https://t.co/GpgYi9xMJI',
|
||||
'channel_id': '701615052',
|
||||
'uploader_id': 'CaptainAmerica',
|
||||
@@ -591,7 +599,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'info_dict': {
|
||||
'id': '852077943283097602',
|
||||
'ext': 'mp4',
|
||||
'title': 'عالم الأخبار - كلمة تاريخية بجلسة الجناسي التاريخية.. النائب خالد مؤنس العتيبي للمعارضين : اتقوا الله .. الظلم ظلمات يوم القيامة',
|
||||
'title': 'عالم الأخبار - كلمة تاريخية بجلسة الجناسي التاريخية.. النائب خالد مؤنس العتيبي للمعا...',
|
||||
'description': 'كلمة تاريخية بجلسة الجناسي التاريخية.. النائب خالد مؤنس العتيبي للمعارضين : اتقوا الله .. الظلم ظلمات يوم القيامة https://t.co/xg6OhpyKfN',
|
||||
'channel_id': '2526757026',
|
||||
'uploader': 'عالم الأخبار',
|
||||
@@ -615,7 +623,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '910030238373089285',
|
||||
'display_id': '910031516746514432',
|
||||
'ext': 'mp4',
|
||||
'title': 'Préfet de Guadeloupe - [Direct] #Maria Le centre se trouve actuellement au sud de Basse-Terre. Restez confinés. Réfugiez-vous dans la pièce la + sûre.',
|
||||
'title': 'Préfet de Guadeloupe - [Direct] #Maria Le centre se trouve actuellement au sud de Basse-Terr...',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'description': '[Direct] #Maria Le centre se trouve actuellement au sud de Basse-Terre. Restez confinés. Réfugiez-vous dans la pièce la + sûre. https://t.co/mwx01Rs4lo',
|
||||
'channel_id': '2319432498',
|
||||
@@ -707,7 +715,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '1349774757969989634',
|
||||
'display_id': '1349794411333394432',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:d1c4941658e4caaa6cb579260d85dcba',
|
||||
'title': "Brooklyn Nets - WATCH: Sean Marks' full media session after our acquisition of 8-time...",
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'description': 'md5:71ead15ec44cee55071547d6447c6a3e',
|
||||
'channel_id': '18552281',
|
||||
@@ -733,7 +741,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '1577855447914409984',
|
||||
'display_id': '1577855540407197696',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:466a3a8b049b5f5a13164ce915484b51',
|
||||
'title': 'Oshtru - gm ✨️ now I can post image and video. nice update.',
|
||||
'description': 'md5:b9c3699335447391d11753ab21c70a74',
|
||||
'upload_date': '20221006',
|
||||
'channel_id': '143077138',
|
||||
@@ -755,10 +763,10 @@ class TwitterIE(TwitterBaseIE):
|
||||
'url': 'https://twitter.com/UltimaShadowX/status/1577719286659006464',
|
||||
'info_dict': {
|
||||
'id': '1577719286659006464',
|
||||
'title': 'Ultima Reload - Test',
|
||||
'title': 'Ultima - Test',
|
||||
'description': 'Test https://t.co/Y3KEZD7Dad',
|
||||
'channel_id': '168922496',
|
||||
'uploader': 'Ultima Reload',
|
||||
'uploader': 'Ultima',
|
||||
'uploader_id': 'UltimaShadowX',
|
||||
'uploader_url': 'https://twitter.com/UltimaShadowX',
|
||||
'upload_date': '20221005',
|
||||
@@ -777,7 +785,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '1575559336759263233',
|
||||
'display_id': '1575560063510810624',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:eec26382babd0f7c18f041db8ae1c9c9',
|
||||
'title': 'Max Olson - Absolutely heartbreaking footage captured by our surge probe of catas...',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'description': 'md5:95aea692fda36a12081b9629b02daa92',
|
||||
'channel_id': '1094109584',
|
||||
@@ -901,18 +909,18 @@ class TwitterIE(TwitterBaseIE):
|
||||
'playlist_mincount': 2,
|
||||
'info_dict': {
|
||||
'id': '1600649710662213632',
|
||||
'title': 'md5:be05989b0722e114103ed3851a0ffae2',
|
||||
'title': "Jocelyn Laidlaw - How Kirstie Alley's tragic death inspired me to share more about my c...",
|
||||
'timestamp': 1670459604.0,
|
||||
'description': 'md5:591c19ce66fadc2359725d5cd0d1052c',
|
||||
'comment_count': int,
|
||||
'uploader_id': 'CTVJLaidlaw',
|
||||
'uploader_id': 'JocelynVLaidlaw',
|
||||
'channel_id': '80082014',
|
||||
'repost_count': int,
|
||||
'tags': ['colorectalcancer', 'cancerjourney', 'imnotaquitter'],
|
||||
'upload_date': '20221208',
|
||||
'age_limit': 0,
|
||||
'uploader': 'Jocelyn Laidlaw',
|
||||
'uploader_url': 'https://twitter.com/CTVJLaidlaw',
|
||||
'uploader_url': 'https://twitter.com/JocelynVLaidlaw',
|
||||
'like_count': int,
|
||||
},
|
||||
}, {
|
||||
@@ -921,17 +929,17 @@ class TwitterIE(TwitterBaseIE):
|
||||
'info_dict': {
|
||||
'id': '1600649511827013632',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:7662a0a27ce6faa3e5b160340f3cfab1',
|
||||
'title': "Jocelyn Laidlaw - How Kirstie Alley's tragic death inspired me to share more about my c... #1",
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
'timestamp': 1670459604.0,
|
||||
'channel_id': '80082014',
|
||||
'uploader_id': 'CTVJLaidlaw',
|
||||
'uploader_id': 'JocelynVLaidlaw',
|
||||
'uploader': 'Jocelyn Laidlaw',
|
||||
'repost_count': int,
|
||||
'comment_count': int,
|
||||
'tags': ['colorectalcancer', 'cancerjourney', 'imnotaquitter'],
|
||||
'duration': 102.226,
|
||||
'uploader_url': 'https://twitter.com/CTVJLaidlaw',
|
||||
'uploader_url': 'https://twitter.com/JocelynVLaidlaw',
|
||||
'display_id': '1600649710662213632',
|
||||
'like_count': int,
|
||||
'description': 'md5:591c19ce66fadc2359725d5cd0d1052c',
|
||||
@@ -990,6 +998,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'_old_archive_ids': ['twitter 1599108751385972737'],
|
||||
},
|
||||
'params': {'noplaylist': True},
|
||||
'skip': 'Tweet is limited',
|
||||
}, {
|
||||
'url': 'https://twitter.com/MunTheShinobi/status/1600009574919962625',
|
||||
'info_dict': {
|
||||
@@ -1001,10 +1010,10 @@ class TwitterIE(TwitterBaseIE):
|
||||
'description': 'This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525 https://t.co/cNsA0MoOml',
|
||||
'thumbnail': 'https://pbs.twimg.com/ext_tw_video_thumb/1600009362759733248/pu/img/XVhFQivj75H_YxxV.jpg?name=orig',
|
||||
'age_limit': 0,
|
||||
'uploader': 'Mün',
|
||||
'uploader': 'Boy Called Mün',
|
||||
'repost_count': int,
|
||||
'upload_date': '20221206',
|
||||
'title': 'Mün - This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525',
|
||||
'title': 'Boy Called Mün - This is a genius ad by Apple. \U0001f525\U0001f525\U0001f525\U0001f525\U0001f525',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
'tags': [],
|
||||
@@ -1042,7 +1051,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '1694928337846538240',
|
||||
'ext': 'mp4',
|
||||
'display_id': '1695424220702888009',
|
||||
'title': 'md5:e8daa9527bc2b947121395494f786d9d',
|
||||
'title': 'Benny Johnson - Donald Trump driving through the urban, poor neighborhoods of Atlanta...',
|
||||
'description': 'md5:004f2d37fd58737724ec75bc7e679938',
|
||||
'channel_id': '15212187',
|
||||
'uploader': 'Benny Johnson',
|
||||
@@ -1066,7 +1075,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'id': '1694928337846538240',
|
||||
'ext': 'mp4',
|
||||
'display_id': '1695424220702888009',
|
||||
'title': 'md5:e8daa9527bc2b947121395494f786d9d',
|
||||
'title': 'Benny Johnson - Donald Trump driving through the urban, poor neighborhoods of Atlanta...',
|
||||
'description': 'md5:004f2d37fd58737724ec75bc7e679938',
|
||||
'channel_id': '15212187',
|
||||
'uploader': 'Benny Johnson',
|
||||
@@ -1101,6 +1110,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'view_count': int,
|
||||
},
|
||||
'add_ie': ['TwitterBroadcast'],
|
||||
'skip': 'Broadcast no longer exists',
|
||||
}, {
|
||||
# Animated gif and quote tweet video
|
||||
'url': 'https://twitter.com/BAKKOOONN/status/1696256659889565950',
|
||||
@@ -1129,7 +1139,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'info_dict': {
|
||||
'id': '1724883339285544960',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:cc56716f9ed0b368de2ba54c478e493c',
|
||||
'title': 'Robert F. Kennedy Jr - A beautifully crafted short film by Mikki Willis about my independent...',
|
||||
'description': 'md5:9dc14f5b0f1311fc7caf591ae253a164',
|
||||
'display_id': '1724884212803834154',
|
||||
'channel_id': '337808606',
|
||||
@@ -1150,7 +1160,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
}, {
|
||||
# x.com
|
||||
'url': 'https://x.com/historyinmemes/status/1790637656616943991',
|
||||
'md5': 'daca3952ba0defe2cfafb1276d4c1ea5',
|
||||
'md5': '4549eda363fecfe37439c455923cba2c',
|
||||
'info_dict': {
|
||||
'id': '1790637589910654976',
|
||||
'ext': 'mp4',
|
||||
@@ -1334,7 +1344,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
def _generate_syndication_token(self, twid):
|
||||
# ((Number(twid) / 1e15) * Math.PI).toString(36).replace(/(0+|\.)/g, '')
|
||||
translation = str.maketrans(dict.fromkeys('0.'))
|
||||
return js_number_to_string((int(twid) / 1e15) * math.PI, 36).translate(translation)
|
||||
return js_number_to_string((int(twid) / 1e15) * math.pi, 36).translate(translation)
|
||||
|
||||
def _call_syndication_api(self, twid):
|
||||
self.report_warning(
|
||||
@@ -1390,7 +1400,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
title = description = traverse_obj(
|
||||
status, (('full_text', 'text'), {lambda x: x.replace('\n', ' ')}), get_all=False) or ''
|
||||
# strip 'https -_t.co_BJYgOjSeGA' junk from filenames
|
||||
title = re.sub(r'\s+(https?://[^ ]+)', '', title)
|
||||
title = truncate_string(re.sub(r'\s+(https?://[^ ]+)', '', title), left=72)
|
||||
user = status.get('user') or {}
|
||||
uploader = user.get('name')
|
||||
if uploader:
|
||||
|
||||
@@ -116,6 +116,7 @@ class VKIE(VKBaseIE):
|
||||
'id': '-77521_162222515',
|
||||
'ext': 'mp4',
|
||||
'title': 'ProtivoGunz - Хуёвая песня',
|
||||
'description': 'Видео из официальной группы Noize MC\nhttp://vk.com/noizemc',
|
||||
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
|
||||
'uploader_id': '39545378',
|
||||
'duration': 195,
|
||||
@@ -165,6 +166,7 @@ class VKIE(VKBaseIE):
|
||||
'id': '-93049196_456239755',
|
||||
'ext': 'mp4',
|
||||
'title': '8 серия (озвучка)',
|
||||
'description': 'Видео из официальной группы Noize MC\nhttp://vk.com/noizemc',
|
||||
'duration': 8383,
|
||||
'comment_count': int,
|
||||
'uploader': 'Dizi2021',
|
||||
@@ -240,6 +242,7 @@ class VKIE(VKBaseIE):
|
||||
'upload_date': '20221005',
|
||||
'uploader': 'Шальная Императрица',
|
||||
'uploader_id': '-74006511',
|
||||
'description': 'md5:f9315f7786fa0e84e75e4f824a48b056',
|
||||
},
|
||||
},
|
||||
{
|
||||
@@ -278,6 +281,25 @@ class VKIE(VKBaseIE):
|
||||
},
|
||||
'skip': 'No formats found',
|
||||
},
|
||||
{
|
||||
'note': 'video has chapters',
|
||||
'url': 'https://vkvideo.ru/video-18403220_456239696',
|
||||
'info_dict': {
|
||||
'id': '-18403220_456239696',
|
||||
'ext': 'mp4',
|
||||
'title': 'Трамп отменяет гранты // DeepSeek - Революция в ИИ // Илон Маск читер',
|
||||
'description': 'md5:b112ea9de53683b6d03d29076f62eec2',
|
||||
'uploader': 'Руслан Усачев',
|
||||
'uploader_id': '-18403220',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
'duration': 1983,
|
||||
'thumbnail': r're:https?://.+\.jpg',
|
||||
'chapters': 'count:21',
|
||||
'timestamp': 1738252883,
|
||||
'upload_date': '20250130',
|
||||
},
|
||||
},
|
||||
{
|
||||
# live stream, hls and rtmp links, most likely already finished live
|
||||
# stream by the time you are reading this comment
|
||||
@@ -449,7 +471,6 @@ class VKIE(VKBaseIE):
|
||||
return self.url_result(opts_url)
|
||||
|
||||
data = player['params'][0]
|
||||
title = unescapeHTML(data['md_title'])
|
||||
|
||||
# 2 = live
|
||||
# 3 = post live (finished live)
|
||||
@@ -507,17 +528,29 @@ class VKIE(VKBaseIE):
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'title': title,
|
||||
'thumbnail': data.get('jpg'),
|
||||
'uploader': data.get('md_author'),
|
||||
'uploader_id': str_or_none(data.get('author_id') or mv_data.get('authorId')),
|
||||
'duration': int_or_none(data.get('duration') or mv_data.get('duration')),
|
||||
'subtitles': subtitles,
|
||||
**traverse_obj(mv_data, {
|
||||
'title': ('title', {unescapeHTML}),
|
||||
'description': ('desc', {clean_html}, filter),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
'like_count': ('likes', {int_or_none}),
|
||||
'comment_count': ('commcount', {int_or_none}),
|
||||
}),
|
||||
**traverse_obj(data, {
|
||||
'title': ('md_title', {unescapeHTML}),
|
||||
'description': ('description', {clean_html}, filter),
|
||||
'thumbnail': ('jpg', {url_or_none}),
|
||||
'uploader': ('md_author', {str}),
|
||||
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
|
||||
'title': ('text', {str}),
|
||||
'start_time': 'time',
|
||||
}),
|
||||
}),
|
||||
'timestamp': timestamp,
|
||||
'view_count': view_count,
|
||||
'like_count': int_or_none(mv_data.get('likes')),
|
||||
'comment_count': int_or_none(mv_data.get('commcount')),
|
||||
'is_live': is_live,
|
||||
'subtitles': subtitles,
|
||||
'_format_sort_fields': ('res', 'source'),
|
||||
}
|
||||
|
||||
|
||||
@@ -2,31 +2,33 @@ import json
|
||||
import time
|
||||
import urllib.parse
|
||||
|
||||
from .gigya import GigyaBaseIE
|
||||
from .common import InfoExtractor
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
clean_html,
|
||||
extract_attributes,
|
||||
filter_dict,
|
||||
float_or_none,
|
||||
get_element_by_class,
|
||||
get_element_html_by_class,
|
||||
int_or_none,
|
||||
join_nonempty,
|
||||
jwt_decode_hs256,
|
||||
jwt_encode_hs256,
|
||||
make_archive_id,
|
||||
merge_dicts,
|
||||
parse_age_limit,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
str_or_none,
|
||||
strip_or_none,
|
||||
traverse_obj,
|
||||
try_call,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class VRTBaseIE(GigyaBaseIE):
|
||||
class VRTBaseIE(InfoExtractor):
|
||||
_GEO_BYPASS = False
|
||||
_PLAYER_INFO = {
|
||||
'platform': 'desktop',
|
||||
@@ -37,11 +39,11 @@ class VRTBaseIE(GigyaBaseIE):
|
||||
'device': 'undefined (undefined)',
|
||||
'os': {
|
||||
'name': 'Windows',
|
||||
'version': 'x86_64',
|
||||
'version': '10',
|
||||
},
|
||||
'player': {
|
||||
'name': 'VRT web player',
|
||||
'version': '2.7.4-prod-2023-04-19T06:05:45',
|
||||
'version': '5.1.1-prod-2025-02-14T08:44:16"',
|
||||
},
|
||||
}
|
||||
# From https://player.vrt.be/vrtnws/js/main.js & https://player.vrt.be/ketnet/js/main.8cdb11341bcb79e4cd44.js
|
||||
@@ -90,20 +92,21 @@ class VRTBaseIE(GigyaBaseIE):
|
||||
def _call_api(self, video_id, client='null', id_token=None, version='v2'):
|
||||
player_info = {'exp': (round(time.time(), 3) + 900), **self._PLAYER_INFO}
|
||||
player_token = self._download_json(
|
||||
'https://media-services-public.vrt.be/vualto-video-aggregator-web/rest/external/v2/tokens',
|
||||
video_id, 'Downloading player token', headers={
|
||||
f'https://media-services-public.vrt.be/vualto-video-aggregator-web/rest/external/{version}/tokens',
|
||||
video_id, 'Downloading player token', 'Failed to download player token', headers={
|
||||
**self.geo_verification_headers(),
|
||||
'Content-Type': 'application/json',
|
||||
}, data=json.dumps({
|
||||
'identityToken': id_token or {},
|
||||
'identityToken': id_token or '',
|
||||
'playerInfo': jwt_encode_hs256(player_info, self._JWT_SIGNING_KEY, headers={
|
||||
'kid': self._JWT_KEY_ID,
|
||||
}).decode(),
|
||||
}, separators=(',', ':')).encode())['vrtPlayerToken']
|
||||
|
||||
return self._download_json(
|
||||
f'https://media-services-public.vrt.be/media-aggregator/{version}/media-items/{video_id}',
|
||||
video_id, 'Downloading API JSON', query={
|
||||
# The URL below redirects to https://media-services-public.vrt.be/media-aggregator/{version}/media-items/{video_id}
|
||||
f'https://media-services-public.vrt.be/vualto-video-aggregator-web/rest/external/{version}/videos/{video_id}',
|
||||
video_id, 'Downloading API JSON', 'Failed to download API JSON', query={
|
||||
'vrtPlayerToken': player_token,
|
||||
'client': client,
|
||||
}, expected_status=400)
|
||||
@@ -177,215 +180,286 @@ class VRTIE(VRTBaseIE):
|
||||
|
||||
|
||||
class VrtNUIE(VRTBaseIE):
|
||||
IE_DESC = 'VRT MAX'
|
||||
_VALID_URL = r'https?://(?:www\.)?vrt\.be/vrtnu/a-z/(?:[^/]+/){2}(?P<id>[^/?#&]+)'
|
||||
IE_NAME = 'vrtmax'
|
||||
IE_DESC = 'VRT MAX (formerly VRT NU)'
|
||||
_VALID_URL = r'https?://(?:www\.)?vrt\.be/(?:vrtnu|vrtmax)/a-z/(?:[^/]+/){2}(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
# CONTENT_IS_AGE_RESTRICTED
|
||||
'url': 'https://www.vrt.be/vrtnu/a-z/de-ideale-wereld/2023-vj/de-ideale-wereld-d20230116/',
|
||||
'url': 'https://www.vrt.be/vrtmax/a-z/ket---doc/trailer/ket---doc-trailer-s6/',
|
||||
'info_dict': {
|
||||
'id': 'pbs-pub-855b00a8-6ce2-4032-ac4f-1fcf3ae78524$vid-d2243aa1-ec46-4e34-a55b-92568459906f',
|
||||
'id': 'pbs-pub-c8a78645-5d3e-468a-89ec-6f3ed5534bd5$vid-242ddfe9-18f5-4e16-ab45-09b122a19251',
|
||||
'ext': 'mp4',
|
||||
'title': 'Tom Waes',
|
||||
'description': 'Satirisch actualiteitenmagazine met Ella Leyers. Tom Waes is te gast.',
|
||||
'timestamp': 1673905125,
|
||||
'release_timestamp': 1673905125,
|
||||
'series': 'De ideale wereld',
|
||||
'season_id': '1672830988794',
|
||||
'episode': 'Aflevering 1',
|
||||
'episode_number': 1,
|
||||
'episode_id': '1672830988861',
|
||||
'display_id': 'de-ideale-wereld-d20230116',
|
||||
'channel': 'VRT',
|
||||
'duration': 1939.0,
|
||||
'thumbnail': 'https://images.vrt.be/orig/2023/01/10/1bb39cb3-9115-11ed-b07d-02b7b76bf47f.jpg',
|
||||
'release_date': '20230116',
|
||||
'upload_date': '20230116',
|
||||
'age_limit': 12,
|
||||
'channel': 'ketnet',
|
||||
'description': 'Neem een kijkje in de bijzondere wereld van deze Ketnetters.',
|
||||
'display_id': 'ket---doc-trailer-s6',
|
||||
'duration': 30.0,
|
||||
'episode': 'Reeks 6 volledig vanaf 3 maart',
|
||||
'episode_id': '1739450401467',
|
||||
'season': 'Trailer',
|
||||
'season_id': '1739450401467',
|
||||
'series': 'Ket & Doc',
|
||||
'thumbnail': 'https://images.vrt.be/orig/2025/02/21/63f07122-5bbd-4ca1-b42e-8565c6cd95df.jpg',
|
||||
'timestamp': 1740373200,
|
||||
'title': 'Reeks 6 volledig vanaf 3 maart',
|
||||
'upload_date': '20250224',
|
||||
'_old_archive_ids': [
|
||||
'canvas pbs-pub-c8a78645-5d3e-468a-89ec-6f3ed5534bd5$vid-242ddfe9-18f5-4e16-ab45-09b122a19251',
|
||||
'ketnet pbs-pub-c8a78645-5d3e-468a-89ec-6f3ed5534bd5$vid-242ddfe9-18f5-4e16-ab45-09b122a19251',
|
||||
],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.vrt.be/vrtnu/a-z/buurman--wat-doet-u-nu-/6/buurman--wat-doet-u-nu--s6-trailer/',
|
||||
'url': 'https://www.vrt.be/vrtmax/a-z/meisjes/6/meisjes-s6a5/',
|
||||
'info_dict': {
|
||||
'id': 'pbs-pub-ad4050eb-d9e5-48c2-9ec8-b6c355032361$vid-0465537a-34a8-4617-8352-4d8d983b4eee',
|
||||
'id': 'pbs-pub-97b541ab-e05c-43b9-9a40-445702ef7189$vid-5e306921-a9aa-4fa9-9f39-5b82c8f1028e',
|
||||
'ext': 'mp4',
|
||||
'title': 'Trailer seizoen 6 \'Buurman, wat doet u nu?\'',
|
||||
'description': 'md5:197424726c61384b4e5c519f16c0cf02',
|
||||
'timestamp': 1652940000,
|
||||
'release_timestamp': 1652940000,
|
||||
'series': 'Buurman, wat doet u nu?',
|
||||
'season': 'Seizoen 6',
|
||||
'channel': 'ketnet',
|
||||
'description': 'md5:713793f15cbf677f66200b36b7b1ec5a',
|
||||
'display_id': 'meisjes-s6a5',
|
||||
'duration': 1336.02,
|
||||
'episode': 'Week 5',
|
||||
'episode_id': '1684157692901',
|
||||
'episode_number': 5,
|
||||
'season': '6',
|
||||
'season_id': '1684157692901',
|
||||
'season_number': 6,
|
||||
'season_id': '1652344200907',
|
||||
'episode': 'Aflevering 0',
|
||||
'episode_number': 0,
|
||||
'episode_id': '1652951873524',
|
||||
'display_id': 'buurman--wat-doet-u-nu--s6-trailer',
|
||||
'channel': 'VRT',
|
||||
'duration': 33.13,
|
||||
'thumbnail': 'https://images.vrt.be/orig/2022/05/23/3c234d21-da83-11ec-b07d-02b7b76bf47f.jpg',
|
||||
'release_date': '20220519',
|
||||
'upload_date': '20220519',
|
||||
'series': 'Meisjes',
|
||||
'thumbnail': 'https://images.vrt.be/orig/2023/05/14/bf526ae0-f1d9-11ed-91d7-02b7b76bf47f.jpg',
|
||||
'timestamp': 1685251800,
|
||||
'title': 'Week 5',
|
||||
'upload_date': '20230528',
|
||||
'_old_archive_ids': [
|
||||
'canvas pbs-pub-97b541ab-e05c-43b9-9a40-445702ef7189$vid-5e306921-a9aa-4fa9-9f39-5b82c8f1028e',
|
||||
'ketnet pbs-pub-97b541ab-e05c-43b9-9a40-445702ef7189$vid-5e306921-a9aa-4fa9-9f39-5b82c8f1028e',
|
||||
],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.vrt.be/vrtnu/a-z/taboe/3/taboe-s3a4/',
|
||||
'info_dict': {
|
||||
'id': 'pbs-pub-f50faa3a-1778-46b6-9117-4ba85f197703$vid-547507fe-1c8b-4394-b361-21e627cbd0fd',
|
||||
'ext': 'mp4',
|
||||
'channel': 'een',
|
||||
'description': 'md5:bf61345a95eca9393a95de4a7a54b5c6',
|
||||
'display_id': 'taboe-s3a4',
|
||||
'duration': 2882.02,
|
||||
'episode': 'Mensen met het syndroom van Gilles de la Tourette',
|
||||
'episode_id': '1739055911734',
|
||||
'episode_number': 4,
|
||||
'season': '3',
|
||||
'season_id': '1739055911734',
|
||||
'season_number': 3,
|
||||
'series': 'Taboe',
|
||||
'thumbnail': 'https://images.vrt.be/orig/2025/02/19/8198496c-d1ae-4bca-9a48-761cf3ea3ff2.jpg',
|
||||
'timestamp': 1740286800,
|
||||
'title': 'Mensen met het syndroom van Gilles de la Tourette',
|
||||
'upload_date': '20250223',
|
||||
'_old_archive_ids': [
|
||||
'canvas pbs-pub-f50faa3a-1778-46b6-9117-4ba85f197703$vid-547507fe-1c8b-4394-b361-21e627cbd0fd',
|
||||
'ketnet pbs-pub-f50faa3a-1778-46b6-9117-4ba85f197703$vid-547507fe-1c8b-4394-b361-21e627cbd0fd',
|
||||
],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}]
|
||||
_NETRC_MACHINE = 'vrtnu'
|
||||
_authenticated = False
|
||||
|
||||
_TOKEN_COOKIE_DOMAIN = '.www.vrt.be'
|
||||
_ACCESS_TOKEN_COOKIE_NAME = 'vrtnu-site_profile_at'
|
||||
_REFRESH_TOKEN_COOKIE_NAME = 'vrtnu-site_profile_rt'
|
||||
_VIDEO_TOKEN_COOKIE_NAME = 'vrtnu-site_profile_vt'
|
||||
_VIDEO_PAGE_QUERY = '''
|
||||
query VideoPage($pageId: ID!) {
|
||||
page(id: $pageId) {
|
||||
... on EpisodePage {
|
||||
episode {
|
||||
ageRaw
|
||||
description
|
||||
durationRaw
|
||||
episodeNumberRaw
|
||||
id
|
||||
name
|
||||
onTimeRaw
|
||||
program {
|
||||
title
|
||||
}
|
||||
season {
|
||||
id
|
||||
titleRaw
|
||||
}
|
||||
title
|
||||
brand
|
||||
}
|
||||
ldjson
|
||||
player {
|
||||
image {
|
||||
templateUrl
|
||||
}
|
||||
modes {
|
||||
streamId
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
'''
|
||||
|
||||
def _fetch_tokens(self):
|
||||
has_credentials = self._get_login_info()[0]
|
||||
access_token = self._get_vrt_cookie(self._ACCESS_TOKEN_COOKIE_NAME)
|
||||
video_token = self._get_vrt_cookie(self._VIDEO_TOKEN_COOKIE_NAME)
|
||||
|
||||
if (access_token and not self._is_jwt_token_expired(access_token)
|
||||
and video_token and not self._is_jwt_token_expired(video_token)):
|
||||
return access_token, video_token
|
||||
|
||||
if has_credentials:
|
||||
access_token, video_token = self.cache.load(self._NETRC_MACHINE, 'token_data', default=(None, None))
|
||||
|
||||
if (access_token and not self._is_jwt_token_expired(access_token)
|
||||
and video_token and not self._is_jwt_token_expired(video_token)):
|
||||
self.write_debug('Restored tokens from cache')
|
||||
self._set_cookie(self._TOKEN_COOKIE_DOMAIN, self._ACCESS_TOKEN_COOKIE_NAME, access_token)
|
||||
self._set_cookie(self._TOKEN_COOKIE_DOMAIN, self._VIDEO_TOKEN_COOKIE_NAME, video_token)
|
||||
return access_token, video_token
|
||||
|
||||
if not self._get_vrt_cookie(self._REFRESH_TOKEN_COOKIE_NAME):
|
||||
return None, None
|
||||
|
||||
self._request_webpage(
|
||||
'https://www.vrt.be/vrtmax/sso/refresh', None,
|
||||
note='Refreshing tokens', errnote='Failed to refresh tokens', fatal=False)
|
||||
|
||||
access_token = self._get_vrt_cookie(self._ACCESS_TOKEN_COOKIE_NAME)
|
||||
video_token = self._get_vrt_cookie(self._VIDEO_TOKEN_COOKIE_NAME)
|
||||
|
||||
if not access_token or not video_token:
|
||||
self.cache.store(self._NETRC_MACHINE, 'refresh_token', None)
|
||||
self.cookiejar.clear(self._TOKEN_COOKIE_DOMAIN, '/vrtmax/sso', self._REFRESH_TOKEN_COOKIE_NAME)
|
||||
msg = 'Refreshing of tokens failed'
|
||||
if not has_credentials:
|
||||
self.report_warning(msg)
|
||||
return None, None
|
||||
self.report_warning(f'{msg}. Re-logging in')
|
||||
return self._perform_login(*self._get_login_info())
|
||||
|
||||
if has_credentials:
|
||||
self.cache.store(self._NETRC_MACHINE, 'token_data', (access_token, video_token))
|
||||
|
||||
return access_token, video_token
|
||||
|
||||
def _get_vrt_cookie(self, cookie_name):
|
||||
# Refresh token cookie is scoped to /vrtmax/sso, others are scoped to /
|
||||
return try_call(lambda: self._get_cookies('https://www.vrt.be/vrtmax/sso')[cookie_name].value)
|
||||
|
||||
@staticmethod
|
||||
def _is_jwt_token_expired(token):
|
||||
return jwt_decode_hs256(token)['exp'] - time.time() < 300
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
auth_info = self._gigya_login({
|
||||
'APIKey': '3_0Z2HujMtiWq_pkAjgnS2Md2E11a1AwZjYiBETtwNE-EoEHDINgtnvcAOpNgmrVGy',
|
||||
'targetEnv': 'jssdk',
|
||||
'loginID': username,
|
||||
'password': password,
|
||||
'authMode': 'cookie',
|
||||
})
|
||||
refresh_token = self._get_vrt_cookie(self._REFRESH_TOKEN_COOKIE_NAME)
|
||||
if refresh_token and not self._is_jwt_token_expired(refresh_token):
|
||||
self.write_debug('Using refresh token from logged-in cookies; skipping login with credentials')
|
||||
return
|
||||
|
||||
if auth_info.get('errorDetails'):
|
||||
raise ExtractorError(f'Unable to login. VrtNU said: {auth_info["errorDetails"]}', expected=True)
|
||||
refresh_token = self.cache.load(self._NETRC_MACHINE, 'refresh_token', default=None)
|
||||
if refresh_token and not self._is_jwt_token_expired(refresh_token):
|
||||
self.write_debug('Restored refresh token from cache')
|
||||
self._set_cookie(self._TOKEN_COOKIE_DOMAIN, self._REFRESH_TOKEN_COOKIE_NAME, refresh_token, path='/vrtmax/sso')
|
||||
return
|
||||
|
||||
# Sometimes authentication fails for no good reason, retry
|
||||
for retry in self.RetryManager():
|
||||
if retry.attempt > 1:
|
||||
self._sleep(1, None)
|
||||
try:
|
||||
self._request_webpage(
|
||||
'https://token.vrt.be/vrtnuinitlogin', None, note='Requesting XSRF Token',
|
||||
errnote='Could not get XSRF Token', query={
|
||||
'provider': 'site',
|
||||
'destination': 'https://www.vrt.be/vrtnu/',
|
||||
})
|
||||
self._request_webpage(
|
||||
'https://login.vrt.be/perform_login', None,
|
||||
note='Performing login', errnote='Login failed',
|
||||
query={'client_id': 'vrtnu-site'}, data=urlencode_postdata({
|
||||
'UID': auth_info['UID'],
|
||||
'UIDSignature': auth_info['UIDSignature'],
|
||||
'signatureTimestamp': auth_info['signatureTimestamp'],
|
||||
'_csrf': self._get_cookies('https://login.vrt.be').get('OIDCXSRF').value,
|
||||
}))
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
|
||||
retry.error = e
|
||||
continue
|
||||
raise
|
||||
self._request_webpage(
|
||||
'https://www.vrt.be/vrtmax/sso/login', None,
|
||||
note='Getting session cookies', errnote='Failed to get session cookies')
|
||||
|
||||
self._authenticated = True
|
||||
login_data = self._download_json(
|
||||
'https://login.vrt.be/perform_login', None, data=json.dumps({
|
||||
'clientId': 'vrtnu-site',
|
||||
'loginID': username,
|
||||
'password': password,
|
||||
}).encode(), headers={
|
||||
'Content-Type': 'application/json',
|
||||
'Oidcxsrf': self._get_cookies('https://login.vrt.be')['OIDCXSRF'].value,
|
||||
}, note='Logging in', errnote='Login failed', expected_status=403)
|
||||
if login_data.get('errorCode'):
|
||||
raise ExtractorError(f'Login failed: {login_data.get("errorMessage")}', expected=True)
|
||||
|
||||
self._request_webpage(
|
||||
login_data['redirectUrl'], None,
|
||||
note='Getting access token', errnote='Failed to get access token')
|
||||
|
||||
access_token = self._get_vrt_cookie(self._ACCESS_TOKEN_COOKIE_NAME)
|
||||
video_token = self._get_vrt_cookie(self._VIDEO_TOKEN_COOKIE_NAME)
|
||||
refresh_token = self._get_vrt_cookie(self._REFRESH_TOKEN_COOKIE_NAME)
|
||||
|
||||
if not all((access_token, video_token, refresh_token)):
|
||||
raise ExtractorError('Unable to extract token cookie values')
|
||||
|
||||
self.cache.store(self._NETRC_MACHINE, 'token_data', (access_token, video_token))
|
||||
self.cache.store(self._NETRC_MACHINE, 'refresh_token', refresh_token)
|
||||
|
||||
return access_token, video_token
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
parsed_url = urllib.parse.urlparse(url)
|
||||
details = self._download_json(
|
||||
f'{parsed_url.scheme}://{parsed_url.netloc}{parsed_url.path.rstrip("/")}.model.json',
|
||||
display_id, 'Downloading asset JSON', 'Unable to download asset JSON')['details']
|
||||
access_token, video_token = self._fetch_tokens()
|
||||
|
||||
watch_info = traverse_obj(details, (
|
||||
'actions', lambda _, v: v['type'] == 'watch-episode', {dict}), get_all=False) or {}
|
||||
video_id = join_nonempty(
|
||||
'episodePublicationId', 'episodeVideoId', delim='$', from_dict=watch_info)
|
||||
if '$' not in video_id:
|
||||
raise ExtractorError('Unable to extract video ID')
|
||||
metadata = self._download_json(
|
||||
f'https://www.vrt.be/vrtnu-api/graphql{"" if access_token else "/public"}/v1',
|
||||
display_id, 'Downloading asset JSON', 'Unable to download asset JSON',
|
||||
data=json.dumps({
|
||||
'operationName': 'VideoPage',
|
||||
'query': self._VIDEO_PAGE_QUERY,
|
||||
'variables': {'pageId': urllib.parse.urlparse(url).path},
|
||||
}).encode(),
|
||||
headers=filter_dict({
|
||||
'Authorization': f'Bearer {access_token}' if access_token else None,
|
||||
'Content-Type': 'application/json',
|
||||
'x-vrt-client-name': 'WEB',
|
||||
'x-vrt-client-version': '1.5.9',
|
||||
'x-vrt-zone': 'default',
|
||||
}))['data']['page']
|
||||
|
||||
vrtnutoken = self._download_json(
|
||||
'https://token.vrt.be/refreshtoken', video_id, note='Retrieving vrtnutoken',
|
||||
errnote='Token refresh failed')['vrtnutoken'] if self._authenticated else None
|
||||
video_id = metadata['player']['modes'][0]['streamId']
|
||||
|
||||
video_info = self._call_api(video_id, 'vrtnu-web@PROD', vrtnutoken)
|
||||
try:
|
||||
streaming_info = self._call_api(video_id, 'vrtnu-web@PROD', id_token=video_token)
|
||||
except ExtractorError as e:
|
||||
if not video_token and isinstance(e.cause, HTTPError) and e.cause.status == 404:
|
||||
self.raise_login_required()
|
||||
raise
|
||||
|
||||
if 'title' not in video_info:
|
||||
code = video_info.get('code')
|
||||
if code in ('AUTHENTICATION_REQUIRED', 'CONTENT_IS_AGE_RESTRICTED'):
|
||||
self.raise_login_required(code, method='password')
|
||||
elif code in ('INVALID_LOCATION', 'CONTENT_AVAILABLE_ONLY_IN_BE'):
|
||||
formats, subtitles = self._extract_formats_and_subtitles(streaming_info, video_id)
|
||||
|
||||
code = traverse_obj(streaming_info, ('code', {str}))
|
||||
if not formats and code:
|
||||
if code in ('CONTENT_AVAILABLE_ONLY_FOR_BE_RESIDENTS', 'CONTENT_AVAILABLE_ONLY_IN_BE', 'CONTENT_UNAVAILABLE_VIA_PROXY'):
|
||||
self.raise_geo_restricted(countries=['BE'])
|
||||
elif code == 'CONTENT_AVAILABLE_ONLY_FOR_BE_RESIDENTS_AND_EXPATS':
|
||||
if not self._authenticated:
|
||||
self.raise_login_required(code, method='password')
|
||||
self.raise_geo_restricted(countries=['BE'])
|
||||
raise ExtractorError(code, expected=True)
|
||||
|
||||
formats, subtitles = self._extract_formats_and_subtitles(video_info, video_id)
|
||||
elif code in ('CONTENT_AVAILABLE_ONLY_FOR_BE_RESIDENTS_AND_EXPATS', 'CONTENT_IS_AGE_RESTRICTED', 'CONTENT_REQUIRES_AUTHENTICATION'):
|
||||
self.raise_login_required()
|
||||
else:
|
||||
self.raise_no_formats(f'Unable to extract formats: {code}')
|
||||
|
||||
return {
|
||||
**traverse_obj(details, {
|
||||
'title': 'title',
|
||||
'description': ('description', {clean_html}),
|
||||
'timestamp': ('data', 'episode', 'onTime', 'raw', {parse_iso8601}),
|
||||
'release_timestamp': ('data', 'episode', 'onTime', 'raw', {parse_iso8601}),
|
||||
'series': ('data', 'program', 'title'),
|
||||
'season': ('data', 'season', 'title', 'value'),
|
||||
'season_number': ('data', 'season', 'title', 'raw', {int_or_none}),
|
||||
'season_id': ('data', 'season', 'id', {str_or_none}),
|
||||
'episode': ('data', 'episode', 'number', 'value', {str_or_none}),
|
||||
'episode_number': ('data', 'episode', 'number', 'raw', {int_or_none}),
|
||||
'episode_id': ('data', 'episode', 'id', {str_or_none}),
|
||||
'age_limit': ('data', 'episode', 'age', 'raw', {parse_age_limit}),
|
||||
}),
|
||||
'duration': float_or_none(streaming_info.get('duration'), 1000),
|
||||
'thumbnail': url_or_none(streaming_info.get('posterImageUrl')),
|
||||
**self._json_ld(traverse_obj(metadata, ('ldjson', ..., {json.loads})), video_id, fatal=False),
|
||||
**traverse_obj(metadata, ('episode', {
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {str}),
|
||||
'timestamp': ('onTimeRaw', {parse_iso8601}),
|
||||
'series': ('program', 'title', {str}),
|
||||
'season': ('season', 'titleRaw', {str}),
|
||||
'season_number': ('season', 'titleRaw', {int_or_none}),
|
||||
'season_id': ('id', {str_or_none}),
|
||||
'episode': ('title', {str}),
|
||||
'episode_number': ('episodeNumberRaw', {int_or_none}),
|
||||
'episode_id': ('id', {str_or_none}),
|
||||
'age_limit': ('ageRaw', {parse_age_limit}),
|
||||
'channel': ('brand', {str}),
|
||||
'duration': ('durationRaw', {parse_duration}),
|
||||
})),
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'channel': 'VRT',
|
||||
'formats': formats,
|
||||
'duration': float_or_none(video_info.get('duration'), 1000),
|
||||
'thumbnail': url_or_none(video_info.get('posterImageUrl')),
|
||||
'subtitles': subtitles,
|
||||
'_old_archive_ids': [make_archive_id('Canvas', video_id)],
|
||||
}
|
||||
|
||||
|
||||
class KetnetIE(VRTBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?P<id>(?:[^/]+/)*[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.ketnet.be/kijken/m/meisjes/6/meisjes-s6a5',
|
||||
'info_dict': {
|
||||
'id': 'pbs-pub-39f8351c-a0a0-43e6-8394-205d597d6162$vid-5e306921-a9aa-4fa9-9f39-5b82c8f1028e',
|
||||
'ext': 'mp4',
|
||||
'title': 'Meisjes',
|
||||
'episode': 'Reeks 6: Week 5',
|
||||
'season': 'Reeks 6',
|
||||
'series': 'Meisjes',
|
||||
'timestamp': 1685251800,
|
||||
'upload_date': '20230528',
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
video = self._download_json(
|
||||
'https://senior-bff.ketnet.be/graphql', display_id, query={
|
||||
'query': '''{
|
||||
video(id: "content/ketnet/nl/%s.model.json") {
|
||||
description
|
||||
episodeNr
|
||||
imageUrl
|
||||
mediaReference
|
||||
programTitle
|
||||
publicationDate
|
||||
seasonTitle
|
||||
subtitleVideodetail
|
||||
titleVideodetail
|
||||
}
|
||||
}''' % display_id, # noqa: UP031
|
||||
})['data']['video']
|
||||
|
||||
video_id = urllib.parse.unquote(video['mediaReference'])
|
||||
data = self._call_api(video_id, 'ketnet@PROD', version='v1')
|
||||
formats, subtitles = self._extract_formats_and_subtitles(data, video_id)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'_old_archive_ids': [make_archive_id('Canvas', video_id)],
|
||||
**traverse_obj(video, {
|
||||
'title': ('titleVideodetail', {str}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('thumbnail', {url_or_none}),
|
||||
'timestamp': ('publicationDate', {parse_iso8601}),
|
||||
'series': ('programTitle', {str}),
|
||||
'season': ('seasonTitle', {str}),
|
||||
'episode': ('subtitleVideodetail', {str}),
|
||||
'episode_number': ('episodeNr', {int_or_none}),
|
||||
}),
|
||||
'_old_archive_ids': [make_archive_id('Canvas', video_id),
|
||||
make_archive_id('Ketnet', video_id)],
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -12,6 +12,7 @@ from ..utils import (
|
||||
str_or_none,
|
||||
strip_jsonp,
|
||||
traverse_obj,
|
||||
truncate_string,
|
||||
url_or_none,
|
||||
urlencode_postdata,
|
||||
urljoin,
|
||||
@@ -96,7 +97,8 @@ class WeiboBaseIE(InfoExtractor):
|
||||
})
|
||||
return formats
|
||||
|
||||
def _parse_video_info(self, video_info, video_id=None):
|
||||
def _parse_video_info(self, video_info):
|
||||
video_id = traverse_obj(video_info, (('id', 'id_str', 'mid'), {str_or_none}, any))
|
||||
return {
|
||||
'id': video_id,
|
||||
'extractor_key': WeiboIE.ie_key(),
|
||||
@@ -105,9 +107,10 @@ class WeiboBaseIE(InfoExtractor):
|
||||
'http_headers': {'Referer': 'https://weibo.com/'},
|
||||
'_old_archive_ids': [make_archive_id('WeiboMobile', video_id)],
|
||||
**traverse_obj(video_info, {
|
||||
'id': (('id', 'id_str', 'mid'), {str_or_none}),
|
||||
'display_id': ('mblogid', {str_or_none}),
|
||||
'title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'), {str}, filter),
|
||||
'title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'),
|
||||
{lambda x: x.replace('\n', ' ')}, {truncate_string(left=72)}, filter),
|
||||
'alt_title': ('page_info', 'media_info', ('video_title', 'kol_title', 'name'), {str}, filter),
|
||||
'description': ('text_raw', {str}),
|
||||
'duration': ('page_info', 'media_info', 'duration', {int_or_none}),
|
||||
'timestamp': ('page_info', 'media_info', 'video_publish_time', {int_or_none}),
|
||||
@@ -129,9 +132,11 @@ class WeiboIE(WeiboBaseIE):
|
||||
'url': 'https://weibo.com/7827771738/N4xlMvjhI',
|
||||
'info_dict': {
|
||||
'id': '4910815147462302',
|
||||
'_old_archive_ids': ['weibomobile 4910815147462302'],
|
||||
'ext': 'mp4',
|
||||
'display_id': 'N4xlMvjhI',
|
||||
'title': '【睡前消息暑假版第一期:拉泰国一把 对中国有好处】',
|
||||
'alt_title': '【睡前消息暑假版第一期:拉泰国一把 对中国有好处】',
|
||||
'description': 'md5:e2637a7673980d68694ea7c43cf12a5f',
|
||||
'duration': 918,
|
||||
'timestamp': 1686312819,
|
||||
@@ -149,9 +154,11 @@ class WeiboIE(WeiboBaseIE):
|
||||
'url': 'https://m.weibo.cn/status/4189191225395228',
|
||||
'info_dict': {
|
||||
'id': '4189191225395228',
|
||||
'_old_archive_ids': ['weibomobile 4189191225395228'],
|
||||
'ext': 'mp4',
|
||||
'display_id': 'FBqgOmDxO',
|
||||
'title': '柴犬柴犬的秒拍视频',
|
||||
'alt_title': '柴犬柴犬的秒拍视频',
|
||||
'description': 'md5:80f461ab5cdae6bbdb70efbf5a1db24f',
|
||||
'duration': 53,
|
||||
'timestamp': 1514264429,
|
||||
@@ -166,34 +173,35 @@ class WeiboIE(WeiboBaseIE):
|
||||
},
|
||||
}, {
|
||||
'url': 'https://m.weibo.cn/detail/4189191225395228',
|
||||
'info_dict': {
|
||||
'id': '4189191225395228',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'FBqgOmDxO',
|
||||
'title': '柴犬柴犬的秒拍视频',
|
||||
'description': '午睡当然是要甜甜蜜蜜的啦![坏笑] Instagram:shibainu.gaku http://t.cn/RHbmjzW ',
|
||||
'duration': 53,
|
||||
'timestamp': 1514264429,
|
||||
'upload_date': '20171226',
|
||||
'thumbnail': r're:https://.*\.jpg',
|
||||
'uploader': '柴犬柴犬',
|
||||
'uploader_id': '5926682210',
|
||||
'uploader_url': 'https://weibo.com/u/5926682210',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'repost_count': int,
|
||||
},
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://weibo.com/0/4224132150961381',
|
||||
'note': 'no playback_list example',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://m.weibo.cn/detail/5120561132606436',
|
||||
'info_dict': {
|
||||
'id': '5120561132606436',
|
||||
},
|
||||
'playlist_count': 9,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
return self._parse_video_info(self._weibo_download_json(
|
||||
f'https://weibo.com/ajax/statuses/show?id={video_id}', video_id))
|
||||
meta = self._weibo_download_json(f'https://weibo.com/ajax/statuses/show?id={video_id}', video_id)
|
||||
mix_media_info = traverse_obj(meta, ('mix_media_info', 'items', ...))
|
||||
if not mix_media_info:
|
||||
return self._parse_video_info(meta)
|
||||
|
||||
return self.playlist_result(self._entries(mix_media_info), video_id)
|
||||
|
||||
def _entries(self, mix_media_info):
|
||||
for media_info in traverse_obj(mix_media_info, lambda _, v: v['type'] != 'pic'):
|
||||
yield self._parse_video_info(traverse_obj(media_info, {
|
||||
'id': ('data', 'object_id'),
|
||||
'page_info': {'media_info': ('data', 'media_info', {dict})},
|
||||
}))
|
||||
|
||||
|
||||
class WeiboVideoIE(WeiboBaseIE):
|
||||
@@ -205,6 +213,7 @@ class WeiboVideoIE(WeiboBaseIE):
|
||||
'ext': 'mp4',
|
||||
'display_id': 'LEZDodaiW',
|
||||
'title': '呃,稍微了解了一下靡烟miya,感觉这东西也太二了',
|
||||
'alt_title': '呃,稍微了解了一下靡烟miya,感觉这东西也太二了',
|
||||
'description': '呃,稍微了解了一下靡烟miya,感觉这东西也太二了 http://t.cn/A6aerGsM \u200b\u200b\u200b',
|
||||
'duration': 76,
|
||||
'timestamp': 1659344278,
|
||||
@@ -216,6 +225,7 @@ class WeiboVideoIE(WeiboBaseIE):
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'repost_count': int,
|
||||
'_old_archive_ids': ['weibomobile 4797700463137878'],
|
||||
},
|
||||
}]
|
||||
|
||||
|
||||
@@ -100,8 +100,8 @@ class WSJIE(InfoExtractor):
|
||||
|
||||
|
||||
class WSJArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'(?i)https?://(?:www\.)?wsj\.com/articles/(?P<id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
_VALID_URL = r'(?i)https?://(?:www\.)?wsj\.com/(?:articles|opinion)/(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.wsj.com/articles/dont-like-china-no-pandas-for-you-1490366939?',
|
||||
'info_dict': {
|
||||
'id': '4B13FA62-1D8C-45DB-8EA1-4105CB20B362',
|
||||
@@ -110,11 +110,20 @@ class WSJArticleIE(InfoExtractor):
|
||||
'uploader_id': 'ralcaraz',
|
||||
'title': 'Bao Bao the Panda Leaves for China',
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.wsj.com/opinion/hamas-hostages-caskets-bibas-family-israel-gaza-29da083b',
|
||||
'info_dict': {
|
||||
'id': 'CE68D629-8DB8-4CD3-B30A-92112C102054',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20241007',
|
||||
'uploader_id': 'Tinnes, David',
|
||||
'title': 'WSJ Opinion: "Get the Jew": The Crown Heights Riot Revisited',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
article_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, article_id)
|
||||
webpage = self._download_webpage(url, article_id, impersonate=True)
|
||||
video_id = self._search_regex(
|
||||
r'(?:id=["\']video|video-|iframe\.html\?guid=|data-src=["\'])([a-fA-F0-9-]{36})',
|
||||
webpage, 'video id')
|
||||
|
||||
50
yt_dlp/extractor/youtube/__init__.py
Normal file
50
yt_dlp/extractor/youtube/__init__.py
Normal file
@@ -0,0 +1,50 @@
|
||||
# flake8: noqa: F401
|
||||
from ._base import YoutubeBaseInfoExtractor
|
||||
from ._clip import YoutubeClipIE
|
||||
from ._mistakes import YoutubeTruncatedIDIE, YoutubeTruncatedURLIE
|
||||
from ._notifications import YoutubeNotificationsIE
|
||||
from ._redirect import (
|
||||
YoutubeConsentRedirectIE,
|
||||
YoutubeFavouritesIE,
|
||||
YoutubeFeedsInfoExtractor,
|
||||
YoutubeHistoryIE,
|
||||
YoutubeLivestreamEmbedIE,
|
||||
YoutubeRecommendedIE,
|
||||
YoutubeShortsAudioPivotIE,
|
||||
YoutubeSubscriptionsIE,
|
||||
YoutubeWatchLaterIE,
|
||||
YoutubeYtBeIE,
|
||||
YoutubeYtUserIE,
|
||||
)
|
||||
from ._search import YoutubeMusicSearchURLIE, YoutubeSearchDateIE, YoutubeSearchIE, YoutubeSearchURLIE
|
||||
from ._tab import YoutubePlaylistIE, YoutubeTabBaseInfoExtractor, YoutubeTabIE
|
||||
from ._video import YoutubeIE
|
||||
|
||||
# Hack to allow plugin overrides work
|
||||
for _cls in [
|
||||
YoutubeBaseInfoExtractor,
|
||||
YoutubeClipIE,
|
||||
YoutubeTruncatedIDIE,
|
||||
YoutubeTruncatedURLIE,
|
||||
YoutubeNotificationsIE,
|
||||
YoutubeConsentRedirectIE,
|
||||
YoutubeFavouritesIE,
|
||||
YoutubeFeedsInfoExtractor,
|
||||
YoutubeHistoryIE,
|
||||
YoutubeLivestreamEmbedIE,
|
||||
YoutubeRecommendedIE,
|
||||
YoutubeShortsAudioPivotIE,
|
||||
YoutubeSubscriptionsIE,
|
||||
YoutubeWatchLaterIE,
|
||||
YoutubeYtBeIE,
|
||||
YoutubeYtUserIE,
|
||||
YoutubeMusicSearchURLIE,
|
||||
YoutubeSearchDateIE,
|
||||
YoutubeSearchIE,
|
||||
YoutubeSearchURLIE,
|
||||
YoutubePlaylistIE,
|
||||
YoutubeTabBaseInfoExtractor,
|
||||
YoutubeTabIE,
|
||||
YoutubeIE,
|
||||
]:
|
||||
_cls.__module__ = 'yt_dlp.extractor.youtube'
|
||||
1075
yt_dlp/extractor/youtube/_base.py
Normal file
1075
yt_dlp/extractor/youtube/_base.py
Normal file
File diff suppressed because it is too large
Load Diff
66
yt_dlp/extractor/youtube/_clip.py
Normal file
66
yt_dlp/extractor/youtube/_clip.py
Normal file
@@ -0,0 +1,66 @@
|
||||
from ._tab import YoutubeTabBaseInfoExtractor
|
||||
from ._video import YoutubeIE
|
||||
from ...utils import ExtractorError, traverse_obj
|
||||
|
||||
|
||||
class YoutubeClipIE(YoutubeTabBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:clip'
|
||||
_VALID_URL = r'https?://(?:www\.)?youtube\.com/clip/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
# FIXME: Other metadata should be extracted from the clip, not from the base video
|
||||
'url': 'https://www.youtube.com/clip/UgytZKpehg-hEMBSn3F4AaABCQ',
|
||||
'info_dict': {
|
||||
'id': 'UgytZKpehg-hEMBSn3F4AaABCQ',
|
||||
'ext': 'mp4',
|
||||
'section_start': 29.0,
|
||||
'section_end': 39.7,
|
||||
'duration': 10.7,
|
||||
'age_limit': 0,
|
||||
'availability': 'public',
|
||||
'categories': ['Gaming'],
|
||||
'channel': 'Scott The Woz',
|
||||
'channel_id': 'UC4rqhyiTs7XyuODcECvuiiQ',
|
||||
'channel_url': 'https://www.youtube.com/channel/UC4rqhyiTs7XyuODcECvuiiQ',
|
||||
'description': 'md5:7a4517a17ea9b4bd98996399d8bb36e7',
|
||||
'like_count': int,
|
||||
'playable_in_embed': True,
|
||||
'tags': 'count:17',
|
||||
'thumbnail': 'https://i.ytimg.com/vi_webp/ScPX26pdQik/maxresdefault.webp',
|
||||
'title': 'Mobile Games on Console - Scott The Woz',
|
||||
'upload_date': '20210920',
|
||||
'uploader': 'Scott The Woz',
|
||||
'uploader_id': '@ScottTheWoz',
|
||||
'uploader_url': 'https://www.youtube.com/@ScottTheWoz',
|
||||
'view_count': int,
|
||||
'live_status': 'not_live',
|
||||
'channel_follower_count': int,
|
||||
'chapters': 'count:20',
|
||||
'comment_count': int,
|
||||
'heatmap': 'count:100',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
clip_id = self._match_id(url)
|
||||
_, data = self._extract_webpage(url, clip_id)
|
||||
|
||||
video_id = traverse_obj(data, ('currentVideoEndpoint', 'watchEndpoint', 'videoId'))
|
||||
if not video_id:
|
||||
raise ExtractorError('Unable to find video ID')
|
||||
|
||||
clip_data = traverse_obj(data, (
|
||||
'engagementPanels', ..., 'engagementPanelSectionListRenderer', 'content', 'clipSectionRenderer',
|
||||
'contents', ..., 'clipAttributionRenderer', 'onScrubExit', 'commandExecutorCommand', 'commands', ...,
|
||||
'openPopupAction', 'popup', 'notificationActionRenderer', 'actionButton', 'buttonRenderer', 'command',
|
||||
'commandExecutorCommand', 'commands', ..., 'loopCommand'), get_all=False)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': f'https://www.youtube.com/watch?v={video_id}',
|
||||
'ie_key': YoutubeIE.ie_key(),
|
||||
'id': clip_id,
|
||||
'section_start': int(clip_data['startTimeMs']) / 1000,
|
||||
'section_end': int(clip_data['endTimeMs']) / 1000,
|
||||
'_format_sort_fields': ( # https protocol is prioritized for ffmpeg compatibility
|
||||
'proto:https', 'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec', 'channels', 'acodec', 'lang'),
|
||||
}
|
||||
69
yt_dlp/extractor/youtube/_mistakes.py
Normal file
69
yt_dlp/extractor/youtube/_mistakes.py
Normal file
@@ -0,0 +1,69 @@
|
||||
|
||||
from ._base import YoutubeBaseInfoExtractor
|
||||
from ...utils import ExtractorError
|
||||
|
||||
|
||||
class YoutubeTruncatedURLIE(YoutubeBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:truncated_url'
|
||||
IE_DESC = False # Do not list
|
||||
_VALID_URL = r'''(?x)
|
||||
(?:https?://)?
|
||||
(?:\w+\.)?[yY][oO][uU][tT][uU][bB][eE](?:-nocookie)?\.com/
|
||||
(?:watch\?(?:
|
||||
feature=[a-z_]+|
|
||||
annotation_id=annotation_[^&]+|
|
||||
x-yt-cl=[0-9]+|
|
||||
hl=[^&]*|
|
||||
t=[0-9]+
|
||||
)?
|
||||
|
|
||||
attribution_link\?a=[^&]+
|
||||
)
|
||||
$
|
||||
'''
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://www.youtube.com/watch?annotation_id=annotation_3951667041',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/watch?',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/watch?x-yt-cl=84503534',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/watch?feature=foo',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/watch?hl=en-GB',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/watch?t=2372',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
raise ExtractorError(
|
||||
'Did you forget to quote the URL? Remember that & is a meta '
|
||||
'character in most shells, so you want to put the URL in quotes, '
|
||||
'like yt-dlp '
|
||||
'"https://www.youtube.com/watch?feature=foo&v=BaW_jenozKc" '
|
||||
' or simply yt-dlp BaW_jenozKc .',
|
||||
expected=True)
|
||||
|
||||
|
||||
class YoutubeTruncatedIDIE(YoutubeBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:truncated_id'
|
||||
IE_DESC = False # Do not list
|
||||
_VALID_URL = r'https?://(?:www\.)?youtube\.com/watch\?v=(?P<id>[0-9A-Za-z_-]{1,10})$'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://www.youtube.com/watch?v=N_708QY7Ob',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
raise ExtractorError(
|
||||
f'Incomplete YouTube ID {video_id}. URL {url} looks truncated.',
|
||||
expected=True)
|
||||
98
yt_dlp/extractor/youtube/_notifications.py
Normal file
98
yt_dlp/extractor/youtube/_notifications.py
Normal file
@@ -0,0 +1,98 @@
|
||||
import itertools
|
||||
import re
|
||||
|
||||
from ._tab import YoutubeTabBaseInfoExtractor, YoutubeTabIE
|
||||
from ._video import YoutubeIE
|
||||
from ...utils import traverse_obj
|
||||
|
||||
|
||||
class YoutubeNotificationsIE(YoutubeTabBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:notif'
|
||||
IE_DESC = 'YouTube notifications; ":ytnotif" keyword (requires cookies)'
|
||||
_VALID_URL = r':ytnotif(?:ication)?s?'
|
||||
_LOGIN_REQUIRED = True
|
||||
_TESTS = [{
|
||||
'url': ':ytnotif',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': ':ytnotifications',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_notification_menu(self, response, continuation_list):
|
||||
notification_list = traverse_obj(
|
||||
response,
|
||||
('actions', 0, 'openPopupAction', 'popup', 'multiPageMenuRenderer', 'sections', 0, 'multiPageMenuNotificationSectionRenderer', 'items'),
|
||||
('actions', 0, 'appendContinuationItemsAction', 'continuationItems'),
|
||||
expected_type=list) or []
|
||||
continuation_list[0] = None
|
||||
for item in notification_list:
|
||||
entry = self._extract_notification_renderer(item.get('notificationRenderer'))
|
||||
if entry:
|
||||
yield entry
|
||||
continuation = item.get('continuationItemRenderer')
|
||||
if continuation:
|
||||
continuation_list[0] = continuation
|
||||
|
||||
def _extract_notification_renderer(self, notification):
|
||||
video_id = traverse_obj(
|
||||
notification, ('navigationEndpoint', 'watchEndpoint', 'videoId'), expected_type=str)
|
||||
url = f'https://www.youtube.com/watch?v={video_id}'
|
||||
channel_id = None
|
||||
if not video_id:
|
||||
browse_ep = traverse_obj(
|
||||
notification, ('navigationEndpoint', 'browseEndpoint'), expected_type=dict)
|
||||
channel_id = self.ucid_or_none(traverse_obj(browse_ep, 'browseId', expected_type=str))
|
||||
post_id = self._search_regex(
|
||||
r'/post/(.+)', traverse_obj(browse_ep, 'canonicalBaseUrl', expected_type=str),
|
||||
'post id', default=None)
|
||||
if not channel_id or not post_id:
|
||||
return
|
||||
# The direct /post url redirects to this in the browser
|
||||
url = f'https://www.youtube.com/channel/{channel_id}/community?lb={post_id}'
|
||||
|
||||
channel = traverse_obj(
|
||||
notification, ('contextualMenu', 'menuRenderer', 'items', 1, 'menuServiceItemRenderer', 'text', 'runs', 1, 'text'),
|
||||
expected_type=str)
|
||||
notification_title = self._get_text(notification, 'shortMessage')
|
||||
if notification_title:
|
||||
notification_title = notification_title.replace('\xad', '') # remove soft hyphens
|
||||
# TODO: handle recommended videos
|
||||
title = self._search_regex(
|
||||
rf'{re.escape(channel or "")}[^:]+: (.+)', notification_title,
|
||||
'video title', default=None)
|
||||
timestamp = (self._parse_time_text(self._get_text(notification, 'sentTimeText'))
|
||||
if self._configuration_arg('approximate_date', ie_key=YoutubeTabIE)
|
||||
else None)
|
||||
return {
|
||||
'_type': 'url',
|
||||
'url': url,
|
||||
'ie_key': (YoutubeIE if video_id else YoutubeTabIE).ie_key(),
|
||||
'video_id': video_id,
|
||||
'title': title,
|
||||
'channel_id': channel_id,
|
||||
'channel': channel,
|
||||
'uploader': channel,
|
||||
'thumbnails': self._extract_thumbnails(notification, 'videoThumbnail'),
|
||||
'timestamp': timestamp,
|
||||
}
|
||||
|
||||
def _notification_menu_entries(self, ytcfg):
|
||||
continuation_list = [None]
|
||||
response = None
|
||||
for page in itertools.count(1):
|
||||
ctoken = traverse_obj(
|
||||
continuation_list, (0, 'continuationEndpoint', 'getNotificationMenuEndpoint', 'ctoken'), expected_type=str)
|
||||
response = self._extract_response(
|
||||
item_id=f'page {page}', query={'ctoken': ctoken} if ctoken else {}, ytcfg=ytcfg,
|
||||
ep='notification/get_notification_menu', check_get_keys='actions',
|
||||
headers=self.generate_api_headers(ytcfg=ytcfg, visitor_data=self._extract_visitor_data(response)))
|
||||
yield from self._extract_notification_menu(response, continuation_list)
|
||||
if not continuation_list[0]:
|
||||
break
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = 'notifications'
|
||||
ytcfg = self._download_ytcfg('web', display_id) if not self.skip_webpage else {}
|
||||
self._report_playlist_authcheck(ytcfg)
|
||||
return self.playlist_result(self._notification_menu_entries(ytcfg), display_id, display_id)
|
||||
247
yt_dlp/extractor/youtube/_redirect.py
Normal file
247
yt_dlp/extractor/youtube/_redirect.py
Normal file
@@ -0,0 +1,247 @@
|
||||
import base64
|
||||
import urllib.parse
|
||||
|
||||
from ._base import YoutubeBaseInfoExtractor
|
||||
from ._tab import YoutubeTabIE
|
||||
from ...utils import ExtractorError, classproperty, parse_qs, update_url_query, url_or_none
|
||||
|
||||
|
||||
class YoutubeYtBeIE(YoutubeBaseInfoExtractor):
|
||||
IE_DESC = 'youtu.be'
|
||||
_VALID_URL = rf'https?://youtu\.be/(?P<id>[0-9A-Za-z_-]{{11}})/*?.*?\blist=(?P<playlist_id>{YoutubeBaseInfoExtractor._PLAYLIST_ID_RE})'
|
||||
_TESTS = [{
|
||||
'url': 'https://youtu.be/yeWKywCrFtk?list=PL2qgrgXsNUG5ig9cat4ohreBjYLAPC0J5',
|
||||
'info_dict': {
|
||||
'id': 'yeWKywCrFtk',
|
||||
'ext': 'mp4',
|
||||
'title': 'Small Scale Baler and Braiding Rugs',
|
||||
'uploader': 'Backus-Page House Museum',
|
||||
'uploader_id': '@backuspagemuseum',
|
||||
'uploader_url': r're:https?://(?:www\.)?youtube\.com/@backuspagemuseum',
|
||||
'upload_date': '20161008',
|
||||
'description': 'md5:800c0c78d5eb128500bffd4f0b4f2e8a',
|
||||
'categories': ['Nonprofits & Activism'],
|
||||
'tags': list,
|
||||
'like_count': int,
|
||||
'age_limit': 0,
|
||||
'playable_in_embed': True,
|
||||
'thumbnail': r're:^https?://.*\.webp',
|
||||
'channel': 'Backus-Page House Museum',
|
||||
'channel_id': 'UCEfMCQ9bs3tjvjy1s451zaw',
|
||||
'live_status': 'not_live',
|
||||
'view_count': int,
|
||||
'channel_url': 'https://www.youtube.com/channel/UCEfMCQ9bs3tjvjy1s451zaw',
|
||||
'availability': 'public',
|
||||
'duration': 59,
|
||||
'comment_count': int,
|
||||
'channel_follower_count': int,
|
||||
},
|
||||
'params': {
|
||||
'noplaylist': True,
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://youtu.be/uWyaPkt-VOI?list=PL9D9FC436B881BA21',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._match_valid_url(url)
|
||||
video_id = mobj.group('id')
|
||||
playlist_id = mobj.group('playlist_id')
|
||||
return self.url_result(
|
||||
update_url_query('https://www.youtube.com/watch', {
|
||||
'v': video_id,
|
||||
'list': playlist_id,
|
||||
'feature': 'youtu.be',
|
||||
}), ie=YoutubeTabIE.ie_key(), video_id=playlist_id)
|
||||
|
||||
|
||||
class YoutubeLivestreamEmbedIE(YoutubeBaseInfoExtractor):
|
||||
IE_DESC = 'YouTube livestream embeds'
|
||||
_VALID_URL = r'https?://(?:\w+\.)?youtube\.com/embed/live_stream/?\?(?:[^#]+&)?channel=(?P<id>[^&#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.youtube.com/embed/live_stream?channel=UC2_KI6RB__jGdlnK6dvFEZA',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel_id = self._match_id(url)
|
||||
return self.url_result(
|
||||
f'https://www.youtube.com/channel/{channel_id}/live',
|
||||
ie=YoutubeTabIE.ie_key(), video_id=channel_id)
|
||||
|
||||
|
||||
class YoutubeYtUserIE(YoutubeBaseInfoExtractor):
|
||||
IE_DESC = 'YouTube user videos; "ytuser:" prefix'
|
||||
IE_NAME = 'youtube:user'
|
||||
_VALID_URL = r'ytuser:(?P<id>.+)'
|
||||
_TESTS = [{
|
||||
'url': 'ytuser:phihag',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
user_id = self._match_id(url)
|
||||
return self.url_result(f'https://www.youtube.com/user/{user_id}', YoutubeTabIE, user_id)
|
||||
|
||||
|
||||
class YoutubeFavouritesIE(YoutubeBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:favorites'
|
||||
IE_DESC = 'YouTube liked videos; ":ytfav" keyword (requires cookies)'
|
||||
_VALID_URL = r':ytfav(?:ou?rite)?s?'
|
||||
_LOGIN_REQUIRED = True
|
||||
_TESTS = [{
|
||||
'url': ':ytfav',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': ':ytfavorites',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
return self.url_result(
|
||||
'https://www.youtube.com/playlist?list=LL',
|
||||
ie=YoutubeTabIE.ie_key())
|
||||
|
||||
|
||||
class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
"""
|
||||
Base class for feed extractors
|
||||
Subclasses must re-define the _FEED_NAME property.
|
||||
"""
|
||||
_LOGIN_REQUIRED = True
|
||||
_FEED_NAME = 'feeds'
|
||||
|
||||
@classproperty
|
||||
def IE_NAME(cls):
|
||||
return f'youtube:{cls._FEED_NAME}'
|
||||
|
||||
def _real_extract(self, url):
|
||||
return self.url_result(
|
||||
f'https://www.youtube.com/feed/{self._FEED_NAME}', ie=YoutubeTabIE.ie_key())
|
||||
|
||||
|
||||
class YoutubeWatchLaterIE(YoutubeBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:watchlater'
|
||||
IE_DESC = 'Youtube watch later list; ":ytwatchlater" keyword (requires cookies)'
|
||||
_VALID_URL = r':ytwatchlater'
|
||||
_TESTS = [{
|
||||
'url': ':ytwatchlater',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
return self.url_result(
|
||||
'https://www.youtube.com/playlist?list=WL', ie=YoutubeTabIE.ie_key())
|
||||
|
||||
|
||||
class YoutubeRecommendedIE(YoutubeFeedsInfoExtractor):
|
||||
IE_DESC = 'YouTube recommended videos; ":ytrec" keyword'
|
||||
_VALID_URL = r'https?://(?:www\.)?youtube\.com/?(?:[?#]|$)|:ytrec(?:ommended)?'
|
||||
_FEED_NAME = 'recommended'
|
||||
_LOGIN_REQUIRED = False
|
||||
_TESTS = [{
|
||||
'url': ':ytrec',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': ':ytrecommended',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://youtube.com',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
|
||||
IE_DESC = 'YouTube subscriptions feed; ":ytsubs" keyword (requires cookies)'
|
||||
_VALID_URL = r':ytsub(?:scription)?s?'
|
||||
_FEED_NAME = 'subscriptions'
|
||||
_TESTS = [{
|
||||
'url': ':ytsubs',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': ':ytsubscriptions',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class YoutubeHistoryIE(YoutubeFeedsInfoExtractor):
|
||||
IE_DESC = 'Youtube watch history; ":ythis" keyword (requires cookies)'
|
||||
_VALID_URL = r':ythis(?:tory)?'
|
||||
_FEED_NAME = 'history'
|
||||
_TESTS = [{
|
||||
'url': ':ythistory',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class YoutubeShortsAudioPivotIE(YoutubeBaseInfoExtractor):
|
||||
IE_DESC = 'YouTube Shorts audio pivot (Shorts using audio of a given video)'
|
||||
IE_NAME = 'youtube:shorts:pivot:audio'
|
||||
_VALID_URL = r'https?://(?:www\.)?youtube\.com/source/(?P<id>[\w-]{11})/shorts'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.youtube.com/source/Lyj-MZSAA9o/shorts',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _generate_audio_pivot_params(video_id):
|
||||
"""
|
||||
Generates sfv_audio_pivot browse params for this video id
|
||||
"""
|
||||
pb_params = b'\xf2\x05+\n)\x12\'\n\x0b%b\x12\x0b%b\x1a\x0b%b' % ((video_id.encode(),) * 3)
|
||||
return urllib.parse.quote(base64.b64encode(pb_params).decode())
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
return self.url_result(
|
||||
f'https://www.youtube.com/feed/sfv_audio_pivot?bp={self._generate_audio_pivot_params(video_id)}',
|
||||
ie=YoutubeTabIE)
|
||||
|
||||
|
||||
class YoutubeConsentRedirectIE(YoutubeBaseInfoExtractor):
|
||||
IE_NAME = 'youtube:consent'
|
||||
IE_DESC = False # Do not list
|
||||
_VALID_URL = r'https?://consent\.youtube\.com/m\?'
|
||||
_TESTS = [{
|
||||
'url': 'https://consent.youtube.com/m?continue=https%3A%2F%2Fwww.youtube.com%2Flive%2FqVv6vCqciTM%3Fcbrd%3D1&gl=NL&m=0&pc=yt&hl=en&src=1',
|
||||
'info_dict': {
|
||||
'id': 'qVv6vCqciTM',
|
||||
'ext': 'mp4',
|
||||
'age_limit': 0,
|
||||
'uploader_id': '@sana_natori',
|
||||
'comment_count': int,
|
||||
'chapters': 'count:13',
|
||||
'upload_date': '20221223',
|
||||
'thumbnail': 'https://i.ytimg.com/vi/qVv6vCqciTM/maxresdefault.jpg',
|
||||
'channel_url': 'https://www.youtube.com/channel/UCIdEIHpS0TdkqRkHL5OkLtA',
|
||||
'uploader_url': 'https://www.youtube.com/@sana_natori',
|
||||
'like_count': int,
|
||||
'release_date': '20221223',
|
||||
'tags': ['Vtuber', '月ノ美兎', '名取さな', 'にじさんじ', 'クリスマス', '3D配信'],
|
||||
'title': '【 #インターネット女クリスマス 】3Dで歌ってはしゃぐインターネットの女たち【月ノ美兎/名取さな】',
|
||||
'view_count': int,
|
||||
'playable_in_embed': True,
|
||||
'duration': 4438,
|
||||
'availability': 'public',
|
||||
'channel_follower_count': int,
|
||||
'channel_id': 'UCIdEIHpS0TdkqRkHL5OkLtA',
|
||||
'categories': ['Entertainment'],
|
||||
'live_status': 'was_live',
|
||||
'release_timestamp': 1671793345,
|
||||
'channel': 'さなちゃんねる',
|
||||
'description': 'md5:6aebf95cc4a1d731aebc01ad6cc9806d',
|
||||
'uploader': 'さなちゃんねる',
|
||||
'channel_is_verified': True,
|
||||
'heatmap': 'count:100',
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
'params': {'skip_download': 'Youtube'},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
redirect_url = url_or_none(parse_qs(url).get('continue', [None])[-1])
|
||||
if not redirect_url:
|
||||
raise ExtractorError('Invalid cookie consent redirect URL', expected=True)
|
||||
return self.url_result(redirect_url)
|
||||
167
yt_dlp/extractor/youtube/_search.py
Normal file
167
yt_dlp/extractor/youtube/_search.py
Normal file
@@ -0,0 +1,167 @@
|
||||
import urllib.parse
|
||||
|
||||
from ._tab import YoutubeTabBaseInfoExtractor
|
||||
from ..common import SearchInfoExtractor
|
||||
from ...utils import join_nonempty, parse_qs
|
||||
|
||||
|
||||
class YoutubeSearchIE(YoutubeTabBaseInfoExtractor, SearchInfoExtractor):
|
||||
IE_DESC = 'YouTube search'
|
||||
IE_NAME = 'youtube:search'
|
||||
_SEARCH_KEY = 'ytsearch'
|
||||
_SEARCH_PARAMS = 'EgIQAfABAQ==' # Videos only
|
||||
_TESTS = [{
|
||||
'url': 'ytsearch5:youtube-dl test video',
|
||||
'playlist_count': 5,
|
||||
'info_dict': {
|
||||
'id': 'youtube-dl test video',
|
||||
'title': 'youtube-dl test video',
|
||||
},
|
||||
}, {
|
||||
'note': 'Suicide/self-harm search warning',
|
||||
'url': 'ytsearch1:i hate myself and i wanna die',
|
||||
'playlist_count': 1,
|
||||
'info_dict': {
|
||||
'id': 'i hate myself and i wanna die',
|
||||
'title': 'i hate myself and i wanna die',
|
||||
},
|
||||
}]
|
||||
|
||||
|
||||
class YoutubeSearchDateIE(YoutubeTabBaseInfoExtractor, SearchInfoExtractor):
|
||||
IE_NAME = YoutubeSearchIE.IE_NAME + ':date'
|
||||
_SEARCH_KEY = 'ytsearchdate'
|
||||
IE_DESC = 'YouTube search, newest videos first'
|
||||
_SEARCH_PARAMS = 'CAISAhAB8AEB' # Videos only, sorted by date
|
||||
_TESTS = [{
|
||||
'url': 'ytsearchdate5:youtube-dl test video',
|
||||
'playlist_count': 5,
|
||||
'info_dict': {
|
||||
'id': 'youtube-dl test video',
|
||||
'title': 'youtube-dl test video',
|
||||
},
|
||||
}]
|
||||
|
||||
|
||||
class YoutubeSearchURLIE(YoutubeTabBaseInfoExtractor):
|
||||
IE_DESC = 'YouTube search URLs with sorting and filter support'
|
||||
IE_NAME = YoutubeSearchIE.IE_NAME + '_url'
|
||||
_VALID_URL = r'https?://(?:www\.)?youtube\.com/(?:results|search)\?([^#]+&)?(?:search_query|q)=(?:[^&]+)(?:[&#]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video',
|
||||
'playlist_mincount': 5,
|
||||
'info_dict': {
|
||||
'id': 'youtube-dl test video',
|
||||
'title': 'youtube-dl test video',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/results?search_query=python&sp=EgIQAg%253D%253D',
|
||||
'playlist_mincount': 5,
|
||||
'info_dict': {
|
||||
'id': 'python',
|
||||
'title': 'python',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/results?search_query=%23cats',
|
||||
'playlist_mincount': 1,
|
||||
'info_dict': {
|
||||
'id': '#cats',
|
||||
'title': '#cats',
|
||||
# The test suite does not have support for nested playlists
|
||||
# 'entries': [{
|
||||
# 'url': r're:https://(www\.)?youtube\.com/hashtag/cats',
|
||||
# 'title': '#cats',
|
||||
# }],
|
||||
},
|
||||
}, {
|
||||
# Channel results
|
||||
'url': 'https://www.youtube.com/results?search_query=kurzgesagt&sp=EgIQAg%253D%253D',
|
||||
'info_dict': {
|
||||
'id': 'kurzgesagt',
|
||||
'title': 'kurzgesagt',
|
||||
},
|
||||
'playlist': [{
|
||||
'info_dict': {
|
||||
'_type': 'url',
|
||||
'id': 'UCsXVk37bltHxD1rDPwtNM8Q',
|
||||
'url': 'https://www.youtube.com/channel/UCsXVk37bltHxD1rDPwtNM8Q',
|
||||
'ie_key': 'YoutubeTab',
|
||||
'channel': 'Kurzgesagt – In a Nutshell',
|
||||
'description': 'md5:4ae48dfa9505ffc307dad26342d06bfc',
|
||||
'title': 'Kurzgesagt – In a Nutshell',
|
||||
'channel_id': 'UCsXVk37bltHxD1rDPwtNM8Q',
|
||||
# No longer available for search as it is set to the handle.
|
||||
# 'playlist_count': int,
|
||||
'channel_url': 'https://www.youtube.com/channel/UCsXVk37bltHxD1rDPwtNM8Q',
|
||||
'thumbnails': list,
|
||||
'uploader_id': '@kurzgesagt',
|
||||
'uploader_url': 'https://www.youtube.com/@kurzgesagt',
|
||||
'uploader': 'Kurzgesagt – In a Nutshell',
|
||||
'channel_is_verified': True,
|
||||
'channel_follower_count': int,
|
||||
},
|
||||
}],
|
||||
'params': {'extract_flat': True, 'playlist_items': '1'},
|
||||
'playlist_mincount': 1,
|
||||
}, {
|
||||
'url': 'https://www.youtube.com/results?q=test&sp=EgQIBBgB',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
qs = parse_qs(url)
|
||||
query = (qs.get('search_query') or qs.get('q'))[0]
|
||||
return self.playlist_result(self._search_results(query, qs.get('sp', (None,))[0]), query, query)
|
||||
|
||||
|
||||
class YoutubeMusicSearchURLIE(YoutubeTabBaseInfoExtractor):
|
||||
IE_DESC = 'YouTube music search URLs with selectable sections, e.g. #songs'
|
||||
IE_NAME = 'youtube:music:search_url'
|
||||
_VALID_URL = r'https?://music\.youtube\.com/search\?([^#]+&)?(?:search_query|q)=(?:[^&]+)(?:[&#]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'https://music.youtube.com/search?q=royalty+free+music',
|
||||
'playlist_count': 16,
|
||||
'info_dict': {
|
||||
'id': 'royalty free music',
|
||||
'title': 'royalty free music',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://music.youtube.com/search?q=royalty+free+music&sp=EgWKAQIIAWoKEAoQAxAEEAkQBQ%3D%3D',
|
||||
'playlist_mincount': 30,
|
||||
'info_dict': {
|
||||
'id': 'royalty free music - songs',
|
||||
'title': 'royalty free music - songs',
|
||||
},
|
||||
'params': {'extract_flat': 'in_playlist'},
|
||||
}, {
|
||||
'url': 'https://music.youtube.com/search?q=royalty+free+music#community+playlists',
|
||||
'playlist_mincount': 30,
|
||||
'info_dict': {
|
||||
'id': 'royalty free music - community playlists',
|
||||
'title': 'royalty free music - community playlists',
|
||||
},
|
||||
'params': {'extract_flat': 'in_playlist'},
|
||||
}]
|
||||
|
||||
_SECTIONS = {
|
||||
'albums': 'EgWKAQIYAWoKEAoQAxAEEAkQBQ==',
|
||||
'artists': 'EgWKAQIgAWoKEAoQAxAEEAkQBQ==',
|
||||
'community playlists': 'EgeKAQQoAEABagoQChADEAQQCRAF',
|
||||
'featured playlists': 'EgeKAQQoADgBagwQAxAJEAQQDhAKEAU==',
|
||||
'songs': 'EgWKAQIIAWoKEAoQAxAEEAkQBQ==',
|
||||
'videos': 'EgWKAQIQAWoKEAoQAxAEEAkQBQ==',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
qs = parse_qs(url)
|
||||
query = (qs.get('search_query') or qs.get('q'))[0]
|
||||
params = qs.get('sp', (None,))[0]
|
||||
if params:
|
||||
section = next((k for k, v in self._SECTIONS.items() if v == params), params)
|
||||
else:
|
||||
section = urllib.parse.unquote_plus(([*url.split('#'), ''])[1]).lower()
|
||||
params = self._SECTIONS.get(section)
|
||||
if not params:
|
||||
section = None
|
||||
title = join_nonempty(query, section, delim=' - ')
|
||||
return self.playlist_result(self._search_results(query, params, default_client='web_music'), title, title)
|
||||
2348
yt_dlp/extractor/youtube/_tab.py
Normal file
2348
yt_dlp/extractor/youtube/_tab.py
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
30
yt_dlp/globals.py
Normal file
30
yt_dlp/globals.py
Normal file
@@ -0,0 +1,30 @@
|
||||
from collections import defaultdict
|
||||
|
||||
# Please Note: Due to necessary changes and the complex nature involved in the plugin/globals system,
|
||||
# no backwards compatibility is guaranteed for the plugin system API.
|
||||
# However, we will still try our best.
|
||||
|
||||
|
||||
class Indirect:
|
||||
def __init__(self, initial, /):
|
||||
self.value = initial
|
||||
|
||||
def __repr__(self, /):
|
||||
return f'{type(self).__name__}({self.value!r})'
|
||||
|
||||
|
||||
postprocessors = Indirect({})
|
||||
extractors = Indirect({})
|
||||
|
||||
# Plugins
|
||||
all_plugins_loaded = Indirect(False)
|
||||
plugin_specs = Indirect({})
|
||||
plugin_dirs = Indirect(['default'])
|
||||
|
||||
plugin_ies = Indirect({})
|
||||
plugin_pps = Indirect({})
|
||||
plugin_ies_overrides = Indirect(defaultdict(list))
|
||||
|
||||
# Misc
|
||||
IN_CLI = Indirect(False)
|
||||
LAZY_EXTRACTORS = Indirect(None) # `False`=force, `None`=disabled, `True`=enabled
|
||||
@@ -301,7 +301,7 @@ class JSInterpreter:
|
||||
OP_CHARS = '+-*/%&|^=<>!,;{}:['
|
||||
if not expr:
|
||||
return
|
||||
counters = {k: 0 for k in _MATCHING_PARENS.values()}
|
||||
counters = dict.fromkeys(_MATCHING_PARENS.values(), 0)
|
||||
start, splits, pos, delim_len = 0, 0, 0, len(delim) - 1
|
||||
in_quote, escaping, after_op, in_regex_char_group = None, False, True, False
|
||||
for idx, char in enumerate(expr):
|
||||
@@ -890,9 +890,9 @@ class JSInterpreter:
|
||||
code, _ = self._separate_at_paren(func_m.group('code'))
|
||||
return [x.strip() for x in func_m.group('args').split(',')], code
|
||||
|
||||
def extract_function(self, funcname):
|
||||
def extract_function(self, funcname, *global_stack):
|
||||
return function_with_repr(
|
||||
self.extract_function_from_code(*self.extract_function_code(funcname)),
|
||||
self.extract_function_from_code(*self.extract_function_code(funcname), *global_stack),
|
||||
f'F<{funcname}>')
|
||||
|
||||
def extract_function_from_code(self, argnames, code, *global_stack):
|
||||
|
||||
@@ -21,9 +21,11 @@ if urllib3 is None:
|
||||
urllib3_version = tuple(int_or_none(x, default=0) for x in urllib3.__version__.split('.'))
|
||||
|
||||
if urllib3_version < (1, 26, 17):
|
||||
urllib3._yt_dlp__version = f'{urllib3.__version__} (unsupported)'
|
||||
raise ImportError('Only urllib3 >= 1.26.17 is supported')
|
||||
|
||||
if requests.__build__ < 0x023202:
|
||||
requests._yt_dlp__version = f'{requests.__version__} (unsupported)'
|
||||
raise ImportError('Only requests >= 2.32.2 is supported')
|
||||
|
||||
import requests.adapters
|
||||
@@ -296,6 +298,7 @@ class RequestsRH(RequestHandler, InstanceStoreMixin):
|
||||
extensions.pop('cookiejar', None)
|
||||
extensions.pop('timeout', None)
|
||||
extensions.pop('legacy_ssl', None)
|
||||
extensions.pop('keep_header_casing', None)
|
||||
|
||||
def _create_instance(self, cookiejar, legacy_ssl_support=None):
|
||||
session = RequestsSession()
|
||||
@@ -312,11 +315,12 @@ class RequestsRH(RequestHandler, InstanceStoreMixin):
|
||||
session.trust_env = False # no need, we already load proxies from env
|
||||
return session
|
||||
|
||||
def _send(self, request):
|
||||
|
||||
headers = self._merge_headers(request.headers)
|
||||
def _prepare_headers(self, _, headers):
|
||||
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
||||
|
||||
def _send(self, request):
|
||||
|
||||
headers = self._get_headers(request)
|
||||
max_redirects_exceeded = False
|
||||
|
||||
session = self._get_instance(
|
||||
|
||||
@@ -379,13 +379,15 @@ class UrllibRH(RequestHandler, InstanceStoreMixin):
|
||||
opener.addheaders = []
|
||||
return opener
|
||||
|
||||
def _send(self, request):
|
||||
headers = self._merge_headers(request.headers)
|
||||
def _prepare_headers(self, _, headers):
|
||||
add_accept_encoding_header(headers, SUPPORTED_ENCODINGS)
|
||||
|
||||
def _send(self, request):
|
||||
headers = self._get_headers(request)
|
||||
urllib_req = urllib.request.Request(
|
||||
url=request.url,
|
||||
data=request.data,
|
||||
headers=dict(headers),
|
||||
headers=headers,
|
||||
method=request.method,
|
||||
)
|
||||
|
||||
|
||||
@@ -34,6 +34,7 @@ import websockets.version
|
||||
|
||||
websockets_version = tuple(map(int_or_none, websockets.version.version.split('.')))
|
||||
if websockets_version < (13, 0):
|
||||
websockets._yt_dlp__version = f'{websockets.version.version} (unsupported)'
|
||||
raise ImportError('Only websockets>=13.0 is supported')
|
||||
|
||||
import websockets.sync.client
|
||||
@@ -116,6 +117,7 @@ class WebsocketsRH(WebSocketRequestHandler):
|
||||
extensions.pop('timeout', None)
|
||||
extensions.pop('cookiejar', None)
|
||||
extensions.pop('legacy_ssl', None)
|
||||
extensions.pop('keep_header_casing', None)
|
||||
|
||||
def close(self):
|
||||
# Remove the logging handler that contains a reference to our logger
|
||||
@@ -123,15 +125,16 @@ class WebsocketsRH(WebSocketRequestHandler):
|
||||
for name, handler in self.__logging_handlers.items():
|
||||
logging.getLogger(name).removeHandler(handler)
|
||||
|
||||
def _send(self, request):
|
||||
timeout = self._calculate_timeout(request)
|
||||
headers = self._merge_headers(request.headers)
|
||||
def _prepare_headers(self, request, headers):
|
||||
if 'cookie' not in headers:
|
||||
cookiejar = self._get_cookiejar(request)
|
||||
cookie_header = cookiejar.get_cookie_header(request.url)
|
||||
if cookie_header:
|
||||
headers['cookie'] = cookie_header
|
||||
|
||||
def _send(self, request):
|
||||
timeout = self._calculate_timeout(request)
|
||||
headers = self._get_headers(request)
|
||||
wsuri = parse_uri(request.url)
|
||||
create_conn_kwargs = {
|
||||
'source_address': (self.source_address, 0) if self.source_address else None,
|
||||
|
||||
@@ -206,6 +206,7 @@ class RequestHandler(abc.ABC):
|
||||
- `cookiejar`: Cookiejar to use for this request.
|
||||
- `timeout`: socket timeout to use for this request.
|
||||
- `legacy_ssl`: Enable legacy SSL options for this request. See legacy_ssl_support.
|
||||
- `keep_header_casing`: Keep the casing of headers when sending the request.
|
||||
To enable these, add extensions.pop('<extension>', None) to _check_extensions
|
||||
|
||||
Apart from the url protocol, proxies dict may contain the following keys:
|
||||
@@ -259,6 +260,23 @@ class RequestHandler(abc.ABC):
|
||||
def _merge_headers(self, request_headers):
|
||||
return HTTPHeaderDict(self.headers, request_headers)
|
||||
|
||||
def _prepare_headers(self, request: Request, headers: HTTPHeaderDict) -> None: # noqa: B027
|
||||
"""Additional operations to prepare headers before building. To be extended by subclasses.
|
||||
@param request: Request object
|
||||
@param headers: Merged headers to prepare
|
||||
"""
|
||||
|
||||
def _get_headers(self, request: Request) -> dict[str, str]:
|
||||
"""
|
||||
Get headers for external use.
|
||||
Subclasses may define a _prepare_headers method to modify headers after merge but before building.
|
||||
"""
|
||||
headers = self._merge_headers(request.headers)
|
||||
self._prepare_headers(request, headers)
|
||||
if request.extensions.get('keep_header_casing'):
|
||||
return headers.sensitive()
|
||||
return dict(headers)
|
||||
|
||||
def _calculate_timeout(self, request):
|
||||
return float(request.extensions.get('timeout') or self.timeout)
|
||||
|
||||
@@ -317,6 +335,7 @@ class RequestHandler(abc.ABC):
|
||||
assert isinstance(extensions.get('cookiejar'), (YoutubeDLCookieJar, NoneType))
|
||||
assert isinstance(extensions.get('timeout'), (float, int, NoneType))
|
||||
assert isinstance(extensions.get('legacy_ssl'), (bool, NoneType))
|
||||
assert isinstance(extensions.get('keep_header_casing'), (bool, NoneType))
|
||||
|
||||
def _validate(self, request):
|
||||
self._check_url_scheme(request)
|
||||
|
||||
@@ -5,11 +5,11 @@ from abc import ABC
|
||||
from dataclasses import dataclass
|
||||
from typing import Any
|
||||
|
||||
from .common import RequestHandler, register_preference
|
||||
from .common import RequestHandler, register_preference, Request
|
||||
from .exceptions import UnsupportedRequest
|
||||
from ..compat.types import NoneType
|
||||
from ..utils import classproperty, join_nonempty
|
||||
from ..utils.networking import std_headers
|
||||
from ..utils.networking import std_headers, HTTPHeaderDict
|
||||
|
||||
|
||||
@dataclass(order=True, frozen=True)
|
||||
@@ -123,7 +123,17 @@ class ImpersonateRequestHandler(RequestHandler, ABC):
|
||||
"""Get the requested target for the request"""
|
||||
return self._resolve_target(request.extensions.get('impersonate') or self.impersonate)
|
||||
|
||||
def _get_impersonate_headers(self, request):
|
||||
def _prepare_impersonate_headers(self, request: Request, headers: HTTPHeaderDict) -> None: # noqa: B027
|
||||
"""Additional operations to prepare headers before building. To be extended by subclasses.
|
||||
@param request: Request object
|
||||
@param headers: Merged headers to prepare
|
||||
"""
|
||||
|
||||
def _get_impersonate_headers(self, request: Request) -> dict[str, str]:
|
||||
"""
|
||||
Get headers for external impersonation use.
|
||||
Subclasses may define a _prepare_impersonate_headers method to modify headers after merge but before building.
|
||||
"""
|
||||
headers = self._merge_headers(request.headers)
|
||||
if self._get_request_target(request) is not None:
|
||||
# remove all headers present in std_headers
|
||||
@@ -131,7 +141,11 @@ class ImpersonateRequestHandler(RequestHandler, ABC):
|
||||
for k, v in std_headers.items():
|
||||
if headers.get(k) == v:
|
||||
headers.pop(k)
|
||||
return headers
|
||||
|
||||
self._prepare_impersonate_headers(request, headers)
|
||||
if request.extensions.get('keep_header_casing'):
|
||||
return headers.sensitive()
|
||||
return dict(headers)
|
||||
|
||||
|
||||
@register_preference(ImpersonateRequestHandler)
|
||||
|
||||
@@ -398,7 +398,7 @@ def create_parser():
|
||||
'(Alias: --no-config)'))
|
||||
general.add_option(
|
||||
'--no-config-locations',
|
||||
action='store_const', dest='config_locations', const=[],
|
||||
action='store_const', dest='config_locations', const=None,
|
||||
help=(
|
||||
'Do not load any custom configuration files (default). When given inside a '
|
||||
'configuration file, ignore all previous --config-locations defined in the current file'))
|
||||
@@ -410,12 +410,21 @@ def create_parser():
|
||||
'("-" for stdin). Can be used multiple times and inside other configuration files'))
|
||||
general.add_option(
|
||||
'--plugin-dirs',
|
||||
dest='plugin_dirs', metavar='PATH', action='append',
|
||||
metavar='PATH',
|
||||
dest='plugin_dirs',
|
||||
action='callback',
|
||||
callback=_list_from_options_callback,
|
||||
type='str',
|
||||
callback_kwargs={'delim': None},
|
||||
default=['default'],
|
||||
help=(
|
||||
'Path to an additional directory to search for plugins. '
|
||||
'This option can be used multiple times to add multiple directories. '
|
||||
'Note that this currently only works for extractor plugins; '
|
||||
'postprocessor plugins can only be loaded from the default plugin directories'))
|
||||
'Use "default" to search the default plugin directories (default)'))
|
||||
general.add_option(
|
||||
'--no-plugin-dirs',
|
||||
dest='plugin_dirs', action='store_const', const=[],
|
||||
help='Clear plugin directories to search, including defaults and those provided by previous --plugin-dirs')
|
||||
general.add_option(
|
||||
'--flat-playlist',
|
||||
action='store_const', dest='extract_flat', const='in_playlist', default=False,
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import contextlib
|
||||
import dataclasses
|
||||
import functools
|
||||
import importlib
|
||||
import importlib.abc
|
||||
@@ -14,17 +15,48 @@ import zipimport
|
||||
from pathlib import Path
|
||||
from zipfile import ZipFile
|
||||
|
||||
from .globals import (
|
||||
Indirect,
|
||||
plugin_dirs,
|
||||
all_plugins_loaded,
|
||||
plugin_specs,
|
||||
)
|
||||
|
||||
from .utils import (
|
||||
Config,
|
||||
get_executable_path,
|
||||
get_system_config_dirs,
|
||||
get_user_config_dirs,
|
||||
merge_dicts,
|
||||
orderedSet,
|
||||
write_string,
|
||||
)
|
||||
|
||||
PACKAGE_NAME = 'yt_dlp_plugins'
|
||||
COMPAT_PACKAGE_NAME = 'ytdlp_plugins'
|
||||
_BASE_PACKAGE_PATH = Path(__file__).parent
|
||||
|
||||
|
||||
# Please Note: Due to necessary changes and the complex nature involved,
|
||||
# no backwards compatibility is guaranteed for the plugin system API.
|
||||
# However, we will still try our best.
|
||||
|
||||
__all__ = [
|
||||
'COMPAT_PACKAGE_NAME',
|
||||
'PACKAGE_NAME',
|
||||
'PluginSpec',
|
||||
'directories',
|
||||
'load_all_plugins',
|
||||
'load_plugins',
|
||||
'register_plugin_spec',
|
||||
]
|
||||
|
||||
|
||||
@dataclasses.dataclass
|
||||
class PluginSpec:
|
||||
module_name: str
|
||||
suffix: str
|
||||
destination: Indirect
|
||||
plugin_destination: Indirect
|
||||
|
||||
|
||||
class PluginLoader(importlib.abc.Loader):
|
||||
@@ -44,7 +76,42 @@ def dirs_in_zip(archive):
|
||||
pass
|
||||
except Exception as e:
|
||||
write_string(f'WARNING: Could not read zip file {archive}: {e}\n')
|
||||
return set()
|
||||
return ()
|
||||
|
||||
|
||||
def default_plugin_paths():
|
||||
def _get_package_paths(*root_paths, containing_folder):
|
||||
for config_dir in orderedSet(map(Path, root_paths), lazy=True):
|
||||
# We need to filter the base path added when running __main__.py directly
|
||||
if config_dir == _BASE_PACKAGE_PATH:
|
||||
continue
|
||||
with contextlib.suppress(OSError):
|
||||
yield from (config_dir / containing_folder).iterdir()
|
||||
|
||||
# Load from yt-dlp config folders
|
||||
yield from _get_package_paths(
|
||||
*get_user_config_dirs('yt-dlp'),
|
||||
*get_system_config_dirs('yt-dlp'),
|
||||
containing_folder='plugins',
|
||||
)
|
||||
|
||||
# Load from yt-dlp-plugins folders
|
||||
yield from _get_package_paths(
|
||||
get_executable_path(),
|
||||
*get_user_config_dirs(''),
|
||||
*get_system_config_dirs(''),
|
||||
containing_folder='yt-dlp-plugins',
|
||||
)
|
||||
|
||||
# Load from PYTHONPATH directories
|
||||
yield from (path for path in map(Path, sys.path) if path != _BASE_PACKAGE_PATH)
|
||||
|
||||
|
||||
def candidate_plugin_paths(candidate):
|
||||
candidate_path = Path(candidate)
|
||||
if not candidate_path.is_dir():
|
||||
raise ValueError(f'Invalid plugin directory: {candidate_path}')
|
||||
yield from candidate_path.iterdir()
|
||||
|
||||
|
||||
class PluginFinder(importlib.abc.MetaPathFinder):
|
||||
@@ -56,40 +123,16 @@ class PluginFinder(importlib.abc.MetaPathFinder):
|
||||
|
||||
def __init__(self, *packages):
|
||||
self._zip_content_cache = {}
|
||||
self.packages = set(itertools.chain.from_iterable(
|
||||
itertools.accumulate(name.split('.'), lambda a, b: '.'.join((a, b)))
|
||||
for name in packages))
|
||||
self.packages = set(
|
||||
itertools.chain.from_iterable(
|
||||
itertools.accumulate(name.split('.'), lambda a, b: '.'.join((a, b)))
|
||||
for name in packages))
|
||||
|
||||
def search_locations(self, fullname):
|
||||
candidate_locations = []
|
||||
|
||||
def _get_package_paths(*root_paths, containing_folder='plugins'):
|
||||
for config_dir in orderedSet(map(Path, root_paths), lazy=True):
|
||||
with contextlib.suppress(OSError):
|
||||
yield from (config_dir / containing_folder).iterdir()
|
||||
|
||||
# Load from yt-dlp config folders
|
||||
candidate_locations.extend(_get_package_paths(
|
||||
*get_user_config_dirs('yt-dlp'),
|
||||
*get_system_config_dirs('yt-dlp'),
|
||||
containing_folder='plugins'))
|
||||
|
||||
# Load from yt-dlp-plugins folders
|
||||
candidate_locations.extend(_get_package_paths(
|
||||
get_executable_path(),
|
||||
*get_user_config_dirs(''),
|
||||
*get_system_config_dirs(''),
|
||||
containing_folder='yt-dlp-plugins'))
|
||||
|
||||
candidate_locations.extend(map(Path, sys.path)) # PYTHONPATH
|
||||
with contextlib.suppress(ValueError): # Added when running __main__.py directly
|
||||
candidate_locations.remove(Path(__file__).parent)
|
||||
|
||||
# TODO(coletdjnz): remove when plugin globals system is implemented
|
||||
if Config._plugin_dirs:
|
||||
candidate_locations.extend(_get_package_paths(
|
||||
*Config._plugin_dirs,
|
||||
containing_folder=''))
|
||||
candidate_locations = itertools.chain.from_iterable(
|
||||
default_plugin_paths() if candidate == 'default' else candidate_plugin_paths(candidate)
|
||||
for candidate in plugin_dirs.value
|
||||
)
|
||||
|
||||
parts = Path(*fullname.split('.'))
|
||||
for path in orderedSet(candidate_locations, lazy=True):
|
||||
@@ -109,7 +152,8 @@ class PluginFinder(importlib.abc.MetaPathFinder):
|
||||
|
||||
search_locations = list(map(str, self.search_locations(fullname)))
|
||||
if not search_locations:
|
||||
return None
|
||||
# Prevent using built-in meta finders for searching plugins.
|
||||
raise ModuleNotFoundError(fullname)
|
||||
|
||||
spec = importlib.machinery.ModuleSpec(fullname, PluginLoader(), is_package=True)
|
||||
spec.submodule_search_locations = search_locations
|
||||
@@ -123,8 +167,10 @@ class PluginFinder(importlib.abc.MetaPathFinder):
|
||||
|
||||
|
||||
def directories():
|
||||
spec = importlib.util.find_spec(PACKAGE_NAME)
|
||||
return spec.submodule_search_locations if spec else []
|
||||
with contextlib.suppress(ModuleNotFoundError):
|
||||
if spec := importlib.util.find_spec(PACKAGE_NAME):
|
||||
return list(spec.submodule_search_locations)
|
||||
return []
|
||||
|
||||
|
||||
def iter_modules(subpackage):
|
||||
@@ -134,19 +180,23 @@ def iter_modules(subpackage):
|
||||
yield from pkgutil.iter_modules(path=pkg.__path__, prefix=f'{fullname}.')
|
||||
|
||||
|
||||
def load_module(module, module_name, suffix):
|
||||
def get_regular_classes(module, module_name, suffix):
|
||||
# Find standard public plugin classes (not overrides)
|
||||
return inspect.getmembers(module, lambda obj: (
|
||||
inspect.isclass(obj)
|
||||
and obj.__name__.endswith(suffix)
|
||||
and obj.__module__.startswith(module_name)
|
||||
and not obj.__name__.startswith('_')
|
||||
and obj.__name__ in getattr(module, '__all__', [obj.__name__])))
|
||||
and obj.__name__ in getattr(module, '__all__', [obj.__name__])
|
||||
and getattr(obj, 'PLUGIN_NAME', None) is None
|
||||
))
|
||||
|
||||
|
||||
def load_plugins(name, suffix):
|
||||
classes = {}
|
||||
if os.environ.get('YTDLP_NO_PLUGINS'):
|
||||
return classes
|
||||
def load_plugins(plugin_spec: PluginSpec):
|
||||
name, suffix = plugin_spec.module_name, plugin_spec.suffix
|
||||
regular_classes = {}
|
||||
if os.environ.get('YTDLP_NO_PLUGINS') or not plugin_dirs.value:
|
||||
return regular_classes
|
||||
|
||||
for finder, module_name, _ in iter_modules(name):
|
||||
if any(x.startswith('_') for x in module_name.split('.')):
|
||||
@@ -163,24 +213,42 @@ def load_plugins(name, suffix):
|
||||
sys.modules[module_name] = module
|
||||
spec.loader.exec_module(module)
|
||||
except Exception:
|
||||
write_string(f'Error while importing module {module_name!r}\n{traceback.format_exc(limit=-1)}')
|
||||
write_string(
|
||||
f'Error while importing module {module_name!r}\n{traceback.format_exc(limit=-1)}',
|
||||
)
|
||||
continue
|
||||
classes.update(load_module(module, module_name, suffix))
|
||||
regular_classes.update(get_regular_classes(module, module_name, suffix))
|
||||
|
||||
# Compat: old plugin system using __init__.py
|
||||
# Note: plugins imported this way do not show up in directories()
|
||||
# nor are considered part of the yt_dlp_plugins namespace package
|
||||
with contextlib.suppress(FileNotFoundError):
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
name, Path(get_executable_path(), COMPAT_PACKAGE_NAME, name, '__init__.py'))
|
||||
plugins = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = plugins
|
||||
spec.loader.exec_module(plugins)
|
||||
classes.update(load_module(plugins, spec.name, suffix))
|
||||
if 'default' in plugin_dirs.value:
|
||||
with contextlib.suppress(FileNotFoundError):
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
name,
|
||||
Path(get_executable_path(), COMPAT_PACKAGE_NAME, name, '__init__.py'),
|
||||
)
|
||||
plugins = importlib.util.module_from_spec(spec)
|
||||
sys.modules[spec.name] = plugins
|
||||
spec.loader.exec_module(plugins)
|
||||
regular_classes.update(get_regular_classes(plugins, spec.name, suffix))
|
||||
|
||||
return classes
|
||||
# Add the classes into the global plugin lookup for that type
|
||||
plugin_spec.plugin_destination.value = regular_classes
|
||||
# We want to prepend to the main lookup for that type
|
||||
plugin_spec.destination.value = merge_dicts(regular_classes, plugin_spec.destination.value)
|
||||
|
||||
return regular_classes
|
||||
|
||||
|
||||
sys.meta_path.insert(0, PluginFinder(f'{PACKAGE_NAME}.extractor', f'{PACKAGE_NAME}.postprocessor'))
|
||||
def load_all_plugins():
|
||||
for plugin_spec in plugin_specs.value.values():
|
||||
load_plugins(plugin_spec)
|
||||
all_plugins_loaded.value = True
|
||||
|
||||
__all__ = ['COMPAT_PACKAGE_NAME', 'PACKAGE_NAME', 'directories', 'load_plugins']
|
||||
|
||||
def register_plugin_spec(plugin_spec: PluginSpec):
|
||||
# If the plugin spec for a module is already registered, it will not be added again
|
||||
if plugin_spec.module_name not in plugin_specs.value:
|
||||
plugin_specs.value[plugin_spec.module_name] = plugin_spec
|
||||
sys.meta_path.insert(0, PluginFinder(f'{PACKAGE_NAME}.{plugin_spec.module_name}'))
|
||||
|
||||
@@ -33,15 +33,38 @@ from .movefilesafterdownload import MoveFilesAfterDownloadPP
|
||||
from .sponskrub import SponSkrubPP
|
||||
from .sponsorblock import SponsorBlockPP
|
||||
from .xattrpp import XAttrMetadataPP
|
||||
from ..plugins import load_plugins
|
||||
from ..globals import plugin_pps, postprocessors
|
||||
from ..plugins import PACKAGE_NAME, register_plugin_spec, PluginSpec
|
||||
from ..utils import deprecation_warning
|
||||
|
||||
_PLUGIN_CLASSES = load_plugins('postprocessor', 'PP')
|
||||
|
||||
def __getattr__(name):
|
||||
lookup = plugin_pps.value
|
||||
if name in lookup:
|
||||
deprecation_warning(
|
||||
f'Importing a plugin Post-Processor from {__name__} is deprecated. '
|
||||
f'Please import {PACKAGE_NAME}.postprocessor.{name} instead.')
|
||||
return lookup[name]
|
||||
|
||||
raise AttributeError(f'module {__name__!r} has no attribute {name!r}')
|
||||
|
||||
|
||||
def get_postprocessor(key):
|
||||
return globals()[key + 'PP']
|
||||
return postprocessors.value[key + 'PP']
|
||||
|
||||
|
||||
globals().update(_PLUGIN_CLASSES)
|
||||
__all__ = [name for name in globals() if name.endswith('PP')]
|
||||
__all__.extend(('FFmpegPostProcessor', 'PostProcessor'))
|
||||
register_plugin_spec(PluginSpec(
|
||||
module_name='postprocessor',
|
||||
suffix='PP',
|
||||
destination=postprocessors,
|
||||
plugin_destination=plugin_pps,
|
||||
))
|
||||
|
||||
_default_pps = {
|
||||
name: value
|
||||
for name, value in globals().items()
|
||||
if name.endswith('PP') or name in ('FFmpegPostProcessor', 'PostProcessor')
|
||||
}
|
||||
postprocessors.value.update(_default_pps)
|
||||
|
||||
__all__ = list(_default_pps.values())
|
||||
|
||||
@@ -10,6 +10,7 @@ from ..utils import (
|
||||
_configuration_args,
|
||||
deprecation_warning,
|
||||
)
|
||||
from ..utils._utils import _ProgressState
|
||||
|
||||
|
||||
class PostProcessorMetaClass(type):
|
||||
@@ -189,7 +190,7 @@ class PostProcessor(metaclass=PostProcessorMetaClass):
|
||||
|
||||
self._downloader.to_console_title(self._downloader.evaluate_outtmpl(
|
||||
progress_template.get('postprocess-title') or 'yt-dlp %(progress._default_template)s',
|
||||
progress_dict))
|
||||
progress_dict), _ProgressState.from_dict(s), s.get('_percent'))
|
||||
|
||||
def _retry_download(self, err, count, retries):
|
||||
# While this is not an extractor, it behaves similar to one and
|
||||
|
||||
@@ -202,7 +202,7 @@ class FFmpegPostProcessor(PostProcessor):
|
||||
|
||||
@property
|
||||
def available(self):
|
||||
return self.basename is not None
|
||||
return bool(self._ffmpeg_location.get()) or self.basename is not None
|
||||
|
||||
@property
|
||||
def executable(self):
|
||||
@@ -743,7 +743,7 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
|
||||
if value not in ('', None):
|
||||
value = ', '.join(map(str, variadic(value)))
|
||||
value = value.replace('\0', '') # nul character cannot be passed in command line
|
||||
metadata['common'].update({meta_f: value for meta_f in variadic(meta_list)})
|
||||
metadata['common'].update(dict.fromkeys(variadic(meta_list), value))
|
||||
|
||||
# Info on media metadata/metadata supported by ffmpeg:
|
||||
# https://wiki.multimedia.cx/index.php/FFmpeg_Metadata
|
||||
|
||||
@@ -117,7 +117,7 @@ _FILE_SUFFIXES = {
|
||||
}
|
||||
|
||||
_NON_UPDATEABLE_REASONS = {
|
||||
**{variant: None for variant in _FILE_SUFFIXES}, # Updatable
|
||||
**dict.fromkeys(_FILE_SUFFIXES), # Updatable
|
||||
**{variant: f'Auto-update is not supported for unpackaged {name} executable; Re-download the latest release'
|
||||
for variant, name in {'win32_dir': 'Windows', 'darwin_dir': 'MacOS', 'linux_dir': 'Linux'}.items()},
|
||||
'py2exe': 'py2exe is no longer supported by yt-dlp; This executable cannot be updated',
|
||||
|
||||
@@ -8,6 +8,7 @@ import contextlib
|
||||
import datetime as dt
|
||||
import email.header
|
||||
import email.utils
|
||||
import enum
|
||||
import errno
|
||||
import functools
|
||||
import hashlib
|
||||
@@ -51,6 +52,7 @@ from ..compat import (
|
||||
compat_HTMLParseError,
|
||||
)
|
||||
from ..dependencies import xattr
|
||||
from ..globals import IN_CLI
|
||||
|
||||
__name__ = __name__.rsplit('.', 1)[0] # noqa: A001: Pretend to be the parent module
|
||||
|
||||
@@ -1486,8 +1488,7 @@ def write_string(s, out=None, encoding=None):
|
||||
|
||||
# TODO: Use global logger
|
||||
def deprecation_warning(msg, *, printer=None, stacklevel=0, **kwargs):
|
||||
from .. import _IN_CLI
|
||||
if _IN_CLI:
|
||||
if IN_CLI.value:
|
||||
if msg in deprecation_warning._cache:
|
||||
return
|
||||
deprecation_warning._cache.add(msg)
|
||||
@@ -3246,7 +3247,7 @@ def _match_one(filter_part, dct, incomplete):
|
||||
op = lambda attr, value: not unnegated_op(attr, value)
|
||||
else:
|
||||
op = unnegated_op
|
||||
comparison_value = m['quotedstrval'] or m['strval'] or m['intval']
|
||||
comparison_value = m['quotedstrval'] or m['strval']
|
||||
if m['quote']:
|
||||
comparison_value = comparison_value.replace(r'\{}'.format(m['quote']), m['quote'])
|
||||
actual_value = dct.get(m['key'])
|
||||
@@ -4890,10 +4891,6 @@ class Config:
|
||||
filename = None
|
||||
__initialized = False
|
||||
|
||||
# Internal only, do not use! Hack to enable --plugin-dirs
|
||||
# TODO(coletdjnz): remove when plugin globals system is implemented
|
||||
_plugin_dirs = None
|
||||
|
||||
def __init__(self, parser, label=None):
|
||||
self.parser, self.label = parser, label
|
||||
self._loaded_paths, self.configs = set(), []
|
||||
@@ -5677,3 +5674,32 @@ class _YDLLogger:
|
||||
def stderr(self, message):
|
||||
if self._ydl:
|
||||
self._ydl.to_stderr(message)
|
||||
|
||||
|
||||
class _ProgressState(enum.Enum):
|
||||
"""
|
||||
Represents a state for a progress bar.
|
||||
|
||||
See: https://conemu.github.io/en/AnsiEscapeCodes.html#ConEmu_specific_OSC
|
||||
"""
|
||||
|
||||
HIDDEN = 0
|
||||
INDETERMINATE = 3
|
||||
VISIBLE = 1
|
||||
WARNING = 4
|
||||
ERROR = 2
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, s, /):
|
||||
if s['status'] == 'finished':
|
||||
return cls.INDETERMINATE
|
||||
|
||||
# Not currently used
|
||||
if s['status'] == 'error':
|
||||
return cls.ERROR
|
||||
|
||||
return cls.INDETERMINATE if s.get('_percent') is None else cls.VISIBLE
|
||||
|
||||
def get_ansi_escape(self, /, percent=None):
|
||||
percent = 0 if percent is None else int(percent)
|
||||
return f'\033]9;4;{self.value};{percent}\007'
|
||||
|
||||
@@ -1,9 +1,16 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import collections
|
||||
import collections.abc
|
||||
import random
|
||||
import typing
|
||||
import urllib.parse
|
||||
import urllib.request
|
||||
|
||||
from ._utils import remove_start
|
||||
if typing.TYPE_CHECKING:
|
||||
T = typing.TypeVar('T')
|
||||
|
||||
from ._utils import NO_DEFAULT, remove_start
|
||||
|
||||
|
||||
def random_user_agent():
|
||||
@@ -51,32 +58,141 @@ def random_user_agent():
|
||||
return _USER_AGENT_TPL % random.choice(_CHROME_VERSIONS)
|
||||
|
||||
|
||||
class HTTPHeaderDict(collections.UserDict, dict):
|
||||
class HTTPHeaderDict(dict):
|
||||
"""
|
||||
Store and access keys case-insensitively.
|
||||
The constructor can take multiple dicts, in which keys in the latter are prioritised.
|
||||
|
||||
Retains a case sensitive mapping of the headers, which can be accessed via `.sensitive()`.
|
||||
"""
|
||||
def __new__(cls, *args: typing.Any, **kwargs: typing.Any) -> typing.Self:
|
||||
obj = dict.__new__(cls, *args, **kwargs)
|
||||
obj.__sensitive_map = {}
|
||||
return obj
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
def __init__(self, /, *args, **kwargs):
|
||||
super().__init__()
|
||||
for dct in args:
|
||||
if dct is not None:
|
||||
self.update(dct)
|
||||
self.update(kwargs)
|
||||
self.__sensitive_map = {}
|
||||
|
||||
def __setitem__(self, key, value):
|
||||
if isinstance(value, bytes):
|
||||
value = value.decode('latin-1')
|
||||
super().__setitem__(key.title(), str(value).strip())
|
||||
for dct in filter(None, args):
|
||||
self.update(dct)
|
||||
if kwargs:
|
||||
self.update(kwargs)
|
||||
|
||||
def __getitem__(self, key):
|
||||
def sensitive(self, /) -> dict[str, str]:
|
||||
return {
|
||||
self.__sensitive_map[key]: value
|
||||
for key, value in self.items()
|
||||
}
|
||||
|
||||
def __contains__(self, key: str, /) -> bool:
|
||||
return super().__contains__(key.title() if isinstance(key, str) else key)
|
||||
|
||||
def __delitem__(self, key: str, /) -> None:
|
||||
key = key.title()
|
||||
del self.__sensitive_map[key]
|
||||
super().__delitem__(key)
|
||||
|
||||
def __getitem__(self, key, /) -> str:
|
||||
return super().__getitem__(key.title())
|
||||
|
||||
def __delitem__(self, key):
|
||||
super().__delitem__(key.title())
|
||||
def __ior__(self, other, /):
|
||||
if isinstance(other, type(self)):
|
||||
other = other.sensitive()
|
||||
if isinstance(other, dict):
|
||||
self.update(other)
|
||||
return
|
||||
return NotImplemented
|
||||
|
||||
def __contains__(self, key):
|
||||
return super().__contains__(key.title() if isinstance(key, str) else key)
|
||||
def __or__(self, other, /) -> typing.Self:
|
||||
if isinstance(other, type(self)):
|
||||
other = other.sensitive()
|
||||
if isinstance(other, dict):
|
||||
return type(self)(self.sensitive(), other)
|
||||
return NotImplemented
|
||||
|
||||
def __ror__(self, other, /) -> typing.Self:
|
||||
if isinstance(other, type(self)):
|
||||
other = other.sensitive()
|
||||
if isinstance(other, dict):
|
||||
return type(self)(other, self.sensitive())
|
||||
return NotImplemented
|
||||
|
||||
def __setitem__(self, key: str, value, /) -> None:
|
||||
if isinstance(value, bytes):
|
||||
value = value.decode('latin-1')
|
||||
key_title = key.title()
|
||||
self.__sensitive_map[key_title] = key
|
||||
super().__setitem__(key_title, str(value).strip())
|
||||
|
||||
def clear(self, /) -> None:
|
||||
self.__sensitive_map.clear()
|
||||
super().clear()
|
||||
|
||||
def copy(self, /) -> typing.Self:
|
||||
return type(self)(self.sensitive())
|
||||
|
||||
@typing.overload
|
||||
def get(self, key: str, /) -> str | None: ...
|
||||
|
||||
@typing.overload
|
||||
def get(self, key: str, /, default: T) -> str | T: ...
|
||||
|
||||
def get(self, key, /, default=NO_DEFAULT):
|
||||
key = key.title()
|
||||
if default is NO_DEFAULT:
|
||||
return super().get(key)
|
||||
return super().get(key, default)
|
||||
|
||||
@typing.overload
|
||||
def pop(self, key: str, /) -> str: ...
|
||||
|
||||
@typing.overload
|
||||
def pop(self, key: str, /, default: T) -> str | T: ...
|
||||
|
||||
def pop(self, key, /, default=NO_DEFAULT):
|
||||
key = key.title()
|
||||
if default is NO_DEFAULT:
|
||||
self.__sensitive_map.pop(key)
|
||||
return super().pop(key)
|
||||
self.__sensitive_map.pop(key, default)
|
||||
return super().pop(key, default)
|
||||
|
||||
def popitem(self) -> tuple[str, str]:
|
||||
self.__sensitive_map.popitem()
|
||||
return super().popitem()
|
||||
|
||||
@typing.overload
|
||||
def setdefault(self, key: str, /) -> str: ...
|
||||
|
||||
@typing.overload
|
||||
def setdefault(self, key: str, /, default) -> str: ...
|
||||
|
||||
def setdefault(self, key, /, default=None) -> str:
|
||||
key = key.title()
|
||||
if key in self.__sensitive_map:
|
||||
return super().__getitem__(key)
|
||||
|
||||
self[key] = default or ''
|
||||
return self[key]
|
||||
|
||||
def update(self, other, /, **kwargs) -> None:
|
||||
if isinstance(other, type(self)):
|
||||
other = other.sensitive()
|
||||
if isinstance(other, collections.abc.Mapping):
|
||||
for key, value in other.items():
|
||||
self[key] = value
|
||||
|
||||
elif hasattr(other, 'keys'):
|
||||
for key in other.keys(): # noqa: SIM118
|
||||
self[key] = other[key]
|
||||
|
||||
else:
|
||||
for key, value in other:
|
||||
self[key] = value
|
||||
|
||||
for key, value in kwargs.items():
|
||||
self[key] = value
|
||||
|
||||
|
||||
std_headers = HTTPHeaderDict({
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user