mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2026-01-12 09:51:15 +00:00
Compare commits
36 Commits
2024.12.03
...
2025.01.12
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a3c0321825 | ||
|
|
dade5e35c8 | ||
|
|
e2ef4fece6 | ||
|
|
1f489f4a45 | ||
|
|
75079f4e3f | ||
|
|
712d2abb32 | ||
|
|
8346b54915 | ||
|
|
1f4e1e85a2 | ||
|
|
763ed06ee6 | ||
|
|
3c14e9191f | ||
|
|
0b6b7742c2 | ||
|
|
3905f64920 | ||
|
|
65cf46cddd | ||
|
|
9f42e68a74 | ||
|
|
6fc85f617a | ||
|
|
d298693b1b | ||
|
|
09a6c68712 | ||
|
|
1a8851b689 | ||
|
|
b91c3925c2 | ||
|
|
3d3ee458c1 | ||
|
|
2037a6414f | ||
|
|
5421669626 | ||
|
|
dc3c4fddcc | ||
|
|
5460cd9189 | ||
|
|
f6c73aad5f | ||
|
|
d5e2a379f2 | ||
|
|
bc262bcad4 | ||
|
|
f4d3e9e6dc | ||
|
|
6fef824025 | ||
|
|
4bd2655398 | ||
|
|
a95ee6d880 | ||
|
|
4c85ccd136 | ||
|
|
2feb28028e | ||
|
|
fca3eb5f8b | ||
|
|
2e49c789d3 | ||
|
|
354cb4026c |
@@ -710,3 +710,8 @@ subrat-lima
|
||||
gitninja1234
|
||||
jkruse
|
||||
xiaomac
|
||||
wesson09
|
||||
Crypto90
|
||||
MutantPiggieGolem1
|
||||
Sanceilaks
|
||||
Strkmn
|
||||
|
||||
62
Changelog.md
62
Changelog.md
@@ -4,6 +4,68 @@
|
||||
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
||||
-->
|
||||
|
||||
### 2025.01.12
|
||||
|
||||
#### Core changes
|
||||
- [Fix filename sanitization with `--no-windows-filenames`](https://github.com/yt-dlp/yt-dlp/commit/8346b549150003df988538e54c9d8bc4de568979) ([#11988](https://github.com/yt-dlp/yt-dlp/issues/11988)) by [bashonly](https://github.com/bashonly)
|
||||
- [Validate retries values are non-negative](https://github.com/yt-dlp/yt-dlp/commit/1f4e1e85a27c5b43e34d7706cfd88ffce1b56a4a) ([#11927](https://github.com/yt-dlp/yt-dlp/issues/11927)) by [Strkmn](https://github.com/Strkmn)
|
||||
|
||||
#### Extractor changes
|
||||
- **drtalks**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/1f489f4a45691cac3f9e787d22a3a8a086229ba6) ([#10831](https://github.com/yt-dlp/yt-dlp/issues/10831)) by [pzhlkj6612](https://github.com/pzhlkj6612), [seproDev](https://github.com/seproDev)
|
||||
- **plvideo**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3c14e9191f3035b9a729d1d87bc0381f42de57cf) ([#10657](https://github.com/yt-dlp/yt-dlp/issues/10657)) by [Sanceilaks](https://github.com/Sanceilaks), [seproDev](https://github.com/seproDev)
|
||||
- **vine**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/e2ef4fece6c9742d1733e3bae408c4787765f78c) ([#11700](https://github.com/yt-dlp/yt-dlp/issues/11700)) by [allendema](https://github.com/allendema)
|
||||
- **xiaohongshu**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/763ed06ee69f13949397897bd42ff2ec3dc3d384) ([#11806](https://github.com/yt-dlp/yt-dlp/issues/11806)) by [HobbyistDev](https://github.com/HobbyistDev)
|
||||
- **youtube**
|
||||
- [Fix DASH formats incorrectly skipped in some situations](https://github.com/yt-dlp/yt-dlp/commit/0b6b7742c2e7f2a1fcb0b54ef3dd484bab404b3f) ([#11910](https://github.com/yt-dlp/yt-dlp/issues/11910)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
- [Refactor cookie auth](https://github.com/yt-dlp/yt-dlp/commit/75079f4e3f7dce49b61ef01da7adcd9876a0ca3b) ([#11989](https://github.com/yt-dlp/yt-dlp/issues/11989)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
- [Use `tv` instead of `mweb` client by default](https://github.com/yt-dlp/yt-dlp/commit/712d2abb32f59b2d246be2901255f84f1a4c30b3) ([#12059](https://github.com/yt-dlp/yt-dlp/issues/12059)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
|
||||
#### Misc. changes
|
||||
- **cleanup**: Miscellaneous: [dade5e3](https://github.com/yt-dlp/yt-dlp/commit/dade5e35c89adaad04408bfef766820dbca06ebe) by [grqz](https://github.com/grqz), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
|
||||
|
||||
### 2024.12.23
|
||||
|
||||
#### Core changes
|
||||
- [Don't sanitize filename on Unix when `--no-windows-filenames`](https://github.com/yt-dlp/yt-dlp/commit/6fc85f617a5850307fd5b258477070e6ee177796) ([#9591](https://github.com/yt-dlp/yt-dlp/issues/9591)) by [pukkandan](https://github.com/pukkandan)
|
||||
- **update**
|
||||
- [Check 64-bitness when upgrading ARM builds](https://github.com/yt-dlp/yt-dlp/commit/b91c3925c2059970daa801cb131c0c2f4f302e72) ([#11819](https://github.com/yt-dlp/yt-dlp/issues/11819)) by [bashonly](https://github.com/bashonly)
|
||||
- [Fix endless update loop for `linux_exe` builds](https://github.com/yt-dlp/yt-dlp/commit/3d3ee458c1fe49dd5ebd7651a092119d23eb7000) ([#11827](https://github.com/yt-dlp/yt-dlp/issues/11827)) by [bashonly](https://github.com/bashonly)
|
||||
|
||||
#### Extractor changes
|
||||
- **soundcloud**: [Various fixes](https://github.com/yt-dlp/yt-dlp/commit/d298693b1b266d198e8eeecb90ea17c4a031268f) ([#11820](https://github.com/yt-dlp/yt-dlp/issues/11820)) by [bashonly](https://github.com/bashonly)
|
||||
- **youtube**
|
||||
- [Add age-gate workaround for some embeddable videos](https://github.com/yt-dlp/yt-dlp/commit/09a6c687126f04e243fcb105a828787efddd1030) ([#11821](https://github.com/yt-dlp/yt-dlp/issues/11821)) by [bashonly](https://github.com/bashonly)
|
||||
- [Fix `uploader_id` extraction](https://github.com/yt-dlp/yt-dlp/commit/1a8851b689763e5173b96f70f8a71df0e4a44b66) ([#11818](https://github.com/yt-dlp/yt-dlp/issues/11818)) by [bashonly](https://github.com/bashonly)
|
||||
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/65cf46cddd873fd229dbb0fc0689bca4c201c6b6) ([#11893](https://github.com/yt-dlp/yt-dlp/issues/11893)) by [bashonly](https://github.com/bashonly)
|
||||
- [Skip iOS formats that require PO Token](https://github.com/yt-dlp/yt-dlp/commit/9f42e68a74f3f00b0253fe70763abd57cac4237b) ([#11890](https://github.com/yt-dlp/yt-dlp/issues/11890)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
|
||||
### 2024.12.13
|
||||
|
||||
#### Extractor changes
|
||||
- **patreon**: campaign: [Support /c/ URLs](https://github.com/yt-dlp/yt-dlp/commit/bc262bcad4d3683ceadf61a7eb87e233e72adef3) ([#11756](https://github.com/yt-dlp/yt-dlp/issues/11756)) by [bashonly](https://github.com/bashonly)
|
||||
- **soundcloud**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/f4d3e9e6dc25077b79849a31a2f67f93fdc01e62) ([#11777](https://github.com/yt-dlp/yt-dlp/issues/11777)) by [bashonly](https://github.com/bashonly)
|
||||
- **youtube**
|
||||
- [Fix `release_date` extraction](https://github.com/yt-dlp/yt-dlp/commit/d5e2a379f2adcb28bc48c7d9e90716d7278f89d2) ([#11759](https://github.com/yt-dlp/yt-dlp/issues/11759)) by [MutantPiggieGolem1](https://github.com/MutantPiggieGolem1)
|
||||
- [Fix signature function extraction for `2f1832d2`](https://github.com/yt-dlp/yt-dlp/commit/5460cd91891bf613a2065e2fc278d9903c37a127) ([#11801](https://github.com/yt-dlp/yt-dlp/issues/11801)) by [bashonly](https://github.com/bashonly)
|
||||
- [Prioritize original language over auto-dubbed audio](https://github.com/yt-dlp/yt-dlp/commit/dc3c4fddcc653989dae71fc563d82a308fc898cc) ([#11803](https://github.com/yt-dlp/yt-dlp/issues/11803)) by [bashonly](https://github.com/bashonly)
|
||||
- search_url: [Fix playlist searches](https://github.com/yt-dlp/yt-dlp/commit/f6c73aad5f1a67544bea137ebd9d1e22e0e56567) ([#11782](https://github.com/yt-dlp/yt-dlp/issues/11782)) by [Crypto90](https://github.com/Crypto90)
|
||||
|
||||
#### Misc. changes
|
||||
- **cleanup**: [Make more playlist entries lazy](https://github.com/yt-dlp/yt-dlp/commit/54216696261bc07cacd9a837c501d9e0b7fed09e) ([#11763](https://github.com/yt-dlp/yt-dlp/issues/11763)) by [seproDev](https://github.com/seproDev)
|
||||
|
||||
### 2024.12.06
|
||||
|
||||
#### Core changes
|
||||
- **cookies**: [Add `--cookies-from-browser` support for MS Store Firefox](https://github.com/yt-dlp/yt-dlp/commit/354cb4026cf2191e1a130ec2a627b95cabfbc60a) ([#11731](https://github.com/yt-dlp/yt-dlp/issues/11731)) by [wesson09](https://github.com/wesson09)
|
||||
|
||||
#### Extractor changes
|
||||
- **bilibili**: [Fix HD formats extraction](https://github.com/yt-dlp/yt-dlp/commit/fca3eb5f8be08d5fab2e18b45b7281a12e566725) ([#11734](https://github.com/yt-dlp/yt-dlp/issues/11734)) by [grqz](https://github.com/grqz)
|
||||
- **soundcloud**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/2feb28028ee48f2185d2d95076e62accb09b9e2e) ([#11742](https://github.com/yt-dlp/yt-dlp/issues/11742)) by [bashonly](https://github.com/bashonly)
|
||||
- **youtube**
|
||||
- [Fix `n` sig extraction for player `3bb1f723`](https://github.com/yt-dlp/yt-dlp/commit/a95ee6d8803fca9157adecf63732ab58bf87fd88) ([#11750](https://github.com/yt-dlp/yt-dlp/issues/11750)) by [bashonly](https://github.com/bashonly) (With fixes in [4bd2655](https://github.com/yt-dlp/yt-dlp/commit/4bd2655398aed450456197a6767639114a24eac2))
|
||||
- [Fix signature function extraction](https://github.com/yt-dlp/yt-dlp/commit/4c85ccd1366c88cf93982f8350f58eed17355981) ([#11751](https://github.com/yt-dlp/yt-dlp/issues/11751)) by [bashonly](https://github.com/bashonly)
|
||||
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/2e49c789d3eebc39af8910705d65a98bca0e4c4f) ([#11724](https://github.com/yt-dlp/yt-dlp/issues/11724)) by [bashonly](https://github.com/bashonly)
|
||||
|
||||
### 2024.12.03
|
||||
|
||||
#### Core changes
|
||||
|
||||
@@ -613,8 +613,7 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
|
||||
--no-restrict-filenames Allow Unicode characters, "&" and spaces in
|
||||
filenames (default)
|
||||
--windows-filenames Force filenames to be Windows-compatible
|
||||
--no-windows-filenames Make filenames Windows-compatible only if
|
||||
using Windows (default)
|
||||
--no-windows-filenames Sanitize filenames only minimally
|
||||
--trim-filenames LENGTH Limit the filename length (excluding
|
||||
extension) to the specified number of
|
||||
characters
|
||||
@@ -1770,13 +1769,13 @@ The following extractors use this feature:
|
||||
#### youtube
|
||||
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
|
||||
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
|
||||
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,mweb` is used, or `web_creator,mweb` is used when authenticating with cookies. The `_music` variants are added for `music.youtube.com` URLs. Some clients, such as `web` and `android`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web`
|
||||
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,tv` is used, or `web_creator,tv` is used when authenticating with cookies. The `_music` variants are added for `music.youtube.com` URLs. Some clients, such as `web` and `android`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web`
|
||||
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
|
||||
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
|
||||
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
|
||||
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
|
||||
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
|
||||
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8)
|
||||
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8), `missing_pot` (include formats that require a PO Token but are missing one)
|
||||
* `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
|
||||
* `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used
|
||||
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
|
||||
@@ -1860,7 +1859,7 @@ The following extractors use this feature:
|
||||
* `cdn`: One or more CDN IDs to use with the API call for stream URLs, e.g. `gcp_cdn`, `gs_cdn_pc_app`, `gs_cdn_mobile_web`, `gs_cdn_pc_web`
|
||||
|
||||
#### soundcloud
|
||||
* `formats`: Formats to request from the API. Requested values should be in the format of `{protocol}_{extension}` (omitting the bitrate), e.g. `hls_opus,http_aac`. The `*` character functions as a wildcard, e.g. `*_mp3`, and can be passed by itself to request all formats. Known protocols include `http`, `hls` and `hls-aes`; known extensions include `aac`, `opus` and `mp3`. Original `download` formats are always extracted. Default is `http_aac,hls_aac,http_opus,hls_opus,http_mp3,hls_mp3`
|
||||
* `formats`: Formats to request from the API. Requested values should be in the format of `{protocol}_{codec}`, e.g. `hls_opus,http_aac`. The `*` character functions as a wildcard, e.g. `*_mp3`, and can be passed by itself to request all formats. Known protocols include `http`, `hls` and `hls-aes`; known codecs include `aac`, `opus` and `mp3`. Original `download` formats are always extracted. Default is `http_aac,hls_aac,http_opus,hls_opus,http_mp3,hls_mp3`
|
||||
|
||||
#### orfon (orf:on)
|
||||
* `prefer_segments_playlist`: Prefer a playlist of program segments instead of a single complete video when available. If individual segments are desired, use `--concat-playlist never --extractor-args "orfon:prefer_segments_playlist"`
|
||||
|
||||
@@ -76,7 +76,7 @@ dev = [
|
||||
]
|
||||
static-analysis = [
|
||||
"autopep8~=2.0",
|
||||
"ruff~=0.8.0",
|
||||
"ruff~=0.9.0",
|
||||
]
|
||||
test = [
|
||||
"pytest~=8.1",
|
||||
@@ -195,6 +195,7 @@ ignore = [
|
||||
"B023", # function-uses-loop-variable (false positives)
|
||||
"B028", # no-explicit-stacklevel
|
||||
"B904", # raise-without-from-inside-except
|
||||
"A005", # stdlib-module-shadowing
|
||||
"C401", # unnecessary-generator-set
|
||||
"C402", # unnecessary-generator-dict
|
||||
"PIE790", # unnecessary-placeholder
|
||||
|
||||
@@ -374,6 +374,7 @@
|
||||
- **Dropbox**
|
||||
- **Dropout**: [*dropout*](## "netrc machine")
|
||||
- **DropoutSeason**
|
||||
- **DrTalks**
|
||||
- **DrTuber**
|
||||
- **drtv**
|
||||
- **drtv:live**
|
||||
@@ -1086,6 +1087,7 @@
|
||||
- **pluralsight**: [*pluralsight*](## "netrc machine")
|
||||
- **pluralsight:course**
|
||||
- **PlutoTV**: (**Currently broken**)
|
||||
- **PlVideo**: Платформа
|
||||
- **PodbayFM**
|
||||
- **PodbayFMChannel**
|
||||
- **Podchaser**
|
||||
@@ -1641,8 +1643,6 @@
|
||||
- **Vimm:stream**
|
||||
- **ViMP**
|
||||
- **ViMP:Playlist**
|
||||
- **Vine**
|
||||
- **vine:user**
|
||||
- **Viously**
|
||||
- **Viqeo**: (**Currently broken**)
|
||||
- **Viu**
|
||||
|
||||
@@ -761,6 +761,13 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
test('%(width)06d.%%(ext)s', 'NA.%(ext)s')
|
||||
test('%%(width)06d.%(ext)s', '%(width)06d.mp4')
|
||||
|
||||
# Sanitization options
|
||||
test('%(title3)s', (None, 'foo⧸bar⧹test'))
|
||||
test('%(title5)s', (None, 'aei_A'), restrictfilenames=True)
|
||||
test('%(title3)s', (None, 'foo_bar_test'), windowsfilenames=False, restrictfilenames=True)
|
||||
if sys.platform != 'win32':
|
||||
test('%(title3)s', (None, 'foo⧸bar\\test'), windowsfilenames=False)
|
||||
|
||||
# ID sanitization
|
||||
test('%(id)s', '_abcd', info={'id': '_abcd'})
|
||||
test('%(some_id)s', '_abcd', info={'some_id': '_abcd'})
|
||||
|
||||
@@ -68,6 +68,16 @@ _SIG_TESTS = [
|
||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
'AOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL2QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
|
||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
'MyOSJXtKI3m-uME_jv7-pT12gOFC02RFkGoqWpzE0Cs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
|
||||
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
|
||||
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
|
||||
),
|
||||
]
|
||||
|
||||
_NSIG_TESTS = [
|
||||
@@ -183,6 +193,14 @@ _NSIG_TESTS = [
|
||||
'https://www.youtube.com/s/player/b12cc44b/player_ias.vflset/en_US/base.js',
|
||||
'keLa5R2U00sR9SQK', 'N1OGyujjEwMnLw',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
|
||||
'gK15nzVyaXE9RsMP3z', 'ZFFWFLPWx9DEgQ',
|
||||
),
|
||||
(
|
||||
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
|
||||
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
@@ -254,8 +272,11 @@ def signature(jscode, sig_input):
|
||||
|
||||
|
||||
def n_sig(jscode, sig_input):
|
||||
funcname = YoutubeIE(FakeYDL())._extract_n_function_name(jscode)
|
||||
return JSInterpreter(jscode).call_function(funcname, sig_input)
|
||||
ie = YoutubeIE(FakeYDL())
|
||||
funcname = ie._extract_n_function_name(jscode)
|
||||
jsi = JSInterpreter(jscode)
|
||||
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname)))
|
||||
return func([sig_input])
|
||||
|
||||
|
||||
make_sig_test = t_factory(
|
||||
|
||||
@@ -266,7 +266,9 @@ class YoutubeDL:
|
||||
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
|
||||
restrictfilenames: Do not allow "&" and spaces in file names
|
||||
trim_file_name: Limit length of filename (extension excluded)
|
||||
windowsfilenames: Force the filenames to be windows compatible
|
||||
windowsfilenames: True: Force filenames to be Windows compatible
|
||||
False: Sanitize filenames only minimally
|
||||
This option has no effect when running on Windows
|
||||
ignoreerrors: Do not stop on download/postprocessing errors.
|
||||
Can be 'only_download' to ignore only download errors.
|
||||
Default is 'only_download' for CLI, but False for API
|
||||
@@ -281,7 +283,10 @@ class YoutubeDL:
|
||||
lazy_playlist: Process playlist entries as they are received.
|
||||
matchtitle: Download only matching titles.
|
||||
rejecttitle: Reject downloads for matching titles.
|
||||
logger: Log messages to a logging.Logger instance.
|
||||
logger: A class having a `debug`, `warning` and `error` function where
|
||||
each has a single string parameter, the message to be logged.
|
||||
For compatibility reasons, both debug and info messages are passed to `debug`.
|
||||
A debug message will have a prefix of `[debug] ` to discern it from info messages.
|
||||
logtostderr: Print everything to stderr instead of stdout.
|
||||
consoletitle: Display progress in the console window's titlebar.
|
||||
writedescription: Write the video description to a .description file
|
||||
@@ -1192,8 +1197,7 @@ class YoutubeDL:
|
||||
|
||||
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
|
||||
""" Make the outtmpl and info_dict suitable for substitution: ydl.escape_outtmpl(outtmpl) % info_dict
|
||||
@param sanitize Whether to sanitize the output as a filename.
|
||||
For backward compatibility, a function can also be passed
|
||||
@param sanitize Whether to sanitize the output as a filename
|
||||
"""
|
||||
|
||||
info_dict.setdefault('epoch', int(time.time())) # keep epoch consistent once set
|
||||
@@ -1309,14 +1313,23 @@ class YoutubeDL:
|
||||
|
||||
na = self.params.get('outtmpl_na_placeholder', 'NA')
|
||||
|
||||
def filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames')):
|
||||
def filename_sanitizer(key, value, restricted):
|
||||
return sanitize_filename(str(value), restricted=restricted, is_id=(
|
||||
bool(re.search(r'(^|[_.])id(\.|$)', key))
|
||||
if 'filename-sanitization' in self.params['compat_opts']
|
||||
else NO_DEFAULT))
|
||||
|
||||
sanitizer = sanitize if callable(sanitize) else filename_sanitizer
|
||||
sanitize = bool(sanitize)
|
||||
if callable(sanitize):
|
||||
self.deprecation_warning('Passing a callable "sanitize" to YoutubeDL.prepare_outtmpl is deprecated')
|
||||
elif not sanitize:
|
||||
pass
|
||||
elif (sys.platform != 'win32' and not self.params.get('restrictfilenames')
|
||||
and self.params.get('windowsfilenames') is False):
|
||||
def sanitize(key, value):
|
||||
return str(value).replace('/', '\u29F8').replace('\0', '')
|
||||
else:
|
||||
def sanitize(key, value):
|
||||
return filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames'))
|
||||
|
||||
def _dumpjson_default(obj):
|
||||
if isinstance(obj, (set, LazyList)):
|
||||
@@ -1399,13 +1412,13 @@ class YoutubeDL:
|
||||
|
||||
if sanitize:
|
||||
# If value is an object, sanitize might convert it to a string
|
||||
# So we convert it to repr first
|
||||
# So we manually convert it before sanitizing
|
||||
if fmt[-1] == 'r':
|
||||
value, fmt = repr(value), str_fmt
|
||||
elif fmt[-1] == 'a':
|
||||
value, fmt = ascii(value), str_fmt
|
||||
if fmt[-1] in 'csra':
|
||||
value = sanitizer(last_field, value)
|
||||
value = sanitize(last_field, value)
|
||||
|
||||
key = '{}\0{}'.format(key.replace('%', '%\0'), outer_mobj.group('format'))
|
||||
TMPL_DICT[key] = value
|
||||
|
||||
@@ -261,9 +261,11 @@ def validate_options(opts):
|
||||
elif value in ('inf', 'infinite'):
|
||||
return float('inf')
|
||||
try:
|
||||
return int(value)
|
||||
int_value = int(value)
|
||||
except (TypeError, ValueError):
|
||||
validate(False, f'{name} retry count', value)
|
||||
validate_positive(f'{name} retry count', int_value)
|
||||
return int_value
|
||||
|
||||
opts.retries = parse_retries('download', opts.retries)
|
||||
opts.fragment_retries = parse_retries('fragment', opts.fragment_retries)
|
||||
|
||||
@@ -195,7 +195,10 @@ def _extract_firefox_cookies(profile, container, logger):
|
||||
|
||||
def _firefox_browser_dirs():
|
||||
if sys.platform in ('cygwin', 'win32'):
|
||||
yield os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles')
|
||||
yield from map(os.path.expandvars, (
|
||||
R'%APPDATA%\Mozilla\Firefox\Profiles',
|
||||
R'%LOCALAPPDATA%\Packages\Mozilla.Firefox_n80bbvh6b1yt2\LocalCache\Roaming\Mozilla\Firefox\Profiles',
|
||||
))
|
||||
|
||||
elif sys.platform == 'darwin':
|
||||
yield os.path.expanduser('~/Library/Application Support/Firefox/Profiles')
|
||||
|
||||
@@ -555,6 +555,7 @@ from .dropout import (
|
||||
DropoutIE,
|
||||
DropoutSeasonIE,
|
||||
)
|
||||
from .drtalks import DrTalksIE
|
||||
from .drtuber import DrTuberIE
|
||||
from .drtv import (
|
||||
DRTVIE,
|
||||
@@ -1551,6 +1552,7 @@ from .pluralsight import (
|
||||
PluralsightIE,
|
||||
)
|
||||
from .plutotv import PlutoTVIE
|
||||
from .plvideo import PlVideoIE
|
||||
from .podbayfm import (
|
||||
PodbayFMChannelIE,
|
||||
PodbayFMIE,
|
||||
@@ -2354,10 +2356,6 @@ from .vimm import (
|
||||
VimmIE,
|
||||
VimmRecordingIE,
|
||||
)
|
||||
from .vine import (
|
||||
VineIE,
|
||||
VineUserIE,
|
||||
)
|
||||
from .viously import ViouslyIE
|
||||
from .viqeo import ViqeoIE
|
||||
from .viu import (
|
||||
|
||||
@@ -681,12 +681,6 @@ class BiliBiliIE(BilibiliBaseIE):
|
||||
old_video_id = format_field(aid, None, f'%s_part{part_id or 1}')
|
||||
cid = traverse_obj(video_data, ('pages', part_id - 1, 'cid')) if part_id else video_data.get('cid')
|
||||
|
||||
play_info = (
|
||||
traverse_obj(
|
||||
self._search_json(r'window\.__playinfo__\s*=', webpage, 'play info', video_id, default=None),
|
||||
('data', {dict}))
|
||||
or self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1}))
|
||||
|
||||
festival_info = {}
|
||||
if is_festival:
|
||||
festival_info = traverse_obj(initial_state, {
|
||||
@@ -724,6 +718,13 @@ class BiliBiliIE(BilibiliBaseIE):
|
||||
duration=traverse_obj(initial_state, ('videoData', 'duration', {int_or_none})),
|
||||
__post_extractor=self.extract_comments(aid))
|
||||
|
||||
play_info = None
|
||||
if self.is_logged_in:
|
||||
play_info = traverse_obj(
|
||||
self._search_json(r'window\.__playinfo__\s*=', webpage, 'play info', video_id, default=None),
|
||||
('data', {dict}))
|
||||
if not play_info:
|
||||
play_info = self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1})
|
||||
formats = self.extract_formats(play_info)
|
||||
|
||||
if video_data.get('is_upower_exclusive'):
|
||||
|
||||
@@ -31,6 +31,7 @@ from ..utils import (
|
||||
update_url_query,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class BrightcoveLegacyIE(InfoExtractor):
|
||||
@@ -935,8 +936,8 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
|
||||
|
||||
if content_type == 'playlist':
|
||||
return self.playlist_result(
|
||||
[self._parse_brightcove_metadata(vid, vid.get('id'), headers)
|
||||
for vid in json_data.get('videos', []) if vid.get('id')],
|
||||
(self._parse_brightcove_metadata(vid, vid['id'], headers)
|
||||
for vid in traverse_obj(json_data, ('videos', lambda _, v: v['id']))),
|
||||
json_data.get('id'), json_data.get('name'),
|
||||
json_data.get('description'))
|
||||
|
||||
|
||||
51
yt_dlp/extractor/drtalks.py
Normal file
51
yt_dlp/extractor/drtalks.py
Normal file
@@ -0,0 +1,51 @@
|
||||
from .brightcove import BrightcoveNewIE
|
||||
from .common import InfoExtractor
|
||||
from ..utils import url_or_none
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class DrTalksIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?drtalks\.com/videos/(?P<id>[\w-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://drtalks.com/videos/six-pillars-of-resilience-tools-for-managing-stress-and-flourishing/',
|
||||
'info_dict': {
|
||||
'id': '6366193757112',
|
||||
'ext': 'mp4',
|
||||
'uploader_id': '6314452011001',
|
||||
'tags': ['resilience'],
|
||||
'description': 'md5:9c6805aee237ee6de8052461855b9dda',
|
||||
'timestamp': 1734546659,
|
||||
'thumbnail': 'https://drtalks.com/wp-content/uploads/2024/12/Episode-82-Eva-Selhub-DrTalks-Thumbs.jpg',
|
||||
'title': 'Six Pillars of Resilience: Tools for Managing Stress and Flourishing',
|
||||
'duration': 2800.682,
|
||||
'upload_date': '20241218',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://drtalks.com/videos/the-pcos-puzzle-mastering-metabolic-health-with-marcelle-pick/',
|
||||
'info_dict': {
|
||||
'id': '6364699891112',
|
||||
'ext': 'mp4',
|
||||
'title': 'The PCOS Puzzle: Mastering Metabolic Health with Marcelle Pick',
|
||||
'description': 'md5:e87cbe00ca50135d5702787fc4043aaa',
|
||||
'thumbnail': 'https://drtalks.com/wp-content/uploads/2024/11/Episode-34-Marcelle-Pick-OBGYN-NP-DrTalks.jpg',
|
||||
'duration': 3515.2,
|
||||
'tags': ['pcos'],
|
||||
'upload_date': '20241114',
|
||||
'timestamp': 1731592119,
|
||||
'uploader_id': '6314452011001',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
next_data = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['data']['video']
|
||||
|
||||
return self.url_result(
|
||||
next_data['videos']['brightcoveVideoLink'], BrightcoveNewIE, video_id,
|
||||
url_transparent=True,
|
||||
**traverse_obj(next_data, {
|
||||
'title': ('title', {str}),
|
||||
'description': ('videos', 'summury', {str}),
|
||||
'thumbnail': ('featuredImage', 'node', 'sourceUrl', {url_or_none}),
|
||||
}))
|
||||
@@ -162,7 +162,7 @@ class DVTVIE(InfoExtractor):
|
||||
items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage)
|
||||
if items:
|
||||
return self.playlist_result(
|
||||
[self._parse_video_metadata(i, video_id, timestamp) for i in items],
|
||||
(self._parse_video_metadata(i, video_id, timestamp) for i in items),
|
||||
video_id, self._html_search_meta('twitter:title', webpage))
|
||||
|
||||
item = self._search_regex(
|
||||
|
||||
@@ -343,7 +343,7 @@ class NYTimesCookingIE(NYTimesBaseIE):
|
||||
if media_ids:
|
||||
media_ids.append(lead_video_id)
|
||||
return self.playlist_result(
|
||||
[self._extract_video(media_id) for media_id in media_ids], page_id, title, description)
|
||||
map(self._extract_video, media_ids), page_id, title, description)
|
||||
|
||||
return {
|
||||
**self._extract_video(lead_video_id),
|
||||
|
||||
@@ -457,7 +457,7 @@ class PatreonCampaignIE(PatreonBaseIE):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://(?:www\.)?patreon\.com/(?:
|
||||
(?:m|api/campaigns)/(?P<campaign_id>\d+)|
|
||||
(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+)
|
||||
(?:c/)?(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+)
|
||||
)(?:/posts)?/?(?:$|[?#])'''
|
||||
_TESTS = [{
|
||||
'url': 'https://www.patreon.com/dissonancepod/',
|
||||
@@ -509,6 +509,26 @@ class PatreonCampaignIE(PatreonBaseIE):
|
||||
'thumbnail': r're:^https?://.*$',
|
||||
},
|
||||
'playlist_mincount': 201,
|
||||
}, {
|
||||
'url': 'https://www.patreon.com/c/OgSog',
|
||||
'info_dict': {
|
||||
'id': '8504388',
|
||||
'title': 'OGSoG',
|
||||
'description': r're:(?s)Hello and welcome to our Patreon page. We are Mari, Lasercorn, .+',
|
||||
'channel': 'OGSoG',
|
||||
'channel_id': '8504388',
|
||||
'channel_url': 'https://www.patreon.com/OgSog',
|
||||
'uploader_url': 'https://www.patreon.com/OgSog',
|
||||
'uploader_id': '72323575',
|
||||
'uploader': 'David Moss',
|
||||
'thumbnail': r're:https?://.+/.+',
|
||||
'channel_follower_count': int,
|
||||
'age_limit': 0,
|
||||
},
|
||||
'playlist_mincount': 331,
|
||||
}, {
|
||||
'url': 'https://www.patreon.com/c/OgSog/posts',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.patreon.com/dissonancepod/posts',
|
||||
'only_matching': True,
|
||||
|
||||
130
yt_dlp/extractor/plvideo.py
Normal file
130
yt_dlp/extractor/plvideo.py
Normal file
@@ -0,0 +1,130 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
parse_resolution,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class PlVideoIE(InfoExtractor):
|
||||
IE_DESC = 'Платформа'
|
||||
_VALID_URL = r'https?://(?:www\.)?plvideo\.ru/(?:watch\?(?:[^#]+&)?v=|shorts/)(?P<id>[\w-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://plvideo.ru/watch?v=Y5JzUzkcQTMK',
|
||||
'md5': 'fe8e18aca892b3b31f3bf492169f8a26',
|
||||
'info_dict': {
|
||||
'id': 'Y5JzUzkcQTMK',
|
||||
'ext': 'mp4',
|
||||
'thumbnail': 'https://img.plvideo.ru/images/fp-2024-images/v/cover/37/dd/37dd00a4c96c77436ab737e85947abd7/original663a4a3bb713e5.33151959.jpg',
|
||||
'title': 'Presidente de Cuba llega a Moscú en una visita de trabajo',
|
||||
'channel': 'RT en Español',
|
||||
'channel_id': 'ZH4EKqunVDvo',
|
||||
'media_type': 'video',
|
||||
'comment_count': int,
|
||||
'tags': ['rusia', 'cuba', 'russia', 'miguel díaz-canel'],
|
||||
'description': 'md5:a1a395d900d77a86542a91ee0826c115',
|
||||
'released_timestamp': 1715096124,
|
||||
'channel_is_verified': True,
|
||||
'like_count': int,
|
||||
'timestamp': 1715095911,
|
||||
'duration': 44320,
|
||||
'view_count': int,
|
||||
'dislike_count': int,
|
||||
'upload_date': '20240507',
|
||||
'modified_date': '20240701',
|
||||
'channel_follower_count': int,
|
||||
'modified_timestamp': 1719824073,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://plvideo.ru/shorts/S3Uo9c-VLwFX',
|
||||
'md5': '7d8fa2279406c69d2fd2a6fc548a9805',
|
||||
'info_dict': {
|
||||
'id': 'S3Uo9c-VLwFX',
|
||||
'ext': 'mp4',
|
||||
'channel': 'Romaatom',
|
||||
'tags': 'count:22',
|
||||
'dislike_count': int,
|
||||
'upload_date': '20241130',
|
||||
'description': 'md5:452e6de219bf2f32bb95806c51c3b364',
|
||||
'duration': 58433,
|
||||
'modified_date': '20241130',
|
||||
'thumbnail': 'https://img.plvideo.ru/images/fp-2024-11-cover/S3Uo9c-VLwFX/f9318999-a941-482b-b700-2102a7049366.jpg',
|
||||
'media_type': 'shorts',
|
||||
'like_count': int,
|
||||
'modified_timestamp': 1732961458,
|
||||
'channel_is_verified': True,
|
||||
'channel_id': 'erJyyTIbmUd1',
|
||||
'timestamp': 1732961355,
|
||||
'comment_count': int,
|
||||
'title': 'Белоусов отменил приказы о кадровом резерве на гражданской службе',
|
||||
'channel_follower_count': int,
|
||||
'view_count': int,
|
||||
'released_timestamp': 1732961458,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
video_data = self._download_json(
|
||||
f'https://api.g1.plvideo.ru/v1/videos/{video_id}?Aud=18', video_id)
|
||||
|
||||
is_live = False
|
||||
formats = []
|
||||
subtitles = {}
|
||||
automatic_captions = {}
|
||||
for quality, data in traverse_obj(video_data, ('item', 'profiles', {dict.items}, lambda _, v: url_or_none(v[1]['hls']))):
|
||||
formats.append({
|
||||
'format_id': quality,
|
||||
'ext': 'mp4',
|
||||
'protocol': 'm3u8_native',
|
||||
**traverse_obj(data, {
|
||||
'url': 'hls',
|
||||
'fps': ('fps', {float_or_none}),
|
||||
'aspect_ratio': ('aspectRatio', {float_or_none}),
|
||||
}),
|
||||
**parse_resolution(quality),
|
||||
})
|
||||
if livestream_url := traverse_obj(video_data, ('item', 'livestream', 'url', {url_or_none})):
|
||||
is_live = True
|
||||
formats.extend(self._extract_m3u8_formats(livestream_url, video_id, 'mp4', live=True))
|
||||
for lang, url in traverse_obj(video_data, ('item', 'subtitles', {dict.items}, lambda _, v: url_or_none(v[1]))):
|
||||
if lang.endswith('-auto'):
|
||||
automatic_captions.setdefault(lang[:-5], []).append({
|
||||
'url': url,
|
||||
})
|
||||
else:
|
||||
subtitles.setdefault(lang, []).append({
|
||||
'url': url,
|
||||
})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'automatic_captions': automatic_captions,
|
||||
'is_live': is_live,
|
||||
**traverse_obj(video_data, ('item', {
|
||||
'id': ('id', {str}),
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('cover', 'paths', 'original', 'src', {url_or_none}),
|
||||
'duration': ('uploadFile', 'videoDuration', {int_or_none}),
|
||||
'channel': ('channel', 'name', {str}),
|
||||
'channel_id': ('channel', 'id', {str}),
|
||||
'channel_follower_count': ('channel', 'stats', 'subscribers', {int_or_none}),
|
||||
'channel_is_verified': ('channel', 'verified', {bool}),
|
||||
'tags': ('tags', ..., {str}),
|
||||
'timestamp': ('createdAt', {parse_iso8601}),
|
||||
'released_timestamp': ('publishedAt', {parse_iso8601}),
|
||||
'modified_timestamp': ('updatedAt', {parse_iso8601}),
|
||||
'view_count': ('stats', 'viewTotalCount', {int_or_none}),
|
||||
'like_count': ('stats', 'likeCount', {int_or_none}),
|
||||
'dislike_count': ('stats', 'dislikeCount', {int_or_none}),
|
||||
'comment_count': ('stats', 'commentCount', {int_or_none}),
|
||||
'media_type': ('type', {str}),
|
||||
})),
|
||||
}
|
||||
@@ -7,7 +7,6 @@ from .common import InfoExtractor, SearchInfoExtractor
|
||||
from ..networking import HEADRequest
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
KNOWN_EXTENSIONS,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
@@ -211,6 +210,7 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
|
||||
format_urls = set()
|
||||
formats = []
|
||||
has_drm = False
|
||||
query = {'client_id': self._CLIENT_ID}
|
||||
if secret_token:
|
||||
query['secret_token'] = secret_token
|
||||
@@ -246,55 +246,24 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
'url': format_url,
|
||||
'quality': 10,
|
||||
'format_note': 'Original',
|
||||
'vcodec': 'none',
|
||||
})
|
||||
|
||||
def invalid_url(url):
|
||||
return not url or url in format_urls
|
||||
|
||||
def add_format(f, protocol, is_preview=False):
|
||||
mobj = re.search(r'\.(?P<abr>\d+)\.(?P<ext>[0-9a-z]{3,4})(?=[/?])', stream_url)
|
||||
if mobj:
|
||||
for k, v in mobj.groupdict().items():
|
||||
if not f.get(k):
|
||||
f[k] = v
|
||||
format_id_list = []
|
||||
if protocol:
|
||||
format_id_list.append(protocol)
|
||||
ext = f.get('ext')
|
||||
if ext == 'aac':
|
||||
f.update({
|
||||
'abr': 256,
|
||||
'quality': 5,
|
||||
'format_note': 'Premium',
|
||||
})
|
||||
for k in ('ext', 'abr'):
|
||||
v = str_or_none(f.get(k))
|
||||
if v:
|
||||
format_id_list.append(v)
|
||||
preview = is_preview or re.search(r'/(?:preview|playlist)/0/30/', f['url'])
|
||||
if preview:
|
||||
format_id_list.append('preview')
|
||||
abr = f.get('abr')
|
||||
if abr:
|
||||
f['abr'] = int(abr)
|
||||
if protocol in ('hls', 'hls-aes'):
|
||||
protocol = 'm3u8' if ext == 'aac' else 'm3u8_native'
|
||||
else:
|
||||
protocol = 'http'
|
||||
f.update({
|
||||
'format_id': '_'.join(format_id_list),
|
||||
'protocol': protocol,
|
||||
'preference': -10 if preview else None,
|
||||
})
|
||||
formats.append(f)
|
||||
|
||||
# New API
|
||||
for t in traverse_obj(info, ('media', 'transcodings', lambda _, v: url_or_none(v['url']))):
|
||||
for t in traverse_obj(info, ('media', 'transcodings', lambda _, v: url_or_none(v['url']) and v['preset'])):
|
||||
if extract_flat:
|
||||
break
|
||||
format_url = t['url']
|
||||
preset = t['preset']
|
||||
preset_base = preset.partition('_')[0]
|
||||
|
||||
protocol = traverse_obj(t, ('format', 'protocol', {str}))
|
||||
protocol = traverse_obj(t, ('format', 'protocol', {str})) or 'http'
|
||||
if protocol.startswith(('ctr-', 'cbc-')):
|
||||
has_drm = True
|
||||
continue
|
||||
if protocol == 'progressive':
|
||||
protocol = 'http'
|
||||
if protocol != 'hls' and '/hls' in format_url:
|
||||
@@ -302,35 +271,60 @@ class SoundcloudBaseIE(InfoExtractor):
|
||||
if protocol == 'encrypted-hls' or '/encrypted-hls' in format_url:
|
||||
protocol = 'hls-aes'
|
||||
|
||||
ext = None
|
||||
if preset := traverse_obj(t, ('preset', {str_or_none})):
|
||||
ext = preset.split('_')[0]
|
||||
if ext not in KNOWN_EXTENSIONS:
|
||||
ext = mimetype2ext(traverse_obj(t, ('format', 'mime_type', {str})))
|
||||
|
||||
identifier = join_nonempty(protocol, ext, delim='_')
|
||||
if not self._is_requested(identifier):
|
||||
self.write_debug(f'"{identifier}" is not a requested format, skipping')
|
||||
short_identifier = f'{protocol}_{preset_base}'
|
||||
if preset_base == 'abr':
|
||||
self.write_debug(f'Skipping broken "{short_identifier}" format')
|
||||
continue
|
||||
if not self._is_requested(short_identifier):
|
||||
self.write_debug(f'"{short_identifier}" is not a requested format, skipping')
|
||||
continue
|
||||
|
||||
# XXX: if not extract_flat, 429 error must be caught where _extract_info_dict is called
|
||||
stream_url = traverse_obj(self._call_api(
|
||||
format_url, track_id, f'Downloading {identifier} format info JSON',
|
||||
format_url, track_id, f'Downloading {short_identifier} format info JSON',
|
||||
query=query, headers=self._HEADERS), ('url', {url_or_none}))
|
||||
|
||||
if invalid_url(stream_url):
|
||||
continue
|
||||
format_urls.add(stream_url)
|
||||
add_format({
|
||||
|
||||
mime_type = traverse_obj(t, ('format', 'mime_type', {str}))
|
||||
codec = self._search_regex(r'codecs="([^"]+)"', mime_type, 'codec', default=None)
|
||||
ext = {
|
||||
'mp4a': 'm4a',
|
||||
'opus': 'opus',
|
||||
}.get(codec[:4] if codec else None) or mimetype2ext(mime_type, default=None)
|
||||
if not ext or ext == 'm3u8':
|
||||
ext = preset_base
|
||||
|
||||
is_premium = t.get('quality') == 'hq'
|
||||
abr = int_or_none(
|
||||
self._search_regex(r'(\d+)k$', preset, 'abr', default=None)
|
||||
or self._search_regex(r'\.(\d+)\.(?:opus|mp3)[/?]', stream_url, 'abr', default=None)
|
||||
or (256 if (is_premium and 'aac' in preset) else None))
|
||||
|
||||
is_preview = (t.get('snipped')
|
||||
or '/preview/' in format_url
|
||||
or re.search(r'/(?:preview|playlist)/0/30/', stream_url))
|
||||
|
||||
formats.append({
|
||||
'format_id': join_nonempty(protocol, preset, is_preview and 'preview', delim='_'),
|
||||
'url': stream_url,
|
||||
'ext': ext,
|
||||
}, protocol, t.get('snipped') or '/preview/' in format_url)
|
||||
'acodec': codec,
|
||||
'vcodec': 'none',
|
||||
'abr': abr,
|
||||
'protocol': 'm3u8_native' if protocol in ('hls', 'hls-aes') else 'http',
|
||||
'container': 'm4a_dash' if ext == 'm4a' else None,
|
||||
'quality': 5 if is_premium else 0 if (abr and abr >= 160) else -1,
|
||||
'format_note': 'Premium' if is_premium else None,
|
||||
'preference': -10 if is_preview else None,
|
||||
})
|
||||
|
||||
for f in formats:
|
||||
f['vcodec'] = 'none'
|
||||
|
||||
if not formats and info.get('policy') == 'BLOCK':
|
||||
self.raise_geo_restricted(metadata_available=True)
|
||||
if not formats:
|
||||
if has_drm:
|
||||
self.report_drm(track_id)
|
||||
if info.get('policy') == 'BLOCK':
|
||||
self.raise_geo_restricted(metadata_available=True)
|
||||
|
||||
user = info.get('user') or {}
|
||||
|
||||
|
||||
@@ -189,26 +189,6 @@ class TumblrIE(InfoExtractor):
|
||||
'release_date': '20140227',
|
||||
},
|
||||
'add_ie': ['Vimeo'],
|
||||
}, {
|
||||
'url': 'http://sutiblr.tumblr.com/post/139638707273',
|
||||
'md5': '2dd184b3669e049ba40563a7d423f95c',
|
||||
'info_dict': {
|
||||
'id': 'ir7qBEIKqvq',
|
||||
'ext': 'mp4',
|
||||
'title': 'Vine by sutiblr',
|
||||
'alt_title': 'Vine by sutiblr',
|
||||
'uploader': 'sutiblr',
|
||||
'uploader_id': '1198993975374495744',
|
||||
'upload_date': '20160220',
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1455940159,
|
||||
'view_count': int,
|
||||
},
|
||||
'add_ie': ['Vine'],
|
||||
'skip': 'Vine is unavailable',
|
||||
}, {
|
||||
'url': 'https://silami.tumblr.com/post/84250043974/my-bad-river-flows-in-you-impression-on-maschine',
|
||||
'md5': '3c92d7c3d867f14ccbeefa2119022277',
|
||||
@@ -366,7 +346,6 @@ class TumblrIE(InfoExtractor):
|
||||
_providers = {
|
||||
'instagram': 'Instagram',
|
||||
'vimeo': 'Vimeo',
|
||||
'vine': 'Vine',
|
||||
'youtube': 'Youtube',
|
||||
'dailymotion': 'Dailymotion',
|
||||
'tiktok': 'TikTok',
|
||||
|
||||
@@ -409,26 +409,6 @@ class TwitterCardIE(InfoExtractor):
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
},
|
||||
{
|
||||
'url': 'https://twitter.com/i/cards/tfw/v1/665289828897005568',
|
||||
'info_dict': {
|
||||
'id': 'iBb2x00UVlv',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20151113',
|
||||
'uploader_id': '1189339351084113920',
|
||||
'uploader': 'ArsenalTerje',
|
||||
'title': 'Vine by ArsenalTerje',
|
||||
'timestamp': 1447451307,
|
||||
'alt_title': 'Vine by ArsenalTerje',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
'thumbnail': r're:^https?://[^?#]+\.jpg',
|
||||
'view_count': int,
|
||||
'repost_count': int,
|
||||
},
|
||||
'add_ie': ['Vine'],
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
},
|
||||
{
|
||||
'url': 'https://twitter.com/i/videos/tweet/705235433198714880',
|
||||
'md5': '884812a2adc8aaf6fe52b15ccbfa3b88',
|
||||
@@ -567,25 +547,6 @@ class TwitterIE(TwitterBaseIE):
|
||||
'age_limit': 0,
|
||||
'_old_archive_ids': ['twitter 700207533655363584'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://twitter.com/Filmdrunk/status/713801302971588609',
|
||||
'md5': '89a15ed345d13b86e9a5a5e051fa308a',
|
||||
'info_dict': {
|
||||
'id': 'MIOxnrUteUd',
|
||||
'ext': 'mp4',
|
||||
'title': 'Dr.Pepperの飲み方 #japanese #バカ #ドクペ #電動ガン',
|
||||
'uploader': 'TAKUMA',
|
||||
'uploader_id': '1004126642786242560',
|
||||
'timestamp': 1402826626,
|
||||
'upload_date': '20140615',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'alt_title': 'Vine by TAKUMA',
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
'like_count': int,
|
||||
'view_count': int,
|
||||
},
|
||||
'add_ie': ['Vine'],
|
||||
}, {
|
||||
'url': 'https://twitter.com/captainamerica/status/719944021058060289',
|
||||
'info_dict': {
|
||||
|
||||
@@ -421,5 +421,5 @@ class VidyardIE(VidyardBaseIE):
|
||||
return self._process_video_json(video_json['chapters'][0], video_id)
|
||||
|
||||
return self.playlist_result(
|
||||
[self._process_video_json(chapter, video_id) for chapter in video_json['chapters']],
|
||||
(self._process_video_json(chapter, video_id) for chapter in video_json['chapters']),
|
||||
str(video_json['playerUuid']), video_json.get('name'))
|
||||
|
||||
@@ -1,150 +0,0 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
format_field,
|
||||
int_or_none,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
class VineIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?vine\.co/(?:v|oembed)/(?P<id>\w+)'
|
||||
_EMBED_REGEX = [r'<iframe[^>]+src=[\'"](?P<url>(?:https?:)?//(?:www\.)?vine\.co/v/[^/]+/embed/(?:simple|postcard))']
|
||||
_TESTS = [{
|
||||
'url': 'https://vine.co/v/b9KOOWX7HUx',
|
||||
'md5': '2f36fed6235b16da96ce9b4dc890940d',
|
||||
'info_dict': {
|
||||
'id': 'b9KOOWX7HUx',
|
||||
'ext': 'mp4',
|
||||
'title': 'Chicken.',
|
||||
'alt_title': 'Vine by Jack',
|
||||
'timestamp': 1368997951,
|
||||
'upload_date': '20130519',
|
||||
'uploader': 'Jack',
|
||||
'uploader_id': '76',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://vine.co/v/e192BnZnZ9V',
|
||||
'info_dict': {
|
||||
'id': 'e192BnZnZ9V',
|
||||
'ext': 'mp4',
|
||||
'title': 'ยิ้ม~ เขิน~ อาย~ น่าร้ากอ้ะ >//< @n_whitewo @orlameena #lovesicktheseries #lovesickseason2',
|
||||
'alt_title': 'Vine by Pimry_zaa',
|
||||
'timestamp': 1436057405,
|
||||
'upload_date': '20150705',
|
||||
'uploader': 'Pimry_zaa',
|
||||
'uploader_id': '1135760698325307392',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://vine.co/v/MYxVapFvz2z',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://vine.co/v/bxVjBbZlPUH',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://vine.co/oembed/MYxVapFvz2z.json',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
data = self._download_json(
|
||||
f'https://archive.vine.co/posts/{video_id}.json', video_id)
|
||||
|
||||
def video_url(kind):
|
||||
for url_suffix in ('Url', 'URL'):
|
||||
format_url = data.get(f'video{kind}{url_suffix}')
|
||||
if format_url:
|
||||
return format_url
|
||||
|
||||
formats = []
|
||||
for quality, format_id in enumerate(('low', '', 'dash')):
|
||||
format_url = video_url(format_id.capitalize())
|
||||
if not format_url:
|
||||
continue
|
||||
# DASH link returns plain mp4
|
||||
if format_id == 'dash' and determine_ext(format_url) == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
format_url, video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': format_id or 'standard',
|
||||
'quality': quality,
|
||||
})
|
||||
self._check_formats(formats, video_id)
|
||||
|
||||
username = data.get('username')
|
||||
|
||||
alt_title = format_field(username, None, 'Vine by %s')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': data.get('description') or alt_title or 'Vine video',
|
||||
'alt_title': alt_title,
|
||||
'thumbnail': data.get('thumbnailUrl'),
|
||||
'timestamp': unified_timestamp(data.get('created')),
|
||||
'uploader': username,
|
||||
'uploader_id': data.get('userIdStr'),
|
||||
'view_count': int_or_none(data.get('loops')),
|
||||
'like_count': int_or_none(data.get('likes')),
|
||||
'comment_count': int_or_none(data.get('comments')),
|
||||
'repost_count': int_or_none(data.get('reposts')),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class VineUserIE(InfoExtractor):
|
||||
IE_NAME = 'vine:user'
|
||||
_VALID_URL = r'https?://vine\.co/(?P<u>u/)?(?P<user>[^/]+)'
|
||||
_VINE_BASE_URL = 'https://vine.co/'
|
||||
_TESTS = [{
|
||||
'url': 'https://vine.co/itsruthb',
|
||||
'info_dict': {
|
||||
'id': 'itsruthb',
|
||||
'title': 'Ruth B',
|
||||
'description': '| Instagram/Twitter: itsruthb | still a lost boy from neverland',
|
||||
},
|
||||
'playlist_mincount': 611,
|
||||
}, {
|
||||
'url': 'https://vine.co/u/942914934646415360',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if VineIE.suitable(url) else super().suitable(url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._match_valid_url(url)
|
||||
user = mobj.group('user')
|
||||
u = mobj.group('u')
|
||||
|
||||
profile_url = '{}api/users/profiles/{}{}'.format(
|
||||
self._VINE_BASE_URL, 'vanity/' if not u else '', user)
|
||||
profile_data = self._download_json(
|
||||
profile_url, user, note='Downloading user profile data')
|
||||
|
||||
data = profile_data['data']
|
||||
user_id = data.get('userId') or data['userIdStr']
|
||||
profile = self._download_json(
|
||||
f'https://archive.vine.co/profiles/{user_id}.json', user_id)
|
||||
entries = [
|
||||
self.url_result(
|
||||
f'https://vine.co/v/{post_id}', ie='Vine', video_id=post_id)
|
||||
for post_id in profile['posts']
|
||||
if post_id and isinstance(post_id, str)]
|
||||
return self.playlist_result(
|
||||
entries, user, profile.get('username'), profile.get('description'))
|
||||
@@ -10,7 +10,7 @@ from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class XiaoHongShuIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.xiaohongshu\.com/explore/(?P<id>[\da-f]+)'
|
||||
_VALID_URL = r'https?://www\.xiaohongshu\.com/(?:explore|discovery/item)/(?P<id>[\da-f]+)'
|
||||
IE_DESC = '小红书'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.xiaohongshu.com/explore/6411cf99000000001300b6d9',
|
||||
@@ -25,6 +25,18 @@ class XiaoHongShuIE(InfoExtractor):
|
||||
'duration': 101.726,
|
||||
'thumbnail': r're:https?://sns-webpic-qc\.xhscdn\.com/\d+/[a-z0-9]+/[\w]+',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.xiaohongshu.com/discovery/item/674051740000000007027a15?xsec_token=CBgeL8Dxd1ZWBhwqRd568gAZ_iwG-9JIf9tnApNmteU2E=',
|
||||
'info_dict': {
|
||||
'id': '674051740000000007027a15',
|
||||
'ext': 'mp4',
|
||||
'title': '相互喜欢就可以了',
|
||||
'uploader_id': '63439913000000001901f49a',
|
||||
'duration': 28.073,
|
||||
'description': '#广州[话题]# #深圳[话题]# #香港[话题]# #街头采访[话题]# #是你喜欢的类型[话题]#',
|
||||
'thumbnail': r're:https?://sns-webpic-qc\.xhscdn\.com/\d+/[\da-f]+/[^/]+',
|
||||
'tags': ['广州', '深圳', '香港', '街头采访', '是你喜欢的类型'],
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -32,7 +32,6 @@ from ..utils import (
|
||||
classproperty,
|
||||
clean_html,
|
||||
datetime_from_str,
|
||||
dict_get,
|
||||
filesize_from_tbr,
|
||||
filter_dict,
|
||||
float_or_none,
|
||||
@@ -78,7 +77,7 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'WEB',
|
||||
'clientVersion': '2.20240726.00.00',
|
||||
'clientVersion': '2.20241126.01.00',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
|
||||
@@ -90,7 +89,7 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'WEB',
|
||||
'clientVersion': '2.20240726.00.00',
|
||||
'clientVersion': '2.20241126.01.00',
|
||||
'userAgent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.5 Safari/605.1.15,gzip(gfe)',
|
||||
},
|
||||
},
|
||||
@@ -102,7 +101,7 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'WEB_EMBEDDED_PLAYER',
|
||||
'clientVersion': '1.20240723.01.00',
|
||||
'clientVersion': '1.20241201.00.00',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 56,
|
||||
@@ -113,7 +112,7 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'WEB_REMIX',
|
||||
'clientVersion': '1.20240724.00.00',
|
||||
'clientVersion': '1.20241127.01.00',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 67,
|
||||
@@ -124,7 +123,7 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'WEB_CREATOR',
|
||||
'clientVersion': '1.20240723.03.00',
|
||||
'clientVersion': '1.20241203.01.00',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 62,
|
||||
@@ -162,7 +161,6 @@ INNERTUBE_CLIENTS = {
|
||||
'REQUIRE_JS_PLAYER': False,
|
||||
'REQUIRE_PO_TOKEN': True,
|
||||
'REQUIRE_AUTH': True,
|
||||
'SUPPORTS_COOKIES': True,
|
||||
},
|
||||
# This client now requires sign-in for every video
|
||||
'android_creator': {
|
||||
@@ -197,7 +195,6 @@ INNERTUBE_CLIENTS = {
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 28,
|
||||
'REQUIRE_JS_PLAYER': False,
|
||||
'SUPPORTS_COOKIES': True,
|
||||
},
|
||||
# iOS clients have HLS live streams. Setting device model to get 60fps formats.
|
||||
# See: https://github.com/TeamNewPipe/NewPipeExtractor/issues/680#issuecomment-1002724558
|
||||
@@ -214,6 +211,7 @@ INNERTUBE_CLIENTS = {
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
|
||||
'REQUIRE_PO_TOKEN': True,
|
||||
'REQUIRE_JS_PLAYER': False,
|
||||
},
|
||||
# This client now requires sign-in for every video
|
||||
@@ -232,7 +230,6 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
|
||||
'REQUIRE_JS_PLAYER': False,
|
||||
'REQUIRE_AUTH': True,
|
||||
'SUPPORTS_COOKIES': True,
|
||||
},
|
||||
# This client now requires sign-in for every video
|
||||
'ios_creator': {
|
||||
@@ -257,17 +254,20 @@ INNERTUBE_CLIENTS = {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'MWEB',
|
||||
'clientVersion': '2.20240726.01.00',
|
||||
'clientVersion': '2.20241202.07.00',
|
||||
# mweb previously did not require PO Token with this UA
|
||||
'userAgent': 'Mozilla/5.0 (iPad; CPU OS 16_7_10 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Mobile/15E148 Safari/604.1,gzip(gfe)',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 2,
|
||||
'REQUIRE_PO_TOKEN': True,
|
||||
'SUPPORTS_COOKIES': True,
|
||||
},
|
||||
'tv': {
|
||||
'INNERTUBE_CONTEXT': {
|
||||
'client': {
|
||||
'clientName': 'TVHTML5',
|
||||
'clientVersion': '7.20240724.13.00',
|
||||
'clientVersion': '7.20241201.18.00',
|
||||
},
|
||||
},
|
||||
'INNERTUBE_CONTEXT_CLIENT_NAME': 7,
|
||||
@@ -517,11 +517,12 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
return self._search_regex(rf'^({self._YT_CHANNEL_UCID_RE})$', ucid, 'UC-id', default=None)
|
||||
|
||||
def handle_or_none(self, handle):
|
||||
return self._search_regex(rf'^({self._YT_HANDLE_RE})$', handle, '@-handle', default=None)
|
||||
return self._search_regex(rf'^({self._YT_HANDLE_RE})$', urllib.parse.unquote(handle or ''),
|
||||
'@-handle', default=None)
|
||||
|
||||
def handle_from_url(self, url):
|
||||
return self._search_regex(rf'^(?:https?://(?:www\.)?youtube\.com)?/({self._YT_HANDLE_RE})',
|
||||
url, 'channel handle', default=None)
|
||||
urllib.parse.unquote(url or ''), 'channel handle', default=None)
|
||||
|
||||
def ucid_from_url(self, url):
|
||||
return self._search_regex(rf'^(?:https?://(?:www\.)?youtube\.com)?/({self._YT_CHANNEL_UCID_RE})',
|
||||
@@ -566,9 +567,15 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
pref.update({'hl': self._preferred_lang or 'en', 'tz': 'UTC'})
|
||||
self._set_cookie('.youtube.com', name='PREF', value=urllib.parse.urlencode(pref))
|
||||
|
||||
def _initialize_cookie_auth(self):
|
||||
yt_sapisid, yt_1psapisid, yt_3psapisid = self._get_sid_cookies()
|
||||
if yt_sapisid or yt_1psapisid or yt_3psapisid:
|
||||
self.write_debug('Found YouTube account cookies')
|
||||
|
||||
def _real_initialize(self):
|
||||
self._initialize_pref()
|
||||
self._initialize_consent()
|
||||
self._initialize_cookie_auth()
|
||||
self._check_login_required()
|
||||
|
||||
def _perform_login(self, username, password):
|
||||
@@ -626,32 +633,63 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
client_context.update({'hl': self._preferred_lang or 'en', 'timeZone': 'UTC', 'utcOffsetMinutes': 0})
|
||||
return context
|
||||
|
||||
_SAPISID = None
|
||||
@staticmethod
|
||||
def _make_sid_authorization(scheme, sid, origin, additional_parts):
|
||||
timestamp = str(round(time.time()))
|
||||
|
||||
def _generate_sapisidhash_header(self, origin='https://www.youtube.com'):
|
||||
time_now = round(time.time())
|
||||
if self._SAPISID is None:
|
||||
yt_cookies = self._get_cookies('https://www.youtube.com')
|
||||
# Sometimes SAPISID cookie isn't present but __Secure-3PAPISID is.
|
||||
# See: https://github.com/yt-dlp/yt-dlp/issues/393
|
||||
sapisid_cookie = dict_get(
|
||||
yt_cookies, ('__Secure-3PAPISID', 'SAPISID'))
|
||||
if sapisid_cookie and sapisid_cookie.value:
|
||||
self._SAPISID = sapisid_cookie.value
|
||||
self.write_debug('Extracted SAPISID cookie')
|
||||
# SAPISID cookie is required if not already present
|
||||
if not yt_cookies.get('SAPISID'):
|
||||
self.write_debug('Copying __Secure-3PAPISID cookie to SAPISID cookie')
|
||||
self._set_cookie(
|
||||
'.youtube.com', 'SAPISID', self._SAPISID, secure=True, expire_time=time_now + 3600)
|
||||
else:
|
||||
self._SAPISID = False
|
||||
if not self._SAPISID:
|
||||
hash_parts = []
|
||||
if additional_parts:
|
||||
hash_parts.append(':'.join(additional_parts.values()))
|
||||
hash_parts.extend([timestamp, sid, origin])
|
||||
sidhash = hashlib.sha1(' '.join(hash_parts).encode()).hexdigest()
|
||||
|
||||
parts = [timestamp, sidhash]
|
||||
if additional_parts:
|
||||
parts.append(''.join(additional_parts))
|
||||
|
||||
return f'{scheme} {"_".join(parts)}'
|
||||
|
||||
def _get_sid_cookies(self):
|
||||
"""
|
||||
Get SAPISID, 1PSAPISID, 3PSAPISID cookie values
|
||||
@returns sapisid, 1psapisid, 3psapisid
|
||||
"""
|
||||
yt_cookies = self._get_cookies('https://www.youtube.com')
|
||||
yt_sapisid = try_call(lambda: yt_cookies['SAPISID'].value)
|
||||
yt_3papisid = try_call(lambda: yt_cookies['__Secure-3PAPISID'].value)
|
||||
yt_1papisid = try_call(lambda: yt_cookies['__Secure-1PAPISID'].value)
|
||||
|
||||
# Sometimes SAPISID cookie isn't present but __Secure-3PAPISID is.
|
||||
# YouTube also falls back to __Secure-3PAPISID if SAPISID is missing.
|
||||
# See: https://github.com/yt-dlp/yt-dlp/issues/393
|
||||
|
||||
return yt_sapisid or yt_3papisid, yt_1papisid, yt_3papisid
|
||||
|
||||
def _get_sid_authorization_header(self, origin='https://www.youtube.com', user_session_id=None):
|
||||
"""
|
||||
Generate API Session ID Authorization for Innertube requests. Assumes all requests are secure (https).
|
||||
@param origin: Origin URL
|
||||
@param user_session_id: Optional User Session ID
|
||||
@return: Authorization header value
|
||||
"""
|
||||
|
||||
authorizations = []
|
||||
additional_parts = {}
|
||||
if user_session_id:
|
||||
additional_parts['u'] = user_session_id
|
||||
|
||||
yt_sapisid, yt_1psapisid, yt_3psapisid = self._get_sid_cookies()
|
||||
|
||||
for scheme, sid in (('SAPISIDHASH', yt_sapisid),
|
||||
('SAPISID1PHASH', yt_1psapisid),
|
||||
('SAPISID3PHASH', yt_3psapisid)):
|
||||
if sid:
|
||||
authorizations.append(self._make_sid_authorization(scheme, sid, origin, additional_parts))
|
||||
|
||||
if not authorizations:
|
||||
return None
|
||||
# SAPISIDHASH algorithm from https://stackoverflow.com/a/32065323
|
||||
sapisidhash = hashlib.sha1(
|
||||
f'{time_now} {self._SAPISID} {origin}'.encode()).hexdigest()
|
||||
return f'SAPISIDHASH {time_now}_{sapisidhash}'
|
||||
|
||||
return ' '.join(authorizations)
|
||||
|
||||
def _call_api(self, ep, query, video_id, fatal=True, headers=None,
|
||||
note='Downloading API JSON', errnote='Unable to download API page',
|
||||
@@ -687,26 +725,48 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
if session_index is not None:
|
||||
return session_index
|
||||
|
||||
def _data_sync_id_to_delegated_session_id(self, data_sync_id):
|
||||
if not data_sync_id:
|
||||
return
|
||||
# datasyncid is of the form "channel_syncid||user_syncid" for secondary channel
|
||||
# and just "user_syncid||" for primary channel. We only want the channel_syncid
|
||||
channel_syncid, _, user_syncid = data_sync_id.partition('||')
|
||||
if user_syncid:
|
||||
return channel_syncid
|
||||
|
||||
def _extract_account_syncid(self, *args):
|
||||
@staticmethod
|
||||
def _parse_data_sync_id(data_sync_id):
|
||||
"""
|
||||
Extract current session ID required to download private playlists of secondary channels
|
||||
Parse data_sync_id into delegated_session_id and user_session_id.
|
||||
|
||||
data_sync_id is of the form "delegated_session_id||user_session_id" for secondary channel
|
||||
and just "user_session_id||" for primary channel.
|
||||
|
||||
@param data_sync_id: data_sync_id string
|
||||
@return: Tuple of (delegated_session_id, user_session_id)
|
||||
"""
|
||||
if not data_sync_id:
|
||||
return None, None
|
||||
first, _, second = data_sync_id.partition('||')
|
||||
if second:
|
||||
return first, second
|
||||
return None, first
|
||||
|
||||
def _extract_delegated_session_id(self, *args):
|
||||
"""
|
||||
Extract current delegated session ID required to download private playlists of secondary channels
|
||||
@params response and/or ytcfg
|
||||
@return: delegated session ID
|
||||
"""
|
||||
# ytcfg includes channel_syncid if on secondary channel
|
||||
if delegated_sid := traverse_obj(args, (..., 'DELEGATED_SESSION_ID', {str}, any)):
|
||||
return delegated_sid
|
||||
|
||||
data_sync_id = self._extract_data_sync_id(*args)
|
||||
return self._data_sync_id_to_delegated_session_id(data_sync_id)
|
||||
return self._parse_data_sync_id(data_sync_id)[0]
|
||||
|
||||
def _extract_user_session_id(self, *args):
|
||||
"""
|
||||
Extract current user session ID
|
||||
@params response and/or ytcfg
|
||||
@return: user session ID
|
||||
"""
|
||||
if user_sid := traverse_obj(args, (..., 'USER_SESSION_ID', {str}, any)):
|
||||
return user_sid
|
||||
|
||||
data_sync_id = self._extract_data_sync_id(*args)
|
||||
return self._parse_data_sync_id(data_sync_id)[1]
|
||||
|
||||
def _extract_data_sync_id(self, *args):
|
||||
"""
|
||||
@@ -733,7 +793,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
|
||||
@functools.cached_property
|
||||
def is_authenticated(self):
|
||||
return bool(self._generate_sapisidhash_header())
|
||||
return bool(self._get_sid_authorization_header())
|
||||
|
||||
def extract_ytcfg(self, video_id, webpage):
|
||||
if not webpage:
|
||||
@@ -743,25 +803,28 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
r'ytcfg\.set\s*\(\s*({.+?})\s*\)\s*;', webpage, 'ytcfg',
|
||||
default='{}'), video_id, fatal=False) or {}
|
||||
|
||||
def _generate_cookie_auth_headers(self, *, ytcfg=None, account_syncid=None, session_index=None, origin=None, **kwargs):
|
||||
def _generate_cookie_auth_headers(self, *, ytcfg=None, delegated_session_id=None, user_session_id=None, session_index=None, origin=None, **kwargs):
|
||||
headers = {}
|
||||
account_syncid = account_syncid or self._extract_account_syncid(ytcfg)
|
||||
if account_syncid:
|
||||
headers['X-Goog-PageId'] = account_syncid
|
||||
delegated_session_id = delegated_session_id or self._extract_delegated_session_id(ytcfg)
|
||||
if delegated_session_id:
|
||||
headers['X-Goog-PageId'] = delegated_session_id
|
||||
if session_index is None:
|
||||
session_index = self._extract_session_index(ytcfg)
|
||||
if account_syncid or session_index is not None:
|
||||
if delegated_session_id or session_index is not None:
|
||||
headers['X-Goog-AuthUser'] = session_index if session_index is not None else 0
|
||||
|
||||
auth = self._generate_sapisidhash_header(origin)
|
||||
auth = self._get_sid_authorization_header(origin, user_session_id=user_session_id or self._extract_user_session_id(ytcfg))
|
||||
if auth is not None:
|
||||
headers['Authorization'] = auth
|
||||
headers['X-Origin'] = origin
|
||||
|
||||
if traverse_obj(ytcfg, 'LOGGED_IN', expected_type=bool):
|
||||
headers['X-Youtube-Bootstrap-Logged-In'] = 'true'
|
||||
|
||||
return headers
|
||||
|
||||
def generate_api_headers(
|
||||
self, *, ytcfg=None, account_syncid=None, session_index=None,
|
||||
self, *, ytcfg=None, delegated_session_id=None, user_session_id=None, session_index=None,
|
||||
visitor_data=None, api_hostname=None, default_client='web', **kwargs):
|
||||
|
||||
origin = 'https://' + (self._select_api_hostname(api_hostname, default_client))
|
||||
@@ -772,7 +835,12 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
|
||||
'Origin': origin,
|
||||
'X-Goog-Visitor-Id': visitor_data or self._extract_visitor_data(ytcfg),
|
||||
'User-Agent': self._ytcfg_get_safe(ytcfg, lambda x: x['INNERTUBE_CONTEXT']['client']['userAgent'], default_client=default_client),
|
||||
**self._generate_cookie_auth_headers(ytcfg=ytcfg, account_syncid=account_syncid, session_index=session_index, origin=origin),
|
||||
**self._generate_cookie_auth_headers(
|
||||
ytcfg=ytcfg,
|
||||
delegated_session_id=delegated_session_id,
|
||||
user_session_id=user_session_id,
|
||||
session_index=session_index,
|
||||
origin=origin),
|
||||
}
|
||||
return filter_dict(headers)
|
||||
|
||||
@@ -1355,8 +1423,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'401': {'ext': 'mp4', 'height': 2160, 'format_note': 'DASH video', 'vcodec': 'av01.0.12M.08'},
|
||||
}
|
||||
_SUBTITLE_FORMATS = ('json3', 'srv1', 'srv2', 'srv3', 'ttml', 'vtt')
|
||||
_DEFAULT_CLIENTS = ('ios', 'mweb')
|
||||
_DEFAULT_AUTHED_CLIENTS = ('web_creator', 'mweb')
|
||||
_DEFAULT_CLIENTS = ('ios', 'tv')
|
||||
_DEFAULT_AUTHED_CLIENTS = ('web_creator', 'tv')
|
||||
|
||||
_GEO_BYPASS = False
|
||||
|
||||
@@ -1494,7 +1562,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
},
|
||||
# Age-gate videos. See https://github.com/yt-dlp/yt-dlp/pull/575#issuecomment-888837000
|
||||
{
|
||||
'note': 'Embed allowed age-gate video',
|
||||
'note': 'Embed allowed age-gate video; works with web_embedded',
|
||||
'url': 'https://youtube.com/watch?v=HtVdAasjOgU',
|
||||
'info_dict': {
|
||||
'id': 'HtVdAasjOgU',
|
||||
@@ -1524,7 +1592,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'heatmap': 'count:100',
|
||||
'timestamp': 1401991663,
|
||||
},
|
||||
'skip': 'Age-restricted; requires authentication',
|
||||
},
|
||||
{
|
||||
'note': 'Age-gate video with embed allowed in public site',
|
||||
@@ -2800,6 +2867,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'extractor_args': {'youtube': {'player_client': ['ios'], 'player_skip': ['webpage']}},
|
||||
},
|
||||
},
|
||||
{
|
||||
# uploader_id has non-ASCII characters that are percent-encoded in YT's JSON
|
||||
'url': 'https://www.youtube.com/shorts/18NGQq7p3LY',
|
||||
'info_dict': {
|
||||
'id': '18NGQq7p3LY',
|
||||
'ext': 'mp4',
|
||||
'title': '아이브 이서 장원영 리즈 삐끼삐끼 챌린지',
|
||||
'description': '',
|
||||
'uploader': 'ㅇㅇ',
|
||||
'uploader_id': '@으아-v1k',
|
||||
'uploader_url': 'https://www.youtube.com/@으아-v1k',
|
||||
'channel': 'ㅇㅇ',
|
||||
'channel_id': 'UCC25oTm2J7ZVoi5TngOHg9g',
|
||||
'channel_url': 'https://www.youtube.com/channel/UCC25oTm2J7ZVoi5TngOHg9g',
|
||||
'thumbnail': r're:https?://.+/.+\.jpg',
|
||||
'playable_in_embed': True,
|
||||
'age_limit': 0,
|
||||
'duration': 3,
|
||||
'timestamp': 1724306170,
|
||||
'upload_date': '20240822',
|
||||
'availability': 'public',
|
||||
'live_status': 'not_live',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'channel_follower_count': int,
|
||||
'categories': ['People & Blogs'],
|
||||
'tags': [],
|
||||
},
|
||||
},
|
||||
]
|
||||
|
||||
_WEBPAGE_TESTS = [
|
||||
@@ -3118,19 +3214,26 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
self.to_screen('Extracted signature function:\n' + code)
|
||||
|
||||
def _parse_sig_js(self, jscode):
|
||||
# Examples where `sig` is funcname:
|
||||
# sig=function(a){a=a.split(""); ... ;return a.join("")};
|
||||
# ;c&&(c=sig(decodeURIComponent(c)),a.set(b,encodeURIComponent(c)));return a};
|
||||
# {var l=f,m=h.sp,n=sig(decodeURIComponent(h.s));l.set(m,encodeURIComponent(n))}
|
||||
# sig=function(J){J=J.split(""); ... ;return J.join("")};
|
||||
# ;N&&(N=sig(decodeURIComponent(N)),J.set(R,encodeURIComponent(N)));return J};
|
||||
# {var H=u,k=f.sp,v=sig(decodeURIComponent(f.s));H.set(k,encodeURIComponent(v))}
|
||||
funcname = self._search_regex(
|
||||
(r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
(r'\b(?P<var>[a-zA-Z0-9_$]+)&&\((?P=var)=(?P<sig>[a-zA-Z0-9_$]{2,})\(decodeURIComponent\((?P=var)\)\)',
|
||||
r'(?P<sig>[a-zA-Z0-9_$]+)\s*=\s*function\(\s*(?P<arg>[a-zA-Z0-9_$]+)\s*\)\s*{\s*(?P=arg)\s*=\s*(?P=arg)\.split\(\s*""\s*\)\s*;\s*[^}]+;\s*return\s+(?P=arg)\.join\(\s*""\s*\)',
|
||||
r'(?:\b|[^a-zA-Z0-9_$])(?P<sig>[a-zA-Z0-9_$]{2,})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)(?:;[a-zA-Z0-9_$]{2}\.[a-zA-Z0-9_$]{2}\(a,\d+\))?',
|
||||
# Old patterns
|
||||
r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\bm=(?P<sig>[a-zA-Z0-9$]{2,})\(decodeURIComponent\(h\.s\)\)',
|
||||
r'\bc&&\(c=(?P<sig>[a-zA-Z0-9$]{2,})\(decodeURIComponent\(c\)\)',
|
||||
r'(?:\b|[^a-zA-Z0-9$])(?P<sig>[a-zA-Z0-9$]{2,})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)(?:;[a-zA-Z0-9$]{2}\.[a-zA-Z0-9$]{2}\(a,\d+\))?',
|
||||
r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
|
||||
# Obsolete patterns
|
||||
r'("|\')signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
|
||||
r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('),
|
||||
jscode, 'Initial JS player signature function name', group='sig')
|
||||
|
||||
@@ -3204,6 +3307,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
# * a.D&&(b="nn"[+a.D],c=a.get(b))&&(c=narray[idx](c),a.set(b,c),narray.length||nfunc("")
|
||||
# * a.D&&(PL(a),b=a.j.n||null)&&(b=narray[0](b),a.set("n",b),narray.length||nfunc("")
|
||||
# * a.D&&(b="nn"[+a.D],vL(a),c=a.j[b]||null)&&(c=narray[idx](c),a.set(b,c),narray.length||nfunc("")
|
||||
# * J.J="";J.url="";J.Z&&(R="nn"[+J.Z],mW(J),N=J.K[R]||null)&&(N=narray[idx](N),J.set(R,N))}};
|
||||
funcname, idx = self._search_regex(
|
||||
r'''(?x)
|
||||
(?:
|
||||
@@ -3220,7 +3324,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
)\)&&\(c=|
|
||||
\b(?P<var>[a-zA-Z0-9_$]+)=
|
||||
)(?P<nfunc>[a-zA-Z0-9_$]+)(?:\[(?P<idx>\d+)\])?\([a-zA-Z]\)
|
||||
(?(var),[a-zA-Z0-9_$]+\.set\("n"\,(?P=var)\),(?P=nfunc)\.length)''',
|
||||
(?(var),[a-zA-Z0-9_$]+\.set\((?:"n+"|[a-zA-Z0-9_$]+)\,(?P=var)\))''',
|
||||
jscode, 'n function name', group=('nfunc', 'idx'), default=(None, None))
|
||||
if not funcname:
|
||||
self.report_warning(join_nonempty(
|
||||
@@ -3229,7 +3333,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
return self._search_regex(
|
||||
r'''(?xs)
|
||||
;\s*(?P<name>[a-zA-Z0-9_$]+)\s*=\s*function\([a-zA-Z0-9_$]+\)
|
||||
\s*\{(?:(?!};).)+?["']enhanced_except_''',
|
||||
\s*\{(?:(?!};).)+?return\s*(?P<q>["'])[\w-]+_w8_(?P=q)\s*\+\s*[a-zA-Z0-9_$]+''',
|
||||
jscode, 'Initial JS player n function name', group='name')
|
||||
elif not idx:
|
||||
return funcname
|
||||
@@ -3238,6 +3342,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\])\s*[,;]', jscode,
|
||||
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
|
||||
|
||||
def _fixup_n_function_code(self, argnames, code):
|
||||
return argnames, re.sub(
|
||||
rf';\s*if\s*\(\s*typeof\s+[a-zA-Z0-9_$]+\s*===?\s*(["\'])undefined\1\s*\)\s*return\s+{argnames[0]};',
|
||||
';', code)
|
||||
|
||||
def _extract_n_function_code(self, video_id, player_url):
|
||||
player_id = self._extract_player_info(player_url)
|
||||
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2024.07.09')
|
||||
@@ -3249,7 +3358,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
|
||||
func_name = self._extract_n_function_name(jscode, player_url=player_url)
|
||||
|
||||
func_code = jsi.extract_function_code(func_name)
|
||||
# XXX: Workaround for the `typeof` gotcha
|
||||
func_code = self._fixup_n_function_code(*jsi.extract_function_code(func_name))
|
||||
|
||||
self.cache.store('youtube-nsig', player_id, func_code)
|
||||
return jsi, player_id, func_code
|
||||
@@ -3265,7 +3375,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
except Exception as e:
|
||||
raise JSInterpreter.Exception(traceback.format_exc(), cause=e)
|
||||
|
||||
if ret.startswith('enhanced_except_'):
|
||||
if ret.startswith('enhanced_except_') or ret.endswith(s):
|
||||
raise JSInterpreter.Exception('Signature function returned an exception')
|
||||
return ret
|
||||
|
||||
@@ -3793,9 +3903,13 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
default_client=client,
|
||||
visitor_data=visitor_data,
|
||||
session_index=self._extract_session_index(master_ytcfg, player_ytcfg),
|
||||
account_syncid=(
|
||||
self._data_sync_id_to_delegated_session_id(data_sync_id)
|
||||
or self._extract_account_syncid(master_ytcfg, initial_pr, player_ytcfg)
|
||||
delegated_session_id=(
|
||||
self._parse_data_sync_id(data_sync_id)[0]
|
||||
or self._extract_delegated_session_id(master_ytcfg, initial_pr, player_ytcfg)
|
||||
),
|
||||
user_session_id=(
|
||||
self._parse_data_sync_id(data_sync_id)[1]
|
||||
or self._extract_user_session_id(master_ytcfg, initial_pr, player_ytcfg)
|
||||
),
|
||||
)
|
||||
|
||||
@@ -3929,13 +4043,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
)
|
||||
|
||||
require_po_token = self._get_default_ytcfg(client).get('REQUIRE_PO_TOKEN')
|
||||
if not po_token and require_po_token:
|
||||
if not po_token and require_po_token and 'missing_pot' in self._configuration_arg('formats'):
|
||||
self.report_warning(
|
||||
f'No PO Token provided for {client} client, '
|
||||
f'which is required for working {client} formats. '
|
||||
f'You can manually pass a PO Token for this client with '
|
||||
f'--extractor-args "youtube:po_token={client}+XXX"',
|
||||
only_once=True)
|
||||
f'which may be required for working {client} formats. This client will be deprioritized', only_once=True)
|
||||
deprioritize_pr = True
|
||||
|
||||
pr = initial_pr if client == 'web' else None
|
||||
@@ -3968,15 +4079,24 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
else:
|
||||
prs.append(pr)
|
||||
|
||||
# web_embedded can work around age-gate and age-verification for some embeddable videos
|
||||
if self._is_agegated(pr) and variant != 'web_embedded':
|
||||
append_client(f'web_embedded.{base_client}')
|
||||
# Unauthenticated users will only get web_embedded client formats if age-gated
|
||||
if self._is_agegated(pr) and not self.is_authenticated:
|
||||
self.to_screen(
|
||||
f'{video_id}: This video is age-restricted; some formats may be missing '
|
||||
f'without authentication. {self._login_hint()}', only_once=True)
|
||||
|
||||
''' This code is pointless while web_creator is in _DEFAULT_AUTHED_CLIENTS
|
||||
# EU countries require age-verification for accounts to access age-restricted videos
|
||||
# If account is not age-verified, _is_agegated() will be truthy for non-embedded clients
|
||||
if self.is_authenticated and self._is_agegated(pr):
|
||||
embedding_is_disabled = variant == 'web_embedded' and self._is_unplayable(pr)
|
||||
if self.is_authenticated and (self._is_agegated(pr) or embedding_is_disabled):
|
||||
self.to_screen(
|
||||
f'{video_id}: This video is age-restricted and YouTube is requiring '
|
||||
'account age-verification; some formats may be missing', only_once=True)
|
||||
# web_creator can work around the age-verification requirement
|
||||
# android_vr may also be able to work around age-verification
|
||||
# tv_embedded may(?) still work around age-verification if the video is embeddable
|
||||
append_client('web_creator')
|
||||
'''
|
||||
@@ -3999,6 +4119,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
or (live_status == 'post_live' and (duration or 0) > 2 * 3600)):
|
||||
return live_status
|
||||
|
||||
def _report_pot_format_skipped(self, video_id, client_name, proto):
|
||||
msg = (
|
||||
f'{video_id}: {client_name} client {proto} formats require a PO Token which was not provided. '
|
||||
'They will be skipped as they may yield HTTP Error 403. '
|
||||
f'You can manually pass a PO Token for this client with --extractor-args "youtube:po_token={client_name}+XXX". '
|
||||
'For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/Extractors#po-token-guide . '
|
||||
'To enable these broken formats anyway, pass --extractor-args "youtube:formats=missing_pot"')
|
||||
|
||||
# Only raise a warning for non-default clients, to not confuse users.
|
||||
# iOS HLS formats still work without PO Token, so we don't need to warn about them.
|
||||
if client_name in (*self._DEFAULT_CLIENTS, *self._DEFAULT_AUTHED_CLIENTS):
|
||||
self.write_debug(msg, only_once=True)
|
||||
else:
|
||||
self.report_warning(msg, only_once=True)
|
||||
|
||||
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, live_status, duration):
|
||||
CHUNK_SIZE = 10 << 20
|
||||
PREFERRED_LANG_VALUE = 10
|
||||
@@ -4052,10 +4187,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
if height:
|
||||
res_qualities[height] = quality
|
||||
|
||||
display_name = audio_track.get('displayName') or ''
|
||||
is_original = 'original' in display_name.lower()
|
||||
is_descriptive = 'descriptive' in display_name.lower()
|
||||
is_default = audio_track.get('audioIsDefault')
|
||||
is_descriptive = 'descriptive' in (audio_track.get('displayName') or '').lower()
|
||||
language_code = audio_track.get('id', '').split('.')[0]
|
||||
if language_code and is_default:
|
||||
if language_code and (is_original or (is_default and not original_language)):
|
||||
original_language = language_code
|
||||
|
||||
# FORMAT_STREAM_TYPE_OTF(otf=1) requires downloading the init fragment
|
||||
@@ -4123,11 +4260,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
fmt_url = update_url_query(fmt_url, {'pot': po_token})
|
||||
|
||||
# Clients that require PO Token return videoplayback URLs that may return 403
|
||||
is_broken = (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN'))
|
||||
if is_broken:
|
||||
self.report_warning(
|
||||
f'{video_id}: {client_name} client formats require a PO Token which was not provided. '
|
||||
'They will be deprioritized as they may yield HTTP Error 403', only_once=True)
|
||||
require_po_token = (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN'))
|
||||
if require_po_token and 'missing_pot' not in self._configuration_arg('formats'):
|
||||
self._report_pot_format_skipped(video_id, client_name, 'https')
|
||||
continue
|
||||
|
||||
name = fmt.get('qualityLabel') or quality.replace('audio_quality_', '') or ''
|
||||
fps = int_or_none(fmt.get('fps')) or 0
|
||||
@@ -4136,11 +4272,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'filesize': int_or_none(fmt.get('contentLength')),
|
||||
'format_id': f'{itag}{"-drc" if fmt.get("isDrc") else ""}',
|
||||
'format_note': join_nonempty(
|
||||
join_nonempty(audio_track.get('displayName'), is_default and ' (default)', delim=''),
|
||||
join_nonempty(display_name, is_default and ' (default)', delim=''),
|
||||
name, fmt.get('isDrc') and 'DRC',
|
||||
try_get(fmt, lambda x: x['projectionType'].replace('RECTANGULAR', '').lower()),
|
||||
try_get(fmt, lambda x: x['spatialAudioType'].replace('SPATIAL_AUDIO_TYPE_', '').lower()),
|
||||
is_damaged and 'DAMAGED', is_broken and 'BROKEN',
|
||||
is_damaged and 'DAMAGED', require_po_token and 'MISSING POT',
|
||||
(self.get_param('verbose') or all_formats) and short_client_name(client_name),
|
||||
delim=', '),
|
||||
# Format 22 is likely to be damaged. See https://github.com/yt-dlp/yt-dlp/issues/3372
|
||||
@@ -4155,9 +4291,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
'url': fmt_url,
|
||||
'width': int_or_none(fmt.get('width')),
|
||||
'language': join_nonempty(language_code, 'desc' if is_descriptive else '') or None,
|
||||
'language_preference': PREFERRED_LANG_VALUE if is_default else -10 if is_descriptive else -1,
|
||||
'language_preference': PREFERRED_LANG_VALUE if is_original else 5 if is_default else -10 if is_descriptive else -1,
|
||||
# Strictly de-prioritize broken, damaged and 3gp formats
|
||||
'preference': -20 if is_broken else -10 if is_damaged else -2 if itag == '17' else None,
|
||||
'preference': -20 if require_po_token else -10 if is_damaged else -2 if itag == '17' else None,
|
||||
}
|
||||
mime_mobj = re.match(
|
||||
r'((?:[^/]+)/(?:[^;]+))(?:;\s*codecs="([^"]+)")?', fmt.get('mimeType') or '')
|
||||
@@ -4207,7 +4343,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
key = (proto, f.get('language'))
|
||||
if not all_formats and key in itags[itag]:
|
||||
return False
|
||||
itags[itag].add(key)
|
||||
|
||||
if f.get('source_preference') is None:
|
||||
f['source_preference'] = -1
|
||||
@@ -4215,12 +4350,14 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
# Clients that require PO Token return videoplayback URLs that may return 403
|
||||
# hls does not currently require PO Token
|
||||
if (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN')) and proto != 'hls':
|
||||
self.report_warning(
|
||||
f'{video_id}: {client_name} client {proto} formats require a PO Token which was not provided. '
|
||||
'They will be deprioritized as they may yield HTTP Error 403', only_once=True)
|
||||
f['format_note'] = join_nonempty(f.get('format_note'), 'BROKEN', delim=' ')
|
||||
if 'missing_pot' not in self._configuration_arg('formats'):
|
||||
self._report_pot_format_skipped(video_id, client_name, proto)
|
||||
return False
|
||||
f['format_note'] = join_nonempty(f.get('format_note'), 'MISSING POT', delim=' ')
|
||||
f['source_preference'] -= 20
|
||||
|
||||
itags[itag].add(key)
|
||||
|
||||
if itag and all_formats:
|
||||
f['format_id'] = f'{itag}-{proto}'
|
||||
elif any(p != proto for p, _ in itags[itag]):
|
||||
@@ -4674,7 +4811,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
(?=(?P<artist>[^\n]+))(?P=artist)\n+
|
||||
(?=(?P<album>[^\n]+))(?P=album)\n
|
||||
(?:.+?℗\s*(?P<release_year>\d{4})(?!\d))?
|
||||
(?:.+?Released on\s*:\s*(?P<release_date>\d{4}-\d{2}-\d{2}))?
|
||||
(?:.+?Released\ on\s*:\s*(?P<release_date>\d{4}-\d{2}-\d{2}))?
|
||||
(.+?\nArtist\s*:\s*
|
||||
(?=(?P<clean_artist>[^\n]+))(?P=clean_artist)\n
|
||||
)?.+\nAuto-generated\ by\ YouTube\.\s*$
|
||||
@@ -5267,6 +5404,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
'channelRenderer': lambda x: self._grid_entries({'items': [{'channelRenderer': x}]}),
|
||||
'hashtagTileRenderer': lambda x: [self._hashtag_tile_entry(x)],
|
||||
'richGridRenderer': lambda x: self._extract_entries(x, continuation_list),
|
||||
'lockupViewModel': lambda x: [self._extract_lockup_view_model(x)],
|
||||
}
|
||||
for key, renderer in isr_content.items():
|
||||
if key not in known_renderers:
|
||||
@@ -5283,7 +5421,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
if not continuation_list[0]:
|
||||
continuation_list[0] = self._extract_continuation(parent_renderer)
|
||||
|
||||
def _entries(self, tab, item_id, ytcfg, account_syncid, visitor_data):
|
||||
def _entries(self, tab, item_id, ytcfg, delegated_session_id, visitor_data):
|
||||
continuation_list = [None]
|
||||
extract_entries = lambda x: self._extract_entries(x, continuation_list)
|
||||
tab_content = try_get(tab, lambda x: x['content'], dict)
|
||||
@@ -5304,7 +5442,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
break
|
||||
seen_continuations.add(continuation_token)
|
||||
headers = self.generate_api_headers(
|
||||
ytcfg=ytcfg, account_syncid=account_syncid, visitor_data=visitor_data)
|
||||
ytcfg=ytcfg, delegated_session_id=delegated_session_id, visitor_data=visitor_data)
|
||||
response = self._extract_response(
|
||||
item_id=f'{item_id} page {page_num}',
|
||||
query=continuation, headers=headers, ytcfg=ytcfg,
|
||||
@@ -5374,7 +5512,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
return self.playlist_result(
|
||||
self._entries(
|
||||
selected_tab, metadata['id'], ytcfg,
|
||||
self._extract_account_syncid(ytcfg, data),
|
||||
self._extract_delegated_session_id(ytcfg, data),
|
||||
self._extract_visitor_data(data, ytcfg)),
|
||||
**metadata)
|
||||
|
||||
@@ -5526,7 +5664,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
watch_endpoint = try_get(
|
||||
playlist, lambda x: x['contents'][-1]['playlistPanelVideoRenderer']['navigationEndpoint']['watchEndpoint'])
|
||||
headers = self.generate_api_headers(
|
||||
ytcfg=ytcfg, account_syncid=self._extract_account_syncid(ytcfg, data),
|
||||
ytcfg=ytcfg, delegated_session_id=self._extract_delegated_session_id(ytcfg, data),
|
||||
visitor_data=self._extract_visitor_data(response, data, ytcfg))
|
||||
query = {
|
||||
'playlistId': playlist_id,
|
||||
@@ -5624,7 +5762,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
if not is_playlist:
|
||||
return
|
||||
headers = self.generate_api_headers(
|
||||
ytcfg=ytcfg, account_syncid=self._extract_account_syncid(ytcfg, data),
|
||||
ytcfg=ytcfg, delegated_session_id=self._extract_delegated_session_id(ytcfg, data),
|
||||
visitor_data=self._extract_visitor_data(data, ytcfg))
|
||||
query = {
|
||||
'params': 'wgYCCAA=',
|
||||
|
||||
@@ -1370,12 +1370,12 @@ def create_parser():
|
||||
help='Allow Unicode characters, "&" and spaces in filenames (default)')
|
||||
filesystem.add_option(
|
||||
'--windows-filenames',
|
||||
action='store_true', dest='windowsfilenames', default=False,
|
||||
action='store_true', dest='windowsfilenames', default=None,
|
||||
help='Force filenames to be Windows-compatible')
|
||||
filesystem.add_option(
|
||||
'--no-windows-filenames',
|
||||
action='store_false', dest='windowsfilenames',
|
||||
help='Make filenames Windows-compatible only if using Windows (default)')
|
||||
help='Sanitize filenames only minimally')
|
||||
filesystem.add_option(
|
||||
'--trim-filenames', '--trim-file-names', metavar='LENGTH',
|
||||
dest='trim_file_name', default=0, type=int,
|
||||
|
||||
@@ -65,9 +65,14 @@ def _get_variant_and_executable_path():
|
||||
machine = '_legacy' if version_tuple(platform.mac_ver()[0]) < (10, 15) else ''
|
||||
else:
|
||||
machine = f'_{platform.machine().lower()}'
|
||||
is_64bits = sys.maxsize > 2**32
|
||||
# Ref: https://en.wikipedia.org/wiki/Uname#Examples
|
||||
if machine[1:] in ('x86', 'x86_64', 'amd64', 'i386', 'i686'):
|
||||
machine = '_x86' if platform.architecture()[0][:2] == '32' else ''
|
||||
machine = '_x86' if not is_64bits else ''
|
||||
# platform.machine() on 32-bit raspbian OS may return 'aarch64', so check "64-bitness"
|
||||
# See: https://github.com/yt-dlp/yt-dlp/issues/11813
|
||||
elif machine[1:] == 'aarch64' and not is_64bits:
|
||||
machine = '_armv7l'
|
||||
# sys.executable returns a /tmp/ path for staticx builds (linux_static)
|
||||
# Ref: https://staticx.readthedocs.io/en/latest/usage.html#run-time-information
|
||||
if static_exe_path := os.getenv('STATICX_PROG_PATH'):
|
||||
@@ -525,11 +530,16 @@ class Updater:
|
||||
@functools.cached_property
|
||||
def cmd(self):
|
||||
"""The command-line to run the executable, if known"""
|
||||
argv = None
|
||||
# There is no sys.orig_argv in py < 3.10. Also, it can be [] when frozen
|
||||
if getattr(sys, 'orig_argv', None):
|
||||
return sys.orig_argv
|
||||
argv = sys.orig_argv
|
||||
elif getattr(sys, 'frozen', False):
|
||||
return sys.argv
|
||||
argv = sys.argv
|
||||
# linux_static exe's argv[0] will be /tmp/staticx-NNNN/yt-dlp_linux if we don't fixup here
|
||||
if argv and os.getenv('STATICX_PROG_PATH'):
|
||||
argv = [self.filename, *argv[1:]]
|
||||
return argv
|
||||
|
||||
def restart(self):
|
||||
"""Restart the executable"""
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
# Autogenerated by devscripts/update-version.py
|
||||
|
||||
__version__ = '2024.12.03'
|
||||
__version__ = '2025.01.12'
|
||||
|
||||
RELEASE_GIT_HEAD = '2b67ac300ac8b44368fb121637d1743cea8c5b6b'
|
||||
RELEASE_GIT_HEAD = 'dade5e35c89adaad04408bfef766820dbca06ebe'
|
||||
|
||||
VARIANT = None
|
||||
|
||||
@@ -12,4 +12,4 @@ CHANNEL = 'stable'
|
||||
|
||||
ORIGIN = 'yt-dlp/yt-dlp'
|
||||
|
||||
_pkg_version = '2024.12.03'
|
||||
_pkg_version = '2025.01.12'
|
||||
|
||||
Reference in New Issue
Block a user