1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-13 10:21:30 +00:00

Compare commits

...

17 Commits

Author SHA1 Message Date
github-actions[bot]
3905f64920 Release 2024.12.23
Created by: bashonly

:ci skip all
2024-12-23 23:47:20 +00:00
bashonly
65cf46cddd [ie/youtube] Player client maintenance (#11893)
Closes #11867
Authored by: bashonly
2024-12-23 23:26:35 +00:00
coletdjnz
9f42e68a74 [ie/youtube] Skip iOS formats that require PO Token (#11890)
Partial fix for https://github.com/yt-dlp/yt-dlp/issues/11868

Authored by: coletdjnz
2024-12-24 12:03:28 +13:00
pukkandan
6fc85f617a Don't sanitize filename on Unix when --no-windows-filenames (#9591)
Closes #4547, Closes #8464
Authored by: pukkandan
2024-12-23 15:57:25 +05:30
bashonly
d298693b1b [ie/soundcloud] Various fixes (#11820)
- Fix original/download formats so that they are considered bestaudio
- Raise appropriate error if track is DRM-protected

Authored by: bashonly
2024-12-15 20:16:04 +00:00
bashonly
09a6c68712 [ie/youtube] Add age-gate workaround for some embeddable videos (#11821)
Closes #11296
Authored by: bashonly
2024-12-15 20:09:48 +00:00
bashonly
1a8851b689 [ie/youtube] Fix uploader_id extraction (#11818)
Closes #11816
Authored by: bashonly
2024-12-15 20:07:18 +00:00
bashonly
b91c3925c2 [update] Check 64-bitness when upgrading ARM builds (#11819)
Closes #11813
Authored by: bashonly
2024-12-15 19:55:30 +00:00
bashonly
3d3ee458c1 [update] Fix endless update loop for linux_exe builds (#11827)
Closes #11808
Authored by: bashonly
2024-12-15 19:47:50 +00:00
github-actions[bot]
2037a6414f Release 2024.12.13
Created by: bashonly

:ci skip all
2024-12-13 10:35:40 +00:00
sepro
5421669626 [cleanup] Make more playlist entries lazy (#11763)
Authored by: seproDev
2024-12-13 10:25:29 +00:00
bashonly
dc3c4fddcc [ie/youtube] Prioritize original language over auto-dubbed audio (#11803)
Closes #11753
Authored by: bashonly
2024-12-13 10:21:48 +00:00
bashonly
5460cd9189 [ie/youtube] Fix signature function extraction for 2f1832d2 (#11801)
Closes #11798
Authored by: bashonly
2024-12-13 09:43:08 +00:00
Crypto90
f6c73aad5f [ie/youtube:search_url] Fix playlist searches (#11782)
Closes #11666
Authored by: Crypto90
2024-12-12 13:54:11 +00:00
Pew
d5e2a379f2 [ie/youtube] Fix release_date extraction (#11759)
Authored by: MutantPiggieGolem1
2024-12-12 13:46:52 +00:00
bashonly
bc262bcad4 [ie/patreon:campaign] Support /c/ URLs (#11756)
Closes #11755
Authored by: bashonly
2024-12-12 13:44:19 +00:00
bashonly
f4d3e9e6dc [ie/soundcloud] Fix extraction (#11777)
Authored by: bashonly
2024-12-12 13:39:38 +00:00
16 changed files with 206 additions and 60 deletions

View File

@@ -711,3 +711,5 @@ gitninja1234
jkruse
xiaomac
wesson09
Crypto90
MutantPiggieGolem1

View File

@@ -4,6 +4,36 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2024.12.23
#### Core changes
- [Don't sanitize filename on Unix when `--no-windows-filenames`](https://github.com/yt-dlp/yt-dlp/commit/6fc85f617a5850307fd5b258477070e6ee177796) ([#9591](https://github.com/yt-dlp/yt-dlp/issues/9591)) by [pukkandan](https://github.com/pukkandan)
- **update**
- [Check 64-bitness when upgrading ARM builds](https://github.com/yt-dlp/yt-dlp/commit/b91c3925c2059970daa801cb131c0c2f4f302e72) ([#11819](https://github.com/yt-dlp/yt-dlp/issues/11819)) by [bashonly](https://github.com/bashonly)
- [Fix endless update loop for `linux_exe` builds](https://github.com/yt-dlp/yt-dlp/commit/3d3ee458c1fe49dd5ebd7651a092119d23eb7000) ([#11827](https://github.com/yt-dlp/yt-dlp/issues/11827)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- **soundcloud**: [Various fixes](https://github.com/yt-dlp/yt-dlp/commit/d298693b1b266d198e8eeecb90ea17c4a031268f) ([#11820](https://github.com/yt-dlp/yt-dlp/issues/11820)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Add age-gate workaround for some embeddable videos](https://github.com/yt-dlp/yt-dlp/commit/09a6c687126f04e243fcb105a828787efddd1030) ([#11821](https://github.com/yt-dlp/yt-dlp/issues/11821)) by [bashonly](https://github.com/bashonly)
- [Fix `uploader_id` extraction](https://github.com/yt-dlp/yt-dlp/commit/1a8851b689763e5173b96f70f8a71df0e4a44b66) ([#11818](https://github.com/yt-dlp/yt-dlp/issues/11818)) by [bashonly](https://github.com/bashonly)
- [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/65cf46cddd873fd229dbb0fc0689bca4c201c6b6) ([#11893](https://github.com/yt-dlp/yt-dlp/issues/11893)) by [bashonly](https://github.com/bashonly)
- [Skip iOS formats that require PO Token](https://github.com/yt-dlp/yt-dlp/commit/9f42e68a74f3f00b0253fe70763abd57cac4237b) ([#11890](https://github.com/yt-dlp/yt-dlp/issues/11890)) by [coletdjnz](https://github.com/coletdjnz)
### 2024.12.13
#### Extractor changes
- **patreon**: campaign: [Support /c/ URLs](https://github.com/yt-dlp/yt-dlp/commit/bc262bcad4d3683ceadf61a7eb87e233e72adef3) ([#11756](https://github.com/yt-dlp/yt-dlp/issues/11756)) by [bashonly](https://github.com/bashonly)
- **soundcloud**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/f4d3e9e6dc25077b79849a31a2f67f93fdc01e62) ([#11777](https://github.com/yt-dlp/yt-dlp/issues/11777)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Fix `release_date` extraction](https://github.com/yt-dlp/yt-dlp/commit/d5e2a379f2adcb28bc48c7d9e90716d7278f89d2) ([#11759](https://github.com/yt-dlp/yt-dlp/issues/11759)) by [MutantPiggieGolem1](https://github.com/MutantPiggieGolem1)
- [Fix signature function extraction for `2f1832d2`](https://github.com/yt-dlp/yt-dlp/commit/5460cd91891bf613a2065e2fc278d9903c37a127) ([#11801](https://github.com/yt-dlp/yt-dlp/issues/11801)) by [bashonly](https://github.com/bashonly)
- [Prioritize original language over auto-dubbed audio](https://github.com/yt-dlp/yt-dlp/commit/dc3c4fddcc653989dae71fc563d82a308fc898cc) ([#11803](https://github.com/yt-dlp/yt-dlp/issues/11803)) by [bashonly](https://github.com/bashonly)
- search_url: [Fix playlist searches](https://github.com/yt-dlp/yt-dlp/commit/f6c73aad5f1a67544bea137ebd9d1e22e0e56567) ([#11782](https://github.com/yt-dlp/yt-dlp/issues/11782)) by [Crypto90](https://github.com/Crypto90)
#### Misc. changes
- **cleanup**: [Make more playlist entries lazy](https://github.com/yt-dlp/yt-dlp/commit/54216696261bc07cacd9a837c501d9e0b7fed09e) ([#11763](https://github.com/yt-dlp/yt-dlp/issues/11763)) by [seproDev](https://github.com/seproDev)
### 2024.12.06
#### Core changes

View File

@@ -613,8 +613,7 @@ If you fork the project on GitHub, you can run your fork's [build workflow](.git
--no-restrict-filenames Allow Unicode characters, "&" and spaces in
filenames (default)
--windows-filenames Force filenames to be Windows-compatible
--no-windows-filenames Make filenames Windows-compatible only if
using Windows (default)
--no-windows-filenames Sanitize filenames only minimally
--trim-filenames LENGTH Limit the filename length (excluding
extension) to the specified number of
characters
@@ -1776,7 +1775,7 @@ The following extractors use this feature:
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8)
* `formats`: Change the types of formats to return. `dashy` (convert HTTP to DASH), `duplicate` (identical content but different URLs or protocol; includes `dashy`), `incomplete` (cannot be downloaded completely - live dash and post-live m3u8), `missing_pot` (include formats that require a PO Token but are missing one)
* `innertube_host`: Innertube API host to use for all API requests; e.g. `studio.youtube.com`, `youtubei.googleapis.com`. Note that cookies exported from one subdomain will not work on others
* `innertube_key`: Innertube API key to use for all API requests. By default, no API key is used
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning

View File

@@ -761,6 +761,13 @@ class TestYoutubeDL(unittest.TestCase):
test('%(width)06d.%%(ext)s', 'NA.%(ext)s')
test('%%(width)06d.%(ext)s', '%(width)06d.mp4')
# Sanitization options
test('%(title3)s', (None, 'foobartest'))
test('%(title5)s', (None, 'aei_A'), restrictfilenames=True)
test('%(title3)s', (None, 'foo_bar_test'), windowsfilenames=False, restrictfilenames=True)
if sys.platform != 'win32':
test('%(title3)s', (None, 'foobar\\test'), windowsfilenames=False)
# ID sanitization
test('%(id)s', '_abcd', info={'id': '_abcd'})
test('%(some_id)s', '_abcd', info={'some_id': '_abcd'})

View File

@@ -73,6 +73,11 @@ _SIG_TESTS = [
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'MyOSJXtKI3m-uME_jv7-pT12gOFC02RFkGoqWpzE0Cs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xxAj7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJ2OySqa0q',
),
]
_NSIG_TESTS = [
@@ -192,6 +197,10 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/3bb1f723/player_ias.vflset/en_US/base.js',
'gK15nzVyaXE9RsMP3z', 'ZFFWFLPWx9DEgQ',
),
(
'https://www.youtube.com/s/player/2f1832d2/player_ias.vflset/en_US/base.js',
'YWt1qdbe8SAfkoPHW5d', 'RrRjWQOJmBiP',
),
]

View File

@@ -266,7 +266,9 @@ class YoutubeDL:
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names
trim_file_name: Limit length of filename (extension excluded)
windowsfilenames: Force the filenames to be windows compatible
windowsfilenames: True: Force filenames to be Windows compatible
False: Sanitize filenames only minimally
This option has no effect when running on Windows
ignoreerrors: Do not stop on download/postprocessing errors.
Can be 'only_download' to ignore only download errors.
Default is 'only_download' for CLI, but False for API
@@ -1192,8 +1194,7 @@ class YoutubeDL:
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=False):
""" Make the outtmpl and info_dict suitable for substitution: ydl.escape_outtmpl(outtmpl) % info_dict
@param sanitize Whether to sanitize the output as a filename.
For backward compatibility, a function can also be passed
@param sanitize Whether to sanitize the output as a filename
"""
info_dict.setdefault('epoch', int(time.time())) # keep epoch consistent once set
@@ -1309,14 +1310,23 @@ class YoutubeDL:
na = self.params.get('outtmpl_na_placeholder', 'NA')
def filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames')):
def filename_sanitizer(key, value, restricted):
return sanitize_filename(str(value), restricted=restricted, is_id=(
bool(re.search(r'(^|[_.])id(\.|$)', key))
if 'filename-sanitization' in self.params['compat_opts']
else NO_DEFAULT))
sanitizer = sanitize if callable(sanitize) else filename_sanitizer
sanitize = bool(sanitize)
if callable(sanitize):
self.deprecation_warning('Passing a callable "sanitize" to YoutubeDL.prepare_outtmpl is deprecated')
elif not sanitize:
pass
elif (sys.platform != 'win32' and not self.params.get('restrictfilenames')
and self.params.get('windowsfilenames') is False):
def sanitize(key, value):
return value.replace('/', '\u29F8').replace('\0', '')
else:
def sanitize(key, value):
return filename_sanitizer(key, value, restricted=self.params.get('restrictfilenames'))
def _dumpjson_default(obj):
if isinstance(obj, (set, LazyList)):
@@ -1399,13 +1409,13 @@ class YoutubeDL:
if sanitize:
# If value is an object, sanitize might convert it to a string
# So we convert it to repr first
# So we manually convert it before sanitizing
if fmt[-1] == 'r':
value, fmt = repr(value), str_fmt
elif fmt[-1] == 'a':
value, fmt = ascii(value), str_fmt
if fmt[-1] in 'csra':
value = sanitizer(last_field, value)
value = sanitize(last_field, value)
key = '{}\0{}'.format(key.replace('%', '%\0'), outer_mobj.group('format'))
TMPL_DICT[key] = value

View File

@@ -31,6 +31,7 @@ from ..utils import (
update_url_query,
url_or_none,
)
from ..utils.traversal import traverse_obj
class BrightcoveLegacyIE(InfoExtractor):
@@ -935,8 +936,8 @@ class BrightcoveNewIE(BrightcoveNewBaseIE):
if content_type == 'playlist':
return self.playlist_result(
[self._parse_brightcove_metadata(vid, vid.get('id'), headers)
for vid in json_data.get('videos', []) if vid.get('id')],
(self._parse_brightcove_metadata(vid, vid['id'], headers)
for vid in traverse_obj(json_data, ('videos', lambda _, v: v['id']))),
json_data.get('id'), json_data.get('name'),
json_data.get('description'))

View File

@@ -162,7 +162,7 @@ class DVTVIE(InfoExtractor):
items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage)
if items:
return self.playlist_result(
[self._parse_video_metadata(i, video_id, timestamp) for i in items],
(self._parse_video_metadata(i, video_id, timestamp) for i in items),
video_id, self._html_search_meta('twitter:title', webpage))
item = self._search_regex(

View File

@@ -343,7 +343,7 @@ class NYTimesCookingIE(NYTimesBaseIE):
if media_ids:
media_ids.append(lead_video_id)
return self.playlist_result(
[self._extract_video(media_id) for media_id in media_ids], page_id, title, description)
map(self._extract_video, media_ids), page_id, title, description)
return {
**self._extract_video(lead_video_id),

View File

@@ -457,7 +457,7 @@ class PatreonCampaignIE(PatreonBaseIE):
_VALID_URL = r'''(?x)
https?://(?:www\.)?patreon\.com/(?:
(?:m|api/campaigns)/(?P<campaign_id>\d+)|
(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+)
(?:c/)?(?P<vanity>(?!creation[?/]|posts/|rss[?/])[\w-]+)
)(?:/posts)?/?(?:$|[?#])'''
_TESTS = [{
'url': 'https://www.patreon.com/dissonancepod/',
@@ -509,6 +509,26 @@ class PatreonCampaignIE(PatreonBaseIE):
'thumbnail': r're:^https?://.*$',
},
'playlist_mincount': 201,
}, {
'url': 'https://www.patreon.com/c/OgSog',
'info_dict': {
'id': '8504388',
'title': 'OGSoG',
'description': r're:(?s)Hello and welcome to our Patreon page. We are Mari, Lasercorn, .+',
'channel': 'OGSoG',
'channel_id': '8504388',
'channel_url': 'https://www.patreon.com/OgSog',
'uploader_url': 'https://www.patreon.com/OgSog',
'uploader_id': '72323575',
'uploader': 'David Moss',
'thumbnail': r're:https?://.+/.+',
'channel_follower_count': int,
'age_limit': 0,
},
'playlist_mincount': 331,
}, {
'url': 'https://www.patreon.com/c/OgSog/posts',
'only_matching': True,
}, {
'url': 'https://www.patreon.com/dissonancepod/posts',
'only_matching': True,

View File

@@ -210,6 +210,7 @@ class SoundcloudBaseIE(InfoExtractor):
format_urls = set()
formats = []
has_drm = False
query = {'client_id': self._CLIENT_ID}
if secret_token:
query['secret_token'] = secret_token
@@ -245,6 +246,7 @@ class SoundcloudBaseIE(InfoExtractor):
'url': format_url,
'quality': 10,
'format_note': 'Original',
'vcodec': 'none',
})
def invalid_url(url):
@@ -259,6 +261,9 @@ class SoundcloudBaseIE(InfoExtractor):
preset_base = preset.partition('_')[0]
protocol = traverse_obj(t, ('format', 'protocol', {str})) or 'http'
if protocol.startswith(('ctr-', 'cbc-')):
has_drm = True
continue
if protocol == 'progressive':
protocol = 'http'
if protocol != 'hls' and '/hls' in format_url:
@@ -315,8 +320,11 @@ class SoundcloudBaseIE(InfoExtractor):
'preference': -10 if is_preview else None,
})
if not formats and info.get('policy') == 'BLOCK':
self.raise_geo_restricted(metadata_available=True)
if not formats:
if has_drm:
self.report_drm(track_id)
if info.get('policy') == 'BLOCK':
self.raise_geo_restricted(metadata_available=True)
user = info.get('user') or {}

View File

@@ -421,5 +421,5 @@ class VidyardIE(VidyardBaseIE):
return self._process_video_json(video_json['chapters'][0], video_id)
return self.playlist_result(
[self._process_video_json(chapter, video_id) for chapter in video_json['chapters']],
(self._process_video_json(chapter, video_id) for chapter in video_json['chapters']),
str(video_json['playerUuid']), video_json.get('name'))

View File

@@ -162,7 +162,6 @@ INNERTUBE_CLIENTS = {
'REQUIRE_JS_PLAYER': False,
'REQUIRE_PO_TOKEN': True,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
'android_creator': {
@@ -197,7 +196,6 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 28,
'REQUIRE_JS_PLAYER': False,
'SUPPORTS_COOKIES': True,
},
# iOS clients have HLS live streams. Setting device model to get 60fps formats.
# See: https://github.com/TeamNewPipe/NewPipeExtractor/issues/680#issuecomment-1002724558
@@ -214,6 +212,7 @@ INNERTUBE_CLIENTS = {
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 5,
'REQUIRE_PO_TOKEN': True,
'REQUIRE_JS_PLAYER': False,
},
# This client now requires sign-in for every video
@@ -232,7 +231,6 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
'REQUIRE_JS_PLAYER': False,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
'ios_creator': {
@@ -518,11 +516,12 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
return self._search_regex(rf'^({self._YT_CHANNEL_UCID_RE})$', ucid, 'UC-id', default=None)
def handle_or_none(self, handle):
return self._search_regex(rf'^({self._YT_HANDLE_RE})$', handle, '@-handle', default=None)
return self._search_regex(rf'^({self._YT_HANDLE_RE})$', urllib.parse.unquote(handle or ''),
'@-handle', default=None)
def handle_from_url(self, url):
return self._search_regex(rf'^(?:https?://(?:www\.)?youtube\.com)?/({self._YT_HANDLE_RE})',
url, 'channel handle', default=None)
urllib.parse.unquote(url or ''), 'channel handle', default=None)
def ucid_from_url(self, url):
return self._search_regex(rf'^(?:https?://(?:www\.)?youtube\.com)?/({self._YT_CHANNEL_UCID_RE})',
@@ -1495,7 +1494,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
},
# Age-gate videos. See https://github.com/yt-dlp/yt-dlp/pull/575#issuecomment-888837000
{
'note': 'Embed allowed age-gate video',
'note': 'Embed allowed age-gate video; works with web_embedded',
'url': 'https://youtube.com/watch?v=HtVdAasjOgU',
'info_dict': {
'id': 'HtVdAasjOgU',
@@ -1525,7 +1524,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'heatmap': 'count:100',
'timestamp': 1401991663,
},
'skip': 'Age-restricted; requires authentication',
},
{
'note': 'Age-gate video with embed allowed in public site',
@@ -2801,6 +2799,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'extractor_args': {'youtube': {'player_client': ['ios'], 'player_skip': ['webpage']}},
},
},
{
# uploader_id has non-ASCII characters that are percent-encoded in YT's JSON
'url': 'https://www.youtube.com/shorts/18NGQq7p3LY',
'info_dict': {
'id': '18NGQq7p3LY',
'ext': 'mp4',
'title': '아이브 이서 장원영 리즈 삐끼삐끼 챌린지',
'description': '',
'uploader': 'ㅇㅇ',
'uploader_id': '@으아-v1k',
'uploader_url': 'https://www.youtube.com/@으아-v1k',
'channel': 'ㅇㅇ',
'channel_id': 'UCC25oTm2J7ZVoi5TngOHg9g',
'channel_url': 'https://www.youtube.com/channel/UCC25oTm2J7ZVoi5TngOHg9g',
'thumbnail': r're:https?://.+/.+\.jpg',
'playable_in_embed': True,
'age_limit': 0,
'duration': 3,
'timestamp': 1724306170,
'upload_date': '20240822',
'availability': 'public',
'live_status': 'not_live',
'view_count': int,
'like_count': int,
'channel_follower_count': int,
'categories': ['People & Blogs'],
'tags': [],
},
},
]
_WEBPAGE_TESTS = [
@@ -3127,9 +3154,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# ;N&&(N=sig(decodeURIComponent(N)),J.set(R,encodeURIComponent(N)));return J};
# {var H=u,k=f.sp,v=sig(decodeURIComponent(f.s));H.set(k,encodeURIComponent(v))}
funcname = self._search_regex(
(r'\b(?P<var>[a-zA-Z0-9$]+)&&\((?P=var)=(?P<sig>[a-zA-Z0-9$]{2,})\(decodeURIComponent\((?P=var)\)\)',
r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*(?P<arg>[a-zA-Z0-9$]+)\s*\)\s*{\s*(?P=arg)\s*=\s*(?P=arg)\.split\(\s*""\s*\)\s*;\s*[^}]+;\s*return\s+(?P=arg)\.join\(\s*""\s*\)',
r'(?:\b|[^a-zA-Z0-9$])(?P<sig>[a-zA-Z0-9$]{2,})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)(?:;[a-zA-Z0-9$]{2}\.[a-zA-Z0-9$]{2}\(a,\d+\))?',
(r'\b(?P<var>[a-zA-Z0-9_$]+)&&\((?P=var)=(?P<sig>[a-zA-Z0-9_$]{2,})\(decodeURIComponent\((?P=var)\)\)',
r'(?P<sig>[a-zA-Z0-9_$]+)\s*=\s*function\(\s*(?P<arg>[a-zA-Z0-9_$]+)\s*\)\s*{\s*(?P=arg)\s*=\s*(?P=arg)\.split\(\s*""\s*\)\s*;\s*[^}]+;\s*return\s+(?P=arg)\.join\(\s*""\s*\)',
r'(?:\b|[^a-zA-Z0-9_$])(?P<sig>[a-zA-Z0-9_$]{2,})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)(?:;[a-zA-Z0-9_$]{2}\.[a-zA-Z0-9_$]{2}\(a,\d+\))?',
# Old patterns
r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
@@ -3944,13 +3971,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
)
require_po_token = self._get_default_ytcfg(client).get('REQUIRE_PO_TOKEN')
if not po_token and require_po_token:
if not po_token and require_po_token and 'missing_pot' in self._configuration_arg('formats'):
self.report_warning(
f'No PO Token provided for {client} client, '
f'which is required for working {client} formats. '
f'You can manually pass a PO Token for this client with '
f'--extractor-args "youtube:po_token={client}+XXX"',
only_once=True)
f'which may be required for working {client} formats. This client will be deprioritized', only_once=True)
deprioritize_pr = True
pr = initial_pr if client == 'web' else None
@@ -3983,15 +4007,24 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
else:
prs.append(pr)
# web_embedded can work around age-gate and age-verification for some embeddable videos
if self._is_agegated(pr) and variant != 'web_embedded':
append_client(f'web_embedded.{base_client}')
# Unauthenticated users will only get web_embedded client formats if age-gated
if self._is_agegated(pr) and not self.is_authenticated:
self.to_screen(
f'{video_id}: This video is age-restricted; some formats may be missing '
f'without authentication. {self._login_hint()}', only_once=True)
''' This code is pointless while web_creator is in _DEFAULT_AUTHED_CLIENTS
# EU countries require age-verification for accounts to access age-restricted videos
# If account is not age-verified, _is_agegated() will be truthy for non-embedded clients
if self.is_authenticated and self._is_agegated(pr):
embedding_is_disabled = variant == 'web_embedded' and self._is_unplayable(pr)
if self.is_authenticated and (self._is_agegated(pr) or embedding_is_disabled):
self.to_screen(
f'{video_id}: This video is age-restricted and YouTube is requiring '
'account age-verification; some formats may be missing', only_once=True)
# web_creator can work around the age-verification requirement
# android_vr may also be able to work around age-verification
# tv_embedded may(?) still work around age-verification if the video is embeddable
append_client('web_creator')
'''
@@ -4014,6 +4047,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
or (live_status == 'post_live' and (duration or 0) > 2 * 3600)):
return live_status
def _report_pot_format_skipped(self, video_id, client_name, proto):
msg = (
f'{video_id}: {client_name} client {proto} formats require a PO Token which was not provided. '
'They will be skipped as they may yield HTTP Error 403. '
f'You can manually pass a PO Token for this client with --extractor-args "youtube:po_token={client_name}+XXX. '
'For more information, refer to https://github.com/yt-dlp/yt-dlp/wiki/Extractors#po-token-guide . '
'To enable these broken formats anyway, pass --extractor-args "youtube:formats=missing_pot"')
# Only raise a warning for non-default clients, to not confuse users.
# iOS HLS formats still work without PO Token, so we don't need to warn about them.
if client_name in (*self._DEFAULT_CLIENTS, *self._DEFAULT_AUTHED_CLIENTS):
self.write_debug(msg, only_once=True)
else:
self.report_warning(msg, only_once=True)
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, live_status, duration):
CHUNK_SIZE = 10 << 20
PREFERRED_LANG_VALUE = 10
@@ -4067,10 +4115,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if height:
res_qualities[height] = quality
display_name = audio_track.get('displayName') or ''
is_original = 'original' in display_name.lower()
is_descriptive = 'descriptive' in display_name.lower()
is_default = audio_track.get('audioIsDefault')
is_descriptive = 'descriptive' in (audio_track.get('displayName') or '').lower()
language_code = audio_track.get('id', '').split('.')[0]
if language_code and is_default:
if language_code and (is_original or (is_default and not original_language)):
original_language = language_code
# FORMAT_STREAM_TYPE_OTF(otf=1) requires downloading the init fragment
@@ -4138,11 +4188,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
fmt_url = update_url_query(fmt_url, {'pot': po_token})
# Clients that require PO Token return videoplayback URLs that may return 403
is_broken = (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN'))
if is_broken:
self.report_warning(
f'{video_id}: {client_name} client formats require a PO Token which was not provided. '
'They will be deprioritized as they may yield HTTP Error 403', only_once=True)
require_po_token = (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN'))
if require_po_token and 'missing_pot' not in self._configuration_arg('formats'):
self._report_pot_format_skipped(video_id, client_name, 'https')
continue
name = fmt.get('qualityLabel') or quality.replace('audio_quality_', '') or ''
fps = int_or_none(fmt.get('fps')) or 0
@@ -4151,11 +4200,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'filesize': int_or_none(fmt.get('contentLength')),
'format_id': f'{itag}{"-drc" if fmt.get("isDrc") else ""}',
'format_note': join_nonempty(
join_nonempty(audio_track.get('displayName'), is_default and ' (default)', delim=''),
join_nonempty(display_name, is_default and ' (default)', delim=''),
name, fmt.get('isDrc') and 'DRC',
try_get(fmt, lambda x: x['projectionType'].replace('RECTANGULAR', '').lower()),
try_get(fmt, lambda x: x['spatialAudioType'].replace('SPATIAL_AUDIO_TYPE_', '').lower()),
is_damaged and 'DAMAGED', is_broken and 'BROKEN',
is_damaged and 'DAMAGED', require_po_token and 'MISSING POT',
(self.get_param('verbose') or all_formats) and short_client_name(client_name),
delim=', '),
# Format 22 is likely to be damaged. See https://github.com/yt-dlp/yt-dlp/issues/3372
@@ -4170,9 +4219,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'url': fmt_url,
'width': int_or_none(fmt.get('width')),
'language': join_nonempty(language_code, 'desc' if is_descriptive else '') or None,
'language_preference': PREFERRED_LANG_VALUE if is_default else -10 if is_descriptive else -1,
'language_preference': PREFERRED_LANG_VALUE if is_original else 5 if is_default else -10 if is_descriptive else -1,
# Strictly de-prioritize broken, damaged and 3gp formats
'preference': -20 if is_broken else -10 if is_damaged else -2 if itag == '17' else None,
'preference': -20 if require_po_token else -10 if is_damaged else -2 if itag == '17' else None,
}
mime_mobj = re.match(
r'((?:[^/]+)/(?:[^;]+))(?:;\s*codecs="([^"]+)")?', fmt.get('mimeType') or '')
@@ -4230,10 +4279,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Clients that require PO Token return videoplayback URLs that may return 403
# hls does not currently require PO Token
if (not po_token and self._get_default_ytcfg(client_name).get('REQUIRE_PO_TOKEN')) and proto != 'hls':
self.report_warning(
f'{video_id}: {client_name} client {proto} formats require a PO Token which was not provided. '
'They will be deprioritized as they may yield HTTP Error 403', only_once=True)
f['format_note'] = join_nonempty(f.get('format_note'), 'BROKEN', delim=' ')
if 'missing_pot' not in self._configuration_arg('formats'):
self._report_pot_format_skipped(video_id, client_name, proto)
return False
f['format_note'] = join_nonempty(f.get('format_note'), 'MISSING POT', delim=' ')
f['source_preference'] -= 20
if itag and all_formats:
@@ -4689,7 +4738,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
(?=(?P<artist>[^\n]+))(?P=artist)\n+
(?=(?P<album>[^\n]+))(?P=album)\n
(?:.+?\s*(?P<release_year>\d{4})(?!\d))?
(?:.+?Released on\s*:\s*(?P<release_date>\d{4}-\d{2}-\d{2}))?
(?:.+?Released\ on\s*:\s*(?P<release_date>\d{4}-\d{2}-\d{2}))?
(.+?\nArtist\s*:\s*
(?=(?P<clean_artist>[^\n]+))(?P=clean_artist)\n
)?.+\nAuto-generated\ by\ YouTube\.\s*$
@@ -5282,6 +5331,7 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
'channelRenderer': lambda x: self._grid_entries({'items': [{'channelRenderer': x}]}),
'hashtagTileRenderer': lambda x: [self._hashtag_tile_entry(x)],
'richGridRenderer': lambda x: self._extract_entries(x, continuation_list),
'lockupViewModel': lambda x: [self._extract_lockup_view_model(x)],
}
for key, renderer in isr_content.items():
if key not in known_renderers:

View File

@@ -1370,12 +1370,12 @@ def create_parser():
help='Allow Unicode characters, "&" and spaces in filenames (default)')
filesystem.add_option(
'--windows-filenames',
action='store_true', dest='windowsfilenames', default=False,
action='store_true', dest='windowsfilenames', default=None,
help='Force filenames to be Windows-compatible')
filesystem.add_option(
'--no-windows-filenames',
action='store_false', dest='windowsfilenames',
help='Make filenames Windows-compatible only if using Windows (default)')
help='Sanitize filenames only minimally')
filesystem.add_option(
'--trim-filenames', '--trim-file-names', metavar='LENGTH',
dest='trim_file_name', default=0, type=int,

View File

@@ -65,9 +65,14 @@ def _get_variant_and_executable_path():
machine = '_legacy' if version_tuple(platform.mac_ver()[0]) < (10, 15) else ''
else:
machine = f'_{platform.machine().lower()}'
is_64bits = sys.maxsize > 2**32
# Ref: https://en.wikipedia.org/wiki/Uname#Examples
if machine[1:] in ('x86', 'x86_64', 'amd64', 'i386', 'i686'):
machine = '_x86' if platform.architecture()[0][:2] == '32' else ''
machine = '_x86' if not is_64bits else ''
# platform.machine() on 32-bit raspbian OS may return 'aarch64', so check "64-bitness"
# See: https://github.com/yt-dlp/yt-dlp/issues/11813
elif machine[1:] == 'aarch64' and not is_64bits:
machine = '_armv7l'
# sys.executable returns a /tmp/ path for staticx builds (linux_static)
# Ref: https://staticx.readthedocs.io/en/latest/usage.html#run-time-information
if static_exe_path := os.getenv('STATICX_PROG_PATH'):
@@ -525,11 +530,16 @@ class Updater:
@functools.cached_property
def cmd(self):
"""The command-line to run the executable, if known"""
argv = None
# There is no sys.orig_argv in py < 3.10. Also, it can be [] when frozen
if getattr(sys, 'orig_argv', None):
return sys.orig_argv
argv = sys.orig_argv
elif getattr(sys, 'frozen', False):
return sys.argv
argv = sys.argv
# linux_static exe's argv[0] will be /tmp/staticx-NNNN/yt-dlp_linux if we don't fixup here
if argv and os.getenv('STATICX_PROG_PATH'):
argv = [self.filename, *argv[1:]]
return argv
def restart(self):
"""Restart the executable"""

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2024.12.06'
__version__ = '2024.12.23'
RELEASE_GIT_HEAD = '4bd2655398aed450456197a6767639114a24eac2'
RELEASE_GIT_HEAD = '65cf46cddd873fd229dbb0fc0689bca4c201c6b6'
VARIANT = None
@@ -12,4 +12,4 @@ CHANNEL = 'stable'
ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2024.12.06'
_pkg_version = '2024.12.23'